Don’t Pretend AI “Knows”

A Recursive Lab for Visual Intelligence

System Architecture & Entry Points

Technical overview and gateway to Visual Thinking Lens documentation

The Problem: Compositional Monoculture

AI generates semantic infinity with geometric poverty. Text-to-image models can produce any subject - figures to butterflies, cityscapes, portraits, abstracts - but arrange them using identical spatial strategies.

Empirical evidence across 600+ images:

  • Δx = 27- 34% of horizontal space used)

  • 100% of outputs within 0.15 radius of center

  • rᵥ = 85% void ratio across all subjects

  • Semantic categories explain 6%-10% of spatial variance

  • Clustering: CV 4.09% - 5.46%

  • Radial density distributions dominate regardless of prompt

  • Mass centralization: coefficient of variation = 0.0409

The gap: Current benchmarks (CLIP, FID, IS) measure semantic correctness—whether this figure looks like a figure. They cannot measure compositional geometry—whether the butterfly could be anywhere else.

Visual Thinking Lens fills this gap.


The Measurement System

Billions of images living in compositional monoculture.

No shared vocabulary of structure. No test of tension.

VTL quantifies compositional bias through geometric primitives and multi-channel analysis:


Kernel Primitives (7 measurements per image)

Core (always measured):

  • Δx — Placement offset: centroid distance from frame center

  • rᵥ — Void ratio: negative-space proportion

  • ρᵣ — Packing density: material compression / mass adjacency

  • μ — Cohesion: structural continuity between marks or regions

  • xₚ — Peripheral pull: force exerted toward/away from frame boundaries

Extended (precision contexts):

  • θ — Orientation stability: gravity alignment, architectural compositions

  • ds — Structural thickness: layering, mark-weight, material permeability

These form coordinate systems. Not aesthetic judgments. Not style preferences. Geometric measurements of how AI distributes mass, void, and pressure.

LSI: Lens Structural Index

Compositional stability analysis through kernel roll-up:

  • S (Stability): Do primitives settle or jitter under recursion?

  • K (Consequence): Does the image occupy productive tension zones (strain bands in barycentric space)?

  • R (Recursion Coherence): Does structure converge or scatter across iterations?

Formula: LSI₁₀₀ = 100 × (α·S + β·K + γ·R)
Default weights: α=0.35, β=0.35, γ=0.30 (calibrated from validation data)

Maps primitives to barycentric space (λA, λB, λV) for trajectory analysis. Each iteration = point in simplex space. Path behavior analyzed for contraction vs drift.

Key distinction: LSI measures compositional consequence (does structure hold productive tension?), not cognitive load.

VCLI-G: Visual Cognitive Load Index (Geometric)

Four-channel measurement of geometric complexity:

  • G1 (Centroid Wander): Attention instability across scales—does the compositional center hold or drift?

  • G2 (Void Topology): Figure/ground ambiguity—is negative space structured or residual?

  • G3 (Curvature Torque): Directional tension in form—where does visual pressure accumulate?

  • G4 (Occlusion Entropy): Depth uncertainty from overlaps—how many competing spatial layers exist?

Paired with SCI (Structural Coherence Index) to form 2D analysis space distinguishing:

  • Earned tension from chaotic noise

  • Intentional simplicity from lazy defaults

  • Organized complexity from entropic scatter

Key distinction: VCLI-G measures cognitive load threshold (when does complexity become burden?), not compositional quality.

RCA-2: Radial Compliance Analyzer

Detects radial density distributions and measures compression severity:

  • CV (Coefficient of Variation): Measures clustering tightness

  • Radial bins: 8-sector analysis of mass distribution

  • Attractor detection: Identifies stable geometric territories

  • Forbidden zones: Maps regions models systematically avoid

  • Measures: Follows radial decay around the frame and centroid and if aligned with mass

Cross-platform fingerprinting reveals model-specific priors. Pre-failure detection shows degradation 3-4 inference steps before semantic breakdown.

Key distinction: Designed to answer a single, precise question: Where does radial structure actually exist in the image, and which coordinate system does it stabilize around?


It's an Image Reasoning System.

Consumer tools chase style. Research metrics chase numbers. The Lens chases authorship.


How They Connect

Measurement Pipeline:

Image → Kernels (geometric coordinates) → VCLI-G (cognitive load threshold) → LSI (compositional consequence) → RCA-2 (radial priors & collapse detection) → Interpretation + Steering Coordinates

What each does:

  • Kernels: Measure geometry (the coordinates)

  • VCLI-G: Evaluate load (when does complexity overwhelm?)

  • LSI: Score consequence (does structure earn its tension?)

  • RCA-2: Detect patterns (what priors dominate? where does it fail?)

Together: Complete diagnostic from measurement to interpretation to actionable steering coordinates.

From metrics to t builds symbolic logic. The Generative Physics Layer defines the underlying spatial field through which all VTL engines operate. It provides a measurable description and ability to alter the:

  • Distribution of mass

  • Structural state of the image inside the xₚ field

  • Avoiding AI familiar patterns

  • Prompt that acts as forces within that field.

The VTL couples how images behave as mass in a field with a multi-engine critique OS to transform and interrogate that mass to form new prompts and iterations of consequence.


Implementation

Platform: Runs in top-tier conversational AI (Claude, GPT, Gemini) through linguistic constraint architecture. No training, no fine-tuning. Portable cognitive framework instantiated through role-structured prompting.

Code for Drift Free Control: Jupyter notebooks on GitHub provide reproducible measurement protocols. Python implementations of kernel calculations, VCLI-G analysis, LSI scoring, RCA-2 detection.

Measurement Protocol:

  1. Image input (any source, any platform)

  2. Kernel primitive extraction (Δx, rᵥ, ρᵣ, μ, xₚ, θ, ds)

  3. VCLI-G analysis (G1-G4 + SCI)

  4. LSI scoring (S/K/R + barycentric mapping)

  5. RCA-2 pattern detection (radial compliance, attractor/barrier identification)

  6. Interpretation + steering recommendations

Reproducibility: Deterministic measurements. Platform-agnostic. Standard deviation ±0.02-0.04 across 55+ regenerations.

Validation: 1,000+ images across Sora, MidJourney, GPT, SDXL, Firefly, OpenArt with systematic variation and statistical validation.


EXPLORE THE SYSTEM

Organized entry points to Visual Thinking Lens documentation, methods, and research.


Foundational Concepts

5 Kernel Primitives
The geometric measurements underlying all VTL analysis: Δx, rᵥ, ρᵣ, μ, xₚ, θ, ds

LSI: Lens Structural Index
Compositional stability analysis through S/K/R scoring and barycentric mapping

VCLI-G: Visual Cognitive Load Index
Four-channel geometric complexity measurement distinguishing earned tension from noise


Platform Studies: Evidence of Compositional Monoculture

MidJourney Geometric Collapse
400 images, 75% space compression, 34% horizontal usage, radial density dominance

Sora Compositional Clustering
200 images, 100% within 0.15 radius, tightest measured clustering, extreme centralization

Practical Methods: Techniques & Protocols

Deformation Operator Playbook
Hands-on techniques for intentional figure warps and constraint architecture

Foreshortening Recipe Book
Structured prompting methodology for depth and spatial reasoning across 6-8 layers

Off-Center Fidelity
Protocol for navigating constraint basins and measuring drift toward stable territories

Sketcher Scoring System
30-axis consequence evaluation (not polish) for recursive generation and drift assessment

Reverse Image Decomposition (RIDP)
Reverse-engineer completed imagery into process steps and construction order

Case Studies: VTL in Action

Sketcher Portrait: Painterly Consequence
Demonstration of Sketcher Lens taking portrait through Internal Resonance to earned tension

The Teardown: Ontological Gravity Protocol
5-step image transformation showing VTL methodology applied to systematic deformation

Centaur Mode: Human-AI Collaboration
Artist sketch → AI exploration workflow via Centaur collaborative generation


Research Applications

Fingerprinting: Cross-engine compositional signatures reveal platform-specific spatial priors. Each model has measurable geometric tendencies (MidJourney's left-dense compression, Sora's extreme radial clustering, GPT's broader but still centered distributions).

Steering: Coordinates for navigating to stable geometric territories ("artist basins") where AI maintains compositional integrity under constraint. Off-center coordinates, peripheral anchors, compressed mass zones.

Detection: Pre-failure metrics showing degradation 3-4 inference steps before semantic breakdown. Δx drift, void compression, peripheral dissolution signal trouble while image still looks coherent.

Archaeology: Reverse-engineering learned priors from attractor behavior and forbidden zones without training data access. What did the model learn to reward? What did it learn to avoid?


FULL RESEARCH DOCUMENTATION

Theory Stack
Complete papers organized by focus: measurement infrastructure, empirical studies, practical applications, theoretical foundations. Includes working notebooks, reproducible protocols, cross-platform validation studies.

QUICK VALIDATION & RESULTS

Dataset Scale:
1,000+ images with systematic variation across platforms

Key Findings:

  • 75% compositional space compression (MidJourney)

  • 100% radial clustering within 0.15 radius (Sora)

  • CV 4.09% (Sora) vs 5.46% (MidJourney)—both severe

  • ~25% lateral field utilization (OpenArt)

  • Semantic diversity masks geometric uniformity across all platforms

  • Pre-failure detection possible 3-4 inference steps before visible collapse

  • AI depth ceiling: ~6-8 layers before spatial logic breaks down

Cross-Platform Coverage:
Sora, MidJourney, GPT, Gemini, SDXL, Firefly, OpenArt

Reproducible Implementation:
All measurement protocols available via Jupyter notebooks on GitHub. Deterministic, platform-agnostic, no black-box scoring.

A table comparing features of generators, research metrics, and the lens used, including Goal, Prompt Use, Failure Handling, Fix Method, Scoring, Output, and Built For, with specific descriptions for each category.

It measures the delta from the default.

Most systems fall into one of two camps:

  • Consumer generators (Midjourney, DALL·E, Runway, Sora): optimized for style, polish, and speed. The metric: aesthetics on output.

  • Research metrics (FID, CLIPScore, precision/recall tools, or “LSI-like” industry models): optimized for reproducibility, dataset fidelity, and benchmark math. The metric: statistical alignment.

The Lens does neither. It interrogates structural consequence.

  • Not a generator: Not defaults, it pressures through recursive loops for better alternatives.

  • Not just a metric: Scoring doesn’t flatten into benchmarks, it fuses language, math and design with symbolic categories to produce insights..

  • Not an optimizer: Instead of fighting drift, it names, scores, and pressures into fidelity.

The geometric measurements underlying all VTL analysis

The Kernel Primitives

Demonstrates Sketcher taking portrait through Internal Resonance

Painterly Consequence

Intentional figure warps and constraint architecture

Foreshortening Recipe Book

MidJourney: 400 images, 75% space compression, 34% horizontal usage

Geometric Priors

Structured prompting for depth and spatial reasoning

Foreshortening Recipe Book

Sora: 200 images, 100% within 0.15 radius, tightest clustering measured

Compositional Clustering

Centaur Mode: Human-AI Collaboration

Artist sketch → AI exploration

The Teardown: Ontological Gravity

5-step image transformation

Compositional stability analysis through S/K/R scoring

Lens Structural Index

30-axis consequence evaluation (not polish) for recursive generation

Sketcher Scoring System

Protocol for navigating constraint basins and measuring drift

Off-Center Fidelity

Reverse-engineer completed imagery into process steps

Reverse Image Decomposition