Don’t Pretend AI “Knows”
A Recursive Lab for Visual Intelligence
System Architecture & Entry Points
Technical overview and gateway to Visual Thinking Lens documentation
The Problem: Compositional Monoculture
AI generates semantic infinity with geometric poverty. Text-to-image models can produce any subject - figures to butterflies, cityscapes, portraits, abstracts - but arrange them using identical spatial strategies.
Empirical evidence across 600+ images:
Δx = 27- 34% of horizontal space used)
100% of outputs within 0.15 radius of center
rᵥ = 85% void ratio across all subjects
Semantic categories explain 6%-10% of spatial variance
Clustering: CV 4.09% - 5.46%
Radial density distributions dominate regardless of prompt
Mass centralization: coefficient of variation = 0.0409
The gap: Current benchmarks (CLIP, FID, IS) measure semantic correctness—whether this figure looks like a figure. They cannot measure compositional geometry—whether the butterfly could be anywhere else.
Visual Thinking Lens fills this gap.
The Measurement System
Billions of images living in compositional monoculture.
No shared vocabulary of structure. No test of tension.
VTL quantifies compositional bias through geometric primitives and multi-channel analysis:
Kernel Primitives (7 measurements per image)
Core (always measured):
Δx — Placement offset: centroid distance from frame center
rᵥ — Void ratio: negative-space proportion
ρᵣ — Packing density: material compression / mass adjacency
μ — Cohesion: structural continuity between marks or regions
xₚ — Peripheral pull: force exerted toward/away from frame boundaries
Extended (precision contexts):
θ — Orientation stability: gravity alignment, architectural compositions
ds — Structural thickness: layering, mark-weight, material permeability
These form coordinate systems. Not aesthetic judgments. Not style preferences. Geometric measurements of how AI distributes mass, void, and pressure.
LSI: Lens Structural Index
Compositional stability analysis through kernel roll-up:
S (Stability): Do primitives settle or jitter under recursion?
K (Consequence): Does the image occupy productive tension zones (strain bands in barycentric space)?
R (Recursion Coherence): Does structure converge or scatter across iterations?
Formula: LSI₁₀₀ = 100 × (α·S + β·K + γ·R)
Default weights: α=0.35, β=0.35, γ=0.30 (calibrated from validation data)
Maps primitives to barycentric space (λA, λB, λV) for trajectory analysis. Each iteration = point in simplex space. Path behavior analyzed for contraction vs drift.
Key distinction: LSI measures compositional consequence (does structure hold productive tension?), not cognitive load.
VCLI-G: Visual Cognitive Load Index (Geometric)
Four-channel measurement of geometric complexity:
G1 (Centroid Wander): Attention instability across scales—does the compositional center hold or drift?
G2 (Void Topology): Figure/ground ambiguity—is negative space structured or residual?
G3 (Curvature Torque): Directional tension in form—where does visual pressure accumulate?
G4 (Occlusion Entropy): Depth uncertainty from overlaps—how many competing spatial layers exist?
Paired with SCI (Structural Coherence Index) to form 2D analysis space distinguishing:
Earned tension from chaotic noise
Intentional simplicity from lazy defaults
Organized complexity from entropic scatter
Key distinction: VCLI-G measures cognitive load threshold (when does complexity become burden?), not compositional quality.
RCA-2: Radial Compliance Analyzer
Detects radial density distributions and measures compression severity:
CV (Coefficient of Variation): Measures clustering tightness
Radial bins: 8-sector analysis of mass distribution
Attractor detection: Identifies stable geometric territories
Forbidden zones: Maps regions models systematically avoid
Measures: Follows radial decay around the frame and centroid and if aligned with mass
Cross-platform fingerprinting reveals model-specific priors. Pre-failure detection shows degradation 3-4 inference steps before semantic breakdown.
Key distinction: Designed to answer a single, precise question: Where does radial structure actually exist in the image, and which coordinate system does it stabilize around?
It's an Image Reasoning System.
Consumer tools chase style. Research metrics chase numbers. The Lens chases authorship.
How They Connect
Measurement Pipeline:
Image → Kernels (geometric coordinates) → VCLI-G (cognitive load threshold) → LSI (compositional consequence) → RCA-2 (radial priors & collapse detection) → Interpretation + Steering Coordinates
What each does:
Kernels: Measure geometry (the coordinates)
VCLI-G: Evaluate load (when does complexity overwhelm?)
LSI: Score consequence (does structure earn its tension?)
RCA-2: Detect patterns (what priors dominate? where does it fail?)
Together: Complete diagnostic from measurement to interpretation to actionable steering coordinates.
From metrics to t builds symbolic logic. The Generative Physics Layer defines the underlying spatial field through which all VTL engines operate. It provides a measurable description and ability to alter the:
Distribution of mass
Structural state of the image inside the xₚ field
Avoiding AI familiar patterns
Prompt that acts as forces within that field.
The VTL couples how images behave as mass in a field with a multi-engine critique OS to transform and interrogate that mass to form new prompts and iterations of consequence.
Implementation
Platform: Runs in top-tier conversational AI (Claude, GPT, Gemini) through linguistic constraint architecture. No training, no fine-tuning. Portable cognitive framework instantiated through role-structured prompting.
Code for Drift Free Control: Jupyter notebooks on GitHub provide reproducible measurement protocols. Python implementations of kernel calculations, VCLI-G analysis, LSI scoring, RCA-2 detection.
Measurement Protocol:
Image input (any source, any platform)
Kernel primitive extraction (Δx, rᵥ, ρᵣ, μ, xₚ, θ, ds)
VCLI-G analysis (G1-G4 + SCI)
LSI scoring (S/K/R + barycentric mapping)
RCA-2 pattern detection (radial compliance, attractor/barrier identification)
Interpretation + steering recommendations
Reproducibility: Deterministic measurements. Platform-agnostic. Standard deviation ±0.02-0.04 across 55+ regenerations.
Validation: 1,000+ images across Sora, MidJourney, GPT, SDXL, Firefly, OpenArt with systematic variation and statistical validation.
EXPLORE THE SYSTEM
Organized entry points to Visual Thinking Lens documentation, methods, and research.
Foundational Concepts
→ 5 Kernel Primitives
The geometric measurements underlying all VTL analysis: Δx, rᵥ, ρᵣ, μ, xₚ, θ, ds
→ LSI: Lens Structural Index
Compositional stability analysis through S/K/R scoring and barycentric mapping
→ VCLI-G: Visual Cognitive Load Index
Four-channel geometric complexity measurement distinguishing earned tension from noise
Platform Studies: Evidence of Compositional Monoculture
→ MidJourney Geometric Collapse
400 images, 75% space compression, 34% horizontal usage, radial density dominance
→ Sora Compositional Clustering
200 images, 100% within 0.15 radius, tightest measured clustering, extreme centralization
Practical Methods: Techniques & Protocols
→ Deformation Operator Playbook
Hands-on techniques for intentional figure warps and constraint architecture
→ Foreshortening Recipe Book
Structured prompting methodology for depth and spatial reasoning across 6-8 layers
→ Off-Center Fidelity
Protocol for navigating constraint basins and measuring drift toward stable territories
→ Sketcher Scoring System
30-axis consequence evaluation (not polish) for recursive generation and drift assessment
→ Reverse Image Decomposition (RIDP)
Reverse-engineer completed imagery into process steps and construction order
Case Studies: VTL in Action
→ Sketcher Portrait: Painterly Consequence
Demonstration of Sketcher Lens taking portrait through Internal Resonance to earned tension
→ The Teardown: Ontological Gravity Protocol
5-step image transformation showing VTL methodology applied to systematic deformation
→ Centaur Mode: Human-AI Collaboration
Artist sketch → AI exploration workflow via Centaur collaborative generation
Research Applications
Fingerprinting: Cross-engine compositional signatures reveal platform-specific spatial priors. Each model has measurable geometric tendencies (MidJourney's left-dense compression, Sora's extreme radial clustering, GPT's broader but still centered distributions).
Steering: Coordinates for navigating to stable geometric territories ("artist basins") where AI maintains compositional integrity under constraint. Off-center coordinates, peripheral anchors, compressed mass zones.
Detection: Pre-failure metrics showing degradation 3-4 inference steps before semantic breakdown. Δx drift, void compression, peripheral dissolution signal trouble while image still looks coherent.
Archaeology: Reverse-engineering learned priors from attractor behavior and forbidden zones without training data access. What did the model learn to reward? What did it learn to avoid?
FULL RESEARCH DOCUMENTATION
→ Theory Stack
Complete papers organized by focus: measurement infrastructure, empirical studies, practical applications, theoretical foundations. Includes working notebooks, reproducible protocols, cross-platform validation studies.
QUICK VALIDATION & RESULTS
Dataset Scale:
1,000+ images with systematic variation across platforms
Key Findings:
75% compositional space compression (MidJourney)
100% radial clustering within 0.15 radius (Sora)
CV 4.09% (Sora) vs 5.46% (MidJourney)—both severe
~25% lateral field utilization (OpenArt)
Semantic diversity masks geometric uniformity across all platforms
Pre-failure detection possible 3-4 inference steps before visible collapse
AI depth ceiling: ~6-8 layers before spatial logic breaks down
Cross-Platform Coverage:
Sora, MidJourney, GPT, Gemini, SDXL, Firefly, OpenArt
Reproducible Implementation:
All measurement protocols available via Jupyter notebooks on GitHub. Deterministic, platform-agnostic, no black-box scoring.
It measures the delta from the default.
Most systems fall into one of two camps:
Consumer generators (Midjourney, DALL·E, Runway, Sora): optimized for style, polish, and speed. The metric: aesthetics on output.
Research metrics (FID, CLIPScore, precision/recall tools, or “LSI-like” industry models): optimized for reproducibility, dataset fidelity, and benchmark math. The metric: statistical alignment.
The Lens does neither. It interrogates structural consequence.
Not a generator: Not defaults, it pressures through recursive loops for better alternatives.
Not just a metric: Scoring doesn’t flatten into benchmarks, it fuses language, math and design with symbolic categories to produce insights..
Not an optimizer: Instead of fighting drift, it names, scores, and pressures into fidelity.
The geometric measurements underlying all VTL analysis
Demonstrates Sketcher taking portrait through Internal Resonance
Intentional figure warps and constraint architecture
MidJourney: 400 images, 75% space compression, 34% horizontal usage
Structured prompting for depth and spatial reasoning
Sora: 200 images, 100% within 0.15 radius, tightest clustering measured
Centaur Mode: Human-AI Collaboration
The Teardown: Ontological Gravity
Compositional stability analysis through S/K/R scoring
30-axis consequence evaluation (not polish) for recursive generation
Protocol for navigating constraint basins and measuring drift