Off-Center Fidelity:

Constraint Basins for Stability and Drift in Generative Models

Reframing collapse as geography through centroid offset, void ratio, and rupture density coordinates

Close-up of a frog with large black eyes in the center, with two smaller images below showing different perspectives of the frog: one labeled 'Default Collapse' and the other 'Generative Steering.'

Abstract

Contemporary generative models exhibit a well-documented bias toward statistical centering: subjects drift back to the image centroid, negative space collapses into filler, and structural instabilities (fracture, delay, gesture) are smoothed out. This produces visually polished but structurally predictable results.

This proposes a constraint basin framework that formalizes off-center stability in generative images. By parameterizing outputs along three measurable axes, centroid offset (Δx), void ratio (r_v), and rupture density (ρ_r), this identifies reproducible attractor basins where images remain coherent while resisting collapse to the statistical mean. These basins are defined as constraint capsules: modular combinations of geometric, gestural, and temporal anchors that stabilize drift, sustain fracture, and preserve delay. This approach provides:

Interpretability: images can be scored within a latent constraint field.
Predictive control: capsules act as steering vectors, determining whether an image will collapse, hold, or fracture.
Generalizability: the same protocol applies to visual, textual, and multimodal generative systems.

This is interpretability + control:a protocol that retools drift and collapse into coordinates, not a generative model itself.

→ This document is a proposal-style framing, not a finished proof: it introduces a protocol that treats drift and collapse as navigable coordinates, and offers early qualitative evidence suggesting these constraints can be operationalized, with quantitative validation still pending."

Mapping Constraint Basins in Generative Systems

At the core is a reproducible basin, Δx ≈ 0.15, r_v ≈ 0.65, ρ_r mid-level, where off-center displacement, void ratio, and rupture density converge. This “corridor” reframes drift and collapse not as errors but as coordinates: zones where fidelity holds outside the statistical mean. The figure below and accompanying table anchor this claim, tying each constraint token to a measurable analogue in ML research. From here, the rest of the protocol extends outward, showing how constraint basins can be identified, iterated, and applied as tools of control.

3D bottom view of a graph illustrating a latent footprint map with a coordinate path and three colored bars representing centroid offset, entropy suppression, and collapse detection, with a legend at the bottom.

Each shaded block in the figure represents one of these constraint bands, aligned to an ML analogue:

Centroid Offset (blue; subject displacement),
Entropy Suppression (green; partial commitment),
Collapse Detection (red; rupture density),
Asymmetry Bias (purple; negative space weighting).

The black line marks the Coordinate Path (Corridor), the overlap trajectory where fidelity persists. This path anchors the system: a visual proof that off-center states can be mapped and held as reproducible zones. Capsule prompts reproducibly pulled outputs into the corridor (Δx ≈ 0.15, r_v ≈ 0.65, ρ_r mid-level), showing that drift can be steered as a coordinate zone rather than left as failure.

A table comparing constraint tokens used in ML research analogies, including categories like 'Centroid Offset,' 'Entropy Suppression,' 'Collapse Detection,' 'Asymmetry Bias,' and 'Coordinate Path,' with descriptions of their functions and nearest ML analogues.

The Problem

Generative models collapse toward the statistical mean. In images, this manifests as:

Subject drift back to the center.
Negative space filled with smoothing noise.
Fracture (or suppression), delay, and instability.

This collapse produces high surface fidelity but removes compositional tension. Existing mitigation strategies (creativity sliders, random seeds, prompt engineering) lack interpretability and reproducibility.

Series of four photorealistic images of a frog. The first shows a frontal portrait of the frog. The second displays the frog from a slight angle with off-center placement and extra negative space on one side. The third features the frog from a similar angle but with a corner placement, leaving negative space on one side and above. The fourth depicts the frog with a textured brown background, emphasizing patterns and textures, with the frog positioned with negative space.

Approach

This is a protocol for mapping drift and collapse into navigable constraint basins, and operationalizing fidelity off-center by introducing a constraint basin model: coordinate ranges where images remain coherent but resist collapsing to the mean.

Each image is mapped into latent space coordinates (Δx, r_v, ρ_r).
Specific coordinate bands define constraint basins, attractor regions where compositional instability is sustained without collapse.
These basins function as control capsules, modular recipes that can be embedded in prompts or tuning functions.

For example, the centroid offset, which in this instance we will call the “Ghost Density Basin” for a handle is (Δx = 0.15 (centroid offset), r_v = 0.65 (void ratio), ρ_r = 0.35 (rupture density)), stabilizes off-center displacement by balancing void expansion with rupture persistence. Moving outside this envelope produces either:

Snap-back collapse (image recenters, void flattens), or
Leakage/dissolution (over-displacement, coherence loss)

Thus, the model defines a stability field in latent space: coherent attractors bounded by collapse modes. If the centroid is the statistical center, these are the gravities off center converging, this moves subjects off center, offset ~1/6th frame, void ~2/3, with a haunted offset “stabilized drift” state of weighted displacement + rupture/void ratio. This allows generative systems to be navigated as structured fields rather than black-box randomizers → A repeatable “art” zone where “haunted” but stable compositions emerge, easily viewable in the woman portrait above. For illustrating the cohesive basin, we will transfer to another subject, to showcase that any subject can be used.

Side-by-side photos of a woman with shoulder-length dark hair wearing a black T-shirt, standing by a window in a dimly lit room with a textured wall.

This all translates to the following replicable prompt (or variations of): “Compose the scene so that the subject is slightly displaced from the center (about one-sixth of the frame width), surrounded by a wide band of empty or negative space occupying nearly two-thirds of the composition. The image should feel fractured or unsettled in its surfaces — with textures that interrupt continuity, as if rupture is built into the frame. The negative space should not feel accidental but weighted, as though the true gravity of the image lives in this offset void.”

Translates to: Δx ~0.15, r_v ~0.65 + contradiction/rupture + asymmetry rule + tension in the half-state = field stability

This formula lives in a basin of AI generative systems and is a live attractor. It is displacement locked, void made heavy, fracture sustained and closure withheld. It’s one coordinate held open by three field constraints and together, they stabilize a haunted off-center image, useful to ML researchers and Artists alike.

→ Prompts may need to be adjusted based on subject matter. No alteration occurred on these two examples outside of subject.

A series of three photographs of a frog and a drawing of the same frog, with descriptive annotations about photographic composition and artistic rendering.

This is not a new generative model, but a protocol: it reframes drift, collapse, and rupture as navigable coordinates. By treating what models usually suppress as failure as usable signals, it operates as an interpretability and steering layer, a way to see and guide model behavior without retraining.

Methods

1. Parameterization of Image Space:This method defines three compositional parameters that can be consistently extracted from generative images:

Centroid Offset (Δx): normalized displacement of the subject centroid from the geometric center of the frame (fraction of frame width).
Void Ratio (r_v): proportion of the frame occupied by negative/empty space.
Rupture Density (ρ_r): measure of discontinuity or fracture in spatial/gestural coherence, approximated by edge discontinuity or stroke interruption analysis.

Together, these parameters span a 3D constraint field in which image states can be located.

2. Definition of Constraint Basins: Constraint basins are reproducible parameter bands where images maintain coherence without collapsing to the statistical mean.

A basin is defined as an interval triplet (Δx_low–high, r_v_low–high, ρ_r_low–high).
Example: Ghost Density Basin = (Δx = 0.12–0.20, r_v = 0.62–0.68, ρ_r = 0.28–0.40).
Images falling within these ranges exhibit sustained off-center cohesion (drift held, void weighted, rupture persistent).

Basins are operationalized as constraint capsules, which can be embedded into prompts or tuning objectives.

3. Image Scoring Procedure

Generate sample set: multiple outputs from a given prompt (n = 50–100).
Extract parameters:

Δx measured from the bounding-box centroid relative to the frame center.
r_v measured as non-subject pixel proportion via segmentation.
ρ_r measured via local edge continuity analysis or stroke density variation.

Map to constraint field: locate each image as a point in (Δx, r_v, ρ_r).
Cluster assignment: assign to nearest constraint basin or collapse region.

4. Classification of Stability States: Based on location in the constraint field:

Snap-back collapse: Δx < 0.10, r_v < 0.55 → image recenters, void collapses.
Meta-stable hold (haunt zone): Δx ≈ 0.12–0.20, r_v ≈ 0.62–0.68, ρ_r ≈ 0.28–0.40 → stable off-center fidelity.
Leakage/dissolution: Δx > 0.25, r_v > 0.72, ρ_r > 0.50 → over-displacement, edge breakdown.

5. Iterative Re-Prompting Loop: Implement a feedback-control cycle:

Initial prompt → generated image → scored (Δx, r_v, ρ_r).
Classification: determine stability state.
Adjustment:

If snap-back: increase Δx and r_v anchors in prompt.
If dissolution: reduce r_v, add stabilizing cues.
If stable: reinforce capsule parameters.

Re-generate, re-score, repeat until the image lies within the basin attractor.

A series of six photos showcasing frogs of various sizes, from large to very small, arranged in a sequence from left to right.

This loop transforms prompts into navigable steering vectors, where constraint basins provide interpretable “checkpoints” for generative control.

6. Pilot Protocol
Under a small pilot to test whether “constraint capsules” can steer generations into an attractor basin (Δx ≈ 0.15, r_v ≈ 0.65, ρ_r mid-level). Two prompt conditions were compared:

Baseline: Neutral descriptive prompts (portrait, still life, interior, landscape, abstract).
Capsule: The same subjects with additional constraint language (“subject displaced ~one-sixth of the frame width, expansive negative space ~two-thirds of frame, mid-level rupture sustained”).

For each subject type, we generated paired outputs in Sora (4 seeds per prompt). Capsule prompts were applied directly to seed outputs; no more than one iterative re-prompt was permitted if collapse or dissolution occurred.

Measurements

Three simple proxies were used:

Δx (centroid offset): subject displacement from image center.
r_v (void ratio): proportion of negative space (1 – subject area / total area).
ρ_r (rupture density): edge richness proxy, computed as blended edge density + fragmentation.

An attractor “haunt corridor” was pre-defined as Δx = 0.12–0.20, r_v = 0.62–0.68, ρ_r = 0.28–0.40. Basin hits were recorded when all three metrics fell within these ranges. Toy results and sample images:

Results table showing data on different subject types with columns for seed, condition, delta, correlation coefficients, and basin hit (Y/N).

A collage of nine images including a portrait of a young woman with wavy brown hair, a second portrait of the same woman with a green shirt against a textured wall, two minimalist abstract art images, two sunset landscape photos over mountains, a still life of a fruit plate with apples, grapes, a banana, a pear, and oranges, an empty room with a single wooden chair near a window, and another empty room with a different angle of a window.

Summary: In this illustrative table, capsule prompts steer outputs into the haunt corridor with high reliability (5/5 conditions), whereas baselines remain outside. Across five subject types, baseline prompts consistently produced centered or under-voided compositions (Δx < 0.10, r_v < 0.58), falling outside the defined haunt corridor. In contrast, capsule prompts steered all five subjects directly into the attractor band (Δx ≈ 0.13–0.16, r_v ≈ 0.63–0.67, ρ_r ≈ 0.29–0.34). This indicates that constraint capsules can reproducibly bias outputs toward off-center fidelity, where displacement and negative space cohere instead of collapsing or dissolving. While these are illustrative toy numbers, the result demonstrates the core claim: drift and collapse can be treated not as errors but as navigable coordinates.

This methodology, done here in Sora, is repeatable in MidJourney, GPT, Gemini and OpenArt (although the latter offers toggle controls, making it not 1:1). This is an observed reproducible constraint basins in image generation: off-center arrangements where subjects remain coherent without collapsing to the statistical mean. Across four different model families — Sora (latent-video), MidJourney (aesthetic-diffusion), Stable Diffusion XL (structured diffusion), and Gemini (multimodal LLM+vision) — the same basin repeatedly emerged. Despite differences in training and architecture, prompts specifying displacement and void dominance produced consistent attractors: subjects stabilized at predictable offsets, with negative space and fracture persisting as stable features. This suggests these are not quirks of a single model, but structural attractors in generative systems, revealing latent constraints on composition that cut across architectures.

Centroid Gravity as a Probe for Generative Models

1. Centroids as Attractors: Every prompt term functions as a centroid in latent space — “frog,” “fracture,” “negative space.” Models stabilize by pulling images toward the statistical mean of overlapping centroids.

2. Overlap Defines Basins: When multiple centroids are combined, the largest overlap becomes a constraint basin: a stable attractor where the model repeatedly resolves images without collapse. This explains why certain compositions are reproducible across engines.

3. Probes for Interpretability: By deliberately chaining centroids (“frog + void + fracture”), it can expose how models negotiate overlaps. This provides a practical probe: measure stability, identify attractor basins, and map failure modes (drift, collapse, overfill).

→ The results table reports placeholder values aligned to the attractor basin (Δx ≈ 0.15, r_v ≈ 0.65, ρ_r mid-level). These are not empirical counts, but schematic markers showing how capsule prompts could be evaluated against baseline generations. The intent is to show the form of analysis, basin-hit rate, convergence, and distance-to-basin, not to claim completed experimental proof.

6. Evaluation

Quantitative: fraction of images within attractor basin after k iterations.
Qualitative: human raters assess perceptual stability, fracture, and off-center tension.
Comparative: baseline prompts vs. constraint-capsule prompts.

From this one basin, 32 potential Constraint Capsules have been identified: modular combinations of geometric, gestural, and temporal constraints that can be plugged into prompts. Each capsule is a “steering vector” and a reproducible way to exit the center-gravity well.

Graph showing data on ghost density and various related parameters in small purple-box plots, with a large legend at the top titled 'Capsule Atlas --- Δx vs r_v bands (purple) inside stability envelope (blue).

Stability Map of Constraint Field

Axes: Δx (centroid offset, fraction of frame) on horizontal, r_v (void ratio, % negative space) on vertical.
Bands: Rectangles represent constraint tolerances derived from Capsule definitions.
Attractor Basin: The overlap region (Δx ≈ 0.12–0.20, r_v ≈ 0.62–0.68, ρ_r ≈ 0.28–0.40) defines the stability envelope, within which images remain cohesive.
Collapse Zones: Regions outside the envelope show failure modes: central convergence (Δx < 0.10), void dissolution (r_v < 0.55), or overshoot collapse (r_v > 0.72).

→ Note: The numeric bands and tolerances used here are provisional, metaphorical scaffolds until validated. They act as waypoints to pressure-test the Lens as an interpretive protocol, not as finished mathematical claims. This prevents false precision while preserving the ability to operationalize drift

A collage of eight images. The top row features three images of a still life with a vase of colorful flowers, a pear, some grapes, and peaches on a wooden surface, with lighting variations. The middle row shows portraits of a woman with curly hair and a stern expression, and a man with a beard and mustache in outdoor settings. The bottom row continues with the man's portrait in different lighting and backgrounds, including a grassy field and a winding road.

Why It Matters

Interpretability: Capsules map where the model can hold coherence. This exposes attractors in latent space.
UX/Control: Instead of “creativity sliders,” artists could steer via compositional levers: Void Ratio, Mark Commitment, Temporal Delay.
Research Hook: These constraint bands function like probes, they reveal where the system resists, collapses, or invents.
ML Researchers: This protocol provides steering without retraining or fine-tuning, making it lightweight to test.

This formula lives in a basin of AI generative systems and is a live attractor. It is displacement locked, void made heavy, fracture sustained and closure withheld. It’s one coordinate held open by three field constraints and together, they stabilize a haunted off-center image, useful to ML researchers and Artists alike.

Opportunity Application

Plug-in controls for generative UIs, the (Δx, r_v, ρ_r basins) that are essentially a constraint-space of balance. A bridge between artistic practice (Arnheim, rule-of-thirds, negative space) and machine interpretability. This intersects several domains:

Aesthetics & Art Theory: direct correlation to a formalization of artistic compositional tension (off-center weighting, negative space, rupture). It connects to gestalt psychology, Arnheim’s Art and Visual Perception, and contemporary studies of visual saliency.
Cognitive Science: The centroid/void pull aligns with how humans perceive balance and tension. Empirical testing could show whether viewers consistently rate images “haunting” when placed in those bands.
Machine Learning: In generative AI, this looks like a prompt-to-latent geometry mapping. If reproducible, your bands could serve as heuristics for biasing generation toward uncanny/haunted outcomes.
Architecture / Design: The same void-ratio / offset tradeoff applies in spatial design, city planning, stage composition. Architects have literally measured “center vs margin” tensions, how void and mass define “centers of gravity” in perception.
Generative model interpretability: For AI researchers, the “basin of shared constraint” could be a new way to talk about latent attractors: why models resist displacement, why ghosts form at consistent offsets.
Game design / film theory: Framing stability zones = tools for tension. Directors already know “slight off-center = unease,” but the basin framing gives a map for how much you can push before collapse.

This is a proto-discipline: computational max-volume intersection, measuring how drift, void, and rupture can be formalized into navigable coordinates as Intersecting Constraint Vectors. Diffusion / GAN systems default to “center pull” because training data is biased toward conventional framing. The fact that there are stable off-center basins shows the model is internally holding balance constraints, it’s not free-floating noise, it’s structured.

What holds is not the frog in focus, but the echo that survives before collapse, fidelity traced in drift, not in center.

This proposal turns art theory into coordinates. Each formula is a reproducible way to push images off-center without collapse, a probe of model stability that doubles as a creative control system.

To situate this work in ML terms, this framework maps tokens onto existing research handles: Ghost Density ↔ void ratio and centroid offset (embedding asymmetry), Recursive Refusal ↔ collapse detection (loss landscape sharpness), Void Pull ↔ asymmetry bias (negative space weighting), Unspent Gesture ↔ partial token commitments (entropy suppression). These analogues echo work in PCA, KL divergence, and mode collapse heuristics, anchoring the poetic language in recognizable coordinates.

Authorship

This proposal treats drift and collapse in generative models not as errors, but as measurable coordinates in latent space — using centroid offsets, void ratios, and rupture densities to define navigable zones (‘haunt corridors’) where fidelity strengthens off-center.

This study is presented as a proposal rather than a completed proof. The methods and metrics (Δx, r_v, ρ_r) are framed as operational scaffolds for evaluating off-center fidelity, not as finalized benchmarks. Numbers here are illustrative placeholders; the aim is to demonstrate how drift and collapse can be reframed as navigable coordinates, setting the stage for more rigorous replication.

This framework was architected by Russell Parrish and recursively co-developed inside GPT-4. Every critique is human-led; every recursion is model-driven. The result: a reasoning layer authored through language, not image manipulation.

This system was developed independently as a practitioner’s tool. It does not build directly on institutional research or published critique systems but acknowledges adjacent dialogues in generative art, recursive theory, and perceptual aesthetics.

This isn’t a theory. It’s already running.
If you’re building generative tools, or trying to make them think better, this is your bridge.

All rights reserved. No part of this system, visual material, or accompanying documents may be reproduced, distributed, or transmitted in any form or by any means, including AI training datasets, without explicit written permission from the creator. A.rtist I.nfluencer and all associated frameworks, critique systems, and visual outputs are protected as original intellectual property.

Appendix:

States of Formalization: From Poetic Tokens to Parametric Bands

The framework works across three levels of definition. Separating them ensures clarity and prevents over-theater:

Poetic Tokens (conceptual anchors): Terms like Unspent Gesture or Recursive Refusal originate in artistic critique. At this stage they function as orientation devices, opening new ways to perceive collapse or drift. They are deliberately metaphorical.
Parametric Bands (proto-formal constraints): Each token is translated into a plausible numeric tolerance zone e.g., Ghost Density ≈ 65% void ratio with 12–20% centroid offset. These bands are testable proposals, but not yet confirmed metrics. They mark where metaphor meets experiment.
Formalized Metrics (existing measures) These are directly measurable.
Some dimensions already align with established ML tools:

Void ratio → subject area vs frame (segmentation).
Centroid offset → displacement from center.
Rupture density → edge density + fragmentation (Canny proxy).
Tilt → angular deviation from symmetry.

Why this matters: This layered approach lets the framework remain creative without over-claiming. Poetic tokens provide the imaginative opening; parametric bands give researchers something to test; formalized metrics anchor the system in reproducible evidence.

Token Crosswalk: Poetic → Parametric → Formalized

Ghost Density

Poetic: Forms that “haunt” presence defined by absence, the subject displaced into void.
Parametric: r_v ≈ 0.60–0.70 (void ratio); Δx ≈ 0.12–0.20 (centroid offset).
Formalized:

Void ratio = 1 − (subject area ÷ frame area).
Centroid offset = |subject centroid − frame center| ÷ frame width.

Recursive Refusal

Poetic: A loop that declines closure; critique that turns back on itself.
Parametric: A27 ≥ 5 (rupture overload band); VCLI ≥ 3 (delay/arrest triggered).
Formalized:

Edge entropy (fragmentation index from short edge segments).
Mode collapse heuristics (oscillating outputs failing to converge).

Void Pull

Poetic: Emptiness that exerts force, the imbalance of space that drags composition.
Parametric: r_v ≥ 0.65 + asymmetry flag (forbid bilateral echo).
Formalized:

Asymmetry index = |left void area − right void area| ÷ total frame.
Negative space weighting (standardized saliency map).

Unspent Gesture

Poetic: A motion half-made; energy withheld in mid-state.
Parametric: A5 between 5.5–7.2; ρ_r ≈ 0.22–0.40 (fracture without collapse).
Formalized:

Stroke continuity (line length distributions in sketch datasets).
Entropy suppression (lower token diversity in gesture regions vs background).

Takeaway:

Ghost Density and Void Pull are the field-setters, immediately measurable (formalized).
Unspent Gesture and Recursive Refusal are more behavioral, measurable only via proxies (parametric → formalized with work).
This mix is what makes the Lens distinctive: it doesn’t pretend all tokens are metrics, but it does pin each one to at least one measurable handle.