Foreshortening Recipe Book
A methodology for controlled foreshortening and continuity
A diagnostic for spatial reasoning under causal constraint.
Foreshortening is challenging for both humans and AI alike. In both fields, it is one of the most reliable ways to expose how well an artist or image model understands depth, proportion, and spatial order.
Humans struggle to find the right “shapes” that define objects that recede, but do it intuitively without description as we know a foot near the lens must expand while the torso recedes, but most AI systems still treat bodies as 2D assemblies of learned silhouettes. When asked for extreme perspective, they often misfire as they dynamically assemble from training data: stretching limbs inconsistently, flattening torsos, or confusing what overlaps what.
This is a quick Sora recipe book designed to test and guide that behavior for AI. Leaning into the use of ordered descriptions, explicit light cues, and floor-plane anchors to help the model maintain depth. They’re not guaranteed fixes and results can be unpredictable, but each prompt pressures the model’s causal sense of space rather than just its stylistic memory.
Frame of Mind:
Most AI imagery fails at this because it builds scenes as a single flattened “snapshot.” The formulas here introduce time into spatial reading. It’s no longer what is depicted, but how it is revealed. That’s a major unlock for any generative system trying to simulate intentional foreshortening, spatial hierarchy, or visual storytelling.
Think about it as the description is led through the space of the body, foot to leg to torso to head, like a traveling the spine of a figure. Each successive part builds on the previous one’s scale and overlap cues. The image stops being static and starts behaving like a process and linear subject. This unlocks an underused tool: progressive anchoring. Each stage gives the next something to push against, so by the time we reach the head, the image has “earned” its depth.
Foreshortened Figure: Sequential Depth Capsule
Prompt: “A single continuous shot, no cuts. Naturalistic studio lighting, 50–70 mm lens feel. Render the subject as a full-body figure reclining toward the viewer, built step by step:
Foreground: the soles and toes are nearest, dominating the lower third of frame.
Continue back through shins and knees, preserving strong scale decay and overlap.
Reveal thighs and pelvis, grounded with cast shadow contact.
Show ribcage and abdomen, compressed in perspective.
Add shoulders and neck, tapering along the depth axis.
Finally, resolve the head farthest from camera — smallest by visual angle, sharp yet least saturated.
Do not render later stages unless prior ones are correctly established. No montage, no disjointed limbs, no symbolic insertions.”
This is exploiting token-chaining priority: the order and gating words in the prompt set up a causal scaffold the engine tries to satisfy in sequence. That sequence then gets mapped to spatial/temporal structure (near→far, part→whole, detail→field).
Tree Canopy: Vertical Sequence Capsule
Prompt: “A single continuous shot looking upward, 24–28 mm wide-angle feel. Build the composition in vertical depth order:
Start with the tree trunk closest to camera, bark texture crisp and detailed.
Extend into primary thick branches splitting from the trunk with strong occlusion.
Continue into secondary thinner branches, creating a radial lattice.
Add a semi-translucent leaf canopy forming an overhead plane.
Finally, reveal a small sunburst peeking through the gaps, flare controlled and leaf detail intact.
No sun until the canopy exists. Flare must not overpower detail. All stages should appear continuous and physically connected.”
This sequence mimics real-world embodied experience (look up → branch → canopy → sky), so the composition feels natural even if the image is highly constructed.
Pencil on Desk: Linear Progression Capsule
Prompt: “A single static shot, shallow depth of field, natural studio light. Sequence the object from nearest to farthest:
Foreground: the graphite tip in sharp focus, bevels and chips visible.
Progress into the sharpened wooden cone, clean helical cut.
Extend along the yellow hexagonal barrel, receding with proper perspective lines aligned to desk grain.
Show the metal ferrule with knurling and specular reflections.
End with the pink eraser, farthest and slightly softer.
Barrel must read continuous. Ferrule cannot appear before barrel. Only one pencil — no duplicates or clones.”
The pencil sequence is the microcosm of the same principle. From tip to eraser, the eye experiences scale decay and material contrast in order — sharpness → length → metal → softness. That small journey builds object intimacy and weight. It also sets up a compositional payoff: the pencil isn’t just an object anymore, it becomes a path the viewer travels along..
Choreography of the Image
This isn’t about perception choreography; it’s about model token sequencing under causal load. It is pressuring the generative process, not the viewer’s eye.
Below are easy foreshortening prompts, which can be taken as is, or built upon.
Simple foreshortening
The Recline (Classic)
“Low front camera. Nearest foot or knee dominates lower frame. Body reclines away, each joint overlapping the next. Light from upper-left, floor plane visible. No motion blur, no floating limbs.”
The Reach
“Single still frame. One hand or arm reaching toward camera, fingers sharp. Forearm, shoulder, and face recede along one line. Depth shown by focus falloff and scale, not blur streaks.”
The Stack (Objects)
“Series of identical solid objects (boxes, books, chairs). Foremost fully visible, each behind smaller, same axis. Floor shadows aligned, no floating. 6–9 units, last merges with haze.”
The Coil (Organic)
“Snake-like or rope-like form looping toward camera. Overlaps alternate light/dark; contact shadows visible. Haze increases along coil.”
Spine Capsules
Reclining Figure: Classic Perspective Capsule
“A single still frame, neutral studio light, 50 mm–70 mm lens feel. (1) Foreground: nearest foot dominates lower third, sharp texture and edge. (2) Shins and knees overlap the foot; scale decay clear. (3) Thighs and pelvis continue backward; cast shadow anchors to floor. (4) Abdomen compresses in perspective; lighting consistent. (5) Chest and shoulders taper; DOF softens slightly. (6) Farthest: head small by angle, least saturation, gaze off-frame.
Crouched Figure: Diagonal Compression
“Single still frame, 35 mm lens feel. (1) Foremost: bent knee and hand planted near camera, in crisp focus. (2) Torso twists back, forming a diagonal path toward far shoulder.
(3) Far knee and foot partly hidden, scale reduced. (4) Face and upper torso align to same perspective; light continuous.
DOF shallow near knee, deeper toward head; shadows unify across floor.”
Running Motion (Freeze-Frame)
“Single exposure, 85 mm lens, low front angle. (1) Foremost: reaching hand extended toward camera, sharp. (2–3) Arm and torso recede; depth shown by foreshortened limb lengths. (4–5) Rear leg trails into haze; ground contact shadow continuous. Capture energy through linear compression, not motion blur.”
Crouched Figure: Diagonal Compression
“Single still frame, 35 mm lens feel. (1) Foremost: bent knee and hand planted near camera, in crisp focus. (2) Torso twists back, forming a diagonal path toward far shoulder. (3) Far knee and foot partly hidden, scale reduced. (4) Face and upper torso align to same perspective; light continuous.
DOF shallow near knee, deeper toward head; shadows unify across floor.”
Environments
Subway Seat
“Low-front perspective of a person sitting on a subway bench, one boot or shoe nearest to camera, filling the foreground. Legs and torso taper toward back seats; train interior repeats evenly behind. Fluorescent light strips create linear perspective; reflections fade along ceiling. Ban fisheye distortion.”
Alley Run
“Narrow alleyway, low front camera. Foreground: running figure mid-stride, nearest foot in sharp focus, torso receding, far leg soft.
Brick walls on both sides converge toward a single vanishing point. Morning light filtering in from behind; mild atmospheric haze.”
Instrumental FTrumpet
“Single still frame, neutral studio light, 70 mm lens. (1) Bell of trumpet nearest, dominating lower third, metallic reflections crisp. (2–4) Tubing and valves recede; highlight streaks continuous. (5) Player’s face farthest, framed by instrument; focus softens gradually.
Gates: brass reflections aligned; no duplicate bells.”
Artist’s Hand Toward Canvas
“Single frame, shallow DOF. (1) Foreground hand with brush nearly touches lens, crisp detail. (2) Brush shaft recedes along depth line.
(3) Canvas surface occupies mid-plane; strokes visible. (4) Artist’s face farthest, half-blurred. Gates: same light source for hand and face; no duplicated tools.”
Start Being Creative
Doorway Lean
“Low-angle shot inside a dim room.
Foreground: a hand resting on a doorframe edge near camera. Torso and face lean outward into brighter hallway light. Depth shown by the shift in temperature — warm interior to cool exterior. Keep the doorframe vertical and consistent perspective lines on floor.”
Workshop Scene
“Interior with workbench. Foreground: extended arm holding a tool toward camera; focus on hand and metal surface. Mid-ground: torso leaning forward; background shelves soft and warm. Shadows align along bench plane; single neutral light source.”
Painting of a Woman in a Hallway
A painting of a woman sitting against a corridor wall, one knee bent and nearest to the camera, foot oversized from perspective. The hallway recedes into cool light, vanishing point centered behind the figure. Shadows run along the floor, light from an overhead source. Background walls parallel; tone softens with distance.
Pirate Scene
Far out on a plank, on a pirate boat, a snake-like or rope-like form looping toward camera next to a pirate’s boots. Another pirate stands far in the background. Overlaps alternate light/dark; contact shadows visible. Haze increases along coil. Dark and moody.
Motorcycle Jump at Sunset
“Low-front camera, 35 mm wide feel. Motorcycle leaning into a turn speeding launching off a cliff. (1) Nearest wheel rim crisp, dominating frame. As if stopped in motion (2) Forks and body shrink toward seat; reflections aligned. Motorcyclist leaning forward, wind passing fast. (3) Rider’s torso and helmet farthest, reduced contrast. Lighting sunset; single VP centered; ground shadow continuous. Light reflecting into sun glare. Background diffused with speed motion. Light continuous. DOF shallow near knee, deeper toward head; shadows unify across floor. tilt motorcycle to one side, add torque to rider”
Degas Painting
“Single degas style painting of a dancer (1) Foremost: bent knee and hand planted near viewer, in crisp focus. (2) Torso twists back, forming a diagonal path toward the far shoulder. (3) Far knee and foot partly hidden, scale reduced. (4) Face and upper torso align to same perspective; light continuous. DOF shallow near knee, deeper toward head; shadows unify across floor.”
Summary
If the figure holds shape and contact through 3–4 overlapping planes, It is already in high territory. If it maintains rhythm and vanishing-point coherence through 7+, the model isn’t just rendering, it’s reasoning spatially. And if the image starts to blur or repeat instead of distort? That’s not a failure; it’s the system protecting itself from hallucinating geometry it can’t sustain.
Foreshortening is the edge where perception meets physics. When AI gets it wrong, it reveals the delta of what you are working against.