Visual Topology
of Light
A physics-grounded, modality-agnostic image coordinate framework for semiconductor process monitoring. 15 coordinates. No learned parameters. Cold transfer across SEM, wafer maps, and aerial simulation.
METHOD
The strongest evidence is not accuracy.
It is the prediction that came true.
Semiconductor process monitoring generates imaging data across structurally distinct modalities: aerial images from lithography simulation, SEM images for metrology and defect review, and binary wafer maps from electrical test. Historically, each modality has required its own analysis pipeline. Deep learning has achieved strong per-domain results but requires large labeled datasets, modality-specific retraining when tools or layers change, and produces outputs that are opaque to process engineers who need to act on them.
We address a different question: can a fixed set of interpretable structural coordinates, derived from image formation physics rather than fitted to data, transfer across imaging modalities without retraining? If a coordinate is truly grounded in how images are formed, its behavior should change in the direction physics predicts as the image formation mechanism changes, not arbitrarily, and not because a training set reinforced a particular pattern.
VTL achieves competitive accuracy across both imaging modalities with the same coordinates and no retraining, while no other method in the comparison does. Second, the pattern of coordinate dominance across modalities is consistent with known image formation behavior. That consistency, pre-stated and then observed, is a stronger claim than a classification number.
15
coordinates derived from image formation physics, no learned parameters
93.4%
balanced accuracy on Carinthia production SEM — first among all baselines
0
retraining steps across three independent datasets spanning two imaging modalities
Pre-validation — before real data was examined
"LER must be identically zero on coherent aerial simulation images. Coherent defocus moves all edges uniformly without roughening them."
Confirmed. LER averages 0.000–0.001 across all focus levels on simulated aerial images. On synthetic CDSEM it rises to 0.004–0.006 — a 5–7× increase from Poisson shot noise and edge discretization. On production SEM it becomes the dominant discriminator (ANOVA F = 8,586). On binary wafer maps it recedes to approximately 11th place. The coordinate follows the expected physical pathway across four distinct image regimes without modification.
Figure. Three-stage pipeline at nominal focus. Left: Hopkins aerial image (LER ≈ 0). Center: resist pattern. Right: synthetic CDSEM with edge halos and Poisson shot noise (LER elevated 5–7×).
Cold Transfer
No Retraining Required
The same 15 coordinates, identically defined, transfer from simulated aerial images to production SEM defect classification to binary wafer maps. When the imaging modality changes, the dominant coordinate changes — not the coordinate definitions.
Modality Shift
Hierarchy Reorganizes, Not Redesigns
LER dominates production SEM (F = 8,586). Spectral band coordinates dominate wafer maps (F = 11,389 / 9,483 / 8,675). The shift is consistent with image formation physics — a physical prediction, not a dataset artifact.
Operational Value
A Consistent Interpretive Language
When a fab brings up a new inspection tool, a learned classifier retrains. VTL reorganizes. A process engineer reading NILS and LER on a new tool reads the same quantities, defined the same way, as on the previous tool.
Negative Result
Perona-Malik Has No Effect
Anisotropic diffusion pre-processing produces statistically indistinguishable results (95.7% vs 95.7%, CI overlap >99%). VTL operates above the SEM grain noise floor at 480×480. LER and NILS are measuring real structural properties.
Feature Efficiency
15 Coordinates vs 1,764
VTL matches or beats Haralick GLCM (48 features), Zernike (45), LBP (26), HOG (1,764), and Hu moments (7) across both modalities simultaneously. No other method maintains competitive performance on both SEM and wafer maps.
Hu Moments
The Cross-Modality Failure Case
Hu moments achieve 71.4% balanced accuracy on Carinthia SEM and collapse to exactly chance (12.5%) on MixedWM38 wafer maps — consistent with their assumption of continuous intensity distributions, which binary wafer maps violate.
Framework
15 Coordinates, One Intensity Map
All coordinates computed from a single normalized intensity map I ∈ [0,1], min-max normalized to grayscale. No learned parameters at any stage. The v1 kernel (9 coordinates) forms the spatial geometry foundation. v2 adds NILS. v3 adds LER. v3.1 adds edge density and spectral band decomposition — the three spectral coordinates rank 1st–3rd on wafer maps by ANOVA F-statistic. Coordinates:
delta_x: Centroid X offset from image center
delta_y: Centroid Y offset from image center
r_vRadial: variance of intensity mass
mu: Mean intensity — encodes defect density
sdi: Normalized Shannon entropy of histogram
rho_r: Radial correlation: bright center vs periphery
x_p: Peak X offset from centroid
theta: Dominant gradient orientation
d_s: Spectral spread — center of FFT mass
nils: Edge sharpness (NILS) — focus indicator
ler: Edge roughness — stochastic process variation
edge_density: Canny edge fraction — structural complexity
ds_low: Low-frequency spectral power
ds_mid: Mid-frequency spectral power
ds_high: High-frequency spectral power
The Central Observation
Same coordinates. Different physics. Different hierarchy.
The coordinate ranking shifts systematically with modality in a pattern consistent with image formation physics. A physical prediction, not a dataset artifact — the coordinate set is fixed, but what it finds informative changes with the mechanism of image formation.
Results
Baseline Comparison — Two Modalities
Identical protocol: StandardScaler + SVM probe + 5-fold stratified CV, balanced accuracy. VTL v3.1 is the only method that ranks first on both modalities with the same 15 coordinates and no retraining.
Figure. UMAP projection of VTL v3.1 coordinates for 4,579 Carinthia production SEM images. All four defect classes form well-separated clusters — zero-parameter visualization of coordinate space structure.
Coordinate Stability Under Imaging Conditions
Validated against 3,402 NIST mds2-3838 simulated SEM images at 27 noise levels and 21 contrast levels. All 15 coordinates are contrast-immune (R²/contrast < 0.10).
Figure. VTL coordinate sensitivity heatmap — NIST mds2-3838 (3,402 simulated SEM images). R² of each coordinate against noise, contrast, and SNR. Green = robust, red = sensitive. All coordinates show R²/contrast < 0.10.
Paper
Visual Topology of Light v3.2
24 pages. 13 figures. 9 tables. All results reproduced from executed Jupyter notebooks.
Status: research preprint. Public structural evidence. Overlay correlation untested — no public dataset pairing SEM images with overlay measurements currently exists.
Related Work
PTD-Z: Pattern Topology: Drift Monitor
The same measurement kernel applied to a different problem: routed structural telemetry for semiconductor-like inspection imagery. PTD-Z asks whether an organized pattern has stopped behaving like the structure it was expected to be — decomposing image evidence into interpretable routes, testing residual signal against classical baselines, and refusing process-causal claims without process-linked data.
Where VTL is a coordinate framework, PTD-Z is a signal router with explicit refusal logic. Two instantiations of the same underlying methodology.