diff --git a/CLAUDE.md b/CLAUDE.md
index 16a6207..1edd33b 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -37,7 +37,7 @@ You are a **principal Python engineer** maintaining a CLI tool and library for r
 - `metadata.py` — `scan_head(path, size=1MB)` is the shared input for every C2PA/AIGC/IPTC byte scan: first `size` bytes plus the payloads of any provenance metadata found beyond that window — for ISOBMFF, the late provenance boxes from `isobmff.scan_c2pa_region` (catches a manifest after a large `mdat`); for **PNG**, the late `tEXt`/`iTXt`/`zTXt`/`eXIf`/`iCCP` chunks from `_png_late_metadata` (catches an XMP/EXIF packet appended after a large `IDAT`, e.g. a TC260 AIGC label at ~2.7 MB). Behavior-neutral (`f.read(size)`) for non-ISOBMFF inputs and for any file that fits within `size`. Use it instead of `open().read(1MB)` for any new marker scan. `synthid_source(path)` returns the vendor name(s) if the C2PA manifest implies a SynthID pixel watermark, else None. Format-agnostic: PNG via the caBX parser, JPEG/WebP/AVIF/HEIF/JXL via a binary scan (C2PA marker + SynthID issuer + AI-source marker). `get_ai_metadata` surfaces the verdict, and `metadata --check` prints it as a callout. Both `get_ai_metadata` and `has_ai_metadata` guard the PIL open with `except Exception` (HEIC/unknown formats raise non-OSError) and fall through to the binary scan. `xai_signature(path)` detects xAI/Grok's EXIF-only scheme (`ImageDescription` = `Signature: <base64>` + UUID `Artist`); it feeds `has_ai_metadata`, `get_ai_metadata` (key `xai_signature`), and `identify`. `iptc_ai_system(path)` detects the IPTC Photo Metadata 2025.1 AI-disclosure XMP properties (`IPTC_AI_FIELD_MARKERS` = `AISystemUsed`/`AISystemVersionUsed`/`AIPromptInformation`/`AIPromptWriterName`) and returns the `AISystemUsed` generator name (or `"fields present"`). `remove_ai_metadata` routes **ISOBMFF video** (`.mp4`/`.mov`/`.m4v`) through the same `isobmff.strip_c2pa_boxes` as AVIF/HEIF (MP4 is ISOBMFF), and `_scrub_ai_exif` removes the xAI signature + AI-generator EXIF tags on JPEG output. `strip_c2pa_boxes` is **fail-safe** on a malformed box: it returns the original bytes unchanged with a logged warning instead of truncating the tail to EOF (detection-only `scan_c2pa_region` still stops at a malformed box). `_png_late_metadata` clamps each late-chunk read to the remaining file size (`safe_length = min(length, remaining)`) so a malformed `length` cannot drive a multi-GB allocation.
 - `identify.py` — the OpenAI rollout caveat is keyed on `_vendor_of(synthid) == "OpenAI"` (not a raw substring over the issuer + verdict blob). `identify(path)` aggregates every locally-readable signal (C2PA issuer→platform, C2PA soft-binding forensic-watermark vendor, IPTC "Made with AI" + IPTC 2025.1 `AISystemUsed`, embedded SD/ComfyUI params, SynthID proxy, xAI/Grok EXIF signature via `metadata.xai_signature`, the China TC260 AIGC label via `metadata.aigc_label`, the HuggingFace `hf-job-id` job marker via `metadata.huggingface_job`, the Samsung Galaxy AI editing marker via `metadata.samsung_genai`, the visible marks — Gemini sparkle plus the ByteDance Doubao 豆包AI生成 / Jimeng 即梦AI text marks via the `watermark_registry` — open invisible watermark, Adobe TrustMark via `trustmark_detector`) into one `ProvenanceReport`. `is_ai_generated` is True or None (never asserted False — stripped metadata is not proof of clean origin). The `hf_job`, visible-mark, and Samsung `samsung_genai` signals are **medium** confidence: each lifts an otherwise-Unknown verdict to a tentative AI (`hf_only` / `visible_only` / `samsung_only`, parallel branches; `visible_only` fires on any `visible_*` signal) but is excluded from the high-confidence `ai_from_metadata` set, so none overrides a hard metadata signal. **Visible-mark detection** (`check_visible`, signals `visible_sparkle` / `visible_doubao` / `visible_jimeng`): the Gemini sparkle keeps its own file-level path (`_visible_sparkle` → `gemini_engine.detect_sparkle_confidence`, promoted only at confidence ≥ `_SPARKLE_THRESHOLD` 0.5; corpus-tuned to separate Gemini sparkles ≥0.56 from non-sparkle ≤0.49), while Doubao/Jimeng reuse the registry detectors (`_visible_text_marks` → `watermark_registry`), each gated by its own engine NCC threshold via `MarkDetection.detected` (Doubao 0.4, Jimeng 0.45). Doubao/Jimeng are normally also caught by the TC260 AIGC metadata label, so the visible path is their stripped-metadata fallback. Visible marks set `platform` only when no harder signal already did, and (like the sparkle) are excluded from integrity-clash vendor claims. The cv2 dependency lives in the engines, not here. **`import identify` is deliberately light** (~21 MB; ~36 MB with cv2 loaded by a visible-mark run, ~106 MB for a full `check_visible` run): it imports only the pure `noai.c2pa`/`noai.constants` submodules, and `noai/__init__` is lazy (see "Test and lint"), so torch/diffusers are NOT pulled at import even in a full `gpu`/`detect` install — fits a 512 MB host. The heavy paths are opt-in: `check_invisible=True` needs the `detect`/`trustmark` extras (each pulls **torch**; TrustMark also **downloads weights**), so on a core-only deploy leave `check_invisible` off (it is a no-op there anyway). Before the lazy `__init__`, the mere presence of torch in the env inflated `import identify` to ~420 MB. **C2PA platform attribution is device-token-first, issuer-scan fallback** (`_device_platform` scans manifest bytes for `_DEVICE_C2PA_PLATFORM` tokens, then `_attribute_platform`/`_ISSUER_PLATFORM`). **Why, verified on real signed files 2026-05-26:** the old issuer-only byte-scan matched ANY issuer substring anywhere, so multi-entity manifests mis-attributed -- Leica→"Truepic" (a signing authority in the trust chain), Nikon→"Adobe Firefly" (XMP-toolkit "Adobe" + the sample's "Adobe_MAX" name), Pixel→"Google (Gemini)" ("Google LLC" cert org), Truepic→"Google". A distinctive device token wins instead. **Token distinctiveness is load-bearing:** bare `b"Truepic"` mis-fires (it appears in unrelated trust chains -- it mis-attributed the OpenAI `chatgpt-1.png` fixture), so the token is the specific `b"Truepic_Lens"` from the Lens SDK claim generator; likewise `b"Pixel Camera"` (cert CN) not bare `b"Pixel"`. `_DEVICE_C2PA_PLATFORM` lists ONLY tokens **verified against a real C2PA file**: Leica (`lc_c2pa`/`Leica Camera`), Nikon (`NIKON`), Pixel (`Pixel Camera` -- from a real Pixel 10 Pro file attached to c2pa-rs issue #1609/#1554), Sony (`sony.sig`/`sony.cert` -- Sony's own C2PA assertion namespace, verified on a real Sony PXW-Z300 file; NOT bare "Sony" which is a common EXIF Make), Truepic (`Truepic_Lens`). Canon/Bria have **no public direct-download C2PA sample** (checked exhaustively: GitHub issue/PR attachments, contentcredentials gallery, HF datasets -- all upload-to-verify or token-gated; Canon's only public file was a self-signed hobbyist CR3, not factory), so they stay unmapped until a real file is captured (same fixture discipline as Grok/Doubao). The Sony sample is video (MP4) -- our ISOBMFF C2PA path detects it; Sony Alpha stills likely share the `sony.*` namespace but are not separately verified. **Samsung Galaxy + ASUS Gallery live in a separate `_SIGNER_C2PA_PLATFORM` (scanned after `_device_platform`, before the issuer fallback), NOT in `_DEVICE_C2PA_PLATFORM`** — verified on real signed files 2026-05-29. Reason: a Galaxy phone stamps BOTH its device cert AND a `trainedAlgorithmicMedia`/genAIType AI marker on a Generative-Edit image, so treating it as a "genuine camera capture" would false-fire integrity-clash rule 2 on every Galaxy AI edit. The signer tokens (`b"Samsung Galaxy"` cert org — distinct from the EXIF `SM-xxxx` model string on ordinary Samsung photos; `b"com.asus.gallery"` claim generator) only resolve the platform label; the AI verdict still comes from the source-type / genAIType. ASUS Gallery is a C2PA-signed edit with no AI marker, so it attributes the platform without asserting `is_ai`. **Samsung's `genAIType` (in the proprietary `PhotoEditor_Re_Edit_Data` JSON) is an undocumented Galaxy-AI editing marker** (`metadata.samsung_genai`, gated on the `PhotoEditor_Re_Edit_Data` container; non-zero value = AI tool used, values {1,5} observed): medium-confidence because the field has no public spec (verified 2026-05-29: absent from C2PA spec + Samsung docs), but it co-occurred with `trainedAlgorithmicMedia` in 3/3 verified files that record a source-type and was the SOLE AI marker on a Galaxy S24 file that omits the source type. Camera C2PA marks capture authenticity, not AI (Pixel carries `computationalCapture`, not `trainedAlgorithmicMedia`), so these never set `is_ai` -- that stays driven by digital-source-type. `c2pa.cbor_text_after` (now public) is best-effort for the `generator` detail string only and can be None when the manifest keys it `claim_generator_info` (Pixel). **Issuer→generator mapping is `is_ai`-gated** (`_attribute_platform(issuers, is_ai=c2pa_is_ai)`): a specific AI-generator platform is named only when the digital-source-type is `trainedAlgorithmicMedia`; on a non-AI source an issuer substring is treated as incidental (an "Adobe XMP" toolkit string in an *unmapped* Canon/Sony capture would otherwise mislabel it "Adobe Firefly"), so it degrades to the neutral "C2PA signer: X" label. Real Firefly/OpenAI/Google output carries the AI source-type, so it is unaffected (verified: chatgpt-1.png→OpenAI, firefly-1.png→Adobe Firefly still attribute). `_attribute_platform` defaults `is_ai=True` so the mapping stays unit-testable in isolation. Add capture-camera tokens to `_DEVICE_C2PA_PLATFORM`, editing-app/AI-device signer tokens to `_SIGNER_C2PA_PLATFORM`, generator/issuer platforms to `_ISSUER_PLATFORM`, not inline. For non-PNG containers (JPEG/WebP/AVIF/HEIF/JXL) the caBX parser returns nothing, so issuer (`_issuers_in`) and generator (`_ai_tools_in`, reusing `C2PA_AI_TOOLS`) are recovered by binary-scanning the first MB. EXIF `Software` / `Make` / `Artist` / `ImageDescription` and XMP `CreatorTool` generator tags are read by `metadata.exif_generator` (PIL+piexif for any format PIL opens incl. AVIF, plus a container-agnostic XMP raw-byte scan that also covers HEIF/JXL), matched against `AI_GENERATOR_TOKENS` so ordinary editors (plain "Adobe Photoshop") and real-camera `Make` ("Apple"/"Canon") are not flagged. **Ideogram tags its output with EXIF `Make="Ideogram AI"`** (verified on a real download 2026-05-24) — that's why `Make` is read. **Integrity-clash detection** (`_integrity_clashes`, surfaced as `ProvenanceReport.integrity_clashes`, printed in red by `identify` and serialized to `--json`): contradictions between independent generator stamps are a laundering/spoofing tell. Two rules: (1) two or more distinct AI-origin vendors named by **independent** signals (e.g. C2PA OpenAI + EXIF `Make="Ideogram AI"`), and (2) a camera-capture C2PA device (`_DEVICE_C2PA_PLATFORM`) coexisting with any AI-generation marker. **Independence is source-grouped (`_CLASH_SOURCE`, added 2026-06-02):** the C2PA issuer attribution (`c2pa`) and the SynthID proxy (`synthid`) are NOT independent — the proxy is inferred from the *same* manifest — so they share one source and two vendors named within a single manifest do not clash. This killed a false-positive class found on the spaces corpus: legitimate multi-actor manifests where a product wraps another vendor's engine (Microsoft Designer on OpenAI → `OpenAI, Microsoft`; Microsoft on Google → `Microsoft, Google LLC, Google C2PA Core Generator Library`) or an edit chain re-signs (Adobe over a Gemini original → Adobe c2pa + Google synthid) — 19 such files across the 2026-06-01/02 batches read as clashes before the fix. Rule 1 still fires when a manifest vendor disagrees with a genuinely independent stamp (EXIF/XMP generator, IPTC `AISystemUsed`, AIGC, xAI); each non-`c2pa`/`synthid` family is its own source (`test_identify.py::TestIntegrityClashes::{test_multi_actor_manifest_no_clash,test_manifest_vendor_vs_independent_signal_clashes}`). Vendor normalization is `_vendor_of` over `_AI_VENDOR_TOKENS` (so a C2PA "Google (Gemini)" issuer and a SynthID-Google proxy agree, while different vendors clash). **High-precision by design:** only hard generator stamps feed it (C2PA-issuer when source is AI, SynthID, EXIF/XMP generator, IPTC `AISystemUsed`, xAI, AIGC); the fuzzy visible sparkle and the open invisible watermark are **excluded** (the latter can be a by-product of our own SDXL removal pass). The c2pa vendor is classified from the issuer attribution / generator, NOT the resolved `platform` (a camera label like "Google Pixel" would mis-normalize to "Google"). All real single-origin fixtures (chatgpt/firefly/doubao/grok/mj) verified to produce **zero** clashes (false-positive guard in `test_identify.py::TestRealSamplesHaveNoClash`).
 - `watermark_registry.py` — **single catalog of known visible watermarks**, the unified "find known marks in their usual places, recognize, remove" entry. **Reverse-alpha based by policy**: a mark is listed only once a real alpha map has been captured for it, and removal inverts that map (`original = (wm - a*logo)/(1-a)`) — Gemini recovers cleanly with no inpaint (its sparkle alpha comes from a pure-black capture, so it is near-exact), while **Doubao and Jimeng both add an always-on THIN residual inpaint** over the glyph footprint (their text marks re-rasterize + jitter a few px per image, so a single capture cannot pixel-cancel them; the inpaint blends into the reverse-alpha-recovered pixels). Arbitrary-region inpainting still lives in `region_eraser`/`erase`. Each `KnownMark` ties a key to {usual `location`, `in_auto` flag, `recovery` (="reverse-alpha"), a `detect` adapter → uniform `MarkDetection`, a `remove` adapter}. Entries today: `gemini` (bottom-right sparkle), `doubao` (bottom-right "豆包AI生成"), and `jimeng` (bottom-right "★ 即梦AI"). `detect_marks` scans all; `best_auto_mark` picks the highest-confidence detection. **Cross-engine confidences aren't directly comparable**, so the gemini adapter applies the corpus-validated 0.5 sparkle threshold (`_GEMINI_AUTO_MIN_CONF`) for its `detected` flag — otherwise the gemini engine's loose internal threshold weakly fires (~0.36) on the Doubao text and hijacks `auto`. The shape-keyed Doubao/Jimeng NCC detectors don't cross-fire (jimeng scores ~0.22 on the Doubao strip, well under its 0.45 threshold), so `auto` picks the right one on a Doubao vs Jimeng image. `cli.cmd_visible` is registry-driven: `--mark auto` → `best_auto_mark`, `--mark <key>` → that mark; `--mark` choices come from `mark_keys()`. `_doubao_remove`/`_jimeng_remove` apply reverse-alpha only when the mark is detected AND `reverse_alpha_available`; outside that, removal is **skipped** (not inpainted). Add a new visible mark = one `KnownMark` entry + its engine (with a captured alpha map); do not re-add per-mark `if` branches in the CLI. **Alpha-on-save policy (issue #30):** `cli._write_bgr_with_alpha` rejoins the input's alpha plane **unchanged** — it must NOT zero alpha in the watermark bbox. Reverse-alpha (and `erase` inpaint) recover real pixels there, so zeroing alpha punched a transparent hole that renders as a solid **white box** on any non-transparent viewer (Gemini app exports are opaque RGBA, so every user hit it; regression-guarded by `test_visible_keeps_alpha_opaque_in_watermark_region`). The registry `remove()` still returns its region (used for `inpaint_residual` positioning), but the CLI no longer uses it to clear alpha.
-- `gemini_engine.py` — visible Gemini-sparkle remover/detector (cv2/numpy, no GPU). `detect_sparkle_confidence(path)` is the file-level entry point used by `identify.py`. The public entry points normalize a grayscale (2D) or RGBA (4-channel) input to BGR up front so a non-BGR image does not crash the cv2 pipeline. **Detection localization (issue #36):** `detect_watermark`'s global multi-scale NCC search applies a size weight (`(scale/96)**0.5`) that suppresses tiny-patch false positives but can let a larger, mediocre match (e.g. a bright collar in a portrait) outrank a small, near-perfect sparkle in the corner — so a faint sparkle on a busy background scored below threshold and read as clean (the regression osachub reported from widening the search window 256px->512px between v0.7.2 and v0.8.8). `_corner_promote` adds a bottom-right-corner raw-NCC pass on top of the global search: a match with raw NCC >= `_CORNER_PROMOTE_NCC` 0.85 that beats the global pick overrides it (it only ever replaces a lower-fidelity pick, so it cannot weaken an existing detection), rescuing the buried sparkle without reverting the wider window. The corner side is **relative-clamped** (`_CORNER_PROMOTE_FRAC` 0.20 of the short side, clamped to `[_CORNER_PROMOTE_MIN` 96, `_CORNER_PROMOTE_MAX` 384`]`): a fixed 256px is a true corner on a large image but covers ~70% of a small portrait, where a real photo raw-matches the star at ~0.81 (relative tightening drops that worst case to ~0.69, while the upper clamp stops the corner ballooning on huge images where a real photo reached ~0.83 at 512px). The 0.85 gate sits midway between the worst real-photo corner match (~0.78 across native + downscaled negatives) and a genuine faint sparkle (~0.93), so promotion adds true detections with zero corpus false positives (Gemini's sparkle sits ~60-160px from the corner at fixed margins, covered by the [96, 384] band at every measured size). Regression-guarded by `test_gemini_engine.py::TestCornerPromotion`. **Removal is reverse-alpha with an over-subtraction guard** (`remove_watermark` → `_reverse_alpha_blend`, else `_inpaint_footprint`): the sparkle alpha is computed (`alpha = max(R,G,B)/255`) from the bundled sparkle-on-black captures `assets/gemini_bg_{96,48}.png` (the capture max is ~130, NOT 255 — the sparkle is a ~51%-opaque white overlay, so `alpha` maxes at ~0.51, which is CORRECT for the capture, not under-exposed). The alpha is near-exact only when the real mark's effective opacity matches the capture, which holds on bright/flat backgrounds — re-verified clean on `demo_banana_before.png` 2026-05-31. **Issue #30 (dark-background black pit):** on a dark/textured background (e.g. grass, ~73) the real sparkle's effective opacity is LOWER than the captured 0.51, so the fixed-alpha reverse blend OVER-subtracts (`watermarked - a*logo` goes negative) and drives the footprint to black — the white sparkle becomes a black diamond. `remove_watermark` now detects this via `_reverse_alpha_oversubtracts` (fraction of footprint pixels with `alpha >= _FOOTPRINT_ALPHA` 0.1 whose numerator < 0 exceeds `_OVERSUB_FOOTPRINT_FRAC` 0.05) and **inpaints the footprint** (`_inpaint_footprint`, cv2 NS over the dilated alpha mask) from the surrounding pixels instead. **Behavior-neutral on the working case:** a bright background over-subtracts at ~0% so reverse-alpha is used and the output is byte-identical to before (verified: demo_banana 0.0 frac vs issue-#30 grass 0.61 frac; regression-guarded by `test_gemini_engine.py::TestOverSubtractionGuard`, which composites the sparkle at a reduced effective alpha to reproduce the mismatch). The registry's optional `inpaint_residual` (edge cleanup) is a no-op on a clean reverse-alpha removal; an earlier "Gemini smears" read was a misjudged soft-fur original, not an artifact. **The bg assets are now rebuilt from OUR OWN controlled captures** (`data/gemini_capture/captures/`, committed) by `scripts/visible_alpha_solve.py gemini`, which locates the 96px sparkle on the black capture and crops it to the two logo sizes; our capture matched the previously third-party-sourced `gemini_bg_96.png` to **NCC 0.9998**, validating the asset and making it reproducible. Gemini's multi-size fixed-slot model is genuinely different from the Doubao/Jimeng text-strip engines (so it stays a separate engine, not part of the shared-base refactor).
+- `gemini_engine.py` — visible Gemini-sparkle remover/detector (cv2/numpy, no GPU). `detect_sparkle_confidence(path)` is the file-level entry point used by `identify.py`. The public entry points normalize a grayscale (2D) or RGBA (4-channel) input to BGR up front so a non-BGR image does not crash the cv2 pipeline. **Detection localization (issue #36):** `detect_watermark`'s global multi-scale NCC search applies a size weight (`(scale/96)**0.5`) that suppresses tiny-patch false positives but can let a larger, mediocre match (e.g. a bright collar in a portrait) outrank a small, near-perfect sparkle in the corner — so a faint sparkle on a busy background scored below threshold and read as clean (the regression osachub reported from widening the search window 256px->512px between v0.7.2 and v0.8.8). `_corner_promote` adds a bottom-right-corner raw-NCC pass on top of the global search: a match with raw NCC >= `_CORNER_PROMOTE_NCC` 0.85 that beats the global pick overrides it (it only ever replaces a lower-fidelity pick, so it cannot weaken an existing detection), rescuing the buried sparkle without reverting the wider window. The corner side is **relative-clamped** (`_CORNER_PROMOTE_FRAC` 0.20 of the short side, clamped to `[_CORNER_PROMOTE_MIN` 96, `_CORNER_PROMOTE_MAX` 384`]`): a fixed 256px is a true corner on a large image but covers ~70% of a small portrait, where a real photo raw-matches the star at ~0.81 (relative tightening drops that worst case to ~0.69, while the upper clamp stops the corner ballooning on huge images where a real photo reached ~0.83 at 512px). The 0.85 gate sits midway between the worst real-photo corner match (~0.78 across native + downscaled negatives) and a genuine faint sparkle (~0.93), so promotion adds true detections with zero corpus false positives (Gemini's sparkle sits ~60-160px from the corner at fixed margins, covered by the [96, 384] band at every measured size). Regression-guarded by `test_gemini_engine.py::TestCornerPromotion`. **Removal is reverse-alpha with an over-subtraction guard** (`remove_watermark` → `_reverse_alpha_blend`, else `_inpaint_footprint`): the sparkle alpha is computed (`alpha = max(R,G,B)/255`) from the bundled sparkle-on-black captures `assets/gemini_bg_{96,48}.png` (the capture max is ~130, NOT 255 — the sparkle is a ~51%-opaque white overlay, so `alpha` maxes at ~0.51, which is CORRECT for the capture, not under-exposed). The alpha is near-exact only when the real mark's effective opacity matches the capture, which holds on bright/flat backgrounds — re-verified clean on `demo_banana_before.png` 2026-05-31. **Issue #30 (dark-background black pit):** on a dark/textured background (e.g. grass, ~73) the real sparkle's effective opacity is LOWER than the captured 0.51, so the fixed-alpha reverse blend OVER-subtracts (`watermarked - a*logo` goes negative) and drives the footprint to black — the white sparkle becomes a black diamond. `remove_watermark` now detects this via `_reverse_alpha_oversubtracts` (fraction of footprint pixels with `alpha >= _FOOTPRINT_ALPHA` 0.1 whose numerator < 0 exceeds `_OVERSUB_FOOTPRINT_FRAC` 0.05) and **inpaints the footprint** (`_inpaint_footprint`, cv2 NS over the dilated alpha mask) from the surrounding pixels instead. **Behavior-neutral on the working case:** a bright background over-subtracts at ~0% so reverse-alpha is used and the output is byte-identical to before (verified: demo_banana 0.0 frac vs issue-#30 grass 0.61 frac; regression-guarded by `test_gemini_engine.py::TestOverSubtractionGuard`, which composites the sparkle at a reduced effective alpha to reproduce the mismatch). **Under-subtraction (the symmetric case, fixed 2026-06-03):** some real Gemini sparkles are rendered MORE opaque than the captured ~0.51, so the fixed-alpha reverse blend UNDER-subtracts and leaves a bright sparkle residual the detector still fires on (measured on the spaces corpus: a visible-removal audit through the registry path left a detectable sparkle on a meaningful fraction of marks, all under-removals, NOT a background-brightness class — failures and successes had the same input confidence and the same background-luma distribution; the discriminator was the removal delta itself). `remove_watermark` now estimates a per-image alpha gain (`_estimate_alpha_gain`: effective sparkle opacity at the bright core vs the local background ring, `a_eff/a_cap`, clamped `[1.0, _ALPHA_GAIN_MAX` 1.94`]`) and scales the alpha to match before the over-sub/blend branch. The gain cleanly separates on the corpus (under-removed marks ~1.47, cleanly-removed ~1.00), and a deadband (`_ALPHA_GAIN_DEADBAND` 1.05) keeps a matching sparkle **byte-identical** to the pre-fix output, so the fix is purely additive (0 regressions on the audit set; the over-sub guard still runs on the scaled alpha as the safety net for an over-shooting estimate). Regression-guarded by `test_gemini_engine.py::TestUnderSubtractionGain` (composites a more-opaque-than-capture sparkle; **asserts on footprint pixels, NOT the detector** — the detector's NCC is degenerate on a flat synthetic background, so a re-detect conf is meaningless there; the real corpus removal drops the detector from ~0.80 to ~0.27). The registry's optional `inpaint_residual` (edge cleanup) is a no-op on a clean reverse-alpha removal; an earlier "Gemini smears" read was a misjudged soft-fur original, not an artifact. **The bg assets are now rebuilt from OUR OWN controlled captures** (`data/gemini_capture/captures/`, committed) by `scripts/visible_alpha_solve.py gemini`, which locates the 96px sparkle on the black capture and crops it to the two logo sizes; our capture matched the previously third-party-sourced `gemini_bg_96.png` to **NCC 0.9998**, validating the asset and making it reproducible. Gemini's multi-size fixed-slot model is genuinely different from the Doubao/Jimeng text-strip engines (so it stays a separate engine, not part of the shared-base refactor).
 - `doubao_engine.py` — visible Doubao "豆包AI生成" remover/detector (cv2/numpy, no GPU). `DoubaoEngine.locate` anchors a bottom-right box by **geometry** (mark scales with image WIDTH), `extract_mask` pulls the light, low-chroma glyphs (the detection candidate) using a per-pixel channel-spread proxy `sat = roi.max(axis=2) - roi.min(axis=2)` (no HSV conversion). `detect` is **shape-consistent**: it matches the bundled alpha glyph silhouette (`assets/doubao_alpha.png`) against the candidate via zero-mean normalized correlation (`_template_match_score`, cv2 `TM_CCOEFF_NORMED`), gated at `DETECT_NCC_THRESHOLD` 0.4 over a small `DETECT_MIN_COVERAGE` floor. Keying on glyph SHAPE (not coverage heuristics) fixed #23 (corpus FP 7/1243). **Removal = reverse-alpha + thin residual inpaint** (`remove_watermark_reverse_alpha`): `original = (wm - a*logo)/(1-a)` from the bundled alpha map + `_ALPHA_LOGO_BGR` (pure white) + `_ALPHA_*_FRAC` geometry, then a deliberately THIN inpaint (`_RESIDUAL_*`, `INPAINT_NS`) over the glyph footprint clears leftover edges without smearing. **Alpha is rebuilt by `scripts/visible_alpha_solve.py` (the careful gray-self solve: cubic background fit, mean over channels, full halo, unblurred), same recipe as Jimeng** — the captures are committed in `data/doubao_capture/captures/`. **Removal aligns ALWAYS** (no `_ALPHA_NATIVE_BAND` fast-path): it tries fixed geometry AND `_aligned_alpha_map`'s `TM_CCOEFF_NORMED` scale+position search and keeps the lower-residual one — the mark is re-rasterized and a few px off per image, so fixed geometry alone leaves a visible outline even at 2048. **The locate box (`WM_*`) is generous (0.22 wide, margins 0.004) and reaches close to the corner** — a tight box (the old 0.185 / margin 0.012) let a corner-ward shift fall OUTSIDE the alignment search, so the align missed and a readable outline survived; regression-guarded by `test_recovers_shifted_mark_on_texture` (composes the alpha shifted on a known texture; old box ~29 vs new ~1 mean residual). **Issue #13 follow-up defect (found 2026-05-31): the SHIPPED Doubao removal left a clearly READABLE "豆包AI生成" outline on the real `doubao-1.png` sample, while `detect` returned conf 0.0 (it is fooled by a thin outline) so `test_reverse_alpha_removes_mark` passed and the old "56/56 clean" claim was detector-measured, not visual.** Root cause: bad alpha (under-estimated, max ~0.65) + fixed-no-inpaint + tight box; the careful rebuild + always-align + thin inpaint + wide box takes it from a readable outline to faint texture-level traces (parity with Jimeng — a single capture cannot pixel-cancel a per-image re-rasterized mark). **Lesson: a detector-only removal test is insufficient; assert visual residual (the textured-shift test).** **`extract_mask` guards a degenerate ROI (`bh < 16 or bw < 16` -> empty mask, skips cv2):** the always-align removal scores each placement with a residual `detect(out)`, and on an extremely wide/short image (e.g. 2048x1, `test_wide_short_does_not_raise`) that fed cv2's GaussianBlur a ~1-px-tall ROI and **faulted natively on Windows py3.12 (access violation, non-deterministic — one CI cell went red while a re-run passed)**; the old at-native path never ran `detect` on degenerate sizes. Real images always clear the guard (the `WM_*` box floors are `max(16, …)` height / `max(40, …)` width), so it only short-circuits slivers. `reverse_alpha_available` is just "asset present"; the registry gates removal on `detect`. The shipped third-party `_refs/zhengsuanfa_doubao_alpha_120x20.png` is NOT a usable alpha (verified 2026-05-29). Arbitrary-region inpainting is `region_eraser`/`erase`.
 - `jimeng_engine.py` — visible Jimeng / Dreamina "★ 即梦AI" remover/detector (cv2/numpy, no GPU), built 2026-05-30 from issue #13's solid captures (@powersee). Mirrors `doubao_engine`: `locate` anchors a bottom-right box by **geometry** (scales with WIDTH), `extract_mask` pulls the light low-chroma glyphs (white top-hat + grayish + min-luma), `detect` matches the bundled "即梦AI" glyph silhouette (`assets/jimeng_alpha.png`) via `TM_CCOEFF_NORMED` over a coverage floor. Threshold `DETECT_NCC_THRESHOLD` **0.45** cleanly separates real Jimeng marks (>=0.81) from the Doubao strip (0.21) and other AI output (0.0), so the two ByteDance marks don't cross-fire in `--mark auto`. **Logo is pure white (255,255,255)** (`_ALPHA_LOGO_BGR`; the white capture + an L-pair-solve confirm ~254.6); compositing is **sRGB, not linear** (a linear-light solve tripled the cross-residual). **Alpha rebuilt by `scripts/visible_alpha_solve.py` from the GRAY capture** (`data/jimeng_capture/captures/`, the solid captures now committed): `a = (I - B)/(255 - B)`, B a per-capture **cubic** background fit over the non-glyph pixels, **averaged over channels, full halo extent (down to a~0.02), unblurred**. Gray (bg ~132) is the deliberate choice over black: it is the best proxy for real content (the mark sits on bright photo areas, not on black), and the careful build drops the gray self-residual to ~1.3. **The mask quality, not the method, was the earlier limit** — a max-channel / quadratic-bg / blurred / halo-truncated build (and a black-dominated LS) left a visible outline (lesson from issue #13: when reverse-alpha leaves a ghost, suspect the captured alpha map before adding heuristics or switching method). Geometry emitted by the solver at `_ALPHA_NATIVE_WIDTH` 2048: `_ALPHA_WIDTH_FRAC` 0.202, `_ALPHA_HEIGHT_FRAC` 0.058, margins ~0.029. **Removal = reverse-alpha + a deliberately THIN residual inpaint** (`remove_watermark_reverse_alpha`, `_RESIDUAL_DILATE` 5 over the `_RESIDUAL_ALPHA_FLOOR` 0.05 footprint, `_RESIDUAL_INPAINT_RADIUS` 2, `INPAINT_NS`): a single 2048 alpha cannot pixel-cancel the mark re-rasterized at another resolution (alpha maps from independent captures correlate 0.998, not 1.0; off-native reverse-alpha alone only halves the mark), so a tight inpaint clears the residual edges WITHOUT the texture/edge smear a wide full-footprint pass caused. **Placement ALWAYS tries fixed geometry AND `_aligned_alpha_map`'s NCC scale+position search, keeping the lower-residual** — the mark re-rasterizes + jitters a few px per image even at the captured width, so fixed geometry alone misses (there is no `_ALPHA_NATIVE_BAND` fast-path; the scale search `_ALPHA_ALIGN_SEARCH` is fine-stepped, and the `WM_*` locate box is generous so a corner-ward shift stays inside the search — the same widen that fixed Doubao). Verified clean on the solid captures (native 2048; faint self-residual ~1.3 visible only on a dead-flat field, hidden by real texture) and a real 1440-wide Jimeng download (off-native, table edge preserved). `reverse_alpha_available` is just "asset present"; the registry gates on `detect`. **No committed real sample** (the real content download stays gitignored; only the solid calibration captures are committed) — `tests/test_jimeng_engine.py` synthesizes a mark from the bundled alpha asset, and `test_recovers_shifted_mark_on_texture` guards the align-on-shift path that the Doubao defect exposed. Jimeng images are independently caught by the China TC260 AIGC label in `metadata`/`identify`, so this engine is the visible-mark *removal* path, not a new `identify` signal.
 - `region_eraser.py` — universal region eraser (`erase` CLI). `erase(image, boxes=|mask=, backend=)` normalizes grayscale (2D) and RGBA (4-channel) inputs up front (`erase_cv2` splits off any alpha plane and re-attaches it on the result): `boxes_to_mask` → `cv2.inpaint` (`cv2` backend, default, no deps) or big-LaMa via onnxruntime (`lama` backend, extra `lama`, `Carve/LaMa-ONNX` Apache-2.0 model downloaded on first use, never bundled). `erase_lama` crops a padded region around the mask, runs LaMa at its fixed 512² input, pastes only masked pixels back (untouched areas stay pixel-exact). Lazy `_get_lama_session` singleton; `lama_available()` guards the optional import. **LaMa-ONNX costs ~3.5-4 GB peak RAM and ~5-6 s/call on CPU** (FFC working set, not arena — `enable_cpu_mem_arena=False` does not help), so it does NOT fit a minimal droplet; the cv2 backend (tens of MB, ~30 ms) does. LaMa quality at low RAM = serverless/GPU, mirroring how raiw.cc offloads SDXL to fal.
diff --git a/README.md b/README.md
index 2f42f36..e916480 100644
--- a/README.md
+++ b/README.md
@@ -17,7 +17,7 @@ If this tool saves you time, consider [sponsoring its development](https://githu
 
 ## Features
 
-- **Visible watermark removal** — a registry of known marks in their usual places: the Gemini / Nano Banana sparkle, the Doubao "豆包AI生成" text strip, and the Jimeng "★ 即梦AI" wordmark. Each is removed by **reverse-alpha blending** against a captured alpha map (`original = (wm − α·logo)/(1−α)`), recovering the true pixels rather than inpainting a guess. The Gemini sparkle recovers cleanly on its own on bright backgrounds (on a dark background, where the fixed alpha would over-subtract and leave a dark spot, it automatically inpaints the small sparkle footprint instead); the Doubao and Jimeng text marks re-rasterize slightly per image, so a thin residual inpaint over the glyph footprint clears the leftover edges (the alpha maps are reproducibly rebuilt from controlled captures by `scripts/visible_alpha_solve.py`). Fast, offline, no GPU. `visible --mark auto` finds and removes the strongest detected mark. (For arbitrary logos/objects, see `erase`.)
+- **Visible watermark removal** — a registry of known marks in their usual places: the Gemini / Nano Banana sparkle, the Doubao "豆包AI生成" text strip, and the Jimeng "★ 即梦AI" wordmark. Each is removed by **reverse-alpha blending** against a captured alpha map (`original = (wm − α·logo)/(1−α)`), recovering the true pixels rather than inpainting a guess. The Gemini sparkle recovers cleanly on its own on bright backgrounds; it adapts the alpha to each image's sparkle opacity, so a more-opaque-than-captured sparkle is still fully removed (and on a dark background, where the fixed alpha would over-subtract and leave a dark spot, it automatically inpaints the small sparkle footprint instead); the Doubao and Jimeng text marks re-rasterize slightly per image, so a thin residual inpaint over the glyph footprint clears the leftover edges (the alpha maps are reproducibly rebuilt from controlled captures by `scripts/visible_alpha_solve.py`). Fast, offline, no GPU. `visible --mark auto` finds and removes the strongest detected mark. (For arbitrary logos/objects, see `erase`.)
 - **Universal region eraser (`erase`)** — remove any logo / watermark / object inside boxes you specify, regardless of position or colour. Default cv2 inpainting (CPU, instant); optional big-LaMa via onnxruntime (`lama` extra) for higher quality
 - **Invisible watermark removal** — SynthID, StableSignature, TreeRing via diffusion-based regeneration (needs a local GPU, or run it with no setup on [raiw.cc](https://raiw.cc))
 - **AI metadata stripping** — EXIF, PNG text chunks, C2PA provenance manifests (PNG / JPEG / AVIF / HEIF / JPEG-XL, **MP4 / MOV / M4V / M4A** at the container level, and **WebM / MP3 / WAV / FLAC / OGG** losslessly via ffmpeg), XMP DigitalSourceType
diff --git a/scripts/visible_removal_audit.py b/scripts/visible_removal_audit.py
new file mode 100644
index 0000000..208ecae
--- /dev/null
+++ b/scripts/visible_removal_audit.py
@@ -0,0 +1,138 @@
+"""Audit visible-watermark removal over a local image corpus.
+
+For every image the registry detects a known visible mark in, run that mark's
+removal and re-detect on the output, recording before/after confidence and
+whether the detector still fires. Also bucket the detected-positive originals
+into per-mark dataset dirs so the visible-mark corpora are reproducible.
+
+Detector-clean after removal is necessary but, for the Doubao/Jimeng text marks,
+NOT sufficient (their NCC detector is fooled by a thin residual outline -- see
+CLAUDE.md). Treat a detector-clean Doubao/Jimeng as "detector passes"; visual
+residual is a separate check.
+
+Operates on gitignored data only (data/spaces/...); writes nothing tracked.
+
+    uv run python scripts/visible_removal_audit.py \
+        --corpus data/spaces/originals --out data/spaces/_visible_audit.csv \
+        --dataset-root data/spaces/_visible_datasets
+"""
+
+from __future__ import annotations
+
+import csv
+import logging
+import shutil
+from pathlib import Path
+
+import click
+
+from remove_ai_watermarks import image_io
+from remove_ai_watermarks.watermark_registry import detect_marks, get_mark
+
+log = logging.getLogger(__name__)
+
+_EXTS = {".png", ".jpg", ".jpeg", ".webp", ".avif", ".heic"}
+
+
+def _rel(p: Path, corpus: Path) -> str:
+    try:
+        return str(p.relative_to(corpus))
+    except ValueError:
+        return p.name
+
+
+@click.command()
+@click.option(
+    "--corpus", type=click.Path(exists=True, file_okay=False, path_type=Path), default=Path("data/spaces/originals")
+)
+@click.option("--out", type=click.Path(path_type=Path), default=Path("data/spaces/_visible_audit.csv"))
+@click.option("--dataset-root", type=click.Path(path_type=Path), default=Path("data/spaces/_visible_datasets"))
+@click.option(
+    "--paths-file",
+    type=click.Path(exists=True, path_type=Path),
+    default=None,
+    help="Audit only these paths (one per line), skipping the full rglob.",
+)
+@click.option("--limit", type=int, default=0, help="Scan at most N files (0 = all).")
+def main(corpus: Path, out: Path, dataset_root: Path, paths_file: Path | None, limit: int) -> None:
+    logging.basicConfig(level=logging.WARNING, format="%(message)s")
+    if paths_file is not None:
+        files = [Path(s) for line in paths_file.read_text().splitlines() if (s := line.strip()) and Path(s).is_file()]
+    else:
+        files = sorted(p for p in corpus.rglob("*") if p.is_file() and p.suffix.lower() in _EXTS)
+    if limit:
+        files = files[:limit]
+    click.echo(f"Scanning {len(files)} files under {corpus} ...")
+
+    rows: list[dict[str, str]] = []
+    n_detected = 0
+    n_clean_after = 0
+    fails: list[tuple[str, str, float]] = []
+
+    with click.progressbar(files, label="audit") as bar:
+        for p in bar:
+            img = image_io.imread(p)
+            if img is None:
+                continue
+            for det in detect_marks(img, include_explicit=False):
+                if not det.detected:
+                    continue
+                n_detected += 1
+                mark = get_mark(det.key)
+                # Bucket the positive original into the per-mark dataset.
+                ddir = dataset_root / det.key
+                ddir.mkdir(parents=True, exist_ok=True)
+                if not (ddir / p.name).exists():
+                    shutil.copy2(p, ddir / p.name)
+                # Remove, then re-detect with the SAME mark's detector.
+                try:
+                    cleaned, _ = mark.remove(img)
+                    after = mark.detect(cleaned)
+                except Exception as exc:
+                    log.warning("remove failed on %s (%s): %s", p.name, det.key, exc)
+                    rows.append(
+                        {
+                            "path": _rel(p, corpus),
+                            "mark": det.key,
+                            "conf_before": f"{det.confidence:.3f}",
+                            "conf_after": "",
+                            "removed": "error",
+                        }
+                    )
+                    continue
+                removed = not after.detected
+                n_clean_after += int(removed)
+                if not removed:
+                    fails.append((_rel(p, corpus), det.key, after.confidence))
+                rows.append(
+                    {
+                        "path": _rel(p, corpus),
+                        "mark": det.key,
+                        "conf_before": f"{det.confidence:.3f}",
+                        "conf_after": f"{after.confidence:.3f}",
+                        "removed": str(removed),
+                    }
+                )
+
+    out.parent.mkdir(parents=True, exist_ok=True)
+    with out.open("w", newline="") as f:
+        w = csv.DictWriter(f, fieldnames=["path", "mark", "conf_before", "conf_after", "removed"])
+        w.writeheader()
+        w.writerows(rows)
+
+    by_mark: dict[str, list[bool]] = {}
+    for r in rows:
+        if r["removed"] in ("True", "False"):
+            by_mark.setdefault(r["mark"], []).append(r["removed"] == "True")
+    click.echo(f"\nDetected positives: {n_detected}; detector-clean after removal: {n_clean_after}")
+    for k, v in sorted(by_mark.items()):
+        click.echo(f"  {k:8} removed {sum(v)}/{len(v)} ({100 * sum(v) // max(1, len(v))}%)")
+    if fails:
+        click.echo(f"\nDetector still fires after removal ({len(fails)}):")
+        for path, key, conf in fails[:30]:
+            click.echo(f"  {key:8} {conf:.3f}  {path}")
+    click.echo(f"\nReport: {out}  |  Datasets: {dataset_root}/<mark>/")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/src/remove_ai_watermarks/gemini_engine.py b/src/remove_ai_watermarks/gemini_engine.py
index da552c0..8635ac9 100644
--- a/src/remove_ai_watermarks/gemini_engine.py
+++ b/src/remove_ai_watermarks/gemini_engine.py
@@ -140,6 +140,23 @@ class GeminiEngine:
     # gate separates them with a wide margin.
     _OVERSUB_FOOTPRINT_FRAC = 0.05
 
+    # Per-image alpha gain (under-subtraction fix). The captured alpha peaks ~0.51
+    # (a ~51%-opaque sparkle). Some real Gemini sparkles are rendered MORE opaque,
+    # so the fixed alpha under-subtracts and reverse-alpha leaves a bright residual
+    # the detector still fires on (~11% of marks on the spaces corpus). Estimate
+    # this image's effective sparkle opacity from the bright core vs the local
+    # background and scale the alpha to match, capped so alpha stays < 0.99. The
+    # gain is clamped to >= 1.0 so it only ever STRENGTHENS removal: ~1.0 when the
+    # sparkle matches the capture (working cases unchanged), >1 when more opaque.
+    # On the spaces corpus the gain cleanly separates -- under-removed marks ~1.47,
+    # cleanly-removed ~1.00. 1.94 is the cap that reaches alpha 0.99 from 0.51.
+    _ALPHA_GAIN_MAX = 1.94
+    _ALPHA_GAIN_CORE_FRAC = 0.8  # body pixels at >= this * peak alpha define the core
+    # Deadband: apply the gain only above this, so a sparkle that already matches the
+    # capture (estimated gain ~1.0-1.04 from background noise) stays byte-identical to
+    # the pre-fix output. Under-removed marks estimate >= 1.26, well clear of the band.
+    _ALPHA_GAIN_DEADBAND = 1.05
+
     # Corner promotion (issue #36): the size weight that suppresses tiny-patch
     # false positives also buries a small, near-perfect sparkle when a larger,
     # mediocre match sits elsewhere (e.g. a bright collar in a portrait). A small
@@ -446,6 +463,12 @@ class GeminiEngine:
 
         pos = (detection.region[0], detection.region[1])
         alpha_map = self.get_interpolated_alpha(detection.region[2])
+        # Match the captured alpha to this image's sparkle opacity (under-subtraction
+        # fix): a more-opaque-than-captured sparkle would otherwise leave a bright
+        # residual. gain == 1.0 leaves the working cases byte-identical.
+        gain = self._estimate_alpha_gain(result, alpha_map, pos)
+        if gain > self._ALPHA_GAIN_DEADBAND:
+            alpha_map = np.clip(alpha_map * gain, 0.0, 0.99)
         logger.debug(
             "Removing watermark at (%d, %d) size %dx%d [conf=%.3f]",
             pos[0],
@@ -525,6 +548,49 @@ class GeminiEngine:
         alpha_roi = alpha_map[ay1 : ay1 + (y2 - y1), ax1 : ax1 + (x2 - x1)]
         return alpha_roi, (y1, y2, x1, x2)
 
+    def _estimate_alpha_gain(
+        self,
+        image: NDArray[Any],
+        alpha_map: NDArray[Any],
+        position: tuple[int, int],
+    ) -> float:
+        """Scale factor matching the captured alpha to this image's sparkle opacity.
+
+        The captured alpha (peak ~0.51) under-represents sparkles rendered more
+        opaque; reverse-alpha then leaves a bright residual. Estimate the effective
+        opacity at the sparkle core (observed brightness vs the local background
+        ring) and return ``a_eff / a_capture``, clamped to ``[1.0, _ALPHA_GAIN_MAX]``
+        so it only ever STRENGTHENS removal (1.0 = no change on a matching sparkle).
+        Returns 1.0 when the background cannot be estimated reliably.
+        """
+        placed = self._footprint_indices(alpha_map, position, image.shape)
+        if placed is None:
+            return 1.0
+        alpha_roi, (y1, y2, x1, x2) = placed
+        a_cap = float(alpha_roi.max())
+        if a_cap < 0.2:
+            return 1.0
+        gray = image.astype(np.float32).mean(axis=2)
+        core = alpha_roi >= a_cap * self._ALPHA_GAIN_CORE_FRAC
+        if not bool(core.any()):
+            return 1.0
+        core_obs = float(np.percentile(gray[y1:y2, x1:x2][core], 75))
+        # Local background = a ring just outside the footprint box.
+        ih, iw = image.shape[:2]
+        pad = int((x2 - x1) * 0.7)
+        ry1, ry2 = max(0, y1 - pad), min(ih, y2 + pad)
+        rx1, rx2 = max(0, x1 - pad), min(iw, x2 + pad)
+        ring = gray[ry1:ry2, rx1:rx2]
+        ring_mask = np.ones(ring.shape, dtype=bool)
+        ring_mask[y1 - ry1 : y2 - ry1, x1 - rx1 : x2 - rx1] = False
+        if int(ring_mask.sum()) < 10:
+            return 1.0
+        bg = float(np.median(ring[ring_mask]))
+        if 255.0 - bg < 5.0:
+            return 1.0
+        a_eff = float(np.clip((core_obs - bg) / (255.0 - bg), 0.0, 0.99))
+        return float(np.clip(a_eff / a_cap, 1.0, self._ALPHA_GAIN_MAX))
+
     def _reverse_alpha_oversubtracts(
         self,
         image: NDArray[Any],
diff --git a/tests/test_gemini_engine.py b/tests/test_gemini_engine.py
index 37451ab..c04e609 100644
--- a/tests/test_gemini_engine.py
+++ b/tests/test_gemini_engine.py
@@ -286,6 +286,66 @@ class TestOverSubtractionGuard:
         assert self.engine._reverse_alpha_oversubtracts(dark, dalpha, (dpos[0], dpos[1])) is True
 
 
+class TestUnderSubtractionGain:
+    """Under-subtraction fix: a sparkle MORE opaque than the captured alpha must not
+    survive removal. The captured alpha (~0.51) under-represents such marks, so the
+    fixed-alpha reverse blend leaves a bright residual; the per-image gain scales the
+    alpha up to match this image's opacity. Mirror of TestOverSubtractionGuard.
+    """
+
+    @pytest.fixture(autouse=True)
+    def _setup_engine(self):
+        self.engine = GeminiEngine()
+
+    def _composite_sparkle(self, bg_value: int, alpha_scale: float, size: int = 1400):
+        """Flat ``bg_value`` image with the sparkle composited at ``alpha_scale`` opacity.
+
+        ``alpha_scale`` > 1 makes the mark MORE opaque than the engine's captured alpha,
+        reproducing the under-subtraction case (real under-removed marks estimate ~1.47).
+        """
+        img = np.full((size, size, 3), bg_value, dtype=np.float32)
+        config = get_watermark_config(size, size)
+        x, y = config.get_position(size, size)
+        alpha = self.engine.get_alpha_map(WatermarkSize.LARGE)
+        ah, aw = alpha.shape[:2]
+        a = np.clip(alpha * alpha_scale, 0.0, 1.0)[:, :, None]
+        roi = img[y : y + ah, x : x + aw]
+        img[y : y + ah, x : x + aw] = a * 255.0 + (1.0 - a) * roi
+        return np.clip(img, 0, 255).astype(np.uint8), (x, y, aw, ah)
+
+    def test_more_opaque_sparkle_estimates_gain_above_deadband(self):
+        image, pos = self._composite_sparkle(bg_value=80, alpha_scale=1.3)
+        alpha = self.engine.get_interpolated_alpha(pos[2])
+        gain = self.engine._estimate_alpha_gain(image, alpha, (pos[0], pos[1]))
+        assert gain > self.engine._ALPHA_GAIN_DEADBAND, f"gain {gain} did not exceed deadband"
+
+    def test_matching_sparkle_estimates_unit_gain(self):
+        """A sparkle that matches the captured opacity gets ~1.0 (no scaling)."""
+        image, pos = self._composite_sparkle(bg_value=80, alpha_scale=1.0)
+        alpha = self.engine.get_interpolated_alpha(pos[2])
+        gain = self.engine._estimate_alpha_gain(image, alpha, (pos[0], pos[1]))
+        assert gain <= self.engine._ALPHA_GAIN_DEADBAND, f"matching sparkle scaled by {gain}"
+
+    def test_more_opaque_sparkle_is_removed(self):
+        """The gain-scaled removal clears a more-opaque sparkle without a black pit.
+
+        Asserted on the footprint PIXELS, not the detector: the detector's NCC is
+        degenerate on a perfectly flat synthetic background (zero-variance regions
+        spuriously match), so a re-detect conf is meaningless here -- on real textured
+        images the same removal drops the detector from ~0.80 to ~0.27 (spaces corpus).
+        """
+        image, (x, y, w, h) = self._composite_sparkle(bg_value=80, alpha_scale=1.3)
+        assert self.engine.detect_watermark(image).detected
+        before_max = int(image[y : y + h, x : x + w].max())  # bright sparkle present
+        assert before_max > 150
+        out = self.engine.remove_watermark(image)
+        footprint = out[y : y + h, x : x + w]
+        # Sparkle gone: no bright residual, no black pit, footprint reads like the bg.
+        assert int(footprint.max()) < 80 + 30, f"bright residual: max={footprint.max()}"
+        assert int(footprint.min()) > 25, f"black pit: min={footprint.min()}"
+        assert abs(float(footprint.mean()) - 80.0) < 20.0
+
+
 class TestCornerPromotion:
     """Issue #36: a small sparkle in the corner must not be lost to a larger decoy.