fix(gemini): inpaint sparkle footprint when reverse-alpha over-subtracts (#30)

On a dark/textured background (e.g. grass) the captured alpha map over-estimates the real Gemini sparkle's effective opacity (~0.51 captured vs ~0.31 effective), so the fixed-alpha reverse blend over-subtracts (watermarked - alpha*logo goes negative) and drives the footprint to black -- the white sparkle turns into a black diamond (issue #30, reported by @CoolZimo1). remove_watermark now detects this via _reverse_alpha_oversubtracts (fraction of footprint pixels with a negative numerator > 5%) and inpaints the small sparkle footprint from the surrounding pixels (cv2 NS, cropped to a padded box) instead. Behavior-neutral on the working case: a bright background over-subtracts at ~0%, so reverse-alpha is used and the output is byte-identical to before (verified: demo_banana 0.0 frac vs the issue-#30 grass image 0.61 frac; issue-#30 footprint recovers to background grass with no pit, residual sparkle conf 0.25 < 0.35). Guard is scoped to GeminiEngine: doubao/jimeng already NCC-align their alpha to the actual mark per image, which sidesteps the fixed-alpha mismatch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-21 07:00:56 +02:00 · 2026-06-02 09:17:32 -07:00
parent b25276c4f2
commit 9ca2811938
5 changed files with 166 additions and 19 deletions
@@ -17,7 +17,7 @@ If this tool saves you time, consider [sponsoring its development](https://githu

 ## Features

- **Visible watermark removal** — a registry of known marks in their usual places: the Gemini / Nano Banana sparkle, the Doubao "豆包AI生成" text strip, and the Jimeng "★ 即梦AI" wordmark. Each is removed by **reverse-alpha blending** against a captured alpha map (`original = (wm − α·logo)/(1−α)`), recovering the true pixels rather than inpainting a guess. The Gemini sparkle recovers cleanly on its own; the Doubao and Jimeng text marks re-rasterize slightly per image, so a thin residual inpaint over the glyph footprint clears the leftover edges (the alpha maps are reproducibly rebuilt from controlled captures by `scripts/visible_alpha_solve.py`). Fast, offline, no GPU. `visible --mark auto` finds and removes the strongest detected mark. (For arbitrary logos/objects, see `erase`.)
+- **Visible watermark removal** — a registry of known marks in their usual places: the Gemini / Nano Banana sparkle, the Doubao "豆包AI生成" text strip, and the Jimeng "★ 即梦AI" wordmark. Each is removed by **reverse-alpha blending** against a captured alpha map (`original = (wm − α·logo)/(1−α)`), recovering the true pixels rather than inpainting a guess. The Gemini sparkle recovers cleanly on its own on bright backgrounds (on a dark background, where the fixed alpha would over-subtract and leave a dark spot, it automatically inpaints the small sparkle footprint instead); the Doubao and Jimeng text marks re-rasterize slightly per image, so a thin residual inpaint over the glyph footprint clears the leftover edges (the alpha maps are reproducibly rebuilt from controlled captures by `scripts/visible_alpha_solve.py`). Fast, offline, no GPU. `visible --mark auto` finds and removes the strongest detected mark. (For arbitrary logos/objects, see `erase`.)
 - **Universal region eraser (`erase`)** — remove any logo / watermark / object inside boxes you specify, regardless of position or colour. Default cv2 inpainting (CPU, instant); optional big-LaMa via onnxruntime (`lama` extra) for higher quality
 - **Invisible watermark removal** — SynthID, StableSignature, TreeRing via diffusion-based regeneration (needs a local GPU, or run it with no setup on [raiw.cc](https://raiw.cc))
 - **AI metadata stripping** — EXIF, PNG text chunks, C2PA provenance manifests (PNG / JPEG / AVIF / HEIF / JPEG-XL, **MP4 / MOV / M4V / M4A** at the container level, and **WebM / MP3 / WAV / FLAC / OGG** losslessly via ffmpeg), XMP DigitalSourceType
@@ -127,6 +127,17 @@ class GeminiEngine:
    This is a Python port of the GeminiWatermarkTool C++ engine.
    """

+    # Footprint pixels with alpha at/above this are the sparkle body; below it the
+    # mark barely affects the pixel, so those are excluded from both the
+    # over-subtraction test and the inpaint mask.
+    _FOOTPRINT_ALPHA = 0.1
+    # If more than this fraction of footprint pixels over-subtract (numerator < 0),
+    # the fixed alpha does not match this image's sparkle and reverse-alpha would
+    # punch a dark pit -- inpaint instead. demo_banana measures 0.0 (reverse-alpha
+    # kept), the issue #30 dark-grass image measures ~0.61 (inpaint), so the 0.05
+    # gate separates them with a wide margin.
+    _OVERSUB_FOOTPRINT_FRAC = 0.05
+
    def __init__(self, logo_value: float = 255.0) -> None:
        """Initialize the engine with embedded alpha maps.

@@ -365,7 +376,21 @@ class GeminiEngine:
            detection.confidence,
        )

-        self._reverse_alpha_blend(result, alpha_map, pos)
+        # The captured alpha map (max ~0.51 = a ~50%-opaque white sparkle) is exact
+        # only when the real mark's effective opacity matches it. On a dark/textured
+        # background the sparkle's effective alpha is lower than the capture, so the
+        # fixed-alpha reverse blend OVER-subtracts and drives the footprint to black --
+        # the "white sparkle turns into a black pit" bug (issue #30). The signature is
+        # a large fraction of footprint pixels whose numerator (watermarked - a*logo)
+        # goes negative, which is physically impossible under a brightening overlay.
+        # In that case inpaint the small footprint from the surrounding pixels instead;
+        # on a bright background no pixel over-subtracts, so reverse-alpha is used and
+        # the result is byte-identical to before (verified on demo_banana: 0% vs 61%).
+        if self._reverse_alpha_oversubtracts(result, alpha_map, pos):
+            logger.debug("Reverse-alpha over-subtracts on this background; inpainting sparkle footprint.")
+            self._inpaint_footprint(result, alpha_map, pos)
+        else:
+            self._reverse_alpha_blend(result, alpha_map, pos)
        return result

    def remove_watermark_custom(
@@ -399,6 +424,84 @@ class GeminiEngine:
        self._reverse_alpha_blend(result, alpha, (x, y))
        return result

+    def _footprint_indices(
+        self,
+        alpha_map: NDArray[Any],
+        position: tuple[int, int],
+        image_shape: tuple[int, ...],
+    ) -> tuple[NDArray[Any], tuple[int, int, int, int]] | None:
+        """Return (alpha_roi, (y1, y2, x1, x2)) for the placed footprint, or None.
+
+        Shared by the over-subtraction test and the inpaint mask so both operate on
+        exactly the same clipped, in-bounds region.
+        """
+        x, y = position
+        ah, aw = alpha_map.shape[:2]
+        ih, iw = image_shape[:2]
+        x1, y1 = max(0, x), max(0, y)
+        x2, y2 = min(iw, x + aw), min(ih, y + ah)
+        if x1 >= x2 or y1 >= y2:
+            return None
+        ax1, ay1 = x1 - x, y1 - y
+        alpha_roi = alpha_map[ay1 : ay1 + (y2 - y1), ax1 : ax1 + (x2 - x1)]
+        return alpha_roi, (y1, y2, x1, x2)
+
+    def _reverse_alpha_oversubtracts(
+        self,
+        image: NDArray[Any],
+        alpha_map: NDArray[Any],
+        position: tuple[int, int],
+    ) -> bool:
+        """True when reverse-alpha would drive the footprint dark (issue #30).
+
+        Tests the numerator ``watermarked - alpha*logo`` over the sparkle body: a
+        brightening overlay can never make it negative, so a large negative fraction
+        means the fixed alpha over-estimates this image's opacity.
+        """
+        placed = self._footprint_indices(alpha_map, position, image.shape)
+        if placed is None:
+            return False
+        alpha_roi, (y1, y2, x1, x2) = placed
+        body = alpha_roi >= self._FOOTPRINT_ALPHA
+        if not bool(body.any()):
+            return False
+        roi = image[y1:y2, x1:x2].astype(np.float32)
+        numerator = roi.mean(axis=2) - np.clip(alpha_roi, 0.0, 0.99) * self.logo_value
+        frac = float((numerator[body] < 0).sum()) / float(body.sum())
+        return frac > self._OVERSUB_FOOTPRINT_FRAC
+
+    def _inpaint_footprint(
+        self,
+        image: NDArray[Any],
+        alpha_map: NDArray[Any],
+        position: tuple[int, int],
+    ) -> None:
+        """Inpaint the sparkle body from surrounding pixels, in-place.
+
+        Fallback for backgrounds where reverse-alpha over-subtracts: a small mask of
+        the footprint (alpha >= threshold, dilated) is reconstructed by cv2 NS inpaint
+        from the continuous surroundings, so the sparkle is replaced by plausible
+        background instead of a black pit.
+        """
+        placed = self._footprint_indices(alpha_map, position, image.shape)
+        if placed is None:
+            return
+        alpha_roi, (y1, y2, x1, x2) = placed
+        # Inpaint only a padded crop around the footprint, not the whole image: the
+        # mask is zero outside a ~96x96 corner, so inpainting the full (multi-MP)
+        # image would be ~hundreds of times more work for an identical result. The
+        # padding gives cv2 enough surrounding context to reconstruct the sparkle.
+        ih, iw = image.shape[:2]
+        pad = 24
+        cy1, cy2 = max(0, y1 - pad), min(ih, y2 + pad)
+        cx1, cx2 = max(0, x1 - pad), min(iw, x2 + pad)
+        crop = image[cy1:cy2, cx1:cx2]
+        mask = np.zeros(crop.shape[:2], dtype=np.uint8)
+        mask[y1 - cy1 : y2 - cy1, x1 - cx1 : x2 - cx1] = (alpha_roi >= self._FOOTPRINT_ALPHA).astype(np.uint8) * 255
+        kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
+        mask = cv2.dilate(mask, kernel, iterations=2)
+        image[cy1:cy2, cx1:cx2] = cv2.inpaint(crop, mask, 6, cv2.INPAINT_NS)
+
    def _reverse_alpha_blend(
        self,
        image: NDArray[Any],
@@ -409,22 +512,10 @@ class GeminiEngine:

        Formula: original = (watermarked - a * logo) / (1 - a)
        """
-        x, y = position
-        ah, aw = alpha_map.shape[:2]
-        ih, iw = image.shape[:2]
-
-        # Clip to bounds
-        x1 = max(0, x)
-        y1 = max(0, y)
-        x2 = min(iw, x + aw)
-        y2 = min(ih, y + ah)
-
-        if x1 >= x2 or y1 >= y2:
+        placed = self._footprint_indices(alpha_map, position, image.shape)
+        if placed is None:
            return
-
-        # Get ROIs
-        ax1, ay1 = x1 - x, y1 - y
-        alpha_roi = alpha_map[ay1 : ay1 + (y2 - y1), ax1 : ax1 + (x2 - x1)]
+        alpha_roi, (y1, y2, x1, x2) = placed
        image_roi = image[y1:y2, x1:x2].astype(np.float32)

        alpha_threshold = 0.002
@@ -8,7 +8,9 @@ present.
 **Reverse-alpha based.** A known mark is a fixed semi-transparent overlay, so it
 is removed by inverting the alpha blend against a captured alpha map
 (``original = (wm - a*logo)/(1-a)``) -- recovering the true pixels rather than
-inpainting a guess. Gemini and Doubao recover exactly with no inpaint at native;
+inpainting a guess. Gemini and Doubao recover exactly with no inpaint at native on
+bright/flat backgrounds (Gemini falls back to inpainting the sparkle footprint when
+reverse-alpha would over-subtract on a dark background -- issue #30, see gemini_engine);
 Jimeng adds a thin residual inpaint over the glyph footprint to clear the outline
 its per-image render variation leaves behind (still seeded by the reverse-alpha
 recovery, not a blind inpaint). Detection is consistent with that: each mark is
@@ -230,3 +230,57 @@ class TestInpainting:
        original = image.copy()
        self.engine.inpaint_residual(image, (150, 150, 48, 48))
        np.testing.assert_array_equal(image, original)
+
+
+class TestOverSubtractionGuard:
+    """Issue #30: reverse-alpha must not turn the sparkle into a black pit.
+
+    On a dark background the captured alpha over-estimates the real sparkle opacity,
+    so the fixed-alpha reverse blend over-subtracts and drives the footprint to black.
+    The engine detects this and inpaints the footprint instead.
+    """
+
+    # Composite the mark at ~60% of the captured opacity: the engine's alpha maxes at
+    # ~0.51, real dark-background sparkles sit nearer ~0.31, so 0.6x reproduces the
+    # capture-over-estimates-reality mismatch that triggers the bug.
+    _REALISTIC_ALPHA_SCALE = 0.6
+
+    @pytest.fixture(autouse=True)
+    def _setup_engine(self):
+        self.engine = GeminiEngine()
+
+    def _composite_sparkle(self, bg_value: int, size: int = 1400, alpha_scale: float = _REALISTIC_ALPHA_SCALE):
+        """Build a flat BGR image of ``bg_value`` with the sparkle composited in.
+
+        The mark is composited at a LOWER effective opacity than the engine's captured
+        alpha map (``alpha_scale`` < 1), reproducing the real-world mismatch behind
+        issue #30: the captured alpha (~0.51) over-estimates a real sparkle whose
+        effective opacity is lower, so the fixed-alpha reverse blend over-subtracts.
+        Placed at the configured large-image position so the detector locates it.
+        """
+        img = np.full((size, size, 3), bg_value, dtype=np.float32)
+        config = get_watermark_config(size, size)
+        x, y = config.get_position(size, size)
+        alpha = self.engine.get_alpha_map(WatermarkSize.LARGE)
+        ah, aw = alpha.shape[:2]
+        a = (alpha * alpha_scale)[:, :, None]
+        roi = img[y : y + ah, x : x + aw]
+        img[y : y + ah, x : x + aw] = a * 255.0 + (1.0 - a) * roi
+        return np.clip(img, 0, 255).astype(np.uint8), (x, y, aw, ah)
+
+    def test_dark_background_does_not_leave_black_pit(self):
+        image, (x, y, w, h) = self._composite_sparkle(bg_value=60)
+        out = self.engine.remove_watermark(image)
+        footprint = out[y : y + h, x : x + w]
+        # The recovered footprint must read like the dark background, not a black hole.
+        assert footprint.min() > 25, f"black pit: min={footprint.min()}"
+        assert abs(float(footprint.mean()) - 60.0) < 25.0
+
+    def test_bright_background_keeps_reverse_alpha(self):
+        """A bright background does not over-subtract, so reverse-alpha is used."""
+        bright, pos = self._composite_sparkle(bg_value=230)
+        alpha = self.engine.get_interpolated_alpha(pos[2])
+        assert self.engine._reverse_alpha_oversubtracts(bright, alpha, (pos[0], pos[1])) is False
+        dark, dpos = self._composite_sparkle(bg_value=60)
+        dalpha = self.engine.get_interpolated_alpha(dpos[2])
+        assert self.engine._reverse_alpha_oversubtracts(dark, dalpha, (dpos[0], dpos[1])) is True