fix(gemini): inpaint sparkle footprint when reverse-alpha over-subtracts (#30)

On a dark/textured background (e.g. grass) the captured alpha map over-estimates
the real Gemini sparkle's effective opacity (~0.51 captured vs ~0.31 effective),
so the fixed-alpha reverse blend over-subtracts (watermarked - alpha*logo goes
negative) and drives the footprint to black -- the white sparkle turns into a
black diamond (issue #30, reported by @CoolZimo1).

remove_watermark now detects this via _reverse_alpha_oversubtracts (fraction of
footprint pixels with a negative numerator > 5%) and inpaints the small sparkle
footprint from the surrounding pixels (cv2 NS, cropped to a padded box) instead.
Behavior-neutral on the working case: a bright background over-subtracts at ~0%,
so reverse-alpha is used and the output is byte-identical to before (verified:
demo_banana 0.0 frac vs the issue-#30 grass image 0.61 frac; issue-#30 footprint
recovers to background grass with no pit, residual sparkle conf 0.25 < 0.35).

Guard is scoped to GeminiEngine: doubao/jimeng already NCC-align their alpha to
the actual mark per image, which sidesteps the fixed-alpha mismatch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Victor Kuznetsov
2026-06-02 09:17:32 -07:00
parent b25276c4f2
commit 9ca2811938
5 changed files with 166 additions and 19 deletions
+1 -1
View File
File diff suppressed because one or more lines are too long
+1 -1
View File
@@ -17,7 +17,7 @@ If this tool saves you time, consider [sponsoring its development](https://githu
## Features
- **Visible watermark removal** — a registry of known marks in their usual places: the Gemini / Nano Banana sparkle, the Doubao "豆包AI生成" text strip, and the Jimeng "★ 即梦AI" wordmark. Each is removed by **reverse-alpha blending** against a captured alpha map (`original = (wm α·logo)/(1−α)`), recovering the true pixels rather than inpainting a guess. The Gemini sparkle recovers cleanly on its own; the Doubao and Jimeng text marks re-rasterize slightly per image, so a thin residual inpaint over the glyph footprint clears the leftover edges (the alpha maps are reproducibly rebuilt from controlled captures by `scripts/visible_alpha_solve.py`). Fast, offline, no GPU. `visible --mark auto` finds and removes the strongest detected mark. (For arbitrary logos/objects, see `erase`.)
- **Visible watermark removal** — a registry of known marks in their usual places: the Gemini / Nano Banana sparkle, the Doubao "豆包AI生成" text strip, and the Jimeng "★ 即梦AI" wordmark. Each is removed by **reverse-alpha blending** against a captured alpha map (`original = (wm α·logo)/(1−α)`), recovering the true pixels rather than inpainting a guess. The Gemini sparkle recovers cleanly on its own on bright backgrounds (on a dark background, where the fixed alpha would over-subtract and leave a dark spot, it automatically inpaints the small sparkle footprint instead); the Doubao and Jimeng text marks re-rasterize slightly per image, so a thin residual inpaint over the glyph footprint clears the leftover edges (the alpha maps are reproducibly rebuilt from controlled captures by `scripts/visible_alpha_solve.py`). Fast, offline, no GPU. `visible --mark auto` finds and removes the strongest detected mark. (For arbitrary logos/objects, see `erase`.)
- **Universal region eraser (`erase`)** — remove any logo / watermark / object inside boxes you specify, regardless of position or colour. Default cv2 inpainting (CPU, instant); optional big-LaMa via onnxruntime (`lama` extra) for higher quality
- **Invisible watermark removal** — SynthID, StableSignature, TreeRing via diffusion-based regeneration (needs a local GPU, or run it with no setup on [raiw.cc](https://raiw.cc))
- **AI metadata stripping** — EXIF, PNG text chunks, C2PA provenance manifests (PNG / JPEG / AVIF / HEIF / JPEG-XL, **MP4 / MOV / M4V / M4A** at the container level, and **WebM / MP3 / WAV / FLAC / OGG** losslessly via ffmpeg), XMP DigitalSourceType
+107 -16
View File
@@ -127,6 +127,17 @@ class GeminiEngine:
This is a Python port of the GeminiWatermarkTool C++ engine.
"""
# Footprint pixels with alpha at/above this are the sparkle body; below it the
# mark barely affects the pixel, so those are excluded from both the
# over-subtraction test and the inpaint mask.
_FOOTPRINT_ALPHA = 0.1
# If more than this fraction of footprint pixels over-subtract (numerator < 0),
# the fixed alpha does not match this image's sparkle and reverse-alpha would
# punch a dark pit -- inpaint instead. demo_banana measures 0.0 (reverse-alpha
# kept), the issue #30 dark-grass image measures ~0.61 (inpaint), so the 0.05
# gate separates them with a wide margin.
_OVERSUB_FOOTPRINT_FRAC = 0.05
def __init__(self, logo_value: float = 255.0) -> None:
"""Initialize the engine with embedded alpha maps.
@@ -365,7 +376,21 @@ class GeminiEngine:
detection.confidence,
)
self._reverse_alpha_blend(result, alpha_map, pos)
# The captured alpha map (max ~0.51 = a ~50%-opaque white sparkle) is exact
# only when the real mark's effective opacity matches it. On a dark/textured
# background the sparkle's effective alpha is lower than the capture, so the
# fixed-alpha reverse blend OVER-subtracts and drives the footprint to black --
# the "white sparkle turns into a black pit" bug (issue #30). The signature is
# a large fraction of footprint pixels whose numerator (watermarked - a*logo)
# goes negative, which is physically impossible under a brightening overlay.
# In that case inpaint the small footprint from the surrounding pixels instead;
# on a bright background no pixel over-subtracts, so reverse-alpha is used and
# the result is byte-identical to before (verified on demo_banana: 0% vs 61%).
if self._reverse_alpha_oversubtracts(result, alpha_map, pos):
logger.debug("Reverse-alpha over-subtracts on this background; inpainting sparkle footprint.")
self._inpaint_footprint(result, alpha_map, pos)
else:
self._reverse_alpha_blend(result, alpha_map, pos)
return result
def remove_watermark_custom(
@@ -399,6 +424,84 @@ class GeminiEngine:
self._reverse_alpha_blend(result, alpha, (x, y))
return result
def _footprint_indices(
self,
alpha_map: NDArray[Any],
position: tuple[int, int],
image_shape: tuple[int, ...],
) -> tuple[NDArray[Any], tuple[int, int, int, int]] | None:
"""Return (alpha_roi, (y1, y2, x1, x2)) for the placed footprint, or None.
Shared by the over-subtraction test and the inpaint mask so both operate on
exactly the same clipped, in-bounds region.
"""
x, y = position
ah, aw = alpha_map.shape[:2]
ih, iw = image_shape[:2]
x1, y1 = max(0, x), max(0, y)
x2, y2 = min(iw, x + aw), min(ih, y + ah)
if x1 >= x2 or y1 >= y2:
return None
ax1, ay1 = x1 - x, y1 - y
alpha_roi = alpha_map[ay1 : ay1 + (y2 - y1), ax1 : ax1 + (x2 - x1)]
return alpha_roi, (y1, y2, x1, x2)
def _reverse_alpha_oversubtracts(
self,
image: NDArray[Any],
alpha_map: NDArray[Any],
position: tuple[int, int],
) -> bool:
"""True when reverse-alpha would drive the footprint dark (issue #30).
Tests the numerator ``watermarked - alpha*logo`` over the sparkle body: a
brightening overlay can never make it negative, so a large negative fraction
means the fixed alpha over-estimates this image's opacity.
"""
placed = self._footprint_indices(alpha_map, position, image.shape)
if placed is None:
return False
alpha_roi, (y1, y2, x1, x2) = placed
body = alpha_roi >= self._FOOTPRINT_ALPHA
if not bool(body.any()):
return False
roi = image[y1:y2, x1:x2].astype(np.float32)
numerator = roi.mean(axis=2) - np.clip(alpha_roi, 0.0, 0.99) * self.logo_value
frac = float((numerator[body] < 0).sum()) / float(body.sum())
return frac > self._OVERSUB_FOOTPRINT_FRAC
def _inpaint_footprint(
self,
image: NDArray[Any],
alpha_map: NDArray[Any],
position: tuple[int, int],
) -> None:
"""Inpaint the sparkle body from surrounding pixels, in-place.
Fallback for backgrounds where reverse-alpha over-subtracts: a small mask of
the footprint (alpha >= threshold, dilated) is reconstructed by cv2 NS inpaint
from the continuous surroundings, so the sparkle is replaced by plausible
background instead of a black pit.
"""
placed = self._footprint_indices(alpha_map, position, image.shape)
if placed is None:
return
alpha_roi, (y1, y2, x1, x2) = placed
# Inpaint only a padded crop around the footprint, not the whole image: the
# mask is zero outside a ~96x96 corner, so inpainting the full (multi-MP)
# image would be ~hundreds of times more work for an identical result. The
# padding gives cv2 enough surrounding context to reconstruct the sparkle.
ih, iw = image.shape[:2]
pad = 24
cy1, cy2 = max(0, y1 - pad), min(ih, y2 + pad)
cx1, cx2 = max(0, x1 - pad), min(iw, x2 + pad)
crop = image[cy1:cy2, cx1:cx2]
mask = np.zeros(crop.shape[:2], dtype=np.uint8)
mask[y1 - cy1 : y2 - cy1, x1 - cx1 : x2 - cx1] = (alpha_roi >= self._FOOTPRINT_ALPHA).astype(np.uint8) * 255
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
mask = cv2.dilate(mask, kernel, iterations=2)
image[cy1:cy2, cx1:cx2] = cv2.inpaint(crop, mask, 6, cv2.INPAINT_NS)
def _reverse_alpha_blend(
self,
image: NDArray[Any],
@@ -409,22 +512,10 @@ class GeminiEngine:
Formula: original = (watermarked - a * logo) / (1 - a)
"""
x, y = position
ah, aw = alpha_map.shape[:2]
ih, iw = image.shape[:2]
# Clip to bounds
x1 = max(0, x)
y1 = max(0, y)
x2 = min(iw, x + aw)
y2 = min(ih, y + ah)
if x1 >= x2 or y1 >= y2:
placed = self._footprint_indices(alpha_map, position, image.shape)
if placed is None:
return
# Get ROIs
ax1, ay1 = x1 - x, y1 - y
alpha_roi = alpha_map[ay1 : ay1 + (y2 - y1), ax1 : ax1 + (x2 - x1)]
alpha_roi, (y1, y2, x1, x2) = placed
image_roi = image[y1:y2, x1:x2].astype(np.float32)
alpha_threshold = 0.002
@@ -8,7 +8,9 @@ present.
**Reverse-alpha based.** A known mark is a fixed semi-transparent overlay, so it
is removed by inverting the alpha blend against a captured alpha map
(``original = (wm - a*logo)/(1-a)``) -- recovering the true pixels rather than
inpainting a guess. Gemini and Doubao recover exactly with no inpaint at native;
inpainting a guess. Gemini and Doubao recover exactly with no inpaint at native on
bright/flat backgrounds (Gemini falls back to inpainting the sparkle footprint when
reverse-alpha would over-subtract on a dark background -- issue #30, see gemini_engine);
Jimeng adds a thin residual inpaint over the glyph footprint to clear the outline
its per-image render variation leaves behind (still seeded by the reverse-alpha
recovery, not a blind inpaint). Detection is consistent with that: each mark is
+54
View File
@@ -230,3 +230,57 @@ class TestInpainting:
original = image.copy()
self.engine.inpaint_residual(image, (150, 150, 48, 48))
np.testing.assert_array_equal(image, original)
class TestOverSubtractionGuard:
"""Issue #30: reverse-alpha must not turn the sparkle into a black pit.
On a dark background the captured alpha over-estimates the real sparkle opacity,
so the fixed-alpha reverse blend over-subtracts and drives the footprint to black.
The engine detects this and inpaints the footprint instead.
"""
# Composite the mark at ~60% of the captured opacity: the engine's alpha maxes at
# ~0.51, real dark-background sparkles sit nearer ~0.31, so 0.6x reproduces the
# capture-over-estimates-reality mismatch that triggers the bug.
_REALISTIC_ALPHA_SCALE = 0.6
@pytest.fixture(autouse=True)
def _setup_engine(self):
self.engine = GeminiEngine()
def _composite_sparkle(self, bg_value: int, size: int = 1400, alpha_scale: float = _REALISTIC_ALPHA_SCALE):
"""Build a flat BGR image of ``bg_value`` with the sparkle composited in.
The mark is composited at a LOWER effective opacity than the engine's captured
alpha map (``alpha_scale`` < 1), reproducing the real-world mismatch behind
issue #30: the captured alpha (~0.51) over-estimates a real sparkle whose
effective opacity is lower, so the fixed-alpha reverse blend over-subtracts.
Placed at the configured large-image position so the detector locates it.
"""
img = np.full((size, size, 3), bg_value, dtype=np.float32)
config = get_watermark_config(size, size)
x, y = config.get_position(size, size)
alpha = self.engine.get_alpha_map(WatermarkSize.LARGE)
ah, aw = alpha.shape[:2]
a = (alpha * alpha_scale)[:, :, None]
roi = img[y : y + ah, x : x + aw]
img[y : y + ah, x : x + aw] = a * 255.0 + (1.0 - a) * roi
return np.clip(img, 0, 255).astype(np.uint8), (x, y, aw, ah)
def test_dark_background_does_not_leave_black_pit(self):
image, (x, y, w, h) = self._composite_sparkle(bg_value=60)
out = self.engine.remove_watermark(image)
footprint = out[y : y + h, x : x + w]
# The recovered footprint must read like the dark background, not a black hole.
assert footprint.min() > 25, f"black pit: min={footprint.min()}"
assert abs(float(footprint.mean()) - 60.0) < 25.0
def test_bright_background_keeps_reverse_alpha(self):
"""A bright background does not over-subtract, so reverse-alpha is used."""
bright, pos = self._composite_sparkle(bg_value=230)
alpha = self.engine.get_interpolated_alpha(pos[2])
assert self.engine._reverse_alpha_oversubtracts(bright, alpha, (pos[0], pos[1])) is False
dark, dpos = self._composite_sparkle(bg_value=60)
dalpha = self.engine.get_interpolated_alpha(dpos[2])
assert self.engine._reverse_alpha_oversubtracts(dark, dalpha, (dpos[0], dpos[1])) is True