mirror of
https://github.com/wiltodelta/remove-ai-watermarks.git
synced 2026-06-05 10:38:00 +02:00
fix(gemini): inpaint sparkle footprint when reverse-alpha over-subtracts (#30)
On a dark/textured background (e.g. grass) the captured alpha map over-estimates the real Gemini sparkle's effective opacity (~0.51 captured vs ~0.31 effective), so the fixed-alpha reverse blend over-subtracts (watermarked - alpha*logo goes negative) and drives the footprint to black -- the white sparkle turns into a black diamond (issue #30, reported by @CoolZimo1). remove_watermark now detects this via _reverse_alpha_oversubtracts (fraction of footprint pixels with a negative numerator > 5%) and inpaints the small sparkle footprint from the surrounding pixels (cv2 NS, cropped to a padded box) instead. Behavior-neutral on the working case: a bright background over-subtracts at ~0%, so reverse-alpha is used and the output is byte-identical to before (verified: demo_banana 0.0 frac vs the issue-#30 grass image 0.61 frac; issue-#30 footprint recovers to background grass with no pit, residual sparkle conf 0.25 < 0.35). Guard is scoped to GeminiEngine: doubao/jimeng already NCC-align their alpha to the actual mark per image, which sidesteps the fixed-alpha mismatch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -17,7 +17,7 @@ If this tool saves you time, consider [sponsoring its development](https://githu
|
||||
|
||||
## Features
|
||||
|
||||
- **Visible watermark removal** — a registry of known marks in their usual places: the Gemini / Nano Banana sparkle, the Doubao "豆包AI生成" text strip, and the Jimeng "★ 即梦AI" wordmark. Each is removed by **reverse-alpha blending** against a captured alpha map (`original = (wm − α·logo)/(1−α)`), recovering the true pixels rather than inpainting a guess. The Gemini sparkle recovers cleanly on its own; the Doubao and Jimeng text marks re-rasterize slightly per image, so a thin residual inpaint over the glyph footprint clears the leftover edges (the alpha maps are reproducibly rebuilt from controlled captures by `scripts/visible_alpha_solve.py`). Fast, offline, no GPU. `visible --mark auto` finds and removes the strongest detected mark. (For arbitrary logos/objects, see `erase`.)
|
||||
- **Visible watermark removal** — a registry of known marks in their usual places: the Gemini / Nano Banana sparkle, the Doubao "豆包AI生成" text strip, and the Jimeng "★ 即梦AI" wordmark. Each is removed by **reverse-alpha blending** against a captured alpha map (`original = (wm − α·logo)/(1−α)`), recovering the true pixels rather than inpainting a guess. The Gemini sparkle recovers cleanly on its own on bright backgrounds (on a dark background, where the fixed alpha would over-subtract and leave a dark spot, it automatically inpaints the small sparkle footprint instead); the Doubao and Jimeng text marks re-rasterize slightly per image, so a thin residual inpaint over the glyph footprint clears the leftover edges (the alpha maps are reproducibly rebuilt from controlled captures by `scripts/visible_alpha_solve.py`). Fast, offline, no GPU. `visible --mark auto` finds and removes the strongest detected mark. (For arbitrary logos/objects, see `erase`.)
|
||||
- **Universal region eraser (`erase`)** — remove any logo / watermark / object inside boxes you specify, regardless of position or colour. Default cv2 inpainting (CPU, instant); optional big-LaMa via onnxruntime (`lama` extra) for higher quality
|
||||
- **Invisible watermark removal** — SynthID, StableSignature, TreeRing via diffusion-based regeneration (needs a local GPU, or run it with no setup on [raiw.cc](https://raiw.cc))
|
||||
- **AI metadata stripping** — EXIF, PNG text chunks, C2PA provenance manifests (PNG / JPEG / AVIF / HEIF / JPEG-XL, **MP4 / MOV / M4V / M4A** at the container level, and **WebM / MP3 / WAV / FLAC / OGG** losslessly via ffmpeg), XMP DigitalSourceType
|
||||
|
||||
@@ -127,6 +127,17 @@ class GeminiEngine:
|
||||
This is a Python port of the GeminiWatermarkTool C++ engine.
|
||||
"""
|
||||
|
||||
# Footprint pixels with alpha at/above this are the sparkle body; below it the
|
||||
# mark barely affects the pixel, so those are excluded from both the
|
||||
# over-subtraction test and the inpaint mask.
|
||||
_FOOTPRINT_ALPHA = 0.1
|
||||
# If more than this fraction of footprint pixels over-subtract (numerator < 0),
|
||||
# the fixed alpha does not match this image's sparkle and reverse-alpha would
|
||||
# punch a dark pit -- inpaint instead. demo_banana measures 0.0 (reverse-alpha
|
||||
# kept), the issue #30 dark-grass image measures ~0.61 (inpaint), so the 0.05
|
||||
# gate separates them with a wide margin.
|
||||
_OVERSUB_FOOTPRINT_FRAC = 0.05
|
||||
|
||||
def __init__(self, logo_value: float = 255.0) -> None:
|
||||
"""Initialize the engine with embedded alpha maps.
|
||||
|
||||
@@ -365,7 +376,21 @@ class GeminiEngine:
|
||||
detection.confidence,
|
||||
)
|
||||
|
||||
self._reverse_alpha_blend(result, alpha_map, pos)
|
||||
# The captured alpha map (max ~0.51 = a ~50%-opaque white sparkle) is exact
|
||||
# only when the real mark's effective opacity matches it. On a dark/textured
|
||||
# background the sparkle's effective alpha is lower than the capture, so the
|
||||
# fixed-alpha reverse blend OVER-subtracts and drives the footprint to black --
|
||||
# the "white sparkle turns into a black pit" bug (issue #30). The signature is
|
||||
# a large fraction of footprint pixels whose numerator (watermarked - a*logo)
|
||||
# goes negative, which is physically impossible under a brightening overlay.
|
||||
# In that case inpaint the small footprint from the surrounding pixels instead;
|
||||
# on a bright background no pixel over-subtracts, so reverse-alpha is used and
|
||||
# the result is byte-identical to before (verified on demo_banana: 0% vs 61%).
|
||||
if self._reverse_alpha_oversubtracts(result, alpha_map, pos):
|
||||
logger.debug("Reverse-alpha over-subtracts on this background; inpainting sparkle footprint.")
|
||||
self._inpaint_footprint(result, alpha_map, pos)
|
||||
else:
|
||||
self._reverse_alpha_blend(result, alpha_map, pos)
|
||||
return result
|
||||
|
||||
def remove_watermark_custom(
|
||||
@@ -399,6 +424,84 @@ class GeminiEngine:
|
||||
self._reverse_alpha_blend(result, alpha, (x, y))
|
||||
return result
|
||||
|
||||
def _footprint_indices(
|
||||
self,
|
||||
alpha_map: NDArray[Any],
|
||||
position: tuple[int, int],
|
||||
image_shape: tuple[int, ...],
|
||||
) -> tuple[NDArray[Any], tuple[int, int, int, int]] | None:
|
||||
"""Return (alpha_roi, (y1, y2, x1, x2)) for the placed footprint, or None.
|
||||
|
||||
Shared by the over-subtraction test and the inpaint mask so both operate on
|
||||
exactly the same clipped, in-bounds region.
|
||||
"""
|
||||
x, y = position
|
||||
ah, aw = alpha_map.shape[:2]
|
||||
ih, iw = image_shape[:2]
|
||||
x1, y1 = max(0, x), max(0, y)
|
||||
x2, y2 = min(iw, x + aw), min(ih, y + ah)
|
||||
if x1 >= x2 or y1 >= y2:
|
||||
return None
|
||||
ax1, ay1 = x1 - x, y1 - y
|
||||
alpha_roi = alpha_map[ay1 : ay1 + (y2 - y1), ax1 : ax1 + (x2 - x1)]
|
||||
return alpha_roi, (y1, y2, x1, x2)
|
||||
|
||||
def _reverse_alpha_oversubtracts(
|
||||
self,
|
||||
image: NDArray[Any],
|
||||
alpha_map: NDArray[Any],
|
||||
position: tuple[int, int],
|
||||
) -> bool:
|
||||
"""True when reverse-alpha would drive the footprint dark (issue #30).
|
||||
|
||||
Tests the numerator ``watermarked - alpha*logo`` over the sparkle body: a
|
||||
brightening overlay can never make it negative, so a large negative fraction
|
||||
means the fixed alpha over-estimates this image's opacity.
|
||||
"""
|
||||
placed = self._footprint_indices(alpha_map, position, image.shape)
|
||||
if placed is None:
|
||||
return False
|
||||
alpha_roi, (y1, y2, x1, x2) = placed
|
||||
body = alpha_roi >= self._FOOTPRINT_ALPHA
|
||||
if not bool(body.any()):
|
||||
return False
|
||||
roi = image[y1:y2, x1:x2].astype(np.float32)
|
||||
numerator = roi.mean(axis=2) - np.clip(alpha_roi, 0.0, 0.99) * self.logo_value
|
||||
frac = float((numerator[body] < 0).sum()) / float(body.sum())
|
||||
return frac > self._OVERSUB_FOOTPRINT_FRAC
|
||||
|
||||
def _inpaint_footprint(
|
||||
self,
|
||||
image: NDArray[Any],
|
||||
alpha_map: NDArray[Any],
|
||||
position: tuple[int, int],
|
||||
) -> None:
|
||||
"""Inpaint the sparkle body from surrounding pixels, in-place.
|
||||
|
||||
Fallback for backgrounds where reverse-alpha over-subtracts: a small mask of
|
||||
the footprint (alpha >= threshold, dilated) is reconstructed by cv2 NS inpaint
|
||||
from the continuous surroundings, so the sparkle is replaced by plausible
|
||||
background instead of a black pit.
|
||||
"""
|
||||
placed = self._footprint_indices(alpha_map, position, image.shape)
|
||||
if placed is None:
|
||||
return
|
||||
alpha_roi, (y1, y2, x1, x2) = placed
|
||||
# Inpaint only a padded crop around the footprint, not the whole image: the
|
||||
# mask is zero outside a ~96x96 corner, so inpainting the full (multi-MP)
|
||||
# image would be ~hundreds of times more work for an identical result. The
|
||||
# padding gives cv2 enough surrounding context to reconstruct the sparkle.
|
||||
ih, iw = image.shape[:2]
|
||||
pad = 24
|
||||
cy1, cy2 = max(0, y1 - pad), min(ih, y2 + pad)
|
||||
cx1, cx2 = max(0, x1 - pad), min(iw, x2 + pad)
|
||||
crop = image[cy1:cy2, cx1:cx2]
|
||||
mask = np.zeros(crop.shape[:2], dtype=np.uint8)
|
||||
mask[y1 - cy1 : y2 - cy1, x1 - cx1 : x2 - cx1] = (alpha_roi >= self._FOOTPRINT_ALPHA).astype(np.uint8) * 255
|
||||
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
|
||||
mask = cv2.dilate(mask, kernel, iterations=2)
|
||||
image[cy1:cy2, cx1:cx2] = cv2.inpaint(crop, mask, 6, cv2.INPAINT_NS)
|
||||
|
||||
def _reverse_alpha_blend(
|
||||
self,
|
||||
image: NDArray[Any],
|
||||
@@ -409,22 +512,10 @@ class GeminiEngine:
|
||||
|
||||
Formula: original = (watermarked - a * logo) / (1 - a)
|
||||
"""
|
||||
x, y = position
|
||||
ah, aw = alpha_map.shape[:2]
|
||||
ih, iw = image.shape[:2]
|
||||
|
||||
# Clip to bounds
|
||||
x1 = max(0, x)
|
||||
y1 = max(0, y)
|
||||
x2 = min(iw, x + aw)
|
||||
y2 = min(ih, y + ah)
|
||||
|
||||
if x1 >= x2 or y1 >= y2:
|
||||
placed = self._footprint_indices(alpha_map, position, image.shape)
|
||||
if placed is None:
|
||||
return
|
||||
|
||||
# Get ROIs
|
||||
ax1, ay1 = x1 - x, y1 - y
|
||||
alpha_roi = alpha_map[ay1 : ay1 + (y2 - y1), ax1 : ax1 + (x2 - x1)]
|
||||
alpha_roi, (y1, y2, x1, x2) = placed
|
||||
image_roi = image[y1:y2, x1:x2].astype(np.float32)
|
||||
|
||||
alpha_threshold = 0.002
|
||||
|
||||
@@ -8,7 +8,9 @@ present.
|
||||
**Reverse-alpha based.** A known mark is a fixed semi-transparent overlay, so it
|
||||
is removed by inverting the alpha blend against a captured alpha map
|
||||
(``original = (wm - a*logo)/(1-a)``) -- recovering the true pixels rather than
|
||||
inpainting a guess. Gemini and Doubao recover exactly with no inpaint at native;
|
||||
inpainting a guess. Gemini and Doubao recover exactly with no inpaint at native on
|
||||
bright/flat backgrounds (Gemini falls back to inpainting the sparkle footprint when
|
||||
reverse-alpha would over-subtract on a dark background -- issue #30, see gemini_engine);
|
||||
Jimeng adds a thin residual inpaint over the glyph footprint to clear the outline
|
||||
its per-image render variation leaves behind (still seeded by the reverse-alpha
|
||||
recovery, not a blind inpaint). Detection is consistent with that: each mark is
|
||||
|
||||
@@ -230,3 +230,57 @@ class TestInpainting:
|
||||
original = image.copy()
|
||||
self.engine.inpaint_residual(image, (150, 150, 48, 48))
|
||||
np.testing.assert_array_equal(image, original)
|
||||
|
||||
|
||||
class TestOverSubtractionGuard:
|
||||
"""Issue #30: reverse-alpha must not turn the sparkle into a black pit.
|
||||
|
||||
On a dark background the captured alpha over-estimates the real sparkle opacity,
|
||||
so the fixed-alpha reverse blend over-subtracts and drives the footprint to black.
|
||||
The engine detects this and inpaints the footprint instead.
|
||||
"""
|
||||
|
||||
# Composite the mark at ~60% of the captured opacity: the engine's alpha maxes at
|
||||
# ~0.51, real dark-background sparkles sit nearer ~0.31, so 0.6x reproduces the
|
||||
# capture-over-estimates-reality mismatch that triggers the bug.
|
||||
_REALISTIC_ALPHA_SCALE = 0.6
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _setup_engine(self):
|
||||
self.engine = GeminiEngine()
|
||||
|
||||
def _composite_sparkle(self, bg_value: int, size: int = 1400, alpha_scale: float = _REALISTIC_ALPHA_SCALE):
|
||||
"""Build a flat BGR image of ``bg_value`` with the sparkle composited in.
|
||||
|
||||
The mark is composited at a LOWER effective opacity than the engine's captured
|
||||
alpha map (``alpha_scale`` < 1), reproducing the real-world mismatch behind
|
||||
issue #30: the captured alpha (~0.51) over-estimates a real sparkle whose
|
||||
effective opacity is lower, so the fixed-alpha reverse blend over-subtracts.
|
||||
Placed at the configured large-image position so the detector locates it.
|
||||
"""
|
||||
img = np.full((size, size, 3), bg_value, dtype=np.float32)
|
||||
config = get_watermark_config(size, size)
|
||||
x, y = config.get_position(size, size)
|
||||
alpha = self.engine.get_alpha_map(WatermarkSize.LARGE)
|
||||
ah, aw = alpha.shape[:2]
|
||||
a = (alpha * alpha_scale)[:, :, None]
|
||||
roi = img[y : y + ah, x : x + aw]
|
||||
img[y : y + ah, x : x + aw] = a * 255.0 + (1.0 - a) * roi
|
||||
return np.clip(img, 0, 255).astype(np.uint8), (x, y, aw, ah)
|
||||
|
||||
def test_dark_background_does_not_leave_black_pit(self):
|
||||
image, (x, y, w, h) = self._composite_sparkle(bg_value=60)
|
||||
out = self.engine.remove_watermark(image)
|
||||
footprint = out[y : y + h, x : x + w]
|
||||
# The recovered footprint must read like the dark background, not a black hole.
|
||||
assert footprint.min() > 25, f"black pit: min={footprint.min()}"
|
||||
assert abs(float(footprint.mean()) - 60.0) < 25.0
|
||||
|
||||
def test_bright_background_keeps_reverse_alpha(self):
|
||||
"""A bright background does not over-subtract, so reverse-alpha is used."""
|
||||
bright, pos = self._composite_sparkle(bg_value=230)
|
||||
alpha = self.engine.get_interpolated_alpha(pos[2])
|
||||
assert self.engine._reverse_alpha_oversubtracts(bright, alpha, (pos[0], pos[1])) is False
|
||||
dark, dpos = self._composite_sparkle(bg_value=60)
|
||||
dalpha = self.engine.get_interpolated_alpha(dpos[2])
|
||||
assert self.engine._reverse_alpha_oversubtracts(dark, dalpha, (dpos[0], dpos[1])) is True
|
||||
|
||||
Reference in New Issue
Block a user