mirror of
https://github.com/wiltodelta/remove-ai-watermarks.git
synced 2026-06-04 18:18:00 +02:00
fix(gemini): remove more-opaque sparkles via per-image alpha gain
The captured sparkle alpha peaks ~0.51, but some real Gemini sparkles are rendered more opaque. The fixed-alpha reverse blend then UNDER-subtracts and leaves a bright residual the detector still fires on. A visible-removal audit through the registry path on the spaces corpus showed this as a meaningful fraction of marks -- all under-removals, not a background-brightness class (failures and successes had the same input confidence and background luma; the discriminator was the removal delta itself). remove_watermark now estimates a per-image alpha gain (_estimate_alpha_gain: effective sparkle opacity at the bright core vs the local background ring, a_eff/a_cap, clamped [1.0, 1.94]) and scales the alpha to match before the over-sub/blend branch. A 1.05 deadband keeps a sparkle that already matches the capture byte-identical to the pre-fix output, so the fix is purely additive (0 regressions on the audit set; failures dropped substantially). The over-sub guard still runs on the scaled alpha as the safety net for an over-shoot. - _estimate_alpha_gain + _ALPHA_GAIN_MAX/_DEADBAND/_CORE_FRAC in gemini_engine. - TestUnderSubtractionGain asserts on footprint pixels, NOT the detector (its NCC is degenerate on a flat synthetic bg; the real corpus removal drops the detector ~0.80 -> ~0.27). - scripts/visible_removal_audit.py: the detect -> remove -> re-detect audit tool that found and validated this (operates on gitignored data/spaces only). - CLAUDE.md + README: document the under-subtraction gain. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -17,7 +17,7 @@ If this tool saves you time, consider [sponsoring its development](https://githu
|
||||
|
||||
## Features
|
||||
|
||||
- **Visible watermark removal** — a registry of known marks in their usual places: the Gemini / Nano Banana sparkle, the Doubao "豆包AI生成" text strip, and the Jimeng "★ 即梦AI" wordmark. Each is removed by **reverse-alpha blending** against a captured alpha map (`original = (wm − α·logo)/(1−α)`), recovering the true pixels rather than inpainting a guess. The Gemini sparkle recovers cleanly on its own on bright backgrounds (on a dark background, where the fixed alpha would over-subtract and leave a dark spot, it automatically inpaints the small sparkle footprint instead); the Doubao and Jimeng text marks re-rasterize slightly per image, so a thin residual inpaint over the glyph footprint clears the leftover edges (the alpha maps are reproducibly rebuilt from controlled captures by `scripts/visible_alpha_solve.py`). Fast, offline, no GPU. `visible --mark auto` finds and removes the strongest detected mark. (For arbitrary logos/objects, see `erase`.)
|
||||
- **Visible watermark removal** — a registry of known marks in their usual places: the Gemini / Nano Banana sparkle, the Doubao "豆包AI生成" text strip, and the Jimeng "★ 即梦AI" wordmark. Each is removed by **reverse-alpha blending** against a captured alpha map (`original = (wm − α·logo)/(1−α)`), recovering the true pixels rather than inpainting a guess. The Gemini sparkle recovers cleanly on its own on bright backgrounds; it adapts the alpha to each image's sparkle opacity, so a more-opaque-than-captured sparkle is still fully removed (and on a dark background, where the fixed alpha would over-subtract and leave a dark spot, it automatically inpaints the small sparkle footprint instead); the Doubao and Jimeng text marks re-rasterize slightly per image, so a thin residual inpaint over the glyph footprint clears the leftover edges (the alpha maps are reproducibly rebuilt from controlled captures by `scripts/visible_alpha_solve.py`). Fast, offline, no GPU. `visible --mark auto` finds and removes the strongest detected mark. (For arbitrary logos/objects, see `erase`.)
|
||||
- **Universal region eraser (`erase`)** — remove any logo / watermark / object inside boxes you specify, regardless of position or colour. Default cv2 inpainting (CPU, instant); optional big-LaMa via onnxruntime (`lama` extra) for higher quality
|
||||
- **Invisible watermark removal** — SynthID, StableSignature, TreeRing via diffusion-based regeneration (needs a local GPU, or run it with no setup on [raiw.cc](https://raiw.cc))
|
||||
- **AI metadata stripping** — EXIF, PNG text chunks, C2PA provenance manifests (PNG / JPEG / AVIF / HEIF / JPEG-XL, **MP4 / MOV / M4V / M4A** at the container level, and **WebM / MP3 / WAV / FLAC / OGG** losslessly via ffmpeg), XMP DigitalSourceType
|
||||
|
||||
@@ -0,0 +1,138 @@
|
||||
"""Audit visible-watermark removal over a local image corpus.
|
||||
|
||||
For every image the registry detects a known visible mark in, run that mark's
|
||||
removal and re-detect on the output, recording before/after confidence and
|
||||
whether the detector still fires. Also bucket the detected-positive originals
|
||||
into per-mark dataset dirs so the visible-mark corpora are reproducible.
|
||||
|
||||
Detector-clean after removal is necessary but, for the Doubao/Jimeng text marks,
|
||||
NOT sufficient (their NCC detector is fooled by a thin residual outline -- see
|
||||
CLAUDE.md). Treat a detector-clean Doubao/Jimeng as "detector passes"; visual
|
||||
residual is a separate check.
|
||||
|
||||
Operates on gitignored data only (data/spaces/...); writes nothing tracked.
|
||||
|
||||
uv run python scripts/visible_removal_audit.py \
|
||||
--corpus data/spaces/originals --out data/spaces/_visible_audit.csv \
|
||||
--dataset-root data/spaces/_visible_datasets
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import csv
|
||||
import logging
|
||||
import shutil
|
||||
from pathlib import Path
|
||||
|
||||
import click
|
||||
|
||||
from remove_ai_watermarks import image_io
|
||||
from remove_ai_watermarks.watermark_registry import detect_marks, get_mark
|
||||
|
||||
log = logging.getLogger(__name__)
|
||||
|
||||
_EXTS = {".png", ".jpg", ".jpeg", ".webp", ".avif", ".heic"}
|
||||
|
||||
|
||||
def _rel(p: Path, corpus: Path) -> str:
|
||||
try:
|
||||
return str(p.relative_to(corpus))
|
||||
except ValueError:
|
||||
return p.name
|
||||
|
||||
|
||||
@click.command()
|
||||
@click.option(
|
||||
"--corpus", type=click.Path(exists=True, file_okay=False, path_type=Path), default=Path("data/spaces/originals")
|
||||
)
|
||||
@click.option("--out", type=click.Path(path_type=Path), default=Path("data/spaces/_visible_audit.csv"))
|
||||
@click.option("--dataset-root", type=click.Path(path_type=Path), default=Path("data/spaces/_visible_datasets"))
|
||||
@click.option(
|
||||
"--paths-file",
|
||||
type=click.Path(exists=True, path_type=Path),
|
||||
default=None,
|
||||
help="Audit only these paths (one per line), skipping the full rglob.",
|
||||
)
|
||||
@click.option("--limit", type=int, default=0, help="Scan at most N files (0 = all).")
|
||||
def main(corpus: Path, out: Path, dataset_root: Path, paths_file: Path | None, limit: int) -> None:
|
||||
logging.basicConfig(level=logging.WARNING, format="%(message)s")
|
||||
if paths_file is not None:
|
||||
files = [Path(s) for line in paths_file.read_text().splitlines() if (s := line.strip()) and Path(s).is_file()]
|
||||
else:
|
||||
files = sorted(p for p in corpus.rglob("*") if p.is_file() and p.suffix.lower() in _EXTS)
|
||||
if limit:
|
||||
files = files[:limit]
|
||||
click.echo(f"Scanning {len(files)} files under {corpus} ...")
|
||||
|
||||
rows: list[dict[str, str]] = []
|
||||
n_detected = 0
|
||||
n_clean_after = 0
|
||||
fails: list[tuple[str, str, float]] = []
|
||||
|
||||
with click.progressbar(files, label="audit") as bar:
|
||||
for p in bar:
|
||||
img = image_io.imread(p)
|
||||
if img is None:
|
||||
continue
|
||||
for det in detect_marks(img, include_explicit=False):
|
||||
if not det.detected:
|
||||
continue
|
||||
n_detected += 1
|
||||
mark = get_mark(det.key)
|
||||
# Bucket the positive original into the per-mark dataset.
|
||||
ddir = dataset_root / det.key
|
||||
ddir.mkdir(parents=True, exist_ok=True)
|
||||
if not (ddir / p.name).exists():
|
||||
shutil.copy2(p, ddir / p.name)
|
||||
# Remove, then re-detect with the SAME mark's detector.
|
||||
try:
|
||||
cleaned, _ = mark.remove(img)
|
||||
after = mark.detect(cleaned)
|
||||
except Exception as exc:
|
||||
log.warning("remove failed on %s (%s): %s", p.name, det.key, exc)
|
||||
rows.append(
|
||||
{
|
||||
"path": _rel(p, corpus),
|
||||
"mark": det.key,
|
||||
"conf_before": f"{det.confidence:.3f}",
|
||||
"conf_after": "",
|
||||
"removed": "error",
|
||||
}
|
||||
)
|
||||
continue
|
||||
removed = not after.detected
|
||||
n_clean_after += int(removed)
|
||||
if not removed:
|
||||
fails.append((_rel(p, corpus), det.key, after.confidence))
|
||||
rows.append(
|
||||
{
|
||||
"path": _rel(p, corpus),
|
||||
"mark": det.key,
|
||||
"conf_before": f"{det.confidence:.3f}",
|
||||
"conf_after": f"{after.confidence:.3f}",
|
||||
"removed": str(removed),
|
||||
}
|
||||
)
|
||||
|
||||
out.parent.mkdir(parents=True, exist_ok=True)
|
||||
with out.open("w", newline="") as f:
|
||||
w = csv.DictWriter(f, fieldnames=["path", "mark", "conf_before", "conf_after", "removed"])
|
||||
w.writeheader()
|
||||
w.writerows(rows)
|
||||
|
||||
by_mark: dict[str, list[bool]] = {}
|
||||
for r in rows:
|
||||
if r["removed"] in ("True", "False"):
|
||||
by_mark.setdefault(r["mark"], []).append(r["removed"] == "True")
|
||||
click.echo(f"\nDetected positives: {n_detected}; detector-clean after removal: {n_clean_after}")
|
||||
for k, v in sorted(by_mark.items()):
|
||||
click.echo(f" {k:8} removed {sum(v)}/{len(v)} ({100 * sum(v) // max(1, len(v))}%)")
|
||||
if fails:
|
||||
click.echo(f"\nDetector still fires after removal ({len(fails)}):")
|
||||
for path, key, conf in fails[:30]:
|
||||
click.echo(f" {key:8} {conf:.3f} {path}")
|
||||
click.echo(f"\nReport: {out} | Datasets: {dataset_root}/<mark>/")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -140,6 +140,23 @@ class GeminiEngine:
|
||||
# gate separates them with a wide margin.
|
||||
_OVERSUB_FOOTPRINT_FRAC = 0.05
|
||||
|
||||
# Per-image alpha gain (under-subtraction fix). The captured alpha peaks ~0.51
|
||||
# (a ~51%-opaque sparkle). Some real Gemini sparkles are rendered MORE opaque,
|
||||
# so the fixed alpha under-subtracts and reverse-alpha leaves a bright residual
|
||||
# the detector still fires on (~11% of marks on the spaces corpus). Estimate
|
||||
# this image's effective sparkle opacity from the bright core vs the local
|
||||
# background and scale the alpha to match, capped so alpha stays < 0.99. The
|
||||
# gain is clamped to >= 1.0 so it only ever STRENGTHENS removal: ~1.0 when the
|
||||
# sparkle matches the capture (working cases unchanged), >1 when more opaque.
|
||||
# On the spaces corpus the gain cleanly separates -- under-removed marks ~1.47,
|
||||
# cleanly-removed ~1.00. 1.94 is the cap that reaches alpha 0.99 from 0.51.
|
||||
_ALPHA_GAIN_MAX = 1.94
|
||||
_ALPHA_GAIN_CORE_FRAC = 0.8 # body pixels at >= this * peak alpha define the core
|
||||
# Deadband: apply the gain only above this, so a sparkle that already matches the
|
||||
# capture (estimated gain ~1.0-1.04 from background noise) stays byte-identical to
|
||||
# the pre-fix output. Under-removed marks estimate >= 1.26, well clear of the band.
|
||||
_ALPHA_GAIN_DEADBAND = 1.05
|
||||
|
||||
# Corner promotion (issue #36): the size weight that suppresses tiny-patch
|
||||
# false positives also buries a small, near-perfect sparkle when a larger,
|
||||
# mediocre match sits elsewhere (e.g. a bright collar in a portrait). A small
|
||||
@@ -446,6 +463,12 @@ class GeminiEngine:
|
||||
|
||||
pos = (detection.region[0], detection.region[1])
|
||||
alpha_map = self.get_interpolated_alpha(detection.region[2])
|
||||
# Match the captured alpha to this image's sparkle opacity (under-subtraction
|
||||
# fix): a more-opaque-than-captured sparkle would otherwise leave a bright
|
||||
# residual. gain == 1.0 leaves the working cases byte-identical.
|
||||
gain = self._estimate_alpha_gain(result, alpha_map, pos)
|
||||
if gain > self._ALPHA_GAIN_DEADBAND:
|
||||
alpha_map = np.clip(alpha_map * gain, 0.0, 0.99)
|
||||
logger.debug(
|
||||
"Removing watermark at (%d, %d) size %dx%d [conf=%.3f]",
|
||||
pos[0],
|
||||
@@ -525,6 +548,49 @@ class GeminiEngine:
|
||||
alpha_roi = alpha_map[ay1 : ay1 + (y2 - y1), ax1 : ax1 + (x2 - x1)]
|
||||
return alpha_roi, (y1, y2, x1, x2)
|
||||
|
||||
def _estimate_alpha_gain(
|
||||
self,
|
||||
image: NDArray[Any],
|
||||
alpha_map: NDArray[Any],
|
||||
position: tuple[int, int],
|
||||
) -> float:
|
||||
"""Scale factor matching the captured alpha to this image's sparkle opacity.
|
||||
|
||||
The captured alpha (peak ~0.51) under-represents sparkles rendered more
|
||||
opaque; reverse-alpha then leaves a bright residual. Estimate the effective
|
||||
opacity at the sparkle core (observed brightness vs the local background
|
||||
ring) and return ``a_eff / a_capture``, clamped to ``[1.0, _ALPHA_GAIN_MAX]``
|
||||
so it only ever STRENGTHENS removal (1.0 = no change on a matching sparkle).
|
||||
Returns 1.0 when the background cannot be estimated reliably.
|
||||
"""
|
||||
placed = self._footprint_indices(alpha_map, position, image.shape)
|
||||
if placed is None:
|
||||
return 1.0
|
||||
alpha_roi, (y1, y2, x1, x2) = placed
|
||||
a_cap = float(alpha_roi.max())
|
||||
if a_cap < 0.2:
|
||||
return 1.0
|
||||
gray = image.astype(np.float32).mean(axis=2)
|
||||
core = alpha_roi >= a_cap * self._ALPHA_GAIN_CORE_FRAC
|
||||
if not bool(core.any()):
|
||||
return 1.0
|
||||
core_obs = float(np.percentile(gray[y1:y2, x1:x2][core], 75))
|
||||
# Local background = a ring just outside the footprint box.
|
||||
ih, iw = image.shape[:2]
|
||||
pad = int((x2 - x1) * 0.7)
|
||||
ry1, ry2 = max(0, y1 - pad), min(ih, y2 + pad)
|
||||
rx1, rx2 = max(0, x1 - pad), min(iw, x2 + pad)
|
||||
ring = gray[ry1:ry2, rx1:rx2]
|
||||
ring_mask = np.ones(ring.shape, dtype=bool)
|
||||
ring_mask[y1 - ry1 : y2 - ry1, x1 - rx1 : x2 - rx1] = False
|
||||
if int(ring_mask.sum()) < 10:
|
||||
return 1.0
|
||||
bg = float(np.median(ring[ring_mask]))
|
||||
if 255.0 - bg < 5.0:
|
||||
return 1.0
|
||||
a_eff = float(np.clip((core_obs - bg) / (255.0 - bg), 0.0, 0.99))
|
||||
return float(np.clip(a_eff / a_cap, 1.0, self._ALPHA_GAIN_MAX))
|
||||
|
||||
def _reverse_alpha_oversubtracts(
|
||||
self,
|
||||
image: NDArray[Any],
|
||||
|
||||
@@ -286,6 +286,66 @@ class TestOverSubtractionGuard:
|
||||
assert self.engine._reverse_alpha_oversubtracts(dark, dalpha, (dpos[0], dpos[1])) is True
|
||||
|
||||
|
||||
class TestUnderSubtractionGain:
|
||||
"""Under-subtraction fix: a sparkle MORE opaque than the captured alpha must not
|
||||
survive removal. The captured alpha (~0.51) under-represents such marks, so the
|
||||
fixed-alpha reverse blend leaves a bright residual; the per-image gain scales the
|
||||
alpha up to match this image's opacity. Mirror of TestOverSubtractionGuard.
|
||||
"""
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _setup_engine(self):
|
||||
self.engine = GeminiEngine()
|
||||
|
||||
def _composite_sparkle(self, bg_value: int, alpha_scale: float, size: int = 1400):
|
||||
"""Flat ``bg_value`` image with the sparkle composited at ``alpha_scale`` opacity.
|
||||
|
||||
``alpha_scale`` > 1 makes the mark MORE opaque than the engine's captured alpha,
|
||||
reproducing the under-subtraction case (real under-removed marks estimate ~1.47).
|
||||
"""
|
||||
img = np.full((size, size, 3), bg_value, dtype=np.float32)
|
||||
config = get_watermark_config(size, size)
|
||||
x, y = config.get_position(size, size)
|
||||
alpha = self.engine.get_alpha_map(WatermarkSize.LARGE)
|
||||
ah, aw = alpha.shape[:2]
|
||||
a = np.clip(alpha * alpha_scale, 0.0, 1.0)[:, :, None]
|
||||
roi = img[y : y + ah, x : x + aw]
|
||||
img[y : y + ah, x : x + aw] = a * 255.0 + (1.0 - a) * roi
|
||||
return np.clip(img, 0, 255).astype(np.uint8), (x, y, aw, ah)
|
||||
|
||||
def test_more_opaque_sparkle_estimates_gain_above_deadband(self):
|
||||
image, pos = self._composite_sparkle(bg_value=80, alpha_scale=1.3)
|
||||
alpha = self.engine.get_interpolated_alpha(pos[2])
|
||||
gain = self.engine._estimate_alpha_gain(image, alpha, (pos[0], pos[1]))
|
||||
assert gain > self.engine._ALPHA_GAIN_DEADBAND, f"gain {gain} did not exceed deadband"
|
||||
|
||||
def test_matching_sparkle_estimates_unit_gain(self):
|
||||
"""A sparkle that matches the captured opacity gets ~1.0 (no scaling)."""
|
||||
image, pos = self._composite_sparkle(bg_value=80, alpha_scale=1.0)
|
||||
alpha = self.engine.get_interpolated_alpha(pos[2])
|
||||
gain = self.engine._estimate_alpha_gain(image, alpha, (pos[0], pos[1]))
|
||||
assert gain <= self.engine._ALPHA_GAIN_DEADBAND, f"matching sparkle scaled by {gain}"
|
||||
|
||||
def test_more_opaque_sparkle_is_removed(self):
|
||||
"""The gain-scaled removal clears a more-opaque sparkle without a black pit.
|
||||
|
||||
Asserted on the footprint PIXELS, not the detector: the detector's NCC is
|
||||
degenerate on a perfectly flat synthetic background (zero-variance regions
|
||||
spuriously match), so a re-detect conf is meaningless here -- on real textured
|
||||
images the same removal drops the detector from ~0.80 to ~0.27 (spaces corpus).
|
||||
"""
|
||||
image, (x, y, w, h) = self._composite_sparkle(bg_value=80, alpha_scale=1.3)
|
||||
assert self.engine.detect_watermark(image).detected
|
||||
before_max = int(image[y : y + h, x : x + w].max()) # bright sparkle present
|
||||
assert before_max > 150
|
||||
out = self.engine.remove_watermark(image)
|
||||
footprint = out[y : y + h, x : x + w]
|
||||
# Sparkle gone: no bright residual, no black pit, footprint reads like the bg.
|
||||
assert int(footprint.max()) < 80 + 30, f"bright residual: max={footprint.max()}"
|
||||
assert int(footprint.min()) > 25, f"black pit: min={footprint.min()}"
|
||||
assert abs(float(footprint.mean()) - 80.0) < 20.0
|
||||
|
||||
|
||||
class TestCornerPromotion:
|
||||
"""Issue #36: a small sparkle in the corner must not be lost to a larger decoy.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user