From 89f427852f7ecaa92b8812770277c291b4c0e20b Mon Sep 17 00:00:00 2001 From: Victor Kuznetsov Date: Sat, 30 May 2026 12:27:37 -0700 Subject: [PATCH] Fix #30 white box: stop zeroing alpha in the watermark region on save On RGBA inputs the CLI forced the watermark bbox alpha to 0 on save, so the removed-sparkle area became a transparent hole that renders as a solid white box on any non-transparent viewer. The Gemini app exports opaque RGBA, so every user hit it. Reverse-alpha already recovers the real pixels there (and `erase` inpaints them), so there is no artifact to hide -- the hole was the bug, introduced as an over-correction in d091b9f. `_write_bgr_with_alpha` now rejoins the input alpha plane unchanged (drops the `clear_region`/`pad` params); the `visible` / `erase` / `all` / `batch` call sites drop the cleared-region argument and the orphaned region bookkeeping. The registry `remove()` still returns the mark bbox (used for inpaint_residual positioning); the CLI just no longer clears alpha with it. Inverts the test that locked in the old behavior into a #30 regression guard (watermark-region alpha stays opaque, no pixel forced transparent). Verified end-to-end on a real Gemini RGBA export: sparkle gone, zero transparent pixels, clean over a white background. Co-Authored-By: Claude Opus 4.8 --- CLAUDE.md | 2 +- src/remove_ai_watermarks/cli.py | 49 +++++++------------ .../watermark_registry.py | 6 ++- tests/test_cli.py | 16 +++--- 4 files changed, 32 insertions(+), 41 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index f85ebd8..7a912d6 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -34,7 +34,7 @@ You are a **principal Python engineer** maintaining a CLI tool and library for r - `noai/constants.py` — PNG_SIGNATURE, C2PA_CHUNK_TYPE, C2PA_SIGNATURES, C2PA_ISSUERS, `SYNTHID_C2PA_ISSUERS` (issuers that pair SynthID with C2PA: Google, OpenAI), and `C2PA_SOFT_BINDINGS` (soft-binding `alg` prefix → forensic-watermark vendor: Adobe TrustMark, Digimarc, Imatag, Steg.AI, Microsoft, ...). Add a new issuer/binding here, not inline. - `metadata.py` — `scan_head(path, size=1MB)` is the shared input for every C2PA/AIGC/IPTC byte scan: first `size` bytes plus the payloads of any provenance metadata found beyond that window — for ISOBMFF, the late provenance boxes from `isobmff.scan_c2pa_region` (catches a manifest after a large `mdat`); for **PNG**, the late `tEXt`/`iTXt`/`zTXt`/`eXIf`/`iCCP` chunks from `_png_late_metadata` (catches an XMP/EXIF packet appended after a large `IDAT`, e.g. a TC260 AIGC label at ~2.7 MB). Behavior-neutral (`f.read(size)`) for non-ISOBMFF inputs and for any file that fits within `size`. Use it instead of `open().read(1MB)` for any new marker scan. `synthid_source(path)` returns the vendor name(s) if the C2PA manifest implies a SynthID pixel watermark, else None. Format-agnostic: PNG via the caBX parser, JPEG/WebP/AVIF/HEIF/JXL via a binary scan (C2PA marker + SynthID issuer + AI-source marker). `get_ai_metadata` surfaces the verdict, and `metadata --check` prints it as a callout. Both `get_ai_metadata` and `has_ai_metadata` guard the PIL open with `except Exception` (HEIC/unknown formats raise non-OSError) and fall through to the binary scan. `xai_signature(path)` detects xAI/Grok's EXIF-only scheme (`ImageDescription` = `Signature: ` + UUID `Artist`); it feeds `has_ai_metadata`, `get_ai_metadata` (key `xai_signature`), and `identify`. `iptc_ai_system(path)` detects the IPTC Photo Metadata 2025.1 AI-disclosure XMP properties (`IPTC_AI_FIELD_MARKERS` = `AISystemUsed`/`AISystemVersionUsed`/`AIPromptInformation`/`AIPromptWriterName`) and returns the `AISystemUsed` generator name (or `"fields present"`). `remove_ai_metadata` routes **ISOBMFF video** (`.mp4`/`.mov`/`.m4v`) through the same `isobmff.strip_c2pa_boxes` as AVIF/HEIF (MP4 is ISOBMFF), and `_scrub_ai_exif` removes the xAI signature + AI-generator EXIF tags on JPEG output. - `identify.py` — `identify(path)` aggregates every locally-readable signal (C2PA issuer→platform, C2PA soft-binding forensic-watermark vendor, IPTC "Made with AI" + IPTC 2025.1 `AISystemUsed`, embedded SD/ComfyUI params, SynthID proxy, xAI/Grok EXIF signature via `metadata.xai_signature`, the China TC260 AIGC label via `metadata.aigc_label`, the HuggingFace `hf-job-id` job marker via `metadata.huggingface_job`, the Samsung Galaxy AI editing marker via `metadata.samsung_genai`, visible Gemini sparkle, open invisible watermark, Adobe TrustMark via `trustmark_detector`) into one `ProvenanceReport`. `is_ai_generated` is True or None (never asserted False — stripped metadata is not proof of clean origin). The `hf_job`, visible-sparkle, and Samsung `samsung_genai` signals are **medium** confidence: each lifts an otherwise-Unknown verdict to a tentative AI (`hf_only` / `visible_only` / `samsung_only`, parallel branches) but is excluded from the high-confidence `ai_from_metadata` set, so none overrides a hard metadata signal. Visible-sparkle is promoted only at confidence ≥ `_SPARKLE_THRESHOLD` (0.5; corpus-tuned to separate Gemini sparkles ≥0.56 from non-sparkle ≤0.49). The cv2 dependency lives in `gemini_engine.detect_sparkle_confidence`, not here. **C2PA platform attribution is device-token-first, issuer-scan fallback** (`_device_platform` scans manifest bytes for `_DEVICE_C2PA_PLATFORM` tokens, then `_attribute_platform`/`_ISSUER_PLATFORM`). **Why, verified on real signed files 2026-05-26:** the old issuer-only byte-scan matched ANY issuer substring anywhere, so multi-entity manifests mis-attributed -- Leica→"Truepic" (a signing authority in the trust chain), Nikon→"Adobe Firefly" (XMP-toolkit "Adobe" + the sample's "Adobe_MAX" name), Pixel→"Google (Gemini)" ("Google LLC" cert org), Truepic→"Google". A distinctive device token wins instead. **Token distinctiveness is load-bearing:** bare `b"Truepic"` mis-fires (it appears in unrelated trust chains -- it mis-attributed the OpenAI `chatgpt-1.png` fixture), so the token is the specific `b"Truepic_Lens"` from the Lens SDK claim generator; likewise `b"Pixel Camera"` (cert CN) not bare `b"Pixel"`. `_DEVICE_C2PA_PLATFORM` lists ONLY tokens **verified against a real C2PA file**: Leica (`lc_c2pa`/`Leica Camera`), Nikon (`NIKON`), Pixel (`Pixel Camera` -- from a real Pixel 10 Pro file attached to c2pa-rs issue #1609/#1554), Sony (`sony.sig`/`sony.cert` -- Sony's own C2PA assertion namespace, verified on a real Sony PXW-Z300 file; NOT bare "Sony" which is a common EXIF Make), Truepic (`Truepic_Lens`). Canon/Bria have **no public direct-download C2PA sample** (checked exhaustively: GitHub issue/PR attachments, contentcredentials gallery, HF datasets -- all upload-to-verify or token-gated; Canon's only public file was a self-signed hobbyist CR3, not factory), so they stay unmapped until a real file is captured (same fixture discipline as Grok/Doubao). The Sony sample is video (MP4) -- our ISOBMFF C2PA path detects it; Sony Alpha stills likely share the `sony.*` namespace but are not separately verified. **Samsung Galaxy + ASUS Gallery live in a separate `_SIGNER_C2PA_PLATFORM` (scanned after `_device_platform`, before the issuer fallback), NOT in `_DEVICE_C2PA_PLATFORM`** — verified on real signed files 2026-05-29. Reason: a Galaxy phone stamps BOTH its device cert AND a `trainedAlgorithmicMedia`/genAIType AI marker on a Generative-Edit image, so treating it as a "genuine camera capture" would false-fire integrity-clash rule 2 on every Galaxy AI edit. The signer tokens (`b"Samsung Galaxy"` cert org — distinct from the EXIF `SM-xxxx` model string on ordinary Samsung photos; `b"com.asus.gallery"` claim generator) only resolve the platform label; the AI verdict still comes from the source-type / genAIType. ASUS Gallery is a C2PA-signed edit with no AI marker, so it attributes the platform without asserting `is_ai`. **Samsung's `genAIType` (in the proprietary `PhotoEditor_Re_Edit_Data` JSON) is an undocumented Galaxy-AI editing marker** (`metadata.samsung_genai`, gated on the `PhotoEditor_Re_Edit_Data` container; non-zero value = AI tool used, values {1,5} observed): medium-confidence because the field has no public spec (verified 2026-05-29: absent from C2PA spec + Samsung docs), but it co-occurred with `trainedAlgorithmicMedia` in 3/3 verified files that record a source-type and was the SOLE AI marker on a Galaxy S24 file that omits the source type. Camera C2PA marks capture authenticity, not AI (Pixel carries `computationalCapture`, not `trainedAlgorithmicMedia`), so these never set `is_ai` -- that stays driven by digital-source-type. `c2pa.cbor_text_after` (now public) is best-effort for the `generator` detail string only and can be None when the manifest keys it `claim_generator_info` (Pixel). **Issuer→generator mapping is `is_ai`-gated** (`_attribute_platform(issuers, is_ai=c2pa_is_ai)`): a specific AI-generator platform is named only when the digital-source-type is `trainedAlgorithmicMedia`; on a non-AI source an issuer substring is treated as incidental (an "Adobe XMP" toolkit string in an *unmapped* Canon/Sony capture would otherwise mislabel it "Adobe Firefly"), so it degrades to the neutral "C2PA signer: X" label. Real Firefly/OpenAI/Google output carries the AI source-type, so it is unaffected (verified: chatgpt-1.png→OpenAI, firefly-1.png→Adobe Firefly still attribute). `_attribute_platform` defaults `is_ai=True` so the mapping stays unit-testable in isolation. Add capture-camera tokens to `_DEVICE_C2PA_PLATFORM`, editing-app/AI-device signer tokens to `_SIGNER_C2PA_PLATFORM`, generator/issuer platforms to `_ISSUER_PLATFORM`, not inline. For non-PNG containers (JPEG/WebP/AVIF/HEIF/JXL) the caBX parser returns nothing, so issuer (`_issuers_in`) and generator (`_ai_tools_in`, reusing `C2PA_AI_TOOLS`) are recovered by binary-scanning the first MB. EXIF `Software` / `Make` / `Artist` / `ImageDescription` and XMP `CreatorTool` generator tags are read by `metadata.exif_generator` (PIL+piexif for any format PIL opens incl. AVIF, plus a container-agnostic XMP raw-byte scan that also covers HEIF/JXL), matched against `AI_GENERATOR_TOKENS` so ordinary editors (plain "Adobe Photoshop") and real-camera `Make` ("Apple"/"Canon") are not flagged. **Ideogram tags its output with EXIF `Make="Ideogram AI"`** (verified on a real download 2026-05-24) — that's why `Make` is read. **Integrity-clash detection** (`_integrity_clashes`, surfaced as `ProvenanceReport.integrity_clashes`, printed in red by `identify` and serialized to `--json`): contradictions between independent generator stamps are a laundering/spoofing tell. Two rules: (1) two or more distinct AI-origin vendors named by independent signals (e.g. C2PA OpenAI + EXIF `Make="Ideogram AI"`), and (2) a camera-capture C2PA device (`_DEVICE_C2PA_PLATFORM`) coexisting with any AI-generation marker. Vendor normalization is `_vendor_of` over `_AI_VENDOR_TOKENS` (so a C2PA "Google (Gemini)" issuer and a SynthID-Google proxy agree, while different vendors clash). **High-precision by design:** only hard generator stamps feed it (C2PA-issuer when source is AI, SynthID, EXIF/XMP generator, IPTC `AISystemUsed`, xAI, AIGC); the fuzzy visible sparkle and the open invisible watermark are **excluded** (the latter can be a by-product of our own SDXL removal pass). The c2pa vendor is classified from the issuer attribution / generator, NOT the resolved `platform` (a camera label like "Google Pixel" would mis-normalize to "Google"). All real single-origin fixtures (chatgpt/firefly/doubao/grok/mj) verified to produce **zero** clashes (false-positive guard in `test_identify.py::TestRealSamplesHaveNoClash`). -- `watermark_registry.py` — **single catalog of known visible watermarks**, the unified "find known marks in their usual places, recognize, remove" entry. **Reverse-alpha only by policy**: a mark is listed only once a real alpha map has been captured for it, and removal inverts that map (`original = (wm - a*logo)/(1-a)`, exact recovery) — no inpaint/heuristic removal here (arbitrary-region inpainting lives in `region_eraser`/`erase`). Each `KnownMark` ties a key to {usual `location`, `in_auto` flag, `recovery` (="reverse-alpha"), a `detect` adapter → uniform `MarkDetection`, a `remove` adapter}. Entries today: `gemini` (bottom-right sparkle) and `doubao` (bottom-right "豆包AI生成"). `detect_marks` scans all; `best_auto_mark` picks the highest-confidence detection. **Cross-engine confidences aren't directly comparable**, so the gemini adapter applies the corpus-validated 0.5 sparkle threshold (`_GEMINI_AUTO_MIN_CONF`) for its `detected` flag — otherwise the gemini engine's loose internal threshold weakly fires (~0.36) on the Doubao text and hijacks `auto`. `cli.cmd_visible` is registry-driven: `--mark auto` → `best_auto_mark`, `--mark ` → that mark; `--mark` choices come from `mark_keys()`. `_doubao_remove` applies reverse-alpha only when the mark is detected AND `reverse_alpha_available` (resolution in the alpha band); outside that, removal is **skipped** (not inpainted). Add a new visible mark = one `KnownMark` entry + its engine (with a captured alpha map); do not re-add per-mark `if` branches in the CLI. +- `watermark_registry.py` — **single catalog of known visible watermarks**, the unified "find known marks in their usual places, recognize, remove" entry. **Reverse-alpha only by policy**: a mark is listed only once a real alpha map has been captured for it, and removal inverts that map (`original = (wm - a*logo)/(1-a)`, exact recovery) — no inpaint/heuristic removal here (arbitrary-region inpainting lives in `region_eraser`/`erase`). Each `KnownMark` ties a key to {usual `location`, `in_auto` flag, `recovery` (="reverse-alpha"), a `detect` adapter → uniform `MarkDetection`, a `remove` adapter}. Entries today: `gemini` (bottom-right sparkle) and `doubao` (bottom-right "豆包AI生成"). `detect_marks` scans all; `best_auto_mark` picks the highest-confidence detection. **Cross-engine confidences aren't directly comparable**, so the gemini adapter applies the corpus-validated 0.5 sparkle threshold (`_GEMINI_AUTO_MIN_CONF`) for its `detected` flag — otherwise the gemini engine's loose internal threshold weakly fires (~0.36) on the Doubao text and hijacks `auto`. `cli.cmd_visible` is registry-driven: `--mark auto` → `best_auto_mark`, `--mark ` → that mark; `--mark` choices come from `mark_keys()`. `_doubao_remove` applies reverse-alpha only when the mark is detected AND `reverse_alpha_available` (resolution in the alpha band); outside that, removal is **skipped** (not inpainted). Add a new visible mark = one `KnownMark` entry + its engine (with a captured alpha map); do not re-add per-mark `if` branches in the CLI. **Alpha-on-save policy (issue #30):** `cli._write_bgr_with_alpha` rejoins the input's alpha plane **unchanged** — it must NOT zero alpha in the watermark bbox. Reverse-alpha (and `erase` inpaint) recover real pixels there, so zeroing alpha punched a transparent hole that renders as a solid **white box** on any non-transparent viewer (Gemini app exports are opaque RGBA, so every user hit it; regression-guarded by `test_visible_keeps_alpha_opaque_in_watermark_region`). The registry `remove()` still returns its region (used for `inpaint_residual` positioning), but the CLI no longer uses it to clear alpha. - `gemini_engine.py` — visible Gemini-sparkle remover/detector (cv2/numpy, no GPU). `detect_sparkle_confidence(path)` is the file-level entry point used by `identify.py`. - `doubao_engine.py` — visible Doubao "豆包AI生成" remover/detector (cv2/numpy, no GPU), **reverse-alpha only**. `DoubaoEngine.locate` anchors a bottom-right box by **geometry** (mark scales with image WIDTH), `extract_mask` pulls the light low-saturation glyphs (the detection candidate). `detect` is **reverse-alpha-consistent**: it matches the bundled alpha glyph silhouette (`assets/doubao_alpha.png`, the exact shape we invert) against the candidate via zero-mean normalized correlation (`_template_match_score`, cv2 `TM_CCOEFF_NORMED`), gated at `DETECT_NCC_THRESHOLD` 0.4 over a small `DETECT_MIN_COVERAGE` floor. Keying on glyph SHAPE (not coverage/structure heuristics) fixed #23: corpus FP fell to 7/1243 (0.6%); old coverage-only fired on ~28%. **Removal is exact reverse-alpha** (`remove_watermark_reverse_alpha`): `original = (wm - a*logo)/(1-a)` from the bundled alpha map + `_ALPHA_LOGO_BGR` (near-white ~253) + `_ALPHA_*_FRAC` geometry. The alpha map + logo were **solved from real black+gray Doubao captures** (`data/doubao_capture/captures/`, gitignored): on black `captured = a*logo`, the black/gray pair solves `a` per-pixel without assuming the logo colour (white capture cross-validates: mark → flat fill). The single captured alpha map (at width 2048) **generalizes to any resolution**: at (near) the captured width (`_ALPHA_NATIVE_BAND` of `_ALPHA_NATIVE_WIDTH`) `_fixed_alpha_map` places it by exact width-relative geometry (pixel-exact recovery, ~0.9 mean error — the whole point of reverse-alpha); off that width it **tries BOTH placements -- fixed geometry AND `_aligned_alpha_map`'s `TM_CCOEFF_NORMED` scale+position search (`_ALPHA_ALIGN_SEARCH`) -- and keeps whichever leaves the least residual mark** (re-`detect` confidence on the bare reverse-alpha). On a faint/busy-background mark the NCC peak wanders a few px and geometry wins; on a clear mark alignment wins -- no magic threshold, it just picks the better removal. Verified **56/56 real detected-Doubao removed clean across all corpus resolutions** (2048 fixed 27/27, 1773 22/22, plus 1185/1187/1535/1672); a single fixed-vs-aligned choice left 2/56 busy-background residuals, try-both fixed them. `reverse_alpha_available` is just "asset present"; the registry still gates removal on `detect` so a clean corner is never touched. **Residual inpaint is off-native-only:** at the captured width the fixed-geometry recovery is exact, so it is returned untouched -- inpainting over exactly-recovered interior pixels only swaps them for a cv2 hallucination (measured worse, native textured-bg error vs true bg **1.6 reverse-alpha-only vs 2.6 with the old always-on full-footprint inpaint**; regression-guarded by `test_native_returns_exact_reverse_alpha_no_inpaint`). Off-native the NCC alignment is only sub-pixel-approximate, so the interior is no longer exact and a residual inpaint over the glyph footprint cleans the seam (costs nothing there and reliably clears the mark). The shipped third-party `_refs/zhengsuanfa_doubao_alpha_120x20.png` is NOT a usable alpha (≈0.85 everywhere → blacks out on inversion; wrong resolution/version), verified 2026-05-29. There is no inpaint-based removal here (removed 2026-05-29; arbitrary-region inpainting is `region_eraser`/`erase`). - `region_eraser.py` — universal region eraser (`erase` CLI). `erase(image, boxes=|mask=, backend=)`: `boxes_to_mask` → `cv2.inpaint` (`cv2` backend, default, no deps) or big-LaMa via onnxruntime (`lama` backend, extra `lama`, `Carve/LaMa-ONNX` Apache-2.0 model downloaded on first use, never bundled). `erase_lama` crops a padded region around the mask, runs LaMa at its fixed 512² input, pastes only masked pixels back (untouched areas stay pixel-exact). Lazy `_get_lama_session` singleton; `lama_available()` guards the optional import. **LaMa-ONNX costs ~3.5-4 GB peak RAM and ~5-6 s/call on CPU** (FFC working set, not arena — `enable_cpu_mem_arena=False` does not help), so it does NOT fit a minimal droplet; the cv2 backend (tens of MB, ~30 ms) does. LaMa quality at low RAM = serverless/GPU, mirroring how raiw.cc offloads SDXL to fal. diff --git a/src/remove_ai_watermarks/cli.py b/src/remove_ai_watermarks/cli.py index b33643c..598db45 100644 --- a/src/remove_ai_watermarks/cli.py +++ b/src/remove_ai_watermarks/cli.py @@ -101,15 +101,15 @@ def _write_bgr_with_alpha( path: Path, bgr: NDArray[Any], alpha: NDArray[Any] | None, - clear_region: tuple[int, int, int, int] | None = None, - pad: int = 6, ) -> None: """Write BGR (with optional alpha) to ``path``. - When ``alpha`` is provided and the output extension supports it, writes a - 4-channel image. If ``clear_region`` is given as ``(x, y, w, h)``, alpha is - forced to 0 inside that bbox (expanded by ``pad`` px) so the watermark area - becomes fully transparent in the saved file. + When ``alpha`` is provided and the output extension supports it, the original + alpha plane is rejoined unchanged. The watermark region is NOT made + transparent: reverse-alpha (and inpaint) recover real pixels there, so + zeroing alpha would punch a transparent hole that renders as a white box on + any non-transparent viewer (issue #30). Preserving the input alpha keeps + genuinely transparent backgrounds intact without inventing new holes. """ import numpy as np @@ -119,17 +119,7 @@ def _write_bgr_with_alpha( image_io.imwrite(path, bgr) return - alpha_out = alpha - if clear_region is not None: - alpha_out = alpha.copy() - x, y, w, h = clear_region - height, width = alpha.shape[:2] - x0, y0 = max(0, x - pad), max(0, y - pad) - x1, y1 = min(width, x + w + pad), min(height, y + h + pad) - if x1 > x0 and y1 > y0: - alpha_out[y0:y1, x0:x1] = 0 - - bgra = np.dstack([bgr, alpha_out]) + bgra = np.dstack([bgr, alpha]) image_io.imwrite(path, bgra) @@ -246,7 +236,7 @@ def cmd_visible( method: Literal["telea", "ns"] = "ns" if inpaint_method == "ns" else "telea" t0 = time.monotonic() with console.status(f"[cyan]Removing {chosen.label}… ({chosen.recovery})[/]"): - result, region = chosen.remove( + result, _ = chosen.remove( image, inpaint_method=method, inpaint=inpaint, @@ -255,9 +245,9 @@ def cmd_visible( ) elapsed = time.monotonic() - t0 - # Save (preserves transparency by clearing alpha in the watermark region) + # Save (rejoins the original alpha plane unchanged) output.parent.mkdir(parents=True, exist_ok=True) - _write_bgr_with_alpha(output, result, alpha, clear_region=region) + _write_bgr_with_alpha(output, result, alpha) # Strip metadata if strip_metadata: @@ -349,8 +339,7 @@ def cmd_erase( elapsed = time.monotonic() - t0 output.parent.mkdir(parents=True, exist_ok=True) - clear = boxes[0] if len(boxes) == 1 else None - _write_bgr_with_alpha(output, result, alpha, clear_region=clear) + _write_bgr_with_alpha(output, result, alpha) if strip_metadata: try: @@ -695,7 +684,6 @@ def cmd_all( h, w = image.shape[:2] console.print(f" [dim]Input:[/] {source.name} ({w}x{h})") - region: tuple[int, int, int, int] | None = None with console.status("[cyan]Removing visible watermark…[/]"): det = engine.detect_watermark(image) if det.detected: @@ -709,7 +697,7 @@ def cmd_all( console.print(" [dim]Skipped (no visible watermark detected)[/]") # Save to temp file for invisible engine input (preserve alpha if present) - _write_bgr_with_alpha(tmp_path, result, alpha, clear_region=region) + _write_bgr_with_alpha(tmp_path, result, alpha) # ── Step 2: Invisible watermark ────────────────────────────── console.print("\n [bold cyan]② Invisible watermark removal[/]") @@ -761,14 +749,14 @@ def cmd_all( # ── Write final result ──────────────────────────────────────── # The invisible step (and downstream cv2.IMREAD_COLOR paths) drops alpha, - # so re-attach the original alpha (with the watermark region cleared) - # when writing the final output for transparent formats. + # so re-attach the original alpha plane unchanged when writing the final + # output for transparent formats. output.parent.mkdir(parents=True, exist_ok=True) final_bgr, _ = _read_bgr_and_alpha(tmp_path) if final_bgr is None: console.print(f"[red]Error:[/] Failed to read intermediate file: {tmp_path}") raise SystemExit(1) - _write_bgr_with_alpha(output, final_bgr, alpha, clear_region=region) + _write_bgr_with_alpha(output, final_bgr, alpha) finally: # Clean up temp file if it still exists @@ -808,7 +796,6 @@ def _process_batch_image( ValueError: If the image cannot be opened. """ saved_alpha: NDArray[Any] | None = None - saved_region: tuple[int, int, int, int] | None = None if mode in ("visible", "all"): from remove_ai_watermarks.gemini_engine import GeminiEngine @@ -823,7 +810,6 @@ def _process_batch_image( if image is None: raise ValueError("Failed to read image") - region: tuple[int, int, int, int] | None = None det = engine.detect_watermark(image) if det.detected: result = engine.remove_watermark(image) @@ -834,9 +820,8 @@ def _process_batch_image( else: result = image.copy() - _write_bgr_with_alpha(out_path, result, alpha, clear_region=region) + _write_bgr_with_alpha(out_path, result, alpha) saved_alpha = alpha - saved_region = region if mode in ("invisible", "all"): from remove_ai_watermarks.invisible_engine import ( @@ -873,7 +858,7 @@ def _process_batch_image( if mode == "all" and saved_alpha is not None: final_bgr, _ = _read_bgr_and_alpha(out_path) if final_bgr is not None: - _write_bgr_with_alpha(out_path, final_bgr, saved_alpha, clear_region=saved_region) + _write_bgr_with_alpha(out_path, final_bgr, saved_alpha) @main.command("batch") diff --git a/src/remove_ai_watermarks/watermark_registry.py b/src/remove_ai_watermarks/watermark_registry.py index 7fb130d..a7cb6ce 100644 --- a/src/remove_ai_watermarks/watermark_registry.py +++ b/src/remove_ai_watermarks/watermark_registry.py @@ -71,8 +71,10 @@ class KnownMark: inpaint_strength: float = 0.85, force: bool = False, ) -> tuple[NDArray[Any], Region | None]: - """Remove this mark by reverse-alpha; returns ``(result, cleared_region)`` - (region for clearing alpha on save, or None if nothing was removed). + """Remove this mark by reverse-alpha; returns ``(result, region)`` where + ``region`` is the removed mark's bbox (for residual-inpaint positioning), + or None if nothing was removed. NB: the CLI does NOT use ``region`` to + clear alpha on save -- that zeroing caused the issue-#30 white box. ``inpaint`` / ``inpaint_strength`` / ``inpaint_method`` tune the Gemini reverse-alpha edge-residual cleanup only. ``force`` removes at the mark's diff --git a/tests/test_cli.py b/tests/test_cli.py index 3a53c62..08e0339 100644 --- a/tests/test_cli.py +++ b/tests/test_cli.py @@ -198,15 +198,17 @@ class TestVisibleCommand: # which doesn't overlap the centre square at 200x200). assert out[100, 100, 3] == 255 - def test_visible_clears_alpha_in_watermark_region(self, runner, tmp_path): - """When inpainting an RGBA image, the watermark region must be cleared - in the alpha channel so the sparkle area becomes transparent, not opaque-black. + def test_visible_keeps_alpha_opaque_in_watermark_region(self, runner, tmp_path): + """Regression for issue #30 (white box): on an opaque RGBA image, the + watermark region must stay OPAQUE. Reverse-alpha recovers real pixels + there, so zeroing alpha would punch a transparent hole that renders as a + solid white box on any non-transparent viewer. """ rgba = np.full((200, 200, 4), 255, dtype=np.uint8) # fully opaque white src = tmp_path / "rgba_full.png" cv2.imwrite(str(src), rgba) - output = tmp_path / "rgba_cleared.png" + output = tmp_path / "rgba_kept.png" result = runner.invoke( main, ["visible", str(src), "-o", str(output), "--no-detect"], @@ -215,13 +217,15 @@ class TestVisibleCommand: assert result.exit_code == 0, result.output out = cv2.imread(str(output), cv2.IMREAD_UNCHANGED) assert out.shape[2] == 4 - # Default sparkle position is in the bottom-right; alpha there must be 0. + # Default sparkle position is in the bottom-right; alpha there must stay 255. from remove_ai_watermarks.gemini_engine import get_watermark_config cfg = get_watermark_config(200, 200) px, py = cfg.get_position(200, 200) size = cfg.logo_size - assert out[py + size // 2, px + size // 2, 3] == 0, "alpha in the watermark region was not cleared" + assert out[py + size // 2, px + size // 2, 3] == 255, "watermark region alpha was zeroed (white-box regression)" + # No pixel anywhere should have been forced transparent. + assert int((out[:, :, 3] == 0).sum()) == 0, "spurious transparent pixels introduced" def test_visible_rgb_input_stays_rgb(self, runner, sample_png, tmp_path): """Regression: a plain RGB PNG must NOT gain a spurious alpha channel."""