fix: harden metadata parsers and engines; sync docs (full-repo review)

Apply fixes from a full-repo review (code, tests, docs).

Security / correctness:
- Clamp attacker-controlled PNG/caBX chunk lengths to the remaining file
  size in metadata.py and noai/c2pa.py (a malformed length no longer drives
  a multi-GB read); skipped chunks seek instead of read.
- noai/isobmff.strip_c2pa_boxes is now fail-safe on a malformed box: return
  the original bytes with a warning instead of silently truncating the tail,
  so metadata --remove can no longer emit a corrupt file.
- doubao_engine._fixed_alpha_map clamps the glyph box to the image (no crash
  on degenerate width-vs-height).
- watermark_remover._run_region_hires gates the phaseCorrelate offset on
  response and magnitude (a spurious shift no longer garbles text) and drops
  the generator after a CPU fallback (no MPS/CPU device mismatch).

Robustness:
- gemini_engine, doubao_engine, region_eraser normalize grayscale and RGBA
  inputs to BGR at the engine entry points.
- image_io.imwrite returns False on an unwritable path (matches cv2).
- invisible_engine guards a None imread result before use.
- trustmark_detector._decoder uses a double-checked threading lock.
- ctrlregen.tiling.tile_positions raises on overlap >= tile.
- humanizer chromatic shift no longer wraps opposite-edge pixels.
- identify OpenAI caveat keyed on the normalized vendor, not a substring.
- Remove the dead "visible --detect-threshold" CLI option.
- publish.yml verifies the release tag matches the package version.

Docs:
- README strength 0.05 to 0.10; .env.example HF_TOKEN marked optional;
  doubao_capture README updated to reverse-alpha-only; CLAUDE.md synced with
  the new behaviors and the batch command.

Tests: new test_security_clamp.py for the read clamp and isobmff fail-safe;
erase CLI coverage; integrity-clash rule 2 end-to-end; multi-tag EXIF
survival and cross-format strip guards; channel/size, tiling, humanizer, and
imwrite regressions. Full suite 493 passed, 2 skipped; ruff and pyright src/
clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Victor Kuznetsov
2026-05-30 18:00:39 -07:00
parent 5298dcc6a3
commit 5d0e6c3a65
29 changed files with 580 additions and 43 deletions
+1 -1
View File
@@ -1,3 +1,3 @@
# HuggingFace token (required for invisible watermark removal)
# HuggingFace token (optional; only needed for gated/private models)
# Get yours at: https://huggingface.co/settings/tokens
HF_TOKEN=
+10
View File
@@ -17,6 +17,16 @@ jobs:
- uses: astral-sh/setup-uv@v7
- name: Verify release tag matches package version
run: |
tag="${{ github.event.release.tag_name }}"
version="$(grep -m1 '^version = ' pyproject.toml | sed -E 's/^version = "([^"]+)"/\1/')"
if [ "${tag#v}" != "$version" ]; then
echo "Release tag '$tag' does not match pyproject.toml version '$version'" >&2
exit 1
fi
echo "Release tag '$tag' matches package version '$version'"
- name: Build package
run: uv build
+13 -11
View File
File diff suppressed because one or more lines are too long
+2 -2
View File
@@ -98,11 +98,11 @@ The removal pipeline (default profile, SDXL):
```text
image → encode to latent space (VAE) at native resolution
→ add controlled noise (forward diffusion)
→ denoise (reverse diffusion, ~50 steps at strength 0.05)
→ denoise (reverse diffusion, ~50 steps at strength 0.10)
→ decode back to pixels (VAE)
```
By default the image is processed at its **native resolution** with no pre-downscale, matching the hosted raiw.cc backend (fal `fast-sdxl`, which is `stabilityai/stable-diffusion-xl-base-1.0` — the same checkpoint the CLI defaults to). At strength ~0.05 SDXL img2img does not need the input shrunk, and the old forced downscale-to-1024 then upscale-back round-trip was the main quality loss. Pass `--max-resolution N` to cap the long side only when a very large image runs out of GPU/MPS memory (it reintroduces that lossy round-trip).
- Native resolution avoids shrinking the input to 1024 px first; that down-then-up round-trip was the main quality loss (issue #10). Use `--max-resolution N` only to cap GPU/MPS memory on very large inputs.
SDXL is the default since May 2026: empirically defeats SynthID v2 on Gemini 3 Pro outputs, where the older SD-1.5 pipeline at 768 px did not. The SD-1.5 path was removed once it was verified not to handle v2. Note the scope: this defeats the SynthID *verifier*, which is not the same as being forensically indistinguishable from a real photo. Recent work ([arXiv:2605.09203](https://arxiv.org/abs/2605.09203)) shows watermark-removal pipelines leave detectable traces, so a separate "this image was processed" classifier can still flag the output.
+10 -2
View File
@@ -1,5 +1,11 @@
# Doubao visible watermark capture
> **Status (completed 2026-05-29):** the capture described below was carried out (black + gray
> Doubao captures) and the exact alpha map was solved. Removal is now **reverse-alpha only**: at the
> captured native width recovery is pixel-exact and inpaint is OFF; a residual inpaint runs off-native
> only. See the `doubao_engine.py` notes in the root `CLAUDE.md`. The text below is kept as the
> historical capture plan.
Goal: capture the Doubao "豆包AI生成" visible watermark over known flat backgrounds so we can
build a per-pixel alpha map and a reverse-alpha-blend remover, the same way the Gemini sparkle
engine works (`src/remove_ai_watermarks/gemini_engine.py`).
@@ -16,8 +22,10 @@ engine works (`src/remove_ai_watermarks/gemini_engine.py`).
- Size **scales with resolution**. Third-party numbers (~90x18 at <=1024, ~180x40 at >1024) are
approximate and calibrated for ~1024-1280 outputs; at 2048 the strip is much larger. A shipped
third-party alpha map is only 120x20, too small for our 2K/4K target -> capture fresh.
- In practice clean inversion leaves residue on textured backgrounds, so the remover pairs the alpha
map with inpainting (our Gemini engine already does gradient-masked inpainting for residual edges).
- The planning assumption was that clean inversion leaves residue on textured backgrounds, so the
remover would pair the alpha map with inpainting. After the capture this turned out unnecessary at
the native width (recovery is pixel-exact there and inpaint is off); the shipped remover is
reverse-alpha only, with a residual inpaint applied off-native only.
## Use doubao.com specifically
-2
View File
@@ -160,7 +160,6 @@ def main(ctx: click.Context, verbose: bool) -> None:
)
@click.option("--inpaint-strength", type=float, default=0.85, help="Inpainting blend strength (0.0-1.0).")
@click.option("--detect/--no-detect", default=True, help="Detect watermark before removal.")
@click.option("--detect-threshold", type=float, default=0.25, help="Detection confidence threshold.")
@click.option(
"--mark",
type=click.Choice(["auto", *watermark_registry.mark_keys()]),
@@ -178,7 +177,6 @@ def cmd_visible(
inpaint_method: Literal["ns", "telea", "gaussian"],
inpaint_strength: float,
detect: bool,
detect_threshold: float,
mark: str,
strip_metadata: bool,
) -> None:
+20 -2
View File
@@ -222,7 +222,14 @@ class DoubaoEngine:
"""
h, w = image.shape[:2]
x, y, bw, bh = loc.bbox
roi = image[y : y + bh, x : x + bw].astype(np.float32)
# Normalize the ROI to 3-channel BGR: a 2D grayscale or 4-channel BGRA
# input would otherwise break the axis=2 channel reductions below.
roi = image[y : y + bh, x : x + bw]
if roi.ndim == 2:
roi = cv2.cvtColor(roi, cv2.COLOR_GRAY2BGR)
elif roi.shape[2] == 4:
roi = cv2.cvtColor(roi, cv2.COLOR_BGRA2BGR)
roi = roi.astype(np.float32)
luma = roi.mean(axis=2)
sat = roi.max(axis=2) - roi.min(axis=2)
@@ -290,7 +297,12 @@ class DoubaoEngine:
if at is None:
return None
h, w = image.shape[:2]
gw, gh = max(1, int(_ALPHA_WIDTH_FRAC * w)), max(1, int(_ALPHA_HEIGHT_FRAC * w))
# Glyph box scales with WIDTH; on a wide/short image the height-from-width
# box can exceed the image height. Clamp both dims so the slice assignment
# below cannot overflow (a degenerate 2048x1 input otherwise raised
# ValueError on the broadcast). Normal images are unaffected.
gw = min(w, max(1, int(_ALPHA_WIDTH_FRAC * w)))
gh = min(h, max(1, int(_ALPHA_HEIGHT_FRAC * w)))
ax = max(0, w - int(_ALPHA_MARGIN_RIGHT_FRAC * w) - gw)
ay = max(0, h - int(_ALPHA_MARGIN_BOTTOM_FRAC * w) - gh)
amap = np.zeros((h, w), np.float32)
@@ -353,6 +365,12 @@ class DoubaoEngine:
inpaint there costs nothing and reliably clears the mark).
Call only when :meth:`reverse_alpha_available` and the mark is detected.
"""
# Normalize to 3-channel BGR so a 2D grayscale or 4-channel BGRA input
# does not break the reverse-alpha math (which assumes a 3-channel logo).
if image.ndim == 2:
image = cv2.cvtColor(image, cv2.COLOR_GRAY2BGR)
elif image.shape[2] == 4:
image = cv2.cvtColor(image, cv2.COLOR_BGRA2BGR)
at_native = abs(image.shape[1] / _ALPHA_NATIVE_WIDTH - 1.0) <= _ALPHA_NATIVE_BAND
if at_native:
amap = self._fixed_alpha_map(image)
+9 -4
View File
@@ -329,16 +329,21 @@ class GeminiEngine:
"""
result = image.copy()
# Handle alpha channel
if result.shape[2] == 4:
# Normalize to 3-channel BGR up front: 2D grayscale (no channel axis) and
# 4-channel BGRA both reach this public entry point and would otherwise
# crash on the channel-count checks / downstream 3-channel math.
if result.ndim == 2:
result = cv2.cvtColor(result, cv2.COLOR_GRAY2BGR)
elif result.shape[2] == 4:
result = cv2.cvtColor(result, cv2.COLOR_BGRA2BGR)
elif result.shape[2] == 1:
result = cv2.cvtColor(result, cv2.COLOR_GRAY2BGR)
size = force_size or get_watermark_size(result.shape[1], result.shape[0])
# Detect dynamic position & size
detection = self.detect_watermark(image, force_size=size)
# Detect dynamic position & size (on the normalized 3-channel image so a
# grayscale/BGRA input does not crash the detector).
detection = self.detect_watermark(result, force_size=size)
if not detection.detected:
logger.debug(
+5 -1
View File
@@ -36,10 +36,14 @@ def apply_analog_humanizer(image: NDArray, grain_intensity: float = 4.0, chromat
b, g, r = cv2.split(image)
# 1. Chromatic Aberration
# Shift R channel left, B channel right
# Shift R channel left, B channel right. np.roll is circular, so it wraps
# the opposite edge into a thin colored fringe at the L/R borders; replicate
# the original edge columns there to keep the intended offset interior-only.
if chromatic_shift > 0:
r = np.roll(r, -chromatic_shift, axis=1)
r[:, -chromatic_shift:] = r[:, -chromatic_shift - 1 : -chromatic_shift]
b = np.roll(b, chromatic_shift, axis=1)
b[:, :chromatic_shift] = b[:, chromatic_shift : chromatic_shift + 1]
merged = cv2.merge((b, g, r))
+1 -1
View File
@@ -431,7 +431,7 @@ def identify(image_path: Path, *, check_visible: bool = True, check_invisible: b
if synthid:
watermarks.append(f"SynthID pixel watermark ({synthid})")
caveats.append(_SYNTHID_CAVEAT)
if "OpenAI" in (" ".join(issuers) + synthid):
if _vendor_of(synthid) == "OpenAI":
caveats.append(_OPENAI_CAVEAT)
if v := _vendor_of(synthid):
ai_vendor_claims["synthid"] = v
+6 -2
View File
@@ -54,7 +54,8 @@ def imwrite(path: str | Path, img: NDArray[Any]) -> bool:
The output format is taken from the path extension (e.g. ``.png``), exactly
like ``cv2.imwrite``. Returns ``True`` on success, ``False`` if the codec
rejects the image.
rejects the image or the path cannot be written (matching ``cv2.imwrite``,
which returns ``False`` rather than raising on an unwritable path).
"""
import cv2
@@ -62,5 +63,8 @@ def imwrite(path: str | Path, img: NDArray[Any]) -> bool:
ok, buf = cv2.imencode(ext, img)
if not ok:
return False
buf.tofile(str(path))
try:
buf.tofile(str(path))
except OSError:
return False
return True
+3 -1
View File
@@ -73,7 +73,7 @@ class InvisibleEngine:
"""
# SDXL base is the default since May 2026: empirically defeats SynthID v2
# at strength=0.05 / steps=50 / native ~1024px. See CLAUDE.md "Known
# at strength=0.10 / steps=50 / native ~1024px. See CLAUDE.md "Known
# limitations" for the regression evidence ruling out SD-1.5 pipelines.
DEFAULT_MODEL_ID = "stabilityai/stable-diffusion-xl-base-1.0"
CTRLREGEN_MODEL_ID = "yepengliu/ctrlregen"
@@ -227,6 +227,8 @@ class InvisibleEngine:
from remove_ai_watermarks import image_io
out_cv = image_io.imread(out_path, cv2.IMREAD_COLOR)
if out_cv is None:
return out_path
if protect_faces and original_faces:
if self._progress_callback:
+6 -1
View File
@@ -190,6 +190,8 @@ def _png_late_metadata(image_path: Path, window: int) -> bytes:
with open(image_path, "rb") as f:
if f.read(8) != b"\x89PNG\r\n\x1a\n":
return b""
f.seek(0, 2)
file_size = f.tell()
pos = 8
while True:
f.seek(pos)
@@ -201,9 +203,12 @@ def _png_late_metadata(image_path: Path, window: int) -> bytes:
if chunk_type == b"IEND":
break
data_start = pos + 8
# Clamp the attacker-controlled 32-bit length to the bytes that
# actually remain, so a malformed huge length can't allocate GBs.
safe_length = max(0, min(length, file_size - data_start))
if chunk_type in _PNG_META_CHUNKS and data_start >= window:
f.seek(data_start)
out += f.read(length)
out += f.read(safe_length)
pos = data_start + length + 4 # data + CRC
except OSError as exc:
logger.debug("PNG late-metadata scan failed on %s: %s", image_path, exc)
+24 -6
View File
@@ -55,6 +55,9 @@ def has_c2pa_metadata(image_path: Path) -> bool:
if signature != PNG_SIGNATURE:
return False
file_size = f.seek(0, 2)
f.seek(8)
while True:
chunk_header = f.read(8)
if len(chunk_header) < 8:
@@ -62,9 +65,12 @@ def has_c2pa_metadata(image_path: Path) -> bool:
length = struct.unpack(">I", chunk_header[:4])[0]
chunk_type = chunk_header[4:8]
# Clamp the attacker-controlled 32-bit length to the bytes that
# actually remain, so a malformed huge length can't allocate GBs.
safe_length = max(0, min(length, file_size - f.tell()))
if chunk_type == C2PA_CHUNK_TYPE:
chunk_data = f.read(length)
chunk_data = f.read(safe_length)
# Check for any C2PA signature
for sig in C2PA_SIGNATURES:
if sig in chunk_data:
@@ -74,7 +80,7 @@ def has_c2pa_metadata(image_path: Path) -> bool:
return True
f.read(4)
else:
f.read(length + 4)
f.seek(safe_length + 4, 1)
if chunk_type == b"IEND":
break
@@ -108,6 +114,9 @@ def extract_c2pa_info(image_path: Path) -> dict[str, Any]:
if signature != PNG_SIGNATURE:
return c2pa_info
file_size = f.seek(0, 2)
f.seek(8)
while True:
chunk_header = f.read(8)
if len(chunk_header) < 8:
@@ -115,13 +124,16 @@ def extract_c2pa_info(image_path: Path) -> dict[str, Any]:
length = struct.unpack(">I", chunk_header[:4])[0]
chunk_type = chunk_header[4:8]
# Clamp the attacker-controlled 32-bit length to the bytes that
# actually remain, so a malformed huge length can't allocate GBs.
safe_length = max(0, min(length, file_size - f.tell()))
if chunk_type == C2PA_CHUNK_TYPE:
chunk_data = f.read(length)
chunk_data = f.read(safe_length)
_parse_c2pa_chunk(chunk_data, c2pa_info)
f.read(4)
else:
f.read(length + 4)
f.seek(safe_length + 4, 1)
if chunk_type == b"IEND":
break
@@ -278,6 +290,9 @@ def extract_c2pa_chunk(image_path: Path) -> bytes | None:
if signature != PNG_SIGNATURE:
return None
file_size = f.seek(0, 2)
f.seek(8)
while True:
chunk_header = f.read(8)
if len(chunk_header) < 8:
@@ -285,9 +300,12 @@ def extract_c2pa_chunk(image_path: Path) -> bytes | None:
length = struct.unpack(">I", chunk_header[:4])[0]
chunk_type = chunk_header[4:8]
# Clamp the attacker-controlled 32-bit length to the bytes that
# actually remain, so a malformed huge length can't allocate GBs.
safe_length = max(0, min(length, file_size - f.tell()))
if chunk_type == C2PA_CHUNK_TYPE:
chunk_data = f.read(length)
chunk_data = f.read(safe_length)
crc = f.read(4)
# Check for any C2PA signature
@@ -299,7 +317,7 @@ def extract_c2pa_chunk(image_path: Path) -> bytes | None:
if b"jumb" in chunk_data.lower() or b"c2pa" in chunk_data.lower():
return chunk_header + chunk_data + crc
else:
f.read(length + 4)
f.seek(safe_length + 4, 1)
if chunk_type == b"IEND":
break
@@ -20,6 +20,8 @@ from PIL import Image
def tile_positions(total: int, tile: int, overlap: int) -> list[int]:
"""Compute evenly-spaced tile start positions covering *total* pixels."""
if not (0 <= overlap < tile):
raise ValueError(f"overlap must satisfy 0 <= overlap < tile (got overlap={overlap}, tile={tile})")
if total <= tile:
return [0]
n = max(2, math.ceil((total - overlap) / (tile - overlap)))
+21
View File
@@ -17,6 +17,7 @@ Reference: ISO/IEC 14496-12 (ISOBMFF) and C2PA 2.1 spec §11.
from __future__ import annotations
import logging
import re
import struct
from typing import TYPE_CHECKING
@@ -32,6 +33,8 @@ from remove_ai_watermarks.metadata import (
IPTC_AI_MARKERS,
)
log = logging.getLogger(__name__)
# Top-level box types that may carry AI provenance. ``uuid`` boxes are checked
# against ``C2PA_UUID`` / AI-label markers before being stripped; ``jumb`` boxes
# are always stripped (JPEG-XL uses them exclusively for JUMBF).
@@ -126,6 +129,8 @@ def scan_c2pa_region(path: str | Path, *, max_total: int = 4 * 1024 * 1024) -> b
else:
size = size32
if size < (payload_off - pos) or pos + size > file_size:
# Detection-only: a malformed box halts the walk, so a manifest
# placed after it is missed (best-effort scan; no resync).
break
if box_type in C2PA_BOX_TYPES:
f.seek(payload_off)
@@ -162,7 +167,9 @@ def strip_c2pa_boxes(data: bytes) -> tuple[bytes, int]:
out = bytearray()
stripped = 0
consumed = 0
for start, end, box_type, payload_off in _iter_top_level_boxes(data):
consumed = end
if box_type == b"uuid":
# uuid boxes carry the 16-byte UUID immediately after the type.
is_c2pa = payload_off + 16 <= end and data[payload_off : payload_off + 16] == C2PA_UUID
@@ -174,6 +181,20 @@ def strip_c2pa_boxes(data: bytes) -> tuple[bytes, int]:
stripped += 1
continue
out.extend(data[start:end])
# Fail-safe: the walker returns early on a malformed box (bad size, or a box
# that runs past EOF), so anything after it was never visited. Emitting `out`
# would silently truncate the file from the bad box to EOF -- worse than not
# stripping. If the walk did not consume the whole input, return it unchanged.
if consumed != len(data):
log.warning(
"ISOBMFF box walk stopped at offset %d of %d (malformed box); "
"returning input unchanged to avoid truncation",
consumed,
len(data),
)
return data, 0
return bytes(out), stripped
@@ -272,6 +272,12 @@ def _make_seed_generator(device: str, seed: int) -> Any:
return torch.Generator().manual_seed(seed) # type: ignore
def _generator_device(generator: Any) -> str:
"""Best-effort device type of a ``torch.Generator`` (e.g. ``"cpu"``, ``"mps"``)."""
device = getattr(generator, "device", None)
return getattr(device, "type", str(device)) if device is not None else "cpu"
# Keep legacy name available for backwards compatibility
_detect_model_profile_from_id = detect_model_profile
@@ -677,6 +683,14 @@ class WatermarkRemover:
base = self._run_img2img(init_image, strength, num_inference_steps, guidance_scale, generator)
# The base pass may have fallen back from MPS to CPU (it flips
# self.device). The generator was built for the original device, and
# diffusers rejects a device-mismatched generator ("Expected a 'cpu'
# device generator but found 'mps'"), so drop it for the per-region
# passes -- they then seed from the global RNG, which is fine here.
if generator is not None and self.device == "cpu" and _generator_device(generator) != "cpu":
generator = None
bgr = cv2.cvtColor(np.array(init_image), cv2.COLOR_RGB2BGR)
try:
boxes = text_protector.TextProtector().detect_text_boxes(bgr)
@@ -718,8 +732,13 @@ class WatermarkRemover:
# the composite even though the text is crisp.
cg = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY).astype(np.float32)
dg = cv2.cvtColor(down, cv2.COLOR_BGR2GRAY).astype(np.float32)
(sx, sy), _resp = cv2.phaseCorrelate(cg, dg)
if abs(sx) > 0.1 or abs(sy) > 0.1:
(sx, sy), resp = cv2.phaseCorrelate(cg, dg)
# Only correct for the real 1-2px round-trip shift. On a near-flat /
# low-contrast crop phaseCorrelate returns a spurious large offset at
# a tiny response (e.g. (19,19) at resp ~0.005); warping by that
# garbles the composite -- the exact failure this was meant to
# prevent. Gate on both a confident response and a plausible offset.
if resp > 0.3 and abs(sx) < 4 and abs(sy) < 4 and (abs(sx) > 0.1 or abs(sy) > 0.1):
m = np.float32([[1, 0, -sx], [0, 1, -sy]])
down = cv2.warpAffine(down, m, (w, h), flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_REPLICATE)
out_bgr = text_protector.feather_paste(out_bgr, down, x, y)
+9 -1
View File
@@ -67,8 +67,16 @@ def erase_cv2(
method: Literal["telea", "ns"] = "telea",
radius: int = 6,
) -> NDArray[Any]:
"""Inpaint ``mask`` with classical cv2 inpainting (CPU, no extra deps)."""
"""Inpaint ``mask`` with classical cv2 inpainting (CPU, no extra deps).
Accepts 1-/3-channel BGR (passed straight to ``cv2.inpaint``) and 4-channel
BGRA: ``cv2.inpaint`` rejects 4 channels, so the alpha plane is split off,
the BGR is inpainted, and alpha is re-attached unchanged.
"""
flag = cv2.INPAINT_TELEA if method == "telea" else cv2.INPAINT_NS
if image_bgr.ndim == 3 and image_bgr.shape[2] == 4:
bgr = cv2.inpaint(image_bgr[:, :, :3], mask, radius, flag)
return np.dstack([bgr, image_bgr[:, :, 3]])
return cv2.inpaint(image_bgr, mask, radius, flag)
@@ -22,6 +22,7 @@ signal, not proof of AI origin.
from __future__ import annotations
import logging
import threading
from typing import TYPE_CHECKING, Any
if TYPE_CHECKING:
@@ -32,7 +33,9 @@ log = logging.getLogger(__name__)
# Adobe ships Variant P in production (com.adobe.trustmark.P).
_MODEL_TYPE = "P"
# Lazily constructed singleton -- model load + first-use download is expensive.
# Guarded by a lock so concurrent callers don't double-construct/double-download.
_tm: Any = None
_tm_lock = threading.Lock()
def is_available() -> bool:
@@ -45,9 +48,11 @@ def is_available() -> bool:
def _decoder() -> Any:
global _tm
if _tm is None:
from trustmark import TrustMark
with _tm_lock:
if _tm is None:
from trustmark import TrustMark
_tm = TrustMark(verbose=False, model_type=_MODEL_TYPE)
_tm = TrustMark(verbose=False, model_type=_MODEL_TYPE)
return _tm
+75
View File
@@ -541,3 +541,78 @@ class TestGpuHintMarkup:
with patch("remove_ai_watermarks.invisible_engine.is_available", return_value=False):
result = runner.invoke(main, ["all", str(sample_png)])
assert "remove-ai-watermarks[gpu]" in result.output
class TestEraseCommand:
"""Tests for the 'erase' universal region eraser subcommand."""
def test_erase_help(self, runner):
result = runner.invoke(main, ["erase", "--help"])
assert result.exit_code == 0
assert "--region" in result.output
assert "--backend" in result.output
def test_erase_single_region(self, runner, sample_png, tmp_path):
output = tmp_path / "erased.png"
result = runner.invoke(
main,
["erase", str(sample_png), "--region", "10,10,40,40", "-o", str(output)],
)
assert result.exit_code == 0, result.output
assert output.exists()
def test_erase_two_regions(self, runner, sample_png, tmp_path):
output = tmp_path / "erased2.png"
result = runner.invoke(
main,
[
"erase",
str(sample_png),
"--region",
"10,10,30,30",
"--region",
"120,120,30,30",
"-o",
str(output),
],
)
assert result.exit_code == 0, result.output
assert output.exists()
# The banner reports the region count it processed.
assert "2 region(s)" in result.output
def test_erase_default_output_name(self, runner, sample_png):
result = runner.invoke(main, ["erase", str(sample_png), "--region", "10,10,40,40"])
assert result.exit_code == 0, result.output
assert sample_png.with_stem(sample_png.stem + "_clean").exists()
def test_erase_malformed_region_exits_nonzero(self, runner, sample_png, tmp_path):
output = tmp_path / "x.png"
# Only three values: click.BadParameter -> non-zero exit, no output file.
result = runner.invoke(
main,
["erase", str(sample_png), "--region", "1,2,3", "-o", str(output)],
)
assert result.exit_code != 0
assert not output.exists()
def test_erase_nonexistent_file(self, runner):
result = runner.invoke(main, ["erase", "/nonexistent/file.png", "--region", "0,0,10,10"])
assert result.exit_code != 0
def test_erase_lama_backend_without_onnxruntime(self, runner, sample_png, tmp_path):
# The LaMa backend needs onnxruntime; without it the CLI must surface a
# clear error and exit non-zero rather than crash. When onnxruntime IS
# installed there is no missing-dep path to exercise, so skip.
from remove_ai_watermarks.region_eraser import lama_available
if lama_available():
pytest.skip("onnxruntime installed; missing-dep error path not reachable")
output = tmp_path / "y.png"
result = runner.invoke(
main,
["erase", str(sample_png), "--region", "10,10,40,40", "--backend", "lama", "-o", str(output)],
)
assert result.exit_code != 0
assert "onnxruntime" in result.output.lower()
assert not output.exists()
+25
View File
@@ -161,3 +161,28 @@ class TestReverseAlpha:
assert float(np.abs(wm.astype(np.float32)[mark] - 100.0).mean()) > 15 # mark visible
out = eng.remove_watermark_reverse_alpha(wm).astype(np.float32)
assert float(np.abs(out[mark] - 100.0).mean()) < max_err
class TestDegenerateAndChannelInputs:
"""Removal must not crash on degenerate sizes or non-3-channel inputs."""
@pytest.mark.parametrize(("w", "h"), [(2048, 1), (1, 2048), (2048, 8)])
def test_wide_short_does_not_raise(self, w, h):
"""A wide/short image at native width makes the width-derived glyph box
taller than the image; the slice assignment must not ValueError."""
eng = DoubaoEngine()
img = np.zeros((h, w, 3), np.uint8)
out = eng.remove_watermark_reverse_alpha(img)
assert out.shape == img.shape
def test_grayscale_2d_does_not_raise(self):
eng = DoubaoEngine()
gray = np.zeros((2048, 2048), np.uint8)
out = eng.remove_watermark_reverse_alpha(gray)
assert out.shape == (2048, 2048, 3)
def test_bgra_4channel_does_not_raise(self):
eng = DoubaoEngine()
bgra = np.zeros((2048, 2048, 4), np.uint8)
out = eng.remove_watermark_reverse_alpha(bgra)
assert out.shape == (2048, 2048, 3)
+20
View File
@@ -50,3 +50,23 @@ def test_invalid_shape():
img[0, 0] = 50
result = apply_analog_humanizer(img)
assert np.array_equal(img, result)
def test_chromatic_shift_does_not_wrap_opposite_edge():
# On a horizontal gradient (dark left, bright right), a circular np.roll
# would wrap the bright right edge into the R channel's left border and the
# dark left edge into the B channel's right border, producing a colored
# fringe. After the fix the border columns must replicate their own edge.
ramp = np.linspace(0, 255, 64, dtype=np.uint8)
gray = np.broadcast_to(ramp, (32, 64))
img = np.stack([gray, gray, gray], axis=2).copy() # B, G, R
shift = 3
result = apply_analog_humanizer(img, grain_intensity=0.0, chromatic_shift=shift)
# B (index 0) rolled right -> its left border must stay dark (near 0),
# NOT wrap the bright right edge.
assert result[:, :shift, 0].max() < 60
# R (index 2) rolled left -> its right border must stay bright (near 255),
# NOT wrap the dark left edge.
assert result[:, -shift:, 2].min() > 195
+63
View File
@@ -389,6 +389,56 @@ class TestIdentifyCaveats:
assert len(r.caveats) == len(set(r.caveats))
class TestOpenAiCaveatVendorScoped:
"""The OpenAI rollout caveat keys on the normalized SynthID vendor, not a raw
"OpenAI" substring over the issuer + verdict blob -- so a Google-SynthID
manifest with an incidental "OpenAI" byte elsewhere is not mislabeled, while
a genuine OpenAI manifest still gets the hedge.
"""
@staticmethod
def _png_chunk(ctype: bytes, data: bytes) -> bytes:
import struct
import zlib
return struct.pack(">I", len(data)) + ctype + data + struct.pack(">I", zlib.crc32(ctype + data) & 0xFFFFFFFF)
def _png(self, tmp_path: Path, name: str, *extra: bytes) -> Path:
import struct
import zlib
ihdr = struct.pack(">IIBBBBB", 1, 1, 8, 6, 0, 0, 0)
body = (
b"\x89PNG\r\n\x1a\n"
+ self._png_chunk(b"IHDR", ihdr)
+ self._png_chunk(b"IDAT", zlib.compress(b"\x00" * 6, 9))
+ b"".join(extra)
+ self._png_chunk(b"IEND", b"")
)
path = tmp_path / name
path.write_bytes(body)
return path
def test_google_synthid_with_incidental_openai_byte_no_caveat(self, tmp_path: Path):
# Google C2PA/SynthID manifest in caBX; the byte "OpenAI" lives in a
# separate tEXt chunk (e.g. a trust-chain note), not as a SynthID vendor.
png = self._png(
tmp_path,
"g.png",
self._png_chunk(b"caBX", b"jumbc2pa Google ... trainedAlgorithmicMedia"),
self._png_chunk(b"tEXt", b"note\x00signed via OpenAI trust chain"),
)
r = identify(png, check_visible=False, check_invisible=False)
assert any("SynthID pixel watermark (likely present (Google" in w for w in r.watermarks)
assert not any("before the rollout" in c for c in r.caveats)
def test_openai_synthid_still_gets_caveat(self, tmp_path: Path):
png = self._png(tmp_path, "oa.png", self._png_chunk(b"caBX", b"jumbc2pa OpenAI ... trainedAlgorithmicMedia"))
r = identify(png, check_visible=False, check_invisible=False)
assert any("SynthID pixel watermark (likely present (OpenAI" in w for w in r.watermarks)
assert any("before the rollout" in c for c in r.caveats)
class TestReportSerializable:
def test_report_is_json_serializable(self, tmp_png_with_ai_metadata: Path):
# The CLI --json path relies on asdict + json.dumps(default=str).
@@ -657,6 +707,19 @@ class TestIntegrityClashEndToEnd:
r = identify(path, check_visible=False, check_invisible=False)
assert r.integrity_clashes == []
def test_camera_device_plus_ai_marker_clash(self, tmp_path: Path):
# Integrity-clash rule #2: a camera-capture C2PA device token (Pixel
# Camera) coexisting with an independent AI-generation marker (a China
# TC260 AIGC label) -- a genuine camera capture is not AI-generated, so
# the provenance is inconsistent (a laundering / spoofing tell).
path = self._c2pa_jpeg(
tmp_path,
b'Pixel Camera ... <TC260:AIGC>{"Label":"1","ContentProducer":"BYTEDANCE001"}</TC260:AIGC>',
)
r = identify(path, check_visible=False, check_invisible=False)
assert r.platform == "Google Pixel (camera, C2PA capture)"
assert any("Camera-capture C2PA credentials" in c and "AI-generation markers" in c for c in r.integrity_clashes)
def test_clash_serializes_to_json(self, tmp_path: Path):
path = self._c2pa_jpeg(tmp_path, b"OpenAI ... trainedAlgorithmicMedia ... TC260:AIGC label")
r = identify(path, check_visible=False, check_invisible=False)
+5
View File
@@ -72,3 +72,8 @@ class TestFailureSemantics:
path = tmp_path / "garbage.png"
path.write_bytes(b"not an image")
assert image_io.imread(path) is None
def test_imwrite_to_missing_directory_returns_false(self, tmp_path: Path) -> None:
# An unwritable path must return False (cv2.imwrite contract), not raise.
path = tmp_path / "no-such-dir" / "out.png"
assert image_io.imwrite(path, _make_bgr()) is False
+60
View File
@@ -445,6 +445,31 @@ class TestRemoveAiMetadata:
assert isinstance(result, Path)
assert result == output
def _sd_png(self, tmp_path: Path) -> Path:
img = Image.new("RGB", (32, 32), color=(80, 80, 80))
pnginfo = PngInfo()
pnginfo.add_text("parameters", "Steps: 20, Sampler: Euler")
img.save(tmp_path / "sd.png", pnginfo=pnginfo)
return tmp_path / "sd.png"
def test_png_to_jpeg_strips_ai(self, tmp_path):
# Cross-format output: the AI text chunk must not survive the PNG->JPEG
# re-encode, by detection AND by raw bytes.
out = tmp_path / "clean.jpg"
remove_ai_metadata(self._sd_png(tmp_path), out)
assert not has_ai_metadata(out)
body = out.read_bytes()
assert b"parameters" not in body
assert b"Steps" not in body
def test_png_to_webp_strips_ai(self, tmp_path):
out = tmp_path / "clean.webp"
remove_ai_metadata(self._sd_png(tmp_path), out)
assert not has_ai_metadata(out)
body = out.read_bytes()
assert b"parameters" not in body
assert b"Steps" not in body
def _img_with_software(tmp_path: Path, fmt: str, software: str) -> Path:
"""Write a tiny image carrying an EXIF Software tag."""
@@ -617,6 +642,41 @@ class TestRemoveAiExif:
kept = piexif.load(Image.open(out).info["exif"])["0th"]
assert kept.get(piexif.ImageIFD.Make) == b"Apple"
def test_xai_pair_stripped_but_genuine_camera_tags_kept(self, tmp_path: Path):
# An image carrying BOTH the xAI Signature pair (ImageDescription =
# "Signature: <base64>" + UUID Artist) AND genuine non-AI camera tags.
# The scrub must delete only the xAI pair, leaving the camera tags intact.
sig = "Signature: " + "A" * 120
artist = "12345678-1234-1234-1234-123456789abc"
exif = piexif.dump(
{
"0th": {
piexif.ImageIFD.ImageDescription: sig.encode(),
piexif.ImageIFD.Artist: artist.encode(),
piexif.ImageIFD.Make: b"Canon",
piexif.ImageIFD.Model: b"EOS R5",
},
"Exif": {piexif.ExifIFD.DateTimeOriginal: b"2024:01:01 12:00:00"},
"GPS": {piexif.GPSIFD.GPSLatitudeRef: b"N"},
"1st": {},
}
)
src = tmp_path / "grok_plus_cam.jpg"
Image.new("RGB", (32, 32)).save(src, exif=exif)
out = tmp_path / "scrubbed.jpg"
remove_ai_metadata(src, out)
# xAI signature pair is gone (xai_signature returns a bool, not None).
assert xai_signature(out) is False
kept = piexif.load(Image.open(out).info["exif"])
assert kept["0th"].get(piexif.ImageIFD.ImageDescription) is None
assert kept["0th"].get(piexif.ImageIFD.Artist) is None
# Genuine camera tags are preserved.
assert kept["0th"].get(piexif.ImageIFD.Make) == b"Canon"
assert kept["0th"].get(piexif.ImageIFD.Model) == b"EOS R5"
assert kept["Exif"].get(piexif.ExifIFD.DateTimeOriginal) == b"2024:01:01 12:00:00"
assert kept["GPS"].get(piexif.GPSIFD.GPSLatitudeRef) == b"N"
class TestAIGCLabel:
"""China TC260 AIGC labeling (Doubao and other China-served generators)."""
+5 -2
View File
@@ -328,9 +328,12 @@ class TestISOBMFF:
def test_truncated_largesize_terminates_safely(self):
# size32==1 promises a 64-bit largesize, but the box ends after 8 bytes;
# iteration must stop rather than read the missing largesize past EOF.
cleaned, stripped = strip_c2pa_boxes(FTYP + b"\x00\x00\x00\x01uuid")
# The walk halts before EOF, so the fail-safe returns the input unchanged
# (emitting only FTYP would silently truncate the file).
data = FTYP + b"\x00\x00\x00\x01uuid"
cleaned, stripped = strip_c2pa_boxes(data)
assert stripped == 0
assert cleaned == FTYP
assert cleaned == data
class TestC2PAInvalidSignature:
+17
View File
@@ -66,6 +66,23 @@ class TestEraseCv2:
assert np.array_equal(img, out)
class TestNonBgrInputs:
"""cv2.inpaint rejects 4-channel BGRA and 2D-only entry points must work."""
def test_grayscale_2d_does_not_raise(self):
gray = np.full((100, 100), 120, np.uint8)
out = erase(gray, boxes=[(40, 40, 20, 20)], backend="cv2")
assert out.shape == gray.shape
def test_bgra_preserves_alpha_and_does_not_raise(self):
bgra = np.full((100, 100, 4), 120, np.uint8)
bgra[..., 3] = 200 # opaque-ish alpha plane
out = erase(bgra, boxes=[(40, 40, 20, 20)], backend="cv2", dilate=0)
assert out.shape == bgra.shape
# alpha plane is carried through unchanged
assert np.array_equal(out[..., 3], bgra[..., 3])
class TestLamaBackend:
def test_lama_raises_when_unavailable(self):
img = np.full((100, 100, 3), 50, np.uint8)
+130
View File
@@ -0,0 +1,130 @@
"""Regression guards for malformed-length DoS and removal-truncation bugs.
Three verified bugs are locked in here:
1. PNG C2PA parsers (``c2pa.has_c2pa_metadata`` / ``extract_c2pa_info`` and
``metadata._png_late_metadata`` via ``scan_head``) used the raw 32-bit chunk
``length`` field directly in ``f.read(length)``. A crafted file can declare
``length = 0x7FFFFFFF`` (~2 GiB) on a 60-byte file, forcing a multi-GB
allocation. The fix clamps ``length`` to the bytes actually remaining.
2. ISOBMFF ``strip_c2pa_boxes`` truncated the file from a malformed box to EOF
(the box walker returns early), so ``remove_ai_metadata`` could emit a
shorter file and report success. The fix returns the input unchanged when the
walk does not reach EOF.
"""
from __future__ import annotations
import struct
import tracemalloc
from remove_ai_watermarks import metadata
from remove_ai_watermarks.noai import c2pa, isobmff
PNG_SIG = b"\x89PNG\r\n\x1a\n"
_HUGE = 0x7FFFFFFF # ~2 GiB declared length on a tiny file
def _png_with_huge_c2pa_chunk() -> bytes:
"""A ~60-byte 'PNG' whose caBX chunk header lies about its length."""
header = struct.pack(">I", _HUGE) + c2pa.C2PA_CHUNK_TYPE
body = b"jumbc2pa-not-really" # far shorter than the declared length
return PNG_SIG + header + body
class TestPngLengthClampNoAlloc:
"""Clamping makes the parsers read only the real bytes, not the lie."""
def test_has_c2pa_metadata_is_bounded(self, tmp_path):
path = tmp_path / "evil.png"
path.write_bytes(_png_with_huge_c2pa_chunk())
tracemalloc.start()
try:
# Must return quickly without allocating gigabytes and without raising.
c2pa.has_c2pa_metadata(path)
_, peak = tracemalloc.get_traced_memory()
finally:
tracemalloc.stop()
assert peak < 50 * 1024 * 1024 # < 50 MB locks in the clamp
def test_extract_c2pa_info_is_bounded(self, tmp_path):
path = tmp_path / "evil.png"
path.write_bytes(_png_with_huge_c2pa_chunk())
tracemalloc.start()
try:
c2pa.extract_c2pa_info(path)
_, peak = tracemalloc.get_traced_memory()
finally:
tracemalloc.stop()
assert peak < 50 * 1024 * 1024
def test_extract_c2pa_chunk_is_bounded(self, tmp_path):
path = tmp_path / "evil.png"
path.write_bytes(_png_with_huge_c2pa_chunk())
tracemalloc.start()
try:
c2pa.extract_c2pa_chunk(path)
_, peak = tracemalloc.get_traced_memory()
finally:
tracemalloc.stop()
assert peak < 50 * 1024 * 1024
def test_png_late_metadata_scan_is_bounded(self, tmp_path):
# A PNG with a real IDAT pushing the late-scan window past 1 MB, then a
# tEXt chunk lying about its length. scan_head() -> _png_late_metadata().
idat = b"\x00" * (1024 * 1024 + 16)
text_header = struct.pack(">I", _HUGE) + b"tEXt"
blob = (
PNG_SIG
+ struct.pack(">I", len(idat))
+ b"IDAT"
+ idat
+ b"\x00\x00\x00\x00" # fake CRC
+ text_header
+ b"AIGC short"
)
path = tmp_path / "evil_late.png"
path.write_bytes(blob)
tracemalloc.start()
try:
metadata.scan_head(path)
_, peak = tracemalloc.get_traced_memory()
finally:
tracemalloc.stop()
# head itself is ~1 MB; the clamp keeps the late read tiny. Generous cap.
assert peak < 50 * 1024 * 1024
def _box(box_type: bytes, payload: bytes) -> bytes:
return struct.pack(">I", 8 + len(payload)) + box_type + payload
class TestIsobmffStripFailSafe:
def test_well_formed_file_still_strips_uuid(self):
ftyp = _box(b"ftyp", b"isom\x00\x00\x00\x00mp42")
c2pa_box = _box(b"uuid", isobmff.C2PA_UUID + b"manifest-bytes")
mdat = _box(b"mdat", b"\x00" * 32)
data = ftyp + c2pa_box + mdat
cleaned, stripped = isobmff.strip_c2pa_boxes(data)
assert stripped == 1
assert len(cleaned) == len(data) - len(c2pa_box)
assert isobmff.C2PA_UUID not in cleaned
def test_malformed_box_does_not_truncate_tail(self):
ftyp = _box(b"ftyp", b"isom\x00\x00\x00\x00mp42")
c2pa_box = _box(b"uuid", isobmff.C2PA_UUID + b"manifest-bytes")
# A box claiming ~2 GiB before EOF: the walker returns early here.
bad_box = struct.pack(">I", _HUGE) + b"free" + b"\x00" * 16
data = ftyp + c2pa_box + bad_box
cleaned, stripped = isobmff.strip_c2pa_boxes(data)
# Fail-safe: input returned unchanged, nothing stripped, no truncation.
assert stripped == 0
assert cleaned == data
assert len(cleaned) == len(data)
+10
View File
@@ -43,6 +43,16 @@ class TestTilePositions:
# 1024 wide, 512 tile, no overlap -> two tiles butting at 512.
assert tile_positions(1024, 512, 0) == [0, 512]
def test_overlap_equal_to_tile_raises(self):
# overlap == tile makes the stride denominator (tile - overlap) zero;
# reject up front instead of dividing by zero.
with pytest.raises(ValueError, match="overlap"):
tile_positions(2000, 512, 512)
def test_overlap_greater_than_tile_raises(self):
with pytest.raises(ValueError, match="overlap"):
tile_positions(2000, 512, 600)
class TestMakeBlendWeight:
def test_zero_overlap_is_all_ones(self):