mirror of
https://github.com/wiltodelta/remove-ai-watermarks.git
synced 2026-06-13 05:57:46 +02:00
fix: harden metadata parsers and engines; sync docs (full-repo review)
Apply fixes from a full-repo review (code, tests, docs). Security / correctness: - Clamp attacker-controlled PNG/caBX chunk lengths to the remaining file size in metadata.py and noai/c2pa.py (a malformed length no longer drives a multi-GB read); skipped chunks seek instead of read. - noai/isobmff.strip_c2pa_boxes is now fail-safe on a malformed box: return the original bytes with a warning instead of silently truncating the tail, so metadata --remove can no longer emit a corrupt file. - doubao_engine._fixed_alpha_map clamps the glyph box to the image (no crash on degenerate width-vs-height). - watermark_remover._run_region_hires gates the phaseCorrelate offset on response and magnitude (a spurious shift no longer garbles text) and drops the generator after a CPU fallback (no MPS/CPU device mismatch). Robustness: - gemini_engine, doubao_engine, region_eraser normalize grayscale and RGBA inputs to BGR at the engine entry points. - image_io.imwrite returns False on an unwritable path (matches cv2). - invisible_engine guards a None imread result before use. - trustmark_detector._decoder uses a double-checked threading lock. - ctrlregen.tiling.tile_positions raises on overlap >= tile. - humanizer chromatic shift no longer wraps opposite-edge pixels. - identify OpenAI caveat keyed on the normalized vendor, not a substring. - Remove the dead "visible --detect-threshold" CLI option. - publish.yml verifies the release tag matches the package version. Docs: - README strength 0.05 to 0.10; .env.example HF_TOKEN marked optional; doubao_capture README updated to reverse-alpha-only; CLAUDE.md synced with the new behaviors and the batch command. Tests: new test_security_clamp.py for the read clamp and isobmff fail-safe; erase CLI coverage; integrity-clash rule 2 end-to-end; multi-tag EXIF survival and cross-format strip guards; channel/size, tiling, humanizer, and imwrite regressions. Full suite 493 passed, 2 skipped; ruff and pyright src/ clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
+1
-1
@@ -1,3 +1,3 @@
|
||||
# HuggingFace token (required for invisible watermark removal)
|
||||
# HuggingFace token (optional; only needed for gated/private models)
|
||||
# Get yours at: https://huggingface.co/settings/tokens
|
||||
HF_TOKEN=
|
||||
|
||||
@@ -17,6 +17,16 @@ jobs:
|
||||
|
||||
- uses: astral-sh/setup-uv@v7
|
||||
|
||||
- name: Verify release tag matches package version
|
||||
run: |
|
||||
tag="${{ github.event.release.tag_name }}"
|
||||
version="$(grep -m1 '^version = ' pyproject.toml | sed -E 's/^version = "([^"]+)"/\1/')"
|
||||
if [ "${tag#v}" != "$version" ]; then
|
||||
echo "Release tag '$tag' does not match pyproject.toml version '$version'" >&2
|
||||
exit 1
|
||||
fi
|
||||
echo "Release tag '$tag' matches package version '$version'"
|
||||
|
||||
- name: Build package
|
||||
run: uv build
|
||||
|
||||
|
||||
@@ -98,11 +98,11 @@ The removal pipeline (default profile, SDXL):
|
||||
```text
|
||||
image → encode to latent space (VAE) at native resolution
|
||||
→ add controlled noise (forward diffusion)
|
||||
→ denoise (reverse diffusion, ~50 steps at strength 0.05)
|
||||
→ denoise (reverse diffusion, ~50 steps at strength 0.10)
|
||||
→ decode back to pixels (VAE)
|
||||
```
|
||||
|
||||
By default the image is processed at its **native resolution** with no pre-downscale, matching the hosted raiw.cc backend (fal `fast-sdxl`, which is `stabilityai/stable-diffusion-xl-base-1.0` — the same checkpoint the CLI defaults to). At strength ~0.05 SDXL img2img does not need the input shrunk, and the old forced downscale-to-1024 then upscale-back round-trip was the main quality loss. Pass `--max-resolution N` to cap the long side only when a very large image runs out of GPU/MPS memory (it reintroduces that lossy round-trip).
|
||||
- Native resolution avoids shrinking the input to 1024 px first; that down-then-up round-trip was the main quality loss (issue #10). Use `--max-resolution N` only to cap GPU/MPS memory on very large inputs.
|
||||
|
||||
SDXL is the default since May 2026: empirically defeats SynthID v2 on Gemini 3 Pro outputs, where the older SD-1.5 pipeline at 768 px did not. The SD-1.5 path was removed once it was verified not to handle v2. Note the scope: this defeats the SynthID *verifier*, which is not the same as being forensically indistinguishable from a real photo. Recent work ([arXiv:2605.09203](https://arxiv.org/abs/2605.09203)) shows watermark-removal pipelines leave detectable traces, so a separate "this image was processed" classifier can still flag the output.
|
||||
|
||||
|
||||
@@ -1,5 +1,11 @@
|
||||
# Doubao visible watermark capture
|
||||
|
||||
> **Status (completed 2026-05-29):** the capture described below was carried out (black + gray
|
||||
> Doubao captures) and the exact alpha map was solved. Removal is now **reverse-alpha only**: at the
|
||||
> captured native width recovery is pixel-exact and inpaint is OFF; a residual inpaint runs off-native
|
||||
> only. See the `doubao_engine.py` notes in the root `CLAUDE.md`. The text below is kept as the
|
||||
> historical capture plan.
|
||||
|
||||
Goal: capture the Doubao "豆包AI生成" visible watermark over known flat backgrounds so we can
|
||||
build a per-pixel alpha map and a reverse-alpha-blend remover, the same way the Gemini sparkle
|
||||
engine works (`src/remove_ai_watermarks/gemini_engine.py`).
|
||||
@@ -16,8 +22,10 @@ engine works (`src/remove_ai_watermarks/gemini_engine.py`).
|
||||
- Size **scales with resolution**. Third-party numbers (~90x18 at <=1024, ~180x40 at >1024) are
|
||||
approximate and calibrated for ~1024-1280 outputs; at 2048 the strip is much larger. A shipped
|
||||
third-party alpha map is only 120x20, too small for our 2K/4K target -> capture fresh.
|
||||
- In practice clean inversion leaves residue on textured backgrounds, so the remover pairs the alpha
|
||||
map with inpainting (our Gemini engine already does gradient-masked inpainting for residual edges).
|
||||
- The planning assumption was that clean inversion leaves residue on textured backgrounds, so the
|
||||
remover would pair the alpha map with inpainting. After the capture this turned out unnecessary at
|
||||
the native width (recovery is pixel-exact there and inpaint is off); the shipped remover is
|
||||
reverse-alpha only, with a residual inpaint applied off-native only.
|
||||
|
||||
## Use doubao.com specifically
|
||||
|
||||
|
||||
@@ -160,7 +160,6 @@ def main(ctx: click.Context, verbose: bool) -> None:
|
||||
)
|
||||
@click.option("--inpaint-strength", type=float, default=0.85, help="Inpainting blend strength (0.0-1.0).")
|
||||
@click.option("--detect/--no-detect", default=True, help="Detect watermark before removal.")
|
||||
@click.option("--detect-threshold", type=float, default=0.25, help="Detection confidence threshold.")
|
||||
@click.option(
|
||||
"--mark",
|
||||
type=click.Choice(["auto", *watermark_registry.mark_keys()]),
|
||||
@@ -178,7 +177,6 @@ def cmd_visible(
|
||||
inpaint_method: Literal["ns", "telea", "gaussian"],
|
||||
inpaint_strength: float,
|
||||
detect: bool,
|
||||
detect_threshold: float,
|
||||
mark: str,
|
||||
strip_metadata: bool,
|
||||
) -> None:
|
||||
|
||||
@@ -222,7 +222,14 @@ class DoubaoEngine:
|
||||
"""
|
||||
h, w = image.shape[:2]
|
||||
x, y, bw, bh = loc.bbox
|
||||
roi = image[y : y + bh, x : x + bw].astype(np.float32)
|
||||
# Normalize the ROI to 3-channel BGR: a 2D grayscale or 4-channel BGRA
|
||||
# input would otherwise break the axis=2 channel reductions below.
|
||||
roi = image[y : y + bh, x : x + bw]
|
||||
if roi.ndim == 2:
|
||||
roi = cv2.cvtColor(roi, cv2.COLOR_GRAY2BGR)
|
||||
elif roi.shape[2] == 4:
|
||||
roi = cv2.cvtColor(roi, cv2.COLOR_BGRA2BGR)
|
||||
roi = roi.astype(np.float32)
|
||||
|
||||
luma = roi.mean(axis=2)
|
||||
sat = roi.max(axis=2) - roi.min(axis=2)
|
||||
@@ -290,7 +297,12 @@ class DoubaoEngine:
|
||||
if at is None:
|
||||
return None
|
||||
h, w = image.shape[:2]
|
||||
gw, gh = max(1, int(_ALPHA_WIDTH_FRAC * w)), max(1, int(_ALPHA_HEIGHT_FRAC * w))
|
||||
# Glyph box scales with WIDTH; on a wide/short image the height-from-width
|
||||
# box can exceed the image height. Clamp both dims so the slice assignment
|
||||
# below cannot overflow (a degenerate 2048x1 input otherwise raised
|
||||
# ValueError on the broadcast). Normal images are unaffected.
|
||||
gw = min(w, max(1, int(_ALPHA_WIDTH_FRAC * w)))
|
||||
gh = min(h, max(1, int(_ALPHA_HEIGHT_FRAC * w)))
|
||||
ax = max(0, w - int(_ALPHA_MARGIN_RIGHT_FRAC * w) - gw)
|
||||
ay = max(0, h - int(_ALPHA_MARGIN_BOTTOM_FRAC * w) - gh)
|
||||
amap = np.zeros((h, w), np.float32)
|
||||
@@ -353,6 +365,12 @@ class DoubaoEngine:
|
||||
inpaint there costs nothing and reliably clears the mark).
|
||||
Call only when :meth:`reverse_alpha_available` and the mark is detected.
|
||||
"""
|
||||
# Normalize to 3-channel BGR so a 2D grayscale or 4-channel BGRA input
|
||||
# does not break the reverse-alpha math (which assumes a 3-channel logo).
|
||||
if image.ndim == 2:
|
||||
image = cv2.cvtColor(image, cv2.COLOR_GRAY2BGR)
|
||||
elif image.shape[2] == 4:
|
||||
image = cv2.cvtColor(image, cv2.COLOR_BGRA2BGR)
|
||||
at_native = abs(image.shape[1] / _ALPHA_NATIVE_WIDTH - 1.0) <= _ALPHA_NATIVE_BAND
|
||||
if at_native:
|
||||
amap = self._fixed_alpha_map(image)
|
||||
|
||||
@@ -329,16 +329,21 @@ class GeminiEngine:
|
||||
"""
|
||||
result = image.copy()
|
||||
|
||||
# Handle alpha channel
|
||||
if result.shape[2] == 4:
|
||||
# Normalize to 3-channel BGR up front: 2D grayscale (no channel axis) and
|
||||
# 4-channel BGRA both reach this public entry point and would otherwise
|
||||
# crash on the channel-count checks / downstream 3-channel math.
|
||||
if result.ndim == 2:
|
||||
result = cv2.cvtColor(result, cv2.COLOR_GRAY2BGR)
|
||||
elif result.shape[2] == 4:
|
||||
result = cv2.cvtColor(result, cv2.COLOR_BGRA2BGR)
|
||||
elif result.shape[2] == 1:
|
||||
result = cv2.cvtColor(result, cv2.COLOR_GRAY2BGR)
|
||||
|
||||
size = force_size or get_watermark_size(result.shape[1], result.shape[0])
|
||||
|
||||
# Detect dynamic position & size
|
||||
detection = self.detect_watermark(image, force_size=size)
|
||||
# Detect dynamic position & size (on the normalized 3-channel image so a
|
||||
# grayscale/BGRA input does not crash the detector).
|
||||
detection = self.detect_watermark(result, force_size=size)
|
||||
|
||||
if not detection.detected:
|
||||
logger.debug(
|
||||
|
||||
@@ -36,10 +36,14 @@ def apply_analog_humanizer(image: NDArray, grain_intensity: float = 4.0, chromat
|
||||
b, g, r = cv2.split(image)
|
||||
|
||||
# 1. Chromatic Aberration
|
||||
# Shift R channel left, B channel right
|
||||
# Shift R channel left, B channel right. np.roll is circular, so it wraps
|
||||
# the opposite edge into a thin colored fringe at the L/R borders; replicate
|
||||
# the original edge columns there to keep the intended offset interior-only.
|
||||
if chromatic_shift > 0:
|
||||
r = np.roll(r, -chromatic_shift, axis=1)
|
||||
r[:, -chromatic_shift:] = r[:, -chromatic_shift - 1 : -chromatic_shift]
|
||||
b = np.roll(b, chromatic_shift, axis=1)
|
||||
b[:, :chromatic_shift] = b[:, chromatic_shift : chromatic_shift + 1]
|
||||
|
||||
merged = cv2.merge((b, g, r))
|
||||
|
||||
|
||||
@@ -431,7 +431,7 @@ def identify(image_path: Path, *, check_visible: bool = True, check_invisible: b
|
||||
if synthid:
|
||||
watermarks.append(f"SynthID pixel watermark ({synthid})")
|
||||
caveats.append(_SYNTHID_CAVEAT)
|
||||
if "OpenAI" in (" ".join(issuers) + synthid):
|
||||
if _vendor_of(synthid) == "OpenAI":
|
||||
caveats.append(_OPENAI_CAVEAT)
|
||||
if v := _vendor_of(synthid):
|
||||
ai_vendor_claims["synthid"] = v
|
||||
|
||||
@@ -54,7 +54,8 @@ def imwrite(path: str | Path, img: NDArray[Any]) -> bool:
|
||||
|
||||
The output format is taken from the path extension (e.g. ``.png``), exactly
|
||||
like ``cv2.imwrite``. Returns ``True`` on success, ``False`` if the codec
|
||||
rejects the image.
|
||||
rejects the image or the path cannot be written (matching ``cv2.imwrite``,
|
||||
which returns ``False`` rather than raising on an unwritable path).
|
||||
"""
|
||||
import cv2
|
||||
|
||||
@@ -62,5 +63,8 @@ def imwrite(path: str | Path, img: NDArray[Any]) -> bool:
|
||||
ok, buf = cv2.imencode(ext, img)
|
||||
if not ok:
|
||||
return False
|
||||
buf.tofile(str(path))
|
||||
try:
|
||||
buf.tofile(str(path))
|
||||
except OSError:
|
||||
return False
|
||||
return True
|
||||
|
||||
@@ -73,7 +73,7 @@ class InvisibleEngine:
|
||||
"""
|
||||
|
||||
# SDXL base is the default since May 2026: empirically defeats SynthID v2
|
||||
# at strength=0.05 / steps=50 / native ~1024px. See CLAUDE.md "Known
|
||||
# at strength=0.10 / steps=50 / native ~1024px. See CLAUDE.md "Known
|
||||
# limitations" for the regression evidence ruling out SD-1.5 pipelines.
|
||||
DEFAULT_MODEL_ID = "stabilityai/stable-diffusion-xl-base-1.0"
|
||||
CTRLREGEN_MODEL_ID = "yepengliu/ctrlregen"
|
||||
@@ -227,6 +227,8 @@ class InvisibleEngine:
|
||||
from remove_ai_watermarks import image_io
|
||||
|
||||
out_cv = image_io.imread(out_path, cv2.IMREAD_COLOR)
|
||||
if out_cv is None:
|
||||
return out_path
|
||||
|
||||
if protect_faces and original_faces:
|
||||
if self._progress_callback:
|
||||
|
||||
@@ -190,6 +190,8 @@ def _png_late_metadata(image_path: Path, window: int) -> bytes:
|
||||
with open(image_path, "rb") as f:
|
||||
if f.read(8) != b"\x89PNG\r\n\x1a\n":
|
||||
return b""
|
||||
f.seek(0, 2)
|
||||
file_size = f.tell()
|
||||
pos = 8
|
||||
while True:
|
||||
f.seek(pos)
|
||||
@@ -201,9 +203,12 @@ def _png_late_metadata(image_path: Path, window: int) -> bytes:
|
||||
if chunk_type == b"IEND":
|
||||
break
|
||||
data_start = pos + 8
|
||||
# Clamp the attacker-controlled 32-bit length to the bytes that
|
||||
# actually remain, so a malformed huge length can't allocate GBs.
|
||||
safe_length = max(0, min(length, file_size - data_start))
|
||||
if chunk_type in _PNG_META_CHUNKS and data_start >= window:
|
||||
f.seek(data_start)
|
||||
out += f.read(length)
|
||||
out += f.read(safe_length)
|
||||
pos = data_start + length + 4 # data + CRC
|
||||
except OSError as exc:
|
||||
logger.debug("PNG late-metadata scan failed on %s: %s", image_path, exc)
|
||||
|
||||
@@ -55,6 +55,9 @@ def has_c2pa_metadata(image_path: Path) -> bool:
|
||||
if signature != PNG_SIGNATURE:
|
||||
return False
|
||||
|
||||
file_size = f.seek(0, 2)
|
||||
f.seek(8)
|
||||
|
||||
while True:
|
||||
chunk_header = f.read(8)
|
||||
if len(chunk_header) < 8:
|
||||
@@ -62,9 +65,12 @@ def has_c2pa_metadata(image_path: Path) -> bool:
|
||||
|
||||
length = struct.unpack(">I", chunk_header[:4])[0]
|
||||
chunk_type = chunk_header[4:8]
|
||||
# Clamp the attacker-controlled 32-bit length to the bytes that
|
||||
# actually remain, so a malformed huge length can't allocate GBs.
|
||||
safe_length = max(0, min(length, file_size - f.tell()))
|
||||
|
||||
if chunk_type == C2PA_CHUNK_TYPE:
|
||||
chunk_data = f.read(length)
|
||||
chunk_data = f.read(safe_length)
|
||||
# Check for any C2PA signature
|
||||
for sig in C2PA_SIGNATURES:
|
||||
if sig in chunk_data:
|
||||
@@ -74,7 +80,7 @@ def has_c2pa_metadata(image_path: Path) -> bool:
|
||||
return True
|
||||
f.read(4)
|
||||
else:
|
||||
f.read(length + 4)
|
||||
f.seek(safe_length + 4, 1)
|
||||
|
||||
if chunk_type == b"IEND":
|
||||
break
|
||||
@@ -108,6 +114,9 @@ def extract_c2pa_info(image_path: Path) -> dict[str, Any]:
|
||||
if signature != PNG_SIGNATURE:
|
||||
return c2pa_info
|
||||
|
||||
file_size = f.seek(0, 2)
|
||||
f.seek(8)
|
||||
|
||||
while True:
|
||||
chunk_header = f.read(8)
|
||||
if len(chunk_header) < 8:
|
||||
@@ -115,13 +124,16 @@ def extract_c2pa_info(image_path: Path) -> dict[str, Any]:
|
||||
|
||||
length = struct.unpack(">I", chunk_header[:4])[0]
|
||||
chunk_type = chunk_header[4:8]
|
||||
# Clamp the attacker-controlled 32-bit length to the bytes that
|
||||
# actually remain, so a malformed huge length can't allocate GBs.
|
||||
safe_length = max(0, min(length, file_size - f.tell()))
|
||||
|
||||
if chunk_type == C2PA_CHUNK_TYPE:
|
||||
chunk_data = f.read(length)
|
||||
chunk_data = f.read(safe_length)
|
||||
_parse_c2pa_chunk(chunk_data, c2pa_info)
|
||||
f.read(4)
|
||||
else:
|
||||
f.read(length + 4)
|
||||
f.seek(safe_length + 4, 1)
|
||||
|
||||
if chunk_type == b"IEND":
|
||||
break
|
||||
@@ -278,6 +290,9 @@ def extract_c2pa_chunk(image_path: Path) -> bytes | None:
|
||||
if signature != PNG_SIGNATURE:
|
||||
return None
|
||||
|
||||
file_size = f.seek(0, 2)
|
||||
f.seek(8)
|
||||
|
||||
while True:
|
||||
chunk_header = f.read(8)
|
||||
if len(chunk_header) < 8:
|
||||
@@ -285,9 +300,12 @@ def extract_c2pa_chunk(image_path: Path) -> bytes | None:
|
||||
|
||||
length = struct.unpack(">I", chunk_header[:4])[0]
|
||||
chunk_type = chunk_header[4:8]
|
||||
# Clamp the attacker-controlled 32-bit length to the bytes that
|
||||
# actually remain, so a malformed huge length can't allocate GBs.
|
||||
safe_length = max(0, min(length, file_size - f.tell()))
|
||||
|
||||
if chunk_type == C2PA_CHUNK_TYPE:
|
||||
chunk_data = f.read(length)
|
||||
chunk_data = f.read(safe_length)
|
||||
crc = f.read(4)
|
||||
|
||||
# Check for any C2PA signature
|
||||
@@ -299,7 +317,7 @@ def extract_c2pa_chunk(image_path: Path) -> bytes | None:
|
||||
if b"jumb" in chunk_data.lower() or b"c2pa" in chunk_data.lower():
|
||||
return chunk_header + chunk_data + crc
|
||||
else:
|
||||
f.read(length + 4)
|
||||
f.seek(safe_length + 4, 1)
|
||||
|
||||
if chunk_type == b"IEND":
|
||||
break
|
||||
|
||||
@@ -20,6 +20,8 @@ from PIL import Image
|
||||
|
||||
def tile_positions(total: int, tile: int, overlap: int) -> list[int]:
|
||||
"""Compute evenly-spaced tile start positions covering *total* pixels."""
|
||||
if not (0 <= overlap < tile):
|
||||
raise ValueError(f"overlap must satisfy 0 <= overlap < tile (got overlap={overlap}, tile={tile})")
|
||||
if total <= tile:
|
||||
return [0]
|
||||
n = max(2, math.ceil((total - overlap) / (tile - overlap)))
|
||||
|
||||
@@ -17,6 +17,7 @@ Reference: ISO/IEC 14496-12 (ISOBMFF) and C2PA 2.1 spec §11.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import re
|
||||
import struct
|
||||
from typing import TYPE_CHECKING
|
||||
@@ -32,6 +33,8 @@ from remove_ai_watermarks.metadata import (
|
||||
IPTC_AI_MARKERS,
|
||||
)
|
||||
|
||||
log = logging.getLogger(__name__)
|
||||
|
||||
# Top-level box types that may carry AI provenance. ``uuid`` boxes are checked
|
||||
# against ``C2PA_UUID`` / AI-label markers before being stripped; ``jumb`` boxes
|
||||
# are always stripped (JPEG-XL uses them exclusively for JUMBF).
|
||||
@@ -126,6 +129,8 @@ def scan_c2pa_region(path: str | Path, *, max_total: int = 4 * 1024 * 1024) -> b
|
||||
else:
|
||||
size = size32
|
||||
if size < (payload_off - pos) or pos + size > file_size:
|
||||
# Detection-only: a malformed box halts the walk, so a manifest
|
||||
# placed after it is missed (best-effort scan; no resync).
|
||||
break
|
||||
if box_type in C2PA_BOX_TYPES:
|
||||
f.seek(payload_off)
|
||||
@@ -162,7 +167,9 @@ def strip_c2pa_boxes(data: bytes) -> tuple[bytes, int]:
|
||||
|
||||
out = bytearray()
|
||||
stripped = 0
|
||||
consumed = 0
|
||||
for start, end, box_type, payload_off in _iter_top_level_boxes(data):
|
||||
consumed = end
|
||||
if box_type == b"uuid":
|
||||
# uuid boxes carry the 16-byte UUID immediately after the type.
|
||||
is_c2pa = payload_off + 16 <= end and data[payload_off : payload_off + 16] == C2PA_UUID
|
||||
@@ -174,6 +181,20 @@ def strip_c2pa_boxes(data: bytes) -> tuple[bytes, int]:
|
||||
stripped += 1
|
||||
continue
|
||||
out.extend(data[start:end])
|
||||
|
||||
# Fail-safe: the walker returns early on a malformed box (bad size, or a box
|
||||
# that runs past EOF), so anything after it was never visited. Emitting `out`
|
||||
# would silently truncate the file from the bad box to EOF -- worse than not
|
||||
# stripping. If the walk did not consume the whole input, return it unchanged.
|
||||
if consumed != len(data):
|
||||
log.warning(
|
||||
"ISOBMFF box walk stopped at offset %d of %d (malformed box); "
|
||||
"returning input unchanged to avoid truncation",
|
||||
consumed,
|
||||
len(data),
|
||||
)
|
||||
return data, 0
|
||||
|
||||
return bytes(out), stripped
|
||||
|
||||
|
||||
|
||||
@@ -272,6 +272,12 @@ def _make_seed_generator(device: str, seed: int) -> Any:
|
||||
return torch.Generator().manual_seed(seed) # type: ignore
|
||||
|
||||
|
||||
def _generator_device(generator: Any) -> str:
|
||||
"""Best-effort device type of a ``torch.Generator`` (e.g. ``"cpu"``, ``"mps"``)."""
|
||||
device = getattr(generator, "device", None)
|
||||
return getattr(device, "type", str(device)) if device is not None else "cpu"
|
||||
|
||||
|
||||
# Keep legacy name available for backwards compatibility
|
||||
_detect_model_profile_from_id = detect_model_profile
|
||||
|
||||
@@ -677,6 +683,14 @@ class WatermarkRemover:
|
||||
|
||||
base = self._run_img2img(init_image, strength, num_inference_steps, guidance_scale, generator)
|
||||
|
||||
# The base pass may have fallen back from MPS to CPU (it flips
|
||||
# self.device). The generator was built for the original device, and
|
||||
# diffusers rejects a device-mismatched generator ("Expected a 'cpu'
|
||||
# device generator but found 'mps'"), so drop it for the per-region
|
||||
# passes -- they then seed from the global RNG, which is fine here.
|
||||
if generator is not None and self.device == "cpu" and _generator_device(generator) != "cpu":
|
||||
generator = None
|
||||
|
||||
bgr = cv2.cvtColor(np.array(init_image), cv2.COLOR_RGB2BGR)
|
||||
try:
|
||||
boxes = text_protector.TextProtector().detect_text_boxes(bgr)
|
||||
@@ -718,8 +732,13 @@ class WatermarkRemover:
|
||||
# the composite even though the text is crisp.
|
||||
cg = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY).astype(np.float32)
|
||||
dg = cv2.cvtColor(down, cv2.COLOR_BGR2GRAY).astype(np.float32)
|
||||
(sx, sy), _resp = cv2.phaseCorrelate(cg, dg)
|
||||
if abs(sx) > 0.1 or abs(sy) > 0.1:
|
||||
(sx, sy), resp = cv2.phaseCorrelate(cg, dg)
|
||||
# Only correct for the real 1-2px round-trip shift. On a near-flat /
|
||||
# low-contrast crop phaseCorrelate returns a spurious large offset at
|
||||
# a tiny response (e.g. (19,19) at resp ~0.005); warping by that
|
||||
# garbles the composite -- the exact failure this was meant to
|
||||
# prevent. Gate on both a confident response and a plausible offset.
|
||||
if resp > 0.3 and abs(sx) < 4 and abs(sy) < 4 and (abs(sx) > 0.1 or abs(sy) > 0.1):
|
||||
m = np.float32([[1, 0, -sx], [0, 1, -sy]])
|
||||
down = cv2.warpAffine(down, m, (w, h), flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_REPLICATE)
|
||||
out_bgr = text_protector.feather_paste(out_bgr, down, x, y)
|
||||
|
||||
@@ -67,8 +67,16 @@ def erase_cv2(
|
||||
method: Literal["telea", "ns"] = "telea",
|
||||
radius: int = 6,
|
||||
) -> NDArray[Any]:
|
||||
"""Inpaint ``mask`` with classical cv2 inpainting (CPU, no extra deps)."""
|
||||
"""Inpaint ``mask`` with classical cv2 inpainting (CPU, no extra deps).
|
||||
|
||||
Accepts 1-/3-channel BGR (passed straight to ``cv2.inpaint``) and 4-channel
|
||||
BGRA: ``cv2.inpaint`` rejects 4 channels, so the alpha plane is split off,
|
||||
the BGR is inpainted, and alpha is re-attached unchanged.
|
||||
"""
|
||||
flag = cv2.INPAINT_TELEA if method == "telea" else cv2.INPAINT_NS
|
||||
if image_bgr.ndim == 3 and image_bgr.shape[2] == 4:
|
||||
bgr = cv2.inpaint(image_bgr[:, :, :3], mask, radius, flag)
|
||||
return np.dstack([bgr, image_bgr[:, :, 3]])
|
||||
return cv2.inpaint(image_bgr, mask, radius, flag)
|
||||
|
||||
|
||||
|
||||
@@ -22,6 +22,7 @@ signal, not proof of AI origin.
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import threading
|
||||
from typing import TYPE_CHECKING, Any
|
||||
|
||||
if TYPE_CHECKING:
|
||||
@@ -32,7 +33,9 @@ log = logging.getLogger(__name__)
|
||||
# Adobe ships Variant P in production (com.adobe.trustmark.P).
|
||||
_MODEL_TYPE = "P"
|
||||
# Lazily constructed singleton -- model load + first-use download is expensive.
|
||||
# Guarded by a lock so concurrent callers don't double-construct/double-download.
|
||||
_tm: Any = None
|
||||
_tm_lock = threading.Lock()
|
||||
|
||||
|
||||
def is_available() -> bool:
|
||||
@@ -45,9 +48,11 @@ def is_available() -> bool:
|
||||
def _decoder() -> Any:
|
||||
global _tm
|
||||
if _tm is None:
|
||||
from trustmark import TrustMark
|
||||
with _tm_lock:
|
||||
if _tm is None:
|
||||
from trustmark import TrustMark
|
||||
|
||||
_tm = TrustMark(verbose=False, model_type=_MODEL_TYPE)
|
||||
_tm = TrustMark(verbose=False, model_type=_MODEL_TYPE)
|
||||
return _tm
|
||||
|
||||
|
||||
|
||||
@@ -541,3 +541,78 @@ class TestGpuHintMarkup:
|
||||
with patch("remove_ai_watermarks.invisible_engine.is_available", return_value=False):
|
||||
result = runner.invoke(main, ["all", str(sample_png)])
|
||||
assert "remove-ai-watermarks[gpu]" in result.output
|
||||
|
||||
|
||||
class TestEraseCommand:
|
||||
"""Tests for the 'erase' universal region eraser subcommand."""
|
||||
|
||||
def test_erase_help(self, runner):
|
||||
result = runner.invoke(main, ["erase", "--help"])
|
||||
assert result.exit_code == 0
|
||||
assert "--region" in result.output
|
||||
assert "--backend" in result.output
|
||||
|
||||
def test_erase_single_region(self, runner, sample_png, tmp_path):
|
||||
output = tmp_path / "erased.png"
|
||||
result = runner.invoke(
|
||||
main,
|
||||
["erase", str(sample_png), "--region", "10,10,40,40", "-o", str(output)],
|
||||
)
|
||||
assert result.exit_code == 0, result.output
|
||||
assert output.exists()
|
||||
|
||||
def test_erase_two_regions(self, runner, sample_png, tmp_path):
|
||||
output = tmp_path / "erased2.png"
|
||||
result = runner.invoke(
|
||||
main,
|
||||
[
|
||||
"erase",
|
||||
str(sample_png),
|
||||
"--region",
|
||||
"10,10,30,30",
|
||||
"--region",
|
||||
"120,120,30,30",
|
||||
"-o",
|
||||
str(output),
|
||||
],
|
||||
)
|
||||
assert result.exit_code == 0, result.output
|
||||
assert output.exists()
|
||||
# The banner reports the region count it processed.
|
||||
assert "2 region(s)" in result.output
|
||||
|
||||
def test_erase_default_output_name(self, runner, sample_png):
|
||||
result = runner.invoke(main, ["erase", str(sample_png), "--region", "10,10,40,40"])
|
||||
assert result.exit_code == 0, result.output
|
||||
assert sample_png.with_stem(sample_png.stem + "_clean").exists()
|
||||
|
||||
def test_erase_malformed_region_exits_nonzero(self, runner, sample_png, tmp_path):
|
||||
output = tmp_path / "x.png"
|
||||
# Only three values: click.BadParameter -> non-zero exit, no output file.
|
||||
result = runner.invoke(
|
||||
main,
|
||||
["erase", str(sample_png), "--region", "1,2,3", "-o", str(output)],
|
||||
)
|
||||
assert result.exit_code != 0
|
||||
assert not output.exists()
|
||||
|
||||
def test_erase_nonexistent_file(self, runner):
|
||||
result = runner.invoke(main, ["erase", "/nonexistent/file.png", "--region", "0,0,10,10"])
|
||||
assert result.exit_code != 0
|
||||
|
||||
def test_erase_lama_backend_without_onnxruntime(self, runner, sample_png, tmp_path):
|
||||
# The LaMa backend needs onnxruntime; without it the CLI must surface a
|
||||
# clear error and exit non-zero rather than crash. When onnxruntime IS
|
||||
# installed there is no missing-dep path to exercise, so skip.
|
||||
from remove_ai_watermarks.region_eraser import lama_available
|
||||
|
||||
if lama_available():
|
||||
pytest.skip("onnxruntime installed; missing-dep error path not reachable")
|
||||
output = tmp_path / "y.png"
|
||||
result = runner.invoke(
|
||||
main,
|
||||
["erase", str(sample_png), "--region", "10,10,40,40", "--backend", "lama", "-o", str(output)],
|
||||
)
|
||||
assert result.exit_code != 0
|
||||
assert "onnxruntime" in result.output.lower()
|
||||
assert not output.exists()
|
||||
|
||||
@@ -161,3 +161,28 @@ class TestReverseAlpha:
|
||||
assert float(np.abs(wm.astype(np.float32)[mark] - 100.0).mean()) > 15 # mark visible
|
||||
out = eng.remove_watermark_reverse_alpha(wm).astype(np.float32)
|
||||
assert float(np.abs(out[mark] - 100.0).mean()) < max_err
|
||||
|
||||
|
||||
class TestDegenerateAndChannelInputs:
|
||||
"""Removal must not crash on degenerate sizes or non-3-channel inputs."""
|
||||
|
||||
@pytest.mark.parametrize(("w", "h"), [(2048, 1), (1, 2048), (2048, 8)])
|
||||
def test_wide_short_does_not_raise(self, w, h):
|
||||
"""A wide/short image at native width makes the width-derived glyph box
|
||||
taller than the image; the slice assignment must not ValueError."""
|
||||
eng = DoubaoEngine()
|
||||
img = np.zeros((h, w, 3), np.uint8)
|
||||
out = eng.remove_watermark_reverse_alpha(img)
|
||||
assert out.shape == img.shape
|
||||
|
||||
def test_grayscale_2d_does_not_raise(self):
|
||||
eng = DoubaoEngine()
|
||||
gray = np.zeros((2048, 2048), np.uint8)
|
||||
out = eng.remove_watermark_reverse_alpha(gray)
|
||||
assert out.shape == (2048, 2048, 3)
|
||||
|
||||
def test_bgra_4channel_does_not_raise(self):
|
||||
eng = DoubaoEngine()
|
||||
bgra = np.zeros((2048, 2048, 4), np.uint8)
|
||||
out = eng.remove_watermark_reverse_alpha(bgra)
|
||||
assert out.shape == (2048, 2048, 3)
|
||||
|
||||
@@ -50,3 +50,23 @@ def test_invalid_shape():
|
||||
img[0, 0] = 50
|
||||
result = apply_analog_humanizer(img)
|
||||
assert np.array_equal(img, result)
|
||||
|
||||
|
||||
def test_chromatic_shift_does_not_wrap_opposite_edge():
|
||||
# On a horizontal gradient (dark left, bright right), a circular np.roll
|
||||
# would wrap the bright right edge into the R channel's left border and the
|
||||
# dark left edge into the B channel's right border, producing a colored
|
||||
# fringe. After the fix the border columns must replicate their own edge.
|
||||
ramp = np.linspace(0, 255, 64, dtype=np.uint8)
|
||||
gray = np.broadcast_to(ramp, (32, 64))
|
||||
img = np.stack([gray, gray, gray], axis=2).copy() # B, G, R
|
||||
|
||||
shift = 3
|
||||
result = apply_analog_humanizer(img, grain_intensity=0.0, chromatic_shift=shift)
|
||||
|
||||
# B (index 0) rolled right -> its left border must stay dark (near 0),
|
||||
# NOT wrap the bright right edge.
|
||||
assert result[:, :shift, 0].max() < 60
|
||||
# R (index 2) rolled left -> its right border must stay bright (near 255),
|
||||
# NOT wrap the dark left edge.
|
||||
assert result[:, -shift:, 2].min() > 195
|
||||
|
||||
@@ -389,6 +389,56 @@ class TestIdentifyCaveats:
|
||||
assert len(r.caveats) == len(set(r.caveats))
|
||||
|
||||
|
||||
class TestOpenAiCaveatVendorScoped:
|
||||
"""The OpenAI rollout caveat keys on the normalized SynthID vendor, not a raw
|
||||
"OpenAI" substring over the issuer + verdict blob -- so a Google-SynthID
|
||||
manifest with an incidental "OpenAI" byte elsewhere is not mislabeled, while
|
||||
a genuine OpenAI manifest still gets the hedge.
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def _png_chunk(ctype: bytes, data: bytes) -> bytes:
|
||||
import struct
|
||||
import zlib
|
||||
|
||||
return struct.pack(">I", len(data)) + ctype + data + struct.pack(">I", zlib.crc32(ctype + data) & 0xFFFFFFFF)
|
||||
|
||||
def _png(self, tmp_path: Path, name: str, *extra: bytes) -> Path:
|
||||
import struct
|
||||
import zlib
|
||||
|
||||
ihdr = struct.pack(">IIBBBBB", 1, 1, 8, 6, 0, 0, 0)
|
||||
body = (
|
||||
b"\x89PNG\r\n\x1a\n"
|
||||
+ self._png_chunk(b"IHDR", ihdr)
|
||||
+ self._png_chunk(b"IDAT", zlib.compress(b"\x00" * 6, 9))
|
||||
+ b"".join(extra)
|
||||
+ self._png_chunk(b"IEND", b"")
|
||||
)
|
||||
path = tmp_path / name
|
||||
path.write_bytes(body)
|
||||
return path
|
||||
|
||||
def test_google_synthid_with_incidental_openai_byte_no_caveat(self, tmp_path: Path):
|
||||
# Google C2PA/SynthID manifest in caBX; the byte "OpenAI" lives in a
|
||||
# separate tEXt chunk (e.g. a trust-chain note), not as a SynthID vendor.
|
||||
png = self._png(
|
||||
tmp_path,
|
||||
"g.png",
|
||||
self._png_chunk(b"caBX", b"jumbc2pa Google ... trainedAlgorithmicMedia"),
|
||||
self._png_chunk(b"tEXt", b"note\x00signed via OpenAI trust chain"),
|
||||
)
|
||||
r = identify(png, check_visible=False, check_invisible=False)
|
||||
assert any("SynthID pixel watermark (likely present (Google" in w for w in r.watermarks)
|
||||
assert not any("before the rollout" in c for c in r.caveats)
|
||||
|
||||
def test_openai_synthid_still_gets_caveat(self, tmp_path: Path):
|
||||
png = self._png(tmp_path, "oa.png", self._png_chunk(b"caBX", b"jumbc2pa OpenAI ... trainedAlgorithmicMedia"))
|
||||
r = identify(png, check_visible=False, check_invisible=False)
|
||||
assert any("SynthID pixel watermark (likely present (OpenAI" in w for w in r.watermarks)
|
||||
assert any("before the rollout" in c for c in r.caveats)
|
||||
|
||||
|
||||
class TestReportSerializable:
|
||||
def test_report_is_json_serializable(self, tmp_png_with_ai_metadata: Path):
|
||||
# The CLI --json path relies on asdict + json.dumps(default=str).
|
||||
@@ -657,6 +707,19 @@ class TestIntegrityClashEndToEnd:
|
||||
r = identify(path, check_visible=False, check_invisible=False)
|
||||
assert r.integrity_clashes == []
|
||||
|
||||
def test_camera_device_plus_ai_marker_clash(self, tmp_path: Path):
|
||||
# Integrity-clash rule #2: a camera-capture C2PA device token (Pixel
|
||||
# Camera) coexisting with an independent AI-generation marker (a China
|
||||
# TC260 AIGC label) -- a genuine camera capture is not AI-generated, so
|
||||
# the provenance is inconsistent (a laundering / spoofing tell).
|
||||
path = self._c2pa_jpeg(
|
||||
tmp_path,
|
||||
b'Pixel Camera ... <TC260:AIGC>{"Label":"1","ContentProducer":"BYTEDANCE001"}</TC260:AIGC>',
|
||||
)
|
||||
r = identify(path, check_visible=False, check_invisible=False)
|
||||
assert r.platform == "Google Pixel (camera, C2PA capture)"
|
||||
assert any("Camera-capture C2PA credentials" in c and "AI-generation markers" in c for c in r.integrity_clashes)
|
||||
|
||||
def test_clash_serializes_to_json(self, tmp_path: Path):
|
||||
path = self._c2pa_jpeg(tmp_path, b"OpenAI ... trainedAlgorithmicMedia ... TC260:AIGC label")
|
||||
r = identify(path, check_visible=False, check_invisible=False)
|
||||
|
||||
@@ -72,3 +72,8 @@ class TestFailureSemantics:
|
||||
path = tmp_path / "garbage.png"
|
||||
path.write_bytes(b"not an image")
|
||||
assert image_io.imread(path) is None
|
||||
|
||||
def test_imwrite_to_missing_directory_returns_false(self, tmp_path: Path) -> None:
|
||||
# An unwritable path must return False (cv2.imwrite contract), not raise.
|
||||
path = tmp_path / "no-such-dir" / "out.png"
|
||||
assert image_io.imwrite(path, _make_bgr()) is False
|
||||
|
||||
@@ -445,6 +445,31 @@ class TestRemoveAiMetadata:
|
||||
assert isinstance(result, Path)
|
||||
assert result == output
|
||||
|
||||
def _sd_png(self, tmp_path: Path) -> Path:
|
||||
img = Image.new("RGB", (32, 32), color=(80, 80, 80))
|
||||
pnginfo = PngInfo()
|
||||
pnginfo.add_text("parameters", "Steps: 20, Sampler: Euler")
|
||||
img.save(tmp_path / "sd.png", pnginfo=pnginfo)
|
||||
return tmp_path / "sd.png"
|
||||
|
||||
def test_png_to_jpeg_strips_ai(self, tmp_path):
|
||||
# Cross-format output: the AI text chunk must not survive the PNG->JPEG
|
||||
# re-encode, by detection AND by raw bytes.
|
||||
out = tmp_path / "clean.jpg"
|
||||
remove_ai_metadata(self._sd_png(tmp_path), out)
|
||||
assert not has_ai_metadata(out)
|
||||
body = out.read_bytes()
|
||||
assert b"parameters" not in body
|
||||
assert b"Steps" not in body
|
||||
|
||||
def test_png_to_webp_strips_ai(self, tmp_path):
|
||||
out = tmp_path / "clean.webp"
|
||||
remove_ai_metadata(self._sd_png(tmp_path), out)
|
||||
assert not has_ai_metadata(out)
|
||||
body = out.read_bytes()
|
||||
assert b"parameters" not in body
|
||||
assert b"Steps" not in body
|
||||
|
||||
|
||||
def _img_with_software(tmp_path: Path, fmt: str, software: str) -> Path:
|
||||
"""Write a tiny image carrying an EXIF Software tag."""
|
||||
@@ -617,6 +642,41 @@ class TestRemoveAiExif:
|
||||
kept = piexif.load(Image.open(out).info["exif"])["0th"]
|
||||
assert kept.get(piexif.ImageIFD.Make) == b"Apple"
|
||||
|
||||
def test_xai_pair_stripped_but_genuine_camera_tags_kept(self, tmp_path: Path):
|
||||
# An image carrying BOTH the xAI Signature pair (ImageDescription =
|
||||
# "Signature: <base64>" + UUID Artist) AND genuine non-AI camera tags.
|
||||
# The scrub must delete only the xAI pair, leaving the camera tags intact.
|
||||
sig = "Signature: " + "A" * 120
|
||||
artist = "12345678-1234-1234-1234-123456789abc"
|
||||
exif = piexif.dump(
|
||||
{
|
||||
"0th": {
|
||||
piexif.ImageIFD.ImageDescription: sig.encode(),
|
||||
piexif.ImageIFD.Artist: artist.encode(),
|
||||
piexif.ImageIFD.Make: b"Canon",
|
||||
piexif.ImageIFD.Model: b"EOS R5",
|
||||
},
|
||||
"Exif": {piexif.ExifIFD.DateTimeOriginal: b"2024:01:01 12:00:00"},
|
||||
"GPS": {piexif.GPSIFD.GPSLatitudeRef: b"N"},
|
||||
"1st": {},
|
||||
}
|
||||
)
|
||||
src = tmp_path / "grok_plus_cam.jpg"
|
||||
Image.new("RGB", (32, 32)).save(src, exif=exif)
|
||||
out = tmp_path / "scrubbed.jpg"
|
||||
remove_ai_metadata(src, out)
|
||||
|
||||
# xAI signature pair is gone (xai_signature returns a bool, not None).
|
||||
assert xai_signature(out) is False
|
||||
kept = piexif.load(Image.open(out).info["exif"])
|
||||
assert kept["0th"].get(piexif.ImageIFD.ImageDescription) is None
|
||||
assert kept["0th"].get(piexif.ImageIFD.Artist) is None
|
||||
# Genuine camera tags are preserved.
|
||||
assert kept["0th"].get(piexif.ImageIFD.Make) == b"Canon"
|
||||
assert kept["0th"].get(piexif.ImageIFD.Model) == b"EOS R5"
|
||||
assert kept["Exif"].get(piexif.ExifIFD.DateTimeOriginal) == b"2024:01:01 12:00:00"
|
||||
assert kept["GPS"].get(piexif.GPSIFD.GPSLatitudeRef) == b"N"
|
||||
|
||||
|
||||
class TestAIGCLabel:
|
||||
"""China TC260 AIGC labeling (Doubao and other China-served generators)."""
|
||||
|
||||
+5
-2
@@ -328,9 +328,12 @@ class TestISOBMFF:
|
||||
def test_truncated_largesize_terminates_safely(self):
|
||||
# size32==1 promises a 64-bit largesize, but the box ends after 8 bytes;
|
||||
# iteration must stop rather than read the missing largesize past EOF.
|
||||
cleaned, stripped = strip_c2pa_boxes(FTYP + b"\x00\x00\x00\x01uuid")
|
||||
# The walk halts before EOF, so the fail-safe returns the input unchanged
|
||||
# (emitting only FTYP would silently truncate the file).
|
||||
data = FTYP + b"\x00\x00\x00\x01uuid"
|
||||
cleaned, stripped = strip_c2pa_boxes(data)
|
||||
assert stripped == 0
|
||||
assert cleaned == FTYP
|
||||
assert cleaned == data
|
||||
|
||||
|
||||
class TestC2PAInvalidSignature:
|
||||
|
||||
@@ -66,6 +66,23 @@ class TestEraseCv2:
|
||||
assert np.array_equal(img, out)
|
||||
|
||||
|
||||
class TestNonBgrInputs:
|
||||
"""cv2.inpaint rejects 4-channel BGRA and 2D-only entry points must work."""
|
||||
|
||||
def test_grayscale_2d_does_not_raise(self):
|
||||
gray = np.full((100, 100), 120, np.uint8)
|
||||
out = erase(gray, boxes=[(40, 40, 20, 20)], backend="cv2")
|
||||
assert out.shape == gray.shape
|
||||
|
||||
def test_bgra_preserves_alpha_and_does_not_raise(self):
|
||||
bgra = np.full((100, 100, 4), 120, np.uint8)
|
||||
bgra[..., 3] = 200 # opaque-ish alpha plane
|
||||
out = erase(bgra, boxes=[(40, 40, 20, 20)], backend="cv2", dilate=0)
|
||||
assert out.shape == bgra.shape
|
||||
# alpha plane is carried through unchanged
|
||||
assert np.array_equal(out[..., 3], bgra[..., 3])
|
||||
|
||||
|
||||
class TestLamaBackend:
|
||||
def test_lama_raises_when_unavailable(self):
|
||||
img = np.full((100, 100, 3), 50, np.uint8)
|
||||
|
||||
@@ -0,0 +1,130 @@
|
||||
"""Regression guards for malformed-length DoS and removal-truncation bugs.
|
||||
|
||||
Three verified bugs are locked in here:
|
||||
|
||||
1. PNG C2PA parsers (``c2pa.has_c2pa_metadata`` / ``extract_c2pa_info`` and
|
||||
``metadata._png_late_metadata`` via ``scan_head``) used the raw 32-bit chunk
|
||||
``length`` field directly in ``f.read(length)``. A crafted file can declare
|
||||
``length = 0x7FFFFFFF`` (~2 GiB) on a 60-byte file, forcing a multi-GB
|
||||
allocation. The fix clamps ``length`` to the bytes actually remaining.
|
||||
|
||||
2. ISOBMFF ``strip_c2pa_boxes`` truncated the file from a malformed box to EOF
|
||||
(the box walker returns early), so ``remove_ai_metadata`` could emit a
|
||||
shorter file and report success. The fix returns the input unchanged when the
|
||||
walk does not reach EOF.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import struct
|
||||
import tracemalloc
|
||||
|
||||
from remove_ai_watermarks import metadata
|
||||
from remove_ai_watermarks.noai import c2pa, isobmff
|
||||
|
||||
PNG_SIG = b"\x89PNG\r\n\x1a\n"
|
||||
_HUGE = 0x7FFFFFFF # ~2 GiB declared length on a tiny file
|
||||
|
||||
|
||||
def _png_with_huge_c2pa_chunk() -> bytes:
|
||||
"""A ~60-byte 'PNG' whose caBX chunk header lies about its length."""
|
||||
header = struct.pack(">I", _HUGE) + c2pa.C2PA_CHUNK_TYPE
|
||||
body = b"jumbc2pa-not-really" # far shorter than the declared length
|
||||
return PNG_SIG + header + body
|
||||
|
||||
|
||||
class TestPngLengthClampNoAlloc:
|
||||
"""Clamping makes the parsers read only the real bytes, not the lie."""
|
||||
|
||||
def test_has_c2pa_metadata_is_bounded(self, tmp_path):
|
||||
path = tmp_path / "evil.png"
|
||||
path.write_bytes(_png_with_huge_c2pa_chunk())
|
||||
|
||||
tracemalloc.start()
|
||||
try:
|
||||
# Must return quickly without allocating gigabytes and without raising.
|
||||
c2pa.has_c2pa_metadata(path)
|
||||
_, peak = tracemalloc.get_traced_memory()
|
||||
finally:
|
||||
tracemalloc.stop()
|
||||
assert peak < 50 * 1024 * 1024 # < 50 MB locks in the clamp
|
||||
|
||||
def test_extract_c2pa_info_is_bounded(self, tmp_path):
|
||||
path = tmp_path / "evil.png"
|
||||
path.write_bytes(_png_with_huge_c2pa_chunk())
|
||||
|
||||
tracemalloc.start()
|
||||
try:
|
||||
c2pa.extract_c2pa_info(path)
|
||||
_, peak = tracemalloc.get_traced_memory()
|
||||
finally:
|
||||
tracemalloc.stop()
|
||||
assert peak < 50 * 1024 * 1024
|
||||
|
||||
def test_extract_c2pa_chunk_is_bounded(self, tmp_path):
|
||||
path = tmp_path / "evil.png"
|
||||
path.write_bytes(_png_with_huge_c2pa_chunk())
|
||||
|
||||
tracemalloc.start()
|
||||
try:
|
||||
c2pa.extract_c2pa_chunk(path)
|
||||
_, peak = tracemalloc.get_traced_memory()
|
||||
finally:
|
||||
tracemalloc.stop()
|
||||
assert peak < 50 * 1024 * 1024
|
||||
|
||||
def test_png_late_metadata_scan_is_bounded(self, tmp_path):
|
||||
# A PNG with a real IDAT pushing the late-scan window past 1 MB, then a
|
||||
# tEXt chunk lying about its length. scan_head() -> _png_late_metadata().
|
||||
idat = b"\x00" * (1024 * 1024 + 16)
|
||||
text_header = struct.pack(">I", _HUGE) + b"tEXt"
|
||||
blob = (
|
||||
PNG_SIG
|
||||
+ struct.pack(">I", len(idat))
|
||||
+ b"IDAT"
|
||||
+ idat
|
||||
+ b"\x00\x00\x00\x00" # fake CRC
|
||||
+ text_header
|
||||
+ b"AIGC short"
|
||||
)
|
||||
path = tmp_path / "evil_late.png"
|
||||
path.write_bytes(blob)
|
||||
|
||||
tracemalloc.start()
|
||||
try:
|
||||
metadata.scan_head(path)
|
||||
_, peak = tracemalloc.get_traced_memory()
|
||||
finally:
|
||||
tracemalloc.stop()
|
||||
# head itself is ~1 MB; the clamp keeps the late read tiny. Generous cap.
|
||||
assert peak < 50 * 1024 * 1024
|
||||
|
||||
|
||||
def _box(box_type: bytes, payload: bytes) -> bytes:
|
||||
return struct.pack(">I", 8 + len(payload)) + box_type + payload
|
||||
|
||||
|
||||
class TestIsobmffStripFailSafe:
|
||||
def test_well_formed_file_still_strips_uuid(self):
|
||||
ftyp = _box(b"ftyp", b"isom\x00\x00\x00\x00mp42")
|
||||
c2pa_box = _box(b"uuid", isobmff.C2PA_UUID + b"manifest-bytes")
|
||||
mdat = _box(b"mdat", b"\x00" * 32)
|
||||
data = ftyp + c2pa_box + mdat
|
||||
|
||||
cleaned, stripped = isobmff.strip_c2pa_boxes(data)
|
||||
assert stripped == 1
|
||||
assert len(cleaned) == len(data) - len(c2pa_box)
|
||||
assert isobmff.C2PA_UUID not in cleaned
|
||||
|
||||
def test_malformed_box_does_not_truncate_tail(self):
|
||||
ftyp = _box(b"ftyp", b"isom\x00\x00\x00\x00mp42")
|
||||
c2pa_box = _box(b"uuid", isobmff.C2PA_UUID + b"manifest-bytes")
|
||||
# A box claiming ~2 GiB before EOF: the walker returns early here.
|
||||
bad_box = struct.pack(">I", _HUGE) + b"free" + b"\x00" * 16
|
||||
data = ftyp + c2pa_box + bad_box
|
||||
|
||||
cleaned, stripped = isobmff.strip_c2pa_boxes(data)
|
||||
# Fail-safe: input returned unchanged, nothing stripped, no truncation.
|
||||
assert stripped == 0
|
||||
assert cleaned == data
|
||||
assert len(cleaned) == len(data)
|
||||
@@ -43,6 +43,16 @@ class TestTilePositions:
|
||||
# 1024 wide, 512 tile, no overlap -> two tiles butting at 512.
|
||||
assert tile_positions(1024, 512, 0) == [0, 512]
|
||||
|
||||
def test_overlap_equal_to_tile_raises(self):
|
||||
# overlap == tile makes the stride denominator (tile - overlap) zero;
|
||||
# reject up front instead of dividing by zero.
|
||||
with pytest.raises(ValueError, match="overlap"):
|
||||
tile_positions(2000, 512, 512)
|
||||
|
||||
def test_overlap_greater_than_tile_raises(self):
|
||||
with pytest.raises(ValueError, match="overlap"):
|
||||
tile_positions(2000, 512, 600)
|
||||
|
||||
|
||||
class TestMakeBlendWeight:
|
||||
def test_zero_overlap_is_all_ones(self):
|
||||
|
||||
Reference in New Issue
Block a user