Extract _target_size helper + regression-test native resolution (v0.5.4)

The native-vs-downscale decision in InvisibleEngine.remove_watermark (the
issue #10/#15 fix: max_resolution=0 must not pre-downscale, since any
downscale both loses quality and lets SynthID survive) had no test. Extract
it into a pure helper invisible_engine._target_size(w, h, max_resolution)
and cover it with tests/test_invisible_engine.py::TestTargetSize so a
re-introduced forced downscale fails CI instead of silently regressing #15.

Also:
- Clamp the short side to >=1 in _target_size: extreme aspect ratios (e.g.
  5000x3 with --max-resolution 1024) truncated it to 0 and crashed
  image.resize(). Pre-existing in the inline math; fixed now that it is a
  named, tested function.
- Consolidate the two duplicated temp-file save blocks into one
  unconditional save (behavior unchanged: the EXIF-transposed image is
  still always persisted before WatermarkRemover reloads it by path), and
  drop the now-redundant `_tmp_path is not None` guard in finally.
- Bump version 0.5.3 -> 0.5.4 (pyproject, __init__, uv.lock); document the
  helper as the regression guard in CLAUDE.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
test-user
2026-05-25 14:09:33 -07:00
parent 28fe13db8f
commit d24d8a4b14
6 changed files with 80 additions and 30 deletions
+1 -1
View File
@@ -44,7 +44,7 @@ Who embeds what, and whether it is locally detectable (so we know which gaps are
## Known limitations
- `invisible` pipeline processes at **native resolution by default** (`max_resolution=0`), matching the hosted raiw.cc backend (fal fast-sdxl, no pre-downscale). The old forced downscale-to-1024 -> upscale-back round-trip was the main quality loss (issue #10) and is gone; at strength ~0.05 SDXL img2img does not need the ~1024 downscale. `--max-resolution N` re-introduces an opt-in long-side cap purely to bound GPU/MPS memory on very large inputs (it reintroduces the lossy round-trip). For huge images that OOM at native, tile-based diffusion is still the proper long-term fix. **Concrete MPS data point (verified 2026-05-25 on a 1254x1254 gpt-image SDXL run, fp32, 20 GB MPS ceiling):** native res OOMs at the *UNet* step (peak ~17 GiB), not only the VAE decode, and the auto-fallback in `img2img_runner` reloads on CPU and finishes (slow, ~13 min) -- the output is still weight-identical and defeats SynthID, so "looks hung/crashed" on Mac is usually this CPU fallback, not a pipeline error. Adding `enable_vae_tiling()` alone does NOT prevent it (the peak is the UNet, not the VAE). The fast Mac workarounds are fp16 on MPS (roughly halves memory) or `--max-resolution` to cap the long side; neither is wired as the default.
- `invisible` pipeline processes at **native resolution by default** (`max_resolution=0`), matching the hosted raiw.cc backend (fal fast-sdxl, no pre-downscale). The old forced downscale-to-1024 -> upscale-back round-trip was the main quality loss (issue #10) and is gone; at strength ~0.05 SDXL img2img does not need the ~1024 downscale. `--max-resolution N` re-introduces an opt-in long-side cap purely to bound GPU/MPS memory on very large inputs (it reintroduces the lossy round-trip). For huge images that OOM at native, tile-based diffusion is still the proper long-term fix. **Concrete MPS data point (verified 2026-05-25 on a 1254x1254 gpt-image SDXL run, fp32, 20 GB MPS ceiling):** native res OOMs at the *UNet* step (peak ~17 GiB), not only the VAE decode, and the auto-fallback in `img2img_runner` reloads on CPU and finishes (slow, ~13 min) -- the output is still weight-identical and defeats SynthID, so "looks hung/crashed" on Mac is usually this CPU fallback, not a pipeline error. Adding `enable_vae_tiling()` alone does NOT prevent it (the peak is the UNet, not the VAE). The fast Mac workarounds are fp16 on MPS (roughly halves memory) or `--max-resolution` to cap the long side; neither is wired as the default. The native-vs-downscale decision lives in the pure helper `invisible_engine._target_size(w, h, max_resolution)` (returns `None` for native, a clamped target tuple otherwise) so it is unit-tested (`tests/test_invisible_engine.py::TestTargetSize`, the #10/#15 regression guard) without loading the model -- keep that logic in the helper, don't re-inline it.
- Pyright first run is slow (2-3 min) due to ML deps (torch/diffusers/transformers stubs); full-project `uv run pyright` can stall for many minutes — scope it to changed files.
- `ultralytics` monkey-patches `PIL.Image.open` and tries to autoload `pi_heif`. When `pi_heif` is missing, opening files raises `ModuleNotFoundError`, not `UnidentifiedImageError`. Code that opens user-supplied or unknown-format files should `except Exception`, not just `OSError`/`UnidentifiedImageError`.
- Metadata detection for AVIF/HEIF/JPEG-XL relies on a binary scan for `C2PA_UUID` + `IPTC_AI_MARKERS`, plus EXIF `Software` / XMP `CreatorTool` generator tags via `metadata.exif_generator` (validated with synthesized AVIF/JPEG fixtures + an XMP raw-scan fixture). C2PA removal in those containers is implemented via `noai/isobmff.py` (top-level ``uuid`` / ``jumb`` box stripper, no re-encoding). EXIF/XMP boxes inside those containers are read for detection but not yet **scrubbed** on removal.
+1 -1
View File
@@ -1,6 +1,6 @@
[project]
name = "remove-ai-watermarks"
version = "0.5.3"
version = "0.5.4"
description = "Remove visible and invisible AI watermarks from images (Gemini / Nano Banana, ChatGPT, Stable Diffusion)"
readme = "README.md"
requires-python = ">=3.10"
+1 -1
View File
@@ -1,3 +1,3 @@
"""Remove-AI-Watermarks: Unified tool for removing visible and invisible AI watermarks."""
__version__ = "0.5.3"
__version__ = "0.5.4"
+33 -25
View File
@@ -42,6 +42,24 @@ def is_available() -> bool:
return False
def _target_size(width: int, height: int, max_resolution: int) -> tuple[int, int] | None:
"""Compute the downscaled (width, height) for a long-side cap, or None for native.
Returns None when no pre-downscale is needed: ``max_resolution <= 0`` (native
resolution, the default that matches the raiw.cc backend -- see issue #10) or
the long side already fits the cap. Otherwise scales the long side down to
``max_resolution`` preserving aspect ratio (integer-truncated, matching the
PIL ``resize`` call site). Pure function so the native-vs-downscale decision
is unit-testable without loading the diffusion model.
"""
if max_resolution > 0 and max(width, height) > max_resolution:
ratio = max_resolution / max(width, height)
# Clamp the short side to >=1: extreme aspect ratios (e.g. 5000x3 capped
# at 1024) would otherwise truncate it to 0 and crash image.resize().
return (max(1, int(width * ratio)), max(1, int(height * ratio)))
return None
class InvisibleEngine:
"""Remove invisible AI watermarks using diffusion model regeneration.
@@ -142,37 +160,26 @@ class InvisibleEngine:
image = Image.open(image_path)
image = ImageOps.exif_transpose(image)
orig_size = image.size # (width, height)
_tmp_path = None
if max_resolution > 0 and max(image.width, image.height) > max_resolution:
ratio = max_resolution / max(image.width, image.height)
new_size = (int(image.width * ratio), int(image.height * ratio))
# Optional long-side downscale; native resolution by default (issue #10).
target = _target_size(image.width, image.height, max_resolution)
if target is not None:
if self._progress_callback:
self._progress_callback(
f"Downscaling {image.width}x{image.height} "
f"to {new_size[0]}x{new_size[1]} "
f"to {target[0]}x{target[1]} "
f"(max-resolution cap {max_resolution}px)..."
)
image = image.resize(new_size, Image.Resampling.LANCZOS)
image = image.resize(target, Image.Resampling.LANCZOS)
# Save to a temp file instead of overwriting the original
_tmp_fd, _tmp_str = tempfile.mkstemp(suffix=image_path.suffix)
_tmp_path = Path(_tmp_str)
image.save(_tmp_path)
import os as _os
_os.close(_tmp_fd)
image_path = _tmp_path
else:
# We must save the transposed image back to a tmp file if it was rotated
# otherwise WatermarkRemover will reload it without EXIF rotation!
_tmp_fd, _tmp_str = tempfile.mkstemp(suffix=image_path.suffix)
_tmp_path = Path(_tmp_str)
image.save(_tmp_path)
import os as _os
_os.close(_tmp_fd)
image_path = _tmp_path
# Always persist to a temp file, even without downscaling: WatermarkRemover
# reloads by path, so the EXIF-transposed pixels must be saved or rotation
# is lost. Cleaned up in the finally block via _tmp_path.
_tmp_fd, _tmp_str = tempfile.mkstemp(suffix=image_path.suffix)
_tmp_path = Path(_tmp_str)
image.save(_tmp_path)
os.close(_tmp_fd)
image_path = _tmp_path
try:
# Optional: Face protection (Phase 1 - Extraction)
@@ -253,7 +260,8 @@ class InvisibleEngine:
return out_path
finally:
if _tmp_path is not None and _tmp_path.exists():
# _tmp_path is always set above (we persist the image unconditionally).
if _tmp_path.exists():
_tmp_path.unlink()
def remove_watermark_batch(
+43 -1
View File
@@ -2,7 +2,7 @@
from __future__ import annotations
from remove_ai_watermarks.invisible_engine import InvisibleEngine, is_available
from remove_ai_watermarks.invisible_engine import InvisibleEngine, _target_size, is_available
class TestIsAvailable:
@@ -26,3 +26,45 @@ class TestInvisibleEngineInit:
def test_ctrlregen_model_id(self):
assert InvisibleEngine.CTRLREGEN_MODEL_ID == "yepengliu/ctrlregen"
class TestTargetSize:
"""Regression guard for the native-resolution decision (issues #10 / #15).
max_resolution=0 must NOT downscale -- the forced downscale->upscale
round-trip was the quality loss in #10, and downscaling at all let SynthID
survive in #15 (the native SDXL pass at strength ~0.05 is what defeats it).
"""
def test_native_default_no_downscale(self):
# The default (0) means native resolution: no resize, regardless of size.
assert _target_size(4096, 4096, 0) is None
assert _target_size(123, 456, 0) is None
def test_negative_cap_treated_as_native(self):
assert _target_size(4096, 4096, -1) is None
def test_cap_below_long_side_downscales(self):
# 2000x1000, cap 1024 -> long side scaled to 1024, aspect preserved.
assert _target_size(2000, 1000, 1024) == (1024, 512)
def test_cap_uses_long_side_for_portrait(self):
# Portrait: height is the long side, so it drives the ratio.
assert _target_size(1000, 2000, 1024) == (512, 1024)
def test_cap_at_or_above_long_side_no_downscale(self):
# Already within the cap (and exactly equal) -> no resize.
assert _target_size(800, 600, 1024) is None
assert _target_size(1024, 768, 1024) is None
def test_integer_truncation_matches_pil_call_site(self):
# 1254x1254 (the gpt-image sample) capped at 1000: int(1254*1000/1254)=1000.
assert _target_size(1254, 1254, 1000) == (1000, 1000)
# Non-divisible ratio truncates toward zero like int() at the call site.
assert _target_size(1000, 333, 500) == (500, 166)
def test_extreme_aspect_ratio_clamps_short_side_to_one(self):
# 5000x3 capped at 1024: int(3 * 1024/5000) = 0 would crash resize();
# the short side must clamp to 1, never 0.
assert _target_size(5000, 3, 1024) == (1024, 1)
assert _target_size(3, 5000, 1024) == (1, 1024)
Generated
+1 -1
View File
@@ -2150,7 +2150,7 @@ wheels = [
[[package]]
name = "remove-ai-watermarks"
version = "0.5.3"
version = "0.5.4"
source = { editable = "." }
dependencies = [
{ name = "click" },