Regenerating pixels removes SynthID / open watermarks but degrades a real
photo, so running it on a clean image is the dominant paid score-0 cause on
no-watermark uploads. Gate invisible/all/batch on identify.has_invisible_target:
when no invisible AI signal is locally detectable and --force is unset, skip the
regeneration. Per-command semantics:
- invisible: write no output, exit EXIT_NO_INVISIBLE_SIGNAL (2)
- all: skip step 2 but keep visible-removed pixels + strip metadata, exit 0
- batch: skip the scrub; copy the input through in invisible mode
A skip never claims the image is clean (a pixel SynthID is undetectable once its
metadata proxy is gone); the message says so and routes to --force. The gate
fails safe (a detector error runs the removal).
has_invisible_target wraps identify(check_visible=False, check_invisible=True)
and returns the new ProvenanceReport.ai_from_metadata field (the confidence==high
union), so the raiw.cc worker can reuse the same gate. Gate placed before engine
construction so the skip path is cheap; shared via cli._should_skip_invisible_scrub.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- watermark_remover: _build_qwen_kwargs now passes explicit height/width (via
_qwen_target_size, floored to /16). Without it QwenImageImg2ImgPipeline defaults to
1024x1024 and silently squishes non-square inputs, distorting the scene and garbling text.
- watermark_profiles: resolve_strength gains a `pipeline` arg + a Qwen strength ladder
(_QWEN_VENDOR_STRENGTH, Gemini 0.25), so `--pipeline qwen` gets its certified floor
automatically; retires the manual "pass --strength 0.25 for Gemini on qwen" workaround.
- fidelity_metrics: replace per-face nearest matching (collided on multi-face images when a
variant dropped a face, corrupting the identity metric) with a collision-free one-to-one
assignment (assign_faces_one_to_one). lapvar/LPIPS were always bbox-anchored and immune.
Regression-guarded by tests/test_fidelity_matching.py.
- docs: record the measured outcomes of the qwen-improvement arc. The Qwen ControlNet
face-fix is CLOSED (no permissive Qwen detail/tile ControlNet exists; canny carries edges,
not skin grain). The `--pipeline auto` router + faces+text mixed dual-pass were prototyped
and DROPPED (controlnet wins faces AND display text: abba CER 0.114 vs qwen 0.379).
Z-Image-Turbo was tried and dropped (same regeneration limits). qwen stays a manual opt-in;
controlnet is the default for everything.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Cited deep-research report (22 sources, 3-vote adversarial verification, 5 refuted)
behind the "ship qwen as-is or improve first?" decision. Verdict: shippable now as
an opt-in text lane; strongest improvement lead is adding a Qwen-Image ControlNet
(InstantX / DiffSynth, Apache-2.0, diffusers QwenImageControlNetPipeline) for face/
skin structure; Z-Image-Turbo (6B, Apache-2.0) is the best cheaper text-preserving
substitute. No improvement has measured face-fidelity at our scrub floors yet --
validate with scripts/fidelity_metrics.py first. Linked from known-limitations.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Measured (openai_1, 0.10, seeds 0-4): seed barely moves whole-image fidelity
(img LPIPS 0.062-0.065, SSIM/PSNR flat) but shifts text legibility (OCR CER
0.241-0.290, ~17% spread) -- it changes which details regenerate, not the level.
So per-image best-of-N-seed is a weak text-only lever (pin a seed in prod; reserve
best-of-N for text-heavy premium). Also retitle the qwen section "certified floors"
and drop the now-stale "uncertified / run seed-repeat / floor 0.30" tails.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Oracle seed-repeat + floor refinement (2026-06-20, data/qwen_in):
- OpenAI floor 0.10 is SEED-ROBUST: 0.05 and 0.075 still detected; 0.10 clean on
seeds 0-4 (5/5) -> a random seed is safe.
- Gemini floor lowered 0.30 -> 0.25 (0.20 still detected, 0.25 clean on both
images). Single-seed (seed 0): the Gemini oracle rate-limits volume seed-repeat,
so pin a seed in prod rather than relying on seed-robustness there.
Re-measured fidelity at the certified floors (controlnet 0.15 vs Qwen 0.25 for
Gemini): faces still favor controlnet (ArcFace 0.546 vs 0.382, lapvar 0.62 vs
0.40); the short-CJK text case is now a TIE (gemini_1 0.037 vs 0.037 -- the earlier
Qwen 0.000 was at 0.30, not the floor). Qwen's text win holds on substantial
Latin/mixed text (OpenAI 0.385 vs 0.241 / 0.341 vs 0.290). Update watermark_profiles
comment, CLAUDE.md, module-internals, known-limitations.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The face fidelity numbers cited an equal-strength compare (both 0.15), but Qwen at
0.15 does NOT clear Gemini SynthID -- so that output is un-scrubbed and the compare
is invalid. Per the methodology rule (compare fidelity only between outputs where
SynthID is removed in BOTH), restate faces at each pipeline's scrub floor
(controlnet 0.15 / Qwen 0.30): ArcFace identity 0.546 vs 0.331, lapvar 0.62 vs 0.40,
face LPIPS 0.09 vs 0.19 -- controlnet still wins faces, conclusion unchanged. Drop
the "equal strength" framing in CLAUDE.md / module-internals / known-limitations.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
data/qwen_in/ground_truth.json is transcribed by vision (PaddleOCR mangled the
stylized Cyrillic), so the text metric scores variants against an accurate
reference instead of noisy OCR-vs-OCR. Re-measured text CER (controlnet vs qwen)
with this ground truth confirms qwen wins text across EN/RU/ZH: openai_1 0.385 vs
0.241, openai_2 0.341 vs 0.290, gemini_1 (ZH) 0.037 vs 0.000 (perfect Chinese even
at the higher 0.30 strength). Faces still favor controlnet. Refresh the numbers in
docs/known-limitations.md to this cleaner methodology.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add scripts/fidelity_metrics.py: an objective eval harness comparing
watermark-removal outputs against the original (reference) across four groups
-- OCR character error rate (EasyOCR), ArcFace identity cosine (insightface),
face texture (LPIPS + Laplacian-variance ratio), and whole-image LPIPS/SSIM/
PSNR. PEP 723 inline deps so it stays out of the package / uv.lock; metrics
self-gate (faces only where faces, text only where text).
The metrics overturned an eyeball conclusion: at EQUAL strength Qwen beats
controlnet on TEXT (OpenAI typography 0.10: OCR CER 0.25 vs 0.37) but controlnet
beats Qwen on FACES (gemini_3, 18 faces, 0.15 each: Laplacian-variance retention
0.62 vs 0.41, face LPIPS 0.09 vs 0.13 -- Qwen smooths faces MORE; ArcFace
identity ~tied). So Qwen is the better TEXT-preserving remover, not a universal
fidelity win. Correct the earlier "qwen keeps faces faithful where controlnet
plasticizes" claim in CLAUDE.md, module-internals.md, known-limitations.md, README.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A third diffusion pipeline alongside sdxl/controlnet: Qwen-Image (20B MMDiT,
Apache-2.0 code AND weights) img2img. The scrub still comes from the img2img
strength; Qwen preserves text (incl. CJK) and structure markedly better than
SDXL at the scrub floor, so it over-regenerates real photos far less (directly
targets the controlnet over-regeneration that degrades real uploads).
- watermark_profiles: QWEN_MODEL_ID, normalize_profile accepts "qwen".
- WatermarkRemover: _load_qwen_pipeline (bf16, loads Qwen base unless --model
overridden, clear ImportError if diffusers lacks the class), _run_qwen (no
MPS fallback -- 20B is CUDA/cloud-class), dispatch in _generate_one/preload,
pure _build_qwen_kwargs (true_cfg_scale, not guidance_scale).
- Shared _base_load_kwargs() across all three loaders (dtype + token).
- CLI --pipeline gains "qwen"; invisible_engine threads it through.
- scripts/qwen_scrub_prototype.py: standalone PEP 723 GPU experiment.
Prototype oracle floors (Modal A100-80GB, single seed, controls SynthID-positive,
PENDING seed-repeat cert): OpenAI clears at strength ~0.10, Gemini at ~0.30 (0.20
still detected), with CJK text + faces faithful where controlnet plasticizes. The
Gemini floor is higher than the shared default ladder, so pass an explicit
--strength for Gemini on this pipeline until a Qwen-specific ladder is certified.
The model-running path is CUDA-only (untestable locally); unit tests cover the
pure call-shape (_build_qwen_kwargs) and profile normalization without torch.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a lossless alternative to the --max-resolution downscale for large
images that OOM on MPS/GPU: regenerate in overlapping, feather-blended
tiles at native resolution.
- noai/tiling.py: pure plan_tiles (uniform tiles, last flush to edge) +
feather_weights (strictly-positive separable taper -> partition-of-unity
blend) + run_tiled (per-tile generate callable, decoupled from the
pipeline). Unit-tested without the model.
- WatermarkRemover.remove_watermark: refactor _generate into _generate_one
+ a tiled branch that engages only when --tile is set and the long side
exceeds tile_size (ControlNet canny is rebuilt per tile).
- Thread tile/tile_size/tile_overlap through InvisibleEngine and the
invisible/all/batch CLI commands via a shared _tile_options decorator.
Verified end-to-end on the real SDXL pipeline (forced 2x2 tiling on a
1024px sample, MPS): non-degenerate output, no gross seam at tile borders.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Closes a documented coverage gap (P2#9): an AI Software/Make/Artist/ImageDescription
token in an EXIF item (its TIFF bytes live in mdat/idat) survived remove_ai_metadata
because the top-level box stripper and (absent pillow-heif) the PIL EXIF reader can't
reach it. New isobmff.blank_ai_exif_tokens finds EXIF TIFF blocks by their II/MM
byte-order header, validates each with piexif (a coincidental II/MM run in pixels
won't parse as a TIFF IFD, so it's ignored), and overwrites any AI_GENERATOR_TOKENS-
bearing value with same-length spaces -- so box sizes and iloc offsets stay valid and
the coded image is untouched (mirrors blank_ai_xmp_packets; no iinf/iloc surgery, no
exiftool dep). Camera/editor EXIF without an AI token is preserved. Wired into
remove_ai_metadata's ISOBMFF path. Covers the realistic AI-generator-token case; xAI-
signature-in-meta-box-EXIF (Grok is JPEG-only) stays out.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A 2026-06-14 oracle re-test on the deployed Modal controlnet worker (v0.10.0)
cleared SynthID at OpenAI 0.10 (2 photoreal) and Google 0.15 (2 native
2816x1536, retiring the "native >= 0.30" guess), while a pixel sweep showed the
2026-06-04 cert floors (0.20/0.30) over-regenerated for no efficacy gain
(Google MAE -20% at 0.15). Lowers OPENAI_STRENGTH 0.20->0.10, GEMINI_STRENGTH
and UNKNOWN_STRENGTH 0.30->0.15.
Caveats documented in watermark_profiles.py + docs: removal near this floor is
seed-non-deterministic (a service must pin a verified seed), and the n=2 re-test
did not cover flat-graphic hard cases.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>