remove-ai-watermarks

mirror of https://github.com/wiltodelta/remove-ai-watermarks.git synced 2026-07-05 07:57:50 +02:00

Author	SHA1	Message	Date
Victor Kuznetsov	7dddfef14e	docs: certify qwen scrub floors (OpenAI 0.10 seed-robust, Gemini 0.25) Oracle seed-repeat + floor refinement (2026-06-20, data/qwen_in): - OpenAI floor 0.10 is SEED-ROBUST: 0.05 and 0.075 still detected; 0.10 clean on seeds 0-4 (5/5) -> a random seed is safe. - Gemini floor lowered 0.30 -> 0.25 (0.20 still detected, 0.25 clean on both images). Single-seed (seed 0): the Gemini oracle rate-limits volume seed-repeat, so pin a seed in prod rather than relying on seed-robustness there. Re-measured fidelity at the certified floors (controlnet 0.15 vs Qwen 0.25 for Gemini): faces still favor controlnet (ArcFace 0.546 vs 0.382, lapvar 0.62 vs 0.40); the short-CJK text case is now a TIE (gemini_1 0.037 vs 0.037 -- the earlier Qwen 0.000 was at 0.30, not the floor). Qwen's text win holds on substantial Latin/mixed text (OpenAI 0.385 vs 0.241 / 0.341 vs 0.290). Update watermark_profiles comment, CLAUDE.md, module-internals, known-limitations. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-20 15:16:51 -07:00
Victor Kuznetsov	373b910a60	docs: fix the qwen-vs-controlnet face comparison to oracle-confirmed scrub floors The face fidelity numbers cited an equal-strength compare (both 0.15), but Qwen at 0.15 does NOT clear Gemini SynthID -- so that output is un-scrubbed and the compare is invalid. Per the methodology rule (compare fidelity only between outputs where SynthID is removed in BOTH), restate faces at each pipeline's scrub floor (controlnet 0.15 / Qwen 0.30): ArcFace identity 0.546 vs 0.331, lapvar 0.62 vs 0.40, face LPIPS 0.09 vs 0.19 -- controlnet still wins faces, conclusion unchanged. Drop the "equal strength" framing in CLAUDE.md / module-internals / known-limitations. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-20 14:33:11 -07:00
Victor Kuznetsov	2d5b26ed18	test(eval): vision-transcribed ground truth for qwen_in + clean text-CER numbers data/qwen_in/ground_truth.json is transcribed by vision (PaddleOCR mangled the stylized Cyrillic), so the text metric scores variants against an accurate reference instead of noisy OCR-vs-OCR. Re-measured text CER (controlnet vs qwen) with this ground truth confirms qwen wins text across EN/RU/ZH: openai_1 0.385 vs 0.241, openai_2 0.341 vs 0.290, gemini_1 (ZH) 0.037 vs 0.000 (perfect Chinese even at the higher 0.30 strength). Faces still favor controlnet. Refresh the numbers in docs/known-limitations.md to this cleaner methodology. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-20 14:26:23 -07:00
Victor Kuznetsov	a2c33af284	feat(scripts): fidelity_metrics.py + correct the qwen-vs-controlnet claim Add scripts/fidelity_metrics.py: an objective eval harness comparing watermark-removal outputs against the original (reference) across four groups -- OCR character error rate (EasyOCR), ArcFace identity cosine (insightface), face texture (LPIPS + Laplacian-variance ratio), and whole-image LPIPS/SSIM/ PSNR. PEP 723 inline deps so it stays out of the package / uv.lock; metrics self-gate (faces only where faces, text only where text). The metrics overturned an eyeball conclusion: at EQUAL strength Qwen beats controlnet on TEXT (OpenAI typography 0.10: OCR CER 0.25 vs 0.37) but controlnet beats Qwen on FACES (gemini_3, 18 faces, 0.15 each: Laplacian-variance retention 0.62 vs 0.41, face LPIPS 0.09 vs 0.13 -- Qwen smooths faces MORE; ArcFace identity ~tied). So Qwen is the better TEXT-preserving remover, not a universal fidelity win. Correct the earlier "qwen keeps faces faithful where controlnet plasticizes" claim in CLAUDE.md, module-internals.md, known-limitations.md, README. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-20 09:58:22 -07:00
Victor Kuznetsov	76e3d4154c	feat(invisible): add Qwen-Image img2img pipeline (--pipeline qwen) A third diffusion pipeline alongside sdxl/controlnet: Qwen-Image (20B MMDiT, Apache-2.0 code AND weights) img2img. The scrub still comes from the img2img strength; Qwen preserves text (incl. CJK) and structure markedly better than SDXL at the scrub floor, so it over-regenerates real photos far less (directly targets the controlnet over-regeneration that degrades real uploads). - watermark_profiles: QWEN_MODEL_ID, normalize_profile accepts "qwen". - WatermarkRemover: _load_qwen_pipeline (bf16, loads Qwen base unless --model overridden, clear ImportError if diffusers lacks the class), _run_qwen (no MPS fallback -- 20B is CUDA/cloud-class), dispatch in _generate_one/preload, pure _build_qwen_kwargs (true_cfg_scale, not guidance_scale). - Shared _base_load_kwargs() across all three loaders (dtype + token). - CLI --pipeline gains "qwen"; invisible_engine threads it through. - scripts/qwen_scrub_prototype.py: standalone PEP 723 GPU experiment. Prototype oracle floors (Modal A100-80GB, single seed, controls SynthID-positive, PENDING seed-repeat cert): OpenAI clears at strength ~0.10, Gemini at ~0.30 (0.20 still detected), with CJK text + faces faithful where controlnet plasticizes. The Gemini floor is higher than the shared default ladder, so pass an explicit --strength for Gemini on this pipeline until a Qwen-specific ladder is certified. The model-running path is CUDA-only (untestable locally); unit tests cover the pure call-shape (_build_qwen_kwargs) and profile normalization without torch. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 20:44:36 -07:00
Victor Kuznetsov	0c0c6c6b03	feat(invisible): sliding-window tiled diffusion for large inputs (--tile) Add a lossless alternative to the --max-resolution downscale for large images that OOM on MPS/GPU: regenerate in overlapping, feather-blended tiles at native resolution. - noai/tiling.py: pure plan_tiles (uniform tiles, last flush to edge) + feather_weights (strictly-positive separable taper -> partition-of-unity blend) + run_tiled (per-tile generate callable, decoupled from the pipeline). Unit-tested without the model. - WatermarkRemover.remove_watermark: refactor _generate into _generate_one + a tiled branch that engages only when --tile is set and the long side exceeds tile_size (ControlNet canny is rebuilt per tile). - Thread tile/tile_size/tile_overlap through InvisibleEngine and the invisible/all/batch CLI commands via a shared _tile_options decorator. Verified end-to-end on the real SDXL pipeline (forced 2x2 tiling on a 1024px sample, MPS): non-degenerate output, no gross seam at tile borders. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 11:54:58 -07:00
Victor Kuznetsov	d5845a72f3	feat(metadata): blank AI-generator tokens in AVIF/HEIF Exif meta-box items Closes a documented coverage gap (P2#9): an AI Software/Make/Artist/ImageDescription token in an EXIF item (its TIFF bytes live in mdat/idat) survived remove_ai_metadata because the top-level box stripper and (absent pillow-heif) the PIL EXIF reader can't reach it. New isobmff.blank_ai_exif_tokens finds EXIF TIFF blocks by their II/MM byte-order header, validates each with piexif (a coincidental II/MM run in pixels won't parse as a TIFF IFD, so it's ignored), and overwrites any AI_GENERATOR_TOKENS- bearing value with same-length spaces -- so box sizes and iloc offsets stay valid and the coded image is untouched (mirrors blank_ai_xmp_packets; no iinf/iloc surgery, no exiftool dep). Camera/editor EXIF without an AI token is preserved. Wired into remove_ai_metadata's ISOBMFF path. Covers the realistic AI-generator-token case; xAI- signature-in-meta-box-EXIF (Grok is JPEG-only) stays out. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 10:43:35 -07:00
Victor Kuznetsov	4c6b56f888	lower(strength): drop vendor-adaptive floor to OpenAI 0.10 / Google 0.15 A 2026-06-14 oracle re-test on the deployed Modal controlnet worker (v0.10.0) cleared SynthID at OpenAI 0.10 (2 photoreal) and Google 0.15 (2 native 2816x1536, retiring the "native >= 0.30" guess), while a pixel sweep showed the 2026-06-04 cert floors (0.20/0.30) over-regenerated for no efficacy gain (Google MAE -20% at 0.15). Lowers OPENAI_STRENGTH 0.20->0.10, GEMINI_STRENGTH and UNKNOWN_STRENGTH 0.30->0.15. Caveats documented in watermark_profiles.py + docs: removal near this floor is seed-non-deterministic (a service must pin a verified seed), and the n=2 re-test did not cover flat-graphic hard cases. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-14 13:17:11 -07:00
Victor Kuznetsov	9feea4ac1e	Slim CLAUDE.md: move module internals, limitations, landscape research to docs Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 15:50:03 -07:00

9 Commits