Files
remove-ai-watermarks/CLAUDE.md
T
test-user 626f43aec9 docs: correct SynthID spectral-carrier understanding
Deeper re-examination (2026-05-25) of github.com/aloshdenny/reverse-SynthID
on our own data corrects the earlier over-stated dead-end:

- The carrier IS real on solid fills -- measured via per-bin PHASE
  COHERENCE (the prior probe used spatial/FFT-magnitude NCC, which can't
  see a fixed-phase carrier). White gemini-2.5-flash fills: coherence 0.86
  at carriers (0,+/-7..12,20..23) vs 0.31 random; single-image phase-match
  +0.83 vs -0.24 for real photos.
- But it does not generalize: carriers are model-version/resolution/color
  specific (v4 codebook for 3.1-flash/nb-pro scores ~0.5 on 2.5-flash),
  and collapse on real content (coherence ~random; v4 content 0.518 vs
  neg 0.504, no separation).

Net: a controlled-fill characterizer, not a real-content detector.
Metadata proxy + visible sparkle + online oracles remain the ceiling.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 08:57:35 -07:00

15 KiB

Remove-AI-Watermarks

You are a principal Python engineer maintaining a CLI tool and library for removing visible and invisible AI watermarks from images.

How to run

  • uv run remove-ai-watermarks all <image.png> -o <output.png>
  • uv run remove-ai-watermarks identify <image> — provenance verdict (platform + watermark inventory + confidence); --json for machine output, --no-visible to skip the cv2 sparkle detector
  • uv run remove-ai-watermarks metadata <image.png> --check — inspect AI metadata (C2PA, EXIF, PNG chunks)
  • uv run remove-ai-watermarks metadata <image.png> --remove -o <out.png> — strip all AI metadata

Test and lint

  • bash maintain.sh — uv-outdated, uv-secure, ruff check/fix, ruff format, pyright, pytest -n auto
  • maintain.sh does not currently finish green (pre-existing, not per-change): uv-secure aborts on a fixable transitive idna vuln, and strict pyright carries debt in remove_ai_metadata / cli.py (untyped piexif/PIL/click/rich). To gate a change, run uv run ruff check, uv run pyright <changed files>, uv run pytest directly.
  • Run uv run from the repo root — from another cwd it falls back to a bare env without numpy/cv2/torch.
  • Metadata/C2PA tests assert against real committed fixtures in data/samples/ (chatgpt-*.png = OpenAI C2PA, firefly-1.png = Adobe, not-ai-* = clean); synthetic byte blobs cover the JPEG/ISOBMFF format paths.
  • SynthID reference corpus: scripts/synthid_corpus.py ingests labeled images into data/synthid_corpus/ (manifest.csv tracked, images/ gitignored); see its README for the collection protocol and verification oracles.

Configuration

  • GPU/ML modules (invisible_engine, ctrlregen, watermark_remover) are optional — guard imports with is_available() checks
  • Tests for ML modules are limited to availability checks (require multi-GB downloads)

Key modules

  • noai/c2pa.py — PNG chunk parser; use extract_c2pa_chunk(path) to get raw caBX payload, has_c2pa_metadata(path) to detect. Do not reimplement chunk parsing. extract_c2pa_info(path) sets synthid_watermark/synthid_vendors when the manifest is signed by a SynthID-using vendor.
  • noai/constants.py — PNG_SIGNATURE, C2PA_CHUNK_TYPE, C2PA_SIGNATURES, C2PA_ISSUERS, and SYNTHID_C2PA_ISSUERS (issuers that pair SynthID with C2PA: Google, OpenAI). Add a new issuer here, not inline.
  • metadata.pysynthid_source(path) returns the vendor name(s) if the C2PA manifest implies a SynthID pixel watermark, else None. Format-agnostic: PNG via the caBX parser, JPEG/WebP/AVIF/HEIF/JXL via a binary scan (C2PA marker + SynthID issuer + AI-source marker). get_ai_metadata surfaces the verdict, and metadata --check prints it as a callout. Both get_ai_metadata and has_ai_metadata guard the PIL open with except Exception (HEIC/unknown formats raise non-OSError) and fall through to the binary scan.
  • identify.pyidentify(path) aggregates every locally-readable signal (C2PA issuer→platform, IPTC "Made with AI", embedded SD/ComfyUI params, SynthID proxy, visible Gemini sparkle) into one ProvenanceReport. is_ai_generated is True or None (never asserted False — stripped metadata is not proof of clean origin). Visible-sparkle is promoted only at confidence ≥ _SPARKLE_THRESHOLD (0.5; corpus-tuned to separate Gemini sparkles ≥0.56 from non-sparkle ≤0.49). The cv2 dependency lives in gemini_engine.detect_sparkle_confidence, not here. Add platform mappings to _ISSUER_PLATFORM, not inline. For non-PNG containers (JPEG/WebP/AVIF/HEIF/JXL) the caBX parser returns nothing, so issuer (_issuers_in) and generator (_ai_tools_in, reusing C2PA_AI_TOOLS) are recovered by binary-scanning the first MB. EXIF Software / Make / Artist / ImageDescription and XMP CreatorTool generator tags are read by metadata.exif_generator (PIL+piexif for any format PIL opens incl. AVIF, plus a container-agnostic XMP raw-byte scan that also covers HEIF/JXL), matched against AI_GENERATOR_TOKENS so ordinary editors (plain "Adobe Photoshop") and real-camera Make ("Apple"/"Canon") are not flagged. Ideogram tags its output with EXIF Make="Ideogram AI" (verified on a real download 2026-05-24) — that's why Make is read.
  • gemini_engine.py — visible Gemini-sparkle remover/detector (cv2/numpy, no GPU). detect_sparkle_confidence(path) is the file-level entry point used by identify.py.
  • invisible_watermark.pydetect_invisible_watermark(path) decodes the OPEN DWT-DCT watermarks (public decoder, no key) embedded by Stable Diffusion / SDXL / FLUX via the imwatermark library. Known fixed patterns (verified against upstream source) live in _BITS_48 (SDXL 48-bit, FLUX.2 48-bit) and _SD1_STRING ("StableDiffusionV1", SD 1.x/2.x). Optional dep (extra detect); returns None when absent. Unlike SynthID this is locally detectable, but the watermark is fragile (does not survive JPEG re-encode/resize — verified gone after JPEG q90), so it confirms origin only on pristine files. Add new known patterns here. The file carries a top-of-module pyright pragma because imwatermark/cv2 ship no type stubs.
  • face_protector.py — YOLO detect + soft-blend pattern; mirror this for any "protect region during diffusion" features

Watermarking landscape (research 2026-05-24)

Who embeds what, and whether it is locally detectable (so we know which gaps are fillable). See identify.py for what we read.

  • Locally detectable (open decoder, no key/API): Stable Diffusion / SDXL / FLUX via imwatermark DWT-DCT (now covered by invisible_watermark.py). FLUX uses the same library (black-forest-labs/flux2 src/flux2/watermark.py, 48-bit 0b001010101111111010000111100111001111010100101110); SDXL is the diffusers WATERMARK_MESSAGE (0b101100111110110010010000011110111011000110011110). Caveat: fragile to re-encoding.
  • C2PA / IPTC (covered by the issuer/marker scan): OpenAI, Google, Adobe Firefly, Microsoft (Designer + Bing Image Creator — collected 2026-05-24; Bing now runs Microsoft's own MAI-Image model, signs C2PA as "Microsoft", NOT OpenAI/DALL-E), and Stability AI (collected from Brand Studio / DreamStudio successor; signs C2PA as "Stability AI Ltd", no SynthID, no imwatermark on its current Stable Image model — issuer added to C2PA_ISSUERS). Still unsampled: Canva (its downloads are re-encoded design exports that strip C2PA, so a Canva "positive" is inconclusive — skipped), Getty, Shutterstock. Midjourney embeds NO C2PA and no invisible watermark (our mj-* sample carried only the IPTC tag).
  • EXIF/XMP generator tag (caught by exif_generator): Ideogram writes EXIF Make="Ideogram AI" (collected 2026-05-24 — no C2PA, no SynthID, no imwatermark; the Make tag is the only signal).
  • No detectable signal on download (correctly reported unknown): Recraft (PNG export is a re-encoded design export — strips everything), Krea hosting FLUX 2 (no imwatermark despite FLUX — the host omits the encoder, same as Stability's hosted SDXL), and Midjourney (embeds nothing). Lesson: the imwatermark detector only fires on pristine output from a pipeline that runs the encoder (diffusers default, official BFL), not from re-hosts (Krea/Stability) or re-encoded exports (Recraft/Canva).
  • Invisible but NOT locally detectable (proprietary, API/oracle only — same wall as SynthID): Amazon Titan Image Generator + Nova Canvas (Bedrock DetectGeneratedContent API), Kakao (new SynthID image adopter, May 2026), NVIDIA Cosmos (SynthID video). No local detector possible; treat like SynthID.

Known limitations

  • invisible pipeline downscales to model-native resolution (1024 px for SDXL) before diffusion. Degrades fine text in infographics. Tracked; fix is tile-based diffusion.
  • Pyright first run is slow (2-3 min) due to ML deps (torch/diffusers/transformers stubs); full-project uv run pyright can stall for many minutes — scope it to changed files.
  • ultralytics monkey-patches PIL.Image.open and tries to autoload pi_heif. When pi_heif is missing, opening files raises ModuleNotFoundError, not UnidentifiedImageError. Code that opens user-supplied or unknown-format files should except Exception, not just OSError/UnidentifiedImageError.
  • Metadata detection for AVIF/HEIF/JPEG-XL relies on a binary scan for C2PA_UUID + IPTC_AI_MARKERS, plus EXIF Software / XMP CreatorTool generator tags via metadata.exif_generator (validated with synthesized AVIF/JPEG fixtures + an XMP raw-scan fixture). C2PA removal in those containers is implemented via noai/isobmff.py (top-level uuid / jumb box stripper, no re-encoding). EXIF/XMP boxes inside those containers are read for detection but not yet scrubbed on removal.
  • SynthID detection is metadata-only. There is no reliable local detector of the SynthID pixel watermark — Google's decoder is proprietary, no public spec or API (only a waitlisted portal). We detect SynthID by its C2PA companion (synthid_source / SYNTHID_C2PA_ISSUERS), which is reliable while the manifest is intact but says nothing once C2PA is stripped. Surface-dependent blind spot (verified 2026-05-24): the same Google model emits different metadata per surface -- the Gemini app wraps outputs in Google C2PA, but the API/playground (AI Studio, Nano Banana / gemini-2.5-flash-image) emits the SynthID pixel watermark (confirmed via the Gemini-app oracle) + the visible sparkle but no C2PA/IPTC at all, so synthid_source returns None despite SynthID being present. Only the pixel oracle or the visible-sparkle detector catches those. (Meta AI is another surface mismatch: it writes the IPTC digitalSourceType=trainedAlgorithmicMedia marker, not C2PA and not SynthID.) Google→SynthID is long-standing; OpenAI→SynthID is confirmed by OpenAI's Help Center (ChatGPT/Codex/API "include both C2PA metadata and SynthID watermarks", updated 2026-05-21) but time-gated (pre-rollout OpenAI images carry C2PA without SynthID), so the OpenAI verdict is hedged "likely". Oracles: Gemini app "Verify with SynthID" (Google), openai.com/verify (OpenAI). The spectral phase-coherence approach from github.com/aloshdenny/reverse-SynthID was evaluated (May 2026) and does not work for real-content detection: on its own shipped codebook + validation set, watermarked and cleaned images were indistinguishable (conf within noise, cleaned often higher); it only fires on pure-black 1024x1024 reference images at exact resolution (the controlled case it was calibrated on). The README's "90% / conf=0.91" reproduces only in that lab condition. Do not build a production detector on it; if revisited, it is experimental/diagnostic only and needs a per-resolution, per-model reference corpus. A from-scratch gpt-image pilot (2026-05-24) confirmed this independently: 5 independent solid-black gpt-image outputs share a near-identical fixed signature (pairwise residual correlation 0.92, avg-template retains 97% energy), so the watermark/carrier IS strongly present and consistent on flat content — but the carrier frequencies extracted from it do NOT discriminate real content (carrier-to-random ratio: cleaned 1.86 > watermarked 1.53; a non-gpt-image image scored highest at 3.67). The signature drowns in content texture. Net: a perfectly consistent solid-color signature still yields no real-content pixel detector with magnitude/carrier methods. A corpus discrimination test (2026-05-24, scripts/synthid_pixel_probe.py, raw zero-mean residual NCC) independently re-confirms this: at matched resolution, SynthID positives do NOT cluster apart from negatives (within-Gemini 0.07; at 1024 px pos-vs-neg >= pos-vs-pos). The only high correlations were near-duplicate content (5 ChatGPT renders of one prompt at ~0.92, while a distinct ChatGPT image scored ~0 against them) — content, not a carrier. The probe is solid-fills-only and EXPERIMENTAL/DIAGNOSTIC; do not use it on real content. Correction (deeper re-examination 2026-05-25): the carrier IS real on solid fills — the earlier "no carrier" was a method artifact of using spatial / FFT-magnitude NCC, which can't see it. The carrier is a fixed phase at specific low frequencies, so the right metric is per-bin phase coherence. On 8 white gemini-2.5-flash-image fills (generated via the reverse-SynthID trick: identity-edit prompt "Recreate this image exactly as it is" on a synthetic pure-white PNG — this bypasses the recitation block that rejects text prompts for pure colors), phase coherence at the white carriers (0,±7..±12,±20..±23) = 0.86 vs 0.31 random; single-image leave-one-out phase-match +0.83 vs real photos -0.24. (Black 2.5-flash fills clip to std≈0 — SynthID can't push values below 0, so no carrier in black; the repo's dark carriers come from nano-banana-pro.) But it does not generalize: (a) carriers are model-version + resolution + color specific — the repo's v4 codebook (built for gemini-3.1-flash-image-preview + nano-banana-pro-preview) scores ~0.527 on my 2.5-flash white fills, indistinguishable from negatives (~0.50), i.e. carriers shift across model versions and need a per-model codebook; (b) on real content (30 2.5-flash images) the carrier collapses — set phase coherence at carriers 0.37 ≈ random 0.42, and the repo's v4 detector gives content 0.518 ≈ negatives 0.504 (no separation; a faint +0.24 single-image lean is likely a brightness confound). Net: the spectral/phase approach is a real controlled-fill characterizer, NOT an arbitrary-real-content detector, and is brittle to model version. Metadata proxy + visible sparkle + online oracles remain the ceiling for real content.
  • External AI-vs-real classifier models are out of scope (decided 2026-05-24). Generic HuggingFace detectors (Organika/sdxl-detector Swin Transformer, umm-maybe/AI-image-detector, and fine-tunes) exist and report ~0.98 on their own SDXL-vs-real validation sets, but they are per-generator and the model cards themselves note degraded accuracy off-distribution; they are untested on gpt-image / Gemini Nano Banana (the metadata-stripped surfaces we care about), and our own light SDXL pass would likely defeat them the same way it defeats SynthID. Detection here stays local + signal-based (metadata + visible sparkle); do not add a bundled classifier dependency.
  • SynthID v2 vs default pipeline: the SDXL-based default profile (since May 2026) defeats SynthID v2. Verified end-to-end (May 2026): local SDXL run on a Gemini 3 Pro output, checked via the Gemini app's "Verify with SynthID" feature, returned "no SynthID watermark detected". Also confirmed against OpenAI's SynthID (2026-05-23): a fresh ChatGPT/gpt-image output read "SynthID detected" on openai.com/verify before the local SDXL run and "SynthID not detected" after (corpus regression chain: pos 4ef377bd -> cleaned 47188e88). The same configuration is used in raiw-app production (fal-ai/fast-sdxl at native ~1024 px, strength 0.05, steps 50). SD-1.5 dreamshaper at 768 px was previously the default and does NOT defeat v2 — verified empirically against the same feature (strength 0.04, 0.10, and elastic warp α∈{5,8} all flagged positive). That SD-1.5 path was removed; only default (SDXL) and ctrlregen profiles remain.