mirror of https://github.com/wiltodelta/remove-ai-watermarks.git synced 2026-07-05 07:57:50 +02:00

Files

T

Victor Kuznetsov 3f5d6a0af1 docs(landscape): back the DWT-DCT positive-only limitation with researched root cause + citations

Deep-research (2026-06-19, adversarially verified) confirms the open imwatermark
dwtDct mark is fragile by scheme, not by our usage: maintainers admit no 100%
clean-decode guarantee; measured ~0.79 bit accuracy clean (~38/48, below our 44
gate). Root causes (code-verified + locally reproduced): per-block max-coefficient
bit read (content flips bits) and YUV chroma 8-bit clamping on bright pixels (the
bright-flat / all-ones failure). No maintained fork or detector does this scheme
reliably (WAVES relegates it to an appendix; learned schemes are a different class;
dwtDctSvd cannot decode SDXL's dwtDct). Conclusion: keep it positive-only, rely on
C2PA. Sources: imwatermark READMEs, arXiv:2406.08337 (WMAdapter), arXiv:2401.08573
(WAVES), diffusers SDXL watermark.py.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-19 10:27:08 -07:00

20 KiB

Raw Blame History

Watermarking landscape (research 2026-05-24)

Relocated verbatim from CLAUDE.md on 2026-06-11 to keep the always-loaded context small. Long single-line entries were reformatted into paragraphs; no content was changed or summarized.

Who embeds what, and whether it is locally detectable (so we know which gaps are fillable). See identify.py for what we read.

Locally detectable (open decoder, no key/API): Stable Diffusion / SDXL / FLUX via imwatermark DWT-DCT (now covered by invisible_watermark.py). FLUX uses the same library (black-forest-labs/flux2 src/flux2/watermark.py, 48-bit 0b001010101111111010000111100111001111010100101110); SDXL is the diffusers WATERMARK_MESSAGE (0b101100111110110010010000011110111011000110011110). Caveat: the imwatermark dwtDct decode is carrier-fragile on a broad class of real images, NOT just re-encode-fragile, and it is a POSITIVE-ONLY signal. A clean encode->decode round-trip (no re-encode at all) recovers 48/48 bits on some carriers (random noise, chatgpt-1.png 48/48, firefly-1.png 45/48) but FAILS on many others — verified 2026-06-19 that a known-embedded watermark only round-trips 28-39/48 (below the safe _MATCH_48 = 44 gate, random baseline ~24) on the FLUX fox sample (28), doubao-1.png (39), a 1024² minimalist-flat FLUX image (28), AND a clean synthetic bright-flat fill with NO watermark at all (28). The failure does NOT track texture (firefly lapvar ~11 passes; the flat FLUX lapvar ~56 fails); it correlates with a degenerate decode where the raw bits read all-ones (48/48 ones) — which a clean synthetic image reproduces, so all-ones is a CARRIER ARTIFACT, NOT a watermark signal (a double-embed test also showed a pre-existing embed does not corrupt a second embed — no interference). Net: trust a detect_invisible_watermark hit, but treat a None/no-match as inconclusive whenever a positive-control embed on the same carrier does not first recover >=44/48. The 44 gate is a deliberate precision choice (lowering it would admit false positives).

Root cause and external confirmation (deep-research 2026-06-19, adversarially verified). This is the SCHEME's ceiling, not our usage — there is no better decoder to adopt. The imwatermark maintainers state verbatim (both the ShieldMnt and Stability-AI READMEs) that the algorithm "cannot guarantee to decode the original watermarks 100% accurately even though we don't apply any attack." Independent measurement (WMAdapter, arXiv:2406.08337 Table 2) puts dwtDct at only ~0.79 bit accuracy on CLEAN images (~38/48 bits — already below our 44 gate), collapsing to ~0.50 (chance) under crop/JPEG. Two code-verified + locally-reproduced mechanisms drive the content-dependent failures: (1) the decoder reads each bit as the highest-magnitude DCT coefficient per block, so any content coefficient exceeding the encoded target flips the bit; (2) the default embed is in the YUV chroma channel, which 8-bit-clamps on white/bright pixels (a +36 chroma delta survives a white-fill round-trip as only +4, ~89% loss) — this is the mechanism behind the bright-flat / minimalist failures and the all-ones degenerate decode. No maintained fork or detector decodes this scheme reliably: the WAVES benchmark (arXiv:2401.08573) relegates DWT-DCT to supplementary appendix G.5 and targets Stable Signature / Tree-Ring / StegaStamp instead; learned encoder/decoder schemes reach ~0.98-0.99 clean but are a DIFFERENT watermark class (not what SDXL/FLUX stamp). dwtDctSvd does not help (SDXL embeds dwtDct; dwtDctSvd cannot decode it, and its clean accuracy ~0.72 is lower). Authoritative conclusion: the open DWT-DCT mark cannot be turned from positive-only into a reliable real-world detector; keep it positive-only and rely on C2PA. (Refuted along the way: that the library is unmaintained, and that it is robust to JPEG but only fails on geometric attacks — both did not survive verification.)

Consequence for the FLUX hosted-output question (BFL Playground, FLUX.2 [pro] + FLUX.1 [dev], 2026-06-19): all samples carry the signed C2PA manifest (issuer "Black Forest Labs"); the open DWT-DCT decode returned None, but every available FLUX carrier (textured fox AND a minimalist-flat generation) failed the positive control (28/48), so the detector is blind on them and whether BFL hosted output embeds the open pixel watermark is UNRESOLVED (an earlier note here wrongly asserted it absent — overstated; a later note blamed "high texture" — also wrong, flat carriers fail too). What IS established: C2PA is the reliable FLUX identifier; the _BITS_48 pattern is correct (round-trips on chatgpt/firefly/random). Resolving the hosted question needs a hosted FLUX carrier that first passes a >=44/48 positive control, which neither a textured nor a flat prompt produced — low priority (the open mark is only a stripped-metadata fallback).
C2PA / IPTC (covered by the issuer/marker scan): OpenAI, Google, Adobe Firefly, Microsoft (Designer + Bing Image Creator — collected 2026-05-24; Bing now runs Microsoft's own MAI-Image model, signs C2PA as "Microsoft", NOT OpenAI/DALL-E), Stability AI (collected from Brand Studio / DreamStudio successor; signs C2PA as "Stability AI Ltd", no SynthID, no imwatermark on its current Stable Image model — issuer added to C2PA_ISSUERS), and Canva (Magic Media signs C2PA as "Canva" + trainedAlgorithmicMedia with a generic c2pa-rs claim generator, no SynthID — issuer b"Canva" → "Canva (Magic Media)"; found on real production traffic 2026-06-19, which disproved the earlier assumption that Canva downloads are re-encoded exports that always strip C2PA). Still unsampled: Getty, Shutterstock. Midjourney embeds NO C2PA and no invisible watermark (our mj-* sample carried only the IPTC tag).

Samsung Galaxy AI (Generative Edit / Sketch to Image / Portrait Studio on Galaxy S23 FE / S24 / S25, One UI 7+) signs C2PA as "Samsung Galaxy" with the standard trainedAlgorithmicMedia source type AND a proprietary genAIType marker; verified on real signed files 2026-05-29 (the standard scan catches the source type; genAIType additionally catches a Galaxy S24 file that omits it). It ALSO burns a visible localized wordmark into the pixels — a sparkle + "generated with AI" string in the bottom-LEFT corner (issue #37; the Italian "✦ Contenuti generati dall'AI" variant is calibrated) — removed by samsung_engine.py / visible --mark samsung (reverse-alpha, see the engine bullet); detection feeds identify as the medium visible_samsung signal. The string is locale-specific, so each locale needs its own captured alpha template.

ASUS Gallery also signs edited photos as C2PA (com.asus.gallery) but with no AI source type — a signer, not an AI marker.

Black Forest Labs (FLUX) API output signs C2PA: claim_generator_info "Black Forest Labs API" + a c2pa.ai_generated_content assertion + trainedAlgorithmicMedia (issuer b"Black Forest Labs" added to C2PA_ISSUERS, platform "Black Forest Labs (FLUX)").

ByteDance Volcano Engine (Volcengine) — the cloud behind Doubao / Jimeng — signs its AI image output with a cert from certificate_center@volcengine.com + trainedAlgorithmicMedia (issuer b"volcengine" → "ByteDance (Volcano Engine)", platform "ByteDance (Doubao / Jimeng / Volcano Engine)"); note this is the C2PA-signed surface, distinct from the XMP/PNG TC260 AIGC label Doubao also uses. All three verified on real signed files 2026-05-29. ByteDance's international brand (BytePlus / Seedream / Seededit) signs the SAME content as "Byteplus Pte. Ltd." — the bare volcengine needle missed it, so real BytePlus output was mis-attributed to "Adobe Firefly" (an incidental "Adobe XMP" toolkit string in the file's XMP, picked up by the fallback byte-scan once the clean manifest issuer matched nothing). Added issuer b"Byteplus" → org "BytePlus (ByteDance)" (platform resolves to the shared "ByteDance (Doubao / Jimeng / Volcano Engine)" label via the common ByteDance needle) so the clean manifest issuer attributes it directly; found on real production traffic 2026-06-19.

EXIF/XMP generator tag (caught by exif_generator): Ideogram writes EXIF Make="Ideogram AI" (collected 2026-05-24 — no C2PA, no SynthID, no imwatermark; the Make tag is the only signal).
xAI / Grok — its own EXIF signature scheme, NOT C2PA (DETECTED by metadata.xai_signature, built 2026-05-26).

Grok JPEG downloads (Aurora model) carry no C2PA, no XMP, no SynthID, no IPTC — only EXIF Artist = a UUID and EXIF ImageDescription = Signature: <base64> (a crypto signature, unverifiable locally without xAI's public key). This empirically kills the earlier unverified "xAI signs C2PA as xAI" lead — xAI is not even a C2PA member. exif_generator misses it (neither field holds an AI_GENERATOR_TOKENS token), so a dedicated detector xai_signature(path) matches the pair (ImageDescription ~ ^Signature: [A-Za-z0-9+/=]{64,} AND UUID Artist); wired into has_ai_metadata, get_ai_metadata (key xai_signature), and identify (signal xai_signature, platform "xAI (Grok / Aurora)").

Format confirmed stable across n=3 genuine generations: exactly three EXIF tags (Artist, ExifOffset, ImageDescription), Signature: prefix constant, base64 payload 300-1004 chars. Two capture facts: (a) the Artist UUID equals the public image id in the asset URL (https://imagine-public.x.ai/imagine-public/images/<uuid>.jpg), so it is NOT a private per-user secret — only the Signature blob is; (b) the Grok web-UI image is a re-encoded WebP with no signature — the EXIF survives only in the original JPEG (download button or that public tokenless URL), which is why screenshots / re-encodes are metadata-stripped. A real fixture data/samples/grok-1.jpg plus synthetic JPEG fixtures (fake UUID + fake Signature: blob) cover the detector; never add a real Grok image carrying private content (the repo is public).

Stripped on removal too: remove_ai_metadata now calls _scrub_ai_exif on the JPEG EXIF, which deletes the xAI Signature+UUID-Artist pair and any Software/Make/Artist/ImageDescription tag holding an AI_GENERATOR_TOKENS token (so Ideogram's Make="Ideogram AI" is scrubbed too), while keeping genuine camera/editor EXIF. The shared _is_xai_signature_pair helper (module-level compiled regexes) is the single source of truth for the pattern, used by both xai_signature and _scrub_ai_exif. (AVIF/HEIF/JXL still strip only C2PA boxes via isobmff, not EXIF — unchanged.)

China TC260 AIGC label (caught by AIGC_MARKERS / metadata.aigc_label, surfaced by identify as the aigc signal): China-served generators embed an XMP <TC260:AIGC>{"Label":"1","ContentProducer":...} block — China's mandatory AI-content labeling (TC260 namespace tc260.org.cn/ns/AIGC).

Doubao (ByteDance) uses it (verified on the real #13 sample 2026-05-25; ContentProducer 001191110102MACQD9K64010000, no C2PA/SynthID/imwatermark — the XMP block is the only signal; GitHub attachment upload did NOT strip it). The same standard is mandatory for Jimeng/Kling/Qwen/Ernie etc., so the one marker covers the whole China-AIGC-labeled ecosystem. aigc_label reads four serializations through a shared _parse helper: the HTML-entity-encoded XMP TC260:AIGC block in either RDF form — the nested element <TC260:AIGC>{...}</TC260:AIGC> (Doubao) or the attribute TC260:AIGC="{...}" (PicWish, ContentProducer="picwish", verified on the corpus 2026-05-30) — via a container-agnostic raw-byte scan (any JSON object accepted), a raw-JSON PNG AIGC tEXt chunk (Doubao also writes the label this way, no namespaced marker at all — confirmed on the corpus 2026-05-28, ContentProducer="doubao"), a bare raw-JSON {"AIGC":{...}} object embedded in JPEG EXIF (UserComment) by some China-served generators, brace-matched from the scan head with json.JSONDecoder().raw_decode (no namespaced marker, no PNG chunk — confirmed on the corpus 2026-05-30, ContentProducer="001191440300708461136T1308L"), and a bare AIGC{...} blob (the label glued straight to its JSON, no "AIGC": key wrapper) embedded in a JPEG APP segment near the JFIF header — confirmed on the corpus 2026-06-10 (ContentProducer="00119144030008867405X210002"; 3 files read unknown before this form was added). The two raw-JSON forms are scanned in one loop ('"AIGC"' then AIGC{) that falls through on a non-TC260 / undecodable hit instead of returning — a quoted "AIGC" can appear later in an XMP packet while the real label is a bare AIGC{...} earlier in the file, so an unconditional early return on the quoted form would shadow the bare form (the exact bug behind the 06-10 misses). All three generic forms (the PNG chunk, the bare {"AIGC":...} object, and the bare AIGC{...} blob) are gated on at least one TC260 field (_TC260_FIELDS) so a generic AIGC key cannot false-positive; the namespaced XMP element is unambiguous and needs no gate. In identify, aigc fires on the parsed label or the AIGC_MARKERS byte scan (the latter preserves the laundering-tell case where the JSON payload is truncated).

HuggingFace-hosted job (caught by metadata.huggingface_job, surfaced by identify as the hf_job signal, MEDIUM confidence): HuggingFace Jobs / Spaces stamp generated PNGs with an hf-job-id tEXt chunk holding the job UUID (3 on the corpus 2026-05-28, no other signal). It marks the hosting job, not a model — most commonly diffusion output — so it lifts an Unknown verdict to a tentative AI via hf_only (parallel to the visible sparkle) but never overrides a hard metadata signal; _HF_JOB_CAVEAT states the limit (job, not model; not proof of AI pixels). Stripped on removal (the PNG save whitelist keeps only STANDARD_METADATA_KEYS, so hf-job-id and the AIGC chunk are both dropped). The exact writer is not authoritatively documented (HF Jobs are generic GPU jobs), hence medium not high.
No detectable signal on download (correctly reported unknown): Recraft (PNG export is a re-encoded design export — strips everything), Krea hosting FLUX 2 (no imwatermark despite FLUX — the host omits the encoder, same as Stability's hosted SDXL), and Midjourney (embeds nothing). Lesson: the imwatermark detector only fires on pristine output from a pipeline that runs the encoder (diffusers default, official BFL), not from re-hosts (Krea/Stability) or re-encoded exports (Recraft/Canva).
Invisible but NOT locally detectable (proprietary, API/oracle only — same wall as SynthID): Amazon Titan Image Generator + Nova Canvas (Bedrock DetectGeneratedContent API), Kakao (new SynthID image adopter, May 2026), NVIDIA Cosmos (SynthID video). No local detector possible; treat like SynthID.
C2PA 2.4 "Durable Content Credentials" (April 2026; verified against the spec) raise the bar for metadata stripping. 2.4 defines soft bindings (an invisible watermark or a content fingerprint) plus a server-side manifest repository and a new c2pa.repository-receipt assertion. Per the spec: "if a C2PA manifest is removed from an asset, but a copy of that manifest remains in a provenance store elsewhere, the manifest and asset may be matched using available soft bindings." So our local metadata --remove deletes the embedded manifest, but a fingerprint/watermark soft binding can still re-link the image to its manifest in a repository server-side. Stripping the file is becoming necessary-but-not-sufficient against durable provenance. (Our parsers target the stable embedded-manifest format documented in C2PA 2.1 §11; that format is unchanged in 2.4 -- the new pieces are repository/soft-binding infra, not the on-file box layout, so no parser change is implied.) Spec: https://spec.c2pa.org/specifications/specifications/2.4/specs/C2PA_Specification.html We now READ the soft-binding alg (C2PA_SOFT_BINDINGS / soft_binding_vendors_in) to name the forensic-watermark vendor, and locally DECODE the one open scheme, Adobe TrustMark (trustmark_detector); the rest (Digimarc/Imatag/Steg.AI/...) stay name-only (proprietary decoders).
Built 2026-05-26 (this batch): soft-binding alg vendor detection; IPTC Photo Metadata 2025.1 AI-disclosure fields (AISystemUsed etc.); video C2PA metadata detect + strip for MP4/MOV/M4V (free — isobmff.py is format-agnostic, MP4 is ISOBMFF); Adobe TrustMark open decoder. NOT done (out of cheap reach, per the feasibility review): visible video-logo removal (needs a video frame pipeline) and audio (SynthID/ElevenLabs/Resemble/Suno all oracle-only or unmarked).

Box detection window — now handled (v0.6.8): detection no longer relies on a fixed first-MB read. metadata.scan_head(path, size) reads the first size bytes and, for ISOBMFF, appends the payloads of late provenance boxes found by isobmff.scan_c2pa_region (a file-seeking top-level box walker that skips past mdat by size without reading it), so a C2PA/AIGC/IPTC manifest placed AFTER a large mdat in a streaming/non-faststart MP4 is now caught. Every C2PA/marker byte scan (has_ai_metadata, aigc_label, iptc_ai_system, synthid_source, exif_generator XMP, get_ai_metadata soft-binding, and identify) goes through scan_head; it is behavior-neutral for non-ISOBMFF inputs (exactly f.read(size)).

Meta-box XMP removal — now handled (v0.6.9): an AI-label XMP packet stored as a meta-box mime item (HEIF/AVIF; out of reach of the top-level box stripper) is blanked in place by isobmff.blank_ai_xmp_packets — it locates the packet by its <?xpacket begin … end?> delimiters and, if it carries an AI marker (_AI_LABEL_MARKERS), overwrites it with spaces of the SAME length, so box sizes / iloc offsets stay valid and the coded image is untouched (selective: plain non-AI XMP is left alone, mirroring the top-level uuid logic). Wired into remove_ai_metadata's ISOBMFF branch after strip_c2pa_boxes. The remaining gap is an Exif meta-box item (rare; the AI labels are XMP) — still needs iinf/iloc surgery or exiftool.

Regulatory driver (context, not a code change): AI-content labeling mandates are expanding, which pushes more generators toward exactly the C2PA + watermark signals we read. The full per-jurisdiction table lives in README "## Legal" -- keep it there, not duplicated here. Newly added + primary-source verified 2026-05-26: EU AI Act Article 50 machine-readable marking applicable 2026-08-02 (verified against the article text); South Korea AI Framework Act Art. 31(3) in force since 22 January 2026 (verified via Kim & Chang + FPF/Korea Times; Enforcement Decree accepts an invisible-watermark label); California AB 853 (amends the CA AI Transparency Act) latent-disclosure duty operative 2026-08-02, requiring a disclosure "permanent or extraordinarily difficult to remove" (verified against the leginfo bill text -- this is the exact disclosure our tool strips); India IT Amendment Rules 2026 in force 2026-02-20 (verified via Chambers), which prominently-label + permanent-provenance-id all synthetic media AND expressly prohibit removing/suppressing the label or metadata -- the first major all-content removal ban outside China.

Removal liability (README "## Legal" disclaimer): the tool is lawful general-purpose software; liability sits with the remover and is intent-gated -- downstream acts (fraud/deception/IP), plus US DMCA 17 USC 1202 (removing copyright-management info to conceal infringement), plus the removal-as-such bans in China + India. When extending the README table, verify each date/article against the statute/bill text before committing, not against search summaries.

20 KiB Raw Blame History

Watermarking landscape (research 2026-05-24)

20 KiB

Raw Blame History