From a0a349cc6687a58a6b5711320fd0fcb04494d6c4 Mon Sep 17 00:00:00 2001
From: Victor Kuznetsov <kuznetsov.va@gmail.com>
Date: Fri, 19 Jun 2026 09:48:21 -0700
Subject: [PATCH] docs: correct overstated FLUX open-watermark claim; record
 detector content-fragility

Earlier notes asserted BFL hosted output has no open DWT-DCT watermark. That was
overstated: the test carriers were high-texture fox images where a clean
encode->decode round-trip of a KNOWN-embedded watermark recovers only 28-35/48
bits (below the safe 44 gate), so the detector would miss a present mark there --
the None is inconclusive, not proof of absence.

Verified positive-control (2026-06-19): imwatermark dwtDct round-trips 48/48 on
synthetic carriers and on chatgpt-1.png (48/48) / firefly-1.png (45/48), but
FAILS on flux-1.png (28/48) and doubao-1.png (39/48). So invisible_watermark
detection is a positive-only signal: trust a hit, treat a miss on busy content as
inconclusive. Affects all open SD/SDXL/FLUX DWT-DCT detection. C2PA stays the
reliable FLUX identifier; whether BFL hosted embeds the open mark is unresolved
(needs a low-texture hosted sample).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 CLAUDE.md                      | 4 ++--
 docs/watermarking-landscape.md | 4 +++-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/CLAUDE.md b/CLAUDE.md
index df60779..cac6cef 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -36,7 +36,7 @@ Consequences for contributors (do not drift back into the stock niche just becau
 - Run `uv run` from the repo root — from another cwd it falls back to a bare env without numpy/cv2/torch.
 - **Stale `trustmark` remnant in site-packages after an extras change:** the `trustmark` package downloads model weights INTO its own package dir, so when a narrower `uv sync` prunes the package, a `trustmark/models/` directory survives as an empty namespace package. Symptom: pyright `"TrustMark" is unknown import symbol` on `trustmark_detector.py` and `find_spec("trustmark")` returning a loader-less spec (so `is_available()` lies True). Fix: `rm -rf .venv/lib/python3.12/site-packages/trustmark` (regenerable weights cache).
 - To add a dev tool (pytest/ruff/pyright) into the env, use `uv sync --frozen --extra dev --extra gpu`, **never `uv pip install`** — `uv pip install` re-resolves and rewrites `uv.lock`, which silently bumped `transformers` to a build incompatible with the pinned `diffusers` (`cannot import name 'Qwen3VLForConditionalGeneration'`) and broke every `identify`/metadata import. Recovery: `git checkout uv.lock && uv sync --frozen --extra gpu --extra dev`. The `gpu` extra holds `diffusers`/`transformers`/`torch`, so a bare `uv sync` (no extras) removes them; `noai/__init__` is now **lazy** (PEP 562 `__getattr__`, so importing `identify`/`metadata` no longer pulls `watermark_remover`/torch), so a bare env breaks only when the removal pipeline is actually invoked, not on import. `maintain.sh`'s `uv sync --all-extras` also pulls the heavy `trustmark`/`lama` wheels (pytorch-lightning, onnxruntime) — fine on a good connection, but on flaky DNS sync only `--extra gpu --extra dev` and run the lint/test steps by hand.
-- Metadata/C2PA tests assert against real committed fixtures in `data/samples/` (`chatgpt-*.png` = OpenAI C2PA, `firefly-1.png` = Adobe, `mj-*` = Midjourney IPTC, `doubao-1.png` = ByteDance Doubao with the China TC260 `<TC260:AIGC>` XMP label **and** a visible "豆包AI生成" text mark bottom-right; `grok-1.jpg` = xAI Grok with its EXIF-only `Signature:` blob + UUID `Artist` and no C2PA/SynthID/IPTC; `flux-1.png` / `flux-1.jpg` = real Black Forest Labs FLUX.2 Playground output, signed C2PA (issuer "Black Forest Labs" + `trainedAlgorithmicMedia`) -- `flux-1.jpg` is the first committed **JPEG-with-C2PA** fixture, exercising the c2pa-python non-PNG reader path end to end; note BFL's hosted output carries C2PA only, NOT the open DWT-DCT pixel watermark, see `docs/watermarking-landscape.md`); synthetic byte blobs cover the remaining JPEG/ISOBMFF format paths. The "non-AI / clean photo" control is no longer in `data/samples/` -- the `clean_photo` conftest fixture serves a verified-negative image from the corpus `neg/` set (skips if the corpus is absent).
+- Metadata/C2PA tests assert against real committed fixtures in `data/samples/` (`chatgpt-*.png` = OpenAI C2PA, `firefly-1.png` = Adobe, `mj-*` = Midjourney IPTC, `doubao-1.png` = ByteDance Doubao with the China TC260 `<TC260:AIGC>` XMP label **and** a visible "豆包AI生成" text mark bottom-right; `grok-1.jpg` = xAI Grok with its EXIF-only `Signature:` blob + UUID `Artist` and no C2PA/SynthID/IPTC; `flux-1.png` / `flux-1.jpg` = real Black Forest Labs FLUX.2 Playground output, signed C2PA (issuer "Black Forest Labs" + `trainedAlgorithmicMedia`) -- `flux-1.jpg` is the first committed **JPEG-with-C2PA** fixture, exercising the c2pa-python non-PNG reader path end to end; whether BFL hosted output also embeds the open DWT-DCT pixel watermark is UNRESOLVED -- our detector returns None on these fox samples, but they are high-texture carriers where even a known-embedded watermark fails the round-trip, see the content-fragility caveat in `docs/watermarking-landscape.md`); synthetic byte blobs cover the remaining JPEG/ISOBMFF format paths. The "non-AI / clean photo" control is no longer in `data/samples/` -- the `clean_photo` conftest fixture serves a verified-negative image from the corpus `neg/` set (skips if the corpus is absent).
 - SynthID reference corpus: `scripts/synthid_corpus.py` ingests labeled images into `data/synthid_corpus/`. The labeled `images/` (`pos/` `neg/` `cleaned/`) are **committed** (public repo -- review every image for private content before adding; `manifest.csv` is kept in sync with the files on disk, one row per tracked image); only the synthetic `refs/` calibration fills are gitignored. See its README for the collection protocol and verification oracles. **`cleaned/` examples must be produced by a CURRENT shipped removal method** -- the default SDXL img2img pass (optionally `--max-resolution`). Do NOT archive cleaned outputs from methods that are no longer in the pipeline (ctrlregen, the old text/face-protection, IP-Adapter FaceID, CodeFormer) or from the experimental opt-in paths (controlnet, face restore) as corpus examples; a cleaned reference should represent the canonical removal, and a removed method's output is not a reproducible example. Keep those experiment outputs in a local working dir, never in the committed corpus.
 
 ## Configuration
@@ -59,7 +59,7 @@ Compact map. The full per-module detail (design decisions, tuned thresholds, cal
 - `_text_mark_engine.py` — shared base for the three reverse-alpha text-mark engines (extracted 2026-06-09); the per-engine modules are config-only subclasses. New text mark = a `TextMarkConfig` + a thin subclass + one registry row. Gemini stays a separate engine (different model).
 - `doubao_engine.py` / `jimeng_engine.py` / `samsung_engine.py` — thin `TextMarkEngine` subclasses: Doubao "豆包AI生成" (bottom-right), Jimeng "★ 即梦AI" (bottom-right), Samsung Galaxy AI "✦ Contenuti generati dall'AI" (bottom-LEFT, locale-specific — Italian variant calibrated). Removal = reverse-alpha (always-align) + thin residual inpaint. A detector-only removal test is insufficient — assert visual residual (the textured-shift tests).
 - `region_eraser.py` — universal region eraser (`erase` CLI): cv2 backend default (no deps), optional big-LaMa via onnxruntime (~3.5-4 GB peak RAM, ~5-6 s/call CPU — does not fit a minimal droplet).
-- `invisible_watermark.py` — decodes the OPEN DWT-DCT watermarks (SD / SDXL / FLUX) via `imwatermark` (extra `detect`, pulls torch). Fragile: does not survive JPEG re-encode/resize, so it confirms origin only on pristine files.
+- `invisible_watermark.py` — decodes the OPEN DWT-DCT watermarks (SD / SDXL / FLUX) via `imwatermark` (extra `detect`, pulls torch). Fragile two ways: (1) does not survive JPEG re-encode/resize; (2) **content-fragile even on pristine files** -- a clean encode->decode round-trip recovers 48/48 bits on synthetic + many real images (chatgpt 48/48, firefly 45/48) but FAILS on high-texture content (flux fox 28/48, doubao 39/48, below the safe `_MATCH_48`=44 gate). So a `None`/no-match is a positive-only signal: it confirms origin when it fires but is **inconclusive** when it doesn't (it would miss a present mark on busy content). Verified 2026-06-19; see the content-fragility caveat in `docs/watermarking-landscape.md`.
 - `trustmark_detector.py` — Adobe TrustMark open decoder (extra `trustmark`). Do NOT remove the JPEG re-encode false-positive gate — a lone TrustMark hit without it is almost always content noise.
 - `noai/watermark_remover.py` — `WatermarkRemover` with two diffusion pipelines selected by the explicit `pipeline` ctor arg, never inferred from `model_id`: `sdxl` (plain SDXL img2img) and `controlnet` (SDXL + canny ControlNet, **the DEFAULT since 2026-06-09**). Removal comes from the img2img `strength`; ControlNet only preserves text/face STRUCTURE — SynthID CAN survive controlnet on photoreal content at low strength. No face-restore extra ships, by validated decision (every restore approach looked MORE AI-generated).
 - `auto_config.py` + the content-detection layer were REMOVED 2026-06-09; `--auto` is a deprecated no-op (controlnet is the default pipeline and the adaptive polish is ON by default and self-gates to a no-op where there is no detail deficit).
diff --git a/docs/watermarking-landscape.md b/docs/watermarking-landscape.md
index 397c2fb..8753d0a 100644
--- a/docs/watermarking-landscape.md
+++ b/docs/watermarking-landscape.md
@@ -5,7 +5,9 @@
 > no content was changed or summarized.
 
 Who embeds what, and whether it is locally detectable (so we know which gaps are fillable). See `identify.py` for what we read.
-- **Locally detectable (open decoder, no key/API):** Stable Diffusion / SDXL / FLUX via `imwatermark` DWT-DCT (now covered by `invisible_watermark.py`). FLUX uses the same library (`black-forest-labs/flux2` `src/flux2/watermark.py`, 48-bit `0b001010101111111010000111100111001111010100101110`); SDXL is the diffusers `WATERMARK_MESSAGE` (`0b101100111110110010010000011110111011000110011110`). Caveat: fragile to re-encoding. **The FLUX open DWT-DCT is OPTIONAL dev-inference-code only and is NOT applied by Black Forest Labs' hosted surface** — verified 2026-06-19 on real BFL Playground outputs from BOTH FLUX.2 [pro] and FLUX.1 [dev] (lossless PNG + JPG each): all carry the signed C2PA manifest (issuer "Black Forest Labs") but the DWT-DCT decode lands at chance level (FLUX.2 [pro] PNG decodes to the degenerate all-ones; FLUX.1 [dev] lossless PNG to 17/48 ones; matches to the FLUX reference 19-28/48, threshold 44), so the open pixel watermark is absent on hosted output regardless of model line or container. Practical consequence: a hosted FLUX.2 image is identified by C2PA only; once C2PA is stripped there is NO open-pixel fallback for it (the 48-bit pattern in `_BITS_48` is correct and would fire only on a locally-generated FLUX.2 with the watermark flag explicitly enabled).
+- **Locally detectable (open decoder, no key/API):** Stable Diffusion / SDXL / FLUX via `imwatermark` DWT-DCT (now covered by `invisible_watermark.py`). FLUX uses the same library (`black-forest-labs/flux2` `src/flux2/watermark.py`, 48-bit `0b001010101111111010000111100111001111010100101110`); SDXL is the diffusers `WATERMARK_MESSAGE` (`0b101100111110110010010000011110111011000110011110`). **Caveat: the `imwatermark` dwtDct decode is content-fragile, NOT just re-encode-fragile.** A clean encode->decode round-trip (no re-encode at all) recovers 48/48 bits on synthetic carriers (random / flat / gradient) AND on some real images (chatgpt-1.png 48/48, firefly-1.png 45/48), but FAILS on high-texture/high-frequency content — the FLUX fox sample recovers only 28/48 and doubao-1.png 39/48, both below the safe `_MATCH_48` = 44 gate (random baseline ~24). So a `None` from `detect_invisible_watermark` on a textured image is **inconclusive** — the decoder would miss even a present watermark. The 44 gate is a deliberate precision choice (lowering it to catch textured carriers would admit false positives); the blind spot is the imwatermark method on high-frequency content.
+
+  Consequence for the FLUX hosted-output question (BFL Playground, FLUX.2 [pro] + FLUX.1 [dev], 2026-06-19): both carry the signed C2PA manifest (issuer "Black Forest Labs"); the open DWT-DCT decode returned `None`, BUT the test carriers were the high-texture fox images where even a *known-embedded* watermark only round-trips 28-35/48 — so **whether BFL hosted output embeds the open pixel watermark is UNRESOLVED** (an earlier note here wrongly asserted it absent; that was overstated — the carrier defeats the detector). What IS established: C2PA is the reliable FLUX identifier; the open DWT-DCT is OPTIONAL in the FLUX dev inference code and the `_BITS_48` pattern is correct (round-trips on low/medium-texture carriers). To resolve the hosted question, test a LOW-texture hosted FLUX image (where the round-trip is first validated to recover >=44/48).
 - **C2PA / IPTC (covered by the issuer/marker scan):** OpenAI, Google, Adobe Firefly, Microsoft (Designer + **Bing Image Creator** — collected 2026-05-24; Bing now runs Microsoft's own **MAI-Image** model, signs C2PA as "Microsoft", NOT OpenAI/DALL-E), **Stability AI** (collected from Brand Studio / DreamStudio successor; signs C2PA as "Stability AI Ltd", no SynthID, no imwatermark on its current Stable Image model — issuer added to `C2PA_ISSUERS`), and **Canva** (Magic Media signs C2PA as "Canva" + `trainedAlgorithmicMedia` with a generic `c2pa-rs` claim generator, no SynthID — issuer `b"Canva"` → "Canva (Magic Media)"; found on real production traffic 2026-06-19, which **disproved the earlier assumption** that Canva downloads are re-encoded exports that always strip C2PA). Still unsampled: Getty, Shutterstock. Midjourney embeds NO C2PA and no invisible watermark (our `mj-*` sample carried only the IPTC tag).
 
 **Samsung Galaxy AI** (Generative Edit / Sketch to Image / Portrait Studio on Galaxy S23 FE / S24 / S25, One UI 7+) signs C2PA as "Samsung Galaxy" with the standard `trainedAlgorithmicMedia` source type AND a proprietary `genAIType` marker; verified on real signed files 2026-05-29 (the standard scan catches the source type; `genAIType` additionally catches a Galaxy S24 file that omits it). It ALSO burns a **visible** localized wordmark into the pixels — a sparkle + "generated with AI" string in the bottom-LEFT corner (issue #37; the Italian "✦ Contenuti generati dall'AI" variant is calibrated) — removed by `samsung_engine.py` / `visible --mark samsung` (reverse-alpha, see the engine bullet); detection feeds `identify` as the medium `visible_samsung` signal. The string is locale-specific, so each locale needs its own captured alpha template.