docs: record verified fal fast-sdxl checkpoint + native-resolution updates

- fal's llms.txt confirms fast-sdxl is stabilityai/stable-diffusion-xl-base-1.0,
  the exact checkpoint the local CLI defaults to -> local == prod weights.
  Recorded in CLAUDE.md and README.
- README How it works + sample README: replace the old downscale->upscale
  description with native-resolution processing (matches the #10 fix);
  document --max-resolution as an opt-in OOM cap.
- README roadmap: idna already bumped (uv-secure clean).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
test-user
2026-05-25 09:57:03 -07:00
parent 18740969ae
commit fb42295b3a
3 changed files with 16 additions and 12 deletions
+1 -1
View File
@@ -49,4 +49,4 @@ Who embeds what, and whether it is locally detectable (so we know which gaps are
- Metadata detection for AVIF/HEIF/JPEG-XL relies on a binary scan for `C2PA_UUID` + `IPTC_AI_MARKERS`, plus EXIF `Software` / XMP `CreatorTool` generator tags via `metadata.exif_generator` (validated with synthesized AVIF/JPEG fixtures + an XMP raw-scan fixture). C2PA removal in those containers is implemented via `noai/isobmff.py` (top-level ``uuid`` / ``jumb`` box stripper, no re-encoding). EXIF/XMP boxes inside those containers are read for detection but not yet **scrubbed** on removal.
- **SynthID detection is metadata-only.** There is no reliable *local* detector of the SynthID *pixel* watermark — Google's decoder is proprietary, no public spec or API (only a waitlisted portal). We detect SynthID by its C2PA companion (`synthid_source` / `SYNTHID_C2PA_ISSUERS`), which is reliable while the manifest is intact but says nothing once C2PA is stripped. **Surface-dependent blind spot (verified 2026-05-24):** the same Google model emits different metadata per surface -- the Gemini *app* wraps outputs in Google C2PA, but the *API/playground* (AI Studio, Nano Banana / gemini-2.5-flash-image) emits the SynthID *pixel* watermark (confirmed via the Gemini-app oracle) + the visible sparkle but **no C2PA/IPTC at all**, so `synthid_source` returns None despite SynthID being present. Only the pixel oracle or the visible-sparkle detector catches those. (Meta AI is another surface mismatch: it writes the IPTC `digitalSourceType=trainedAlgorithmicMedia` marker, not C2PA and not SynthID.) Google→SynthID is long-standing; OpenAI→SynthID is confirmed by OpenAI's Help Center (ChatGPT/Codex/API "include both C2PA metadata and SynthID watermarks", updated 2026-05-21) but time-gated (pre-rollout OpenAI images carry C2PA without SynthID), so the OpenAI verdict is hedged "likely". Oracles: Gemini app "Verify with SynthID" (Google), openai.com/verify (OpenAI). The spectral phase-coherence approach from `github.com/aloshdenny/reverse-SynthID` was evaluated (May 2026) and **does not work for real-content detection**: on its own shipped codebook + validation set, watermarked and cleaned images were indistinguishable (conf within noise, cleaned often higher); it only fires on pure-black 1024x1024 reference images at exact resolution (the controlled case it was calibrated on). The README's "90% / conf=0.91" reproduces only in that lab condition. Do not build a production detector on it; if revisited, it is experimental/diagnostic only and needs a per-resolution, per-model reference corpus. A from-scratch gpt-image pilot (2026-05-24) confirmed this independently: 5 independent solid-black gpt-image outputs share a near-identical fixed signature (pairwise residual correlation **0.92**, avg-template retains 97% energy), so the watermark/carrier IS strongly present and consistent on flat content — but the carrier frequencies extracted from it do NOT discriminate real content (carrier-to-random ratio: cleaned 1.86 > watermarked 1.53; a non-gpt-image image scored highest at 3.67). The signature drowns in content texture. Net: a perfectly consistent solid-color signature still yields no real-content pixel detector with magnitude/carrier methods. A corpus discrimination test (2026-05-24, `scripts/synthid_pixel_probe.py`, raw zero-mean residual NCC) independently re-confirms this: at matched resolution, SynthID positives do NOT cluster apart from negatives (within-Gemini 0.07; at 1024 px pos-vs-neg >= pos-vs-pos). The only high correlations were near-duplicate *content* (5 ChatGPT renders of one prompt at ~0.92, while a distinct ChatGPT image scored ~0 against them) — content, not a carrier. The probe is solid-fills-only and EXPERIMENTAL/DIAGNOSTIC; do not use it on real content. **Correction (deeper re-examination 2026-05-25):** the carrier IS real on solid fills — the earlier "no carrier" was a *method* artifact of using spatial / FFT-magnitude NCC, which can't see it. The carrier is a fixed *phase* at specific low frequencies, so the right metric is **per-bin phase coherence**. On 8 white `gemini-2.5-flash-image` fills (generated via the reverse-SynthID trick: identity-edit prompt "Recreate this image exactly as it is" on a synthetic pure-white PNG — this bypasses the recitation block that rejects text prompts for pure colors), phase coherence at the white carriers `(0,±7..±12,±20..±23)` = **0.86** vs **0.31** random; single-image leave-one-out phase-match **+0.83** vs real photos **-0.24**. (Black `2.5-flash` fills clip to std≈0 — SynthID can't push values below 0, so no carrier in black; the repo's dark carriers come from nano-banana-pro.) **But it does not generalize:** (a) carriers are model-version + resolution + color specific — the repo's v4 codebook (built for `gemini-3.1-flash-image-preview` + `nano-banana-pro-preview`) scores ~0.527 on my 2.5-flash white fills, indistinguishable from negatives (~0.50), i.e. carriers shift across model versions and need a per-model codebook; (b) on real content (30 `2.5-flash` images) the carrier collapses — set phase coherence at carriers 0.37 ≈ random 0.42, and the repo's v4 detector gives content 0.518 ≈ negatives 0.504 (no separation; a faint +0.24 single-image lean is likely a brightness confound). Net: the spectral/phase approach is a real *controlled-fill* characterizer, NOT an arbitrary-real-content detector, and is brittle to model version. Metadata proxy + visible sparkle + online oracles remain the ceiling for real content.
- **External AI-vs-real classifier models are out of scope (decided 2026-05-24).** Generic HuggingFace detectors (`Organika/sdxl-detector` Swin Transformer, `umm-maybe/AI-image-detector`, and fine-tunes) exist and report ~0.98 on their *own* SDXL-vs-real validation sets, but they are per-generator and the model cards themselves note degraded accuracy off-distribution; they are untested on gpt-image / Gemini Nano Banana (the metadata-stripped surfaces we care about), and our own light SDXL pass would likely defeat them the same way it defeats SynthID. Detection here stays local + signal-based (metadata + visible sparkle); do not add a bundled classifier dependency.
- **SynthID v2 vs default pipeline:** the SDXL-based default profile (since May 2026) defeats SynthID v2. **Verified end-to-end (May 2026):** local SDXL run on a Gemini 3 Pro output, checked via the Gemini app's "Verify with SynthID" feature, returned "no SynthID watermark detected". Also confirmed against **OpenAI's** SynthID (2026-05-23): a fresh ChatGPT/gpt-image output read "SynthID detected" on openai.com/verify before the local SDXL run and "SynthID not detected" after (corpus regression chain: pos `4ef377bd` -> cleaned `47188e88`). The same configuration is used in raiw-app production (`fal-ai/fast-sdxl` at native ~1024 px, strength 0.05, steps 50). SD-1.5 dreamshaper at 768 px was previously the default and does NOT defeat v2 — verified empirically against the same feature (strength 0.04, 0.10, and elastic warp α∈{5,8} all flagged positive). That SD-1.5 path was removed; only `default` (SDXL) and `ctrlregen` profiles remain.
- **SynthID v2 vs default pipeline:** the SDXL-based default profile (since May 2026) defeats SynthID v2. **Verified end-to-end (May 2026):** local SDXL run on a Gemini 3 Pro output, checked via the Gemini app's "Verify with SynthID" feature, returned "no SynthID watermark detected". Also confirmed against **OpenAI's** SynthID (2026-05-23): a fresh ChatGPT/gpt-image output read "SynthID detected" on openai.com/verify before the local SDXL run and "SynthID not detected" after (corpus regression chain: pos `4ef377bd` -> cleaned `47188e88`). The same configuration is used in raiw-app production (`fal-ai/fast-sdxl/image-to-image`, strength 0.05, steps 50, guidance 7.5, no pre-downscale). fal's own `llms.txt` for `fast-sdxl` names the base checkpoint as `stabilityai/stable-diffusion-xl-base-1.0` (verified 2026-05-25) -- the exact checkpoint the local CLI defaults to (`DEFAULT_MODEL_ID`). So the local `invisible` default is weight-for-weight identical to prod; "fast-sdxl" is fal's optimized serving, not different weights. After the native-resolution fix the local pipeline matches prod on weights + strength + steps + guidance + resolution. SD-1.5 dreamshaper at 768 px was previously the default and does NOT defeat v2 — verified empirically against the same feature (strength 0.04, 0.10, and elastic warp α∈{5,8} all flagged positive). That SD-1.5 path was removed; only `default` (SDXL) and `ctrlregen` profiles remain.
+14 -10
View File
@@ -28,7 +28,7 @@ Strips SynthID, C2PA Content Credentials, EXIF/XMP "Made with AI" labels, and vi
| AI model | Visible watermark | Invisible watermark | Metadata | Our approach |
| --- | --- | --- | --- | --- |
| **Google Gemini / Nano Banana / Gemini 3 Pro** | ✅ Sparkle logo | ✅ SynthID v1 + v2 (default SDXL pipeline at native ~1024 px) | ✅ C2PA + EXIF | Alpha reversal + diffusion + metadata strip |
| **Google Gemini / Nano Banana / Gemini 3 Pro** | ✅ Sparkle logo | ✅ SynthID v1 + v2 (default SDXL pipeline, native resolution) | ✅ C2PA + EXIF | Alpha reversal + diffusion + metadata strip |
| **OpenAI DALL-E 3 / ChatGPT** | — | — | ✅ C2PA manifest | Metadata strip |
| **OpenAI ChatGPT Images 2.0** (gpt-image-2) | — | ⚠️ imperceptible pixel watermark (no public detector yet) | ✅ C2PA manifest (verified) | Diffusion regeneration + metadata strip |
| **Stable Diffusion / SDXL (AUTOMATIC1111, ComfyUI)** | — | ✅ DWT-DCT (imwatermark — locally detectable) | ✅ PNG text chunks | Diffusion regeneration + metadata strip |
@@ -72,12 +72,14 @@ Google embeds **SynthID** into every image generated by Gemini / Nano Banana. Ot
The removal pipeline (default profile, SDXL):
```text
image → resize to ~1024px (SDXL native) → encode to latent space (VAE)
image → encode to latent space (VAE) at native resolution
→ add controlled noise (forward diffusion)
→ denoise (reverse diffusion, ~50 steps at strength 0.05)
→ decode back to pixels (VAE) → upscale to original resolution
→ decode back to pixels (VAE)
```
By default the image is processed at its **native resolution** with no pre-downscale, matching the hosted raiw.cc backend (fal `fast-sdxl`, which is `stabilityai/stable-diffusion-xl-base-1.0` — the same checkpoint the CLI defaults to). At strength ~0.05 SDXL img2img does not need the input shrunk, and the old forced downscale-to-1024 then upscale-back round-trip was the main quality loss. Pass `--max-resolution N` to cap the long side only when a very large image runs out of GPU/MPS memory (it reintroduces that lossy round-trip).
SDXL is the default since May 2026: empirically defeats SynthID v2 on Gemini 3 Pro outputs, where the older SD-1.5 pipeline at 768 px did not. The SD-1.5 path was removed once it was verified not to handle v2.
**Face Protection**: before diffusion, YOLO detects people in the image and extracts them. After diffusion, the original faces are blended back with a soft elliptical mask to prevent AI distortion of facial features.
@@ -208,6 +210,8 @@ remove-ai-watermarks visible image.png -o clean.png
# Invisible watermark only (SynthID etc.) — requires GPU
remove-ai-watermarks invisible image.png -o clean.png --humanize 4.0
# Runs at native resolution by default. On a very large image that OOMs the
# GPU/MPS, cap the long side: --max-resolution 2048
# Check / strip AI metadata (C2PA, EXIF, "Made with AI" labels)
# --check also flags SynthID-bearing sources: a C2PA manifest signed by
@@ -284,7 +288,7 @@ Tracked but not yet implemented:
- **Local SynthID *pixel* detector**. Not feasible today: Google's decoder is proprietary, and magnitude/carrier spectral methods do not separate real content (confirmed by three independent evaluations, including a from-scratch gpt-image pilot; see CLAUDE.md). Blocked on either (a) a programmatic generation path (OpenAI / Gemini API) to build a per-(model, resolution) labeled corpus at scale, or (b) a raw watermarked-output dataset. If data arrives, the next approach to try is a learned classifier on diverse content rather than a fixed carrier codebook.
- **Grow the SynthID reference corpus** (`data/synthid_corpus/`) with oracle-labeled samples per model and resolution (Gemini app for Google, openai.com/verify for OpenAI). Prerequisite for any pixel-detector attempt and for an automated removal-regression set.
- **Real non-PNG C2PA fixtures**. SynthID-source detection for JPEG / WebP / AVIF is currently covered only by synthetic byte blobs; replace with real vendor-emitted files to ground the binary-scan path.
- **Maintenance debt**. Bump transitive `idna` to 3.15 (blocks `uv-secure`); clear strict-pyright debt in `remove_ai_metadata` / `cli.py` (untyped piexif / PIL / click / rich) so `maintain.sh` can finish green.
- **Maintenance debt**. Clear strict-pyright debt in `remove_ai_metadata` / `cli.py` (untyped piexif / PIL / click / rich) so `maintain.sh` can finish green. (`uv-secure` is already clean since `idna` was bumped to 3.16.)
- **AVIF / HEIF / JPEG-XL detection limits**. Removal strips top-level C2PA `uuid` and JUMBF `jumb` boxes. EXIF/XMP boxes inside these containers are not yet scrubbed (PNG and JPEG are fully covered).
- **Video pipeline (`noai-video`)**: per-frame inpainting and tracking for Sora 2 dynamic logo, Veo 3.1 badge, Kling, Runway. Separate package, not folded into this repo.
@@ -298,17 +302,17 @@ Watermarking and provenance for AI-generated content is now regulated in several
| Jurisdiction | Instrument | Status (May 2026) | Relevance |
| --- | --- | --- | --- |
| EU | AI Act, Article 50(2) | Marking obligations postponed to **2 December 2026** under the December 2025 omnibus agreement. Code of Practice finalising May/June 2026. | Removing mandated provenance markers with intent to deceive may be sanctioned under national implementations. |
| US (federal) | COPIED Act | Enacted 2025. | Criminalises removal of provenance information with intent to deceive about content origin. The tool itself is lawful; usage may not be. |
| US (state) | CA AB 2655, TX SB 751, similar | In force. | Content-specific (election deepfakes, sexual deepfakes). Not tool-specific. |
| China | Deep Synthesis Regulation, 2025 updates | In force. | Mandatory visible label for AI content. Removal is an administrative offence. |
| UK | Online Safety Act, 2025 transparency extension | In force. | Platform obligations, not user obligations. |
| EU | AI Act, Article 50 | Transparency duties apply from **2 August 2026**. Legacy generative systems (placed on the market before that date) get a grandfathering period to **2 December 2026** for the Article 50(2) marking duty, under the Digital Omnibus (Commission proposal Nov 2025; co-legislator political agreement 7 May 2026). Article 50 guidelines and a marking Code of Practice are being finalised through 2026. | Removing mandated provenance markers with intent to deceive may be sanctioned under national implementations. |
| US (federal) | COPIED Act | **Reintroduced April 2025; not enacted** (pending in the Senate). | If passed, would set NIST provenance standards and prohibit tampering with / removing provenance information. The tool itself is lawful; usage may not be. |
| US (state) | CA AB 2655, TX SB 751, similar | TX SB 751 in force; **CA AB 2655 struck down** by a federal court (Aug 2025, Section 230 / First Amendment). | Content-specific (election deepfakes, sexual deepfakes). Not tool-specific. |
| China | Measures for Labeling AI-Generated Content (+ GB 45438-2025) | In force since **1 September 2025**. | Mandatory explicit (visible) + implicit (metadata) labels for AI content; tampering with or removing labels is prohibited. |
| UK | Online Safety Act 2023 / Ofcom guidance | In force, but **no statutory AI-provenance or watermarking obligation**. | Ofcom encourages watermarking / provenance metadata as voluntary "attribution measures"; platform duties, not user obligations. |
## Threat model
This tool defends already-distributed AI imagery against automatic detection systems (social-platform "Made with AI" labels, third-party classifiers, content-policy filters). It does **not** retroactively anonymise generation.
In particular, **SynthID-Image v2** (Google, deployed October 2025 with Gemini 3 Pro / Nano Banana Pro / Imagen 4 / Veo) embeds a **136-bit payload** ([arxiv 2510.09263](https://arxiv.org/abs/2510.09263)). The payload is believed to encode a user / session identifier. If the original watermarked file ever passed through a system controlled by the prompt originator (a saved Gemini account history, a screenshot uploaded to a Google product, a backup), Google retains the ability to link that original to the generating account. Stripping the watermark from a copy you possess does not erase Google's server-side record.
In particular, **SynthID-Image** (Google, deployed 2025 with Gemini 3 Pro / Nano Banana Pro / Imagen 4 / Veo) carries a **multi-bit payload** — the paper's SynthID-O variant encodes 136-bit payloads in 512x512 images ([arxiv 2510.09263](https://arxiv.org/abs/2510.09263)). The payload is believed to encode a user / session identifier. If the original watermarked file ever passed through a system controlled by the prompt originator (a saved Gemini account history, a screenshot uploaded to a Google product, a backup), Google retains the ability to link that original to the generating account. Stripping the watermark from a copy you possess does not erase Google's server-side record.
Use cases where the threat model fits:
- You generated the image yourself, want to publish it as your own work, and accept the consequences if Google ever publishes their detector logs.
+1 -1
View File
@@ -38,4 +38,4 @@ remove-ai-watermarks metadata amur-leopard.png --remove -o amur-leopard.clean.pn
remove-ai-watermarks all amur-leopard.png -o amur-leopard.all.png --device mps
```
Note: the diffusion step downscales to 768 px, which degrades fine text on text-heavy infographics like this one. Tracked as a known limitation; see project TODO.
Note: the diffusion step runs at native resolution by default (no pre-downscale), so fine text on a text-heavy infographic like this one is preserved. Pass `--max-resolution N` only if a very large image OOMs the GPU/MPS (that reintroduces a lossy downscale then upscale round-trip).