chore: mark controlnet pipeline + GFPGAN restore-faces as experimental

Both content-preservation features are now flagged EXPERIMENTAL and opt-in.
--pipeline controlnet was already opt-in (default=default); --restore-faces
flips from on-by-default to OFF by default, matching the repo's prior pattern
for experimental preservation passes (the removed protect_text/protect_faces).

- cli.py: --restore-faces/--no-restore-faces default False; EXPERIMENTAL in the
  --restore-faces / --controlnet-scale / --pipeline help; batch default False.
- invisible_engine.py: remove_watermark restore_faces default False + docstring.
- CLAUDE.md / README.md / docs/synthid.md: label both experimental/opt-in.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Victor Kuznetsov
2026-06-03 16:57:29 -07:00
parent 411ef16ec3
commit 5ec8269949
5 changed files with 31 additions and 25 deletions
+4 -4
View File
@@ -27,7 +27,7 @@ You are a **principal Python engineer** maintaining a CLI tool and library for r
- GPU/ML modules (invisible_engine, watermark_remover) are optional — guard imports with `is_available()` checks
- Optional detection extras: `detect` (imwatermark — open SD/SDXL/FLUX watermark) and `trustmark` (Adobe TrustMark decoder; pulls torch + downloads weights). Both are guarded by `is_available()` and skipped by `identify` when absent.
- Optional `restore` extra (gfpgan/facexlib/basicsr): the GFPGAN face-identity post-pass (`face_restore.py`, CLI `--restore-faces`, ON by default). Guarded by `face_restore.is_available()`; the default-on flag auto-skips with a debug log when the extra is absent or no face is detected. numpy<2-pinned and Python-3.12-pinned (see the `face_restore.py` Key-modules bullet).
- Optional `restore` extra (gfpgan/facexlib/basicsr): the GFPGAN face-identity post-pass (`face_restore.py`, CLI `--restore-faces`, **EXPERIMENTAL, opt-in, OFF by default**). Guarded by `face_restore.is_available()`; when enabled it auto-skips with a debug log when the extra is absent or no face is detected. numpy<2-pinned and Python-3.12-pinned (see the `face_restore.py` Key-modules bullet).
- Tests for the *model-running* paths are limited to availability checks (multi-GB downloads). But the **pure helpers inside ML-adjacent modules are unit-tested without any download** and must stay that way: `_target_size` (native-vs-downscale, `test_invisible_engine.py`) and the MPS->CPU fallback control flow via mocked pipelines (`test_img2img_runner.py`, 100% cover). Don't skip these as "ML, needs a model" — only `remove_watermark`/the diffusion bodies do.
## Key modules
@@ -43,8 +43,8 @@ You are a **principal Python engineer** maintaining a CLI tool and library for r
- `region_eraser.py` — universal region eraser (`erase` CLI). `erase(image, boxes=|mask=, backend=)` normalizes grayscale (2D) and RGBA (4-channel) inputs up front (`erase_cv2` splits off any alpha plane and re-attaches it on the result): `boxes_to_mask``cv2.inpaint` (`cv2` backend, default, no deps) or big-LaMa via onnxruntime (`lama` backend, extra `lama`, `Carve/LaMa-ONNX` Apache-2.0 model downloaded on first use, never bundled). `erase_lama` crops a padded region around the mask, runs LaMa at its fixed 512² input, pastes only masked pixels back (untouched areas stay pixel-exact). Lazy `_get_lama_session` singleton; `lama_available()` guards the optional import. **LaMa-ONNX costs ~3.5-4 GB peak RAM and ~5-6 s/call on CPU** (FFC working set, not arena — `enable_cpu_mem_arena=False` does not help), so it does NOT fit a minimal droplet; the cv2 backend (tens of MB, ~30 ms) does. LaMa quality at low RAM = serverless/GPU, mirroring how raiw.cc offloads SDXL to fal.
- `invisible_watermark.py``detect_invisible_watermark(path)` decodes the OPEN DWT-DCT watermarks (public decoder, no key) embedded by Stable Diffusion / SDXL / FLUX via the `imwatermark` library. Known fixed patterns (verified against upstream source) live in `_BITS_48` (SDXL 48-bit, FLUX.2 48-bit) and `_SD1_STRING` ("StableDiffusionV1", SD 1.x/2.x). Optional dep (extra `detect`); returns None when absent. The `detect` extra pulls **torch** transitively (invisible-watermark declares torch a hard dep, and `WatermarkDecoder` eagerly imports `rivaGan` -> `torch` at import time), so detection needs torch present even though dwtDct runs CPU-only on cv2/numpy/pywavelets — no GPU and no separate `gpu` extra required. **Unlike SynthID this is locally detectable**, but the watermark is fragile (does not survive JPEG re-encode/resize — verified gone after JPEG q90), so it confirms origin only on pristine files. Add new known patterns here. The file carries a top-of-module pyright pragma because imwatermark/cv2 ship no type stubs.
- `trustmark_detector.py``detect_trustmark(path)` decodes the OPEN, keyless **Adobe TrustMark** watermark (the soft binding behind Adobe Durable Content Credentials, `alg` `com.adobe.trustmark.P`) via the optional `trustmark` package (extra `trustmark`; pulls torch, downloads model weights on first use). Mirrors `invisible_watermark.py` (lazy singleton guarded by a double-checked `threading.Lock` so concurrent callers do not double-download the weights, top-of-module pyright pragma, returns None when absent). It detects *provenance*, not AI origin as such (TrustMark also marks human-authored content), so `identify` lists it as a watermark without setting `is_ai_generated`. Other soft-binding vendors (Digimarc/Imatag/Steg.AI/...) have no public decoder — they are only *named* via the `C2PA_SOFT_BINDINGS` scan, not decoded. **False-positive gate (added 2026-05-29):** TrustMark's `wm_present` is a BCH error-correction validity flag that spuriously validates on a content-correlated fraction of un-watermarked images — AI-generated textures trip it far more than camera photos (verified 2026-05-29 on real files: it fires on Gemini/OpenAI/Doubao output that *cannot* carry Adobe's watermark, with a random-bytes decoded secret, while signal-free camera photos did not trip it). A genuine TrustMark is a *durable* soft binding engineered to survive re-encoding, so `detect_trustmark` re-decodes after a mild JPEG round-trip (`_survives_reencode`, `_REENCODE_QUALITY` 95) and requires the same schema both times; every observed false positive collapsed (none survived even q95), so the gate is the durability property the watermark guarantees. The second decode runs only on the rare initial hit, so the cost is negligible. Do NOT remove the gate to "catch more" — a lone TrustMark hit without it is almost always content noise.
- `noai/watermark_remover.py` — the `WatermarkRemover` class has two diffusion pipelines, selected by the explicit `pipeline` ctor arg (NOT inferred from `model_id` -- both use the same SDXL base, `DEFAULT_MODEL_ID`). **`default`** runs plain SDXL img2img (`_run_img2img`). **`controlnet`** (`_run_controlnet`, `_load_controlnet_pipeline`) runs `StableDiffusionXLControlNetImg2ImgPipeline` with the SDXL-native canny ControlNet `xinsir/controlnet-canny-sdxl-1.0` (`watermark_profiles.CONTROLNET_CANNY_MODEL`): the control image is `cv2.Canny(gray, 100, 200)` stacked to 3 channels (`_CANNY_LOW`/`_CANNY_HIGH`, prompt `_CONTROLNET_PROMPT` / `_CONTROLNET_NEGATIVE`). **Removal still comes from the img2img regeneration (`strength`); the ControlNet only PRESERVES text and face STRUCTURE via the edge map -- no original pixels are copied or frozen, so SynthID does not survive.** Canny holds face STRUCTURE but NOT identity (the regenerated face drifts in likeness -- canny carries edges, not identity; face identity is preserved by the optional `--restore-faces` GFPGAN post-pass, auto when faces present -- see `face_restore.py`). `controlnet_conditioning_scale` (ctor arg, default 1.0) is the structure-preservation knob. Same dtype rule as `default` (fp32 on cpu/mps, fp16 only on cuda/xpu; the fp16-fixed SDXL VAE `_SDXL_FP16_VAE_ID` is swapped in on fp16 GPUs -- issue #29) and the same MPS->CPU fallback (reload on cpu/fp32, drop a non-cpu generator, retry once).
- `face_restore.py` — optional GFPGAN face-restoration post-pass (cv2/torch/gfpgan boundary, top-of-file pyright pragma). Runs AFTER the diffusion removal pass (`InvisibleEngine.remove_watermark`, params `restore_faces=True` / `restore_faces_weight=0.5`; CLI `--restore-faces`/`--no-restore-faces` + `--restore-faces-weight` on `invisible`/`all`/`batch`, ON by default). **Restores face IDENTITY while still scrubbing the pixel watermark:** GFPGAN re-synthesizes each face from a StyleGAN2 prior (codebook/GAN pixels, NOT the original), so the composited face regions carry no watermark and no pixel-copy -- oracle-validated clean at weight 0.5 with identity preserved. Flow: GFPGANer.enhance runs on the ORIGINAL (watermarked) image -> identity faces + RetinaFace boxes (`restorer.face_helper.det_faces`); `_composite_faces` feather-composites those restored face REGIONS into the diffusion-cleaned image. `is_available()` gates on gfpgan + facexlib; lazily-built `GFPGANer` singleton forces CPU unless CUDA (the pip GFPGANer has an MPS device-mismatch bug; it is a cheap post-pass on a few face crops). `_apply_basicsr_shim()` recreates the removed `torchvision.transforms.functional_tensor` module that basicsr imports. The pure `_composite_faces` helper (Gaussian-feathered rectangular alpha per box, `out = restored*a + base*(1-a)`) is unit-tested without the model (`tests/test_face_restore.py`); the model-running path is gated behind `is_available()`. **Commercial-safe** (GFPGAN Apache-2.0 + RetinaFace MIT); the CodeFormer alternative is NON-COMMERCIAL and is NOT shipped. The `restore` extra (gfpgan/facexlib/basicsr) is kept OUT of `all` (heavy + the GFPGANv1.4 + RetinaFace weights download on first use, never bundled). **`restore` pins numpy<2** (same trap class as the removed faceid/insightface extra): basicsr/gfpgan/facexlib are an old ecosystem, so the extra caps `scipy<1.18` (>=1.18 uses `np.long`, gone in numpy 1.24-1.26) and `numba<0.60` to keep the whole env on one numpy 1.26 resolution; verified the `--extra dev --extra gpu` gate env stays numpy 1.26.4 + `diffusers.loaders.peft` importable with `restore` present. **basicsr 1.4.2 builds only on Python <3.13** (its `setup.py get_version()` uses `exec(...)` + `locals()['__version__']`, which the 3.13 fast-locals change broke -> `KeyError: '__version__'`), so the project is pinned to Python 3.12 via `.python-version` and `[tool.uv.extra-build-dependencies] basicsr = ["setuptools<69"]`. basicsr ships sdist-only (no wheel).
- `noai/watermark_remover.py` — the `WatermarkRemover` class has two diffusion pipelines, selected by the explicit `pipeline` ctor arg (NOT inferred from `model_id` -- both use the same SDXL base, `DEFAULT_MODEL_ID`). **`default`** runs plain SDXL img2img (`_run_img2img`). **`controlnet`** (**EXPERIMENTAL, opt-in**; `_run_controlnet`, `_load_controlnet_pipeline`) runs `StableDiffusionXLControlNetImg2ImgPipeline` with the SDXL-native canny ControlNet `xinsir/controlnet-canny-sdxl-1.0` (`watermark_profiles.CONTROLNET_CANNY_MODEL`): the control image is `cv2.Canny(gray, 100, 200)` stacked to 3 channels (`_CANNY_LOW`/`_CANNY_HIGH`, prompt `_CONTROLNET_PROMPT` / `_CONTROLNET_NEGATIVE`). **Removal still comes from the img2img regeneration (`strength`); the ControlNet only PRESERVES text and face STRUCTURE via the edge map -- no original pixels are copied or frozen, so SynthID does not survive.** Canny holds face STRUCTURE but NOT identity (the regenerated face drifts in likeness -- canny carries edges, not identity; face identity is preserved by the optional `--restore-faces` GFPGAN post-pass (EXPERIMENTAL, opt-in, OFF by default) -- see `face_restore.py`). `controlnet_conditioning_scale` (ctor arg, default 1.0) is the structure-preservation knob. Same dtype rule as `default` (fp32 on cpu/mps, fp16 only on cuda/xpu; the fp16-fixed SDXL VAE `_SDXL_FP16_VAE_ID` is swapped in on fp16 GPUs -- issue #29) and the same MPS->CPU fallback (reload on cpu/fp32, drop a non-cpu generator, retry once).
- `face_restore.py` — optional GFPGAN face-restoration post-pass (cv2/torch/gfpgan boundary, top-of-file pyright pragma). **EXPERIMENTAL, opt-in, OFF by default.** Runs AFTER the diffusion removal pass (`InvisibleEngine.remove_watermark`, params `restore_faces=False` / `restore_faces_weight=0.5`; CLI `--restore-faces`/`--no-restore-faces` + `--restore-faces-weight` on `invisible`/`all`/`batch`). **Restores face IDENTITY while still scrubbing the pixel watermark:** GFPGAN re-synthesizes each face from a StyleGAN2 prior (codebook/GAN pixels, NOT the original), so the composited face regions carry no watermark and no pixel-copy -- oracle-validated clean at weight 0.5 with identity preserved. Flow: GFPGANer.enhance runs on the ORIGINAL (watermarked) image -> identity faces + RetinaFace boxes (`restorer.face_helper.det_faces`); `_composite_faces` feather-composites those restored face REGIONS into the diffusion-cleaned image. `is_available()` gates on gfpgan + facexlib; lazily-built `GFPGANer` singleton forces CPU unless CUDA (the pip GFPGANer has an MPS device-mismatch bug; it is a cheap post-pass on a few face crops). `_apply_basicsr_shim()` recreates the removed `torchvision.transforms.functional_tensor` module that basicsr imports. The pure `_composite_faces` helper (Gaussian-feathered rectangular alpha per box, `out = restored*a + base*(1-a)`) is unit-tested without the model (`tests/test_face_restore.py`); the model-running path is gated behind `is_available()`. **Commercial-safe** (GFPGAN Apache-2.0 + RetinaFace MIT); the CodeFormer alternative is NON-COMMERCIAL and is NOT shipped. The `restore` extra (gfpgan/facexlib/basicsr) is kept OUT of `all` (heavy + the GFPGANv1.4 + RetinaFace weights download on first use, never bundled). **`restore` pins numpy<2** (same trap class as the removed faceid/insightface extra): basicsr/gfpgan/facexlib are an old ecosystem, so the extra caps `scipy<1.18` (>=1.18 uses `np.long`, gone in numpy 1.24-1.26) and `numba<0.60` to keep the whole env on one numpy 1.26 resolution; verified the `--extra dev --extra gpu` gate env stays numpy 1.26.4 + `diffusers.loaders.peft` importable with `restore` present. **basicsr 1.4.2 builds only on Python <3.13** (its `setup.py get_version()` uses `exec(...)` + `locals()['__version__']`, which the 3.13 fast-locals change broke -> `KeyError: '__version__'`), so the project is pinned to Python 3.12 via `.python-version` and `[tool.uv.extra-build-dependencies] basicsr = ["setuptools<69"]`. basicsr ships sdist-only (no wheel).
- `image_io.py` — Unicode-safe cv2 IO (issue #17). `imread(path, flags=None)` / `imwrite(path, img)` wrap `np.fromfile`+`cv2.imdecode` / `cv2.imencode`+`tofile` so non-ASCII paths work on Windows -- bare `cv2.imread`/`cv2.imwrite` use the platform ANSI code-page API there and fail (empty decode + `can't open/read file`) on Chinese/Cyrillic/accented filenames. `imread` keeps `cv2.imread` semantics (defaults to `IMREAD_COLOR`, returns `None` on missing/empty/undecodable). **Every cv2 file read/write in the package routes through here; do not call `cv2.imread`/`cv2.imwrite` directly.** `imwrite` returns `False` on an unwritable path (`OSError` caught) instead of raising, matching `cv2.imwrite` semantics. macOS/Linux already accept UTF-8 paths, so it is behavior-neutral there (the bug only reproduces on Windows). cv2/numpy are imported lazily inside the functions, so the module is cheap to import in a bare env.
### Doubao clean-reverse-alpha distillation (re-investigated 2026-05-29)
@@ -88,4 +88,4 @@ Who embeds what, and whether it is locally detectable (so we know which gaps are
- **External AI-vs-real classifier models are out of scope (decided 2026-05-24).** Generic HuggingFace detectors (`Organika/sdxl-detector` Swin Transformer, `umm-maybe/AI-image-detector`, and fine-tunes) exist and report ~0.98 on their *own* SDXL-vs-real validation sets, but they are per-generator and the model cards themselves note degraded accuracy off-distribution; they are untested on gpt-image / Gemini Nano Banana (the metadata-stripped surfaces we care about), and our own light SDXL pass would likely defeat them the same way it defeats SynthID. Detection here stays local + signal-based (metadata + visible sparkle); do not add a bundled classifier dependency.
- **DEFAULT STRENGTH IS NOW VENDOR-ADAPTIVE (2026-06-01, SUPERSEDES every fixed-default claim in this bullet and the next).** `resolve_strength(strength, profile, vendor)` + `vendor_for_strength(path)` (`watermark_profiles.py`) read the C2PA issuer (`metadata.synthid_source`) on the ORIGINAL input and pick `OPENAI_STRENGTH` **0.10** / `GEMINI_STRENGTH` **0.15** / `UNKNOWN_STRENGTH` **0.15** when `--strength` is unset; explicit `--strength` always wins. The CLI detects the vendor from the pristine source (before the visible pass / metadata-strip removes C2PA from the temp file) and passes it to the engine, so display and execution agree; `cmd_invisible`/`cmd_all`/`batch` + the module-level `remove_watermark` all thread `vendor`. **This replaces the single 0.30 default AND the prior "do NOT build a vendor-adaptive default" policy** -- both came from the now-debunked region-rescrub-contaminated study (the per-region re-scrub that contaminated those numbers was removed in the controlnet refactor). Basis: the oracle-verified June 2026 controlled study (clean v0.8.6, protect OFF): OpenAI clears at 0.05 across 1024-1600 (n=4, resolution-independent); Google needs 0.15 on the capped-1536 path (n=4). `docs/synthid.md` §2.2 (data) + §5.2 (the adaptive default) are authoritative. CAVEAT: Google's 0.15 was validated only on `--max-resolution 1536`; native large Gemini (2816) was not locally measurable (OOM on M-series) and is pending GPU validation on raiw.cc -- if it survives 0.15 native, raise `--strength`. **Everything below in this bullet about a fixed 0.10/0.30 default is HISTORICAL; trust the vendor-adaptive constants + docs/synthid.md.**
- **SynthID removal: strength + oracle scope.** Default strength is vendor-adaptive (see the bullet above); `docs/synthid.md` §2.2 is authoritative for the numbers. **Oracle scope (load-bearing):** the Gemini app "Verify with SynthID" is the ONLY valid SynthID oracle (detects Google's mark on any image); `openai.com/verify` is scoped to OpenAI provenance (its own C2PA), NOT a SynthID oracle -- a negative there is meaningless for SynthID. There is no local SynthID detector, so the tool cannot self-check; if the oracle still reads SynthID, raise `--strength` to the lowest value that verifies clean. Only the `default` (plain SDXL img2img) and `controlnet` (SDXL + canny ControlNet) profiles exist; the local `invisible` default is weight-for-weight identical to raiw.cc prod (`fal-ai/fast-sdxl` = `stabilityai/stable-diffusion-xl-base-1.0`, runtime-downloaded, not bundled). **Forensic-stealth caveat** (arXiv:2605.09203): defeating the SynthID verifier is NOT forensic invisibility -- independent detectors flag *removal-processed* images vs genuinely-clean ones at >98% TPR@1%FPR, so do not over-claim "indistinguishable from a real photo".
- **`controlnet` pipeline (text/face STRUCTURE preservation, opt-in `--pipeline controlnet`).** SDXL + the canny ControlNet `xinsir/controlnet-canny-sdxl-1.0` via `StableDiffusionXLControlNetImg2ImgPipeline` (`watermark_remover._run_controlnet` / `_load_controlnet_pipeline`). **Removal still comes from the img2img regeneration (`strength`); the ControlNet only PRESERVES text and face STRUCTURE by conditioning on the canny edge map** (`cv2.Canny(gray, 100, 200)`, 3-channel). Canny preserves edges, NOT face identity (a regenerated face drifts in likeness); face identity is preserved by the optional `--restore-faces` GFPGAN post-pass (auto when faces present -- see `face_restore.py`, the `restore` extra), which re-synthesizes each face from a StyleGAN2 prior so the composited face pixels carry no watermark. The CodeFormer alternative stays NON-COMMERCIAL and is not shipped. The earlier `--face-id` IP-Adapter FaceID layer was REMOVED (footgun: it needs high strength and corrupts faces at the low removal strength). **No original pixels are copied or frozen, so SynthID does not survive** -- unlike the deleted text/face-protection subsystems, which restored or re-scrubbed original pixels and could shield the watermark. `controlnet_conditioning_scale` (CLI `--controlnet-scale`, default 1.0) is the structure-preservation knob (higher = closer to the original structure). It shares the SDXL base, so it uses the SAME vendor-adaptive strength as `default` (`resolve_strength`); fp32 on cpu/mps, fp16-fixed VAE on cuda/xpu. The `controlnet` profile is threaded explicitly (`WatermarkRemover(pipeline=...)` / `InvisibleEngine(pipeline=...)`), NOT inferred from `model_id`. This productionizes the `scripts/controlnet_sweep.py` prototype; see `docs/controlnet-removal-pipeline-research.md`. **Forensic-stealth caveat still applies** (arXiv:2605.09203): defeating the SynthID verifier is not forensic invisibility -- a "this image went through a removal pipeline" classifier can still flag the output.
- **`controlnet` pipeline (text/face STRUCTURE preservation, EXPERIMENTAL, opt-in `--pipeline controlnet`).** SDXL + the canny ControlNet `xinsir/controlnet-canny-sdxl-1.0` via `StableDiffusionXLControlNetImg2ImgPipeline` (`watermark_remover._run_controlnet` / `_load_controlnet_pipeline`). **Removal still comes from the img2img regeneration (`strength`); the ControlNet only PRESERVES text and face STRUCTURE by conditioning on the canny edge map** (`cv2.Canny(gray, 100, 200)`, 3-channel). Canny preserves edges, NOT face identity (a regenerated face drifts in likeness); face identity is preserved by the optional `--restore-faces` GFPGAN post-pass (EXPERIMENTAL, opt-in, OFF by default -- see `face_restore.py`, the `restore` extra), which re-synthesizes each face from a StyleGAN2 prior so the composited face pixels carry no watermark. The CodeFormer alternative stays NON-COMMERCIAL and is not shipped. The earlier `--face-id` IP-Adapter FaceID layer was REMOVED (footgun: it needs high strength and corrupts faces at the low removal strength). **No original pixels are copied or frozen, so SynthID does not survive** -- unlike the deleted text/face-protection subsystems, which restored or re-scrubbed original pixels and could shield the watermark. `controlnet_conditioning_scale` (CLI `--controlnet-scale`, default 1.0) is the structure-preservation knob (higher = closer to the original structure). It shares the SDXL base, so it uses the SAME vendor-adaptive strength as `default` (`resolve_strength`); fp32 on cpu/mps, fp16-fixed VAE on cuda/xpu. The `controlnet` profile is threaded explicitly (`WatermarkRemover(pipeline=...)` / `InvisibleEngine(pipeline=...)`), NOT inferred from `model_id`. This productionizes the `scripts/controlnet_sweep.py` prototype; see `docs/controlnet-removal-pipeline-research.md`. **Forensic-stealth caveat still applies** (arXiv:2605.09203): defeating the SynthID verifier is not forensic invisibility -- a "this image went through a removal pipeline" classifier can still flag the output.
+5 -5
View File
@@ -23,7 +23,7 @@ If this tool saves you time, consider [sponsoring its development](https://githu
- **AI metadata stripping** — EXIF, PNG text chunks, C2PA provenance manifests (PNG / JPEG / AVIF / HEIF / JPEG-XL, **MP4 / MOV / M4V / M4A** at the container level, and **WebM / MP3 / WAV / FLAC / OGG** losslessly via ffmpeg), XMP DigitalSourceType
- **"Made with AI" label removal** — removes the AI-disclosure metadata that platforms read to apply automatic labels (useful for clearing a false-positive label from a human-edited photograph)
- **Analog Humanizer** — optional film grain and chromatic aberration post-processing
- **Text and face preservation** — optional `--pipeline controlnet` adds a canny ControlNet that keeps text and face structure sharp through the removal pass (without copying original pixels, so SynthID is still removed). Canny preserves face *structure*, not *identity* (the regenerated face drifts in likeness); identity is preserved by the `--restore-faces` GFPGAN post-pass (on by default, auto when faces present)
- **Text and face preservation (experimental)** — optional `--pipeline controlnet` adds a canny ControlNet that keeps text and face structure sharp through the removal pass (without copying original pixels, so SynthID is still removed). Canny preserves face *structure*, not *identity* (the regenerated face drifts in likeness); identity is preserved by the `--restore-faces` GFPGAN post-pass (opt-in). Both are experimental and off by default.
- **Batch processing** — process entire directories
- **Detection** — three-stage NCC watermark detection with confidence scoring
- **Provenance detection (`identify`)** — aggregate C2PA issuer, the C2PA soft-binding forensic-watermark vendor (Adobe TrustMark, Digimarc, Imatag, ...), IPTC "Made with AI" plus the IPTC 2025.1 `AISystemUsed` field, embedded SD/ComfyUI params, EXIF/XMP generator tags, the xAI/Grok EXIF signature, the China TC260 AIGC label (XMP, PNG chunk, or EXIF), the HuggingFace `hf-job-id` job marker, the SynthID metadata proxy, the visible marks (Gemini sparkle plus the Doubao "豆包AI生成" / Jimeng "即梦AI" text marks), the open SD/SDXL/FLUX invisible watermark, and (with the `trustmark` extra) the open Adobe TrustMark watermark into one origin-platform + watermark-inventory verdict (`--json` for machine output)
@@ -117,9 +117,9 @@ image → encode to latent space (VAE) at native resolution
> **Default strength is vendor-adaptive (no flag needed).** The tool reads the C2PA issuer to detect which vendor's SynthID is present and picks the strength that clears it with the least quality loss: **OpenAI gpt-image → `0.10`**, **Google Gemini → `0.15`**, **unknown source → `0.15`**. An oracle-verified June 2026 study (clean pipeline, per-image openai.com/verify or Gemini app) found OpenAI's watermark clears at `0.05` across `1024`-`1600` px (resolution-independent) while Google's is ~3x more robust and needs `0.15`. The dominant factor is the vendor, not resolution. There is no local SynthID detector, so if the oracle still reads SynthID, raise `--strength`; if you care more about preserving fine text, lower it. (Caveat: Google's `0.15` was validated on the capped `--max-resolution 1536` path; a very large native Gemini image may need more.)
>
> **`--pipeline controlnet` preserves text and face structure.** It runs the same SDXL img2img scrub but adds a canny ControlNet that conditions the regeneration on the image's edge map, so text and structure stay sharp at the strengths that remove SynthID. The watermark removal still comes from the img2img regeneration (`--strength`); the ControlNet only preserves structure — no original pixels are copied or frozen, so SynthID does not survive. `--controlnet-scale` tunes the preservation strength (higher = closer to the original structure). Runs fp32 on mps/cpu (fp16 only on cuda/xpu, where the fp16-fixed SDXL VAE is loaded automatically).
> **`--pipeline controlnet` preserves text and face structure (experimental, opt-in).** It runs the same SDXL img2img scrub but adds a canny ControlNet that conditions the regeneration on the image's edge map, so text and structure stay sharp at the strengths that remove SynthID. The watermark removal still comes from the img2img regeneration (`--strength`); the ControlNet only preserves structure — no original pixels are copied or frozen, so SynthID does not survive. `--controlnet-scale` tunes the preservation strength (higher = closer to the original structure). Runs fp32 on mps/cpu (fp16 only on cuda/xpu, where the fp16-fixed SDXL VAE is loaded automatically).
>
> **`--restore-faces` preserves face identity (GFPGAN, on by default).** Canny preserves where a face is, but not who it is — the regenerated face drifts in likeness. The `--restore-faces` post-pass (on by default; needs the `restore` extra) fixes this: after the removal pass it runs GFPGAN on the original faces and composites the restored face regions into the cleaned image. GFPGAN re-synthesizes each face from a StyleGAN2 prior, so those pixels are GAN-generated (not copied) — the watermark is still scrubbed in the face regions while identity is held (oracle-confirmed clean). It auto-skips when no face is detected or the extra is absent. Tune fidelity with `--restore-faces-weight` (default `0.5`; lower = more regeneration / cleaner scrub, higher = closer to the input). Commercial-safe (GFPGAN is Apache-2.0, its RetinaFace detector MIT); the CodeFormer alternative is non-commercial and is not shipped. (An IP-Adapter FaceID approach was tried earlier and removed: it needs high denoise strength and corrupts faces at the low strength used for removal.)
> **`--restore-faces` preserves face identity (GFPGAN, experimental, opt-in).** Canny preserves where a face is, but not who it is — the regenerated face drifts in likeness. The `--restore-faces` post-pass (experimental, off by default; needs the `restore` extra) fixes this: after the removal pass it runs GFPGAN on the original faces and composites the restored face regions into the cleaned image. GFPGAN re-synthesizes each face from a StyleGAN2 prior, so those pixels are GAN-generated (not copied) — the watermark is still scrubbed in the face regions while identity is held (oracle-confirmed clean). It auto-skips when no face is detected or the extra is absent. Tune fidelity with `--restore-faces-weight` (default `0.5`; lower = more regeneration / cleaner scrub, higher = closer to the input). Commercial-safe (GFPGAN is Apache-2.0, its RetinaFace detector MIT); the CodeFormer alternative is non-commercial and is not shipped. (An IP-Adapter FaceID approach was tried earlier and removed: it needs high denoise strength and corrupts faces at the low strength used for removal.)
SDXL is the default since May 2026: empirically defeats SynthID v2 on Gemini 3 Pro outputs, where the older SD-1.5 pipeline at 768 px did not. The SD-1.5 path was removed once it was verified not to handle v2. Note the scope: this defeats the SynthID *verifier*, which is not the same as being forensically indistinguishable from a real photo. Recent work ([arXiv:2605.09203](https://arxiv.org/abs/2605.09203)) shows watermark-removal pipelines leave detectable traces, so a separate "this image was processed" classifier can still flag the output.
@@ -127,7 +127,7 @@ SDXL is the default since May 2026: empirically defeats SynthID v2 on Gemini 3 P
> **Technical deep-dive:** see [`docs/synthid.md`](docs/synthid.md) for a primary-source-cited breakdown of how SynthID works mechanically (post-hoc encoder/decoder, 136-bit payload, pixel-space embedding), what it empirically survives (JPEG, crop, resize: ~99.98% TPR at 0.1% FPR from arXiv:2510.09263), what removes it, and the forensic-stealth tradeoff (all known removal attacks are detectable at >98% TPR@1%FPR per arXiv:2605.09203).
**Text and face preservation** (opt-in `--pipeline controlnet`): adds a canny ControlNet so text and face *structure* stay sharp through the removal pass, without copying or freezing any original pixels (so SynthID is still removed). Tune the preservation strength with `--controlnet-scale`. Canny preserves structure but not face *identity* (identity is preserved by the `--restore-faces` GFPGAN post-pass, on by default — see the callout above).
**Text and face preservation** (experimental, opt-in `--pipeline controlnet`): adds a canny ControlNet so text and face *structure* stay sharp through the removal pass, without copying or freezing any original pixels (so SynthID is still removed). Tune the preservation strength with `--controlnet-scale`. Canny preserves structure but not face *identity* (identity is preserved by the `--restore-faces` GFPGAN post-pass, experimental and off by default — see the callout above). Both features are experimental.
**Analog Humanizer**: optional film grain and chromatic aberration injection that mimics a photo of a screen, raising the bar for AI-generated image classifiers. (It frustrates generic classifiers but does not guarantee forensic invisibility — see the [arXiv:2605.09203](https://arxiv.org/abs/2605.09203) note above.)
@@ -206,7 +206,7 @@ After installation the `remove-ai-watermarks` command is available system-wide.
> ```
>
> To preserve face identity after invisible removal (the `--restore-faces`
> GFPGAN post-pass, on by default), install the `restore` extra. The GFPGANv1.4
> GFPGAN post-pass, experimental and opt-in), install the `restore` extra. The GFPGANv1.4
> and RetinaFace weights download on first use. It needs Python < 3.13 (basicsr
> does not build on 3.13):
>
+7 -3
View File
@@ -379,14 +379,18 @@ the payload, reconstituting SynthID in text. The lesson held and shaped the
current design: **content is preserved by REGENERATING it under structural
conditioning, never by copying original pixels.**
- **Text + structure:** `--pipeline controlnet` (SDXL img2img + a canny ControlNet)
conditions the regeneration on the edge map, so text and structure stay sharp
Both preservation features below are **EXPERIMENTAL and opt-in (off by default)**;
the plain `default` SDXL img2img pass is the shippable path.
- **Text + structure:** `--pipeline controlnet` (SDXL img2img + a canny ControlNet,
experimental/opt-in) conditions the regeneration on the edge map, so text and
structure stay sharp
while every pixel is still regenerated -- SynthID is removed everywhere. Verified
better than plain img2img at the same strength (text stays legible where plain
garbles it), and the controlnet background scrub reads clean on the oracle.
- **Face identity:** canny holds face *structure* but not *identity*. Shipped as the
optional `--restore-faces` GFPGAN post-pass (`face_restore.py`, the `restore`
extra, ON by default, auto when faces present). It runs GFPGAN on the ORIGINAL
extra, experimental/opt-in, off by default). It runs GFPGAN on the ORIGINAL
faces and feather-composites the restored face REGIONS into the cleaned image:
GFPGAN RE-SYNTHESIZES each face from a StyleGAN2 prior (GAN pixels, not original
-> scrubs SynthID) at a low fidelity weight (`--restore-faces-weight`, default
+10 -9
View File
@@ -143,7 +143,8 @@ _controlnet_scale_option = click.option(
"--controlnet-scale",
type=float,
default=1.0,
help="ControlNet conditioning scale (structure/text preservation strength), controlnet pipeline only.",
help="ControlNet conditioning scale (structure/text preservation strength), controlnet pipeline "
"only (EXPERIMENTAL).",
)
@@ -151,10 +152,10 @@ def _restore_faces_options(f: Any) -> Any:
"""Attach the shared GFPGAN face-restoration flags to an invisible-pipeline command."""
restore_flag = click.option(
"--restore-faces/--no-restore-faces",
default=True,
help="Restore face identity with a GFPGAN post-pass when faces are present "
"(needs the 'restore' extra); on by default, auto-skips when no face is detected "
"or the extra is absent.",
default=False,
help="EXPERIMENTAL, opt-in. Restore face identity with a GFPGAN post-pass when "
"faces are present (needs the 'restore' extra); off by default, auto-skips when no "
"face is detected or the extra is absent.",
)
weight_flag = click.option(
"--restore-faces-weight",
@@ -471,7 +472,7 @@ def cmd_erase(
type=click.Choice(["default", "controlnet"]),
default="default",
help="Pipeline profile (default=SDXL img2img; controlnet=SDXL + canny ControlNet that preserves "
"text/faces via edge conditioning while removing SynthID).",
"text/faces via edge conditioning while removing SynthID, EXPERIMENTAL).",
)
@click.option(
"--device",
@@ -715,7 +716,7 @@ def cmd_identify(ctx: click.Context, source: Path, no_visible: bool, as_json: bo
type=click.Choice(["default", "controlnet"]),
default="default",
help="Pipeline profile (default=SDXL img2img; controlnet=SDXL + canny ControlNet that preserves "
"text/faces via edge conditioning while removing SynthID).",
"text/faces via edge conditioning while removing SynthID, EXPERIMENTAL).",
)
@click.option("--model", type=str, default=None, help="HuggingFace model ID for invisible removal.")
@click.option(
@@ -907,7 +908,7 @@ def _process_batch_image(
hf_token: str | None,
humanize: float,
max_resolution: int = 0,
restore_faces: bool = True,
restore_faces: bool = False,
restore_faces_weight: float = 0.5,
) -> None:
"""Process a single image for batch mode.
@@ -1012,7 +1013,7 @@ def _process_batch_image(
type=click.Choice(["default", "controlnet"]),
default="default",
help="Pipeline profile (default=SDXL img2img; controlnet=SDXL + canny ControlNet that preserves "
"text/faces via edge conditioning while removing SynthID).",
"text/faces via edge conditioning while removing SynthID, EXPERIMENTAL).",
)
@click.option(
"--device",
+5 -4
View File
@@ -126,7 +126,7 @@ class InvisibleEngine:
humanize: float = 0.0,
max_resolution: int = 0,
vendor: str | None = None,
restore_faces: bool = True,
restore_faces: bool = False,
restore_faces_weight: float = 0.5,
) -> Path:
"""Remove invisible watermark from an image.
@@ -140,9 +140,10 @@ class InvisibleEngine:
guidance_scale: Classifier-free guidance scale.
seed: Random seed for reproducibility.
humanize: Intensity of Analog Humanizer film grain (0 = off).
restore_faces: Run the optional GFPGAN face-restoration post-pass when
faces are present (needs the ``restore`` extra). Auto-skips with a
debug log when the extra is absent or no face is detected.
restore_faces: EXPERIMENTAL, opt-in (default False). Run the GFPGAN
face-restoration post-pass when faces are present (needs the
``restore`` extra). Auto-skips with a debug log when the extra is
absent or no face is detected.
restore_faces_weight: GFPGAN fidelity weight (0-1); lower = more GAN
regeneration (cleaner watermark scrub), higher = closer to input.
max_resolution: Cap the long side (px) before diffusion. 0 (default)