When `visible --mark auto` (or an explicit `--mark` with detection on) found no registered mark, it exited 0 without writing output -- which a wrapping service reads as success and re-serves the unchanged input. ~74% of real uploads carry no registered visible mark, so this was the dominant "it didn't work" / NPS score-0 failure mode. Now it runs a cheap metadata-only identify, prints actionable guidance (route to `all` for an invisible/metadata mark, or `erase` for an arbitrary logo), writes no output file, and exits EXIT_NO_VISIBLE_MARK (2) -- distinct from success (0) and a hard error (1) so the caller can surface the message. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
24 KiB
Remove-AI-Watermarks
You are a principal Python engineer maintaining a CLI tool and library for removing visible and invisible AI watermarks from images.
How to run
uv run remove-ai-watermarks all <image.png> -o <output.png>— full pipeline (visible + invisible + metadata). Same diffusion knobs asinvisiblebelow, plus the visible-pass--inpaint/--no-inpaint/--inpaint-method. When the[gpu]extra is absent, step 2 (invisible/SynthID) is skipped —allstill writes an output (visible mark + metadata stripped) but prints a prominent end-of-run banner ("the invisible (SynthID) watermark was NOT removed") AND exits non-zero (1), so a skipped SynthID pass is not mistaken for a clean result (the recurring #14/#47 trap, where the old quiet inline warning was missed).invisiblealready hard-errors without the extra; onlyallcontinued, hence the loud end-banner. Regression-guarded bytests/test_cli.py::TestAllCommand::test_all_loud_warning_and_nonzero_exit_when_gpu_missing. Test trap: anyalltest that exercises the full pipeline MUSTpatch("remove_ai_watermarks.invisible_engine.is_available", return_value=True)— CI installs core+dev only (no[gpu]), so an unpatchedalltest takes the skip branch and now hits the non-zero exit. This passed locally (gpu present →is_available()True) but red-failed every matrix cell on the v0.11.0 commit (test_all_basic/test_all_visible_step_uses_registryasserted exit 0); both now patchis_availableTrue.uv run remove-ai-watermarks invisible <image.png> -o <out.png>— diffusion SynthID removal. Full knob set (kept identical acrossinvisible/all/batch):--strength(vendor-adaptive default),--steps,--guidance-scale(CFG, default 7.5),--pipeline sdxl|controlnet(defaultcontrolnet),--controlnet-scale,--model(HF model id, default SDXL base),--device,--seed,--hf-token,--max-resolution/--min-resolution,--upscaler lanczos|esrgan,--humanize(Analog Humanizer grain),--unsharp(final sharpen), and--adaptive-polish/--no-adaptive-polish(ON by default; detail-targeted polish that self-gates to a no-op where there is no deficit).--autois deprecated and now a no-op that only warns (the polish it used to enable is ON by default).uv run remove-ai-watermarks visible <image.png> -o <out.png>— known-visible-mark removal, CPU, no GPU. Reverse-alpha based: each mark is removed by inverting its captured alpha map.--mark auto(default) picks the strongest detected of the Gemini sparkle, the Doubao "豆包AI生成" text strip, the Jimeng "★ 即梦AI" wordmark, and the Samsung Galaxy AI "✦ Contenuti generati dall'AI" strip (bottom-LEFT, locale-specific — Italian variant calibrated);--mark gemini/--mark doubao/--mark jimeng/--mark samsungforce one (choices come from the registry). Gemini/Doubao recover pixels exactly with no inpaint at native; Jimeng and Samsung add an always-on thin residual inpaint over the glyph footprint (their marks re-rasterize per image, so reverse-alpha alone leaves a faint outline). For arbitrary logos/objects useerase. When--mark autofinds no known mark (the common case — ~74% of real uploads carry no registered visible mark), the command does NOT silently re-serve the input as a finished result. It runs a cheap metadata-onlyidentify, prints actionable guidance (if the image carries an invisible/metadata mark, e.g. an OpenAI/Gemini C2PA image, it points toall; otherwise toerase --region), writes NO output file, and exitsEXIT_NO_VISIBLE_MARK(2) — distinct from success (0) and a hard error (1) so a wrapping service (raiw.cc) can surface the message instead of treating the unchanged image as done (the production "it didn't work" / score-0 trap). Same handling for an explicit--mark <name>that is not detected. Helpercli._no_visible_mark_exit; regression-guarded bytests/test_cli.py::TestVisibleCommand::test_visible_auto_no_mark_exits_two_with_eraser_hintandtest_visible_auto_no_mark_routes_to_all_when_metadata.--no-detectstill forces the gemini fallback and proceeds (exit 0).uv run remove-ai-watermarks erase <image.png> --region x,y,w,h -o <out.png>— universal region eraser (any logo/object, any position).--backend cv2(default, no deps) or--backend lama(big-LaMa via onnxruntime, extralama);--regionis repeatable.uv run remove-ai-watermarks identify <image>— provenance verdict (platform + watermark inventory + confidence);--jsonfor machine output,--no-visibleto skip the cv2 sparkle detectoruv run remove-ai-watermarks metadata <image.png> --check— inspect AI metadata (C2PA, EXIF, PNG chunks)uv run remove-ai-watermarks metadata <image.png> --remove -o <out.png>— strip all AI metadatauv run remove-ai-watermarks batch <directory>— process every supported image in a directory (output defaults to<directory>_clean/, set with-o).--mode visible|invisible|metadata|all(defaultvisible); the invisible/all path reuses the fullinvisibleknob set above (--strength/--steps/--guidance-scale/--pipeline/--controlnet-scale/--model/--device/--max-resolution/--min-resolution/--upscaler/--seed/--hf-token/--humanize/--unsharp/--adaptive-polish), plus--inpaint/--no-inpaintfor the visible pass.--adaptive-polishis ON by default;--autois deprecated and a no-op that only warns. One engine cached per pipeline; the polish is resolved once before the loop.
Test and lint
- CI (
.github/workflows/test.yml): runs on push tomain+ every PR. Alintjob (ubuntu:ruff check+ruff format --check) plus atestmatrix (ubuntu/macos/windows x py3.10/3.12) that doesuv sync --frozen --extra devthenpytest. The matrix installs only core + dev (nogpuextra), so the GPU/model-running tests skip there and it exercises the metadata/identify/visible/cv2-eraser surface on all three OSes. Keepuv.lockvalid (don't break--frozen) when editingpyproject.toml. - Release flow + distribution channels (PyPI publish via
publish.yml/uv publish, the automated Homebrew-tap + HF-Space bumps indistribute.yml, conda-forge, ComfyUI Registry, the sdistdata/exclusion, hatchling pin history): seedocs/release-and-distribution.mdbefore cutting a release. bash maintain.sh— uv-outdated, uv-secure, ruff check/fix, ruff format, pyright (scopedsrc/, see the OOM note below), pytest -n auto. The helper tools live in thedevextra (pytest-xdist, plusuv-outdated/uv-securemarker-gated to py3.12+ so the py3.10 resolution stays solvable) — a bare env without--extra devdoes not have them.- Strict pyright is clean across
src/(0 errors). The cv2/torch/diffusers boundary files (gemini_engine,region_eraser,doubao_engine,humanizer,invisible_engine,noai/watermark_remover) carry a documented per-file# pyright:relax pragma that turns off only the unknown-type / untyped-third-party rules — those libs ship no usable types, so strict typing there fights the ecosystem. Pure-logic files stay fully strict;typings/piexif/__init__.pyiis a local stub sometadata.py/extractor.pyresolve piexif. Public ndarray-returning signatures on the relaxed engines are still annotatedNDArray[Any]so strict consumers (cli.py) stay clean. When touching a relaxed file, prefer fixing real issues over widening the pragma; keep the pragma scoped to genuinely-untyped boundaries. (uv-secureis clean since idna was bumped 3.11 -> 3.16, fixing GHSA-65pc-fj4g-8rjx, and aiohttp 3.13.5 -> 3.14.0 viauv lock --upgrade-package aiohttp, fixing GHSA-hg6j-4rv6-33pg + GHSA-jg22-mg44-37j8. (The old basicsr Dependabot alert (GHSA-86w8-vhw6-q9qq) is resolved by removal: the experimentalrestoreextra was retired and basicsr is no longer anywhere in the dependency tree.) The torch Dependabot alert GHSA-rrmf-rvhw-rf47 (torch.jit.scriptmemory corruption, vulnerable<= 2.12.0) is dismissed asnot_used(2026-06-10): torch is a transitive dep of the optionalgpuextra only, the codebase never callstorch.jit(grep-verified), and no patched torch version exists (first_patched_versionis null), so it cannot be closed by an upgrade — do not re-triage it. - Full-project
uv run pyright(no path) OOMs/crashes node on this ML-heavy repo (emits alibnodestack frame, no summary) — a known environment limit, not a code error. Gate withuv run --extra dev --extra gpu pyright src/(completes, authoritative) or scope to changed files; also runuv run ruff checkanduv run pytestdirectly. - Run
uv runfrom the repo root — from another cwd it falls back to a bare env without numpy/cv2/torch. - Stale
trustmarkremnant in site-packages after an extras change: thetrustmarkpackage downloads model weights INTO its own package dir, so when a narroweruv syncprunes the package, atrustmark/models/directory survives as an empty namespace package. Symptom: pyright"TrustMark" is unknown import symbolontrustmark_detector.pyandfind_spec("trustmark")returning a loader-less spec (sois_available()lies True). Fix:rm -rf .venv/lib/python3.12/site-packages/trustmark(regenerable weights cache). - To add a dev tool (pytest/ruff/pyright) into the env, use
uv sync --frozen --extra dev --extra gpu, neveruv pip install—uv pip installre-resolves and rewritesuv.lock, which silently bumpedtransformersto a build incompatible with the pinneddiffusers(cannot import name 'Qwen3VLForConditionalGeneration') and broke everyidentify/metadata import. Recovery:git checkout uv.lock && uv sync --frozen --extra gpu --extra dev. Thegpuextra holdsdiffusers/transformers/torch, so a bareuv sync(no extras) removes them;noai/__init__is now lazy (PEP 562__getattr__, so importingidentify/metadatano longer pullswatermark_remover/torch), so a bare env breaks only when the removal pipeline is actually invoked, not on import.maintain.sh'suv sync --all-extrasalso pulls the heavytrustmark/lamawheels (pytorch-lightning, onnxruntime) — fine on a good connection, but on flaky DNS sync only--extra gpu --extra devand run the lint/test steps by hand. - Metadata/C2PA tests assert against real committed fixtures in
data/samples/(chatgpt-*.png= OpenAI C2PA,firefly-1.png= Adobe,mj-*= Midjourney IPTC,doubao-1.png= ByteDance Doubao with the China TC260<TC260:AIGC>XMP label and a visible "豆包AI生成" text mark bottom-right;grok-1.jpg= xAI Grok with its EXIF-onlySignature:blob + UUIDArtistand no C2PA/SynthID/IPTC); synthetic byte blobs cover the JPEG/ISOBMFF format paths. The "non-AI / clean photo" control is no longer indata/samples/-- theclean_photoconftest fixture serves a verified-negative image from the corpusneg/set (skips if the corpus is absent). - SynthID reference corpus:
scripts/synthid_corpus.pyingests labeled images intodata/synthid_corpus/. The labeledimages/(pos/neg/cleaned/) are committed (public repo -- review every image for private content before adding;manifest.csvis kept in sync with the files on disk, one row per tracked image); only the syntheticrefs/calibration fills are gitignored. See its README for the collection protocol and verification oracles.cleaned/examples must be produced by a CURRENT shipped removal method -- the default SDXL img2img pass (optionally--max-resolution). Do NOT archive cleaned outputs from methods that are no longer in the pipeline (ctrlregen, the old text/face-protection, IP-Adapter FaceID, CodeFormer) or from the experimental opt-in paths (controlnet, face restore) as corpus examples; a cleaned reference should represent the canonical removal, and a removed method's output is not a reproducible example. Keep those experiment outputs in a local working dir, never in the committed corpus.
Configuration
- GPU/ML modules (invisible_engine, watermark_remover) are optional — guard imports with
is_available()checks - Optional detection extras:
detect(imwatermark — open SD/SDXL/FLUX watermark) andtrustmark(Adobe TrustMark decoder; pulls torch + downloads weights). Both are guarded byis_available()and skipped byidentifywhen absent. - Optional
esrganextra (spandrel only): Real-ESRGAN pre-diffusion super-resolution for small inputs (upscaler.py, CLI--upscaler esrganoninvisible/all/batch). Guarded byupscaler.is_available(); the default upscaler stays Lanczos (cv2, no deps) and the engine falls back to Lanczos when the extra is absent or the model errors. spandrel is MIT and pulls NO basicsr (only torch/torchvision/safetensors/numpy/einops); Real-ESRGAN weights are BSD-3-Clause and download on first use viatorch.hub(never bundled). Kept OUT ofall(heavy + model download). - Tests for the model-running paths are limited to availability checks (multi-GB downloads). But the pure helpers inside ML-adjacent modules are unit-tested without any download and must stay that way:
_target_size(native-vs-downscale-cap-vs-upscale-floor,test_invisible_engine.py),humanizer.unsharp_mask/adaptive_polish(test_humanizer.py), and the MPS->CPU fallback control flow via mocked pipelines (test_img2img_runner.py, 100% cover). Don't skip these as "ML, needs a model" — onlyremove_watermark/the diffusion bodies do.
Key modules
Compact map. The full per-module detail (design decisions, tuned thresholds, calibration history, incident records, and the regression-guard map) lives in docs/module-internals.md — read the relevant section there before changing any module below.
noai/c2pa.py— PNG caBX chunk parser (extract_c2pa_chunk/has_c2pa_metadata/extract_c2pa_info). Do not reimplement chunk parsing; chunk reads are clamped to the remaining file size by design.noai/constants.py— the singleC2PA_AI_VENDORSregistry (+C2PA_SOFT_BINDINGS) from whichC2PA_ISSUERS/SYNTHID_C2PA_ISSUERS/identify._ISSUER_PLATFORMare all derived. Add a new vendor as one registry entry; never edit the derived dicts and never add inline.metadata.py—scan_head(path)is the shared (memoized) input for every C2PA/AIGC/IPTC byte scan; use it instead ofopen().read(1MB)for any new marker scan. Also home tosynthid_source,xai_signature,iptc_ai_system,aigc_label,huggingface_job,samsung_genai, andremove_ai_metadata(fail-safestrip_c2pa_boxes).identify.py— aggregates every locally-readable signal into oneProvenanceReport;is_ai_generatedis True or None, never asserted False.import identifyis deliberately light (lazynoai/__init__, fits a 512 MB host) — keep heavy imports out. Add capture-camera tokens to_DEVICE_C2PA_PLATFORMonly when verified against a real C2PA file; editing-app/AI-device signer tokens go to_SIGNER_C2PA_PLATFORM; generator/issuer platforms toC2PA_AI_VENDORSinconstants.py. Integrity-clash detection is high-precision by design (only hard generator stamps feed it, source-grouped independence).watermark_registry.py— the single catalog of known visible watermarks (gemini / doubao / jimeng / samsung), reverse-alpha based by policy. Add a new visible text mark = one_text_mark(...)row + aTextMarkConfigwith a captured alpha map; do not re-add per-markifbranches.cli._write_bgr_with_alphamust NOT zero alpha in the watermark bbox (issue #30 white-box regression).gemini_engine.py— visible Gemini-sparkle remover/detector (cv2/numpy, no GPU): top-K size-weighted fusion candidate selection (_SELECT_TOPK), corner-promote, over/under-subtraction guards, false-positive gate, self-verify repair. Detection scores the top-K size-weighted matches by full fusion (spatial+gradient+variance) and keeps the highest — NOT the raw-NCC argmax, which re-admits the tiny-patch FPs the size weight suppresses (the osachub 2026-06-12 sub-0.85 corner-sparkle regression; seedocs/module-internals.md). Keep the 0.85 corner-promote NCC gate; a margin/chroma-gated lower promote was measured and REJECTED 2026-06-11 (~33% FP on non-Google content). Gate any removal candidate on a physical brightness check, not the detector alone._text_mark_engine.py— shared base for the three reverse-alpha text-mark engines (extracted 2026-06-09); the per-engine modules are config-only subclasses. New text mark = aTextMarkConfig+ a thin subclass + one registry row. Gemini stays a separate engine (different model).doubao_engine.py/jimeng_engine.py/samsung_engine.py— thinTextMarkEnginesubclasses: Doubao "豆包AI生成" (bottom-right), Jimeng "★ 即梦AI" (bottom-right), Samsung Galaxy AI "✦ Contenuti generati dall'AI" (bottom-LEFT, locale-specific — Italian variant calibrated). Removal = reverse-alpha (always-align) + thin residual inpaint. A detector-only removal test is insufficient — assert visual residual (the textured-shift tests).region_eraser.py— universal region eraser (eraseCLI): cv2 backend default (no deps), optional big-LaMa via onnxruntime (~3.5-4 GB peak RAM, ~5-6 s/call CPU — does not fit a minimal droplet).invisible_watermark.py— decodes the OPEN DWT-DCT watermarks (SD / SDXL / FLUX) viaimwatermark(extradetect, pulls torch). Fragile: does not survive JPEG re-encode/resize, so it confirms origin only on pristine files.trustmark_detector.py— Adobe TrustMark open decoder (extratrustmark). Do NOT remove the JPEG re-encode false-positive gate — a lone TrustMark hit without it is almost always content noise.noai/watermark_remover.py—WatermarkRemoverwith two diffusion pipelines selected by the explicitpipelinector arg, never inferred frommodel_id:sdxl(plain SDXL img2img) andcontrolnet(SDXL + canny ControlNet, the DEFAULT since 2026-06-09). Removal comes from the img2imgstrength; ControlNet only preserves text/face STRUCTURE — SynthID CAN survive controlnet on photoreal content at low strength. No face-restore extra ships, by validated decision (every restore approach looked MORE AI-generated).auto_config.py+ the content-detection layer were REMOVED 2026-06-09;--autois a deprecated no-op (controlnet is the default pipeline and the adaptive polish is ON by default and self-gates to a no-op where there is no detail deficit).upscaler.py— optional Real-ESRGAN pre-diffusion super-resolution for small inputs (extraesrgan, spandrel only). Manual opt-in; the default--upscalerstayslanczosand the engine always falls back to Lanczos on absence/error. ESRGAN can degrade faces and thin text.image_io.py— Unicode-safe cv2 IO (issue #17). Every cv2 file read/write in the package routes throughimread/imwrite; do not callcv2.imread/cv2.imwritedirectly.to_bgr(image)is the shared channel normalizer — use it instead of inliningcvtColorbranches.
For the Doubao alpha-distillation history (why content-image reverse-alpha distillation fails by physics and controlled captures were required), see docs/research-doubao-distillation.md.
Watermarking landscape
Who embeds what (C2PA / IPTC / EXIF / TC260 AIGC / xAI signature / open and proprietary invisible watermarks), whether each is locally detectable, the C2PA 2.4 durable-credentials implications, and the regulatory driver table live in docs/watermarking-landscape.md (research 2026-05-24, updated through 2026-06-10). Read it before adding a new identify signal, vendor token, or metadata marker. See identify.py for what we read today.
Known limitations
Compact list. Full measurements, incident history, and oracle-validation runs live in docs/known-limitations.md — read the relevant section there before changing the diffusion pipelines, strength defaults, resolution handling, or metadata coverage.
invisibleprocesses at native resolution for inputs >= 1024px long side and auto-upscales smaller inputs to a 1024px floor (--min-resolution 0disables;--max-resolution Nis an opt-in cap to bound GPU/MPS memory). MPS OOM is memory-tier dependent, not a hard limit: ~24 GB unified memory falls back to CPU (slow but weight-identical output), 32 GB runs native on MPS. The native-vs-cap-vs-floor decision lives in the pure helperinvisible_engine._target_size— keep the logic there, unit-tested without the model.- fp16 VAE black-output (issues #29/#41): the fp16-fixed SDXL VAE (
madebyollin/sdxl-vae-fp16-fix) is swapped in for the default SDXL checkpoint on cuda/xpu fp16, plus a model-agnostic backstop that detects a degenerate (all-black) fp16 output and re-runs once in fp32. cpu/mps run fp32 and never reproduce the bug. - Pyright first run is slow (2-3 min) due to ML deps (torch/diffusers/transformers stubs); full-project
uv run pyrightcan stall for many minutes — scope it to changed files. - A third-party PIL plugin autoload (e.g. an HEIF/AVIF plugin) can raise a non-OSError (
ModuleNotFoundError), notUnidentifiedImageError, when opening a file. Code that opens user-supplied or unknown-format files shouldexcept Exception, not justOSError/UnidentifiedImageError. - rich was dropped: the CLI + analysis scripts print plain text (
click.echo/ thescripts/_plain_console.pyshim).richis NOT a dependency — importing it breaks the core+dev CI sync; new scripts must use the shim. No Unicode glyphs / colors / progress bars in CLI output by design. - AVIF/HEIF/JPEG-XL metadata detection is a binary scan; C2PA removal in those containers (and MP4/MOV/M4V) is
noai/isobmff.py; non-ISOBMFF audio/video (WebM/MP3/WAV/FLAC/OGG) strips losslessly via ffmpeg on PATH. Still NOT built: anExifmeta-box item (needsiinf/ilocsurgery) and Resemble PerTh audio detection (no presence/confidence flag exists). - SynthID technical reference:
docs/synthid.md— primary-source-cited doc covering mechanism (post-hoc encoder/decoder pair, 136-bit payload at 512x512, pixel-space, model weights NOT modified), robustness numbers (arXiv:2510.09263: ~99.98% TPR@0.1%FPR across 30 transforms including JPEG/crop/resize/color/noise), removal attacks and forensic detectability (arXiv:2605.09203: all 6 attacks detectable at >98% TPR@1%FPR), detectability limits (no public decoder, metadata-proxy only), oracle scope, and adoption landscape. Read that doc first before adding notes here. - SynthID detection is metadata-only. No local pixel detector is possible by design (Google's decoder is proprietary, trusted-testers only); we read the C2PA companion proxy, which goes quiet once metadata is stripped — a quiet proxy is not proof the pixel watermark is gone. The Gemini app "Verify with SynthID" is the ONLY valid SynthID oracle;
openai.com/verifyis scoped to OpenAI provenance and each vendor's oracle detects only its own content. SynthID survives JPEG re-encode, so GitHub issue attachments remain valid pixel-watermark test subjects. Every spectral/phase detection approach evaluated (reverse-SynthID, our own probes) works only on controlled solid fills, never on real content. - External AI-vs-real classifier models are out of scope (decided 2026-05-24): per-generator, degrade off-distribution, and our own light SDXL pass would likely defeat them. Detection stays local + signal-based.
- Default strength is VENDOR-ADAPTIVE, one ladder for BOTH pipelines (since 2026-06-09):
resolve_strength(strength, vendor)picks OpenAI 0.20 / Gemini 0.30 / unknown 0.30 when--strengthis unset; explicit--strengthalways wins. Removal at low strength is content x pipeline dependent, and near-threshold removal is SEED-NON-DETERMINISTIC — pick a strength with margin and oracle-revalidate per content type. Certified controlnet floors (Modal cert 2026-06-04): OpenAI 0.20 (resolution-independent), Gemini 0.30 (only <= 1536px; native large Gemini needs ~0.35+ or a cap). controlnetis the default pipeline;--pipeline sdxlis the lighter opt-down. Neither pipeline clears all content at low strength (photoreal survives controlnet, flat graphics survive sdxl — the lever is higher strength). A removal-priority caller MUST oracle-validate strength across content types; prod recipe: controlnet + per-vendor floor + FIXED seed. Forensic-stealth caveat (arXiv:2605.09203): defeating the SynthID verifier is NOT forensic invisibility — removal-processed images are flaggable at >98% TPR@1%FPR.