mirror of
https://github.com/wiltodelta/remove-ai-watermarks.git
synced 2026-07-05 07:57:50 +02:00
09fdb4544ad703fb503e47868662486526f9fc40
95 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
09fdb4544a | fix(invisible): preserve native output dimensions | ||
|
|
61aa76a591 |
perf(identify): decode the image once for all visible-mark detectors
identify(check_visible=True) ran the Gemini-sparkle detector and the Doubao/Jimeng text-mark detector each with its own image_io.imread, so the same bitmap was fully decoded twice. On a memory-constrained host (the raiw.cc 512 MB web worker, which runs identify on every upload) that doubled the peak decode allocation and contributed to OOM restarts. Decode once in identify() and pass the BGR array to both detectors. The detect methods already accept an NDArray, so this only threads the pre-decoded array through: detect_sparkle_confidence and the two _visible_* helpers gain an optional image= param that, when None, preserves the old self-read behavior (so direct callers and the cv2-missing/unreadable paths are unchanged). Only the visible path is deduplicated; the optional check_invisible decoders are unaffected (and off on the web hot path). Adds a test asserting identify(check_visible=True, check_invisible=False) decodes exactly once. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
4c6b56f888 |
lower(strength): drop vendor-adaptive floor to OpenAI 0.10 / Google 0.15
A 2026-06-14 oracle re-test on the deployed Modal controlnet worker (v0.10.0) cleared SynthID at OpenAI 0.10 (2 photoreal) and Google 0.15 (2 native 2816x1536, retiring the "native >= 0.30" guess), while a pixel sweep showed the 2026-06-04 cert floors (0.20/0.30) over-regenerated for no efficacy gain (Google MAE -20% at 0.15). Lowers OPENAI_STRENGTH 0.20->0.10, GEMINI_STRENGTH and UNKNOWN_STRENGTH 0.30->0.15. Caveats documented in watermark_profiles.py + docs: removal near this floor is seed-non-deterministic (a service must pin a verified seed), and the n=2 re-test did not cover flat-graphic hard cases. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
41a2af2ecb |
fix(cli): preserve SynthID uncertainty in no-visible-mark message
The 'no signal' branch of the visible no-mark path claimed 'No AI provenance signal found either', which reads as 'the image is clean'. A missing metadata proxy is not proof an invisible pixel watermark (SynthID) is absent: it cannot be detected once metadata is gone and may have been stripped upstream. The message now preserves that uncertainty and routes to both 'all' (regenerate pixels) and 'erase'. Regression-guarded by the SynthID/all asserts in test_cli.py. CLAUDE.md visible-command note updated to match. Also adds a 'Scope and non-goals' section (CLAUDE.md + README): removing AI-provenance marks on the user's own content is in scope; stripping stock/paid-content watermarks (Shutterstock/Getty/iStock, classifieds) is out of scope by principle, not by difficulty. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
30b56f0ea3 |
fix(cli): stop silent passthrough when visible finds no known mark
When `visible --mark auto` (or an explicit `--mark` with detection on) found no registered mark, it exited 0 without writing output -- which a wrapping service reads as success and re-serves the unchanged input. ~74% of real uploads carry no registered visible mark, so this was the dominant "it didn't work" / NPS score-0 failure mode. Now it runs a cheap metadata-only identify, prints actionable guidance (route to `all` for an invisible/metadata mark, or `erase` for an arbitrary logo), writes no output file, and exits EXIT_NO_VISIBLE_MARK (2) -- distinct from success (0) and a hard error (1) so the caller can surface the message. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
28569bd05d |
fix(gemini): recover sub-0.85 corner sparkles via top-K fusion selection
The 256->512 detection-search widening (v0.8) let a large, low-gradient shape match outrank a genuine mid-size corner sparkle whose raw NCC sits below the 0.85 corner-promote gate, so `identify` read `unknown` on Gemini images that v0.7.2 caught (reporter osachub: scale-48 sparkle on light bedding -- true sparkle spatial 0.775 / grad 0.960 / fusion 0.676, but the size-weighted argmax locked onto a decoy at spatial 0.628 / grad 0.036). detect_watermark now keeps the top-K (_SELECT_TOPK=3) size-weighted candidates (NMS-deduped) plus the corner-promote candidate, scores each by full fusion (spatial+gradient+variance) via the extracted _grad_var_scores helper, and selects the highest -- the gradient term lifts the true sparkle over the decoy. Ranking by the SIZE-WEIGHTED score (not a raw-NCC argmax) preserves tiny-patch suppression: a raw-NCC argmax re-admitted 16-18px content false positives (14/65 doubao + 4/11 jimeng visible images). Top-K adds zero flips on the doubao/jimeng corpora and leaves the 495-image Gemini set unchanged (479 detected) while recovering the reporter's image at 0.676. - _grad_var_scores: gradient/variance scoring factored out of detect_watermark - confidence = best_fused (drop the duplicated fusion recompute) - tests: rename test_promotion_is_what_rescues_it -> test_size_weighted_search_alone_traps_on_the_decoy (corner-promote is no longer the sole rescue path); add a deterministic regression test mirroring the real spatial/grad signature - docs: module-internals.md detector section + CLAUDE.md mechanism map Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
3055aa6c4a |
test: patch is_available in full-pipeline all tests (fix no-gpu CI)
test_all_basic / test_all_visible_step_uses_registry asserted exit 0 but did not patch is_available, so on CI (core+dev only, no gpu) they took the skip branch and hit the new non-zero exit. Passed locally where gpu is present. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> |
||
|
|
a8e218acf6 |
Make all fail loudly when the gpu extra is missing
Step 2 (invisible/SynthID) was skipped with a quiet inline warning and the run still exited 0, so a missing [gpu] extra was mistaken for a clean result (recurring #14/#47). Add a prominent end-of-run banner and a non-zero exit. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> |
||
|
|
ad7e4ee08b |
feat(identify): close 3 detector gaps found on the spaces corpus (06-05..06-11)
- AIGC: parse the bare ``AIGC{...}`` blob form (label glued to its JSON in a
JPEG APP segment near the JFIF header), and scan both raw-JSON forms in one
fall-through loop so a quoted ``"AIGC"`` later in an XMP packet no longer
shadows a real bare label earlier in the file (3 files read unknown before).
- Integrity clash rule 2: a camera device + an AI marker from the SAME C2PA
manifest (Google Pixel Magic Editor / Pixel Studio edit chain) is a legitimate
edit chain, not a contradiction. Fire only when the AI marker's source is
independent of the camera's manifest; pure cameras (Leica/Sony/Nikon) are
unaffected (2 Pixel files mis-flagged before).
- New c2pa_cloud_manifest detector: surface a C2PA 2.4 Durable Content
Credentials cloud-manifest reference (Adobe cai-manifests.adobe.com) as a
medium provenance signal when the embedded manifest is stripped. Provenance
only, never asserts is_ai (2 files read fully unknown before).
identify reuses its already-loaded scan head for the cloud check (no second
read). +7 tests; CLAUDE.md + README synced.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
||
|
|
295e7ada2b |
chore: project review (dev tools in extras, dep upgrades, optional-deps guard, stale cleanup)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> |
||
|
|
2fcd00ced0 |
fix: address whole-project code review (visible all/batch, engine consolidation, I/O)
Nine findings from a high-effort project-wide review, fixed and verified (571 passed, ruff/pyright clean): Correctness: - all/batch now remove Doubao/Jimeng/Samsung visible text marks: the visible step routes through the registry (new cli._remove_visible_auto) instead of a hardcoded GeminiEngine, so they no longer leave the wordmark intact. - batch always reads the original source (dropped the out_path-reuse that re-processed already-cleaned outputs on a re-run). - img2img_runner only retries the diffusion call on the deprecated-callback TypeError; any other TypeError now propagates instead of double-running. - gemini detect/remove and the reverse-alpha engines normalize channels via a new image_io.to_bgr, fixing a grayscale/BGRA crash in the FP-gate path. - _png_late_metadata advances its cursor by the clamped length, so a malformed chunk length no longer aborts the late AI-label scan. Cleanup / efficiency: - Consolidate the ~90%-identical Doubao/Jimeng/Samsung engines into a shared config-driven _text_mark_engine.TextMarkEngine base; each engine is now a thin subclass (TextMarkConfig + test shims). Behavior is byte-exact (the three engine test suites pass unchanged). Registry adapters collapse to one _text_mark(...) row each. Gemini stays a separate engine. - scan_head is memoized per (path, size, mtime), so identify() reads the file head once instead of ~8 times. - invisible_engine post-processing decodes/encodes the output once (chained in memory) instead of 2-4 times across stages. - Remove the orphaned get_model_id_for_profile (+ CONTROLNET_PROFILE); derive the --strength help from the strength constants (strength_default_help) so it cannot drift; share the --pipeline/--strength click options; simplify the retired --auto resolver. Net -835 lines. Tests added for the registry-routed visible pass, to_bgr, the polish/model/guidance wiring, and strength_default_help. CLAUDE.md updated for the new base module, the engine/registry changes, image_io.to_bgr, and the scan_head cache. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
b1189549b8 |
feat(invisible): controlnet default, unified strength, retire --auto, add --model/--guidance-scale
Overhaul the diffusion-removal surface around a single robust default and a complete, consistent CLI. Pipeline + strength: - controlnet is now the DEFAULT pipeline (CLI --pipeline + both engine ctors). With the certified higher strength it clears both photoreal and flat-graphic content, whereas plain SDXL left SynthID on flat graphics. - Rename the plain-SDXL profile default -> sdxl; "default" stays as a back-compat alias (normalize_profile + a click callback that warns). - Unify the strength ladder: resolve_strength applies ONE vendor-adaptive ladder (the certified controlnet floors OpenAI 0.20 / Google 0.30 / unknown 0.30) to both pipelines. sdxl is the weaker remover on its own hard case (flat fills), so the certified floor is the right floor for it too. CLI completeness: - Add --model (HF model id) to invisible + batch (was only on all) and --guidance-scale (CFG) to all three diffusion commands; both were library knobs the CLI did not expose. - Flip --adaptive-polish to ON by default (it self-gates to a no-op where there is no detail deficit, so default-on is safe). - Share --pipeline / --strength / --model / --guidance-scale as single decorators so invisible/all/batch keep an identical surface; the --strength help is derived from the strength constants (strength_default_help) so it can never drift from the ladder. Removals: - Delete the auto_config content-detection planner + its YuNet/DBNet assets (~2.6 MB): with controlnet always the pipeline and the polish self-gating, the face/text/edge detection no longer changed behavior. --auto is now a deprecated no-op that only warns (the polish it enabled is the default). Docs (README, CLAUDE.md, docs/synthid.md) updated throughout; added an InvisibleEngine Python API example. Tests cover the alias warnings, the polish default, and the --model/--guidance-scale wiring. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
20d7eda96a |
remove: drop all face-restore code (regeneration, not preservation)
Empirical conclusion from the 2026-06-04 - 2026-06-08 Modal cert sweeps: every face-restore approach we built (GFPGAN-on-cleaned, PhotoMaker-V2, InstantID txt2img, InstantID img2img-on-cleaned at three parameter settings) regenerates the face via SDXL diffusion rather than preserves it. Output face pixels are diffusion-fresh, so the regenerated face inherits SDXL "clean skin" aesthetic and loses original identity precision -- it looks MORE AI-generated than the cleaned image, not less. The cleaned image from the main controlnet 0.20 removal pass is the least-AI face state we can reach without re-introducing SynthID. Nothing in the restore family achieves the actual goal (preserve the original person's face). Keeping them around as opt-in invites users to ship something that defeats the point. Removing entirely. Library changes: - Deleted src/remove_ai_watermarks/instantid_restore.py - Deleted src/remove_ai_watermarks/photomaker_restore.py - Deleted tests/test_instantid_restore.py - Deleted tests/test_photomaker_restore.py - Removed `instantid` and `photomaker` extras from pyproject.toml - Removed `[tool.hatch.metadata] allow-direct-references = true` (was only needed for the photomaker git+ URL) - InvisibleEngine.remove_watermark: dropped `restore_faces` + `restore_faces_method` params, removed both `_restore_faces_instantid` and `_restore_faces_photomaker` private methods, removed dispatch - CLI: dropped `_restore_faces_options` decorator, all four cmd_* signatures lose `restore_faces` + `restore_faces_method`, kwarg passes to remove_watermark dropped - _apply_auto: dropped `restore_faces` from tuple shape (was unused after the engine no longer takes it) - auto_config.AutoConfig: dropped `restore_faces` field; `plan()` no longer sets it; `reason` no longer mentions it - Tests updated accordingly (test_auto_config.TestReason no longer asserts "face-restore on" in the reason string) Docs updated: - CLAUDE.md: removed the photomaker extras bullet, the Face restore trade-off bullet, the instantid_restore.py + photomaker_restore.py module bullets; replaced restore mentions in watermark_remover and controlnet bullets and prod recipe with the empirical conclusion - README.md: removed both `--restore-faces` callouts and the install snippet; the feature bullet and auto-mode comment updated - docs/synthid-robust-identity-research.md: added Status-retired notice at the top pointing at the 2026-06-08 followup raiw-app: - modal_cert.py: dropped `--restore-faces` flag entirely; sweep() no longer takes restore_faces; pinned _LIB_SPEC to `[gpu]` extras (no `photomaker` / `instantid` extras), points at main ruff + strict pyright clean; 569 tests pass; 18 restore-specific tests gone. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
7c0c16fd66 |
test(instantid): update composite assertion to survive color-match
Last commit added `_color_match` which shifts the face crop's mean to the canvas mean -- the old test fed a uniform face (210) into a uniform cleaned canvas (90), so after color-match the face was uniform 90 and the composite was undetectable by value. Switched the fake pipeline to a gradient face so the color-match preserves variance, and the assertion now checks that the face region has non-zero std (composite injected gradient pixels) instead of a value threshold. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
70e8b3a517 |
feat(face-restore): add InstantID as the default non-commercial restore path
Per the 2026-06-08 deep-research synthesis (docs/synthid-robust-identity- research-2026-06-08.md), the entire ArcFace-class identity-adapter ecosystem for SDXL is blocked from commercial use by InsightFace's non-commercial model packs (antelopev2 / buffalo_l). No commercial-safe ArcFace-grade identity stack exists today. The user explicitly opted into shipping a non-commercial restore path (research / personal use; raiw.cc must NOT install the extra). Architectural choice: InstantID over PhotoMaker-V2 as the default. - PhotoMaker-V2 (CLIP+ArcFace dual encoder, txt2img only): documented upstream identity drift on Asian male faces, visually confirmed in our cert sweep (tatsunari rendered as a generic woman; group photo collapsed into a patchwork). - InstantID (ArcFace cross-attention + landmark ControlNet): semantic identity branch + spatial weak landmark control, decoupled. Per InstantID paper (arXiv:2401.07519) and the research report, stronger identity fidelity on single portraits. Critically: NO original face pixels enter the diffusion (ArcFace embedding is semantic, landmark stick figure is pure geometry), so SynthID is not transported. Implementation: - New `src/remove_ai_watermarks/instantid_restore.py` mirrors the `photomaker_restore.py` shape (lazy singletons for pipeline + FaceAnalysis, per-face crop + _composite_faces from photomaker_restore). Loads the InstantID community pipeline via `DiffusionPipeline.from_pretrained( custom_pipeline="pipeline_stable_diffusion_xl_instantid")` -- no upstream Python package needed; diffusers fetches the file from its community examples. - New `instantid` extra in pyproject (insightface + onnxruntime + huggingface-hub). NON-COMMERCIAL block in the comment explains why. - CLI: `--restore-faces-method [instantid|photomaker]`, default `instantid`. Both methods explicitly labeled NON-COMMERCIAL in the help text. - Engine: dispatch on `restore_faces_method` to either `_restore_faces_instantid` or `_restore_faces_photomaker`. - 9 control-flow tests for InstantID without model download (mirror the photomaker_restore.py test pattern + draw_kps helper checks). 587/587 pass. Diffusers-0.38 compat verified by upstream code inspection: the InstantID pipeline inherits from `StableDiffusionXLControlNetPipeline`, uses only public diffusers APIs (`encode_prompt`, `prepare_image`, `prepare_latents`, `get_guidance_scale_embedding`), uses legacy attention processor API which diffusers preserves for backward compat. No PhotoMaker-V1-style internal text_encoder access. End-to-end execution will be validated by the Modal cert sweep in the next step. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
37817a610f |
test(photomaker): stub face_analyser + analyze_faces in the control-flow test
The previous commit added a real call into FaceAnalysis2 / analyze_faces inside restore_faces_photomaker, which broke the model-free control-flow test. Stub it: - monkeypatch _get_face_analyser to return a sentinel - install a fake `photomaker` module with analyze_faces returning a single 512-d zero embedding - add dtype=torch.float32 to the fake pipeline class so .to(device, dtype=...) works 11/11 green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
65de8df5c5 |
refactor(face-restore): drop GFPGAN, ship PhotoMaker-V2 as the sole restore (non-commercial)
Visual review of the GFPGAN-on-cleaned output (9-face grid, 1448x1086) showed it only polished the already-drifted face without restoring identity — useless for the "restore who is in the photo" intent. Dropping it. The shipped restore path is now PhotoMaker-V2, which delivers true identity-from- embedding face regeneration via a CLIP+ArcFace dual encoder. The ArcFace branch pulls InsightFace antelopev2/buffalo_l model packs at runtime, which InsightFace releases under a research-only license, so the whole extra is **NON-COMMERCIAL**. raiw.cc and any monetized deployment must NOT install the `photomaker` extra. This is called out at every entry point: CLI flag help, module docstring, pyproject extra block, CLAUDE.md extras bullet, README install snippet. Changes: - Deleted `src/remove_ai_watermarks/face_restore.py` and its tests. - Deleted the `restore` extra (gfpgan/facexlib/basicsr + scipy<1.18 / numba<0.60 pins) and the basicsr setuptools<69 build pin from pyproject.toml. - Restored `src/remove_ai_watermarks/photomaker_restore.py` (V2 this time: `TencentARC/PhotoMaker-V2`, `photomaker-v2.bin`, no `pm_version='v1'` override). - Restored the `photomaker` extra in pyproject with all the upstream-compat pins (einops, peft, onnxruntime, insightface) and the `allow-direct-references` hatch metadata block. - `InvisibleEngine` swapped `_restore_faces` -> `_restore_faces_photomaker`; `--restore-faces-method` removed (only one method, no choice). - CLI flag help, CLAUDE.md, README, docs/synthid.md, and docs/controlnet-removal-pipeline-research.md all updated. - docs/synthid-robust-identity-research.md status notice rewritten to list both abandoned commercial-safe attempts (V1 + GFPGAN-on-cleaned) and the non-commercial trade-off we accepted. ruff + strict pyright(src/) clean; 578 tests pass (the 9 GFPGAN tests are gone, the 11 PhotoMaker tests stay green). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
01fe98bf54 |
refactor(face-restore): rollback PhotoMaker, restore GFPGAN on the CLEANED image
After 7 cascading upstream-compat fixes (insightface dep, peft dep, pm_version, device, etc.), the PhotoMaker V1 cert sweep still hit a CFG batch-dim mismatch inside the denoising loop. The upstream PhotoMaker `pipeline.py` is forked from diffusers v0.29.1 and our env runs 0.38; SDXL prompt-encoder handling changed significantly between those versions, so making PhotoMaker work end-to-end needs a proper fork or a diffusers downgrade — both expensive. Not worth shipping today. Pivot: restore `face_restore.py` (GFPGAN) with a single-line fix that makes it SynthID-safe by construction. The previous design ran GFPGAN.enhance on the ORIGINAL watermarked image and was oracle-confirmed to re-add SynthID via the weight-0.5 pixel blend. The fix is to run GFPGAN on the diffusion-CLEANED image — whatever pixels GFPGAN derives from are already SynthID-free, so the partial blend cannot transport the watermark. Identity fidelity is lower than a true identity-as-embedding stack would deliver, but it ships and works. Changes: - `src/remove_ai_watermarks/face_restore.py` restored from pre-wipe state with one line changed: `restorer.enhance(cleaned_bgr, ...)` instead of `restorer.enhance(original_bgr, ...)`. `original_bgr` is kept as an unused positional argument for API stability. - `src/remove_ai_watermarks/photomaker_restore.py` and its tests REMOVED. The research note (`docs/synthid-robust-identity-research.md`) keeps a "status notice" documenting why PhotoMaker is parked for now and what the path back in would look like. - `pyproject.toml` `restore` extra restored (gfpgan/facexlib/basicsr + scipy<1.18 + numba<0.60 pins + the basicsr setuptools<69 build pin), plus `photomaker` extra (with its einops/insightface/peft pile) and the `[tool.hatch.metadata] allow-direct-references = true` block REMOVED. - `InvisibleEngine._restore_faces_photomaker` removed; `_restore_faces` restored. The `--restore-faces` CLI flag and its plumbing through cmd_* signatures are unchanged. - CLAUDE.md, README.md, docs/synthid.md, docs/controlnet-removal-pipeline- research.md updated to describe the shipped GFPGAN-on-cleaned design and to reference PhotoMaker only as the parked alternative. ruff + strict pyright(src/) clean; 578 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
860bde4a26 |
fix(photomaker extra): pin insightface for import resolution (MIT code only)
The upstream PhotoMaker package's `__init__.py` unconditionally imports a face-analyser class from its `insightface_package` submodule, so JUST importing `PhotoMakerStableDiffusionXLPipeline` (the V1 pipeline class we use) raises `ModuleNotFoundError: No module named 'insightface'` if insightface isn't present in the env. The Modal cert sweep caught this on the V1 image. Resolution: pin `insightface>=0.7.3` (and its `onnxruntime` runtime dep) in the `photomaker` extra. The PyPI insightface package is MIT-licensed CODE; the non-commercial restriction sits on the pretrained model packs (antelopev2, buffalo_l) which download only when `FaceAnalysis()` is instantiated. Our V1 path never instantiates the face-analyser -- it loads photomaker-v1.bin (CLIP-only encoder) via `load_photomaker_adapter` -- so the model-pack license does not bind us; we depend only on the MIT code for the import to resolve. Safety guards: - Runtime check in `_get_pipeline`: raises if `_PHOTOMAKER_FILE` is ever pointed at v2 (so a future maintainer can't silently regress to the InsightFace path). - New test class `TestV1OnlyCommercialSafetyGuard`: asserts repo + filename pin to V1 AND asserts the module source never references the face-analyser class (a static check that our codepath stays out of the runtime that would pull the non-commercial model packs). Docs: documented the import dance + legal split inline at the top of `photomaker_restore.py`. ruff clean; 581 tests pass (the 9 PhotoMaker tests plus 3 new V1-guard tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
439eeadc07 |
refactor(face-restore): wipe GFPGAN path, --restore-faces is PhotoMaker-only
The GFPGAN `restore` extra and its `face_restore.py` module are gone. They were oracle-confirmed to re-introduce SynthID by blending watermarked original face pixels at fidelity weight 0.5 (clean A/B: gemini_3 controlnet 0.20 detected WITH GFPGAN, clean WITHOUT). Keeping them as the default restore method was a footgun for the removal pipeline. PhotoMaker-V2 (added in the previous commit) is the single shipped restore path now -- identity-as-embedding, SynthID-safe by construction. Removed: - src/remove_ai_watermarks/face_restore.py + tests/test_face_restore.py - pyproject.toml `restore` extra (gfpgan/facexlib/basicsr + scipy/numba pins) - pyproject.toml `[tool.uv.extra-build-dependencies] basicsr = [...]` build pin - CLI: `--restore-faces-method` and `--restore-faces-weight` (no method choice to make, no GFPGAN weight knob to expose) - InvisibleEngine._restore_faces method (only _restore_faces_photomaker remains) - All restore-faces-method / restore-faces-weight threading through cmd_* signatures and _process_batch_image Kept: - `--restore-faces / --no-restore-faces`: now binds to PhotoMaker-V2. - All adopted oracle findings about GFPGAN re-introducing SynthID (kept in the research docs as historical context that explains why the path was removed). Docs updated: CLAUDE.md (restore extras bullet collapsed to photomaker, removed face_restore Key-modules bullet, several inline GFPGAN refs scrubbed), README.md (face-identity callout + install section now point to the photomaker extra), docs/synthid.md 5.5 (net recipe), docs/controlnet-removal-pipeline-research.md (recommendations). ruff + strict pyright (src/) clean; 578 tests pass (the 9 GFPGAN tests are gone, the 9 PhotoMaker tests stay green). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
1439eb0714 |
feat(photomaker): SynthID-safe face-identity restoration via PhotoMaker-V2
Adds the second face-restore mechanism, selectable via the new CLI option `--restore-faces-method=photomaker`. Unlike the existing GFPGAN path (which runs on the watermarked ORIGINAL and was oracle-confirmed to re-introduce SynthID by partial pixel blending), PhotoMaker carries identity in a SynthID-invariant OpenCLIP embedding and regenerates fresh face pixels conditioned on it — the pixels in the output are diffusion-fresh, so the watermark cannot be transported. The load-bearing assumption (embedding invariance to SynthID-magnitude pixel noise) was empirically validated in the prior commit (smoke test): cosine drift 0.002 under a ±2 LSB low-freq carrier, an order of magnitude less than JPEG90 drift which SynthID survives at >=99% TPR. End-to-end commercial-safe: - PhotoMaker-V2 weights: Apache-2.0 (TencentARC) - ID encoder: OpenCLIP-ViT-H/14 (MIT) - SDXL base: shared with the main pipeline - NO InsightFace (the non-commercial blocker for IP-Adapter FaceID / InstantID / PuLID / Arc2Face) Two-pass architecture (PhotoMaker has no ControlNetImg2img class in diffusers): 1) main controlnet/default removal pass cleans SynthID + drifts faces 2) PhotoMaker txt2img regenerates each face from its embedding, feather-composited back into the cleaned image New module `photomaker_restore.py` mirrors `face_restore.py`: lazy pipeline singleton (double-checked lock), `is_available()` gate, pure `_face_crop_square` and `_composite_faces` helpers, all unit-tested without the model (9 new tests). New `InvisibleEngine._restore_faces_photomaker` runs after the diffusion pass, mirroring `_restore_faces`. CLI flag `--restore-faces-method=[gfpgan|photomaker]` threaded through `cmd_invisible`/`cmd_all`/`cmd_batch` + `_process_batch_image`. New optional `photomaker` extra (Apache-2.0 + Apache-2.0/MIT deps, no basicsr). `[tool.hatch.metadata] allow-direct-references = true` is required because the upstream PhotoMaker package lives only on GitHub. The next step (separate work) is oracle validation: run a 6-image cert sweep through the new pipeline (default/controlnet at the certified strength + --restore-faces-method=photomaker) and confirm SynthID stays clean while face identity is recovered. The required infrastructure (`raiw-app/modal_cert.py`) is already in place. ruff + strict pyright(src/) clean; 586 tests pass (+ 9 new in tests/test_photomaker_restore.py). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
3aea21e632 |
feat(visible): Samsung Galaxy AI mark removal (bottom-left reverse-alpha, #37)
New samsung_engine.py mirrors the jimeng engine but anchors bottom-left; wired into watermark_registry, the CLI (--mark samsung / auto), and identify (visible_samsung, medium). visible_alpha_solve.py gains a corner=bl mode; samsung_alpha.png solved from @f-liva's flat captures. Calibrated for the Italian "Contenuti generati dall'AI" variant. Flat black/gray/white captures committed, real photos gitignored. Tests + docs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
6f4aa4c7b1 |
fix(invisible): retry in fp32 on a degenerate fp16 output (#41)
The fp16-fix VAE swap (#29) is gated to the default SDXL checkpoint, so a custom model_id, a stale pre-fix install, or a fal/custom loader can still decode to an all-black/NaN frame in fp16 (reporter: gpt-image 1448x1086, the `image_processor.py invalid value encountered in cast` warning). Add a model-agnostic backstop in remove_watermark: after generation, if the run was fp16 and the output is degenerate (_is_degenerate_image: near-zero mean and variance), rebuild the pipeline in fp32 on the same device and re-run once. fp32 is the verified-clean path, so a black image is never returned regardless of model_id or version. Mirrors the MPS->CPU fallback's self-mutation pattern; batch inherits it. Verified e2e on MPS by forcing fp16 with the swap disabled (first pass black, guard fired, retry clean). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
2c0b174dfa |
fix(gemini): self-verify repair for under-removed sparkles
After reverse-alpha, re-detect the sparkle; when one survives at or above the registry fail line (conf >= 0.5) -- an alpha mismatch the per-image gain estimate could not fully correct -- inpaint the footprint and keep that only when it lowers the re-detect confidence. The footprint inpaint reconstructs the slot from its darker surroundings, so it physically removes the bright sparkle; purely additive, the common clean removal re-detects below 0.5 and is returned untouched. Measured on the spaces visible-removal audit: gemini removal-audit failures drop 15 -> 11 (4 genuine rescues), doubao 65/65 and jimeng 11/11 unchanged, zero regressions on the 468 already-clean removals. An offset+scale alignment search was prototyped on the remaining 11 fails and rejected: an audit "ceiling" suggested +4 more, but those were NCC-gaming -- the lower-scoring placement left the sparkle as bright or brighter, just reshaping the residual so the contrast-invariant shape-NCC scored lower (a5a9: first-pass slot ~76 at background level vs the "aligned win" ~164). A brightness sanity check rejected every one, so it contributed nothing and was removed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
6d11c11b52 |
feat(auto): DBNet text detector, Real-ESRGAN upscaler, batch --auto
Three content-quality features for the invisible/all/batch pipeline.
DBNet text detector (auto_config): replace the MSER text heuristic with
PP-OCRv3 differentiable-binarization via cv2.dnn.TextDetectionModel_DB,
using a bundled 2.4 MB Apache-2.0 model (en/cn detection nets are
byte-identical, so it ships language-neutral). cv2.dnn is core OpenCV, so
no new pip dep. MSER stays as the fallback when the model can't load.
Validated on real images: matches MSER everywhere and additionally catches
the Doubao CJK mark MSER missed; routing decisions unchanged otherwise.
Real-ESRGAN upscaler (new upscaler.py, esrgan extra): optional
pre-diffusion super-resolution for the min-resolution floor upscale, loaded
via spandrel (MIT, no basicsr) with BSD-3-Clause weights downloaded on
first use. New --upscaler {lanczos,esrgan} on invisible/all/batch; default
stays lanczos and the engine falls back to lanczos when the extra is absent
or the model errors (never breaks removal). It is a manual opt-in knob (the
auto plan never selects it) -- as a generic GAN it sharpens photo/texture
content strongly but can degrade faces (the diffusion pass regenerates
them) and thin text, documented accordingly.
batch --auto: wire the content-adaptive --auto (+ --adaptive-polish) into
cmd_batch. The plan is recomputed per image and the invisible engine is
cached per resolved pipeline (default/controlnet), so a mixed directory
builds at most one engine of each kind. Verified end-to-end: 3 mixed
images routed correctly with only 2 pipeline loads (controlnet reused).
ruff + strict pyright(src/) clean; 558 tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
4a6cd71ab2 |
Merge branch 'claude/silly-northcutt-c2bf06': unify C2PA vendor registry + code-health + uv publish
Brings in commit
|
||
|
|
32a0779e1d |
fix(gemini): demote sparkle false positives with a core-brightness gate
detect_watermark's shape-only NCC (spatial/gradient/var fusion) fires on ornate or flat content (text strips, banners, hatching) that coincidentally matches the diamond shape. The NCC is contrast-invariant, so it cannot see the defining property of a real Gemini sparkle: a bright WHITE overlay whose core sits above the local background. The fusion now demotes (caps confidence to 0.30) a match that is BOTH low-confidence (< _SPARKLE_FP_CONF 0.65) AND has a low core-ring brightness margin (_core_ring_margin < _SPARKLE_FP_MARGIN 5). Real sparkles escape via EITHER high confidence (white-bg sparkles score >=0.79 despite a low margin) OR high margin (dark/mid backgrounds, incl. the #36 faint-corner case), so both must fail to demote. The gate is monotonic -- it only removes detections, never adds -- so it cannot regress the verified-negative corpus (already 0 FPs). On the spaces corpus it demoted 16/495 flagged sparkles (13 no AI metadata = content FPs; the 3 AI-meta ones were visually FPs / a near-invisible white-on-white sparkle whose AI verdict is held by metadata), and dropped the removal-audit failures 20 -> 15. - _core_and_bg shared helper (core 75th-pct brightness vs background-ring median); _estimate_alpha_gain refactored onto it, new _core_ring_margin wrapper. - TestSparkleFalsePositiveGate: margin high/low, strong-sparkle kept (incl. on white via high conf), blurred no-core blob demoted. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
b686dbdd79 |
feat(auto): adaptive detail-targeting polish + --adaptive-polish flag
The fixed mild auto polish (unsharp 0.5 / grain 2.0) under-corrected soft photo/face output (gemini_3 stayed at lap-var 84 vs its 592 original) and its grain speckled small text. Replace it with humanizer.adaptive_polish: target the input's Laplacian variance with a capped unsharp scaled to the deficit + edge- masked grain (smooth regions only), calibrated by a short sigma search. Self- limiting on text/graphics -- already high-frequency, so almost no polish lands and text edges are masked out. Validated on the spaces corpus (gemini_3 84 -> 334 end-to-end; openai_1 text near-untouched). Interface: every --auto decision is now independently overridable -- add --adaptive-polish/--no-adaptive-polish (matching --restore-faces; works without --auto too) so the polish can be disabled or used manually. _apply_auto overrides exactly the three content-adaptive modes (pipeline, restore-faces, adaptive- polish); --unsharp/--humanize stay independent fixed filters. cv2-only, no new deps. Threaded through invisible/all (not batch). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
5cf68a6a3d |
refactor: unify C2PA vendor registry + code-health fixes + uv publish
Three P2 cleanups from a library-wide review. Detection -- single C2PA_AI_VENDORS registry (noai/constants.py): - C2PA_ISSUERS, SYNTHID_C2PA_ISSUERS, and identify._ISSUER_PLATFORM now derive from one C2paAiVendor table, so adding a C2PA vendor is one entry instead of edits in three places across two files. Behavior-identical (262 detection tests pass; the kept `needle` field is load-bearing -- it differs from `org` for Google and ByteDance, with no mechanical derivation). Code-health: - region_eraser.erase_lama now accepts grayscale/BGRA like erase_cv2 (it crashed on grayscale and silently dropped alpha on BGRA). +2 regression tests. - batch frees the device cache between images via a shared try_empty_device_cache helper (generalized from the MPS-only _try_clear_mps_cache, now reused by both the MPS->CPU fallback and the batch loop). - batch gained --controlnet-scale (parity with invisible/all). CI / packaging: - publish.yml uploads via `uv publish` (PyPI trusted publishing over OIDC), replacing pypa/gh-action-pypi-publish so uploads no longer depend on that action's bundled twine accepting the Metadata-Version. Workflow filename + pypi environment unchanged, so PyPI's trusted-publisher entry still matches. - hatchling pin relaxed <1.28 -> <1.31 (verified against hatch's changelog: 1.30.0 made Metadata 2.5 the default, 1.30.1 reverted to 2.4; 1.27-1.29 were always 2.4). Kept as belt-and-suspenders so the first uv-publish release ships 2.4, isolating the uploader swap from the metadata-version bump. Docs (CLAUDE.md, pyproject) synced; corrected the inaccurate "hatchling 1.28+ emits 2.5" note. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
9bd2c17cc4 |
feat(auto): content-adaptive --auto quality mode, Phase 1
Add `auto_config.plan(image_path) -> AutoConfig`, the first step of the invisible/all pipeline: it inspects the input image (before the diffusion model loads) and picks the quality modes so the run adapts to content. Quality-priority routing -- ControlNet (text/face-structure preservation) is the default, skipped for plain SDXL only on a clearly structure-less image; GFPGAN face restore when a face is present; a mild sharpen + grain polish when a smoothing pass ran. Exposed as `--auto` on `all`/`invisible` (`_apply_auto`; explicit flags override via click's parameter source). Not wired into batch (its engine is cached per-mode). Detection is cv2-only and torch-free (~100 MB peak RSS, a few ms): OpenCV YuNet (`cv2.FaceDetectorYN`, MIT, 232 KB model bundled in assets/) for faces, a Canny edge-density + MSER heuristic for text/structure (a rough Phase-1 placeholder; DBNet via cv2.dnn is the planned upgrade). ZERO new pip deps. Designed to run wherever the pipeline runs -- the raiw.cc Modal GPU worker -- never on the 512 MB web host. Real-ESRGAN-via-Spandrel upscaling (a new `esrgan` extra) and an adaptive Laplacian-variance polish are deferred to later phases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
e7fb64dca1 |
fix(gemini): remove more-opaque sparkles via per-image alpha gain
The captured sparkle alpha peaks ~0.51, but some real Gemini sparkles are rendered more opaque. The fixed-alpha reverse blend then UNDER-subtracts and leaves a bright residual the detector still fires on. A visible-removal audit through the registry path on the spaces corpus showed this as a meaningful fraction of marks -- all under-removals, not a background-brightness class (failures and successes had the same input confidence and background luma; the discriminator was the removal delta itself). remove_watermark now estimates a per-image alpha gain (_estimate_alpha_gain: effective sparkle opacity at the bright core vs the local background ring, a_eff/a_cap, clamped [1.0, 1.94]) and scales the alpha to match before the over-sub/blend branch. A 1.05 deadband keeps a sparkle that already matches the capture byte-identical to the pre-fix output, so the fix is purely additive (0 regressions on the audit set; failures dropped substantially). The over-sub guard still runs on the scaled alpha as the safety net for an over-shoot. - _estimate_alpha_gain + _ALPHA_GAIN_MAX/_DEADBAND/_CORE_FRAC in gemini_engine. - TestUnderSubtractionGain asserts on footprint pixels, NOT the detector (its NCC is degenerate on a flat synthetic bg; the real corpus removal drops the detector ~0.80 -> ~0.27). - scripts/visible_removal_audit.py: the detect -> remove -> re-detect audit tool that found and validated this (operates on gitignored data/spaces only). - CLAUDE.md + README: document the under-subtraction gain. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
d7e4fe8835 |
feat(invisible): upscale-floor for small inputs + unsharp post-filter
Two quality knobs for the SDXL invisible pass: - min_resolution floor (default 1024, --min-resolution): small inputs are upscaled to a 1024px long-side floor before diffusion, since SDXL img2img distorts on a tiny latent (a 381x512 portrait wrecks at native). The output is restored to the original input size, so it is a transparent quality boost; it adds time/memory on small inputs. 0 disables. Extends the pure _target_size helper (now cap-or-floor-or-native, min skipped on a min>max misconfig), unit-tested without a model. - unsharp post-filter (humanizer.unsharp_mask, --unsharp, opt-in default 0): applied LAST, after the GFPGAN face pass (a pre-GFPGAN sharpen would be smoothed back over), to counter the soft/over-smoothed look that diffusion + restoration leave behind (an AI tell). Pairs with --humanize (grain). Both threaded through invisible/all/batch + the module-level helper. Verified end-to-end on a 381x512 portrait: upscaled to 1024, sharpened, restored to 381x512. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
411ef16ec3 |
feat: GFPGAN face-identity restoration post-pass
Add an optional, commercial-safe face-restoration post-pass that recovers face identity the diffusion removal pass drifts (canny holds structure, not likeness) while still scrubbing the pixel watermark in the face regions. - face_restore.py: GFPGANer singleton (CPU unless CUDA), the basicsr torchvision.transforms.functional_tensor shim, and the pure feather _composite_faces helper (unit-tested without the model). GFPGAN re-synthesizes each face from a StyleGAN2 prior, so composited face pixels are GAN-generated (no watermark, no pixel-copy) -- oracle-clean at weight 0.5 with identity preserved. - InvisibleEngine.remove_watermark: restore_faces / restore_faces_weight, best-effort, auto-skips when the extra is absent or no face is detected. - CLI --restore-faces/--no-restore-faces + --restore-faces-weight on invisible/all/batch (on by default). - restore extra (gfpgan/facexlib/basicsr), numpy<2-pinned (scipy<1.18, numba<0.60) and kept out of `all`; basicsr needs Python <3.13 + setuptools<69 to build, so pin .python-version 3.12. Commercial-safe: GFPGAN Apache-2.0, RetinaFace MIT. The CodeFormer alternative is non-commercial and is not shipped. The earlier IP-Adapter FaceID layer was removed (footgun: needs high strength, corrupts faces at the low removal strength). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
d90d5d886a |
feat: controlnet pipeline for text/face-structure preservation
Add `--pipeline controlnet` (SDXL base + xinsir canny ControlNet via StableDiffusionXLControlNetImg2ImgPipeline): the canny edge map conditions the img2img regeneration so text and face STRUCTURE stay sharp, while the watermark is still removed by the regeneration (`strength`) -- no original pixels are copied or frozen, so SynthID does not survive. Oracle-verified clean on OpenAI with better text/structure fidelity than plain img2img at equal strength. `--controlnet-scale` tunes structure preservation; fp32 on mps/cpu (fp16-fixed VAE on cuda/xpu). Shares the img2img runner (live progress + MPS->CPU fallback) and the fp16-VAE-fix / device-move helpers with the default pipeline. Remove the superseded subsystems -- ctrlregen (SD1.5 clean-noise), text-protection (differential / region-hires) and face-protection: they either destroyed real content or shielded the watermark by re-using original pixels. controlnet replaces them by regenerating everything under edge conditioning. Canny preserves face structure but not identity; face IDENTITY is a separate face-restoration post-pass (CodeFormer/GFPGAN), researched + prototyped but not yet shipped. An IP-Adapter FaceID attempt was built and removed (footgun: needs high strength, corrupts faces at removal strength). Docs: docs/controlnet-removal-pipeline-research.md, scripts/controlnet_sweep.py. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
175609b60a |
fix(gemini): rescue small corner sparkle buried by the size weight (#36)
detect_watermark's size-weighted global NCC search lets a larger, mediocre match (e.g. a bright collar in a portrait) outrank a small, near-perfect sparkle in the bottom-right corner, so a faint sparkle on a busy background scored below threshold and the image read as clean -- the regression from widening the search window 256px->512px between v0.7.2 and v0.8.8. Add _corner_promote: a bottom-right-corner raw-NCC pass that overrides the global pick when the corner holds a match with raw NCC >= 0.85 that beats it. It only ever replaces a lower-fidelity pick (cannot weaken an existing detection) and keeps the wider window for variant margins. The corner side is relative-clamped (0.20 of the short side, [96, 384]) so it stays a true corner at every scale: a fixed 256px covers ~70% of a small portrait, where a real photo raw-matches the star at ~0.81; relative tightening drops that to ~0.69. The 0.85 gate sits between the worst real-photo corner match (~0.78) and a genuine faint sparkle (~0.93): zero false positives across native + downscaled negatives, headshot rescued from below-threshold to 0.71. Factor the shared multi-scale matchTemplate loop into _scan_scales. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
df0fafe94e |
fix(identify): stop flagging multi-actor C2PA manifests as integrity clashes
The C2PA issuer attribution (`c2pa`) and the SynthID proxy (`synthid`) are derived from the same manifest, so treating them as independent signals made rule 1 fire on legitimate multi-actor manifests where a product wraps another vendor's engine (Microsoft Designer on OpenAI, Microsoft on Google) or an edit chain re-signs (Adobe over a Gemini original). 19 such files in the 2026-06-01/02 spaces batches read as "likely spoofed/laundered" before this. Group `c2pa` + `synthid` into one provenance source via `_CLASH_SOURCE`; rule 1 now requires two vendors from different sources. A manifest vendor still clashes with a genuinely independent stamp (EXIF/XMP generator, IPTC AISystemUsed, AIGC, xAI). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
9ca2811938 |
fix(gemini): inpaint sparkle footprint when reverse-alpha over-subtracts (#30)
On a dark/textured background (e.g. grass) the captured alpha map over-estimates the real Gemini sparkle's effective opacity (~0.51 captured vs ~0.31 effective), so the fixed-alpha reverse blend over-subtracts (watermarked - alpha*logo goes negative) and drives the footprint to black -- the white sparkle turns into a black diamond (issue #30, reported by @CoolZimo1). remove_watermark now detects this via _reverse_alpha_oversubtracts (fraction of footprint pixels with a negative numerator > 5%) and inpaints the small sparkle footprint from the surrounding pixels (cv2 NS, cropped to a padded box) instead. Behavior-neutral on the working case: a bright background over-subtracts at ~0%, so reverse-alpha is used and the output is byte-identical to before (verified: demo_banana 0.0 frac vs the issue-#30 grass image 0.61 frac; issue-#30 footprint recovers to background grass with no pit, residual sparkle conf 0.25 < 0.35). Guard is scoped to GeminiEngine: doubao/jimeng already NCC-align their alpha to the actual mark per image, which sidesteps the fixed-alpha mismatch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
96038f960f |
feat(invisible): vendor-adaptive default strength (OpenAI 0.10 / Google 0.15)
The default img2img strength is now chosen from the detected SynthID vendor (C2PA issuer) instead of a single fixed 0.30: OpenAI gpt-image -> 0.10, Google Gemini -> 0.15, unknown source -> 0.15. Explicit --strength always wins. Basis: an oracle-verified June 2026 controlled study (clean v0.8.6, text/face protection OFF, per-image openai.com/verify or Gemini-app verdict). OpenAI's SynthID clears at 0.05 across 1024-1600 px (n=4, resolution-independent); Google's is ~3x more robust and needs 0.15 on the capped-1536 path (n=4). The dominant factor is the VENDOR, not resolution. The earlier single 0.30 default and the "resolution dependence" lore came from contaminated tests run with the protect-text bug ON (issue #14) -- re-running those same 1600x1600 images clean removes SynthID at 0.05. `vendor_for_strength(path)` reads metadata.synthid_source on the ORIGINAL input and is threaded through cli (invisible/all/batch) -> invisible_engine -> watermark_remover -> resolve_strength(strength, profile, vendor), so display and execution use the same vendor (the engine sees a temp path whose C2PA the visible pass already stripped, so detection must happen in the CLI on the pristine source). Caveat: Google's 0.15 was validated only on --max-resolution 1536; native 2816 Gemini was not locally measurable (OOM on Apple Silicon) and is pending GPU validation on raiw.cc. Docs: docs/synthid.md sections 2.2/4.4/5.2 corrected (the contaminated resolution-dependence findings replaced with the clean oracle-verified table); README and CLAUDE.md updated; CLI --strength help reflects the adaptive default. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
e501bec9ff |
feat(identify): detect visible Doubao/Jimeng marks; keep identify import torch-free
identify previously ran only the Gemini sparkle as a visible detector, so a Doubao/Jimeng image with stripped TC260 metadata had no visible fallback. Add `_visible_text_marks` (registry-backed) so the ByteDance Doubao 豆包AI生成 and Jimeng 即梦AI marks are detected too, each gated by its own engine NCC threshold via MarkDetection.detected. New signals `visible_doubao` / `visible_jimeng` (medium), same stripped-metadata fallback role as the sparkle; excluded from integrity-clash vendor claims; set platform only when no harder signal did. Also make `noai/__init__` lazy (PEP 562 __getattr__): importing the light `noai.c2pa` / `noai.constants` submodules (which identify needs) no longer eagerly pulls `watermark_remover`, which imports torch + diffusers at module top. `import remove_ai_watermarks.identify` drops from ~420 MB to ~21 MB in a full gpu/detect install (torch not loaded), so it fits a 512 MB host; the removal API resolves lazily on first access. Guarded by TestIdentifyImportIsLight. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
cddbaf6413 |
fix(invisible): raise default strength 0.10 -> 0.30 (current SynthID threshold); flag ctrlregen experimental
An oracle-verified GPU strength study (Modal A100, native res, Gemini-app 'Verify with SynthID', n=3 fresh Gemini images, protect_text/faces off) found the current Google SynthID survives strength 0.10/0.15/0.2 and is removed only at 0.3. The previous 0.10 default (set from an n=1 result) no longer clears it -- Google hardened SynthID and the threshold has climbed 0.05 -> 0.10 -> ~0.3. Bump DEFAULT_STRENGTH to 0.30; OpenAI/ChatGPT carry C2PA not SynthID, so 0.10 is plenty there (pass --strength 0.10). Note protect_text shields the text regions SynthID hides in (use --no-protect-text for full removal on text-heavy images). The same study found ctrlregen at clean-noise strength DESTROYS real images (hallucinated micro-text in smooth regions), with no usable middle setting, so the literature's 'clean-noise is the lever' did not hold empirically. Flag ctrlregen EXPERIMENTAL in the CLI --pipeline help, README, and watermark_profiles; SDXL img2img at ~0.3 stays the shippable path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
e42b7e9d6a |
refactor(cli): plain-text console output; drop rich; quiet transformers
cli.py now emits plain ASCII through a small click.echo shim (_Console / _Table / _Progress) instead of rich: no colors, markup tags, panels, progress bar, or Unicode glyphs (Warning: / -> / ... and dropped checkmark/cross marks). identify and metadata tables render as indented plain lines. - drop rich from dependencies (pyproject.toml + uv.lock) - __init__: set TRANSFORMERS_VERBOSITY=error (setdefault) plus a warnings filter so the transformers Siglip2ImageProcessorFast deprecation no longer prints at CLI startup (it fires from the eager noai import) - TestGpuHintMarkup: the [gpu] hint is now printed verbatim; docstring updated - CLAUDE.md: replace the obsolete rich-markup lesson, note the verbosity fix Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
2d49c3cb58 |
fix(invisible): ctrlregen defaults to clean-noise strength, not the SDXL 0.10
The ctrlregen profile inherited the SDXL img2img --strength default (0.10), a near-identity pass that loaded ControlNet + DINOv2-giant and barely changed the image -- a no-op for removal. resolve_strength() now resolves an unset strength per profile: 0.10 for the SDXL default, CTRLREGEN_DEFAULT_STRENGTH (1.0, clean-noise) for ctrlregen. It checks `is None` rather than falsiness, so an explicit 0.0 is respected (the old `strength or DEFAULT` swallowed it). Research basis: CtrlRegen (ICLR 2025, arXiv:2410.05470) removes robust watermarks by regenerating from clean Gaussian noise; partial-noise img2img retains watermark info that diffuses back, so a high (clean-noise) strength is the lever, not a knob on the light SDXL pass. CLI wiring (--strength default None) lands with the cli refactor. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
e572767555 |
feat(visible): add Jimeng remover, fix Doubao outline defect, reproducible mask build
Visible-watermark work across all three corner-mark engines plus a committed,
reproducible alpha-build pipeline (scripts/visible_alpha_solve.py) fed by committed
solid black/gray/white captures.
- jimeng: new "即梦AI" wordmark remover (reverse-alpha + thin residual inpaint,
always NCC-aligned -- the mark re-rasterizes/jitters per image). Detect via glyph
silhouette NCC (0.45 threshold; does not cross-fire with Doubao). Registered in the
visible-mark catalog; `visible --mark jimeng` / `--mark auto`.
- doubao: fix a real production defect -- the shipped remover left a READABLE
"豆包AI生成" outline on real samples while detect() returned conf 0.0 (fooled by a
thin outline), so the test passed and the "56/56 clean" claim was detector-measured,
not visual. Root cause: under-estimated alpha + fixed-geometry-no-inpaint + tight
locate box. Rebuilt alpha (careful gray-self solve), always-align, thin inpaint,
widened locate box -> readable outline becomes faint texture-level traces.
- gemini: rebuild gemini_bg_{96,48} from our own controlled captures (validated NCC
0.9998 vs the prior third-party asset); removal re-verified clean, no behaviour change.
- tests: add textured-shift regression to both engines (guards the align-on-shift path
the Doubao defect exposed; lesson: a detector-only removal test is insufficient,
assert visual residual).
- docs: CLAUDE.md, README, capture READMEs and docstrings synced; stale
"exact/pixel-exact/56-clean" claims removed.
Also includes a SynthID label-wording clarification in identify.py/cli.py
("SynthID pixel watermark" -> "SynthID watermark, inferred from C2PA metadata").
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
5d0e6c3a65 |
fix: harden metadata parsers and engines; sync docs (full-repo review)
Apply fixes from a full-repo review (code, tests, docs). Security / correctness: - Clamp attacker-controlled PNG/caBX chunk lengths to the remaining file size in metadata.py and noai/c2pa.py (a malformed length no longer drives a multi-GB read); skipped chunks seek instead of read. - noai/isobmff.strip_c2pa_boxes is now fail-safe on a malformed box: return the original bytes with a warning instead of silently truncating the tail, so metadata --remove can no longer emit a corrupt file. - doubao_engine._fixed_alpha_map clamps the glyph box to the image (no crash on degenerate width-vs-height). - watermark_remover._run_region_hires gates the phaseCorrelate offset on response and magnitude (a spurious shift no longer garbles text) and drops the generator after a CPU fallback (no MPS/CPU device mismatch). Robustness: - gemini_engine, doubao_engine, region_eraser normalize grayscale and RGBA inputs to BGR at the engine entry points. - image_io.imwrite returns False on an unwritable path (matches cv2). - invisible_engine guards a None imread result before use. - trustmark_detector._decoder uses a double-checked threading lock. - ctrlregen.tiling.tile_positions raises on overlap >= tile. - humanizer chromatic shift no longer wraps opposite-edge pixels. - identify OpenAI caveat keyed on the normalized vendor, not a substring. - Remove the dead "visible --detect-threshold" CLI option. - publish.yml verifies the release tag matches the package version. Docs: - README strength 0.05 to 0.10; .env.example HF_TOKEN marked optional; doubao_capture README updated to reverse-alpha-only; CLAUDE.md synced with the new behaviors and the batch command. Tests: new test_security_clamp.py for the read clamp and isobmff fail-safe; erase CLI coverage; integrity-clash rule 2 end-to-end; multi-tag EXIF survival and cross-format strip guards; channel/size, tiling, humanizer, and imwrite regressions. Full suite 493 passed, 2 skipped; ruff and pyright src/ clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
d88b87ca4e |
Fix #29 black output: use fp16-fixed SDXL VAE on fp16 GPUs
The stock SDXL VAE overflows to NaN in fp16, so the plain img2img path decodes to an all-black image on a CUDA/XPU fp16 backend. This is the raiw.cc black result HitaoLin reported (a 1086x1448 input came back uniformly black). cpu/mps run fp32 and never hit it, and the differential / region-hires pipeline already upcasts the VAE itself, so only the plain path on a fp16 GPU was exposed. `_load_pipeline` now loads `madebyollin/sdxl-vae-fp16-fix` for the default SDXL checkpoint when running fp16, gated by the pure helper `_needs_fp16_vae_fix`. A custom non-SDXL model keeps its own VAE. The decision logic is unit-tested without a download (TestFp16VaeFix). The black->clean recovery itself needs a CUDA GPU and was not verifiable on this MPS machine; it must be confirmed on the backend. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
29da3c52b6 |
Raise default SynthID-removal strength 0.05 → 0.10 (current Google SynthID) (#32)
* Raise default SynthID-removal strength 0.05 -> 0.10 (current Google SynthID) The old default (0.04/0.05) no longer removes the CURRENT Google SynthID (Nano Banana / Gemini 3): verified 2026-05-30 via the Gemini 'Verify with SynthID' oracle on a real image -- 0.05 still detected, 0.10 not detected (OpenAI's was already cleared at 0.05). Add DEFAULT_STRENGTH=0.10 in watermark_profiles, route the engine + CLI defaults to it. At 0.10 small text deforms more, which is why text protection (_run_region_hires) runs by default. CLAUDE.md SynthID note corrected. CAVEAT: n=1 Google + n=1 OpenAI; broad corpus oracle validation pending (task tracked). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Drop unused LOW/MEDIUM/HIGH strength profiles; CLI --strength defaults to DEFAULT_STRENGTH The fixed strength presets (and get_recommended_strength) were dead -- nothing in the pipeline used them, only tests. One knob now: DEFAULT_STRENGTH (0.10), overridable per-call via the CLI --strength flag, which now defaults to that constant (single source of truth). Removed the WatermarkRemover.LOW/MEDIUM/HIGH class attrs and the get_recommended_strength tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
e4f558dccf |
Add per-region high-resolution text protection (regenerate crisp, scrub everywhere) (#31)
Replace the default text-protection path. Differential Diffusion froze text in latent space, which left SynthID intact inside text (violating remove-everywhere) and still softened sub-8px strokes (VAE latent limit). _run_region_hires instead scrubs the whole image, then re-scrubs each detected text block at high resolution and feather-composites it back: every pixel is regenerated (watermark removed everywhere) while small text stays crisp (high-res strokes span >1 latent cell). merge_text_regions + feather_paste are pure and unit-tested; each re-scrubbed patch is phase-correlated back to the original crop to null the ~1-2px round-trip offset. Synthetic 18px multilingual text: text-region SSIM 0.28 -> 0.48, visually garbled -> readable across Latin/Cyrillic/CJK. Legacy _run_differential / build_change_map remain but are no longer the default. Prod use still requires confirming via the SynthID oracle that re-scrubbed text zones read watermark-free. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
89f427852f |
Fix #30 white box: stop zeroing alpha in the watermark region on save
On RGBA inputs the CLI forced the watermark bbox alpha to 0 on save, so the
removed-sparkle area became a transparent hole that renders as a solid white
box on any non-transparent viewer. The Gemini app exports opaque RGBA, so
every user hit it. Reverse-alpha already recovers the real pixels there (and
`erase` inpaints them), so there is no artifact to hide -- the hole was the
bug, introduced as an over-correction in
|
||
|
|
25a1acc53b |
Detect TC260 AIGC label in JPEG EXIF and late/attribute PNG XMP
A corpus audit surfaced China TC260 AIGC-labeled images that `identify`
missed. Three detection gaps in `aigc_label`, all fixed:
- raw-JSON `{"AIGC":{...}}` in JPEG EXIF (UserComment): brace-matched from
the scan head with `json.raw_decode`, gated on a TC260 field like the
PNG-chunk path. (Doubao-class output via that export surface.)
- XMP attribute form `TC260:AIGC="{...}"` (PicWish): folded into the
element regex as a second alternation.
- TC260 XMP packet appended after a large `IDAT`, past the 1 MB scan
window: `scan_head` now appends late PNG metadata chunks via
`_png_late_metadata`, mirroring the existing ISOBMFF late-box scan.
Adds `scripts/corpus_gap_scan.py`: runs `identify` over a corpus, writes
the per-file report CSV, and flags `unknown` files that carry a known
marker in their metadata region (the audit that found these gaps).
Scanning only the metadata region — not the whole file — avoids the
random short-token collisions inside compressed PNG/JPEG streams.
On the local corpus this lifts 3 files from `unknown` to AI (China AIGC)
and leaves zero false gap candidates. Synthetic piexif/PngInfo fixtures
cover all three forms.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
58bdf51c59 |
Visible-watermark registry: reverse-alpha-only Doubao + Gemini, exact native recovery (#28)
* fix(trustmark): gate detection on re-encode durability to kill false positives TrustMark's wm_present flag is a BCH validity check that spuriously validates on a content-correlated fraction of un-watermarked images (AI textures trip it more than camera photos). On a 1343-image set all 20 raw detections were false, several on Gemini/OpenAI/Doubao output that cannot carry Adobe's watermark, with random-bytes secrets. A genuine TrustMark is a durable soft binding that survives re-encoding, so detect_trustmark now re-decodes after a mild JPEG round-trip and requires the same schema both times. Every observed false positive collapsed under this gate; the second decode runs only on the rare hit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(identify): Samsung Galaxy AI, FLUX, ByteDance C2PA; fix C2PA substring FP Detection extensions verified on real signed files (2026-05-29): - Samsung Galaxy AI: signer attribution via a new _SIGNER_C2PA_PLATFORM (Samsung Galaxy / ASUS Gallery) kept separate from the capture-camera _DEVICE_C2PA_PLATFORM so a Galaxy AI edit (device cert + AI source type) does not trip the camera-vs-AI integrity clash. Plus metadata.samsung_genai: the proprietary genAIType marker in PhotoEditor_Re_Edit_Data, a medium- confidence AI-editing signal (samsung_only branch). - Black Forest Labs (FLUX) and ByteDance Volcano Engine (Doubao/Jimeng) added as C2PA issuers + issuer->platform mappings. - fix: C2PA presence required only the bare 4-byte 'c2pa' substring, which false-positives on compressed pixel data (a recompressed PNG IDAT re-flagged C2PA after its manifest was correctly stripped). New c2pa_marker_in() requires the JUMBF wrapper (jumb+c2pa) or the C2PA uuid box; applied in identify + metadata. Verified: all 535 real C2PA files carry jumb. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(doubao): gate detection on text structure to cut ~95% of false positives (#23) Coverage alone over-fired: any textured bottom-right corner cleared the threshold, so the detector false-positived on ~28% of arbitrary images. The real '豆包AI生成' mark is six glyphs in one row, so detect now also requires the text-structure signature (_glyph_structure): many connected components, no single dominant blob, concentration in a thin horizontal band. False positives dropped 343 -> 17 across the corpus while keeping real-mark recall and the doubao-1.png sample. Also accept a no-op force kwarg for remover-interface symmetry. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(samsung): add Samsung Galaxy AI visible-badge remover New samsung_engine.py removes the bottom-left sparkle + localized 'AI-generated content' badge that Galaxy AI tools stamp. Mirrors the Doubao locate->mask->inpaint pattern but bottom-left, with a dual-polarity top-hat mask (the badge is light-on-dark or dark-on-light). Detection gates on a band + left-anchor signature (the Doubao CJK-component gate does not transfer: Latin badge letters connect into few blobs). Explicit-only -- tuned on few real badges with a ~4% FP floor, so it is not used in auto. Synthetic byte-blob fixtures (real badges are user content, not shipped). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(visible): unified known-watermark registry + LaMa inpaint backend watermark_registry.py is a single catalog of known visible marks, each tying {usual location, in_auto flag, recovery strategy, detect adapter, remove adapter}: gemini (reverse-alpha, exact), doubao, samsung. cmd_visible is now registry-driven (best_auto_mark for --mark auto; mark_keys() feeds the CLI choices) -- the per-mark _run_doubao/_run_samsung helper branches are gone. Cross-engine confidences are not comparable, so the gemini adapter applies the corpus-validated 0.5 sparkle threshold for auto arbitration (its engine flag is loose and weakly fired ~0.36 on Doubao text, hijacking auto). --backend auto|cv2|lama chooses background reconstruction for the mask-based marks; auto = LaMa when onnxruntime is present, else cv2. For LaMa the mask is the FILLED glyph bounding box (sparse glyph masks leave anti-aliased edges behind). cv2 stays the zero-dependency fallback. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: watermark registry, Samsung/FLUX/ByteDance detection, LaMa backend, trustmark gate Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(doubao): exact reverse-alpha removal from captured alpha map The Doubao '豆包AI生成' mark is a fixed semi-transparent white overlay, so given its alpha map the original pixels are recovered exactly: original = (wm - a*logo)/(1-a) -- no inpaint hallucination. The alpha map + logo colour were solved from real black+gray Doubao captures on a controlled background: on black captured = a*logo, and the black/gray pair solves a per-pixel without assuming the logo colour (a_max~0.65, logo near-white); the white capture cross-validates (mark vanishes to a flat fill). Bundled as assets/doubao_alpha.png + geometry constants. remove_watermark_reverse_alpha applies it scaled to image width; exact at the captured width, so the registry routes doubao through it only when reverse_alpha_available (width within the calibrated band) and the mark is detected, falling back to mask inpaint (cv2/LaMa) otherwise. A light residual inpaint cleans the sub-pixel rescaling error. Add captures at more resolutions to widen exact coverage. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(visible): reverse-alpha only -- drop inpaint removal + heuristic detection Per the principle that we only remove/detect what we can do exactly, the visible-mark path is now reverse-alpha only: - Doubao detect is reverse-alpha-consistent: match the bundled alpha glyph silhouette against the corner via TM_CCOEFF_NORMED (DETECT_NCC_THRESHOLD 0.4) -- keys on the '豆包AI生成' SHAPE, not coverage/structure heuristics. FP 7/1243 (0.6%). Removes the cv2 inpaint path + the _glyph_structure gate. - Registry is reverse-alpha only: dropped the cv2/LaMa backend (_glyph_remove, _lama_box_inpaint, default_backend, --backend) and the Samsung entry. Doubao outside the alpha resolution band is skipped, never inpainted. - Removed samsung_engine.py + tests + --mark samsung (no alpha map captured; Samsung C2PA/genAIType metadata detection in identify is unaffected). - The universal erase --region (cv2/LaMa) is unchanged -- arbitrary-region inpainting stays a user-directed tool, separate from the known-mark registry. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(doubao): NCC sub-pixel alignment -> reverse-alpha at any resolution A pure width-scale of the captured alpha map is only sub-pixel-accurate at the captured width and leaves a faint ghost elsewhere. remove_watermark_reverse_alpha now registers the alpha glyph to the actual mark via a TM_CCOEFF_NORMED scale+position search (_aligned_alpha_map) before inverting the blend, so the single 2048 capture works at any resolution -- verified clean on the 1773x2364 (3:4) corpus size, the biggest coverage gap (23 files). reverse_alpha_available is now just 'asset present' (no width band); the registry still gates removal on detect so a clean corner is never touched. Drops the _ALPHA_WIDTH_TOLERANCE gate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(doubao): keep native recovery exact -- fixed geometry at captured width Integer-pixel NCC alignment landed ~1px off at the captured width, degrading the otherwise-exact native reverse-alpha (synthetic recovery error 0.94 -> 1.39). remove_watermark_reverse_alpha now uses exact width-relative geometry within _ALPHA_NATIVE_BAND of the captured width and the NCC search only off it -- best of both: native back to 0.94, other resolutions still aligned. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(doubao): harden alignment -- try fixed+aligned, keep least residual (56/56) On a faint/busy-background mark the NCC alignment peak can wander a few px off the true mark and leave a residual (2/56 real corpus files). Off the captured width, remove_watermark_reverse_alpha now builds BOTH the fixed-geometry and the NCC-aligned alpha map, applies each, and keeps whichever leaves the least residual mark (re-detect confidence on the bare reverse-alpha) -- geometry wins on faint marks, alignment on clear ones, no magic threshold. Real-file round-trip now removes 56/56 detected Doubao clean across every corpus resolution (was 54). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * perf(doubao): skip residual inpaint at native width for exact recovery At the captured width the fixed-geometry reverse-alpha is pixel-exact, so inpainting over it only replaced exactly-recovered interior pixels with a cv2 hallucination -- measured worse on a textured background (native error vs true bg 1.6 reverse-alpha-only vs 2.6 with the old always-on full-footprint inpaint). Native now returns the bare recovery untouched; off-native, where NCC alignment is only sub-pixel-approximate, the footprint inpaint stays to clean the seam. Real round-trip still 56/56 across all corpus resolutions; negatives 0/60, Gemini unaffected. Add test_native_returns_exact_reverse_alpha_no_inpaint as the regression guard. Sync CLAUDE.md + README (the table cell and prose described the pre-NCC "skipped off native / cv2-LaMa" behavior, now stale). Gitignore the session scheduled_tasks.lock, and add the text-protection research note. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |