remove-ai-watermarks

mirror of https://github.com/wiltodelta/remove-ai-watermarks.git synced 2026-06-10 12:53:56 +02:00

Author	SHA1	Message	Date
Victor Kuznetsov	3d00fed00c	fix(photomaker-v2): compute id_embeds via FaceAnalysis2 before pipeline call The Modal cert sweep against V2 hit the next layer of the API: PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken.forward() missing 1 required positional argument: 'id_embeds' V2 forward takes BOTH the CLIP image embedding (computed inside the pipeline from input_id_images) AND an ArcFace identity embedding (id_embeds) that the caller must compute. The upstream pipeline does NOT auto-compute it -- inference_pmv2.py shows the caller using FaceAnalysis2 + analyze_faces to extract the ArcFace vector from each input ID image and passing id_embeds=torch.stack([...]) into pipe(...). Wired the same flow here: - New _get_face_analyser() singleton (double-checked lock) builds FaceAnalysis2(['CUDAExecutionProvider' \| 'CPUExecutionProvider']).prepare(...). This is the non-commercial step (antelopev2/buffalo_l auto-download on first use). Module docstring already calls it out. - Per face: analyze_faces() -> torch.from_numpy(embedding) -> .unsqueeze(0) to match the pipeline's expected (B, D) shape, casting to pipeline.device/dtype. Faces InsightFace can't detect inside the crop get skipped (the most likely cause would be the diffusion-cleaned face being too small or stylised after the main pass; YuNet already gated us into having a face per crop, so this should be rare). - id_embeds= keyword threaded into the pipeline call site alongside the existing input_id_images=. Tests untouched (the V1-only safety guard was already removed in the previous commit when we swapped V1->V2; the existing 11 tests still pass). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 18:49:10 -07:00
Victor Kuznetsov	65de8df5c5	refactor(face-restore): drop GFPGAN, ship PhotoMaker-V2 as the sole restore (non-commercial) Visual review of the GFPGAN-on-cleaned output (9-face grid, 1448x1086) showed it only polished the already-drifted face without restoring identity — useless for the "restore who is in the photo" intent. Dropping it. The shipped restore path is now PhotoMaker-V2, which delivers true identity-from- embedding face regeneration via a CLIP+ArcFace dual encoder. The ArcFace branch pulls InsightFace antelopev2/buffalo_l model packs at runtime, which InsightFace releases under a research-only license, so the whole extra is NON-COMMERCIAL. raiw.cc and any monetized deployment must NOT install the `photomaker` extra. This is called out at every entry point: CLI flag help, module docstring, pyproject extra block, CLAUDE.md extras bullet, README install snippet. Changes: - Deleted `src/remove_ai_watermarks/face_restore.py` and its tests. - Deleted the `restore` extra (gfpgan/facexlib/basicsr + scipy<1.18 / numba<0.60 pins) and the basicsr setuptools<69 build pin from pyproject.toml. - Restored `src/remove_ai_watermarks/photomaker_restore.py` (V2 this time: `TencentARC/PhotoMaker-V2`, `photomaker-v2.bin`, no `pm_version='v1'` override). - Restored the `photomaker` extra in pyproject with all the upstream-compat pins (einops, peft, onnxruntime, insightface) and the `allow-direct-references` hatch metadata block. - `InvisibleEngine` swapped `_restore_faces` -> `_restore_faces_photomaker`; `--restore-faces-method` removed (only one method, no choice). - CLI flag help, CLAUDE.md, README, docs/synthid.md, and docs/controlnet-removal-pipeline-research.md all updated. - docs/synthid-robust-identity-research.md status notice rewritten to list both abandoned commercial-safe attempts (V1 + GFPGAN-on-cleaned) and the non-commercial trade-off we accepted. ruff + strict pyright(src/) clean; 578 tests pass (the 9 GFPGAN tests are gone, the 11 PhotoMaker tests stay green). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 18:41:01 -07:00
Victor Kuznetsov	01fe98bf54	refactor(face-restore): rollback PhotoMaker, restore GFPGAN on the CLEANED image After 7 cascading upstream-compat fixes (insightface dep, peft dep, pm_version, device, etc.), the PhotoMaker V1 cert sweep still hit a CFG batch-dim mismatch inside the denoising loop. The upstream PhotoMaker `pipeline.py` is forked from diffusers v0.29.1 and our env runs 0.38; SDXL prompt-encoder handling changed significantly between those versions, so making PhotoMaker work end-to-end needs a proper fork or a diffusers downgrade — both expensive. Not worth shipping today. Pivot: restore `face_restore.py` (GFPGAN) with a single-line fix that makes it SynthID-safe by construction. The previous design ran GFPGAN.enhance on the ORIGINAL watermarked image and was oracle-confirmed to re-add SynthID via the weight-0.5 pixel blend. The fix is to run GFPGAN on the diffusion-CLEANED image — whatever pixels GFPGAN derives from are already SynthID-free, so the partial blend cannot transport the watermark. Identity fidelity is lower than a true identity-as-embedding stack would deliver, but it ships and works. Changes: - `src/remove_ai_watermarks/face_restore.py` restored from pre-wipe state with one line changed: `restorer.enhance(cleaned_bgr, ...)` instead of `restorer.enhance(original_bgr, ...)`. `original_bgr` is kept as an unused positional argument for API stability. - `src/remove_ai_watermarks/photomaker_restore.py` and its tests REMOVED. The research note (`docs/synthid-robust-identity-research.md`) keeps a "status notice" documenting why PhotoMaker is parked for now and what the path back in would look like. - `pyproject.toml` `restore` extra restored (gfpgan/facexlib/basicsr + scipy<1.18 + numba<0.60 pins + the basicsr setuptools<69 build pin), plus `photomaker` extra (with its einops/insightface/peft pile) and the `[tool.hatch.metadata] allow-direct-references = true` block REMOVED. - `InvisibleEngine._restore_faces_photomaker` removed; `_restore_faces` restored. The `--restore-faces` CLI flag and its plumbing through cmd_* signatures are unchanged. - CLAUDE.md, README.md, docs/synthid.md, docs/controlnet-removal-pipeline- research.md updated to describe the shipped GFPGAN-on-cleaned design and to reference PhotoMaker only as the parked alternative. ruff + strict pyright(src/) clean; 578 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:55:45 -07:00
Victor Kuznetsov	d1b85ee6a8	fix(photomaker): drop explicit negative_prompt to fix CFG batch mismatch Modal cert sweep #6 made it INTO the denoising loop and died with "Sizes of tensors must match except in dimension 1. Expected size 2 but got size 1 for tensor number 1 in the list." In the PhotoMaker pipeline's denoising loop, the per-step embeddings are built as torch.cat([negative_prompt_embeds, prompt_embeds(_text_only)], dim=0). The text-encoder + ID-encoder flow can leave the negative branch at batch=2 and the ID-injected branch at batch=1 when a custom negative_prompt is passed, so the cat fails. The upstream gradio demo just passes no negative_prompt and relies on the pipeline's empty default; do the same. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:35:40 -07:00
Victor Kuznetsov	031c38dc7f	fix(photomaker): place id_encoder on the right device + dtype Modal cert sweep #5 made it through component load (V1 id_encoder + lora_weights) and died at inference with the classic "Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor) should be the same" — id_encoder lived on CPU/fp32 while the rest of the pipeline ran on CUDA/fp16. Two fixes: 1. Call `pipe.to(device)` BEFORE `load_photomaker_adapter` so the loader picks the right device/dtype from `self.device` / `self.unet.dtype` when it builds the encoder. 2. Belt: after load, explicitly `pipe.id_encoder.to(device, dtype)` because some torch/diffusers combos leave custom attributes on the old device even when `pipe.to` ran first. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:29:00 -07:00
Victor Kuznetsov	1fb2a64b56	fix(photomaker): pass pm_version='v1' to load_photomaker_adapter Modal cert sweep #3 ran past the `insightface` import error and into a real state_dict mismatch: Error(s) in loading state_dict for PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken: Missing key(s) ... qformer_perceiver.token_proj.0.weight ... The upstream `load_photomaker_adapter` defaults to `pm_version='v2'` regardless of the .bin file passed -- the loader builds a V2 encoder (PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken) and then tries to load V1 weights into it. We must pass `pm_version='v1'` explicitly so the loader instantiates the CLIP-only PhotoMakerIDEncoder. The pipeline-level `input_id_images` API is the same across V1 and V2, so the call site does not change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:18:52 -07:00
Victor Kuznetsov	860bde4a26	fix(photomaker extra): pin `insightface` for import resolution (MIT code only) The upstream PhotoMaker package's `__init__.py` unconditionally imports a face-analyser class from its `insightface_package` submodule, so JUST importing `PhotoMakerStableDiffusionXLPipeline` (the V1 pipeline class we use) raises `ModuleNotFoundError: No module named 'insightface'` if insightface isn't present in the env. The Modal cert sweep caught this on the V1 image. Resolution: pin `insightface>=0.7.3` (and its `onnxruntime` runtime dep) in the `photomaker` extra. The PyPI insightface package is MIT-licensed CODE; the non-commercial restriction sits on the pretrained model packs (antelopev2, buffalo_l) which download only when `FaceAnalysis()` is instantiated. Our V1 path never instantiates the face-analyser -- it loads photomaker-v1.bin (CLIP-only encoder) via `load_photomaker_adapter` -- so the model-pack license does not bind us; we depend only on the MIT code for the import to resolve. Safety guards: - Runtime check in `_get_pipeline`: raises if `_PHOTOMAKER_FILE` is ever pointed at v2 (so a future maintainer can't silently regress to the InsightFace path). - New test class `TestV1OnlyCommercialSafetyGuard`: asserts repo + filename pin to V1 AND asserts the module source never references the face-analyser class (a static check that our codepath stays out of the runtime that would pull the non-commercial model packs). Docs: documented the import dance + legal split inline at the top of `photomaker_restore.py`. ruff clean; 581 tests pass (the 9 PhotoMaker tests plus 3 new V1-guard tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:13:20 -07:00
Victor Kuznetsov	dfa5181309	fix(photomaker): switch to V1 — V2 actually requires InsightFace (non-commercial) A Modal cert sweep caught what the research doc missed: PhotoMaker-V2 fails at import without InsightFace ("No module named 'insightface'"). Reading the upstream source confirms it: `photomaker/__init__.py` imports `FaceAnalysis2` (an InsightFace wrapper) at module load, V2's encoder is named `PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken`, and `model_v2.py`'s forward takes an `id_embeds` argument that the pipeline computes via `insightface.app.FaceAnalysis(name='antelopev2', ...)`. So V2 is a DUAL encoder (CLIP + ArcFace), not CLIP-only as the model card line "id_encoder includes finetuned OpenCLIP-ViT-H-14 and a few fuse layers" implied. InsightFace's pretrained model packs (antelopev2, buffalo_l) are research/ non-commercial only per their own README: "The pretrained models we provided with this library are available for non-commercial research purposes only." So V2 is blocked for a paid service like raiw.cc. PhotoMaker-V1 is the commercial-safe alternative — its `PhotoMakerIDEncoder` (model.py) forward takes only `(id_pixel_values, prompt_embeds, class_tokens_mask)`, no ArcFace branch. Identity is CLIP-only, license is Apache-2.0, no InsightFace. Code change: swap the repo + filename constants in `photomaker_restore.py` (TencentARC/PhotoMaker, photomaker-v1.bin). Tests still pass (the 9 PhotoMaker tests use a fake pipeline, so the model swap is transparent to them). Doc correction: rewrote the verdict / license table / section 5 of `docs/synthid-robust-identity-research.md` to lead with V1 and add a correction notice explaining the V2 misread. Bulk-renamed `PhotoMaker-V2` to `PhotoMaker-V1` across CLAUDE.md, README.md, docs/synthid.md, and docs/controlnet-removal-pipeline-research.md (kept V2 only in the correction notice, the license table, and the anchor reference). ruff clean; 578 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:05:58 -07:00
Victor Kuznetsov	439eeadc07	refactor(face-restore): wipe GFPGAN path, --restore-faces is PhotoMaker-only The GFPGAN `restore` extra and its `face_restore.py` module are gone. They were oracle-confirmed to re-introduce SynthID by blending watermarked original face pixels at fidelity weight 0.5 (clean A/B: gemini_3 controlnet 0.20 detected WITH GFPGAN, clean WITHOUT). Keeping them as the default restore method was a footgun for the removal pipeline. PhotoMaker-V2 (added in the previous commit) is the single shipped restore path now -- identity-as-embedding, SynthID-safe by construction. Removed: - src/remove_ai_watermarks/face_restore.py + tests/test_face_restore.py - pyproject.toml `restore` extra (gfpgan/facexlib/basicsr + scipy/numba pins) - pyproject.toml `[tool.uv.extra-build-dependencies] basicsr = [...]` build pin - CLI: `--restore-faces-method` and `--restore-faces-weight` (no method choice to make, no GFPGAN weight knob to expose) - InvisibleEngine._restore_faces method (only _restore_faces_photomaker remains) - All restore-faces-method / restore-faces-weight threading through cmd_* signatures and _process_batch_image Kept: - `--restore-faces / --no-restore-faces`: now binds to PhotoMaker-V2. - All adopted oracle findings about GFPGAN re-introducing SynthID (kept in the research docs as historical context that explains why the path was removed). Docs updated: CLAUDE.md (restore extras bullet collapsed to photomaker, removed face_restore Key-modules bullet, several inline GFPGAN refs scrubbed), README.md (face-identity callout + install section now point to the photomaker extra), docs/synthid.md 5.5 (net recipe), docs/controlnet-removal-pipeline-research.md (recommendations). ruff + strict pyright (src/) clean; 578 tests pass (the 9 GFPGAN tests are gone, the 9 PhotoMaker tests stay green). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 15:35:37 -07:00
Victor Kuznetsov	1439eb0714	feat(photomaker): SynthID-safe face-identity restoration via PhotoMaker-V2 Adds the second face-restore mechanism, selectable via the new CLI option `--restore-faces-method=photomaker`. Unlike the existing GFPGAN path (which runs on the watermarked ORIGINAL and was oracle-confirmed to re-introduce SynthID by partial pixel blending), PhotoMaker carries identity in a SynthID-invariant OpenCLIP embedding and regenerates fresh face pixels conditioned on it — the pixels in the output are diffusion-fresh, so the watermark cannot be transported. The load-bearing assumption (embedding invariance to SynthID-magnitude pixel noise) was empirically validated in the prior commit (smoke test): cosine drift 0.002 under a ±2 LSB low-freq carrier, an order of magnitude less than JPEG90 drift which SynthID survives at >=99% TPR. End-to-end commercial-safe: - PhotoMaker-V2 weights: Apache-2.0 (TencentARC) - ID encoder: OpenCLIP-ViT-H/14 (MIT) - SDXL base: shared with the main pipeline - NO InsightFace (the non-commercial blocker for IP-Adapter FaceID / InstantID / PuLID / Arc2Face) Two-pass architecture (PhotoMaker has no ControlNetImg2img class in diffusers): 1) main controlnet/default removal pass cleans SynthID + drifts faces 2) PhotoMaker txt2img regenerates each face from its embedding, feather-composited back into the cleaned image New module `photomaker_restore.py` mirrors `face_restore.py`: lazy pipeline singleton (double-checked lock), `is_available()` gate, pure `_face_crop_square` and `_composite_faces` helpers, all unit-tested without the model (9 new tests). New `InvisibleEngine._restore_faces_photomaker` runs after the diffusion pass, mirroring `_restore_faces`. CLI flag `--restore-faces-method=[gfpgan\|photomaker]` threaded through `cmd_invisible`/`cmd_all`/`cmd_batch` + `_process_batch_image`. New optional `photomaker` extra (Apache-2.0 + Apache-2.0/MIT deps, no basicsr). `[tool.hatch.metadata] allow-direct-references = true` is required because the upstream PhotoMaker package lives only on GitHub. The next step (separate work) is oracle validation: run a 6-image cert sweep through the new pipeline (default/controlnet at the certified strength + --restore-faces-method=photomaker) and confirm SynthID stays clean while face identity is recovered. The required infrastructure (`raiw-app/modal_cert.py`) is already in place. ruff + strict pyright(src/) clean; 586 tests pass (+ 9 new in tests/test_photomaker_restore.py). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 15:20:29 -07:00
Victor Kuznetsov	3aea21e632	feat(visible): Samsung Galaxy AI mark removal (bottom-left reverse-alpha, #37 ) New samsung_engine.py mirrors the jimeng engine but anchors bottom-left; wired into watermark_registry, the CLI (--mark samsung / auto), and identify (visible_samsung, medium). visible_alpha_solve.py gains a corner=bl mode; samsung_alpha.png solved from @f-liva's flat captures. Calibrated for the Italian "Contenuti generati dall'AI" variant. Flat black/gray/white captures committed, real photos gitignored. Tests + docs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 10:27:44 -07:00
Victor Kuznetsov	6f4aa4c7b1	fix(invisible): retry in fp32 on a degenerate fp16 output (#41 ) The fp16-fix VAE swap (#29) is gated to the default SDXL checkpoint, so a custom model_id, a stale pre-fix install, or a fal/custom loader can still decode to an all-black/NaN frame in fp16 (reporter: gpt-image 1448x1086, the `image_processor.py invalid value encountered in cast` warning). Add a model-agnostic backstop in remove_watermark: after generation, if the run was fp16 and the output is degenerate (_is_degenerate_image: near-zero mean and variance), rebuild the pipeline in fp32 on the same device and re-run once. fp32 is the verified-clean path, so a black image is never returned regardless of model_id or version. Mirrors the MPS->CPU fallback's self-mutation pattern; batch inherits it. Verified e2e on MPS by forcing fp16 with the swap disabled (first pass black, guard fired, retry clean). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-04 17:43:27 -07:00
Victor Kuznetsov	2c0b174dfa	fix(gemini): self-verify repair for under-removed sparkles After reverse-alpha, re-detect the sparkle; when one survives at or above the registry fail line (conf >= 0.5) -- an alpha mismatch the per-image gain estimate could not fully correct -- inpaint the footprint and keep that only when it lowers the re-detect confidence. The footprint inpaint reconstructs the slot from its darker surroundings, so it physically removes the bright sparkle; purely additive, the common clean removal re-detects below 0.5 and is returned untouched. Measured on the spaces visible-removal audit: gemini removal-audit failures drop 15 -> 11 (4 genuine rescues), doubao 65/65 and jimeng 11/11 unchanged, zero regressions on the 468 already-clean removals. An offset+scale alignment search was prototyped on the remaining 11 fails and rejected: an audit "ceiling" suggested +4 more, but those were NCC-gaming -- the lower-scoring placement left the sparkle as bright or brighter, just reshaping the residual so the contrast-invariant shape-NCC scored lower (a5a9: first-pass slot ~76 at background level vs the "aligned win" ~164). A brightness sanity check rejected every one, so it contributed nothing and was removed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-04 16:45:18 -07:00
Victor Kuznetsov	6d11c11b52	feat(auto): DBNet text detector, Real-ESRGAN upscaler, batch --auto Three content-quality features for the invisible/all/batch pipeline. DBNet text detector (auto_config): replace the MSER text heuristic with PP-OCRv3 differentiable-binarization via cv2.dnn.TextDetectionModel_DB, using a bundled 2.4 MB Apache-2.0 model (en/cn detection nets are byte-identical, so it ships language-neutral). cv2.dnn is core OpenCV, so no new pip dep. MSER stays as the fallback when the model can't load. Validated on real images: matches MSER everywhere and additionally catches the Doubao CJK mark MSER missed; routing decisions unchanged otherwise. Real-ESRGAN upscaler (new upscaler.py, esrgan extra): optional pre-diffusion super-resolution for the min-resolution floor upscale, loaded via spandrel (MIT, no basicsr) with BSD-3-Clause weights downloaded on first use. New --upscaler {lanczos,esrgan} on invisible/all/batch; default stays lanczos and the engine falls back to lanczos when the extra is absent or the model errors (never breaks removal). It is a manual opt-in knob (the auto plan never selects it) -- as a generic GAN it sharpens photo/texture content strongly but can degrade faces (the diffusion pass regenerates them) and thin text, documented accordingly. batch --auto: wire the content-adaptive --auto (+ --adaptive-polish) into cmd_batch. The plan is recomputed per image and the invisible engine is cached per resolved pipeline (default/controlnet), so a mixed directory builds at most one engine of each kind. Verified end-to-end: 3 mixed images routed correctly with only 2 pipeline loads (controlnet reused). ruff + strict pyright(src/) clean; 558 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-04 16:04:33 -07:00
Victor Kuznetsov	4a6cd71ab2	Merge branch 'claude/silly-northcutt-c2bf06': unify C2PA vendor registry + code-health + uv publish Brings in commit `5cf68a6` (single C2PA_AI_VENDORS registry, erase_lama grayscale/BGRA support, batch device-cache clearing + --controlnet-scale, uv publish via OIDC, hatchling pin <1.31). Auto-merged with no conflicts; ruff/pytest(544)/pyright all clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 22:10:25 -07:00
Victor Kuznetsov	32a0779e1d	fix(gemini): demote sparkle false positives with a core-brightness gate detect_watermark's shape-only NCC (spatial/gradient/var fusion) fires on ornate or flat content (text strips, banners, hatching) that coincidentally matches the diamond shape. The NCC is contrast-invariant, so it cannot see the defining property of a real Gemini sparkle: a bright WHITE overlay whose core sits above the local background. The fusion now demotes (caps confidence to 0.30) a match that is BOTH low-confidence (< _SPARKLE_FP_CONF 0.65) AND has a low core-ring brightness margin (_core_ring_margin < _SPARKLE_FP_MARGIN 5). Real sparkles escape via EITHER high confidence (white-bg sparkles score >=0.79 despite a low margin) OR high margin (dark/mid backgrounds, incl. the #36 faint-corner case), so both must fail to demote. The gate is monotonic -- it only removes detections, never adds -- so it cannot regress the verified-negative corpus (already 0 FPs). On the spaces corpus it demoted 16/495 flagged sparkles (13 no AI metadata = content FPs; the 3 AI-meta ones were visually FPs / a near-invisible white-on-white sparkle whose AI verdict is held by metadata), and dropped the removal-audit failures 20 -> 15. - _core_and_bg shared helper (core 75th-pct brightness vs background-ring median); _estimate_alpha_gain refactored onto it, new _core_ring_margin wrapper. - TestSparkleFalsePositiveGate: margin high/low, strong-sparkle kept (incl. on white via high conf), blurred no-core blob demoted. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 22:02:28 -07:00
Victor Kuznetsov	b686dbdd79	feat(auto): adaptive detail-targeting polish + --adaptive-polish flag The fixed mild auto polish (unsharp 0.5 / grain 2.0) under-corrected soft photo/face output (gemini_3 stayed at lap-var 84 vs its 592 original) and its grain speckled small text. Replace it with humanizer.adaptive_polish: target the input's Laplacian variance with a capped unsharp scaled to the deficit + edge- masked grain (smooth regions only), calibrated by a short sigma search. Self- limiting on text/graphics -- already high-frequency, so almost no polish lands and text edges are masked out. Validated on the spaces corpus (gemini_3 84 -> 334 end-to-end; openai_1 text near-untouched). Interface: every --auto decision is now independently overridable -- add --adaptive-polish/--no-adaptive-polish (matching --restore-faces; works without --auto too) so the polish can be disabled or used manually. _apply_auto overrides exactly the three content-adaptive modes (pipeline, restore-faces, adaptive- polish); --unsharp/--humanize stay independent fixed filters. cv2-only, no new deps. Threaded through invisible/all (not batch). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 21:49:08 -07:00
Victor Kuznetsov	5cf68a6a3d	refactor: unify C2PA vendor registry + code-health fixes + uv publish Three P2 cleanups from a library-wide review. Detection -- single C2PA_AI_VENDORS registry (noai/constants.py): - C2PA_ISSUERS, SYNTHID_C2PA_ISSUERS, and identify._ISSUER_PLATFORM now derive from one C2paAiVendor table, so adding a C2PA vendor is one entry instead of edits in three places across two files. Behavior-identical (262 detection tests pass; the kept `needle` field is load-bearing -- it differs from `org` for Google and ByteDance, with no mechanical derivation). Code-health: - region_eraser.erase_lama now accepts grayscale/BGRA like erase_cv2 (it crashed on grayscale and silently dropped alpha on BGRA). +2 regression tests. - batch frees the device cache between images via a shared try_empty_device_cache helper (generalized from the MPS-only _try_clear_mps_cache, now reused by both the MPS->CPU fallback and the batch loop). - batch gained --controlnet-scale (parity with invisible/all). CI / packaging: - publish.yml uploads via `uv publish` (PyPI trusted publishing over OIDC), replacing pypa/gh-action-pypi-publish so uploads no longer depend on that action's bundled twine accepting the Metadata-Version. Workflow filename + pypi environment unchanged, so PyPI's trusted-publisher entry still matches. - hatchling pin relaxed <1.28 -> <1.31 (verified against hatch's changelog: 1.30.0 made Metadata 2.5 the default, 1.30.1 reverted to 2.4; 1.27-1.29 were always 2.4). Kept as belt-and-suspenders so the first uv-publish release ships 2.4, isolating the uploader swap from the metadata-version bump. Docs (CLAUDE.md, pyproject) synced; corrected the inaccurate "hatchling 1.28+ emits 2.5" note. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 21:01:07 -07:00
Victor Kuznetsov	9bd2c17cc4	feat(auto): content-adaptive --auto quality mode, Phase 1 Add `auto_config.plan(image_path) -> AutoConfig`, the first step of the invisible/all pipeline: it inspects the input image (before the diffusion model loads) and picks the quality modes so the run adapts to content. Quality-priority routing -- ControlNet (text/face-structure preservation) is the default, skipped for plain SDXL only on a clearly structure-less image; GFPGAN face restore when a face is present; a mild sharpen + grain polish when a smoothing pass ran. Exposed as `--auto` on `all`/`invisible` (`_apply_auto`; explicit flags override via click's parameter source). Not wired into batch (its engine is cached per-mode). Detection is cv2-only and torch-free (~100 MB peak RSS, a few ms): OpenCV YuNet (`cv2.FaceDetectorYN`, MIT, 232 KB model bundled in assets/) for faces, a Canny edge-density + MSER heuristic for text/structure (a rough Phase-1 placeholder; DBNet via cv2.dnn is the planned upgrade). ZERO new pip deps. Designed to run wherever the pipeline runs -- the raiw.cc Modal GPU worker -- never on the 512 MB web host. Real-ESRGAN-via-Spandrel upscaling (a new `esrgan` extra) and an adaptive Laplacian-variance polish are deferred to later phases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 20:52:17 -07:00
Victor Kuznetsov	e7fb64dca1	fix(gemini): remove more-opaque sparkles via per-image alpha gain The captured sparkle alpha peaks ~0.51, but some real Gemini sparkles are rendered more opaque. The fixed-alpha reverse blend then UNDER-subtracts and leaves a bright residual the detector still fires on. A visible-removal audit through the registry path on the spaces corpus showed this as a meaningful fraction of marks -- all under-removals, not a background-brightness class (failures and successes had the same input confidence and background luma; the discriminator was the removal delta itself). remove_watermark now estimates a per-image alpha gain (_estimate_alpha_gain: effective sparkle opacity at the bright core vs the local background ring, a_eff/a_cap, clamped [1.0, 1.94]) and scales the alpha to match before the over-sub/blend branch. A 1.05 deadband keeps a sparkle that already matches the capture byte-identical to the pre-fix output, so the fix is purely additive (0 regressions on the audit set; failures dropped substantially). The over-sub guard still runs on the scaled alpha as the safety net for an over-shoot. - _estimate_alpha_gain + _ALPHA_GAIN_MAX/_DEADBAND/_CORE_FRAC in gemini_engine. - TestUnderSubtractionGain asserts on footprint pixels, NOT the detector (its NCC is degenerate on a flat synthetic bg; the real corpus removal drops the detector ~0.80 -> ~0.27). - scripts/visible_removal_audit.py: the detect -> remove -> re-detect audit tool that found and validated this (operates on gitignored data/spaces only). - CLAUDE.md + README: document the under-subtraction gain. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 19:48:40 -07:00
Victor Kuznetsov	d7e4fe8835	feat(invisible): upscale-floor for small inputs + unsharp post-filter Two quality knobs for the SDXL invisible pass: - min_resolution floor (default 1024, --min-resolution): small inputs are upscaled to a 1024px long-side floor before diffusion, since SDXL img2img distorts on a tiny latent (a 381x512 portrait wrecks at native). The output is restored to the original input size, so it is a transparent quality boost; it adds time/memory on small inputs. 0 disables. Extends the pure _target_size helper (now cap-or-floor-or-native, min skipped on a min>max misconfig), unit-tested without a model. - unsharp post-filter (humanizer.unsharp_mask, --unsharp, opt-in default 0): applied LAST, after the GFPGAN face pass (a pre-GFPGAN sharpen would be smoothed back over), to counter the soft/over-smoothed look that diffusion + restoration leave behind (an AI tell). Pairs with --humanize (grain). Both threaded through invisible/all/batch + the module-level helper. Verified end-to-end on a 381x512 portrait: upscaled to 1024, sharpened, restored to 381x512. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 18:30:39 -07:00
Victor Kuznetsov	5ec8269949	chore: mark controlnet pipeline + GFPGAN restore-faces as experimental Both content-preservation features are now flagged EXPERIMENTAL and opt-in. --pipeline controlnet was already opt-in (default=default); --restore-faces flips from on-by-default to OFF by default, matching the repo's prior pattern for experimental preservation passes (the removed protect_text/protect_faces). - cli.py: --restore-faces/--no-restore-faces default False; EXPERIMENTAL in the --restore-faces / --controlnet-scale / --pipeline help; batch default False. - invisible_engine.py: remove_watermark restore_faces default False + docstring. - CLAUDE.md / README.md / docs/synthid.md: label both experimental/opt-in. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 16:59:28 -07:00
Victor Kuznetsov	411ef16ec3	feat: GFPGAN face-identity restoration post-pass Add an optional, commercial-safe face-restoration post-pass that recovers face identity the diffusion removal pass drifts (canny holds structure, not likeness) while still scrubbing the pixel watermark in the face regions. - face_restore.py: GFPGANer singleton (CPU unless CUDA), the basicsr torchvision.transforms.functional_tensor shim, and the pure feather _composite_faces helper (unit-tested without the model). GFPGAN re-synthesizes each face from a StyleGAN2 prior, so composited face pixels are GAN-generated (no watermark, no pixel-copy) -- oracle-clean at weight 0.5 with identity preserved. - InvisibleEngine.remove_watermark: restore_faces / restore_faces_weight, best-effort, auto-skips when the extra is absent or no face is detected. - CLI --restore-faces/--no-restore-faces + --restore-faces-weight on invisible/all/batch (on by default). - restore extra (gfpgan/facexlib/basicsr), numpy<2-pinned (scipy<1.18, numba<0.60) and kept out of `all`; basicsr needs Python <3.13 + setuptools<69 to build, so pin .python-version 3.12. Commercial-safe: GFPGAN Apache-2.0, RetinaFace MIT. The CodeFormer alternative is non-commercial and is not shipped. The earlier IP-Adapter FaceID layer was removed (footgun: needs high strength, corrupts faces at the low removal strength). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 16:59:28 -07:00
Victor Kuznetsov	d90d5d886a	feat: controlnet pipeline for text/face-structure preservation Add `--pipeline controlnet` (SDXL base + xinsir canny ControlNet via StableDiffusionXLControlNetImg2ImgPipeline): the canny edge map conditions the img2img regeneration so text and face STRUCTURE stay sharp, while the watermark is still removed by the regeneration (`strength`) -- no original pixels are copied or frozen, so SynthID does not survive. Oracle-verified clean on OpenAI with better text/structure fidelity than plain img2img at equal strength. `--controlnet-scale` tunes structure preservation; fp32 on mps/cpu (fp16-fixed VAE on cuda/xpu). Shares the img2img runner (live progress + MPS->CPU fallback) and the fp16-VAE-fix / device-move helpers with the default pipeline. Remove the superseded subsystems -- ctrlregen (SD1.5 clean-noise), text-protection (differential / region-hires) and face-protection: they either destroyed real content or shielded the watermark by re-using original pixels. controlnet replaces them by regenerating everything under edge conditioning. Canny preserves face structure but not identity; face IDENTITY is a separate face-restoration post-pass (CodeFormer/GFPGAN), researched + prototyped but not yet shipped. An IP-Adapter FaceID attempt was built and removed (footgun: needs high strength, corrupts faces at removal strength). Docs: docs/controlnet-removal-pipeline-research.md, scripts/controlnet_sweep.py. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 16:59:28 -07:00
Victor Kuznetsov	175609b60a	fix(gemini): rescue small corner sparkle buried by the size weight (#36 ) detect_watermark's size-weighted global NCC search lets a larger, mediocre match (e.g. a bright collar in a portrait) outrank a small, near-perfect sparkle in the bottom-right corner, so a faint sparkle on a busy background scored below threshold and the image read as clean -- the regression from widening the search window 256px->512px between v0.7.2 and v0.8.8. Add _corner_promote: a bottom-right-corner raw-NCC pass that overrides the global pick when the corner holds a match with raw NCC >= 0.85 that beats it. It only ever replaces a lower-fidelity pick (cannot weaken an existing detection) and keeps the wider window for variant margins. The corner side is relative-clamped (0.20 of the short side, [96, 384]) so it stays a true corner at every scale: a fixed 256px covers ~70% of a small portrait, where a real photo raw-matches the star at ~0.81; relative tightening drops that to ~0.69. The 0.85 gate sits between the worst real-photo corner match (~0.78) and a genuine faint sparkle (~0.93): zero false positives across native + downscaled negatives, headshot rescued from below-threshold to 0.71. Factor the shared multi-scale matchTemplate loop into _scan_scales. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 16:51:03 -07:00
Victor Kuznetsov	35116d5e97	chore(release): v0.8.9 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 19:04:32 -07:00
Victor Kuznetsov	df0fafe94e	fix(identify): stop flagging multi-actor C2PA manifests as integrity clashes The C2PA issuer attribution (`c2pa`) and the SynthID proxy (`synthid`) are derived from the same manifest, so treating them as independent signals made rule 1 fire on legitimate multi-actor manifests where a product wraps another vendor's engine (Microsoft Designer on OpenAI, Microsoft on Google) or an edit chain re-signs (Adobe over a Gemini original). 19 such files in the 2026-06-01/02 spaces batches read as "likely spoofed/laundered" before this. Group `c2pa` + `synthid` into one provenance source via `_CLASH_SOURCE`; rule 1 now requires two vendors from different sources. A manifest vendor still clashes with a genuinely independent stamp (EXIF/XMP generator, IPTC AISystemUsed, AIGC, xAI). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 19:02:35 -07:00
Victor Kuznetsov	9cb66992bd	chore(release): v0.8.8 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 09:18:02 -07:00
Victor Kuznetsov	9ca2811938	fix(gemini): inpaint sparkle footprint when reverse-alpha over-subtracts (#30 ) On a dark/textured background (e.g. grass) the captured alpha map over-estimates the real Gemini sparkle's effective opacity (~0.51 captured vs ~0.31 effective), so the fixed-alpha reverse blend over-subtracts (watermarked - alpha*logo goes negative) and drives the footprint to black -- the white sparkle turns into a black diamond (issue #30, reported by @CoolZimo1). remove_watermark now detects this via _reverse_alpha_oversubtracts (fraction of footprint pixels with a negative numerator > 5%) and inpaints the small sparkle footprint from the surrounding pixels (cv2 NS, cropped to a padded box) instead. Behavior-neutral on the working case: a bright background over-subtracts at ~0%, so reverse-alpha is used and the output is byte-identical to before (verified: demo_banana 0.0 frac vs the issue-#30 grass image 0.61 frac; issue-#30 footprint recovers to background grass with no pit, residual sparkle conf 0.25 < 0.35). Guard is scoped to GeminiEngine: doubao/jimeng already NCC-align their alpha to the actual mark per image, which sidesteps the fixed-alpha mismatch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 09:17:32 -07:00
Victor Kuznetsov	b25276c4f2	chore(release): v0.8.7	2026-06-01 19:33:08 -07:00
Victor Kuznetsov	96038f960f	feat(invisible): vendor-adaptive default strength (OpenAI 0.10 / Google 0.15) The default img2img strength is now chosen from the detected SynthID vendor (C2PA issuer) instead of a single fixed 0.30: OpenAI gpt-image -> 0.10, Google Gemini -> 0.15, unknown source -> 0.15. Explicit --strength always wins. Basis: an oracle-verified June 2026 controlled study (clean v0.8.6, text/face protection OFF, per-image openai.com/verify or Gemini-app verdict). OpenAI's SynthID clears at 0.05 across 1024-1600 px (n=4, resolution-independent); Google's is ~3x more robust and needs 0.15 on the capped-1536 path (n=4). The dominant factor is the VENDOR, not resolution. The earlier single 0.30 default and the "resolution dependence" lore came from contaminated tests run with the protect-text bug ON (issue #14) -- re-running those same 1600x1600 images clean removes SynthID at 0.05. `vendor_for_strength(path)` reads metadata.synthid_source on the ORIGINAL input and is threaded through cli (invisible/all/batch) -> invisible_engine -> watermark_remover -> resolve_strength(strength, profile, vendor), so display and execution use the same vendor (the engine sees a temp path whose C2PA the visible pass already stripped, so detection must happen in the CLI on the pristine source). Caveat: Google's 0.15 was validated only on --max-resolution 1536; native 2816 Gemini was not locally measurable (OOM on Apple Silicon) and is pending GPU validation on raiw.cc. Docs: docs/synthid.md sections 2.2/4.4/5.2 corrected (the contaminated resolution-dependence findings replaced with the clean oracle-verified table); README and CLAUDE.md updated; CLI --strength help reflects the adaptive default. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 19:29:47 -07:00
Victor Kuznetsov	1708857772	fix(gemini): expand sparkle search area 256 -> 512px from corner The 256px limit caused misses when Gemini places the sparkle further from the corner than the standard 160px (margin 64 + logo 96). Observed variant at ~300px reported in issue #30. 512px covers all known Gemini margin variations with room to spare; matchTemplate on a 512x512 region is still fast on CPU. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 10:42:04 -07:00
Victor Kuznetsov	25cc4750df	chore(release): v0.8.5	2026-06-01 10:31:59 -07:00
Victor Kuznetsov	4b0b370ac0	fix(invisible): disable protect-text/protect-faces by default; add docs/synthid.md Both text and face protection were shielding SynthID from removal. The text-protection high-res re-scrub regenerates pixels at an upscaled resolution where the per-region pass may not be strong enough to re-destroy the SynthID payload, allowing it to survive in text areas. Face protection has an even more direct mechanism: it pastes back the original (pre-diffusion, watermarked) face pixels after the global pass, guaranteeing SynthID survives in face regions regardless of strength. Both --protect-text and --protect-faces are now off by default and opt-in. Rename from --no-protect-text / --no-protect-faces to --protect-text / --protect-faces. Extract shared click.option decorators to module-level constants (_protect_text_option, _protect_faces_option) to eliminate copy-paste between cmd_invisible and cmd_all. Add docs/synthid.md: primary-source-cited technical reference for SynthID-Image covering mechanism (post-hoc encoder/decoder, 136-bit payload, pixel-space, no model-weight modification), robustness numbers (arXiv:2510.09263: ~99.98% TPR at 0.1% FPR across 30 transforms), removal attacks and forensic detectability (arXiv:2605.09203: all 6 attacks detectable >98% TPR@1%FPR), detectability limits, oracle scope, adoption landscape, and practical implications including the protect-text/faces SynthID-preservation finding. Verified June 2026 on gpt-image 1600x1600 via openai.com/verify: with --protect-text SynthID detected; without, SynthID removed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 10:28:34 -07:00
Victor Kuznetsov	72812de03c	chore(release): v0.8.4	2026-05-31 20:46:52 -07:00
Victor Kuznetsov	e501bec9ff	feat(identify): detect visible Doubao/Jimeng marks; keep identify import torch-free identify previously ran only the Gemini sparkle as a visible detector, so a Doubao/Jimeng image with stripped TC260 metadata had no visible fallback. Add `_visible_text_marks` (registry-backed) so the ByteDance Doubao 豆包AI生成 and Jimeng 即梦AI marks are detected too, each gated by its own engine NCC threshold via MarkDetection.detected. New signals `visible_doubao` / `visible_jimeng` (medium), same stripped-metadata fallback role as the sparkle; excluded from integrity-clash vendor claims; set platform only when no harder signal did. Also make `noai/__init__` lazy (PEP 562 __getattr__): importing the light `noai.c2pa` / `noai.constants` submodules (which identify needs) no longer eagerly pulls `watermark_remover`, which imports torch + diffusers at module top. `import remove_ai_watermarks.identify` drops from ~420 MB to ~21 MB in a full gpu/detect install (torch not loaded), so it fits a 512 MB host; the removal API resolves lazily on first access. Guarded by TestIdentifyImportIsLight. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 20:43:52 -07:00
Victor Kuznetsov	c155f81078	chore(release): v0.8.3 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 17:41:10 -07:00
Victor Kuznetsov	b991b11a19	docs(synthid): correct protect_text guidance -- it does NOT block removal (keep ON) An A/B at strength 0.3 on a real e-commerce infographic (updated GPU study) reverses the earlier claim: SynthID is a GLOBAL watermark, so 0.3 removes it whether protect_text is on or off, and protection SALVAGES text fidelity (medium headings/body stay readable; off, they garble). The earlier 'protect_text shields the watermark, use --no-protect-text' was wrong -- it mistook the 0.10 strength failure for a protection effect. Recommended SynthID config: ~0.3 + protect_text ON (the default). Also document the oracle scope: the Gemini app 'Verify with SynthID' is the only valid SynthID oracle; openai.com/verify is provenance-scoped (C2PA) and does NOT measure SynthID. Corrects CLAUDE.md + README + watermark_profiles comment shipped in `cddbaf6`. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 16:50:13 -07:00
Victor Kuznetsov	cddbaf6413	fix(invisible): raise default strength 0.10 -> 0.30 (current SynthID threshold); flag ctrlregen experimental An oracle-verified GPU strength study (Modal A100, native res, Gemini-app 'Verify with SynthID', n=3 fresh Gemini images, protect_text/faces off) found the current Google SynthID survives strength 0.10/0.15/0.2 and is removed only at 0.3. The previous 0.10 default (set from an n=1 result) no longer clears it -- Google hardened SynthID and the threshold has climbed 0.05 -> 0.10 -> ~0.3. Bump DEFAULT_STRENGTH to 0.30; OpenAI/ChatGPT carry C2PA not SynthID, so 0.10 is plenty there (pass --strength 0.10). Note protect_text shields the text regions SynthID hides in (use --no-protect-text for full removal on text-heavy images). The same study found ctrlregen at clean-noise strength DESTROYS real images (hallucinated micro-text in smooth regions), with no usable middle setting, so the literature's 'clean-noise is the lever' did not hold empirically. Flag ctrlregen EXPERIMENTAL in the CLI --pipeline help, README, and watermark_profiles; SDXL img2img at ~0.3 stays the shippable path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 16:38:49 -07:00
Victor Kuznetsov	729f5f2ecd	chore(release): v0.8.2 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 15:46:47 -07:00
Victor Kuznetsov	f16216cabc	feat(cli): add --no-protect-faces to invisible/all (skip the YOLO face detector) Mirrors --no-protect-text: when the image has no people, skip loading and running the YOLO face detector entirely. The heavy extract+blend already only ran when a face was found, but the detector itself always loaded+inferred to decide; this flag lets callers skip that fixed cost. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 15:27:14 -07:00
Victor Kuznetsov	e42b7e9d6a	refactor(cli): plain-text console output; drop rich; quiet transformers cli.py now emits plain ASCII through a small click.echo shim (_Console / _Table / _Progress) instead of rich: no colors, markup tags, panels, progress bar, or Unicode glyphs (Warning: / -> / ... and dropped checkmark/cross marks). identify and metadata tables render as indented plain lines. - drop rich from dependencies (pyproject.toml + uv.lock) - __init__: set TRANSFORMERS_VERBOSITY=error (setdefault) plus a warnings filter so the transformers Siglip2ImageProcessorFast deprecation no longer prints at CLI startup (it fires from the eager noai import) - TestGpuHintMarkup: the [gpu] hint is now printed verbatim; docstring updated - CLAUDE.md: replace the obsolete rich-markup lesson, note the verbosity fix Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 15:21:29 -07:00
Victor Kuznetsov	2d49c3cb58	fix(invisible): ctrlregen defaults to clean-noise strength, not the SDXL 0.10 The ctrlregen profile inherited the SDXL img2img --strength default (0.10), a near-identity pass that loaded ControlNet + DINOv2-giant and barely changed the image -- a no-op for removal. resolve_strength() now resolves an unset strength per profile: 0.10 for the SDXL default, CTRLREGEN_DEFAULT_STRENGTH (1.0, clean-noise) for ctrlregen. It checks `is None` rather than falsiness, so an explicit 0.0 is respected (the old `strength or DEFAULT` swallowed it). Research basis: CtrlRegen (ICLR 2025, arXiv:2410.05470) removes robust watermarks by regenerating from clean Gaussian noise; partial-noise img2img retains watermark info that diffuses back, so a high (clean-noise) strength is the lever, not a knob on the light SDXL pass. CLI wiring (--strength default None) lands with the cli refactor. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 15:07:19 -07:00
Victor Kuznetsov	33bd401e2a	fix(visible): guard remove_watermark_reverse_alpha on tiny images too The previous commit guarded extract_mask, but the 2048x1 crash was actually in _fixed_alpha_map's cv2.resize to a ~1-px-tall target (Windows: "Unknown C++ exception" / access violation). Return image.copy() up front when h < 32 or w < 64 (no real watermarked image is that small), before any cv2 call. Same guard in both Doubao and Jimeng. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 14:00:52 -07:00
Victor Kuznetsov	7167d2bae7	fix(visible): guard extract_mask against degenerate ROIs (Windows CI crash) The always-align removal scores each placement with a residual detect(), which on an extremely wide/short image (2048x1, test_wide_short_does_not_raise) fed cv2 GaussianBlur a ~1-px-tall ROI and faulted natively on Windows py3.12 (access violation, non-deterministic -- one CI cell went red, a re-run passed). The old at-native path never ran detect() on degenerate sizes. Skip the cv2 pipeline and return an empty mask when bh < 16 or bw < 16; real images always clear the guard (the WM_* box floors are max(16,..) / max(40,..)). Same fix in both Doubao and Jimeng. Also sync the stale Doubao module docstring. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 13:25:57 -07:00
Victor Kuznetsov	e7c57e3892	chore(release): v0.8.1 — exclude data/ from sdist The 0.8.0 PyPI publish uploaded the wheel but the sdist was rejected (400 File too large): hatchling's default sdist bundled the committed data/ test corpora (synthid_corpus images + the new visible-mark captures), pushing it past PyPI's per-project file-size limit. Add a sdist target that excludes /data, dropping it ~85 MB -> 9.8 MB. The wheel already ships only src/ and is unaffected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 12:57:46 -07:00
Victor Kuznetsov	315320056b	chore(release): v0.8.0 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 12:25:02 -07:00
Victor Kuznetsov	e572767555	feat(visible): add Jimeng remover, fix Doubao outline defect, reproducible mask build Visible-watermark work across all three corner-mark engines plus a committed, reproducible alpha-build pipeline (scripts/visible_alpha_solve.py) fed by committed solid black/gray/white captures. - jimeng: new "即梦AI" wordmark remover (reverse-alpha + thin residual inpaint, always NCC-aligned -- the mark re-rasterizes/jitters per image). Detect via glyph silhouette NCC (0.45 threshold; does not cross-fire with Doubao). Registered in the visible-mark catalog; `visible --mark jimeng` / `--mark auto`. - doubao: fix a real production defect -- the shipped remover left a READABLE "豆包AI生成" outline on real samples while detect() returned conf 0.0 (fooled by a thin outline), so the test passed and the "56/56 clean" claim was detector-measured, not visual. Root cause: under-estimated alpha + fixed-geometry-no-inpaint + tight locate box. Rebuilt alpha (careful gray-self solve), always-align, thin inpaint, widened locate box -> readable outline becomes faint texture-level traces. - gemini: rebuild gemini_bg_{96,48} from our own controlled captures (validated NCC 0.9998 vs the prior third-party asset); removal re-verified clean, no behaviour change. - tests: add textured-shift regression to both engines (guards the align-on-shift path the Doubao defect exposed; lesson: a detector-only removal test is insufficient, assert visual residual). - docs: CLAUDE.md, README, capture READMEs and docstrings synced; stale "exact/pixel-exact/56-clean" claims removed. Also includes a SynthID label-wording clarification in identify.py/cli.py ("SynthID pixel watermark" -> "SynthID watermark, inferred from C2PA metadata"). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 12:20:19 -07:00
Victor Kuznetsov	5d0e6c3a65	fix: harden metadata parsers and engines; sync docs (full-repo review) Apply fixes from a full-repo review (code, tests, docs). Security / correctness: - Clamp attacker-controlled PNG/caBX chunk lengths to the remaining file size in metadata.py and noai/c2pa.py (a malformed length no longer drives a multi-GB read); skipped chunks seek instead of read. - noai/isobmff.strip_c2pa_boxes is now fail-safe on a malformed box: return the original bytes with a warning instead of silently truncating the tail, so metadata --remove can no longer emit a corrupt file. - doubao_engine._fixed_alpha_map clamps the glyph box to the image (no crash on degenerate width-vs-height). - watermark_remover._run_region_hires gates the phaseCorrelate offset on response and magnitude (a spurious shift no longer garbles text) and drops the generator after a CPU fallback (no MPS/CPU device mismatch). Robustness: - gemini_engine, doubao_engine, region_eraser normalize grayscale and RGBA inputs to BGR at the engine entry points. - image_io.imwrite returns False on an unwritable path (matches cv2). - invisible_engine guards a None imread result before use. - trustmark_detector._decoder uses a double-checked threading lock. - ctrlregen.tiling.tile_positions raises on overlap >= tile. - humanizer chromatic shift no longer wraps opposite-edge pixels. - identify OpenAI caveat keyed on the normalized vendor, not a substring. - Remove the dead "visible --detect-threshold" CLI option. - publish.yml verifies the release tag matches the package version. Docs: - README strength 0.05 to 0.10; .env.example HF_TOKEN marked optional; doubao_capture README updated to reverse-alpha-only; CLAUDE.md synced with the new behaviors and the batch command. Tests: new test_security_clamp.py for the read clamp and isobmff fail-safe; erase CLI coverage; integrity-clash rule 2 end-to-end; multi-tag EXIF survival and cross-format strip guards; channel/size, tiling, humanizer, and imwrite regressions. Full suite 493 passed, 2 skipped; ruff and pyright src/ clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-30 18:00:39 -07:00
Victor Kuznetsov	5298dcc6a3	chore(release): v0.7.2 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-30 14:35:04 -07:00

1 2 3

120 Commits