remove-ai-watermarks

mirror of https://github.com/wiltodelta/remove-ai-watermarks.git synced 2026-07-26 17:31:06 +02:00

Author	SHA1	Message	Date
Victor Kuznetsov	53d753f2ad	fix(instantid): pre-fetch antelopev2 from HF mirror (InsightFace auto-link is broken) InsightFace's built-in auto-download for the antelopev2 model pack (github.com/deepinsight/insightface/releases/download/v0.7/antelopev2.zip) has been broken since at least 2024 (upstream issues #2517, #2766, called out in InstantID's README: "manually download via this URL to models/ antelopev2 as the default link is invalid"). When the .onnx files aren't in place, FaceAnalysis.prepare() raises `assert 'detection' in self.models` -- which is exactly what our Modal cert sweep hit on the first real run. Fix: a tiny pre-flight `_ensure_antelopev2()` that pulls the five expected .onnx files (1k3d68, 2d106det, genderage, glintr100, scrfd_10g_bnkps) from the HuggingFace mirror `kidyu/antelopev2-for-InstantID-ComfyUI` into ./models/antelopev2/ before FaceAnalysis is instantiated. Idempotent (skips files that already exist); uses huggingface_hub's cache for free caching on the Modal volume. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 19:58:40 -07:00
Victor Kuznetsov	00c559482f	fix(invisible-engine): log exc_info + exception class on restore_faces failure The InstantID cert sweep emitted `restore_faces post-pass failed ()` -- the exception's str() was empty so the log line told us nothing about what actually failed. Adding `exc_info=True` plus `type(e).__name__` so the full traceback and exception class land in the log even when the message is empty. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 19:53:07 -07:00
Victor Kuznetsov	a296d5fe46	fix(instantid): inline YuNet detection (the imagined _get_yunet doesn't exist) The InstantID restore module imported `_get_yunet` from `auto_config`, but auto_config doesn't export that function -- the YuNet singleton lives inline inside `detect_face()`. Caught by the Modal cert sweep: restore_faces post-pass failed (cannot import name '_get_yunet' from 'remove_ai_watermarks.auto_config'); keeping un-restored output Inline the YuNet builder the same way `photomaker_restore` does (read `auto_config._FACE_SCORE` and the bundled `face_detection_yunet_2023mar.onnx` asset, build a fresh `FaceDetectorYN` per call). This is the proven pattern from PhotoMaker and avoids a private-API drift between the modules. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 19:48:21 -07:00
Victor Kuznetsov	70e8b3a517	feat(face-restore): add InstantID as the default non-commercial restore path Per the 2026-06-08 deep-research synthesis (docs/synthid-robust-identity- research-2026-06-08.md), the entire ArcFace-class identity-adapter ecosystem for SDXL is blocked from commercial use by InsightFace's non-commercial model packs (antelopev2 / buffalo_l). No commercial-safe ArcFace-grade identity stack exists today. The user explicitly opted into shipping a non-commercial restore path (research / personal use; raiw.cc must NOT install the extra). Architectural choice: InstantID over PhotoMaker-V2 as the default. - PhotoMaker-V2 (CLIP+ArcFace dual encoder, txt2img only): documented upstream identity drift on Asian male faces, visually confirmed in our cert sweep (tatsunari rendered as a generic woman; group photo collapsed into a patchwork). - InstantID (ArcFace cross-attention + landmark ControlNet): semantic identity branch + spatial weak landmark control, decoupled. Per InstantID paper (arXiv:2401.07519) and the research report, stronger identity fidelity on single portraits. Critically: NO original face pixels enter the diffusion (ArcFace embedding is semantic, landmark stick figure is pure geometry), so SynthID is not transported. Implementation: - New `src/remove_ai_watermarks/instantid_restore.py` mirrors the `photomaker_restore.py` shape (lazy singletons for pipeline + FaceAnalysis, per-face crop + _composite_faces from photomaker_restore). Loads the InstantID community pipeline via `DiffusionPipeline.from_pretrained( custom_pipeline="pipeline_stable_diffusion_xl_instantid")` -- no upstream Python package needed; diffusers fetches the file from its community examples. - New `instantid` extra in pyproject (insightface + onnxruntime + huggingface-hub). NON-COMMERCIAL block in the comment explains why. - CLI: `--restore-faces-method [instantid\|photomaker]`, default `instantid`. Both methods explicitly labeled NON-COMMERCIAL in the help text. - Engine: dispatch on `restore_faces_method` to either `_restore_faces_instantid` or `_restore_faces_photomaker`. - 9 control-flow tests for InstantID without model download (mirror the photomaker_restore.py test pattern + draw_kps helper checks). 587/587 pass. Diffusers-0.38 compat verified by upstream code inspection: the InstantID pipeline inherits from `StableDiffusionXLControlNetPipeline`, uses only public diffusers APIs (`encode_prompt`, `prepare_image`, `prepare_latents`, `get_guidance_scale_embedding`), uses legacy attention processor API which diffusers preserves for backward compat. No PhotoMaker-V1-style internal text_encoder access. End-to-end execution will be validated by the Modal cert sweep in the next step. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 19:44:17 -07:00
Victor Kuznetsov	c486badaa8	fix(photomaker-v2): render at SDXL native 1024, use upstream prompt + neg_prompt The 9-face grid + single-face cert outputs were still mosaic of training-time faces even after the id_embeds shape fix. WebFetch of the upstream inference_pmv2.py revealed three mismatches: 1. SDXL at width=height=512 falls into its low-res failure mode (small-detail collage / mosaic) on the V2 LoRA. Render at native 1024 then downscale into the original face bbox at composite time. 2. Upstream prompt is descriptive ("instagram photo, portrait photo of a woman img, colorful, perfect face, natural skin, hard shadows, film grain, best quality"). Our generic prompt let SDXL drift away from the ID embedding. Adopted the upstream pattern. 3. Upstream V2 explicitly passes negative_prompt; the CFG batch-mismatch we hit on V1 isn't a V2 issue. Re-added negative_prompt with the upstream wording (asymmetry/worst quality/etc). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 19:11:48 -07:00
Victor Kuznetsov	b1fed810fd	fix(photomaker-v2): don't pre-unsqueeze id_embeds (the pipeline does it) V2's pipeline forward at line 705 of upstream pipeline.py calls `id_embeds.unsqueeze(0)` itself to add a batch dim, so callers pass a 2-D (N_faces, 512) tensor and the pipeline turns it into 3-D. Upstream inference_pmv2.py shows the canonical form: torch.stack([...]) of per-image embeddings. Our previous call .unsqueeze(0)'d on the way in, which the pipeline then .unsqueeze(0)'d again, giving a (1, 1, 512) shape that the V2 id_encoder consumed as garbage -- the resulting output was a training-time face collage (verified visually 2026-06-04 against tatsunari + gemini_3 + the 9-face grid). Fix: pass torch.stack([torch.from_numpy(embedding)]) -- shape (1, 512) -- so the pipeline's internal unsqueeze gives the expected (1, 1, 512) inside the forward. Don't pre-cast dtype either; the pipeline handles that internally. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 19:03:04 -07:00
Victor Kuznetsov	37817a610f	test(photomaker): stub face_analyser + analyze_faces in the control-flow test The previous commit added a real call into FaceAnalysis2 / analyze_faces inside restore_faces_photomaker, which broke the model-free control-flow test. Stub it: - monkeypatch _get_face_analyser to return a sentinel - install a fake `photomaker` module with analyze_faces returning a single 512-d zero embedding - add dtype=torch.float32 to the fake pipeline class so .to(device, dtype=...) works 11/11 green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 18:51:26 -07:00
Victor Kuznetsov	3d00fed00c	fix(photomaker-v2): compute id_embeds via FaceAnalysis2 before pipeline call The Modal cert sweep against V2 hit the next layer of the API: PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken.forward() missing 1 required positional argument: 'id_embeds' V2 forward takes BOTH the CLIP image embedding (computed inside the pipeline from input_id_images) AND an ArcFace identity embedding (id_embeds) that the caller must compute. The upstream pipeline does NOT auto-compute it -- inference_pmv2.py shows the caller using FaceAnalysis2 + analyze_faces to extract the ArcFace vector from each input ID image and passing id_embeds=torch.stack([...]) into pipe(...). Wired the same flow here: - New _get_face_analyser() singleton (double-checked lock) builds FaceAnalysis2(['CUDAExecutionProvider' \| 'CPUExecutionProvider']).prepare(...). This is the non-commercial step (antelopev2/buffalo_l auto-download on first use). Module docstring already calls it out. - Per face: analyze_faces() -> torch.from_numpy(embedding) -> .unsqueeze(0) to match the pipeline's expected (B, D) shape, casting to pipeline.device/dtype. Faces InsightFace can't detect inside the crop get skipped (the most likely cause would be the diffusion-cleaned face being too small or stylised after the main pass; YuNet already gated us into having a face per crop, so this should be rare). - id_embeds= keyword threaded into the pipeline call site alongside the existing input_id_images=. Tests untouched (the V1-only safety guard was already removed in the previous commit when we swapped V1->V2; the existing 11 tests still pass). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 18:49:10 -07:00
Victor Kuznetsov	65de8df5c5	refactor(face-restore): drop GFPGAN, ship PhotoMaker-V2 as the sole restore (non-commercial) Visual review of the GFPGAN-on-cleaned output (9-face grid, 1448x1086) showed it only polished the already-drifted face without restoring identity — useless for the "restore who is in the photo" intent. Dropping it. The shipped restore path is now PhotoMaker-V2, which delivers true identity-from- embedding face regeneration via a CLIP+ArcFace dual encoder. The ArcFace branch pulls InsightFace antelopev2/buffalo_l model packs at runtime, which InsightFace releases under a research-only license, so the whole extra is NON-COMMERCIAL. raiw.cc and any monetized deployment must NOT install the `photomaker` extra. This is called out at every entry point: CLI flag help, module docstring, pyproject extra block, CLAUDE.md extras bullet, README install snippet. Changes: - Deleted `src/remove_ai_watermarks/face_restore.py` and its tests. - Deleted the `restore` extra (gfpgan/facexlib/basicsr + scipy<1.18 / numba<0.60 pins) and the basicsr setuptools<69 build pin from pyproject.toml. - Restored `src/remove_ai_watermarks/photomaker_restore.py` (V2 this time: `TencentARC/PhotoMaker-V2`, `photomaker-v2.bin`, no `pm_version='v1'` override). - Restored the `photomaker` extra in pyproject with all the upstream-compat pins (einops, peft, onnxruntime, insightface) and the `allow-direct-references` hatch metadata block. - `InvisibleEngine` swapped `_restore_faces` -> `_restore_faces_photomaker`; `--restore-faces-method` removed (only one method, no choice). - CLI flag help, CLAUDE.md, README, docs/synthid.md, and docs/controlnet-removal-pipeline-research.md all updated. - docs/synthid-robust-identity-research.md status notice rewritten to list both abandoned commercial-safe attempts (V1 + GFPGAN-on-cleaned) and the non-commercial trade-off we accepted. ruff + strict pyright(src/) clean; 578 tests pass (the 9 GFPGAN tests are gone, the 11 PhotoMaker tests stay green). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 18:41:01 -07:00
Victor Kuznetsov	01fe98bf54	refactor(face-restore): rollback PhotoMaker, restore GFPGAN on the CLEANED image After 7 cascading upstream-compat fixes (insightface dep, peft dep, pm_version, device, etc.), the PhotoMaker V1 cert sweep still hit a CFG batch-dim mismatch inside the denoising loop. The upstream PhotoMaker `pipeline.py` is forked from diffusers v0.29.1 and our env runs 0.38; SDXL prompt-encoder handling changed significantly between those versions, so making PhotoMaker work end-to-end needs a proper fork or a diffusers downgrade — both expensive. Not worth shipping today. Pivot: restore `face_restore.py` (GFPGAN) with a single-line fix that makes it SynthID-safe by construction. The previous design ran GFPGAN.enhance on the ORIGINAL watermarked image and was oracle-confirmed to re-add SynthID via the weight-0.5 pixel blend. The fix is to run GFPGAN on the diffusion-CLEANED image — whatever pixels GFPGAN derives from are already SynthID-free, so the partial blend cannot transport the watermark. Identity fidelity is lower than a true identity-as-embedding stack would deliver, but it ships and works. Changes: - `src/remove_ai_watermarks/face_restore.py` restored from pre-wipe state with one line changed: `restorer.enhance(cleaned_bgr, ...)` instead of `restorer.enhance(original_bgr, ...)`. `original_bgr` is kept as an unused positional argument for API stability. - `src/remove_ai_watermarks/photomaker_restore.py` and its tests REMOVED. The research note (`docs/synthid-robust-identity-research.md`) keeps a "status notice" documenting why PhotoMaker is parked for now and what the path back in would look like. - `pyproject.toml` `restore` extra restored (gfpgan/facexlib/basicsr + scipy<1.18 + numba<0.60 pins + the basicsr setuptools<69 build pin), plus `photomaker` extra (with its einops/insightface/peft pile) and the `[tool.hatch.metadata] allow-direct-references = true` block REMOVED. - `InvisibleEngine._restore_faces_photomaker` removed; `_restore_faces` restored. The `--restore-faces` CLI flag and its plumbing through cmd_* signatures are unchanged. - CLAUDE.md, README.md, docs/synthid.md, docs/controlnet-removal-pipeline- research.md updated to describe the shipped GFPGAN-on-cleaned design and to reference PhotoMaker only as the parked alternative. ruff + strict pyright(src/) clean; 578 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:55:45 -07:00
Victor Kuznetsov	d1b85ee6a8	fix(photomaker): drop explicit negative_prompt to fix CFG batch mismatch Modal cert sweep #6 made it INTO the denoising loop and died with "Sizes of tensors must match except in dimension 1. Expected size 2 but got size 1 for tensor number 1 in the list." In the PhotoMaker pipeline's denoising loop, the per-step embeddings are built as torch.cat([negative_prompt_embeds, prompt_embeds(_text_only)], dim=0). The text-encoder + ID-encoder flow can leave the negative branch at batch=2 and the ID-injected branch at batch=1 when a custom negative_prompt is passed, so the cat fails. The upstream gradio demo just passes no negative_prompt and relies on the pipeline's empty default; do the same. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:35:40 -07:00
Victor Kuznetsov	031c38dc7f	fix(photomaker): place id_encoder on the right device + dtype Modal cert sweep #5 made it through component load (V1 id_encoder + lora_weights) and died at inference with the classic "Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor) should be the same" — id_encoder lived on CPU/fp32 while the rest of the pipeline ran on CUDA/fp16. Two fixes: 1. Call `pipe.to(device)` BEFORE `load_photomaker_adapter` so the loader picks the right device/dtype from `self.device` / `self.unet.dtype` when it builds the encoder. 2. Belt: after load, explicitly `pipe.id_encoder.to(device, dtype)` because some torch/diffusers combos leave custom attributes on the old device even when `pipe.to` ran first. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:29:00 -07:00
Victor Kuznetsov	9435e12ce6	fix(photomaker extra): add peft dep (required by pipe.fuse_lora) Modal cert sweep #4 got further -- PhotoMaker V1 components actually loaded ("Loading PhotoMaker v1 components [1] id_encoder ... [2] lora_weights") -- and died on the next step: "PEFT backend is required for this method." That's diffusers' fuse_lora call gated on the peft library, which PhotoMaker doesn't declare in its install_requires either. Pin peft>=0.10.0 in the photomaker extra. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:23:32 -07:00
Victor Kuznetsov	1fb2a64b56	fix(photomaker): pass pm_version='v1' to load_photomaker_adapter Modal cert sweep #3 ran past the `insightface` import error and into a real state_dict mismatch: Error(s) in loading state_dict for PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken: Missing key(s) ... qformer_perceiver.token_proj.0.weight ... The upstream `load_photomaker_adapter` defaults to `pm_version='v2'` regardless of the .bin file passed -- the loader builds a V2 encoder (PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken) and then tries to load V1 weights into it. We must pass `pm_version='v1'` explicitly so the loader instantiates the CLIP-only PhotoMakerIDEncoder. The pipeline-level `input_id_images` API is the same across V1 and V2, so the call site does not change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:18:52 -07:00
Victor Kuznetsov	860bde4a26	fix(photomaker extra): pin `insightface` for import resolution (MIT code only) The upstream PhotoMaker package's `__init__.py` unconditionally imports a face-analyser class from its `insightface_package` submodule, so JUST importing `PhotoMakerStableDiffusionXLPipeline` (the V1 pipeline class we use) raises `ModuleNotFoundError: No module named 'insightface'` if insightface isn't present in the env. The Modal cert sweep caught this on the V1 image. Resolution: pin `insightface>=0.7.3` (and its `onnxruntime` runtime dep) in the `photomaker` extra. The PyPI insightface package is MIT-licensed CODE; the non-commercial restriction sits on the pretrained model packs (antelopev2, buffalo_l) which download only when `FaceAnalysis()` is instantiated. Our V1 path never instantiates the face-analyser -- it loads photomaker-v1.bin (CLIP-only encoder) via `load_photomaker_adapter` -- so the model-pack license does not bind us; we depend only on the MIT code for the import to resolve. Safety guards: - Runtime check in `_get_pipeline`: raises if `_PHOTOMAKER_FILE` is ever pointed at v2 (so a future maintainer can't silently regress to the InsightFace path). - New test class `TestV1OnlyCommercialSafetyGuard`: asserts repo + filename pin to V1 AND asserts the module source never references the face-analyser class (a static check that our codepath stays out of the runtime that would pull the non-commercial model packs). Docs: documented the import dance + legal split inline at the top of `photomaker_restore.py`. ruff clean; 581 tests pass (the 9 PhotoMaker tests plus 3 new V1-guard tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:13:20 -07:00
Victor Kuznetsov	dfa5181309	fix(photomaker): switch to V1 — V2 actually requires InsightFace (non-commercial) A Modal cert sweep caught what the research doc missed: PhotoMaker-V2 fails at import without InsightFace ("No module named 'insightface'"). Reading the upstream source confirms it: `photomaker/__init__.py` imports `FaceAnalysis2` (an InsightFace wrapper) at module load, V2's encoder is named `PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken`, and `model_v2.py`'s forward takes an `id_embeds` argument that the pipeline computes via `insightface.app.FaceAnalysis(name='antelopev2', ...)`. So V2 is a DUAL encoder (CLIP + ArcFace), not CLIP-only as the model card line "id_encoder includes finetuned OpenCLIP-ViT-H-14 and a few fuse layers" implied. InsightFace's pretrained model packs (antelopev2, buffalo_l) are research/ non-commercial only per their own README: "The pretrained models we provided with this library are available for non-commercial research purposes only." So V2 is blocked for a paid service like raiw.cc. PhotoMaker-V1 is the commercial-safe alternative — its `PhotoMakerIDEncoder` (model.py) forward takes only `(id_pixel_values, prompt_embeds, class_tokens_mask)`, no ArcFace branch. Identity is CLIP-only, license is Apache-2.0, no InsightFace. Code change: swap the repo + filename constants in `photomaker_restore.py` (TencentARC/PhotoMaker, photomaker-v1.bin). Tests still pass (the 9 PhotoMaker tests use a fake pipeline, so the model swap is transparent to them). Doc correction: rewrote the verdict / license table / section 5 of `docs/synthid-robust-identity-research.md` to lead with V1 and add a correction notice explaining the V2 misread. Bulk-renamed `PhotoMaker-V2` to `PhotoMaker-V1` across CLAUDE.md, README.md, docs/synthid.md, and docs/controlnet-removal-pipeline-research.md (kept V2 only in the correction notice, the license table, and the anchor reference). ruff clean; 578 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:05:58 -07:00
Victor Kuznetsov	7e6fc8bfb9	fix(photomaker extra): add einops explicitly (upstream missed it) PhotoMaker imports einops in its forward path but its install_requires doesn't declare it, so the photomaker extra resolved without einops on a clean install and the Modal cert sweep died at the restore-faces step with "No module named 'einops'" -- the post-pass failed gracefully and returned the un-restored cleaned output, so the cert artifact had no face recovery. Pin einops>=0.7.0 in the photomaker extra so the extra is self-contained. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 15:46:28 -07:00
Victor Kuznetsov	439eeadc07	refactor(face-restore): wipe GFPGAN path, --restore-faces is PhotoMaker-only The GFPGAN `restore` extra and its `face_restore.py` module are gone. They were oracle-confirmed to re-introduce SynthID by blending watermarked original face pixels at fidelity weight 0.5 (clean A/B: gemini_3 controlnet 0.20 detected WITH GFPGAN, clean WITHOUT). Keeping them as the default restore method was a footgun for the removal pipeline. PhotoMaker-V2 (added in the previous commit) is the single shipped restore path now -- identity-as-embedding, SynthID-safe by construction. Removed: - src/remove_ai_watermarks/face_restore.py + tests/test_face_restore.py - pyproject.toml `restore` extra (gfpgan/facexlib/basicsr + scipy/numba pins) - pyproject.toml `[tool.uv.extra-build-dependencies] basicsr = [...]` build pin - CLI: `--restore-faces-method` and `--restore-faces-weight` (no method choice to make, no GFPGAN weight knob to expose) - InvisibleEngine._restore_faces method (only _restore_faces_photomaker remains) - All restore-faces-method / restore-faces-weight threading through cmd_* signatures and _process_batch_image Kept: - `--restore-faces / --no-restore-faces`: now binds to PhotoMaker-V2. - All adopted oracle findings about GFPGAN re-introducing SynthID (kept in the research docs as historical context that explains why the path was removed). Docs updated: CLAUDE.md (restore extras bullet collapsed to photomaker, removed face_restore Key-modules bullet, several inline GFPGAN refs scrubbed), README.md (face-identity callout + install section now point to the photomaker extra), docs/synthid.md 5.5 (net recipe), docs/controlnet-removal-pipeline-research.md (recommendations). ruff + strict pyright (src/) clean; 578 tests pass (the 9 GFPGAN tests are gone, the 9 PhotoMaker tests stay green). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 15:35:37 -07:00
Victor Kuznetsov	1439eb0714	feat(photomaker): SynthID-safe face-identity restoration via PhotoMaker-V2 Adds the second face-restore mechanism, selectable via the new CLI option `--restore-faces-method=photomaker`. Unlike the existing GFPGAN path (which runs on the watermarked ORIGINAL and was oracle-confirmed to re-introduce SynthID by partial pixel blending), PhotoMaker carries identity in a SynthID-invariant OpenCLIP embedding and regenerates fresh face pixels conditioned on it — the pixels in the output are diffusion-fresh, so the watermark cannot be transported. The load-bearing assumption (embedding invariance to SynthID-magnitude pixel noise) was empirically validated in the prior commit (smoke test): cosine drift 0.002 under a ±2 LSB low-freq carrier, an order of magnitude less than JPEG90 drift which SynthID survives at >=99% TPR. End-to-end commercial-safe: - PhotoMaker-V2 weights: Apache-2.0 (TencentARC) - ID encoder: OpenCLIP-ViT-H/14 (MIT) - SDXL base: shared with the main pipeline - NO InsightFace (the non-commercial blocker for IP-Adapter FaceID / InstantID / PuLID / Arc2Face) Two-pass architecture (PhotoMaker has no ControlNetImg2img class in diffusers): 1) main controlnet/default removal pass cleans SynthID + drifts faces 2) PhotoMaker txt2img regenerates each face from its embedding, feather-composited back into the cleaned image New module `photomaker_restore.py` mirrors `face_restore.py`: lazy pipeline singleton (double-checked lock), `is_available()` gate, pure `_face_crop_square` and `_composite_faces` helpers, all unit-tested without the model (9 new tests). New `InvisibleEngine._restore_faces_photomaker` runs after the diffusion pass, mirroring `_restore_faces`. CLI flag `--restore-faces-method=[gfpgan\|photomaker]` threaded through `cmd_invisible`/`cmd_all`/`cmd_batch` + `_process_batch_image`. New optional `photomaker` extra (Apache-2.0 + Apache-2.0/MIT deps, no basicsr). `[tool.hatch.metadata] allow-direct-references = true` is required because the upstream PhotoMaker package lives only on GitHub. The next step (separate work) is oracle validation: run a 6-image cert sweep through the new pipeline (default/controlnet at the certified strength + --restore-faces-method=photomaker) and confirm SynthID stays clean while face identity is recovered. The required infrastructure (`raiw-app/modal_cert.py`) is already in place. ruff + strict pyright(src/) clean; 586 tests pass (+ 9 new in tests/test_photomaker_restore.py). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 15:20:29 -07:00
Victor Kuznetsov	f8f247308b	docs(identity): smoke test confirms OpenCLIP embedding is invariant to SynthID-magnitude noise Empirical confirmation of the load-bearing assumption in the PhotoMaker-V2 path: the identity embedding cannot transport an invisible pixel watermark. Tested OpenCLIP-ViT-H/14 (laion2B-s32B-b79K — the same encoder PhotoMaker-V2 fine-tunes) on 31 face crops from gemini_3/gemini_4/openai_3 grid. cosine similarity between embed(orig) and embed(perturbed): - synthid_proxy (±2 LSB low-frequency noise, the regime SynthID actually lives in): mean 0.9977, min 0.9937. Embedding moves by 0.002 — an order of magnitude less than JPEG90 (mean 0.928), which SynthID survives at >=99% TPR by design. - noise3 / jpeg70 / blur1: 0.89-0.95, all clearly above the SynthID floor. - self check: 1.0000 (pipeline sane). So the embedder discards exactly the dimensions SynthID hides in. PhotoMaker-V2 conditioned on a watermarked face will see the same identity vector as a clean face of that person, so the generated face inherits identity, not the watermark. This unblocks step 2 of the research plan: prototype PhotoMaker-V2 in the controlnet pipeline. The previously logged ad-hoc "cos(orig, SDXL-cleaned)" numbers (0.56-0.93) measured diffusion drift, not watermark invariance, and are not relevant to the hypothesis. Docs only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 15:05:15 -07:00
Victor Kuznetsov	310ce912ba	docs: SynthID-robust identity research — PhotoMaker-V2 is the only commercial-safe SDXL stack After GFPGAN restore was oracle-confirmed to RE-INTRODUCE SynthID (it is a fidelity- restoration net conditioned on the watermarked input), the only identity path that will not transport the watermark is identity-by-EMBEDDING: a semantic vector that conditions a fresh generation. That requires a face-recognition / ArcFace-class or CLIP-image embedder. Verified the license stack of every credible 2025-2026 SDXL identity adapter by fetching primary sources directly (HuggingFace model cards, insightface.ai): - IP-Adapter FaceID family, InstantID, PuLID, Arc2Face -> all blocked. Each depends at runtime on InsightFace's antelopev2/buffalo_l ArcFace packs, and insightface.ai explicitly states "Code is MIT licensed; models require separate commercial licensing." IP-Adapter FaceID's own model card flags itself non- commercial for the same reason. - PhotoMaker-V2 is the single commercial-safe end-to-end stack today: Apache-2.0 adapter weights with identity encoded as a fine-tuned OpenCLIP-ViT-H/14 (the model card's exact phrase: "id_encoder includes finetuned OpenCLIP-ViT-H-14 and a few fuse layers"). No InsightFace. Mechanistic argument that an identity embedding cannot transport SynthID: the embedder is trained to be invariant to low-amplitude pixel changes (JPEG, resize, brightness, noise), which is exactly the regime SynthID hides in by design. So the embedding extracted from a watermarked face should be ~identical to the embedding from the cleaned face, and the embedding cannot carry the watermark into a freshly generated face. Flagged explicitly as not-yet-measured -- the first integration step is a cosine-similarity smoke test (no codegen) before investing in a PhotoMaker prototype. Process note: the deep-research harness was run but its verifier subagents failed to call StructuredOutput (same harness bug as a prior session), so its synthesis was unusable; the license claims here are direct quotes from the primary sources, fetched and verified, not from the workflow synthesis. Docs only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 14:58:11 -07:00
Victor Kuznetsov	be14eca207	docs: certified controlnet strength floors from the Modal GPU oracle sweep Ran the isolated raiw-controlnet-cert Modal app (raiw-app/modal_cert.py) over a strength x seed grid, restore OFF, --max-resolution 1536, each vendor checked on its OWN oracle (OpenAI -> openai.com/verify, Gemini -> the Gemini app). Certified controlnet SynthID-removal floors: - OpenAI 0.20: 2 photoreal images (9-face grid + bracelet) x seed {1,2,3} = 6/6 clean; the bracelet that flipped at 0.15 is seed-robust at 0.20. Transfers to prod (OpenAI removal is resolution-independent). - Gemini 0.30: 0.20 detected -> 0.30 clean on 2/2 seeds (hardest face). Holds only at <= 1536; Gemini is resolution-sensitive and raiw.cc runs NATIVE, so cap Gemini <= 1536 + use 0.30, or native-calibrate (~0.35+). Prod recipe recorded: controlnet + a controlnet-specific per-vendor schedule in resolve_strength (OpenAI 0.20 / Gemini 0.30, NOT the default 0.10/0.15 ladder) + FIXED prod seed (kills the near-threshold non-determinism) + restore reworked/off. Added to docs/controlnet-removal-pipeline-research.md (certified floors table), docs/synthid.md 5.5, and the CLAUDE.md controlnet bullet. Docs only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 12:44:56 -07:00
Victor Kuznetsov	d38b9a6122	docs: correct controlnet/restore SynthID-removal claims from the 2026-06-04 oracle pass Oracle validation (openai.com/verify + the Gemini app) overturned three claims that were on main, and consolidates the controlnet findings into one authoritative place. - controlnet does NOT reliably remove SynthID at the low vendor-adaptive strength: removal is content x pipeline dependent and the survivors FLIP by content type (photoreal survives controlnet / clears default; flat graphic survives default / clears controlnet; flat text clears both). Root cause is insufficient strength, not the pipeline; controlnet needs a higher, per-vendor floor than default. - removal near the threshold is SEED-non-deterministic (same image+pipeline+strength can pass or fail run-to-run); a single clean run does not certify a strength. - `--restore-faces` RE-INTRODUCES SynthID: GFPGAN runs on the ORIGINAL watermarked face at weight 0.5 and composites it back over the cleaned result (clean A/B: a Gemini face stayed detected through controlnet 0.15/0.20/0.25 WITH restore, cleared at 0.20 with --no-restore-faces). The old "GFPGAN scrubs SynthID" claim was wrong. Corrected in CLAUDE.md (watermark_remover controlnet bullet, controlnet Known-limitations bullet, face_restore bullet, vendor-adaptive strength bullet) and docs/synthid.md (5.1 controlnet/face-identity, 5.2 strength floors, new 5.5 oracle validation log). docs/controlnet-removal-pipeline-research.md gains an authoritative "Oracle validation 2026-06-04" section that the others point to as the single source. Docs only; no code change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 12:22:43 -07:00
Victor Kuznetsov	3aea21e632	feat(visible): Samsung Galaxy AI mark removal (bottom-left reverse-alpha, #37 ) New samsung_engine.py mirrors the jimeng engine but anchors bottom-left; wired into watermark_registry, the CLI (--mark samsung / auto), and identify (visible_samsung, medium). visible_alpha_solve.py gains a corner=bl mode; samsung_alpha.png solved from @f-liva's flat captures. Calibrated for the Italian "Contenuti generati dall'AI" variant. Flat black/gray/white captures committed, real photos gitignored. Tests + docs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 10:27:44 -07:00
Victor Kuznetsov	6f4aa4c7b1	fix(invisible): retry in fp32 on a degenerate fp16 output (#41 ) The fp16-fix VAE swap (#29) is gated to the default SDXL checkpoint, so a custom model_id, a stale pre-fix install, or a fal/custom loader can still decode to an all-black/NaN frame in fp16 (reporter: gpt-image 1448x1086, the `image_processor.py invalid value encountered in cast` warning). Add a model-agnostic backstop in remove_watermark: after generation, if the run was fp16 and the output is degenerate (_is_degenerate_image: near-zero mean and variance), rebuild the pipeline in fp32 on the same device and re-run once. fp32 is the verified-clean path, so a black image is never returned regardless of model_id or version. Mirrors the MPS->CPU fallback's self-mutation pattern; batch inherits it. Verified e2e on MPS by forcing fp16 with the swap disabled (first pass black, guard fired, retry clean). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-04 17:43:27 -07:00
Victor Kuznetsov	ec549b5c55	chore(deps): bump aiohttp 3.13.5 -> 3.14.0 for GHSA-hg6j-4rv6-33pg + GHSA-jg22-mg44-37j8 Targeted `uv lock --upgrade-package aiohttp`; only the aiohttp pin changes (no other package added/removed). Clears the two moderate Dependabot alerts on the transitive aiohttp. The third alert (basicsr GHSA-86w8-vhw6-q9qq, command injection, no patch) is accepted: basicsr is the optional, off-by-default `restore` extra pinned to 1.4.2 as the only buildable version. Imports + targeted suite (identify/metadata/gemini) green after the bump. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-04 16:57:08 -07:00
Victor Kuznetsov	2c0b174dfa	fix(gemini): self-verify repair for under-removed sparkles After reverse-alpha, re-detect the sparkle; when one survives at or above the registry fail line (conf >= 0.5) -- an alpha mismatch the per-image gain estimate could not fully correct -- inpaint the footprint and keep that only when it lowers the re-detect confidence. The footprint inpaint reconstructs the slot from its darker surroundings, so it physically removes the bright sparkle; purely additive, the common clean removal re-detects below 0.5 and is returned untouched. Measured on the spaces visible-removal audit: gemini removal-audit failures drop 15 -> 11 (4 genuine rescues), doubao 65/65 and jimeng 11/11 unchanged, zero regressions on the 468 already-clean removals. An offset+scale alignment search was prototyped on the remaining 11 fails and rejected: an audit "ceiling" suggested +4 more, but those were NCC-gaming -- the lower-scoring placement left the sparkle as bright or brighter, just reshaping the residual so the contrast-invariant shape-NCC scored lower (a5a9: first-pass slot ~76 at background level vs the "aligned win" ~164). A brightness sanity check rejected every one, so it contributed nothing and was removed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-04 16:45:18 -07:00
Victor Kuznetsov	6d11c11b52	feat(auto): DBNet text detector, Real-ESRGAN upscaler, batch --auto Three content-quality features for the invisible/all/batch pipeline. DBNet text detector (auto_config): replace the MSER text heuristic with PP-OCRv3 differentiable-binarization via cv2.dnn.TextDetectionModel_DB, using a bundled 2.4 MB Apache-2.0 model (en/cn detection nets are byte-identical, so it ships language-neutral). cv2.dnn is core OpenCV, so no new pip dep. MSER stays as the fallback when the model can't load. Validated on real images: matches MSER everywhere and additionally catches the Doubao CJK mark MSER missed; routing decisions unchanged otherwise. Real-ESRGAN upscaler (new upscaler.py, esrgan extra): optional pre-diffusion super-resolution for the min-resolution floor upscale, loaded via spandrel (MIT, no basicsr) with BSD-3-Clause weights downloaded on first use. New --upscaler {lanczos,esrgan} on invisible/all/batch; default stays lanczos and the engine falls back to lanczos when the extra is absent or the model errors (never breaks removal). It is a manual opt-in knob (the auto plan never selects it) -- as a generic GAN it sharpens photo/texture content strongly but can degrade faces (the diffusion pass regenerates them) and thin text, documented accordingly. batch --auto: wire the content-adaptive --auto (+ --adaptive-polish) into cmd_batch. The plan is recomputed per image and the invisible engine is cached per resolved pipeline (default/controlnet), so a mixed directory builds at most one engine of each kind. Verified end-to-end: 3 mixed images routed correctly with only 2 pipeline loads (controlnet reused). ruff + strict pyright(src/) clean; 558 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-04 16:04:33 -07:00
Victor Kuznetsov	4a6cd71ab2	Merge branch 'claude/silly-northcutt-c2bf06': unify C2PA vendor registry + code-health + uv publish Brings in commit `5cf68a6` (single C2PA_AI_VENDORS registry, erase_lama grayscale/BGRA support, batch device-cache clearing + --controlnet-scale, uv publish via OIDC, hatchling pin <1.31). Auto-merged with no conflicts; ruff/pytest(544)/pyright all clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 22:10:25 -07:00
Victor Kuznetsov	32a0779e1d	fix(gemini): demote sparkle false positives with a core-brightness gate detect_watermark's shape-only NCC (spatial/gradient/var fusion) fires on ornate or flat content (text strips, banners, hatching) that coincidentally matches the diamond shape. The NCC is contrast-invariant, so it cannot see the defining property of a real Gemini sparkle: a bright WHITE overlay whose core sits above the local background. The fusion now demotes (caps confidence to 0.30) a match that is BOTH low-confidence (< _SPARKLE_FP_CONF 0.65) AND has a low core-ring brightness margin (_core_ring_margin < _SPARKLE_FP_MARGIN 5). Real sparkles escape via EITHER high confidence (white-bg sparkles score >=0.79 despite a low margin) OR high margin (dark/mid backgrounds, incl. the #36 faint-corner case), so both must fail to demote. The gate is monotonic -- it only removes detections, never adds -- so it cannot regress the verified-negative corpus (already 0 FPs). On the spaces corpus it demoted 16/495 flagged sparkles (13 no AI metadata = content FPs; the 3 AI-meta ones were visually FPs / a near-invisible white-on-white sparkle whose AI verdict is held by metadata), and dropped the removal-audit failures 20 -> 15. - _core_and_bg shared helper (core 75th-pct brightness vs background-ring median); _estimate_alpha_gain refactored onto it, new _core_ring_margin wrapper. - TestSparkleFalsePositiveGate: margin high/low, strong-sparkle kept (incl. on white via high conf), blurred no-core blob demoted. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 22:02:28 -07:00
Victor Kuznetsov	b686dbdd79	feat(auto): adaptive detail-targeting polish + --adaptive-polish flag The fixed mild auto polish (unsharp 0.5 / grain 2.0) under-corrected soft photo/face output (gemini_3 stayed at lap-var 84 vs its 592 original) and its grain speckled small text. Replace it with humanizer.adaptive_polish: target the input's Laplacian variance with a capped unsharp scaled to the deficit + edge- masked grain (smooth regions only), calibrated by a short sigma search. Self- limiting on text/graphics -- already high-frequency, so almost no polish lands and text edges are masked out. Validated on the spaces corpus (gemini_3 84 -> 334 end-to-end; openai_1 text near-untouched). Interface: every --auto decision is now independently overridable -- add --adaptive-polish/--no-adaptive-polish (matching --restore-faces; works without --auto too) so the polish can be disabled or used manually. _apply_auto overrides exactly the three content-adaptive modes (pipeline, restore-faces, adaptive- polish); --unsharp/--humanize stay independent fixed filters. cv2-only, no new deps. Threaded through invisible/all (not batch). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 21:49:08 -07:00
Victor Kuznetsov	5cf68a6a3d	refactor: unify C2PA vendor registry + code-health fixes + uv publish Three P2 cleanups from a library-wide review. Detection -- single C2PA_AI_VENDORS registry (noai/constants.py): - C2PA_ISSUERS, SYNTHID_C2PA_ISSUERS, and identify._ISSUER_PLATFORM now derive from one C2paAiVendor table, so adding a C2PA vendor is one entry instead of edits in three places across two files. Behavior-identical (262 detection tests pass; the kept `needle` field is load-bearing -- it differs from `org` for Google and ByteDance, with no mechanical derivation). Code-health: - region_eraser.erase_lama now accepts grayscale/BGRA like erase_cv2 (it crashed on grayscale and silently dropped alpha on BGRA). +2 regression tests. - batch frees the device cache between images via a shared try_empty_device_cache helper (generalized from the MPS-only _try_clear_mps_cache, now reused by both the MPS->CPU fallback and the batch loop). - batch gained --controlnet-scale (parity with invisible/all). CI / packaging: - publish.yml uploads via `uv publish` (PyPI trusted publishing over OIDC), replacing pypa/gh-action-pypi-publish so uploads no longer depend on that action's bundled twine accepting the Metadata-Version. Workflow filename + pypi environment unchanged, so PyPI's trusted-publisher entry still matches. - hatchling pin relaxed <1.28 -> <1.31 (verified against hatch's changelog: 1.30.0 made Metadata 2.5 the default, 1.30.1 reverted to 2.4; 1.27-1.29 were always 2.4). Kept as belt-and-suspenders so the first uv-publish release ships 2.4, isolating the uploader swap from the metadata-version bump. Docs (CLAUDE.md, pyproject) synced; corrected the inaccurate "hatchling 1.28+ emits 2.5" note. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 21:01:07 -07:00
Victor Kuznetsov	9bd2c17cc4	feat(auto): content-adaptive --auto quality mode, Phase 1 Add `auto_config.plan(image_path) -> AutoConfig`, the first step of the invisible/all pipeline: it inspects the input image (before the diffusion model loads) and picks the quality modes so the run adapts to content. Quality-priority routing -- ControlNet (text/face-structure preservation) is the default, skipped for plain SDXL only on a clearly structure-less image; GFPGAN face restore when a face is present; a mild sharpen + grain polish when a smoothing pass ran. Exposed as `--auto` on `all`/`invisible` (`_apply_auto`; explicit flags override via click's parameter source). Not wired into batch (its engine is cached per-mode). Detection is cv2-only and torch-free (~100 MB peak RSS, a few ms): OpenCV YuNet (`cv2.FaceDetectorYN`, MIT, 232 KB model bundled in assets/) for faces, a Canny edge-density + MSER heuristic for text/structure (a rough Phase-1 placeholder; DBNet via cv2.dnn is the planned upgrade). ZERO new pip deps. Designed to run wherever the pipeline runs -- the raiw.cc Modal GPU worker -- never on the 512 MB web host. Real-ESRGAN-via-Spandrel upscaling (a new `esrgan` extra) and an adaptive Laplacian-variance polish are deferred to later phases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 20:52:17 -07:00
Victor Kuznetsov	ea59bdc3e2	chore(scripts): add invisible-removal quality audit tool Pairs <hash>_src / <hash>_clean outputs, computes SSIM + detail/resolution proxies, ranks the worst-preserved images for visual classification. Used to characterize the classes the SDXL scrub degrades (line-art, faces, dense text). Operates on gitignored data/spaces only; writes nothing tracked. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 19:56:49 -07:00
Victor Kuznetsov	e7fb64dca1	fix(gemini): remove more-opaque sparkles via per-image alpha gain The captured sparkle alpha peaks ~0.51, but some real Gemini sparkles are rendered more opaque. The fixed-alpha reverse blend then UNDER-subtracts and leaves a bright residual the detector still fires on. A visible-removal audit through the registry path on the spaces corpus showed this as a meaningful fraction of marks -- all under-removals, not a background-brightness class (failures and successes had the same input confidence and background luma; the discriminator was the removal delta itself). remove_watermark now estimates a per-image alpha gain (_estimate_alpha_gain: effective sparkle opacity at the bright core vs the local background ring, a_eff/a_cap, clamped [1.0, 1.94]) and scales the alpha to match before the over-sub/blend branch. A 1.05 deadband keeps a sparkle that already matches the capture byte-identical to the pre-fix output, so the fix is purely additive (0 regressions on the audit set; failures dropped substantially). The over-sub guard still runs on the scaled alpha as the safety net for an over-shoot. - _estimate_alpha_gain + _ALPHA_GAIN_MAX/_DEADBAND/_CORE_FRAC in gemini_engine. - TestUnderSubtractionGain asserts on footprint pixels, NOT the detector (its NCC is degenerate on a flat synthetic bg; the real corpus removal drops the detector ~0.80 -> ~0.27). - scripts/visible_removal_audit.py: the detect -> remove -> re-detect audit tool that found and validated this (operates on gitignored data/spaces only). - CLAUDE.md + README: document the under-subtraction gain. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 19:48:40 -07:00
Victor Kuznetsov	d7e4fe8835	feat(invisible): upscale-floor for small inputs + unsharp post-filter Two quality knobs for the SDXL invisible pass: - min_resolution floor (default 1024, --min-resolution): small inputs are upscaled to a 1024px long-side floor before diffusion, since SDXL img2img distorts on a tiny latent (a 381x512 portrait wrecks at native). The output is restored to the original input size, so it is a transparent quality boost; it adds time/memory on small inputs. 0 disables. Extends the pure _target_size helper (now cap-or-floor-or-native, min skipped on a min>max misconfig), unit-tested without a model. - unsharp post-filter (humanizer.unsharp_mask, --unsharp, opt-in default 0): applied LAST, after the GFPGAN face pass (a pre-GFPGAN sharpen would be smoothed back over), to counter the soft/over-smoothed look that diffusion + restoration leave behind (an AI tell). Pairs with --humanize (grain). Both threaded through invisible/all/batch + the module-level helper. Verified end-to-end on a 381x512 portrait: upscaled to 1024, sharpened, restored to 381x512. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 18:30:39 -07:00
Victor Kuznetsov	a57af5da21	docs(claude): corpus cleaned/ examples must come from a shipped removal method Capture the rule: archive only cleaned outputs from the current default SDXL img2img pass; never archive examples from removed methods (ctrlregen, old text/face protection, FaceID, CodeFormer) or experimental opt-in paths (controlnet, GFPGAN). A removed method's output is not a reproducible example. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 17:13:56 -07:00
Victor Kuznetsov	8523f48fb6	data(corpus): archive June 2026 SynthID strength-study subjects Back docs/synthid.md section 2.2 with the actual test set: the per-image oracle-verified subjects were only in a local working dir, while the doc claimed they were recorded in data/synthid_corpus/. Ingest the key pos+cleaned pairs so the claim holds. - pos: openai_1/2/3 originals (gpt-image, openai-verify) + gemini_1/2/3/4 originals (Gemini app, gemini-app); all probe as C2PA-SynthID present. - cleaned: OpenAI at strength 0.05 (openai_2 only s010 captured) + Gemini at 0.15 --max-resolution 1536; oracle: SynthID NOT detected. Metadata stripped, so no C2PA on the cleaned rows. - Excluded the third-party issue #14 image (pic3): oracle-verified but not committed to the public corpus. - docs/synthid.md 2.2: state OpenAI n=4 = 3 archived + 1 external-only. - CLAUDE.md: drop the drift-prone "~65 MB" corpus size from the sdist note. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 17:09:58 -07:00
Victor Kuznetsov	5ec8269949	chore: mark controlnet pipeline + GFPGAN restore-faces as experimental Both content-preservation features are now flagged EXPERIMENTAL and opt-in. --pipeline controlnet was already opt-in (default=default); --restore-faces flips from on-by-default to OFF by default, matching the repo's prior pattern for experimental preservation passes (the removed protect_text/protect_faces). - cli.py: --restore-faces/--no-restore-faces default False; EXPERIMENTAL in the --restore-faces / --controlnet-scale / --pipeline help; batch default False. - invisible_engine.py: remove_watermark restore_faces default False + docstring. - CLAUDE.md / README.md / docs/synthid.md: label both experimental/opt-in. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 16:59:28 -07:00
Victor Kuznetsov	411ef16ec3	feat: GFPGAN face-identity restoration post-pass Add an optional, commercial-safe face-restoration post-pass that recovers face identity the diffusion removal pass drifts (canny holds structure, not likeness) while still scrubbing the pixel watermark in the face regions. - face_restore.py: GFPGANer singleton (CPU unless CUDA), the basicsr torchvision.transforms.functional_tensor shim, and the pure feather _composite_faces helper (unit-tested without the model). GFPGAN re-synthesizes each face from a StyleGAN2 prior, so composited face pixels are GAN-generated (no watermark, no pixel-copy) -- oracle-clean at weight 0.5 with identity preserved. - InvisibleEngine.remove_watermark: restore_faces / restore_faces_weight, best-effort, auto-skips when the extra is absent or no face is detected. - CLI --restore-faces/--no-restore-faces + --restore-faces-weight on invisible/all/batch (on by default). - restore extra (gfpgan/facexlib/basicsr), numpy<2-pinned (scipy<1.18, numba<0.60) and kept out of `all`; basicsr needs Python <3.13 + setuptools<69 to build, so pin .python-version 3.12. Commercial-safe: GFPGAN Apache-2.0, RetinaFace MIT. The CodeFormer alternative is non-commercial and is not shipped. The earlier IP-Adapter FaceID layer was removed (footgun: needs high strength, corrupts faces at the low removal strength). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 16:59:28 -07:00
Victor Kuznetsov	d90d5d886a	feat: controlnet pipeline for text/face-structure preservation Add `--pipeline controlnet` (SDXL base + xinsir canny ControlNet via StableDiffusionXLControlNetImg2ImgPipeline): the canny edge map conditions the img2img regeneration so text and face STRUCTURE stay sharp, while the watermark is still removed by the regeneration (`strength`) -- no original pixels are copied or frozen, so SynthID does not survive. Oracle-verified clean on OpenAI with better text/structure fidelity than plain img2img at equal strength. `--controlnet-scale` tunes structure preservation; fp32 on mps/cpu (fp16-fixed VAE on cuda/xpu). Shares the img2img runner (live progress + MPS->CPU fallback) and the fp16-VAE-fix / device-move helpers with the default pipeline. Remove the superseded subsystems -- ctrlregen (SD1.5 clean-noise), text-protection (differential / region-hires) and face-protection: they either destroyed real content or shielded the watermark by re-using original pixels. controlnet replaces them by regenerating everything under edge conditioning. Canny preserves face structure but not identity; face IDENTITY is a separate face-restoration post-pass (CodeFormer/GFPGAN), researched + prototyped but not yet shipped. An IP-Adapter FaceID attempt was built and removed (footgun: needs high strength, corrupts faces at removal strength). Docs: docs/controlnet-removal-pipeline-research.md, scripts/controlnet_sweep.py. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 16:59:28 -07:00
Victor Kuznetsov	175609b60a	fix(gemini): rescue small corner sparkle buried by the size weight (#36 ) detect_watermark's size-weighted global NCC search lets a larger, mediocre match (e.g. a bright collar in a portrait) outrank a small, near-perfect sparkle in the bottom-right corner, so a faint sparkle on a busy background scored below threshold and the image read as clean -- the regression from widening the search window 256px->512px between v0.7.2 and v0.8.8. Add _corner_promote: a bottom-right-corner raw-NCC pass that overrides the global pick when the corner holds a match with raw NCC >= 0.85 that beats it. It only ever replaces a lower-fidelity pick (cannot weaken an existing detection) and keeps the wider window for variant margins. The corner side is relative-clamped (0.20 of the short side, [96, 384]) so it stays a true corner at every scale: a fixed 256px covers ~70% of a small portrait, where a real photo raw-matches the star at ~0.81; relative tightening drops that to ~0.69. The 0.85 gate sits between the worst real-photo corner match (~0.78) and a genuine faint sparkle (~0.93): zero false positives across native + downscaled negatives, headshot rescued from below-threshold to 0.71. Factor the shared multi-scale matchTemplate loop into _scan_scales. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 16:51:03 -07:00
Victor Kuznetsov	07c96bed53	docs: mark fp16 VAE black-output fix (#29 ) verified on CUDA Confirmed on real CUDA hardware 2026-06-03: `all` on a 1086x1448 OpenAI gpt-image at fp16 produces a normal (non-black) output, so the fp16-fix VAE swap resolves the all-black decode. Removes the prior "NOT verifiable on this MPS machine" caveat. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 19:18:04 -07:00
Victor Kuznetsov	35116d5e97	chore(release): v0.8.9 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> v0.8.9	2026-06-02 19:04:32 -07:00
Victor Kuznetsov	df0fafe94e	fix(identify): stop flagging multi-actor C2PA manifests as integrity clashes The C2PA issuer attribution (`c2pa`) and the SynthID proxy (`synthid`) are derived from the same manifest, so treating them as independent signals made rule 1 fire on legitimate multi-actor manifests where a product wraps another vendor's engine (Microsoft Designer on OpenAI, Microsoft on Google) or an edit chain re-signs (Adobe over a Gemini original). 19 such files in the 2026-06-01/02 spaces batches read as "likely spoofed/laundered" before this. Group `c2pa` + `synthid` into one provenance source via `_CLASH_SOURCE`; rule 1 now requires two vendors from different sources. A manifest vendor still clashes with a genuinely independent stamp (EXIF/XMP generator, IPTC AISystemUsed, AIGC, xAI). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 19:02:35 -07:00
Victor Kuznetsov	9cb66992bd	chore(release): v0.8.8 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> v0.8.8	2026-06-02 09:18:02 -07:00
Victor Kuznetsov	9ca2811938	fix(gemini): inpaint sparkle footprint when reverse-alpha over-subtracts (#30 ) On a dark/textured background (e.g. grass) the captured alpha map over-estimates the real Gemini sparkle's effective opacity (~0.51 captured vs ~0.31 effective), so the fixed-alpha reverse blend over-subtracts (watermarked - alpha*logo goes negative) and drives the footprint to black -- the white sparkle turns into a black diamond (issue #30, reported by @CoolZimo1). remove_watermark now detects this via _reverse_alpha_oversubtracts (fraction of footprint pixels with a negative numerator > 5%) and inpaints the small sparkle footprint from the surrounding pixels (cv2 NS, cropped to a padded box) instead. Behavior-neutral on the working case: a bright background over-subtracts at ~0%, so reverse-alpha is used and the output is byte-identical to before (verified: demo_banana 0.0 frac vs the issue-#30 grass image 0.61 frac; issue-#30 footprint recovers to background grass with no pit, residual sparkle conf 0.25 < 0.35). Guard is scoped to GeminiEngine: doubao/jimeng already NCC-align their alpha to the actual mark per image, which sidesteps the fixed-alpha mismatch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 09:17:32 -07:00
Victor Kuznetsov	b25276c4f2	chore(release): v0.8.7 v0.8.7	2026-06-01 19:33:08 -07:00
Victor Kuznetsov	96038f960f	feat(invisible): vendor-adaptive default strength (OpenAI 0.10 / Google 0.15) The default img2img strength is now chosen from the detected SynthID vendor (C2PA issuer) instead of a single fixed 0.30: OpenAI gpt-image -> 0.10, Google Gemini -> 0.15, unknown source -> 0.15. Explicit --strength always wins. Basis: an oracle-verified June 2026 controlled study (clean v0.8.6, text/face protection OFF, per-image openai.com/verify or Gemini-app verdict). OpenAI's SynthID clears at 0.05 across 1024-1600 px (n=4, resolution-independent); Google's is ~3x more robust and needs 0.15 on the capped-1536 path (n=4). The dominant factor is the VENDOR, not resolution. The earlier single 0.30 default and the "resolution dependence" lore came from contaminated tests run with the protect-text bug ON (issue #14) -- re-running those same 1600x1600 images clean removes SynthID at 0.05. `vendor_for_strength(path)` reads metadata.synthid_source on the ORIGINAL input and is threaded through cli (invisible/all/batch) -> invisible_engine -> watermark_remover -> resolve_strength(strength, profile, vendor), so display and execution use the same vendor (the engine sees a temp path whose C2PA the visible pass already stripped, so detection must happen in the CLI on the pristine source). Caveat: Google's 0.15 was validated only on --max-resolution 1536; native 2816 Gemini was not locally measurable (OOM on Apple Silicon) and is pending GPU validation on raiw.cc. Docs: docs/synthid.md sections 2.2/4.4/5.2 corrected (the contaminated resolution-dependence findings replaced with the clean oracle-verified table); README and CLAUDE.md updated; CLI --strength help reflects the adaptive default. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 19:29:47 -07:00
Victor Kuznetsov	1708857772	fix(gemini): expand sparkle search area 256 -> 512px from corner The 256px limit caused misses when Gemini places the sparkle further from the corner than the standard 160px (margin 64 + logo 96). Observed variant at ~300px reported in issue #30. 512px covers all known Gemini margin variations with room to spare; matchTemplate on a 512x512 region is still fast on CPU. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> v0.8.6	2026-06-01 10:42:04 -07:00

1 2 3 4

195 Commits