Commit Graph

120 Commits

Author SHA1 Message Date
Victor Kuznetsov 3d00fed00c fix(photomaker-v2): compute id_embeds via FaceAnalysis2 before pipeline call
The Modal cert sweep against V2 hit the next layer of the API:

  PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken.forward() missing 1 required
  positional argument: 'id_embeds'

V2 forward takes BOTH the CLIP image embedding (computed inside the pipeline from
input_id_images) AND an ArcFace identity embedding (id_embeds) that the caller
must compute. The upstream pipeline does NOT auto-compute it -- inference_pmv2.py
shows the caller using FaceAnalysis2 + analyze_faces to extract the ArcFace
vector from each input ID image and passing id_embeds=torch.stack([...]) into
pipe(...).

Wired the same flow here:
- New _get_face_analyser() singleton (double-checked lock) builds
  FaceAnalysis2(['CUDAExecutionProvider' | 'CPUExecutionProvider']).prepare(...).
  This is the non-commercial step (antelopev2/buffalo_l auto-download on first
  use). Module docstring already calls it out.
- Per face: analyze_faces() -> torch.from_numpy(embedding) -> .unsqueeze(0) to
  match the pipeline's expected (B, D) shape, casting to pipeline.device/dtype.
  Faces InsightFace can't detect inside the crop get skipped (the most likely
  cause would be the diffusion-cleaned face being too small or stylised after
  the main pass; YuNet already gated us into having a face per crop, so this
  should be rare).
- id_embeds= keyword threaded into the pipeline call site alongside the existing
  input_id_images=.

Tests untouched (the V1-only safety guard was already removed in the previous
commit when we swapped V1->V2; the existing 11 tests still pass).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 18:49:10 -07:00
Victor Kuznetsov 65de8df5c5 refactor(face-restore): drop GFPGAN, ship PhotoMaker-V2 as the sole restore (non-commercial)
Visual review of the GFPGAN-on-cleaned output (9-face grid, 1448x1086) showed it
only polished the already-drifted face without restoring identity — useless for the
"restore who is in the photo" intent. Dropping it.

The shipped restore path is now PhotoMaker-V2, which delivers true identity-from-
embedding face regeneration via a CLIP+ArcFace dual encoder. The ArcFace branch
pulls InsightFace antelopev2/buffalo_l model packs at runtime, which InsightFace
releases under a research-only license, so the whole extra is **NON-COMMERCIAL**.
raiw.cc and any monetized deployment must NOT install the `photomaker` extra.
This is called out at every entry point: CLI flag help, module docstring,
pyproject extra block, CLAUDE.md extras bullet, README install snippet.

Changes:
- Deleted `src/remove_ai_watermarks/face_restore.py` and its tests.
- Deleted the `restore` extra (gfpgan/facexlib/basicsr + scipy<1.18 / numba<0.60
  pins) and the basicsr setuptools<69 build pin from pyproject.toml.
- Restored `src/remove_ai_watermarks/photomaker_restore.py` (V2 this time:
  `TencentARC/PhotoMaker-V2`, `photomaker-v2.bin`, no `pm_version='v1'` override).
- Restored the `photomaker` extra in pyproject with all the upstream-compat
  pins (einops, peft, onnxruntime, insightface) and the `allow-direct-references`
  hatch metadata block.
- `InvisibleEngine` swapped `_restore_faces` -> `_restore_faces_photomaker`;
  `--restore-faces-method` removed (only one method, no choice).
- CLI flag help, CLAUDE.md, README, docs/synthid.md, and
  docs/controlnet-removal-pipeline-research.md all updated.
- docs/synthid-robust-identity-research.md status notice rewritten to list both
  abandoned commercial-safe attempts (V1 + GFPGAN-on-cleaned) and the
  non-commercial trade-off we accepted.

ruff + strict pyright(src/) clean; 578 tests pass (the 9 GFPGAN tests are gone,
the 11 PhotoMaker tests stay green).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 18:41:01 -07:00
Victor Kuznetsov 01fe98bf54 refactor(face-restore): rollback PhotoMaker, restore GFPGAN on the CLEANED image
After 7 cascading upstream-compat fixes (insightface dep, peft dep, pm_version,
device, etc.), the PhotoMaker V1 cert sweep still hit a CFG batch-dim mismatch
inside the denoising loop. The upstream PhotoMaker `pipeline.py` is forked from
diffusers v0.29.1 and our env runs 0.38; SDXL prompt-encoder handling changed
significantly between those versions, so making PhotoMaker work end-to-end
needs a proper fork or a diffusers downgrade — both expensive. Not worth
shipping today.

Pivot: restore `face_restore.py` (GFPGAN) with a single-line fix that makes it
SynthID-safe by construction. The previous design ran GFPGAN.enhance on the
ORIGINAL watermarked image and was oracle-confirmed to re-add SynthID via the
weight-0.5 pixel blend. The fix is to run GFPGAN on the diffusion-CLEANED
image — whatever pixels GFPGAN derives from are already SynthID-free, so the
partial blend cannot transport the watermark. Identity fidelity is lower than
a true identity-as-embedding stack would deliver, but it ships and works.

Changes:
- `src/remove_ai_watermarks/face_restore.py` restored from pre-wipe state with
  one line changed: `restorer.enhance(cleaned_bgr, ...)` instead of
  `restorer.enhance(original_bgr, ...)`. `original_bgr` is kept as an unused
  positional argument for API stability.
- `src/remove_ai_watermarks/photomaker_restore.py` and its tests REMOVED. The
  research note (`docs/synthid-robust-identity-research.md`) keeps a "status
  notice" documenting why PhotoMaker is parked for now and what the path back
  in would look like.
- `pyproject.toml` `restore` extra restored (gfpgan/facexlib/basicsr +
  scipy<1.18 + numba<0.60 pins + the basicsr setuptools<69 build pin), plus
  `photomaker` extra (with its einops/insightface/peft pile) and the
  `[tool.hatch.metadata] allow-direct-references = true` block REMOVED.
- `InvisibleEngine._restore_faces_photomaker` removed; `_restore_faces`
  restored. The `--restore-faces` CLI flag and its plumbing through cmd_*
  signatures are unchanged.
- CLAUDE.md, README.md, docs/synthid.md, docs/controlnet-removal-pipeline-
  research.md updated to describe the shipped GFPGAN-on-cleaned design and to
  reference PhotoMaker only as the parked alternative.

ruff + strict pyright(src/) clean; 578 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 16:55:45 -07:00
Victor Kuznetsov d1b85ee6a8 fix(photomaker): drop explicit negative_prompt to fix CFG batch mismatch
Modal cert sweep #6 made it INTO the denoising loop and died with
"Sizes of tensors must match except in dimension 1. Expected size 2 but got size 1
for tensor number 1 in the list."

In the PhotoMaker pipeline's denoising loop, the per-step embeddings are built
as torch.cat([negative_prompt_embeds, prompt_embeds(_text_only)], dim=0). The
text-encoder + ID-encoder flow can leave the negative branch at batch=2 and the
ID-injected branch at batch=1 when a custom negative_prompt is passed, so the
cat fails. The upstream gradio demo just passes no negative_prompt and relies
on the pipeline's empty default; do the same.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 16:35:40 -07:00
Victor Kuznetsov 031c38dc7f fix(photomaker): place id_encoder on the right device + dtype
Modal cert sweep #5 made it through component load (V1 id_encoder + lora_weights)
and died at inference with the classic
"Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor) should be
the same" — id_encoder lived on CPU/fp32 while the rest of the pipeline ran on
CUDA/fp16. Two fixes:

1. Call `pipe.to(device)` BEFORE `load_photomaker_adapter` so the loader picks the
   right device/dtype from `self.device` / `self.unet.dtype` when it builds the
   encoder.
2. Belt: after load, explicitly `pipe.id_encoder.to(device, dtype)` because some
   torch/diffusers combos leave custom attributes on the old device even when
   `pipe.to` ran first.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 16:29:00 -07:00
Victor Kuznetsov 1fb2a64b56 fix(photomaker): pass pm_version='v1' to load_photomaker_adapter
Modal cert sweep #3 ran past the `insightface` import error and into a real
state_dict mismatch:

  Error(s) in loading state_dict for PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken:
    Missing key(s) ... qformer_perceiver.token_proj.0.weight ...

The upstream `load_photomaker_adapter` defaults to `pm_version='v2'` regardless of
the .bin file passed -- the loader builds a V2 encoder
(PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken) and then tries to load V1 weights
into it. We must pass `pm_version='v1'` explicitly so the loader instantiates the
CLIP-only PhotoMakerIDEncoder. The pipeline-level `input_id_images` API is the
same across V1 and V2, so the call site does not change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 16:18:52 -07:00
Victor Kuznetsov 860bde4a26 fix(photomaker extra): pin insightface for import resolution (MIT code only)
The upstream PhotoMaker package's `__init__.py` unconditionally imports a
face-analyser class from its `insightface_package` submodule, so JUST importing
`PhotoMakerStableDiffusionXLPipeline` (the V1 pipeline class we use) raises
`ModuleNotFoundError: No module named 'insightface'` if insightface isn't
present in the env. The Modal cert sweep caught this on the V1 image.

Resolution: pin `insightface>=0.7.3` (and its `onnxruntime` runtime dep) in the
`photomaker` extra. The PyPI insightface package is MIT-licensed CODE; the
non-commercial restriction sits on the pretrained model packs (antelopev2,
buffalo_l) which download only when `FaceAnalysis()` is instantiated. Our V1 path
never instantiates the face-analyser -- it loads photomaker-v1.bin (CLIP-only
encoder) via `load_photomaker_adapter` -- so the model-pack license does not
bind us; we depend only on the MIT code for the import to resolve.

Safety guards:
- Runtime check in `_get_pipeline`: raises if `_PHOTOMAKER_FILE` is ever pointed
  at v2 (so a future maintainer can't silently regress to the InsightFace path).
- New test class `TestV1OnlyCommercialSafetyGuard`: asserts repo + filename
  pin to V1 AND asserts the module source never references the face-analyser
  class (a static check that our codepath stays out of the runtime that would
  pull the non-commercial model packs).

Docs: documented the import dance + legal split inline at the top of
`photomaker_restore.py`.

ruff clean; 581 tests pass (the 9 PhotoMaker tests plus 3 new V1-guard tests).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 16:13:20 -07:00
Victor Kuznetsov dfa5181309 fix(photomaker): switch to V1 — V2 actually requires InsightFace (non-commercial)
A Modal cert sweep caught what the research doc missed: PhotoMaker-V2 fails at
import without InsightFace ("No module named 'insightface'"). Reading the upstream
source confirms it: `photomaker/__init__.py` imports `FaceAnalysis2` (an InsightFace
wrapper) at module load, V2's encoder is named
`PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken`, and `model_v2.py`'s forward
takes an `id_embeds` argument that the pipeline computes via
`insightface.app.FaceAnalysis(name='antelopev2', ...)`. So V2 is a DUAL encoder
(CLIP + ArcFace), not CLIP-only as the model card line "id_encoder includes
finetuned OpenCLIP-ViT-H-14 and a few fuse layers" implied.

InsightFace's pretrained model packs (antelopev2, buffalo_l) are research/
non-commercial only per their own README:
  "The pretrained models we provided with this library are available for
   non-commercial research purposes only."
So V2 is blocked for a paid service like raiw.cc.

PhotoMaker-V1 is the commercial-safe alternative — its `PhotoMakerIDEncoder`
(model.py) forward takes only `(id_pixel_values, prompt_embeds, class_tokens_mask)`,
no ArcFace branch. Identity is CLIP-only, license is Apache-2.0, no InsightFace.

Code change: swap the repo + filename constants in `photomaker_restore.py`
(TencentARC/PhotoMaker, photomaker-v1.bin). Tests still pass (the 9 PhotoMaker
tests use a fake pipeline, so the model swap is transparent to them).

Doc correction: rewrote the verdict / license table / section 5 of
`docs/synthid-robust-identity-research.md` to lead with V1 and add a correction
notice explaining the V2 misread. Bulk-renamed `PhotoMaker-V2` to `PhotoMaker-V1`
across CLAUDE.md, README.md, docs/synthid.md, and
docs/controlnet-removal-pipeline-research.md (kept V2 only in the correction
notice, the license table, and the anchor reference).

ruff clean; 578 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 16:05:58 -07:00
Victor Kuznetsov 439eeadc07 refactor(face-restore): wipe GFPGAN path, --restore-faces is PhotoMaker-only
The GFPGAN `restore` extra and its `face_restore.py` module are gone. They were
oracle-confirmed to re-introduce SynthID by blending watermarked original face
pixels at fidelity weight 0.5 (clean A/B: gemini_3 controlnet 0.20 detected WITH
GFPGAN, clean WITHOUT). Keeping them as the default restore method was a footgun
for the removal pipeline. PhotoMaker-V2 (added in the previous commit) is the
single shipped restore path now -- identity-as-embedding, SynthID-safe by
construction.

Removed:
- src/remove_ai_watermarks/face_restore.py + tests/test_face_restore.py
- pyproject.toml `restore` extra (gfpgan/facexlib/basicsr + scipy/numba pins)
- pyproject.toml `[tool.uv.extra-build-dependencies] basicsr = [...]` build pin
- CLI: `--restore-faces-method` and `--restore-faces-weight` (no method choice
  to make, no GFPGAN weight knob to expose)
- InvisibleEngine._restore_faces method (only _restore_faces_photomaker remains)
- All restore-faces-method / restore-faces-weight threading through cmd_*
  signatures and _process_batch_image

Kept:
- `--restore-faces / --no-restore-faces`: now binds to PhotoMaker-V2.
- All adopted oracle findings about GFPGAN re-introducing SynthID (kept in the
  research docs as historical context that explains why the path was removed).

Docs updated: CLAUDE.md (restore extras bullet collapsed to photomaker, removed
face_restore Key-modules bullet, several inline GFPGAN refs scrubbed), README.md
(face-identity callout + install section now point to the photomaker extra),
docs/synthid.md 5.5 (net recipe), docs/controlnet-removal-pipeline-research.md
(recommendations).

ruff + strict pyright (src/) clean; 578 tests pass (the 9 GFPGAN tests are gone,
the 9 PhotoMaker tests stay green).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 15:35:37 -07:00
Victor Kuznetsov 1439eb0714 feat(photomaker): SynthID-safe face-identity restoration via PhotoMaker-V2
Adds the second face-restore mechanism, selectable via the new CLI option
`--restore-faces-method=photomaker`. Unlike the existing GFPGAN path (which runs on
the watermarked ORIGINAL and was oracle-confirmed to re-introduce SynthID by partial
pixel blending), PhotoMaker carries identity in a SynthID-invariant OpenCLIP
embedding and regenerates fresh face pixels conditioned on it — the pixels in the
output are diffusion-fresh, so the watermark cannot be transported.

The load-bearing assumption (embedding invariance to SynthID-magnitude pixel noise)
was empirically validated in the prior commit (smoke test): cosine drift 0.002
under a ±2 LSB low-freq carrier, an order of magnitude less than JPEG90 drift
which SynthID survives at >=99% TPR.

End-to-end commercial-safe:
- PhotoMaker-V2 weights: Apache-2.0 (TencentARC)
- ID encoder: OpenCLIP-ViT-H/14 (MIT)
- SDXL base: shared with the main pipeline
- NO InsightFace (the non-commercial blocker for IP-Adapter FaceID / InstantID /
  PuLID / Arc2Face)

Two-pass architecture (PhotoMaker has no ControlNetImg2img class in diffusers):
1) main controlnet/default removal pass cleans SynthID + drifts faces
2) PhotoMaker txt2img regenerates each face from its embedding, feather-composited
   back into the cleaned image

New module `photomaker_restore.py` mirrors `face_restore.py`: lazy pipeline
singleton (double-checked lock), `is_available()` gate, pure `_face_crop_square` and
`_composite_faces` helpers, all unit-tested without the model (9 new tests). New
`InvisibleEngine._restore_faces_photomaker` runs after the diffusion pass, mirroring
`_restore_faces`. CLI flag `--restore-faces-method=[gfpgan|photomaker]` threaded
through `cmd_invisible`/`cmd_all`/`cmd_batch` + `_process_batch_image`.

New optional `photomaker` extra (Apache-2.0 + Apache-2.0/MIT deps, no basicsr).
`[tool.hatch.metadata] allow-direct-references = true` is required because the
upstream PhotoMaker package lives only on GitHub.

The next step (separate work) is oracle validation: run a 6-image cert sweep
through the new pipeline (default/controlnet at the certified strength +
--restore-faces-method=photomaker) and confirm SynthID stays clean while face
identity is recovered. The required infrastructure (`raiw-app/modal_cert.py`) is
already in place.

ruff + strict pyright(src/) clean; 586 tests pass (+ 9 new in
tests/test_photomaker_restore.py).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 15:20:29 -07:00
Victor Kuznetsov 3aea21e632 feat(visible): Samsung Galaxy AI mark removal (bottom-left reverse-alpha, #37)
New samsung_engine.py mirrors the jimeng engine but anchors bottom-left; wired
into watermark_registry, the CLI (--mark samsung / auto), and identify
(visible_samsung, medium). visible_alpha_solve.py gains a corner=bl mode;
samsung_alpha.png solved from @f-liva's flat captures. Calibrated for the
Italian "Contenuti generati dall'AI" variant. Flat black/gray/white captures
committed, real photos gitignored. Tests + docs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 10:27:44 -07:00
Victor Kuznetsov 6f4aa4c7b1 fix(invisible): retry in fp32 on a degenerate fp16 output (#41)
The fp16-fix VAE swap (#29) is gated to the default SDXL checkpoint, so a
custom model_id, a stale pre-fix install, or a fal/custom loader can still
decode to an all-black/NaN frame in fp16 (reporter: gpt-image 1448x1086,
the `image_processor.py invalid value encountered in cast` warning).

Add a model-agnostic backstop in remove_watermark: after generation, if the
run was fp16 and the output is degenerate (_is_degenerate_image: near-zero
mean and variance), rebuild the pipeline in fp32 on the same device and
re-run once. fp32 is the verified-clean path, so a black image is never
returned regardless of model_id or version. Mirrors the MPS->CPU fallback's
self-mutation pattern; batch inherits it. Verified e2e on MPS by forcing
fp16 with the swap disabled (first pass black, guard fired, retry clean).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 17:43:27 -07:00
Victor Kuznetsov 2c0b174dfa fix(gemini): self-verify repair for under-removed sparkles
After reverse-alpha, re-detect the sparkle; when one survives at or above the
registry fail line (conf >= 0.5) -- an alpha mismatch the per-image gain estimate
could not fully correct -- inpaint the footprint and keep that only when it lowers
the re-detect confidence. The footprint inpaint reconstructs the slot from its
darker surroundings, so it physically removes the bright sparkle; purely additive,
the common clean removal re-detects below 0.5 and is returned untouched.

Measured on the spaces visible-removal audit: gemini removal-audit failures drop
15 -> 11 (4 genuine rescues), doubao 65/65 and jimeng 11/11 unchanged, zero
regressions on the 468 already-clean removals.

An offset+scale alignment search was prototyped on the remaining 11 fails and
rejected: an audit "ceiling" suggested +4 more, but those were NCC-gaming -- the
lower-scoring placement left the sparkle as bright or brighter, just reshaping the
residual so the contrast-invariant shape-NCC scored lower (a5a9: first-pass slot
~76 at background level vs the "aligned win" ~164). A brightness sanity check
rejected every one, so it contributed nothing and was removed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 16:45:18 -07:00
Victor Kuznetsov 6d11c11b52 feat(auto): DBNet text detector, Real-ESRGAN upscaler, batch --auto
Three content-quality features for the invisible/all/batch pipeline.

DBNet text detector (auto_config): replace the MSER text heuristic with
PP-OCRv3 differentiable-binarization via cv2.dnn.TextDetectionModel_DB,
using a bundled 2.4 MB Apache-2.0 model (en/cn detection nets are
byte-identical, so it ships language-neutral). cv2.dnn is core OpenCV, so
no new pip dep. MSER stays as the fallback when the model can't load.
Validated on real images: matches MSER everywhere and additionally catches
the Doubao CJK mark MSER missed; routing decisions unchanged otherwise.

Real-ESRGAN upscaler (new upscaler.py, esrgan extra): optional
pre-diffusion super-resolution for the min-resolution floor upscale, loaded
via spandrel (MIT, no basicsr) with BSD-3-Clause weights downloaded on
first use. New --upscaler {lanczos,esrgan} on invisible/all/batch; default
stays lanczos and the engine falls back to lanczos when the extra is absent
or the model errors (never breaks removal). It is a manual opt-in knob (the
auto plan never selects it) -- as a generic GAN it sharpens photo/texture
content strongly but can degrade faces (the diffusion pass regenerates
them) and thin text, documented accordingly.

batch --auto: wire the content-adaptive --auto (+ --adaptive-polish) into
cmd_batch. The plan is recomputed per image and the invisible engine is
cached per resolved pipeline (default/controlnet), so a mixed directory
builds at most one engine of each kind. Verified end-to-end: 3 mixed
images routed correctly with only 2 pipeline loads (controlnet reused).

ruff + strict pyright(src/) clean; 558 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 16:04:33 -07:00
Victor Kuznetsov 4a6cd71ab2 Merge branch 'claude/silly-northcutt-c2bf06': unify C2PA vendor registry + code-health + uv publish
Brings in commit 5cf68a6 (single C2PA_AI_VENDORS registry, erase_lama
grayscale/BGRA support, batch device-cache clearing + --controlnet-scale,
uv publish via OIDC, hatchling pin <1.31). Auto-merged with no conflicts;
ruff/pytest(544)/pyright all clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 22:10:25 -07:00
Victor Kuznetsov 32a0779e1d fix(gemini): demote sparkle false positives with a core-brightness gate
detect_watermark's shape-only NCC (spatial/gradient/var fusion) fires on ornate
or flat content (text strips, banners, hatching) that coincidentally matches the
diamond shape. The NCC is contrast-invariant, so it cannot see the defining
property of a real Gemini sparkle: a bright WHITE overlay whose core sits above
the local background.

The fusion now demotes (caps confidence to 0.30) a match that is BOTH
low-confidence (< _SPARKLE_FP_CONF 0.65) AND has a low core-ring brightness
margin (_core_ring_margin < _SPARKLE_FP_MARGIN 5). Real sparkles escape via
EITHER high confidence (white-bg sparkles score >=0.79 despite a low margin) OR
high margin (dark/mid backgrounds, incl. the #36 faint-corner case), so both
must fail to demote. The gate is monotonic -- it only removes detections, never
adds -- so it cannot regress the verified-negative corpus (already 0 FPs).

On the spaces corpus it demoted 16/495 flagged sparkles (13 no AI metadata =
content FPs; the 3 AI-meta ones were visually FPs / a near-invisible
white-on-white sparkle whose AI verdict is held by metadata), and dropped the
removal-audit failures 20 -> 15.

- _core_and_bg shared helper (core 75th-pct brightness vs background-ring median);
  _estimate_alpha_gain refactored onto it, new _core_ring_margin wrapper.
- TestSparkleFalsePositiveGate: margin high/low, strong-sparkle kept (incl. on
  white via high conf), blurred no-core blob demoted.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 22:02:28 -07:00
Victor Kuznetsov b686dbdd79 feat(auto): adaptive detail-targeting polish + --adaptive-polish flag
The fixed mild auto polish (unsharp 0.5 / grain 2.0) under-corrected soft
photo/face output (gemini_3 stayed at lap-var 84 vs its 592 original) and its
grain speckled small text. Replace it with humanizer.adaptive_polish: target the
input's Laplacian variance with a capped unsharp scaled to the deficit + edge-
masked grain (smooth regions only), calibrated by a short sigma search. Self-
limiting on text/graphics -- already high-frequency, so almost no polish lands
and text edges are masked out. Validated on the spaces corpus (gemini_3 84 -> 334
end-to-end; openai_1 text near-untouched).

Interface: every --auto decision is now independently overridable -- add
--adaptive-polish/--no-adaptive-polish (matching --restore-faces; works without
--auto too) so the polish can be disabled or used manually. _apply_auto overrides
exactly the three content-adaptive modes (pipeline, restore-faces, adaptive-
polish); --unsharp/--humanize stay independent fixed filters.

cv2-only, no new deps. Threaded through invisible/all (not batch).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 21:49:08 -07:00
Victor Kuznetsov 5cf68a6a3d refactor: unify C2PA vendor registry + code-health fixes + uv publish
Three P2 cleanups from a library-wide review.

Detection -- single C2PA_AI_VENDORS registry (noai/constants.py):
- C2PA_ISSUERS, SYNTHID_C2PA_ISSUERS, and identify._ISSUER_PLATFORM now derive
  from one C2paAiVendor table, so adding a C2PA vendor is one entry instead of
  edits in three places across two files. Behavior-identical (262 detection
  tests pass; the kept `needle` field is load-bearing -- it differs from `org`
  for Google and ByteDance, with no mechanical derivation).

Code-health:
- region_eraser.erase_lama now accepts grayscale/BGRA like erase_cv2 (it
  crashed on grayscale and silently dropped alpha on BGRA). +2 regression tests.
- batch frees the device cache between images via a shared try_empty_device_cache
  helper (generalized from the MPS-only _try_clear_mps_cache, now reused by both
  the MPS->CPU fallback and the batch loop).
- batch gained --controlnet-scale (parity with invisible/all).

CI / packaging:
- publish.yml uploads via `uv publish` (PyPI trusted publishing over OIDC),
  replacing pypa/gh-action-pypi-publish so uploads no longer depend on that
  action's bundled twine accepting the Metadata-Version. Workflow filename +
  pypi environment unchanged, so PyPI's trusted-publisher entry still matches.
- hatchling pin relaxed <1.28 -> <1.31 (verified against hatch's changelog:
  1.30.0 made Metadata 2.5 the default, 1.30.1 reverted to 2.4; 1.27-1.29 were
  always 2.4). Kept as belt-and-suspenders so the first uv-publish release ships
  2.4, isolating the uploader swap from the metadata-version bump.

Docs (CLAUDE.md, pyproject) synced; corrected the inaccurate "hatchling 1.28+
emits 2.5" note.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 21:01:07 -07:00
Victor Kuznetsov 9bd2c17cc4 feat(auto): content-adaptive --auto quality mode, Phase 1
Add `auto_config.plan(image_path) -> AutoConfig`, the first step of the
invisible/all pipeline: it inspects the input image (before the diffusion model
loads) and picks the quality modes so the run adapts to content. Quality-priority
routing -- ControlNet (text/face-structure preservation) is the default, skipped for
plain SDXL only on a clearly structure-less image; GFPGAN face restore when a face is
present; a mild sharpen + grain polish when a smoothing pass ran. Exposed as `--auto`
on `all`/`invisible` (`_apply_auto`; explicit flags override via click's parameter
source). Not wired into batch (its engine is cached per-mode).

Detection is cv2-only and torch-free (~100 MB peak RSS, a few ms): OpenCV YuNet
(`cv2.FaceDetectorYN`, MIT, 232 KB model bundled in assets/) for faces, a Canny
edge-density + MSER heuristic for text/structure (a rough Phase-1 placeholder; DBNet
via cv2.dnn is the planned upgrade). ZERO new pip deps. Designed to run wherever the
pipeline runs -- the raiw.cc Modal GPU worker -- never on the 512 MB web host.

Real-ESRGAN-via-Spandrel upscaling (a new `esrgan` extra) and an adaptive
Laplacian-variance polish are deferred to later phases.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 20:52:17 -07:00
Victor Kuznetsov e7fb64dca1 fix(gemini): remove more-opaque sparkles via per-image alpha gain
The captured sparkle alpha peaks ~0.51, but some real Gemini sparkles are
rendered more opaque. The fixed-alpha reverse blend then UNDER-subtracts and
leaves a bright residual the detector still fires on. A visible-removal audit
through the registry path on the spaces corpus showed this as a meaningful
fraction of marks -- all under-removals, not a background-brightness class
(failures and successes had the same input confidence and background luma; the
discriminator was the removal delta itself).

remove_watermark now estimates a per-image alpha gain (_estimate_alpha_gain:
effective sparkle opacity at the bright core vs the local background ring,
a_eff/a_cap, clamped [1.0, 1.94]) and scales the alpha to match before the
over-sub/blend branch. A 1.05 deadband keeps a sparkle that already matches the
capture byte-identical to the pre-fix output, so the fix is purely additive
(0 regressions on the audit set; failures dropped substantially). The over-sub
guard still runs on the scaled alpha as the safety net for an over-shoot.

- _estimate_alpha_gain + _ALPHA_GAIN_MAX/_DEADBAND/_CORE_FRAC in gemini_engine.
- TestUnderSubtractionGain asserts on footprint pixels, NOT the detector (its
  NCC is degenerate on a flat synthetic bg; the real corpus removal drops the
  detector ~0.80 -> ~0.27).
- scripts/visible_removal_audit.py: the detect -> remove -> re-detect audit tool
  that found and validated this (operates on gitignored data/spaces only).
- CLAUDE.md + README: document the under-subtraction gain.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 19:48:40 -07:00
Victor Kuznetsov d7e4fe8835 feat(invisible): upscale-floor for small inputs + unsharp post-filter
Two quality knobs for the SDXL invisible pass:

- min_resolution floor (default 1024, --min-resolution): small inputs are
  upscaled to a 1024px long-side floor before diffusion, since SDXL img2img
  distorts on a tiny latent (a 381x512 portrait wrecks at native). The output
  is restored to the original input size, so it is a transparent quality boost;
  it adds time/memory on small inputs. 0 disables. Extends the pure _target_size
  helper (now cap-or-floor-or-native, min skipped on a min>max misconfig),
  unit-tested without a model.

- unsharp post-filter (humanizer.unsharp_mask, --unsharp, opt-in default 0):
  applied LAST, after the GFPGAN face pass (a pre-GFPGAN sharpen would be
  smoothed back over), to counter the soft/over-smoothed look that diffusion +
  restoration leave behind (an AI tell). Pairs with --humanize (grain).

Both threaded through invisible/all/batch + the module-level helper. Verified
end-to-end on a 381x512 portrait: upscaled to 1024, sharpened, restored to
381x512.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 18:30:39 -07:00
Victor Kuznetsov 5ec8269949 chore: mark controlnet pipeline + GFPGAN restore-faces as experimental
Both content-preservation features are now flagged EXPERIMENTAL and opt-in.
--pipeline controlnet was already opt-in (default=default); --restore-faces
flips from on-by-default to OFF by default, matching the repo's prior pattern
for experimental preservation passes (the removed protect_text/protect_faces).

- cli.py: --restore-faces/--no-restore-faces default False; EXPERIMENTAL in the
  --restore-faces / --controlnet-scale / --pipeline help; batch default False.
- invisible_engine.py: remove_watermark restore_faces default False + docstring.
- CLAUDE.md / README.md / docs/synthid.md: label both experimental/opt-in.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 16:59:28 -07:00
Victor Kuznetsov 411ef16ec3 feat: GFPGAN face-identity restoration post-pass
Add an optional, commercial-safe face-restoration post-pass that recovers
face identity the diffusion removal pass drifts (canny holds structure, not
likeness) while still scrubbing the pixel watermark in the face regions.

- face_restore.py: GFPGANer singleton (CPU unless CUDA), the basicsr
  torchvision.transforms.functional_tensor shim, and the pure feather
  _composite_faces helper (unit-tested without the model). GFPGAN
  re-synthesizes each face from a StyleGAN2 prior, so composited face pixels
  are GAN-generated (no watermark, no pixel-copy) -- oracle-clean at weight 0.5
  with identity preserved.
- InvisibleEngine.remove_watermark: restore_faces / restore_faces_weight,
  best-effort, auto-skips when the extra is absent or no face is detected.
- CLI --restore-faces/--no-restore-faces + --restore-faces-weight on
  invisible/all/batch (on by default).
- restore extra (gfpgan/facexlib/basicsr), numpy<2-pinned (scipy<1.18,
  numba<0.60) and kept out of `all`; basicsr needs Python <3.13 + setuptools<69
  to build, so pin .python-version 3.12.

Commercial-safe: GFPGAN Apache-2.0, RetinaFace MIT. The CodeFormer alternative
is non-commercial and is not shipped. The earlier IP-Adapter FaceID layer was
removed (footgun: needs high strength, corrupts faces at the low removal
strength).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 16:59:28 -07:00
Victor Kuznetsov d90d5d886a feat: controlnet pipeline for text/face-structure preservation
Add `--pipeline controlnet` (SDXL base + xinsir canny ControlNet via
StableDiffusionXLControlNetImg2ImgPipeline): the canny edge map conditions the
img2img regeneration so text and face STRUCTURE stay sharp, while the watermark
is still removed by the regeneration (`strength`) -- no original pixels are
copied or frozen, so SynthID does not survive. Oracle-verified clean on OpenAI
with better text/structure fidelity than plain img2img at equal strength.
`--controlnet-scale` tunes structure preservation; fp32 on mps/cpu (fp16-fixed
VAE on cuda/xpu). Shares the img2img runner (live progress + MPS->CPU fallback)
and the fp16-VAE-fix / device-move helpers with the default pipeline.

Remove the superseded subsystems -- ctrlregen (SD1.5 clean-noise),
text-protection (differential / region-hires) and face-protection: they either
destroyed real content or shielded the watermark by re-using original pixels.
controlnet replaces them by regenerating everything under edge conditioning.

Canny preserves face structure but not identity; face IDENTITY is a separate
face-restoration post-pass (CodeFormer/GFPGAN), researched + prototyped but not
yet shipped. An IP-Adapter FaceID attempt was built and removed (footgun: needs
high strength, corrupts faces at removal strength).

Docs: docs/controlnet-removal-pipeline-research.md, scripts/controlnet_sweep.py.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 16:59:28 -07:00
Victor Kuznetsov 175609b60a fix(gemini): rescue small corner sparkle buried by the size weight (#36)
detect_watermark's size-weighted global NCC search lets a larger, mediocre
match (e.g. a bright collar in a portrait) outrank a small, near-perfect
sparkle in the bottom-right corner, so a faint sparkle on a busy background
scored below threshold and the image read as clean -- the regression from
widening the search window 256px->512px between v0.7.2 and v0.8.8.

Add _corner_promote: a bottom-right-corner raw-NCC pass that overrides the
global pick when the corner holds a match with raw NCC >= 0.85 that beats it.
It only ever replaces a lower-fidelity pick (cannot weaken an existing
detection) and keeps the wider window for variant margins. The corner side is
relative-clamped (0.20 of the short side, [96, 384]) so it stays a true corner
at every scale: a fixed 256px covers ~70% of a small portrait, where a real
photo raw-matches the star at ~0.81; relative tightening drops that to ~0.69.
The 0.85 gate sits between the worst real-photo corner match (~0.78) and a
genuine faint sparkle (~0.93): zero false positives across native + downscaled
negatives, headshot rescued from below-threshold to 0.71.

Factor the shared multi-scale matchTemplate loop into _scan_scales.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 16:51:03 -07:00
Victor Kuznetsov 35116d5e97 chore(release): v0.8.9
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 19:04:32 -07:00
Victor Kuznetsov df0fafe94e fix(identify): stop flagging multi-actor C2PA manifests as integrity clashes
The C2PA issuer attribution (`c2pa`) and the SynthID proxy (`synthid`) are
derived from the same manifest, so treating them as independent signals made
rule 1 fire on legitimate multi-actor manifests where a product wraps another
vendor's engine (Microsoft Designer on OpenAI, Microsoft on Google) or an edit
chain re-signs (Adobe over a Gemini original). 19 such files in the
2026-06-01/02 spaces batches read as "likely spoofed/laundered" before this.

Group `c2pa` + `synthid` into one provenance source via `_CLASH_SOURCE`; rule 1
now requires two vendors from different sources. A manifest vendor still clashes
with a genuinely independent stamp (EXIF/XMP generator, IPTC AISystemUsed, AIGC,
xAI).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 19:02:35 -07:00
Victor Kuznetsov 9cb66992bd chore(release): v0.8.8
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 09:18:02 -07:00
Victor Kuznetsov 9ca2811938 fix(gemini): inpaint sparkle footprint when reverse-alpha over-subtracts (#30)
On a dark/textured background (e.g. grass) the captured alpha map over-estimates
the real Gemini sparkle's effective opacity (~0.51 captured vs ~0.31 effective),
so the fixed-alpha reverse blend over-subtracts (watermarked - alpha*logo goes
negative) and drives the footprint to black -- the white sparkle turns into a
black diamond (issue #30, reported by @CoolZimo1).

remove_watermark now detects this via _reverse_alpha_oversubtracts (fraction of
footprint pixels with a negative numerator > 5%) and inpaints the small sparkle
footprint from the surrounding pixels (cv2 NS, cropped to a padded box) instead.
Behavior-neutral on the working case: a bright background over-subtracts at ~0%,
so reverse-alpha is used and the output is byte-identical to before (verified:
demo_banana 0.0 frac vs the issue-#30 grass image 0.61 frac; issue-#30 footprint
recovers to background grass with no pit, residual sparkle conf 0.25 < 0.35).

Guard is scoped to GeminiEngine: doubao/jimeng already NCC-align their alpha to
the actual mark per image, which sidesteps the fixed-alpha mismatch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 09:17:32 -07:00
Victor Kuznetsov b25276c4f2 chore(release): v0.8.7 2026-06-01 19:33:08 -07:00
Victor Kuznetsov 96038f960f feat(invisible): vendor-adaptive default strength (OpenAI 0.10 / Google 0.15)
The default img2img strength is now chosen from the detected SynthID vendor
(C2PA issuer) instead of a single fixed 0.30: OpenAI gpt-image -> 0.10, Google
Gemini -> 0.15, unknown source -> 0.15. Explicit --strength always wins.

Basis: an oracle-verified June 2026 controlled study (clean v0.8.6, text/face
protection OFF, per-image openai.com/verify or Gemini-app verdict). OpenAI's
SynthID clears at 0.05 across 1024-1600 px (n=4, resolution-independent);
Google's is ~3x more robust and needs 0.15 on the capped-1536 path (n=4). The
dominant factor is the VENDOR, not resolution. The earlier single 0.30 default
and the "resolution dependence" lore came from contaminated tests run with the
protect-text bug ON (issue #14) -- re-running those same 1600x1600 images clean
removes SynthID at 0.05.

`vendor_for_strength(path)` reads metadata.synthid_source on the ORIGINAL input
and is threaded through cli (invisible/all/batch) -> invisible_engine ->
watermark_remover -> resolve_strength(strength, profile, vendor), so display and
execution use the same vendor (the engine sees a temp path whose C2PA the visible
pass already stripped, so detection must happen in the CLI on the pristine
source). Caveat: Google's 0.15 was validated only on --max-resolution 1536;
native 2816 Gemini was not locally measurable (OOM on Apple Silicon) and is
pending GPU validation on raiw.cc.

Docs: docs/synthid.md sections 2.2/4.4/5.2 corrected (the contaminated
resolution-dependence findings replaced with the clean oracle-verified table);
README and CLAUDE.md updated; CLI --strength help reflects the adaptive default.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 19:29:47 -07:00
Victor Kuznetsov 1708857772 fix(gemini): expand sparkle search area 256 -> 512px from corner
The 256px limit caused misses when Gemini places the sparkle further from the
corner than the standard 160px (margin 64 + logo 96). Observed variant at ~300px
reported in issue #30. 512px covers all known Gemini margin variations with room
to spare; matchTemplate on a 512x512 region is still fast on CPU.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 10:42:04 -07:00
Victor Kuznetsov 25cc4750df chore(release): v0.8.5 2026-06-01 10:31:59 -07:00
Victor Kuznetsov 4b0b370ac0 fix(invisible): disable protect-text/protect-faces by default; add docs/synthid.md
Both text and face protection were shielding SynthID from removal. The
text-protection high-res re-scrub regenerates pixels at an upscaled resolution
where the per-region pass may not be strong enough to re-destroy the SynthID
payload, allowing it to survive in text areas. Face protection has an even more
direct mechanism: it pastes back the original (pre-diffusion, watermarked) face
pixels after the global pass, guaranteeing SynthID survives in face regions
regardless of strength.

Both --protect-text and --protect-faces are now off by default and opt-in.
Rename from --no-protect-text / --no-protect-faces to --protect-text /
--protect-faces. Extract shared click.option decorators to module-level
constants (_protect_text_option, _protect_faces_option) to eliminate
copy-paste between cmd_invisible and cmd_all.

Add docs/synthid.md: primary-source-cited technical reference for SynthID-Image
covering mechanism (post-hoc encoder/decoder, 136-bit payload, pixel-space, no
model-weight modification), robustness numbers (arXiv:2510.09263: ~99.98% TPR
at 0.1% FPR across 30 transforms), removal attacks and forensic detectability
(arXiv:2605.09203: all 6 attacks detectable >98% TPR@1%FPR), detectability
limits, oracle scope, adoption landscape, and practical implications including
the protect-text/faces SynthID-preservation finding.

Verified June 2026 on gpt-image 1600x1600 via openai.com/verify: with
--protect-text SynthID detected; without, SynthID removed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 10:28:34 -07:00
Victor Kuznetsov 72812de03c chore(release): v0.8.4 2026-05-31 20:46:52 -07:00
Victor Kuznetsov e501bec9ff feat(identify): detect visible Doubao/Jimeng marks; keep identify import torch-free
identify previously ran only the Gemini sparkle as a visible detector, so a
Doubao/Jimeng image with stripped TC260 metadata had no visible fallback. Add
`_visible_text_marks` (registry-backed) so the ByteDance Doubao 豆包AI生成 and
Jimeng 即梦AI marks are detected too, each gated by its own engine NCC threshold
via MarkDetection.detected. New signals `visible_doubao` / `visible_jimeng`
(medium), same stripped-metadata fallback role as the sparkle; excluded from
integrity-clash vendor claims; set platform only when no harder signal did.

Also make `noai/__init__` lazy (PEP 562 __getattr__): importing the light
`noai.c2pa` / `noai.constants` submodules (which identify needs) no longer
eagerly pulls `watermark_remover`, which imports torch + diffusers at module
top. `import remove_ai_watermarks.identify` drops from ~420 MB to ~21 MB in a
full gpu/detect install (torch not loaded), so it fits a 512 MB host; the
removal API resolves lazily on first access. Guarded by TestIdentifyImportIsLight.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 20:43:52 -07:00
Victor Kuznetsov c155f81078 chore(release): v0.8.3
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 17:41:10 -07:00
Victor Kuznetsov b991b11a19 docs(synthid): correct protect_text guidance -- it does NOT block removal (keep ON)
An A/B at strength 0.3 on a real e-commerce infographic (updated GPU study)
reverses the earlier claim: SynthID is a GLOBAL watermark, so 0.3 removes it
whether protect_text is on or off, and protection SALVAGES text fidelity (medium
headings/body stay readable; off, they garble). The earlier 'protect_text shields
the watermark, use --no-protect-text' was wrong -- it mistook the 0.10 strength
failure for a protection effect. Recommended SynthID config: ~0.3 + protect_text ON
(the default). Also document the oracle scope: the Gemini app 'Verify with SynthID'
is the only valid SynthID oracle; openai.com/verify is provenance-scoped (C2PA) and
does NOT measure SynthID. Corrects CLAUDE.md + README + watermark_profiles comment
shipped in cddbaf6.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 16:50:13 -07:00
Victor Kuznetsov cddbaf6413 fix(invisible): raise default strength 0.10 -> 0.30 (current SynthID threshold); flag ctrlregen experimental
An oracle-verified GPU strength study (Modal A100, native res, Gemini-app
'Verify with SynthID', n=3 fresh Gemini images, protect_text/faces off) found the
current Google SynthID survives strength 0.10/0.15/0.2 and is removed only at 0.3.
The previous 0.10 default (set from an n=1 result) no longer clears it -- Google
hardened SynthID and the threshold has climbed 0.05 -> 0.10 -> ~0.3. Bump
DEFAULT_STRENGTH to 0.30; OpenAI/ChatGPT carry C2PA not SynthID, so 0.10 is plenty
there (pass --strength 0.10). Note protect_text shields the text regions SynthID
hides in (use --no-protect-text for full removal on text-heavy images).

The same study found ctrlregen at clean-noise strength DESTROYS real images
(hallucinated micro-text in smooth regions), with no usable middle setting, so the
literature's 'clean-noise is the lever' did not hold empirically. Flag ctrlregen
EXPERIMENTAL in the CLI --pipeline help, README, and watermark_profiles; SDXL
img2img at ~0.3 stays the shippable path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 16:38:49 -07:00
Victor Kuznetsov 729f5f2ecd chore(release): v0.8.2
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 15:46:47 -07:00
Victor Kuznetsov f16216cabc feat(cli): add --no-protect-faces to invisible/all (skip the YOLO face detector)
Mirrors --no-protect-text: when the image has no people, skip loading and
running the YOLO face detector entirely. The heavy extract+blend already only
ran when a face was found, but the detector itself always loaded+inferred to
decide; this flag lets callers skip that fixed cost.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 15:27:14 -07:00
Victor Kuznetsov e42b7e9d6a refactor(cli): plain-text console output; drop rich; quiet transformers
cli.py now emits plain ASCII through a small click.echo shim
(_Console / _Table / _Progress) instead of rich: no colors, markup tags,
panels, progress bar, or Unicode glyphs (Warning: / -> / ... and dropped
checkmark/cross marks). identify and metadata tables render as indented
plain lines.

- drop rich from dependencies (pyproject.toml + uv.lock)
- __init__: set TRANSFORMERS_VERBOSITY=error (setdefault) plus a warnings
  filter so the transformers Siglip2ImageProcessorFast deprecation no
  longer prints at CLI startup (it fires from the eager noai import)
- TestGpuHintMarkup: the [gpu] hint is now printed verbatim; docstring updated
- CLAUDE.md: replace the obsolete rich-markup lesson, note the verbosity fix

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 15:21:29 -07:00
Victor Kuznetsov 2d49c3cb58 fix(invisible): ctrlregen defaults to clean-noise strength, not the SDXL 0.10
The ctrlregen profile inherited the SDXL img2img --strength default (0.10), a
near-identity pass that loaded ControlNet + DINOv2-giant and barely changed the
image -- a no-op for removal. resolve_strength() now resolves an unset strength
per profile: 0.10 for the SDXL default, CTRLREGEN_DEFAULT_STRENGTH (1.0,
clean-noise) for ctrlregen. It checks `is None` rather than falsiness, so an
explicit 0.0 is respected (the old `strength or DEFAULT` swallowed it).

Research basis: CtrlRegen (ICLR 2025, arXiv:2410.05470) removes robust
watermarks by regenerating from clean Gaussian noise; partial-noise img2img
retains watermark info that diffuses back, so a high (clean-noise) strength is
the lever, not a knob on the light SDXL pass. CLI wiring (--strength default
None) lands with the cli refactor.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 15:07:19 -07:00
Victor Kuznetsov 33bd401e2a fix(visible): guard remove_watermark_reverse_alpha on tiny images too
The previous commit guarded extract_mask, but the 2048x1 crash was
actually in _fixed_alpha_map's cv2.resize to a ~1-px-tall target (Windows:
"Unknown C++ exception" / access violation). Return image.copy() up front
when h < 32 or w < 64 (no real watermarked image is that small), before any
cv2 call. Same guard in both Doubao and Jimeng.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 14:00:52 -07:00
Victor Kuznetsov 7167d2bae7 fix(visible): guard extract_mask against degenerate ROIs (Windows CI crash)
The always-align removal scores each placement with a residual detect(),
which on an extremely wide/short image (2048x1, test_wide_short_does_not_raise)
fed cv2 GaussianBlur a ~1-px-tall ROI and faulted natively on Windows py3.12
(access violation, non-deterministic -- one CI cell went red, a re-run passed).
The old at-native path never ran detect() on degenerate sizes. Skip the cv2
pipeline and return an empty mask when bh < 16 or bw < 16; real images always
clear the guard (the WM_* box floors are max(16,..) / max(40,..)). Same fix in
both Doubao and Jimeng. Also sync the stale Doubao module docstring.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 13:25:57 -07:00
Victor Kuznetsov e7c57e3892 chore(release): v0.8.1 — exclude data/ from sdist
The 0.8.0 PyPI publish uploaded the wheel but the sdist was rejected
(400 File too large): hatchling's default sdist bundled the committed
data/ test corpora (synthid_corpus images + the new visible-mark
captures), pushing it past PyPI's per-project file-size limit. Add a
sdist target that excludes /data, dropping it ~85 MB -> 9.8 MB. The
wheel already ships only src/ and is unaffected.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 12:57:46 -07:00
Victor Kuznetsov 315320056b chore(release): v0.8.0
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 12:25:02 -07:00
Victor Kuznetsov e572767555 feat(visible): add Jimeng remover, fix Doubao outline defect, reproducible mask build
Visible-watermark work across all three corner-mark engines plus a committed,
reproducible alpha-build pipeline (scripts/visible_alpha_solve.py) fed by committed
solid black/gray/white captures.

- jimeng: new "即梦AI" wordmark remover (reverse-alpha + thin residual inpaint,
  always NCC-aligned -- the mark re-rasterizes/jitters per image). Detect via glyph
  silhouette NCC (0.45 threshold; does not cross-fire with Doubao). Registered in the
  visible-mark catalog; `visible --mark jimeng` / `--mark auto`.
- doubao: fix a real production defect -- the shipped remover left a READABLE
  "豆包AI生成" outline on real samples while detect() returned conf 0.0 (fooled by a
  thin outline), so the test passed and the "56/56 clean" claim was detector-measured,
  not visual. Root cause: under-estimated alpha + fixed-geometry-no-inpaint + tight
  locate box. Rebuilt alpha (careful gray-self solve), always-align, thin inpaint,
  widened locate box -> readable outline becomes faint texture-level traces.
- gemini: rebuild gemini_bg_{96,48} from our own controlled captures (validated NCC
  0.9998 vs the prior third-party asset); removal re-verified clean, no behaviour change.
- tests: add textured-shift regression to both engines (guards the align-on-shift path
  the Doubao defect exposed; lesson: a detector-only removal test is insufficient,
  assert visual residual).
- docs: CLAUDE.md, README, capture READMEs and docstrings synced; stale
  "exact/pixel-exact/56-clean" claims removed.

Also includes a SynthID label-wording clarification in identify.py/cli.py
("SynthID pixel watermark" -> "SynthID watermark, inferred from C2PA metadata").

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 12:20:19 -07:00
Victor Kuznetsov 5d0e6c3a65 fix: harden metadata parsers and engines; sync docs (full-repo review)
Apply fixes from a full-repo review (code, tests, docs).

Security / correctness:
- Clamp attacker-controlled PNG/caBX chunk lengths to the remaining file
  size in metadata.py and noai/c2pa.py (a malformed length no longer drives
  a multi-GB read); skipped chunks seek instead of read.
- noai/isobmff.strip_c2pa_boxes is now fail-safe on a malformed box: return
  the original bytes with a warning instead of silently truncating the tail,
  so metadata --remove can no longer emit a corrupt file.
- doubao_engine._fixed_alpha_map clamps the glyph box to the image (no crash
  on degenerate width-vs-height).
- watermark_remover._run_region_hires gates the phaseCorrelate offset on
  response and magnitude (a spurious shift no longer garbles text) and drops
  the generator after a CPU fallback (no MPS/CPU device mismatch).

Robustness:
- gemini_engine, doubao_engine, region_eraser normalize grayscale and RGBA
  inputs to BGR at the engine entry points.
- image_io.imwrite returns False on an unwritable path (matches cv2).
- invisible_engine guards a None imread result before use.
- trustmark_detector._decoder uses a double-checked threading lock.
- ctrlregen.tiling.tile_positions raises on overlap >= tile.
- humanizer chromatic shift no longer wraps opposite-edge pixels.
- identify OpenAI caveat keyed on the normalized vendor, not a substring.
- Remove the dead "visible --detect-threshold" CLI option.
- publish.yml verifies the release tag matches the package version.

Docs:
- README strength 0.05 to 0.10; .env.example HF_TOKEN marked optional;
  doubao_capture README updated to reverse-alpha-only; CLAUDE.md synced with
  the new behaviors and the batch command.

Tests: new test_security_clamp.py for the read clamp and isobmff fail-safe;
erase CLI coverage; integrity-clash rule 2 end-to-end; multi-tag EXIF
survival and cross-format strip guards; channel/size, tiling, humanizer, and
imwrite regressions. Full suite 493 passed, 2 skipped; ruff and pyright src/
clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 18:00:39 -07:00
Victor Kuznetsov 5298dcc6a3 chore(release): v0.7.2
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 14:35:04 -07:00