Commit Graph

256 Commits

Author SHA1 Message Date
Victor Kuznetsov c1971a3e8d feat(invisible): region-targeted regeneration for AI-enhanced composites
For AI-enhanced composites (digitalSourceType compositeWithTrainedAlgorithmicMedia,
identify ai_source_kind == "enhanced"; roadmap P1#8): regenerate ONLY the AI
region and preserve the real photo elsewhere, instead of regenerating the whole
frame.

- noai.tiling.feather_region_composite(base, regenerated, box, *, feather): pure,
  model-free compositor that blends the regenerated AI box back over the original
  with a feathered seam, leaving pixels OUTSIDE the box exactly equal to base.
  Fully unit-tested (outside-box exactness, interior == regenerated, hard paste at
  feather 0, monotonic seam ramp, dtype/grayscale/clamp/empty-box/shape-mismatch).
- WatermarkRemover.remove_watermark(region=, region_feather=) and the module-level
  convenience function thread it through: the remover regenerates (or tiles) the
  frame, then composites only the AI box back over the original input. The box is
  caller-supplied -- a C2PA composite manifest carries no reliable machine-readable
  region, so none is fabricated. The no-model lossless region path stays
  region_eraser.erase.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 15:34:39 -07:00
Victor Kuznetsov 33fddbc6fa fix(visible): over-subtraction guard for Doubao/Jimeng/Samsung text marks
Port the Gemini sparkle dark-pit guard (commit 41f6797) to the shared
TextMarkEngine reverse-alpha base (roadmap P0#8): on a dark or mid-tone
background the captured alpha can over-estimate this image's mark opacity, and
reverse-alpha leaves a darker-than-background glyph ghost instead of recovering
the true pixels. The sparkle-only fix left the text marks unhandled.

_reverse_alpha_oversubtracts predicts the reverse-alpha output PER PIXEL over the
glyph body from the INPUT ((obs - a*logo)/(1-a), the remover's own math); when
the predicted body lands more than _OVERSUB_DARK_MARGIN (25) gray levels below
the local background ring it abandons the reverse-alpha output for the footprint
and inpaints it from the original surroundings (_inpaint_footprint, wider dilate/
radius than the thin residual pass). Predicting per-pixel from the input (not the
produced output, which depends on which placement the remover picked) keeps a
cleanly captured full-strength mark byte-identical -- it predicts back to the
background everywhere, so the guard never trips on it (verified across all three
engines on white/mid/dark/midgray backgrounds).

Regression-guarded by tests/test_text_mark_oversubtraction.py: predicate True on
faint / False on clean, end-to-end no-dark-pit acceptance, clean-mark byte
identity, and textured-background footprint recovery.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 15:34:39 -07:00
Victor Kuznetsov 0c215b5b2f feat(identify): C2PA vendor coverage, AI-enhanced split, detect/remove threshold unify
Retained-corpus mining (2026-06-20) surfaced three provenance gaps; all are
oracle-free and regression-guarded.

- C2PA vendor coverage (roadmap): register Volcano Engine under its Chinese
  legal entity 北京火山引擎科技有限公司 (the latin "volcengine" needle misses
  those certs) -> normalizes to the same ByteDance platform; register ElevenLabs
  ("Eleven Labs Inc.", pure generative-AI) as a generator. Document the
  deliberate exclusion of TikTok Inc. and PixelBin.io/"Fynd" (provenance/transform
  signers, not generators) so they are not re-added.

- AI-generated vs AI-enhanced (roadmap): ProvenanceReport.ai_source_kind splits
  the C2PA digital-source-type into "generated" (trainedAlgorithmicMedia) vs
  "enhanced" (compositeWithTrainedAlgorithmicMedia) so a caller branches a
  full-frame scrub from a region-targeted clean. Parsed once in
  noai.c2pa._populate_registry_fields (PNG + any c2pa-python-readable container),
  with a raw head-scan fallback in identify for the non-PNG raw-blob path. CLI
  verdict reads "AI-generated (fully synthetic)" vs "AI-enhanced (real content
  with an AI-composited region)"; surfaced in --json.

- Detect-vs-remove threshold desync (P0#7): identify's sparkle threshold and the
  removal arbitration gate were two independent 0.5 constants. Unify them into the
  single GEMINI_SPARKLE_TRUST_CONF (identify imports it) so they can never drift.
  Lowering the gate to recover faint sub-0.5 sparkles was evaluated and REJECTED:
  a real Doubao text mark scores ~0.40-0.42 as a gemini match with a higher
  core-ring brightness margin than a genuine faint sparkle, so neither confidence
  nor the brightness gate separates them in [0.35, 0.5) -- lowering would trade a
  rare miss for false-positive removals on clean images. Regression-guarded by
  TestSparkleDetectRemoveAlignment (real demo sparkle at borderline opacities;
  identify and best_auto_mark must agree on either side of the line).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 15:34:20 -07:00
Victor Kuznetsov 373b910a60 docs: fix the qwen-vs-controlnet face comparison to oracle-confirmed scrub floors
The face fidelity numbers cited an equal-strength compare (both 0.15), but Qwen at
0.15 does NOT clear Gemini SynthID -- so that output is un-scrubbed and the compare
is invalid. Per the methodology rule (compare fidelity only between outputs where
SynthID is removed in BOTH), restate faces at each pipeline's scrub floor
(controlnet 0.15 / Qwen 0.30): ArcFace identity 0.546 vs 0.331, lapvar 0.62 vs 0.40,
face LPIPS 0.09 vs 0.19 -- controlnet still wins faces, conclusion unchanged. Drop
the "equal strength" framing in CLAUDE.md / module-internals / known-limitations.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 14:33:11 -07:00
Victor Kuznetsov 2d5b26ed18 test(eval): vision-transcribed ground truth for qwen_in + clean text-CER numbers
data/qwen_in/ground_truth.json is transcribed by vision (PaddleOCR mangled the
stylized Cyrillic), so the text metric scores variants against an accurate
reference instead of noisy OCR-vs-OCR. Re-measured text CER (controlnet vs qwen)
with this ground truth confirms qwen wins text across EN/RU/ZH: openai_1 0.385 vs
0.241, openai_2 0.341 vs 0.290, gemini_1 (ZH) 0.037 vs 0.000 (perfect Chinese even
at the higher 0.30 strength). Faces still favor controlnet. Refresh the numbers in
docs/known-limitations.md to this cleaner methodology.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 14:26:23 -07:00
Victor Kuznetsov e29c156279 test(eval): fix the qwen_in pipeline-fidelity eval set + PaddleOCR ground-truth flow
- data/qwen_in/: a stable, committed set of 4 AI-generated images (OpenAI +
  Google, carrying SynthID/C2PA -- same class as data/samples fixtures) used to
  compare the controlnet/sdxl/qwen pipelines for fidelity. Two text-multi-script
  (incl. RU/CJK), one EN poster, one face grid. README documents the set + the
  ground-truth workflow. data/ is sdist-excluded so the wheel is unaffected.
- scripts/fidelity_metrics.py: switch text OCR from EasyOCR to PaddleOCR
  (PP-OCRv6, higher accuracy esp. CJK, single multilingual stack); split into
  `ocr` (seed a {basename: text} ground truth) and `compare` (--ground-truth for
  a clean CER vs the hand-verified reference instead of noisy OCR-vs-OCR). Spatial
  IoU-NMS keeps the best-scoring read per line so wrong-script models don't inject
  garbage over Cyrillic/CJK.
- Oracle methodology: validate the OpenAI arm FIRST (openai.com/verify is more
  accessible and the strongest Playwright/Chrome-MCP automation candidate; the
  Gemini app is more manual). Recorded in CLAUDE.md + docs/synthid.md.

Ground-truth JSON (data/qwen_in/ground_truth.json) lands in a follow-up once
hand-verified.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 14:17:04 -07:00
Victor Kuznetsov a2c33af284 feat(scripts): fidelity_metrics.py + correct the qwen-vs-controlnet claim
Add scripts/fidelity_metrics.py: an objective eval harness comparing
watermark-removal outputs against the original (reference) across four groups
-- OCR character error rate (EasyOCR), ArcFace identity cosine (insightface),
face texture (LPIPS + Laplacian-variance ratio), and whole-image LPIPS/SSIM/
PSNR. PEP 723 inline deps so it stays out of the package / uv.lock; metrics
self-gate (faces only where faces, text only where text).

The metrics overturned an eyeball conclusion: at EQUAL strength Qwen beats
controlnet on TEXT (OpenAI typography 0.10: OCR CER 0.25 vs 0.37) but controlnet
beats Qwen on FACES (gemini_3, 18 faces, 0.15 each: Laplacian-variance retention
0.62 vs 0.41, face LPIPS 0.09 vs 0.13 -- Qwen smooths faces MORE; ArcFace
identity ~tied). So Qwen is the better TEXT-preserving remover, not a universal
fidelity win. Correct the earlier "qwen keeps faces faithful where controlnet
plasticizes" claim in CLAUDE.md, module-internals.md, known-limitations.md, README.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 09:58:22 -07:00
Victor Kuznetsov 76e3d4154c feat(invisible): add Qwen-Image img2img pipeline (--pipeline qwen)
A third diffusion pipeline alongside sdxl/controlnet: Qwen-Image (20B MMDiT,
Apache-2.0 code AND weights) img2img. The scrub still comes from the img2img
strength; Qwen preserves text (incl. CJK) and structure markedly better than
SDXL at the scrub floor, so it over-regenerates real photos far less (directly
targets the controlnet over-regeneration that degrades real uploads).

- watermark_profiles: QWEN_MODEL_ID, normalize_profile accepts "qwen".
- WatermarkRemover: _load_qwen_pipeline (bf16, loads Qwen base unless --model
  overridden, clear ImportError if diffusers lacks the class), _run_qwen (no
  MPS fallback -- 20B is CUDA/cloud-class), dispatch in _generate_one/preload,
  pure _build_qwen_kwargs (true_cfg_scale, not guidance_scale).
- Shared _base_load_kwargs() across all three loaders (dtype + token).
- CLI --pipeline gains "qwen"; invisible_engine threads it through.
- scripts/qwen_scrub_prototype.py: standalone PEP 723 GPU experiment.

Prototype oracle floors (Modal A100-80GB, single seed, controls SynthID-positive,
PENDING seed-repeat cert): OpenAI clears at strength ~0.10, Gemini at ~0.30 (0.20
still detected), with CJK text + faces faithful where controlnet plasticizes. The
Gemini floor is higher than the shared default ladder, so pass an explicit
--strength for Gemini on this pipeline until a Qwen-specific ladder is certified.

The model-running path is CUDA-only (untestable locally); unit tests cover the
pure call-shape (_build_qwen_kwargs) and profile normalization without torch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 20:44:36 -07:00
Victor Kuznetsov 0c0c6c6b03 feat(invisible): sliding-window tiled diffusion for large inputs (--tile)
Add a lossless alternative to the --max-resolution downscale for large
images that OOM on MPS/GPU: regenerate in overlapping, feather-blended
tiles at native resolution.

- noai/tiling.py: pure plan_tiles (uniform tiles, last flush to edge) +
  feather_weights (strictly-positive separable taper -> partition-of-unity
  blend) + run_tiled (per-tile generate callable, decoupled from the
  pipeline). Unit-tested without the model.
- WatermarkRemover.remove_watermark: refactor _generate into _generate_one
  + a tiled branch that engages only when --tile is set and the long side
  exceeds tile_size (ControlNet canny is rebuilt per tile).
- Thread tile/tile_size/tile_overlap through InvisibleEngine and the
  invisible/all/batch CLI commands via a shared _tile_options decorator.

Verified end-to-end on the real SDXL pipeline (forced 2x2 tiling on a
1024px sample, MPS): non-degenerate output, no gross seam at tile borders.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 11:54:58 -07:00
Victor Kuznetsov d5845a72f3 feat(metadata): blank AI-generator tokens in AVIF/HEIF Exif meta-box items
Closes a documented coverage gap (P2#9): an AI Software/Make/Artist/ImageDescription
token in an EXIF item (its TIFF bytes live in mdat/idat) survived remove_ai_metadata
because the top-level box stripper and (absent pillow-heif) the PIL EXIF reader can't
reach it. New isobmff.blank_ai_exif_tokens finds EXIF TIFF blocks by their II/MM
byte-order header, validates each with piexif (a coincidental II/MM run in pixels
won't parse as a TIFF IFD, so it's ignored), and overwrites any AI_GENERATOR_TOKENS-
bearing value with same-length spaces -- so box sizes and iloc offsets stay valid and
the coded image is untouched (mirrors blank_ai_xmp_packets; no iinf/iloc surgery, no
exiftool dep). Camera/editor EXIF without an AI token is preserved. Wired into
remove_ai_metadata's ISOBMFF path. Covers the realistic AI-generator-token case; xAI-
signature-in-meta-box-EXIF (Grok is JPEG-only) stays out.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 10:43:35 -07:00
Victor Kuznetsov 3f5d6a0af1 docs(landscape): back the DWT-DCT positive-only limitation with researched root cause + citations
Deep-research (2026-06-19, adversarially verified) confirms the open imwatermark
dwtDct mark is fragile by scheme, not by our usage: maintainers admit no 100%
clean-decode guarantee; measured ~0.79 bit accuracy clean (~38/48, below our 44
gate). Root causes (code-verified + locally reproduced): per-block max-coefficient
bit read (content flips bits) and YUV chroma 8-bit clamping on bright pixels (the
bright-flat / all-ones failure). No maintained fork or detector does this scheme
reliably (WAVES relegates it to an appendix; learned schemes are a different class;
dwtDctSvd cannot decode SDXL's dwtDct). Conclusion: keep it positive-only, rely on
C2PA. Sources: imwatermark READMEs, arXiv:2406.08337 (WMAdapter), arXiv:2401.08573
(WAVES), diffusers SDXL watermark.py.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 10:27:08 -07:00
Victor Kuznetsov f97fdc5b92 chore(release): v0.11.4 v0.11.4 2026-06-19 10:06:04 -07:00
Victor Kuznetsov 4c8a57ec7b docs: dwtDct detector is carrier-fragile (all-ones = artifact), FLUX open-mark unresolvable
Final characterization after a positive-control sweep. The imwatermark dwtDct
round-trip fails (28-39/48, below the 44 gate) not on "high texture" as a prior
note claimed, but on a broad carrier class: the FLUX fox, doubao, a minimalist-FLAT
FLUX generation, AND a clean synthetic bright-flat fill with NO watermark all fail
identically. The degenerate all-ones decode is therefore a CARRIER ARTIFACT, not a
watermark (the no-watermark synthetic image reproduces it; a double-embed test shows
no interference). detect_invisible_watermark is positive-only: trust a hit, treat a
None as inconclusive unless a same-carrier positive control first recovers >=44.

Consequence: whether BFL hosted FLUX embeds the open DWT-DCT is unresolvable with
this detector on the available carriers (textured AND flat FLUX both fail the
control). C2PA stays the reliable FLUX signal. Low priority to chase further.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 10:03:34 -07:00
Victor Kuznetsov a0a349cc66 docs: correct overstated FLUX open-watermark claim; record detector content-fragility
Earlier notes asserted BFL hosted output has no open DWT-DCT watermark. That was
overstated: the test carriers were high-texture fox images where a clean
encode->decode round-trip of a KNOWN-embedded watermark recovers only 28-35/48
bits (below the safe 44 gate), so the detector would miss a present mark there --
the None is inconclusive, not proof of absence.

Verified positive-control (2026-06-19): imwatermark dwtDct round-trips 48/48 on
synthetic carriers and on chatgpt-1.png (48/48) / firefly-1.png (45/48), but
FAILS on flux-1.png (28/48) and doubao-1.png (39/48). So invisible_watermark
detection is a positive-only signal: trust a hit, treat a miss on busy content as
inconclusive. Affects all open SD/SDXL/FLUX DWT-DCT detection. C2PA stays the
reliable FLUX identifier; whether BFL hosted embeds the open mark is unresolved
(needs a low-texture hosted sample).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 10:03:34 -07:00
Victor Kuznetsov 99e57c872f perf(text-mark): footprint-sized arrays in reverse-alpha CPU path
The reverse-alpha text-mark engine (Doubao/Jimeng/Samsung) allocated
full-frame arrays where only the glyph footprint is ever read:

  - _fixed_alpha_map / _aligned_alpha_map each built a full (h, w) float32
    alpha map non-zero only inside the glyph box, and two were held at once
    during removal (~96 MB of mostly-zeros on a 12 MP frame);
  - extract_mask built a full (h, w) uint8 mask that every caller cropped to
    the located box (~12 MB, rebuilt per text-mark detector on the
    memory-tight identify path).

Both now return footprint-sized arrays: the alpha helpers return the
glyph-sized block plus its placement (ax, ay, gw, gh), and extract_mask
returns the box-sized mask. _apply_reverse_alpha consumes the block
directly; the residual inpaint embeds it into one full-frame uint8 mask only
at cv2.inpaint time (which needs a full-frame mask). remove_watermark_
reverse_alpha tracks the winning region alongside best_amap to place it.

Peak allocation drops from O(image*4)x2 + O(image) to O(footprint)x2 +
one gated O(image*1) uint8 mask -- a win every consumer gets, motivated by
the 512 MB raiw.cc worker that OOMs on large decodes. GPU path untouched.

Byte-identical to the old full-frame path (verified: 17 output hashes
across the three engines, inpaint/no-inpaint, detect, and the real
doubao-1.png fixture, unchanged before/after). tests/test_text_mark_memory.py
guards it by reconstructing the old full-frame path inline and asserting
equality, so the proof survives a cv2/asset bump, and pins the O(footprint)
shape so a regression to full-frame fails loudly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 10:01:07 -07:00
Victor Kuznetsov 9614615001 docs(landscape): confirm BFL hosted = C2PA-only on FLUX.1 [dev] too
Lossless-PNG check across both BFL Playground model lines (FLUX.2 [pro] and
FLUX.1 [dev]) confirms the open DWT-DCT pixel watermark is absent on hosted
output regardless of model or container; only the signed C2PA manifest is present.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 09:42:03 -07:00
Victor Kuznetsov 9e307d020e test(c2pa): add real FLUX.2 BFL C2PA fixtures (PNG + JPEG)
flux-1.png / flux-1.jpg are real Black Forest Labs FLUX.2 [pro] Playground
outputs (signed C2PA, issuer "Black Forest Labs" + trainedAlgorithmicMedia,
manifests verified to contain no personal data). flux-1.jpg is the first
committed JPEG-with-C2PA fixture, exercising the c2pa-python non-PNG reader path
end to end. Regression tests assert both attribute to "Black Forest Labs (FLUX)".

Also documents the verified finding (n=2, 2026-06-19): BFL's hosted output carries
the signed C2PA manifest but NOT the open invisible-watermark DWT-DCT (decodes to
degenerate all-ones, chance-level vs the FLUX reference) -- the open pixel mark is
dev-inference-code-optional only. So a hosted FLUX.2 image is identified by C2PA
alone, with no open-pixel fallback once C2PA is stripped.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 09:37:40 -07:00
Victor Kuznetsov d4d9429328 feat(identify): attribute Canva and BytePlus C2PA; fix BytePlus->Adobe mislabel
Mining the local production corpus (25,725 imgs) surfaced two AI vendors signing
C2PA that the registry missed:
- Canva (Magic Media) signed "Canva" + trainedAlgorithmicMedia -> detected AI but
  no platform attributed (disproves the old "Canva exports strip C2PA" assumption).
- BytePlus (ByteDance international: Seedream/Seededit) signs "Byteplus Pte. Ltd.";
  the bare volcengine needle missed it, so its output was mis-attributed to "Adobe
  Firefly" via an incidental "Adobe XMP" string the fallback byte-scan picked up.

Adding both to C2PA_AI_VENDORS lets the clean manifest issuer attribute them
directly. Corpus re-run: 16 platform changes, all improvements (3 Adobe->ByteDance
fixes, 4 None/TC260->ByteDance, 9 None->Canva), 0 regressions. An attempted
signer-based attribution fallback was measured and dropped: it regressed 18 images
(friendly ByteDance label -> raw Chinese cert org; IPTC tool name pre-empted).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 21:57:21 -07:00
Victor Kuznetsov be566e8868 chore(release): v0.11.3 v0.11.3 2026-06-18 17:28:21 -07:00
Victor Kuznetsov 9f6c26a439 refactor(c2pa): read manifests via official c2pa-python, keep byte-scan fallback
extract_c2pa_info now uses the c2pa-python Reader first (any container, whole
manifest store incl. ingredient manifests), falling back to the hand-rolled caBX
parser for blobs the validator rejects (synthetic/partial, broken wheel). The
issuer/source-type/SynthID/soft-binding registry scan is shared by both paths
(_populate_registry_fields), so the return-dict contract is unchanged. Also
replaces the dead `from c2pa import has_c2pa_metadata` import in metadata.py with
a real Reader presence check. c2pa-python added as a core dep (MIT/Apache, ~+5MB
RSS, no torch; wheels cover the CI matrix).

Validated on the full local spaces corpus (25,725 imgs): 0 regressions; 384
manifests newly parsed (379 non-PNG JPEG/WebP + 2 PNGs the byte-scanner missed);
3 false Adobe/Microsoft->Google attributions fixed via real-manifest parsing.

The docs/module-internals.md section for this change already landed in 41f6797.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 17:24:58 -07:00
Victor Kuznetsov 41f67973ce fix(visible): inpaint mid-tone Gemini sparkle instead of a dark diamond
The free `visible` path over-subtracted a faint Gemini sparkle on a
mid-tone background into a darker-than-background brown diamond instead
of removing it (2026-06-18 prod NPS report, "the watermark was not
removed, just its color changed"). The existing over-subtraction guard
only tripped when reverse-alpha drove a footprint pixel fully negative
(the issue #30 dark-background black-pit case); on a mid-tone background
the over-subtraction darkens the core well below the background without
any pixel crossing zero, so the gate missed it and shipped the dark mark.

Add a second over-subtraction signal to `_reverse_alpha_oversubtracts`:
predict the reverse-alpha output at the bright core, (core - a*logo)/(1-a),
and route to the footprint inpaint when it lands more than
`_OVERSUB_DARK_MARGIN` (25) gray levels below the local background ring.
Calibrated wide: clean removals predict within ~12 of background
(demo_banana ~-1), the prod regression ~-40, the issue #30 dark case ~-82.
Corpus-validated on the 479 detected Gemini images: 10 switch reverse-alpha
to inpaint, all of them dark-diamond cases that improve or match; the
other 469 stay byte-identical. demo_banana stays on the reverse-alpha
path (byte-identical).

Also crop both reverse-alpha helpers to the region they actually touch,
a pure O(image) -> O(mark) win that is byte-identical to the full-frame
math (a uint8<->float32 round-trip is exact):
- `GeminiEngine._core_and_bg` converts only the footprint+ring crop to
  gray, not the whole frame (~70 ms -> 0.1 ms on a 12 MP image; it runs
  for both the alpha-gain estimate and the new gate). Verified identical
  across 479 images; detector confidence unchanged.
- `TextMarkEngine._apply_reverse_alpha` computes the blend on the glyph
  crop only (`amap` is zero outside it, so the math is a no-op there):
  ~275 ms -> ~2 ms per placement on a 12 MP frame, up to 2 placements per
  removal. Verified identical across 142 Doubao/Jimeng placements.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 17:19:41 -07:00
Victor Kuznetsov 09fdb4544a fix(invisible): preserve native output dimensions 2026-06-18 16:44:21 -07:00
Victor Kuznetsov 61aa76a591 perf(identify): decode the image once for all visible-mark detectors
identify(check_visible=True) ran the Gemini-sparkle detector and the
Doubao/Jimeng text-mark detector each with its own image_io.imread, so the
same bitmap was fully decoded twice. On a memory-constrained host (the raiw.cc
512 MB web worker, which runs identify on every upload) that doubled the peak
decode allocation and contributed to OOM restarts.

Decode once in identify() and pass the BGR array to both detectors. The detect
methods already accept an NDArray, so this only threads the pre-decoded array
through: detect_sparkle_confidence and the two _visible_* helpers gain an
optional image= param that, when None, preserves the old self-read behavior
(so direct callers and the cv2-missing/unreadable paths are unchanged).

Only the visible path is deduplicated; the optional check_invisible decoders
are unaffected (and off on the web hot path). Adds a test asserting
identify(check_visible=True, check_invisible=False) decodes exactly once.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-18 11:13:17 -07:00
dependabot[bot] cd0a79df38 chore(deps): bump the minor-and-patch group with 5 updates (#50)
Bumps the minor-and-patch group with 5 updates:

| Package | From | To |
| --- | --- | --- |
| [transformers](https://github.com/huggingface/transformers) | `5.10.2` | `5.12.1` |
| [accelerate](https://github.com/huggingface/accelerate) | `1.13.0` | `1.14.0` |
| [huggingface-hub](https://github.com/huggingface/huggingface_hub) | `1.18.0` | `1.19.0` |
| [pytest](https://github.com/pytest-dev/pytest) | `9.0.3` | `9.1.0` |
| [ruff](https://github.com/astral-sh/ruff) | `0.15.16` | `0.15.17` |


Updates `transformers` from 5.10.2 to 5.12.1
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v5.10.2...v5.12.1)

Updates `accelerate` from 1.13.0 to 1.14.0
- [Release notes](https://github.com/huggingface/accelerate/releases)
- [Commits](https://github.com/huggingface/accelerate/compare/v1.13.0...v1.14.0)

Updates `huggingface-hub` from 1.18.0 to 1.19.0
- [Release notes](https://github.com/huggingface/huggingface_hub/releases)
- [Commits](https://github.com/huggingface/huggingface_hub/compare/v1.18.0...v1.19.0)

Updates `pytest` from 9.0.3 to 9.1.0
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/9.0.3...9.1.0)

Updates `ruff` from 0.15.16 to 0.15.17
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.15.16...0.15.17)

---
updated-dependencies:
- dependency-name: transformers
  dependency-version: 5.12.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: minor-and-patch
- dependency-name: accelerate
  dependency-version: 1.14.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: minor-and-patch
- dependency-name: huggingface-hub
  dependency-version: 1.19.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: minor-and-patch
- dependency-name: pytest
  dependency-version: 9.1.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: minor-and-patch
- dependency-name: ruff
  dependency-version: 0.15.17
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: minor-and-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-18 10:07:41 -07:00
Victor Kuznetsov 4c6b56f888 lower(strength): drop vendor-adaptive floor to OpenAI 0.10 / Google 0.15
A 2026-06-14 oracle re-test on the deployed Modal controlnet worker (v0.10.0)
cleared SynthID at OpenAI 0.10 (2 photoreal) and Google 0.15 (2 native
2816x1536, retiring the "native >= 0.30" guess), while a pixel sweep showed the
2026-06-04 cert floors (0.20/0.30) over-regenerated for no efficacy gain
(Google MAE -20% at 0.15). Lowers OPENAI_STRENGTH 0.20->0.10, GEMINI_STRENGTH
and UNKNOWN_STRENGTH 0.30->0.15.

Caveats documented in watermark_profiles.py + docs: removal near this floor is
seed-non-deterministic (a service must pin a verified seed), and the n=2 re-test
did not cover flat-graphic hard cases.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 13:17:11 -07:00
Victor Kuznetsov 41a2af2ecb fix(cli): preserve SynthID uncertainty in no-visible-mark message
The 'no signal' branch of the visible no-mark path claimed 'No AI provenance
signal found either', which reads as 'the image is clean'. A missing metadata
proxy is not proof an invisible pixel watermark (SynthID) is absent: it cannot
be detected once metadata is gone and may have been stripped upstream. The
message now preserves that uncertainty and routes to both 'all' (regenerate
pixels) and 'erase'. Regression-guarded by the SynthID/all asserts in
test_cli.py. CLAUDE.md visible-command note updated to match.

Also adds a 'Scope and non-goals' section (CLAUDE.md + README): removing
AI-provenance marks on the user's own content is in scope; stripping
stock/paid-content watermarks (Shutterstock/Getty/iStock, classifieds) is out
of scope by principle, not by difficulty.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 19:30:49 -07:00
Victor Kuznetsov d8cdc9f478 docs: correct stale strength-ladder values in remove_watermark docstring
The convenience wrapper's docstring still quoted the pre-2026-06 ladder
(0.10 OpenAI / 0.15 Google / 0.15 unknown). The live constants in
watermark_profiles.py are 0.20 / 0.30 / 0.30, applied to both the controlnet
and sdxl pipelines. Docstring only; behaviour was already correct via
vendor_for_strength + resolve_strength.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 09:51:09 -07:00
Victor Kuznetsov 6237429610 chore(release): v0.11.2 v0.11.2 2026-06-12 21:37:04 -07:00
Victor Kuznetsov 30b56f0ea3 fix(cli): stop silent passthrough when visible finds no known mark
When `visible --mark auto` (or an explicit `--mark` with detection on) found
no registered mark, it exited 0 without writing output -- which a wrapping
service reads as success and re-serves the unchanged input. ~74% of real
uploads carry no registered visible mark, so this was the dominant "it didn't
work" / NPS score-0 failure mode.

Now it runs a cheap metadata-only identify, prints actionable guidance (route
to `all` for an invisible/metadata mark, or `erase` for an arbitrary logo),
writes no output file, and exits EXIT_NO_VISIBLE_MARK (2) -- distinct from
success (0) and a hard error (1) so the caller can surface the message.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 21:36:56 -07:00
Victor Kuznetsov b08405bece chore(release): v0.11.1
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v0.11.1
2026-06-12 12:15:20 -07:00
Victor Kuznetsov 28569bd05d fix(gemini): recover sub-0.85 corner sparkles via top-K fusion selection
The 256->512 detection-search widening (v0.8) let a large, low-gradient
shape match outrank a genuine mid-size corner sparkle whose raw NCC sits
below the 0.85 corner-promote gate, so `identify` read `unknown` on Gemini
images that v0.7.2 caught (reporter osachub: scale-48 sparkle on light
bedding -- true sparkle spatial 0.775 / grad 0.960 / fusion 0.676, but the
size-weighted argmax locked onto a decoy at spatial 0.628 / grad 0.036).

detect_watermark now keeps the top-K (_SELECT_TOPK=3) size-weighted
candidates (NMS-deduped) plus the corner-promote candidate, scores each by
full fusion (spatial+gradient+variance) via the extracted _grad_var_scores
helper, and selects the highest -- the gradient term lifts the true sparkle
over the decoy. Ranking by the SIZE-WEIGHTED score (not a raw-NCC argmax)
preserves tiny-patch suppression: a raw-NCC argmax re-admitted 16-18px
content false positives (14/65 doubao + 4/11 jimeng visible images). Top-K
adds zero flips on the doubao/jimeng corpora and leaves the 495-image Gemini
set unchanged (479 detected) while recovering the reporter's image at 0.676.

- _grad_var_scores: gradient/variance scoring factored out of detect_watermark
- confidence = best_fused (drop the duplicated fusion recompute)
- tests: rename test_promotion_is_what_rescues_it ->
  test_size_weighted_search_alone_traps_on_the_decoy (corner-promote is no
  longer the sole rescue path); add a deterministic regression test mirroring
  the real spatial/grad signature
- docs: module-internals.md detector section + CLAUDE.md mechanism map

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 12:04:20 -07:00
Victor Kuznetsov 9feea4ac1e Slim CLAUDE.md: move module internals, limitations, landscape research to docs
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 15:50:03 -07:00
Victor Kuznetsov 3055aa6c4a test: patch is_available in full-pipeline all tests (fix no-gpu CI)
test_all_basic / test_all_visible_step_uses_registry asserted exit 0 but did
not patch is_available, so on CI (core+dev only, no gpu) they took the skip
branch and hit the new non-zero exit. Passed locally where gpu is present.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 10:07:05 -07:00
Victor Kuznetsov c8bc4b7c68 chore(release): v0.11.0
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
v0.11.0
2026-06-11 10:03:51 -07:00
Victor Kuznetsov a8e218acf6 Make all fail loudly when the gpu extra is missing
Step 2 (invisible/SynthID) was skipped with a quiet inline warning and the
run still exited 0, so a missing [gpu] extra was mistaken for a clean result
(recurring #14/#47). Add a prominent end-of-run banner and a non-zero exit.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 09:58:49 -07:00
Victor Kuznetsov ad7e4ee08b feat(identify): close 3 detector gaps found on the spaces corpus (06-05..06-11)
- AIGC: parse the bare ``AIGC{...}`` blob form (label glued to its JSON in a
  JPEG APP segment near the JFIF header), and scan both raw-JSON forms in one
  fall-through loop so a quoted ``"AIGC"`` later in an XMP packet no longer
  shadows a real bare label earlier in the file (3 files read unknown before).
- Integrity clash rule 2: a camera device + an AI marker from the SAME C2PA
  manifest (Google Pixel Magic Editor / Pixel Studio edit chain) is a legitimate
  edit chain, not a contradiction. Fire only when the AI marker's source is
  independent of the camera's manifest; pure cameras (Leica/Sony/Nikon) are
  unaffected (2 Pixel files mis-flagged before).
- New c2pa_cloud_manifest detector: surface a C2PA 2.4 Durable Content
  Credentials cloud-manifest reference (Adobe cai-manifests.adobe.com) as a
  medium provenance signal when the embedded manifest is stripped. Provenance
  only, never asserts is_ai (2 files read fully unknown before).

identify reuses its already-loaded scan head for the cloud check (no second
read). +7 tests; CLAUDE.md + README synced.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 09:28:15 -07:00
Victor Kuznetsov 22bc171806 ci: bump checkout to v6 (Node 24), note dismissed torch alert
actions/checkout@v4 ran on the deprecated Node 20; bump to v6 to match
test.yml/publish.yml. Document the dismissed Dependabot torch alert
(GHSA-rrmf-rvhw-rf47, not_used: no torch.jit usage, gpu-extra-only, no patch).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 16:00:35 -07:00
Victor Kuznetsov d763581ed3 chore(release): v0.10.3
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
v0.10.3
2026-06-10 15:53:50 -07:00
Victor Kuznetsov 0d99f403fb ci: auto-distribute releases to Homebrew tap + HF Space
distribute.yml fans a published GitHub Release out to the channels that
would otherwise be manual: it waits for the sdist on PyPI, bumps the
Homebrew formula (HOMEBREW_TAP_TOKEN) and factory-rebuilds the HF Space
(HF_TOKEN). PyPI stays on publish.yml; conda-forge on its autotick bot.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 15:47:03 -07:00
Victor Kuznetsov e78e5f1154 docs: address HN feedback in README (scope, limitations, honest use case)
From the HN front-page discussion (news.ycombinator.com/item?id=48200569):
- Threat model: drop the 'third-party classifiers' overclaim. State scope
  honestly: it removes SynthID / visible marks / provenance metadata, does NOT
  defeat trained AI-vs-real classifiers (Hive), and watermarks are a weak trust
  signal to begin with.
- Replace the 'preserving art / historical record' use case (criticized as not
  holding) with the defensible one: clearing an overstated AI label from your
  own lightly-AI-edited photo.
- Add a Limitations section: lossless visible/metadata vs lossy content-dependent
  SynthID path, no local self-verify, large images not tiled yet, out-of-scope.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 11:41:39 -07:00
Victor Kuznetsov 0a77d3198e chore(release): v0.10.2
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
v0.10.2
2026-06-10 10:38:50 -07:00
Victor Kuznetsov 9aea5f240f chore: improve discoverability (PyPI keywords/classifiers, README badges)
Research-informed metadata for organic dev discovery:
- pyproject: add a keywords field (was absent; biggest PyPI search gap) and
  expand classifiers (audience, console, security, AI, utilities); rewrite the
  summary noun-first, naming Nano Banana / SynthID / C2PA verbatim.
- README: add PyPI version, Python versions, downloads, and license badges.

GitHub topics (comfyui, watermark-remover) and the repo description were
updated out of band. PyPI metadata ships on the next release.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 10:34:43 -07:00
Victor Kuznetsov c3ddf8a801 docs: document Homebrew, conda-forge, and ComfyUI distribution channels
- README: add Homebrew install, conda (conda-forge, in review), and a
  ComfyUI custom-nodes section.
- CLAUDE.md: per-channel release/bump cadence (Homebrew formula, conda-forge
  autotick bot, ComfyUI Registry); note pip_check: false on the conda recipe.
- Add packaging/conda/recipe.yaml (v1, noarch core-only), verified green on
  conda-forge/staged-recipes PR #33674.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 19:29:40 -07:00
Victor Kuznetsov 5777458296 chore(release): v0.10.1
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
v0.10.1
2026-06-09 17:08:44 -07:00
Victor Kuznetsov 295e7ada2b chore: project review (dev tools in extras, dep upgrades, optional-deps guard, stale cleanup)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-09 17:03:17 -07:00
Victor Kuznetsov 826cfdb82a chore(release): v0.10.0
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
v0.10.0
2026-06-09 13:24:37 -07:00
Victor Kuznetsov 2fcd00ced0 fix: address whole-project code review (visible all/batch, engine consolidation, I/O)
Nine findings from a high-effort project-wide review, fixed and verified
(571 passed, ruff/pyright clean):

Correctness:
- all/batch now remove Doubao/Jimeng/Samsung visible text marks: the visible
  step routes through the registry (new cli._remove_visible_auto) instead of a
  hardcoded GeminiEngine, so they no longer leave the wordmark intact.
- batch always reads the original source (dropped the out_path-reuse that
  re-processed already-cleaned outputs on a re-run).
- img2img_runner only retries the diffusion call on the deprecated-callback
  TypeError; any other TypeError now propagates instead of double-running.
- gemini detect/remove and the reverse-alpha engines normalize channels via a
  new image_io.to_bgr, fixing a grayscale/BGRA crash in the FP-gate path.
- _png_late_metadata advances its cursor by the clamped length, so a malformed
  chunk length no longer aborts the late AI-label scan.

Cleanup / efficiency:
- Consolidate the ~90%-identical Doubao/Jimeng/Samsung engines into a shared
  config-driven _text_mark_engine.TextMarkEngine base; each engine is now a thin
  subclass (TextMarkConfig + test shims). Behavior is byte-exact (the three
  engine test suites pass unchanged). Registry adapters collapse to one
  _text_mark(...) row each. Gemini stays a separate engine.
- scan_head is memoized per (path, size, mtime), so identify() reads the file
  head once instead of ~8 times.
- invisible_engine post-processing decodes/encodes the output once (chained in
  memory) instead of 2-4 times across stages.
- Remove the orphaned get_model_id_for_profile (+ CONTROLNET_PROFILE); derive
  the --strength help from the strength constants (strength_default_help) so it
  cannot drift; share the --pipeline/--strength click options; simplify the
  retired --auto resolver.

Net -835 lines. Tests added for the registry-routed visible pass, to_bgr,
the polish/model/guidance wiring, and strength_default_help. CLAUDE.md updated
for the new base module, the engine/registry changes, image_io.to_bgr, and the
scan_head cache.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 13:21:13 -07:00
Victor Kuznetsov b1189549b8 feat(invisible): controlnet default, unified strength, retire --auto, add --model/--guidance-scale
Overhaul the diffusion-removal surface around a single robust default and a
complete, consistent CLI.

Pipeline + strength:
- controlnet is now the DEFAULT pipeline (CLI --pipeline + both engine ctors).
  With the certified higher strength it clears both photoreal and flat-graphic
  content, whereas plain SDXL left SynthID on flat graphics.
- Rename the plain-SDXL profile default -> sdxl; "default" stays as a back-compat
  alias (normalize_profile + a click callback that warns).
- Unify the strength ladder: resolve_strength applies ONE vendor-adaptive ladder
  (the certified controlnet floors OpenAI 0.20 / Google 0.30 / unknown 0.30) to
  both pipelines. sdxl is the weaker remover on its own hard case (flat fills),
  so the certified floor is the right floor for it too.

CLI completeness:
- Add --model (HF model id) to invisible + batch (was only on all) and
  --guidance-scale (CFG) to all three diffusion commands; both were library
  knobs the CLI did not expose.
- Flip --adaptive-polish to ON by default (it self-gates to a no-op where there
  is no detail deficit, so default-on is safe).
- Share --pipeline / --strength / --model / --guidance-scale as single
  decorators so invisible/all/batch keep an identical surface; the --strength
  help is derived from the strength constants (strength_default_help) so it can
  never drift from the ladder.

Removals:
- Delete the auto_config content-detection planner + its YuNet/DBNet assets
  (~2.6 MB): with controlnet always the pipeline and the polish self-gating, the
  face/text/edge detection no longer changed behavior. --auto is now a deprecated
  no-op that only warns (the polish it enabled is the default).

Docs (README, CLAUDE.md, docs/synthid.md) updated throughout; added an
InvisibleEngine Python API example. Tests cover the alias warnings, the
polish default, and the --model/--guidance-scale wiring.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 12:40:45 -07:00
Victor Kuznetsov efc5b4a9af docs(auto): drop stale face-restore mentions from --auto
The face-restore family was removed in 20d7eda, but the auto_config
module docstring still claimed "PhotoMaker face restoration is enabled
when a face is present" and the --auto help text (CLI + README example)
listed "face restore" as something --auto picks. A detected face now
only routes to the controlnet pipeline (canny preserves face STRUCTURE,
not identity); there is no identity restoration. Comments/docstrings/help
only, no code behavior change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 11:12:53 -07:00
Victor Kuznetsov ea098cf1be chore(release): v0.9.0
BREAKING:
- Drop `--restore-faces` / `--restore-faces-method` CLI flags
- Drop `restore`, `photomaker`, `instantid` extras
- Drop `restore_faces` / `restore_faces_method` params from
  InvisibleEngine.remove_watermark and AutoConfig

Rationale (full empirical record in
docs/synthid-robust-identity-research-2026-06-08.md "Empirical follow-up"):
every face-restore approach evaluated 2026-06-04 - 2026-06-08 (GFPGAN-on-
cleaned, PhotoMaker-V2, InstantID txt2img, InstantID img2img-on-cleaned
at three parameter sweeps) regenerates the face via SDXL diffusion --
output face pixels are diffusion-fresh, so the regenerated face inherits
SDXL's "clean skin" aesthetic and loses original identity precision. The
result looks MORE AI-generated than the cleaned image, not less. The
cleaned controlnet 0.20 image is the least-AI face state we can reach
without re-introducing SynthID.

License:
- MIT -> Apache 2.0 (Apache adds an explicit patent grant + trademark
  clause; better fit with the upstream Apache projects this library
  mirrors / depends on -- diffusers, transformers, controlnet-aux,
  xinsir's controlnet weights)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v0.9.0
2026-06-08 21:28:09 -07:00