Commit Graph

162 Commits

Author SHA1 Message Date
Victor Kuznetsov ea59bdc3e2 chore(scripts): add invisible-removal quality audit tool
Pairs <hash>_src / <hash>_clean outputs, computes SSIM + detail/resolution
proxies, ranks the worst-preserved images for visual classification. Used to
characterize the classes the SDXL scrub degrades (line-art, faces, dense text).
Operates on gitignored data/spaces only; writes nothing tracked.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 19:56:49 -07:00
Victor Kuznetsov e7fb64dca1 fix(gemini): remove more-opaque sparkles via per-image alpha gain
The captured sparkle alpha peaks ~0.51, but some real Gemini sparkles are
rendered more opaque. The fixed-alpha reverse blend then UNDER-subtracts and
leaves a bright residual the detector still fires on. A visible-removal audit
through the registry path on the spaces corpus showed this as a meaningful
fraction of marks -- all under-removals, not a background-brightness class
(failures and successes had the same input confidence and background luma; the
discriminator was the removal delta itself).

remove_watermark now estimates a per-image alpha gain (_estimate_alpha_gain:
effective sparkle opacity at the bright core vs the local background ring,
a_eff/a_cap, clamped [1.0, 1.94]) and scales the alpha to match before the
over-sub/blend branch. A 1.05 deadband keeps a sparkle that already matches the
capture byte-identical to the pre-fix output, so the fix is purely additive
(0 regressions on the audit set; failures dropped substantially). The over-sub
guard still runs on the scaled alpha as the safety net for an over-shoot.

- _estimate_alpha_gain + _ALPHA_GAIN_MAX/_DEADBAND/_CORE_FRAC in gemini_engine.
- TestUnderSubtractionGain asserts on footprint pixels, NOT the detector (its
  NCC is degenerate on a flat synthetic bg; the real corpus removal drops the
  detector ~0.80 -> ~0.27).
- scripts/visible_removal_audit.py: the detect -> remove -> re-detect audit tool
  that found and validated this (operates on gitignored data/spaces only).
- CLAUDE.md + README: document the under-subtraction gain.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 19:48:40 -07:00
Victor Kuznetsov d7e4fe8835 feat(invisible): upscale-floor for small inputs + unsharp post-filter
Two quality knobs for the SDXL invisible pass:

- min_resolution floor (default 1024, --min-resolution): small inputs are
  upscaled to a 1024px long-side floor before diffusion, since SDXL img2img
  distorts on a tiny latent (a 381x512 portrait wrecks at native). The output
  is restored to the original input size, so it is a transparent quality boost;
  it adds time/memory on small inputs. 0 disables. Extends the pure _target_size
  helper (now cap-or-floor-or-native, min skipped on a min>max misconfig),
  unit-tested without a model.

- unsharp post-filter (humanizer.unsharp_mask, --unsharp, opt-in default 0):
  applied LAST, after the GFPGAN face pass (a pre-GFPGAN sharpen would be
  smoothed back over), to counter the soft/over-smoothed look that diffusion +
  restoration leave behind (an AI tell). Pairs with --humanize (grain).

Both threaded through invisible/all/batch + the module-level helper. Verified
end-to-end on a 381x512 portrait: upscaled to 1024, sharpened, restored to
381x512.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 18:30:39 -07:00
Victor Kuznetsov a57af5da21 docs(claude): corpus cleaned/ examples must come from a shipped removal method
Capture the rule: archive only cleaned outputs from the current default SDXL
img2img pass; never archive examples from removed methods (ctrlregen, old
text/face protection, FaceID, CodeFormer) or experimental opt-in paths
(controlnet, GFPGAN). A removed method's output is not a reproducible example.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 17:13:56 -07:00
Victor Kuznetsov 8523f48fb6 data(corpus): archive June 2026 SynthID strength-study subjects
Back docs/synthid.md section 2.2 with the actual test set: the per-image
oracle-verified subjects were only in a local working dir, while the doc claimed
they were recorded in data/synthid_corpus/. Ingest the key pos+cleaned pairs so
the claim holds.

- pos: openai_1/2/3 originals (gpt-image, openai-verify) + gemini_1/2/3/4
  originals (Gemini app, gemini-app); all probe as C2PA-SynthID present.
- cleaned: OpenAI at strength 0.05 (openai_2 only s010 captured) + Gemini at 0.15
  --max-resolution 1536; oracle: SynthID NOT detected. Metadata stripped, so no
  C2PA on the cleaned rows.
- Excluded the third-party issue #14 image (pic3): oracle-verified but not
  committed to the public corpus.
- docs/synthid.md 2.2: state OpenAI n=4 = 3 archived + 1 external-only.
- CLAUDE.md: drop the drift-prone "~65 MB" corpus size from the sdist note.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 17:09:58 -07:00
Victor Kuznetsov 5ec8269949 chore: mark controlnet pipeline + GFPGAN restore-faces as experimental
Both content-preservation features are now flagged EXPERIMENTAL and opt-in.
--pipeline controlnet was already opt-in (default=default); --restore-faces
flips from on-by-default to OFF by default, matching the repo's prior pattern
for experimental preservation passes (the removed protect_text/protect_faces).

- cli.py: --restore-faces/--no-restore-faces default False; EXPERIMENTAL in the
  --restore-faces / --controlnet-scale / --pipeline help; batch default False.
- invisible_engine.py: remove_watermark restore_faces default False + docstring.
- CLAUDE.md / README.md / docs/synthid.md: label both experimental/opt-in.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 16:59:28 -07:00
Victor Kuznetsov 411ef16ec3 feat: GFPGAN face-identity restoration post-pass
Add an optional, commercial-safe face-restoration post-pass that recovers
face identity the diffusion removal pass drifts (canny holds structure, not
likeness) while still scrubbing the pixel watermark in the face regions.

- face_restore.py: GFPGANer singleton (CPU unless CUDA), the basicsr
  torchvision.transforms.functional_tensor shim, and the pure feather
  _composite_faces helper (unit-tested without the model). GFPGAN
  re-synthesizes each face from a StyleGAN2 prior, so composited face pixels
  are GAN-generated (no watermark, no pixel-copy) -- oracle-clean at weight 0.5
  with identity preserved.
- InvisibleEngine.remove_watermark: restore_faces / restore_faces_weight,
  best-effort, auto-skips when the extra is absent or no face is detected.
- CLI --restore-faces/--no-restore-faces + --restore-faces-weight on
  invisible/all/batch (on by default).
- restore extra (gfpgan/facexlib/basicsr), numpy<2-pinned (scipy<1.18,
  numba<0.60) and kept out of `all`; basicsr needs Python <3.13 + setuptools<69
  to build, so pin .python-version 3.12.

Commercial-safe: GFPGAN Apache-2.0, RetinaFace MIT. The CodeFormer alternative
is non-commercial and is not shipped. The earlier IP-Adapter FaceID layer was
removed (footgun: needs high strength, corrupts faces at the low removal
strength).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 16:59:28 -07:00
Victor Kuznetsov d90d5d886a feat: controlnet pipeline for text/face-structure preservation
Add `--pipeline controlnet` (SDXL base + xinsir canny ControlNet via
StableDiffusionXLControlNetImg2ImgPipeline): the canny edge map conditions the
img2img regeneration so text and face STRUCTURE stay sharp, while the watermark
is still removed by the regeneration (`strength`) -- no original pixels are
copied or frozen, so SynthID does not survive. Oracle-verified clean on OpenAI
with better text/structure fidelity than plain img2img at equal strength.
`--controlnet-scale` tunes structure preservation; fp32 on mps/cpu (fp16-fixed
VAE on cuda/xpu). Shares the img2img runner (live progress + MPS->CPU fallback)
and the fp16-VAE-fix / device-move helpers with the default pipeline.

Remove the superseded subsystems -- ctrlregen (SD1.5 clean-noise),
text-protection (differential / region-hires) and face-protection: they either
destroyed real content or shielded the watermark by re-using original pixels.
controlnet replaces them by regenerating everything under edge conditioning.

Canny preserves face structure but not identity; face IDENTITY is a separate
face-restoration post-pass (CodeFormer/GFPGAN), researched + prototyped but not
yet shipped. An IP-Adapter FaceID attempt was built and removed (footgun: needs
high strength, corrupts faces at removal strength).

Docs: docs/controlnet-removal-pipeline-research.md, scripts/controlnet_sweep.py.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 16:59:28 -07:00
Victor Kuznetsov 175609b60a fix(gemini): rescue small corner sparkle buried by the size weight (#36)
detect_watermark's size-weighted global NCC search lets a larger, mediocre
match (e.g. a bright collar in a portrait) outrank a small, near-perfect
sparkle in the bottom-right corner, so a faint sparkle on a busy background
scored below threshold and the image read as clean -- the regression from
widening the search window 256px->512px between v0.7.2 and v0.8.8.

Add _corner_promote: a bottom-right-corner raw-NCC pass that overrides the
global pick when the corner holds a match with raw NCC >= 0.85 that beats it.
It only ever replaces a lower-fidelity pick (cannot weaken an existing
detection) and keeps the wider window for variant margins. The corner side is
relative-clamped (0.20 of the short side, [96, 384]) so it stays a true corner
at every scale: a fixed 256px covers ~70% of a small portrait, where a real
photo raw-matches the star at ~0.81; relative tightening drops that to ~0.69.
The 0.85 gate sits between the worst real-photo corner match (~0.78) and a
genuine faint sparkle (~0.93): zero false positives across native + downscaled
negatives, headshot rescued from below-threshold to 0.71.

Factor the shared multi-scale matchTemplate loop into _scan_scales.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 16:51:03 -07:00
Victor Kuznetsov 07c96bed53 docs: mark fp16 VAE black-output fix (#29) verified on CUDA
Confirmed on real CUDA hardware 2026-06-03: `all` on a 1086x1448 OpenAI
gpt-image at fp16 produces a normal (non-black) output, so the fp16-fix VAE
swap resolves the all-black decode. Removes the prior "NOT verifiable on this
MPS machine" caveat.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 19:18:04 -07:00
Victor Kuznetsov 35116d5e97 chore(release): v0.8.9
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
v0.8.9
2026-06-02 19:04:32 -07:00
Victor Kuznetsov df0fafe94e fix(identify): stop flagging multi-actor C2PA manifests as integrity clashes
The C2PA issuer attribution (`c2pa`) and the SynthID proxy (`synthid`) are
derived from the same manifest, so treating them as independent signals made
rule 1 fire on legitimate multi-actor manifests where a product wraps another
vendor's engine (Microsoft Designer on OpenAI, Microsoft on Google) or an edit
chain re-signs (Adobe over a Gemini original). 19 such files in the
2026-06-01/02 spaces batches read as "likely spoofed/laundered" before this.

Group `c2pa` + `synthid` into one provenance source via `_CLASH_SOURCE`; rule 1
now requires two vendors from different sources. A manifest vendor still clashes
with a genuinely independent stamp (EXIF/XMP generator, IPTC AISystemUsed, AIGC,
xAI).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 19:02:35 -07:00
Victor Kuznetsov 9cb66992bd chore(release): v0.8.8
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v0.8.8
2026-06-02 09:18:02 -07:00
Victor Kuznetsov 9ca2811938 fix(gemini): inpaint sparkle footprint when reverse-alpha over-subtracts (#30)
On a dark/textured background (e.g. grass) the captured alpha map over-estimates
the real Gemini sparkle's effective opacity (~0.51 captured vs ~0.31 effective),
so the fixed-alpha reverse blend over-subtracts (watermarked - alpha*logo goes
negative) and drives the footprint to black -- the white sparkle turns into a
black diamond (issue #30, reported by @CoolZimo1).

remove_watermark now detects this via _reverse_alpha_oversubtracts (fraction of
footprint pixels with a negative numerator > 5%) and inpaints the small sparkle
footprint from the surrounding pixels (cv2 NS, cropped to a padded box) instead.
Behavior-neutral on the working case: a bright background over-subtracts at ~0%,
so reverse-alpha is used and the output is byte-identical to before (verified:
demo_banana 0.0 frac vs the issue-#30 grass image 0.61 frac; issue-#30 footprint
recovers to background grass with no pit, residual sparkle conf 0.25 < 0.35).

Guard is scoped to GeminiEngine: doubao/jimeng already NCC-align their alpha to
the actual mark per image, which sidesteps the fixed-alpha mismatch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 09:17:32 -07:00
Victor Kuznetsov b25276c4f2 chore(release): v0.8.7 v0.8.7 2026-06-01 19:33:08 -07:00
Victor Kuznetsov 96038f960f feat(invisible): vendor-adaptive default strength (OpenAI 0.10 / Google 0.15)
The default img2img strength is now chosen from the detected SynthID vendor
(C2PA issuer) instead of a single fixed 0.30: OpenAI gpt-image -> 0.10, Google
Gemini -> 0.15, unknown source -> 0.15. Explicit --strength always wins.

Basis: an oracle-verified June 2026 controlled study (clean v0.8.6, text/face
protection OFF, per-image openai.com/verify or Gemini-app verdict). OpenAI's
SynthID clears at 0.05 across 1024-1600 px (n=4, resolution-independent);
Google's is ~3x more robust and needs 0.15 on the capped-1536 path (n=4). The
dominant factor is the VENDOR, not resolution. The earlier single 0.30 default
and the "resolution dependence" lore came from contaminated tests run with the
protect-text bug ON (issue #14) -- re-running those same 1600x1600 images clean
removes SynthID at 0.05.

`vendor_for_strength(path)` reads metadata.synthid_source on the ORIGINAL input
and is threaded through cli (invisible/all/batch) -> invisible_engine ->
watermark_remover -> resolve_strength(strength, profile, vendor), so display and
execution use the same vendor (the engine sees a temp path whose C2PA the visible
pass already stripped, so detection must happen in the CLI on the pristine
source). Caveat: Google's 0.15 was validated only on --max-resolution 1536;
native 2816 Gemini was not locally measurable (OOM on Apple Silicon) and is
pending GPU validation on raiw.cc.

Docs: docs/synthid.md sections 2.2/4.4/5.2 corrected (the contaminated
resolution-dependence findings replaced with the clean oracle-verified table);
README and CLAUDE.md updated; CLI --strength help reflects the adaptive default.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 19:29:47 -07:00
Victor Kuznetsov 1708857772 fix(gemini): expand sparkle search area 256 -> 512px from corner
The 256px limit caused misses when Gemini places the sparkle further from the
corner than the standard 160px (margin 64 + logo 96). Observed variant at ~300px
reported in issue #30. 512px covers all known Gemini margin variations with room
to spare; matchTemplate on a 512x512 region is still fast on CPU.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
v0.8.6
2026-06-01 10:42:04 -07:00
Victor Kuznetsov 25cc4750df chore(release): v0.8.5 v0.8.5 2026-06-01 10:31:59 -07:00
Victor Kuznetsov 4b0b370ac0 fix(invisible): disable protect-text/protect-faces by default; add docs/synthid.md
Both text and face protection were shielding SynthID from removal. The
text-protection high-res re-scrub regenerates pixels at an upscaled resolution
where the per-region pass may not be strong enough to re-destroy the SynthID
payload, allowing it to survive in text areas. Face protection has an even more
direct mechanism: it pastes back the original (pre-diffusion, watermarked) face
pixels after the global pass, guaranteeing SynthID survives in face regions
regardless of strength.

Both --protect-text and --protect-faces are now off by default and opt-in.
Rename from --no-protect-text / --no-protect-faces to --protect-text /
--protect-faces. Extract shared click.option decorators to module-level
constants (_protect_text_option, _protect_faces_option) to eliminate
copy-paste between cmd_invisible and cmd_all.

Add docs/synthid.md: primary-source-cited technical reference for SynthID-Image
covering mechanism (post-hoc encoder/decoder, 136-bit payload, pixel-space, no
model-weight modification), robustness numbers (arXiv:2510.09263: ~99.98% TPR
at 0.1% FPR across 30 transforms), removal attacks and forensic detectability
(arXiv:2605.09203: all 6 attacks detectable >98% TPR@1%FPR), detectability
limits, oracle scope, adoption landscape, and practical implications including
the protect-text/faces SynthID-preservation finding.

Verified June 2026 on gpt-image 1600x1600 via openai.com/verify: with
--protect-text SynthID detected; without, SynthID removed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 10:28:34 -07:00
Victor Kuznetsov 72812de03c chore(release): v0.8.4 v0.8.4 2026-05-31 20:46:52 -07:00
Victor Kuznetsov e501bec9ff feat(identify): detect visible Doubao/Jimeng marks; keep identify import torch-free
identify previously ran only the Gemini sparkle as a visible detector, so a
Doubao/Jimeng image with stripped TC260 metadata had no visible fallback. Add
`_visible_text_marks` (registry-backed) so the ByteDance Doubao 豆包AI生成 and
Jimeng 即梦AI marks are detected too, each gated by its own engine NCC threshold
via MarkDetection.detected. New signals `visible_doubao` / `visible_jimeng`
(medium), same stripped-metadata fallback role as the sparkle; excluded from
integrity-clash vendor claims; set platform only when no harder signal did.

Also make `noai/__init__` lazy (PEP 562 __getattr__): importing the light
`noai.c2pa` / `noai.constants` submodules (which identify needs) no longer
eagerly pulls `watermark_remover`, which imports torch + diffusers at module
top. `import remove_ai_watermarks.identify` drops from ~420 MB to ~21 MB in a
full gpu/detect install (torch not loaded), so it fits a 512 MB host; the
removal API resolves lazily on first access. Guarded by TestIdentifyImportIsLight.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 20:43:52 -07:00
Victor Kuznetsov 4b4049a6f1 docs(text-protection): update stale strength note (~0.05 -> ~0.30 SynthID threshold)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 17:53:48 -07:00
Victor Kuznetsov 2e23cf9c4b fix(build): pin hatchling<1.28 to keep Metadata-Version 2.4 (PyPI upload rejected 2.5)
hatchling 1.28+ emits Metadata-Version 2.5 (PEP 639); the twine in
pypa/gh-action-pypi-publish@release/v1 rejects it, which failed the v0.8.3 PyPI
upload (build + tag-match passed, upload step failed, nothing uploaded). 1.27.x
emits 2.4, which uploads fine (0.8.2). Pin the build backend; lift once the action
twine is 2.5-aware or the workflow uses uv publish.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
v0.8.3
2026-05-31 17:49:19 -07:00
Victor Kuznetsov c155f81078 chore(release): v0.8.3
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 17:41:10 -07:00
Victor Kuznetsov 6075ea2c55 docs(synthid): cite the genuine Gemini A/B (gemini_633uuy) for the protect_text finding
The protect_text correction was first cited only to qw1212ss_pic3, an OpenAI
image that carries no Google SynthID (so the Gemini oracle is not a valid check
for it). The updated study re-ran the 0.3 protect_text A/B on a genuine Gemini
SynthID image (gemini_633uuy, photo with a Chinese-text sign): SynthID removed
with protection ON and OFF, Gemini-oracle verified. Cite that as the load-bearing
evidence so the claim rests on a valid subject. Confirms the shipped 0.30 +
protect_text=ON default on a real Gemini target.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 17:39:16 -07:00
Victor Kuznetsov b991b11a19 docs(synthid): correct protect_text guidance -- it does NOT block removal (keep ON)
An A/B at strength 0.3 on a real e-commerce infographic (updated GPU study)
reverses the earlier claim: SynthID is a GLOBAL watermark, so 0.3 removes it
whether protect_text is on or off, and protection SALVAGES text fidelity (medium
headings/body stay readable; off, they garble). The earlier 'protect_text shields
the watermark, use --no-protect-text' was wrong -- it mistook the 0.10 strength
failure for a protection effect. Recommended SynthID config: ~0.3 + protect_text ON
(the default). Also document the oracle scope: the Gemini app 'Verify with SynthID'
is the only valid SynthID oracle; openai.com/verify is provenance-scoped (C2PA) and
does NOT measure SynthID. Corrects CLAUDE.md + README + watermark_profiles comment
shipped in cddbaf6.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 16:50:13 -07:00
Victor Kuznetsov cddbaf6413 fix(invisible): raise default strength 0.10 -> 0.30 (current SynthID threshold); flag ctrlregen experimental
An oracle-verified GPU strength study (Modal A100, native res, Gemini-app
'Verify with SynthID', n=3 fresh Gemini images, protect_text/faces off) found the
current Google SynthID survives strength 0.10/0.15/0.2 and is removed only at 0.3.
The previous 0.10 default (set from an n=1 result) no longer clears it -- Google
hardened SynthID and the threshold has climbed 0.05 -> 0.10 -> ~0.3. Bump
DEFAULT_STRENGTH to 0.30; OpenAI/ChatGPT carry C2PA not SynthID, so 0.10 is plenty
there (pass --strength 0.10). Note protect_text shields the text regions SynthID
hides in (use --no-protect-text for full removal on text-heavy images).

The same study found ctrlregen at clean-noise strength DESTROYS real images
(hallucinated micro-text in smooth regions), with no usable middle setting, so the
literature's 'clean-noise is the lever' did not hold empirically. Flag ctrlregen
EXPERIMENTAL in the CLI --pipeline help, README, and watermark_profiles; SDXL
img2img at ~0.3 stays the shippable path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 16:38:49 -07:00
Victor Kuznetsov 729f5f2ecd chore(release): v0.8.2
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
v0.8.2
2026-05-31 15:46:47 -07:00
Victor Kuznetsov b0aad476fb fix(scripts): drop rich import from analysis scripts (red CI after rich removal)
The cli refactor dropped rich from dependencies, but four scripts still did
`from rich.console import Console` / `rich.table import Table`. Their test
modules import the scripts, so a clean `uv sync --frozen` (CI: core+dev, no
rich) failed at collection with ModuleNotFoundError on macOS/Windows/Linux.

Add a shared plain-text shim `scripts/_plain_console.py` (Console/Table via
click.echo, markup stripped) and switch all four scripts to it. Verified: all
four import with rich blocked, and tests/test_synthid_corpus.py +
tests/test_synthid_pixel_probe.py pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 15:41:50 -07:00
Victor Kuznetsov f16216cabc feat(cli): add --no-protect-faces to invisible/all (skip the YOLO face detector)
Mirrors --no-protect-text: when the image has no people, skip loading and
running the YOLO face detector entirely. The heavy extract+blend already only
ran when a face was found, but the detector itself always loaded+inferred to
decide; this flag lets callers skip that fixed cost.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 15:27:14 -07:00
Victor Kuznetsov e42b7e9d6a refactor(cli): plain-text console output; drop rich; quiet transformers
cli.py now emits plain ASCII through a small click.echo shim
(_Console / _Table / _Progress) instead of rich: no colors, markup tags,
panels, progress bar, or Unicode glyphs (Warning: / -> / ... and dropped
checkmark/cross marks). identify and metadata tables render as indented
plain lines.

- drop rich from dependencies (pyproject.toml + uv.lock)
- __init__: set TRANSFORMERS_VERBOSITY=error (setdefault) plus a warnings
  filter so the transformers Siglip2ImageProcessorFast deprecation no
  longer prints at CLI startup (it fires from the eager noai import)
- TestGpuHintMarkup: the [gpu] hint is now printed verbatim; docstring updated
- CLAUDE.md: replace the obsolete rich-markup lesson, note the verbosity fix

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 15:21:29 -07:00
Victor Kuznetsov 2d49c3cb58 fix(invisible): ctrlregen defaults to clean-noise strength, not the SDXL 0.10
The ctrlregen profile inherited the SDXL img2img --strength default (0.10), a
near-identity pass that loaded ControlNet + DINOv2-giant and barely changed the
image -- a no-op for removal. resolve_strength() now resolves an unset strength
per profile: 0.10 for the SDXL default, CTRLREGEN_DEFAULT_STRENGTH (1.0,
clean-noise) for ctrlregen. It checks `is None` rather than falsiness, so an
explicit 0.0 is respected (the old `strength or DEFAULT` swallowed it).

Research basis: CtrlRegen (ICLR 2025, arXiv:2410.05470) removes robust
watermarks by regenerating from clean Gaussian noise; partial-noise img2img
retains watermark info that diffuses back, so a high (clean-noise) strength is
the lever, not a knob on the light SDXL pass. CLI wiring (--strength default
None) lands with the cli refactor.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 15:07:19 -07:00
Victor Kuznetsov 33bd401e2a fix(visible): guard remove_watermark_reverse_alpha on tiny images too
The previous commit guarded extract_mask, but the 2048x1 crash was
actually in _fixed_alpha_map's cv2.resize to a ~1-px-tall target (Windows:
"Unknown C++ exception" / access violation). Return image.copy() up front
when h < 32 or w < 64 (no real watermarked image is that small), before any
cv2 call. Same guard in both Doubao and Jimeng.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 14:00:52 -07:00
Victor Kuznetsov 7167d2bae7 fix(visible): guard extract_mask against degenerate ROIs (Windows CI crash)
The always-align removal scores each placement with a residual detect(),
which on an extremely wide/short image (2048x1, test_wide_short_does_not_raise)
fed cv2 GaussianBlur a ~1-px-tall ROI and faulted natively on Windows py3.12
(access violation, non-deterministic -- one CI cell went red, a re-run passed).
The old at-native path never ran detect() on degenerate sizes. Skip the cv2
pipeline and return an empty mask when bh < 16 or bw < 16; real images always
clear the guard (the WM_* box floors are max(16,..) / max(40,..)). Same fix in
both Doubao and Jimeng. Also sync the stale Doubao module docstring.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 13:25:57 -07:00
Victor Kuznetsov 196e8ea35f docs(claude): record the release flow + sdist-must-exclude-data lesson
PyPI rejected the 0.8.0 sdist (data/ test corpora over the file-size
limit); document the release steps and the [tool.hatch.build.targets.sdist]
exclude so it does not recur.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 13:01:08 -07:00
Victor Kuznetsov e7c57e3892 chore(release): v0.8.1 — exclude data/ from sdist
The 0.8.0 PyPI publish uploaded the wheel but the sdist was rejected
(400 File too large): hatchling's default sdist bundled the committed
data/ test corpora (synthid_corpus images + the new visible-mark
captures), pushing it past PyPI's per-project file-size limit. Add a
sdist target that excludes /data, dropping it ~85 MB -> 9.8 MB. The
wheel already ships only src/ and is unaffected.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
v0.8.1
2026-05-31 12:57:46 -07:00
Victor Kuznetsov 315320056b chore(release): v0.8.0
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
v0.8.0
2026-05-31 12:25:02 -07:00
Victor Kuznetsov e572767555 feat(visible): add Jimeng remover, fix Doubao outline defect, reproducible mask build
Visible-watermark work across all three corner-mark engines plus a committed,
reproducible alpha-build pipeline (scripts/visible_alpha_solve.py) fed by committed
solid black/gray/white captures.

- jimeng: new "即梦AI" wordmark remover (reverse-alpha + thin residual inpaint,
  always NCC-aligned -- the mark re-rasterizes/jitters per image). Detect via glyph
  silhouette NCC (0.45 threshold; does not cross-fire with Doubao). Registered in the
  visible-mark catalog; `visible --mark jimeng` / `--mark auto`.
- doubao: fix a real production defect -- the shipped remover left a READABLE
  "豆包AI生成" outline on real samples while detect() returned conf 0.0 (fooled by a
  thin outline), so the test passed and the "56/56 clean" claim was detector-measured,
  not visual. Root cause: under-estimated alpha + fixed-geometry-no-inpaint + tight
  locate box. Rebuilt alpha (careful gray-self solve), always-align, thin inpaint,
  widened locate box -> readable outline becomes faint texture-level traces.
- gemini: rebuild gemini_bg_{96,48} from our own controlled captures (validated NCC
  0.9998 vs the prior third-party asset); removal re-verified clean, no behaviour change.
- tests: add textured-shift regression to both engines (guards the align-on-shift path
  the Doubao defect exposed; lesson: a detector-only removal test is insufficient,
  assert visual residual).
- docs: CLAUDE.md, README, capture READMEs and docstrings synced; stale
  "exact/pixel-exact/56-clean" claims removed.

Also includes a SynthID label-wording clarification in identify.py/cli.py
("SynthID pixel watermark" -> "SynthID watermark, inferred from C2PA metadata").

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 12:20:19 -07:00
Victor Kuznetsov 5d0e6c3a65 fix: harden metadata parsers and engines; sync docs (full-repo review)
Apply fixes from a full-repo review (code, tests, docs).

Security / correctness:
- Clamp attacker-controlled PNG/caBX chunk lengths to the remaining file
  size in metadata.py and noai/c2pa.py (a malformed length no longer drives
  a multi-GB read); skipped chunks seek instead of read.
- noai/isobmff.strip_c2pa_boxes is now fail-safe on a malformed box: return
  the original bytes with a warning instead of silently truncating the tail,
  so metadata --remove can no longer emit a corrupt file.
- doubao_engine._fixed_alpha_map clamps the glyph box to the image (no crash
  on degenerate width-vs-height).
- watermark_remover._run_region_hires gates the phaseCorrelate offset on
  response and magnitude (a spurious shift no longer garbles text) and drops
  the generator after a CPU fallback (no MPS/CPU device mismatch).

Robustness:
- gemini_engine, doubao_engine, region_eraser normalize grayscale and RGBA
  inputs to BGR at the engine entry points.
- image_io.imwrite returns False on an unwritable path (matches cv2).
- invisible_engine guards a None imread result before use.
- trustmark_detector._decoder uses a double-checked threading lock.
- ctrlregen.tiling.tile_positions raises on overlap >= tile.
- humanizer chromatic shift no longer wraps opposite-edge pixels.
- identify OpenAI caveat keyed on the normalized vendor, not a substring.
- Remove the dead "visible --detect-threshold" CLI option.
- publish.yml verifies the release tag matches the package version.

Docs:
- README strength 0.05 to 0.10; .env.example HF_TOKEN marked optional;
  doubao_capture README updated to reverse-alpha-only; CLAUDE.md synced with
  the new behaviors and the batch command.

Tests: new test_security_clamp.py for the read clamp and isobmff fail-safe;
erase CLI coverage; integrity-clash rule 2 end-to-end; multi-tag EXIF
survival and cross-format strip guards; channel/size, tiling, humanizer, and
imwrite regressions. Full suite 493 passed, 2 skipped; ruff and pyright src/
clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 18:00:39 -07:00
Victor Kuznetsov 5298dcc6a3 chore(release): v0.7.2
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
v0.7.2
2026-05-30 14:35:04 -07:00
Victor Kuznetsov d88b87ca4e Fix #29 black output: use fp16-fixed SDXL VAE on fp16 GPUs
The stock SDXL VAE overflows to NaN in fp16, so the plain img2img path decodes
to an all-black image on a CUDA/XPU fp16 backend. This is the raiw.cc black
result HitaoLin reported (a 1086x1448 input came back uniformly black). cpu/mps
run fp32 and never hit it, and the differential / region-hires pipeline already
upcasts the VAE itself, so only the plain path on a fp16 GPU was exposed.

`_load_pipeline` now loads `madebyollin/sdxl-vae-fp16-fix` for the default SDXL
checkpoint when running fp16, gated by the pure helper `_needs_fp16_vae_fix`. A
custom non-SDXL model keeps its own VAE.

The decision logic is unit-tested without a download (TestFp16VaeFix). The
black->clean recovery itself needs a CUDA GPU and was not verifiable on this MPS
machine; it must be confirmed on the backend.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 14:31:51 -07:00
Victor Kuznetsov 9be66752c5 chore(release): v0.7.1
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
v0.7.1
2026-05-30 14:21:02 -07:00
Victor Kuznetsov 69559226d7 Clarify metadata command supports video/audio, drop misfiring format warning (#33)
The `metadata` command handles more than images: `remove_ai_metadata` strips
C2PA / AIGC provenance from MP4/MOV/M4V/M4A and from WebM/MP3/WAV/FLAC/OGG via
ffmpeg. But the help said "from images" and the shared `_validate_image` call
printed "Warning: .mp4 may not be supported" on exactly those supported
containers. The argument's `exists=True` already enforces the file exists, so
the validation call only added the wrong warning here.

Update the docstring to list the real format coverage and drop the
image-only validation from this command. The image commands keep it.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 13:20:04 -07:00
Victor Kuznetsov 29da3c52b6 Raise default SynthID-removal strength 0.05 → 0.10 (current Google SynthID) (#32)
* Raise default SynthID-removal strength 0.05 -> 0.10 (current Google SynthID)

The old default (0.04/0.05) no longer removes the CURRENT Google SynthID (Nano
Banana / Gemini 3): verified 2026-05-30 via the Gemini 'Verify with SynthID'
oracle on a real image -- 0.05 still detected, 0.10 not detected (OpenAI's was
already cleared at 0.05). Add DEFAULT_STRENGTH=0.10 in watermark_profiles, route
the engine + CLI defaults to it. At 0.10 small text deforms more, which is why
text protection (_run_region_hires) runs by default. CLAUDE.md SynthID note
corrected. CAVEAT: n=1 Google + n=1 OpenAI; broad corpus oracle validation
pending (task tracked).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Drop unused LOW/MEDIUM/HIGH strength profiles; CLI --strength defaults to DEFAULT_STRENGTH

The fixed strength presets (and get_recommended_strength) were dead -- nothing in
the pipeline used them, only tests. One knob now: DEFAULT_STRENGTH (0.10),
overridable per-call via the CLI --strength flag, which now defaults to that
constant (single source of truth). Removed the WatermarkRemover.LOW/MEDIUM/HIGH
class attrs and the get_recommended_strength tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 13:15:58 -07:00
Victor Kuznetsov e4f558dccf Add per-region high-resolution text protection (regenerate crisp, scrub everywhere) (#31)
Replace the default text-protection path. Differential Diffusion froze text in
latent space, which left SynthID intact inside text (violating remove-everywhere)
and still softened sub-8px strokes (VAE latent limit). _run_region_hires instead
scrubs the whole image, then re-scrubs each detected text block at high resolution
and feather-composites it back: every pixel is regenerated (watermark removed
everywhere) while small text stays crisp (high-res strokes span >1 latent cell).

merge_text_regions + feather_paste are pure and unit-tested; each re-scrubbed
patch is phase-correlated back to the original crop to null the ~1-2px round-trip
offset. Synthetic 18px multilingual text: text-region SSIM 0.28 -> 0.48, visually
garbled -> readable across Latin/Cyrillic/CJK. Legacy _run_differential /
build_change_map remain but are no longer the default. Prod use still requires
confirming via the SynthID oracle that re-scrubbed text zones read watermark-free.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 12:59:29 -07:00
Victor Kuznetsov c928ee6e42 chore(release): v0.7.0
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
v0.7.0
2026-05-30 12:35:32 -07:00
Victor Kuznetsov 89f427852f Fix #30 white box: stop zeroing alpha in the watermark region on save
On RGBA inputs the CLI forced the watermark bbox alpha to 0 on save, so the
removed-sparkle area became a transparent hole that renders as a solid white
box on any non-transparent viewer. The Gemini app exports opaque RGBA, so
every user hit it. Reverse-alpha already recovers the real pixels there (and
`erase` inpaints them), so there is no artifact to hide -- the hole was the
bug, introduced as an over-correction in d091b9f.

`_write_bgr_with_alpha` now rejoins the input alpha plane unchanged (drops the
`clear_region`/`pad` params); the `visible` / `erase` / `all` / `batch` call
sites drop the cleared-region argument and the orphaned region bookkeeping.
The registry `remove()` still returns the mark bbox (used for inpaint_residual
positioning); the CLI just no longer clears alpha with it.

Inverts the test that locked in the old behavior into a #30 regression guard
(watermark-region alpha stays opaque, no pixel forced transparent). Verified
end-to-end on a real Gemini RGBA export: sparkle gone, zero transparent
pixels, clean over a white background.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 12:27:37 -07:00
Victor Kuznetsov 25a1acc53b Detect TC260 AIGC label in JPEG EXIF and late/attribute PNG XMP
A corpus audit surfaced China TC260 AIGC-labeled images that `identify`
missed. Three detection gaps in `aigc_label`, all fixed:

- raw-JSON `{"AIGC":{...}}` in JPEG EXIF (UserComment): brace-matched from
  the scan head with `json.raw_decode`, gated on a TC260 field like the
  PNG-chunk path. (Doubao-class output via that export surface.)
- XMP attribute form `TC260:AIGC="{...}"` (PicWish): folded into the
  element regex as a second alternation.
- TC260 XMP packet appended after a large `IDAT`, past the 1 MB scan
  window: `scan_head` now appends late PNG metadata chunks via
  `_png_late_metadata`, mirroring the existing ISOBMFF late-box scan.

Adds `scripts/corpus_gap_scan.py`: runs `identify` over a corpus, writes
the per-file report CSV, and flags `unknown` files that carry a known
marker in their metadata region (the audit that found these gaps).
Scanning only the metadata region — not the whole file — avoids the
random short-token collisions inside compressed PNG/JPEG streams.

On the local corpus this lifts 3 files from `unknown` to AI (China AIGC)
and leaves zero false gap candidates. Synthetic piexif/PngInfo fixtures
cover all three forms.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 11:44:53 -07:00
Victor Kuznetsov 58bdf51c59 Visible-watermark registry: reverse-alpha-only Doubao + Gemini, exact native recovery (#28)
* fix(trustmark): gate detection on re-encode durability to kill false positives

TrustMark's wm_present flag is a BCH validity check that spuriously
validates on a content-correlated fraction of un-watermarked images
(AI textures trip it more than camera photos). On a 1343-image set all
20 raw detections were false, several on Gemini/OpenAI/Doubao output that
cannot carry Adobe's watermark, with random-bytes secrets.

A genuine TrustMark is a durable soft binding that survives re-encoding,
so detect_trustmark now re-decodes after a mild JPEG round-trip and
requires the same schema both times. Every observed false positive
collapsed under this gate; the second decode runs only on the rare hit.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(identify): Samsung Galaxy AI, FLUX, ByteDance C2PA; fix C2PA substring FP

Detection extensions verified on real signed files (2026-05-29):

- Samsung Galaxy AI: signer attribution via a new _SIGNER_C2PA_PLATFORM
  (Samsung Galaxy / ASUS Gallery) kept separate from the capture-camera
  _DEVICE_C2PA_PLATFORM so a Galaxy AI edit (device cert + AI source type)
  does not trip the camera-vs-AI integrity clash. Plus metadata.samsung_genai:
  the proprietary genAIType marker in PhotoEditor_Re_Edit_Data, a medium-
  confidence AI-editing signal (samsung_only branch).
- Black Forest Labs (FLUX) and ByteDance Volcano Engine (Doubao/Jimeng)
  added as C2PA issuers + issuer->platform mappings.
- fix: C2PA presence required only the bare 4-byte 'c2pa' substring, which
  false-positives on compressed pixel data (a recompressed PNG IDAT re-flagged
  C2PA after its manifest was correctly stripped). New c2pa_marker_in() requires
  the JUMBF wrapper (jumb+c2pa) or the C2PA uuid box; applied in identify +
  metadata. Verified: all 535 real C2PA files carry jumb.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(doubao): gate detection on text structure to cut ~95% of false positives (#23)

Coverage alone over-fired: any textured bottom-right corner cleared the
threshold, so the detector false-positived on ~28% of arbitrary images.
The real '豆包AI生成' mark is six glyphs in one row, so detect now also
requires the text-structure signature (_glyph_structure): many connected
components, no single dominant blob, concentration in a thin horizontal
band. False positives dropped 343 -> 17 across the corpus while keeping
real-mark recall and the doubao-1.png sample. Also accept a no-op force
kwarg for remover-interface symmetry.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(samsung): add Samsung Galaxy AI visible-badge remover

New samsung_engine.py removes the bottom-left sparkle + localized
'AI-generated content' badge that Galaxy AI tools stamp. Mirrors the
Doubao locate->mask->inpaint pattern but bottom-left, with a dual-polarity
top-hat mask (the badge is light-on-dark or dark-on-light). Detection gates
on a band + left-anchor signature (the Doubao CJK-component gate does not
transfer: Latin badge letters connect into few blobs). Explicit-only --
tuned on few real badges with a ~4% FP floor, so it is not used in auto.
Synthetic byte-blob fixtures (real badges are user content, not shipped).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(visible): unified known-watermark registry + LaMa inpaint backend

watermark_registry.py is a single catalog of known visible marks, each
tying {usual location, in_auto flag, recovery strategy, detect adapter,
remove adapter}: gemini (reverse-alpha, exact), doubao, samsung. cmd_visible
is now registry-driven (best_auto_mark for --mark auto; mark_keys() feeds the
CLI choices) -- the per-mark _run_doubao/_run_samsung helper branches are gone.

Cross-engine confidences are not comparable, so the gemini adapter applies the
corpus-validated 0.5 sparkle threshold for auto arbitration (its engine flag is
loose and weakly fired ~0.36 on Doubao text, hijacking auto).

--backend auto|cv2|lama chooses background reconstruction for the mask-based
marks; auto = LaMa when onnxruntime is present, else cv2. For LaMa the mask is
the FILLED glyph bounding box (sparse glyph masks leave anti-aliased edges
behind). cv2 stays the zero-dependency fallback.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: watermark registry, Samsung/FLUX/ByteDance detection, LaMa backend, trustmark gate

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(doubao): exact reverse-alpha removal from captured alpha map

The Doubao '豆包AI生成' mark is a fixed semi-transparent white overlay, so
given its alpha map the original pixels are recovered exactly:
original = (wm - a*logo)/(1-a) -- no inpaint hallucination.

The alpha map + logo colour were solved from real black+gray Doubao captures
on a controlled background: on black captured = a*logo, and the black/gray pair
solves a per-pixel without assuming the logo colour (a_max~0.65, logo near-white);
the white capture cross-validates (mark vanishes to a flat fill). Bundled as
assets/doubao_alpha.png + geometry constants.

remove_watermark_reverse_alpha applies it scaled to image width; exact at the
captured width, so the registry routes doubao through it only when
reverse_alpha_available (width within the calibrated band) and the mark is
detected, falling back to mask inpaint (cv2/LaMa) otherwise. A light residual
inpaint cleans the sub-pixel rescaling error. Add captures at more resolutions
to widen exact coverage.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(visible): reverse-alpha only -- drop inpaint removal + heuristic detection

Per the principle that we only remove/detect what we can do exactly, the
visible-mark path is now reverse-alpha only:

- Doubao detect is reverse-alpha-consistent: match the bundled alpha glyph
  silhouette against the corner via TM_CCOEFF_NORMED (DETECT_NCC_THRESHOLD 0.4)
  -- keys on the '豆包AI生成' SHAPE, not coverage/structure heuristics. FP
  7/1243 (0.6%). Removes the cv2 inpaint path + the _glyph_structure gate.
- Registry is reverse-alpha only: dropped the cv2/LaMa backend (_glyph_remove,
  _lama_box_inpaint, default_backend, --backend) and the Samsung entry. Doubao
  outside the alpha resolution band is skipped, never inpainted.
- Removed samsung_engine.py + tests + --mark samsung (no alpha map captured;
  Samsung C2PA/genAIType metadata detection in identify is unaffected).
- The universal erase --region (cv2/LaMa) is unchanged -- arbitrary-region
  inpainting stays a user-directed tool, separate from the known-mark registry.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(doubao): NCC sub-pixel alignment -> reverse-alpha at any resolution

A pure width-scale of the captured alpha map is only sub-pixel-accurate at the
captured width and leaves a faint ghost elsewhere. remove_watermark_reverse_alpha
now registers the alpha glyph to the actual mark via a TM_CCOEFF_NORMED
scale+position search (_aligned_alpha_map) before inverting the blend, so the
single 2048 capture works at any resolution -- verified clean on the 1773x2364
(3:4) corpus size, the biggest coverage gap (23 files).

reverse_alpha_available is now just 'asset present' (no width band); the registry
still gates removal on detect so a clean corner is never touched. Drops the
_ALPHA_WIDTH_TOLERANCE gate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(doubao): keep native recovery exact -- fixed geometry at captured width

Integer-pixel NCC alignment landed ~1px off at the captured width, degrading the
otherwise-exact native reverse-alpha (synthetic recovery error 0.94 -> 1.39).
remove_watermark_reverse_alpha now uses exact width-relative geometry within
_ALPHA_NATIVE_BAND of the captured width and the NCC search only off it -- best
of both: native back to 0.94, other resolutions still aligned.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(doubao): harden alignment -- try fixed+aligned, keep least residual (56/56)

On a faint/busy-background mark the NCC alignment peak can wander a few px off
the true mark and leave a residual (2/56 real corpus files). Off the captured
width, remove_watermark_reverse_alpha now builds BOTH the fixed-geometry and the
NCC-aligned alpha map, applies each, and keeps whichever leaves the least
residual mark (re-detect confidence on the bare reverse-alpha) -- geometry wins
on faint marks, alignment on clear ones, no magic threshold. Real-file round-trip
now removes 56/56 detected Doubao clean across every corpus resolution (was 54).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* perf(doubao): skip residual inpaint at native width for exact recovery

At the captured width the fixed-geometry reverse-alpha is pixel-exact, so
inpainting over it only replaced exactly-recovered interior pixels with a
cv2 hallucination -- measured worse on a textured background (native error
vs true bg 1.6 reverse-alpha-only vs 2.6 with the old always-on
full-footprint inpaint). Native now returns the bare recovery untouched;
off-native, where NCC alignment is only sub-pixel-approximate, the footprint
inpaint stays to clean the seam. Real round-trip still 56/56 across all
corpus resolutions; negatives 0/60, Gemini unaffected.

Add test_native_returns_exact_reverse_alpha_no_inpaint as the regression
guard. Sync CLAUDE.md + README (the table cell and prose described the
pre-NCC "skipped off native / cv2-LaMa" behavior, now stale). Gitignore the
session scheduled_tasks.lock, and add the text-protection research note.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 19:49:09 -07:00
Victor Kuznetsov ef6fdaeeec Detect text at native resolution (capped), fixing small-text recall on large images (#27)
The text-protection detector scaled every image to a fixed 736 px long side, so
small text on large canvases (e.g. ~16 px on 2048) was downscaled below the
detector and missed -> deformed by the SDXL pass (issue #14). Detect at the
native long side capped at 1536, never upscaled (_detection_input_size, a pure
unit-tested helper). Detection is script-agnostic (DB segments regions, not
characters), so this is language-agnostic: a new benchmark
(scripts/text_detection_benchmark.py) measures recall across Latin/Cyrillic/CJK/
Hangul/Arabic/digits x sizes x canvas -> overall hit-rate 0.91 -> 1.00, worst
cell (2048/16 px) 0.06 -> 1.00. Docs updated.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 12:28:30 -07:00