- watermark_remover: _build_qwen_kwargs now passes explicit height/width (via
_qwen_target_size, floored to /16). Without it QwenImageImg2ImgPipeline defaults to
1024x1024 and silently squishes non-square inputs, distorting the scene and garbling text.
- watermark_profiles: resolve_strength gains a `pipeline` arg + a Qwen strength ladder
(_QWEN_VENDOR_STRENGTH, Gemini 0.25), so `--pipeline qwen` gets its certified floor
automatically; retires the manual "pass --strength 0.25 for Gemini on qwen" workaround.
- fidelity_metrics: replace per-face nearest matching (collided on multi-face images when a
variant dropped a face, corrupting the identity metric) with a collision-free one-to-one
assignment (assign_faces_one_to_one). lapvar/LPIPS were always bbox-anchored and immune.
Regression-guarded by tests/test_fidelity_matching.py.
- docs: record the measured outcomes of the qwen-improvement arc. The Qwen ControlNet
face-fix is CLOSED (no permissive Qwen detail/tile ControlNet exists; canny carries edges,
not skin grain). The `--pipeline auto` router + faces+text mixed dual-pass were prototyped
and DROPPED (controlnet wins faces AND display text: abba CER 0.114 vs qwen 0.379).
Z-Image-Turbo was tried and dropped (same regeneration limits). qwen stays a manual opt-in;
controlnet is the default for everything.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
For AI-enhanced composites (digitalSourceType compositeWithTrainedAlgorithmicMedia,
identify ai_source_kind == "enhanced"; roadmap P1#8): regenerate ONLY the AI
region and preserve the real photo elsewhere, instead of regenerating the whole
frame.
- noai.tiling.feather_region_composite(base, regenerated, box, *, feather): pure,
model-free compositor that blends the regenerated AI box back over the original
with a feathered seam, leaving pixels OUTSIDE the box exactly equal to base.
Fully unit-tested (outside-box exactness, interior == regenerated, hard paste at
feather 0, monotonic seam ramp, dtype/grayscale/clamp/empty-box/shape-mismatch).
- WatermarkRemover.remove_watermark(region=, region_feather=) and the module-level
convenience function thread it through: the remover regenerates (or tiles) the
frame, then composites only the AI box back over the original input. The box is
caller-supplied -- a C2PA composite manifest carries no reliable machine-readable
region, so none is fabricated. The no-model lossless region path stays
region_eraser.erase.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Port the Gemini sparkle dark-pit guard (commit 41f6797) to the shared
TextMarkEngine reverse-alpha base (roadmap P0#8): on a dark or mid-tone
background the captured alpha can over-estimate this image's mark opacity, and
reverse-alpha leaves a darker-than-background glyph ghost instead of recovering
the true pixels. The sparkle-only fix left the text marks unhandled.
_reverse_alpha_oversubtracts predicts the reverse-alpha output PER PIXEL over the
glyph body from the INPUT ((obs - a*logo)/(1-a), the remover's own math); when
the predicted body lands more than _OVERSUB_DARK_MARGIN (25) gray levels below
the local background ring it abandons the reverse-alpha output for the footprint
and inpaints it from the original surroundings (_inpaint_footprint, wider dilate/
radius than the thin residual pass). Predicting per-pixel from the input (not the
produced output, which depends on which placement the remover picked) keeps a
cleanly captured full-strength mark byte-identical -- it predicts back to the
background everywhere, so the guard never trips on it (verified across all three
engines on white/mid/dark/midgray backgrounds).
Regression-guarded by tests/test_text_mark_oversubtraction.py: predicate True on
faint / False on clean, end-to-end no-dark-pit acceptance, clean-mark byte
identity, and textured-background footprint recovery.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Retained-corpus mining (2026-06-20) surfaced three provenance gaps; all are
oracle-free and regression-guarded.
- C2PA vendor coverage (roadmap): register Volcano Engine under its Chinese
legal entity 北京火山引擎科技有限公司 (the latin "volcengine" needle misses
those certs) -> normalizes to the same ByteDance platform; register ElevenLabs
("Eleven Labs Inc.", pure generative-AI) as a generator. Document the
deliberate exclusion of TikTok Inc. and PixelBin.io/"Fynd" (provenance/transform
signers, not generators) so they are not re-added.
- AI-generated vs AI-enhanced (roadmap): ProvenanceReport.ai_source_kind splits
the C2PA digital-source-type into "generated" (trainedAlgorithmicMedia) vs
"enhanced" (compositeWithTrainedAlgorithmicMedia) so a caller branches a
full-frame scrub from a region-targeted clean. Parsed once in
noai.c2pa._populate_registry_fields (PNG + any c2pa-python-readable container),
with a raw head-scan fallback in identify for the non-PNG raw-blob path. CLI
verdict reads "AI-generated (fully synthetic)" vs "AI-enhanced (real content
with an AI-composited region)"; surfaced in --json.
- Detect-vs-remove threshold desync (P0#7): identify's sparkle threshold and the
removal arbitration gate were two independent 0.5 constants. Unify them into the
single GEMINI_SPARKLE_TRUST_CONF (identify imports it) so they can never drift.
Lowering the gate to recover faint sub-0.5 sparkles was evaluated and REJECTED:
a real Doubao text mark scores ~0.40-0.42 as a gemini match with a higher
core-ring brightness margin than a genuine faint sparkle, so neither confidence
nor the brightness gate separates them in [0.35, 0.5) -- lowering would trade a
rare miss for false-positive removals on clean images. Regression-guarded by
TestSparkleDetectRemoveAlignment (real demo sparkle at borderline opacities;
identify and best_auto_mark must agree on either side of the line).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A third diffusion pipeline alongside sdxl/controlnet: Qwen-Image (20B MMDiT,
Apache-2.0 code AND weights) img2img. The scrub still comes from the img2img
strength; Qwen preserves text (incl. CJK) and structure markedly better than
SDXL at the scrub floor, so it over-regenerates real photos far less (directly
targets the controlnet over-regeneration that degrades real uploads).
- watermark_profiles: QWEN_MODEL_ID, normalize_profile accepts "qwen".
- WatermarkRemover: _load_qwen_pipeline (bf16, loads Qwen base unless --model
overridden, clear ImportError if diffusers lacks the class), _run_qwen (no
MPS fallback -- 20B is CUDA/cloud-class), dispatch in _generate_one/preload,
pure _build_qwen_kwargs (true_cfg_scale, not guidance_scale).
- Shared _base_load_kwargs() across all three loaders (dtype + token).
- CLI --pipeline gains "qwen"; invisible_engine threads it through.
- scripts/qwen_scrub_prototype.py: standalone PEP 723 GPU experiment.
Prototype oracle floors (Modal A100-80GB, single seed, controls SynthID-positive,
PENDING seed-repeat cert): OpenAI clears at strength ~0.10, Gemini at ~0.30 (0.20
still detected), with CJK text + faces faithful where controlnet plasticizes. The
Gemini floor is higher than the shared default ladder, so pass an explicit
--strength for Gemini on this pipeline until a Qwen-specific ladder is certified.
The model-running path is CUDA-only (untestable locally); unit tests cover the
pure call-shape (_build_qwen_kwargs) and profile normalization without torch.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a lossless alternative to the --max-resolution downscale for large
images that OOM on MPS/GPU: regenerate in overlapping, feather-blended
tiles at native resolution.
- noai/tiling.py: pure plan_tiles (uniform tiles, last flush to edge) +
feather_weights (strictly-positive separable taper -> partition-of-unity
blend) + run_tiled (per-tile generate callable, decoupled from the
pipeline). Unit-tested without the model.
- WatermarkRemover.remove_watermark: refactor _generate into _generate_one
+ a tiled branch that engages only when --tile is set and the long side
exceeds tile_size (ControlNet canny is rebuilt per tile).
- Thread tile/tile_size/tile_overlap through InvisibleEngine and the
invisible/all/batch CLI commands via a shared _tile_options decorator.
Verified end-to-end on the real SDXL pipeline (forced 2x2 tiling on a
1024px sample, MPS): non-degenerate output, no gross seam at tile borders.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Closes a documented coverage gap (P2#9): an AI Software/Make/Artist/ImageDescription
token in an EXIF item (its TIFF bytes live in mdat/idat) survived remove_ai_metadata
because the top-level box stripper and (absent pillow-heif) the PIL EXIF reader can't
reach it. New isobmff.blank_ai_exif_tokens finds EXIF TIFF blocks by their II/MM
byte-order header, validates each with piexif (a coincidental II/MM run in pixels
won't parse as a TIFF IFD, so it's ignored), and overwrites any AI_GENERATOR_TOKENS-
bearing value with same-length spaces -- so box sizes and iloc offsets stay valid and
the coded image is untouched (mirrors blank_ai_xmp_packets; no iinf/iloc surgery, no
exiftool dep). Camera/editor EXIF without an AI token is preserved. Wired into
remove_ai_metadata's ISOBMFF path. Covers the realistic AI-generator-token case; xAI-
signature-in-meta-box-EXIF (Grok is JPEG-only) stays out.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The reverse-alpha text-mark engine (Doubao/Jimeng/Samsung) allocated
full-frame arrays where only the glyph footprint is ever read:
- _fixed_alpha_map / _aligned_alpha_map each built a full (h, w) float32
alpha map non-zero only inside the glyph box, and two were held at once
during removal (~96 MB of mostly-zeros on a 12 MP frame);
- extract_mask built a full (h, w) uint8 mask that every caller cropped to
the located box (~12 MB, rebuilt per text-mark detector on the
memory-tight identify path).
Both now return footprint-sized arrays: the alpha helpers return the
glyph-sized block plus its placement (ax, ay, gw, gh), and extract_mask
returns the box-sized mask. _apply_reverse_alpha consumes the block
directly; the residual inpaint embeds it into one full-frame uint8 mask only
at cv2.inpaint time (which needs a full-frame mask). remove_watermark_
reverse_alpha tracks the winning region alongside best_amap to place it.
Peak allocation drops from O(image*4)x2 + O(image) to O(footprint)x2 +
one gated O(image*1) uint8 mask -- a win every consumer gets, motivated by
the 512 MB raiw.cc worker that OOMs on large decodes. GPU path untouched.
Byte-identical to the old full-frame path (verified: 17 output hashes
across the three engines, inpaint/no-inpaint, detect, and the real
doubao-1.png fixture, unchanged before/after). tests/test_text_mark_memory.py
guards it by reconstructing the old full-frame path inline and asserting
equality, so the proof survives a cv2/asset bump, and pins the O(footprint)
shape so a regression to full-frame fails loudly.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
flux-1.png / flux-1.jpg are real Black Forest Labs FLUX.2 [pro] Playground
outputs (signed C2PA, issuer "Black Forest Labs" + trainedAlgorithmicMedia,
manifests verified to contain no personal data). flux-1.jpg is the first
committed JPEG-with-C2PA fixture, exercising the c2pa-python non-PNG reader path
end to end. Regression tests assert both attribute to "Black Forest Labs (FLUX)".
Also documents the verified finding (n=2, 2026-06-19): BFL's hosted output carries
the signed C2PA manifest but NOT the open invisible-watermark DWT-DCT (decodes to
degenerate all-ones, chance-level vs the FLUX reference) -- the open pixel mark is
dev-inference-code-optional only. So a hosted FLUX.2 image is identified by C2PA
alone, with no open-pixel fallback once C2PA is stripped.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Mining the local production corpus (25,725 imgs) surfaced two AI vendors signing
C2PA that the registry missed:
- Canva (Magic Media) signed "Canva" + trainedAlgorithmicMedia -> detected AI but
no platform attributed (disproves the old "Canva exports strip C2PA" assumption).
- BytePlus (ByteDance international: Seedream/Seededit) signs "Byteplus Pte. Ltd.";
the bare volcengine needle missed it, so its output was mis-attributed to "Adobe
Firefly" via an incidental "Adobe XMP" string the fallback byte-scan picked up.
Adding both to C2PA_AI_VENDORS lets the clean manifest issuer attribute them
directly. Corpus re-run: 16 platform changes, all improvements (3 Adobe->ByteDance
fixes, 4 None/TC260->ByteDance, 9 None->Canva), 0 regressions. An attempted
signer-based attribution fallback was measured and dropped: it regressed 18 images
(friendly ByteDance label -> raw Chinese cert org; IPTC tool name pre-empted).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
extract_c2pa_info now uses the c2pa-python Reader first (any container, whole
manifest store incl. ingredient manifests), falling back to the hand-rolled caBX
parser for blobs the validator rejects (synthetic/partial, broken wheel). The
issuer/source-type/SynthID/soft-binding registry scan is shared by both paths
(_populate_registry_fields), so the return-dict contract is unchanged. Also
replaces the dead `from c2pa import has_c2pa_metadata` import in metadata.py with
a real Reader presence check. c2pa-python added as a core dep (MIT/Apache, ~+5MB
RSS, no torch; wheels cover the CI matrix).
Validated on the full local spaces corpus (25,725 imgs): 0 regressions; 384
manifests newly parsed (379 non-PNG JPEG/WebP + 2 PNGs the byte-scanner missed);
3 false Adobe/Microsoft->Google attributions fixed via real-manifest parsing.
The docs/module-internals.md section for this change already landed in 41f6797.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The free `visible` path over-subtracted a faint Gemini sparkle on a
mid-tone background into a darker-than-background brown diamond instead
of removing it (2026-06-18 prod NPS report, "the watermark was not
removed, just its color changed"). The existing over-subtraction guard
only tripped when reverse-alpha drove a footprint pixel fully negative
(the issue #30 dark-background black-pit case); on a mid-tone background
the over-subtraction darkens the core well below the background without
any pixel crossing zero, so the gate missed it and shipped the dark mark.
Add a second over-subtraction signal to `_reverse_alpha_oversubtracts`:
predict the reverse-alpha output at the bright core, (core - a*logo)/(1-a),
and route to the footprint inpaint when it lands more than
`_OVERSUB_DARK_MARGIN` (25) gray levels below the local background ring.
Calibrated wide: clean removals predict within ~12 of background
(demo_banana ~-1), the prod regression ~-40, the issue #30 dark case ~-82.
Corpus-validated on the 479 detected Gemini images: 10 switch reverse-alpha
to inpaint, all of them dark-diamond cases that improve or match; the
other 469 stay byte-identical. demo_banana stays on the reverse-alpha
path (byte-identical).
Also crop both reverse-alpha helpers to the region they actually touch,
a pure O(image) -> O(mark) win that is byte-identical to the full-frame
math (a uint8<->float32 round-trip is exact):
- `GeminiEngine._core_and_bg` converts only the footprint+ring crop to
gray, not the whole frame (~70 ms -> 0.1 ms on a 12 MP image; it runs
for both the alpha-gain estimate and the new gate). Verified identical
across 479 images; detector confidence unchanged.
- `TextMarkEngine._apply_reverse_alpha` computes the blend on the glyph
crop only (`amap` is zero outside it, so the math is a no-op there):
~275 ms -> ~2 ms per placement on a 12 MP frame, up to 2 placements per
removal. Verified identical across 142 Doubao/Jimeng placements.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
identify(check_visible=True) ran the Gemini-sparkle detector and the
Doubao/Jimeng text-mark detector each with its own image_io.imread, so the
same bitmap was fully decoded twice. On a memory-constrained host (the raiw.cc
512 MB web worker, which runs identify on every upload) that doubled the peak
decode allocation and contributed to OOM restarts.
Decode once in identify() and pass the BGR array to both detectors. The detect
methods already accept an NDArray, so this only threads the pre-decoded array
through: detect_sparkle_confidence and the two _visible_* helpers gain an
optional image= param that, when None, preserves the old self-read behavior
(so direct callers and the cv2-missing/unreadable paths are unchanged).
Only the visible path is deduplicated; the optional check_invisible decoders
are unaffected (and off on the web hot path). Adds a test asserting
identify(check_visible=True, check_invisible=False) decodes exactly once.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
A 2026-06-14 oracle re-test on the deployed Modal controlnet worker (v0.10.0)
cleared SynthID at OpenAI 0.10 (2 photoreal) and Google 0.15 (2 native
2816x1536, retiring the "native >= 0.30" guess), while a pixel sweep showed the
2026-06-04 cert floors (0.20/0.30) over-regenerated for no efficacy gain
(Google MAE -20% at 0.15). Lowers OPENAI_STRENGTH 0.20->0.10, GEMINI_STRENGTH
and UNKNOWN_STRENGTH 0.30->0.15.
Caveats documented in watermark_profiles.py + docs: removal near this floor is
seed-non-deterministic (a service must pin a verified seed), and the n=2 re-test
did not cover flat-graphic hard cases.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The 'no signal' branch of the visible no-mark path claimed 'No AI provenance
signal found either', which reads as 'the image is clean'. A missing metadata
proxy is not proof an invisible pixel watermark (SynthID) is absent: it cannot
be detected once metadata is gone and may have been stripped upstream. The
message now preserves that uncertainty and routes to both 'all' (regenerate
pixels) and 'erase'. Regression-guarded by the SynthID/all asserts in
test_cli.py. CLAUDE.md visible-command note updated to match.
Also adds a 'Scope and non-goals' section (CLAUDE.md + README): removing
AI-provenance marks on the user's own content is in scope; stripping
stock/paid-content watermarks (Shutterstock/Getty/iStock, classifieds) is out
of scope by principle, not by difficulty.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When `visible --mark auto` (or an explicit `--mark` with detection on) found
no registered mark, it exited 0 without writing output -- which a wrapping
service reads as success and re-serves the unchanged input. ~74% of real
uploads carry no registered visible mark, so this was the dominant "it didn't
work" / NPS score-0 failure mode.
Now it runs a cheap metadata-only identify, prints actionable guidance (route
to `all` for an invisible/metadata mark, or `erase` for an arbitrary logo),
writes no output file, and exits EXIT_NO_VISIBLE_MARK (2) -- distinct from
success (0) and a hard error (1) so the caller can surface the message.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The 256->512 detection-search widening (v0.8) let a large, low-gradient
shape match outrank a genuine mid-size corner sparkle whose raw NCC sits
below the 0.85 corner-promote gate, so `identify` read `unknown` on Gemini
images that v0.7.2 caught (reporter osachub: scale-48 sparkle on light
bedding -- true sparkle spatial 0.775 / grad 0.960 / fusion 0.676, but the
size-weighted argmax locked onto a decoy at spatial 0.628 / grad 0.036).
detect_watermark now keeps the top-K (_SELECT_TOPK=3) size-weighted
candidates (NMS-deduped) plus the corner-promote candidate, scores each by
full fusion (spatial+gradient+variance) via the extracted _grad_var_scores
helper, and selects the highest -- the gradient term lifts the true sparkle
over the decoy. Ranking by the SIZE-WEIGHTED score (not a raw-NCC argmax)
preserves tiny-patch suppression: a raw-NCC argmax re-admitted 16-18px
content false positives (14/65 doubao + 4/11 jimeng visible images). Top-K
adds zero flips on the doubao/jimeng corpora and leaves the 495-image Gemini
set unchanged (479 detected) while recovering the reporter's image at 0.676.
- _grad_var_scores: gradient/variance scoring factored out of detect_watermark
- confidence = best_fused (drop the duplicated fusion recompute)
- tests: rename test_promotion_is_what_rescues_it ->
test_size_weighted_search_alone_traps_on_the_decoy (corner-promote is no
longer the sole rescue path); add a deterministic regression test mirroring
the real spatial/grad signature
- docs: module-internals.md detector section + CLAUDE.md mechanism map
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
test_all_basic / test_all_visible_step_uses_registry asserted exit 0 but did
not patch is_available, so on CI (core+dev only, no gpu) they took the skip
branch and hit the new non-zero exit. Passed locally where gpu is present.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Step 2 (invisible/SynthID) was skipped with a quiet inline warning and the
run still exited 0, so a missing [gpu] extra was mistaken for a clean result
(recurring #14/#47). Add a prominent end-of-run banner and a non-zero exit.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- AIGC: parse the bare ``AIGC{...}`` blob form (label glued to its JSON in a
JPEG APP segment near the JFIF header), and scan both raw-JSON forms in one
fall-through loop so a quoted ``"AIGC"`` later in an XMP packet no longer
shadows a real bare label earlier in the file (3 files read unknown before).
- Integrity clash rule 2: a camera device + an AI marker from the SAME C2PA
manifest (Google Pixel Magic Editor / Pixel Studio edit chain) is a legitimate
edit chain, not a contradiction. Fire only when the AI marker's source is
independent of the camera's manifest; pure cameras (Leica/Sony/Nikon) are
unaffected (2 Pixel files mis-flagged before).
- New c2pa_cloud_manifest detector: surface a C2PA 2.4 Durable Content
Credentials cloud-manifest reference (Adobe cai-manifests.adobe.com) as a
medium provenance signal when the embedded manifest is stripped. Provenance
only, never asserts is_ai (2 files read fully unknown before).
identify reuses its already-loaded scan head for the cloud check (no second
read). +7 tests; CLAUDE.md + README synced.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Nine findings from a high-effort project-wide review, fixed and verified
(571 passed, ruff/pyright clean):
Correctness:
- all/batch now remove Doubao/Jimeng/Samsung visible text marks: the visible
step routes through the registry (new cli._remove_visible_auto) instead of a
hardcoded GeminiEngine, so they no longer leave the wordmark intact.
- batch always reads the original source (dropped the out_path-reuse that
re-processed already-cleaned outputs on a re-run).
- img2img_runner only retries the diffusion call on the deprecated-callback
TypeError; any other TypeError now propagates instead of double-running.
- gemini detect/remove and the reverse-alpha engines normalize channels via a
new image_io.to_bgr, fixing a grayscale/BGRA crash in the FP-gate path.
- _png_late_metadata advances its cursor by the clamped length, so a malformed
chunk length no longer aborts the late AI-label scan.
Cleanup / efficiency:
- Consolidate the ~90%-identical Doubao/Jimeng/Samsung engines into a shared
config-driven _text_mark_engine.TextMarkEngine base; each engine is now a thin
subclass (TextMarkConfig + test shims). Behavior is byte-exact (the three
engine test suites pass unchanged). Registry adapters collapse to one
_text_mark(...) row each. Gemini stays a separate engine.
- scan_head is memoized per (path, size, mtime), so identify() reads the file
head once instead of ~8 times.
- invisible_engine post-processing decodes/encodes the output once (chained in
memory) instead of 2-4 times across stages.
- Remove the orphaned get_model_id_for_profile (+ CONTROLNET_PROFILE); derive
the --strength help from the strength constants (strength_default_help) so it
cannot drift; share the --pipeline/--strength click options; simplify the
retired --auto resolver.
Net -835 lines. Tests added for the registry-routed visible pass, to_bgr,
the polish/model/guidance wiring, and strength_default_help. CLAUDE.md updated
for the new base module, the engine/registry changes, image_io.to_bgr, and the
scan_head cache.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Overhaul the diffusion-removal surface around a single robust default and a
complete, consistent CLI.
Pipeline + strength:
- controlnet is now the DEFAULT pipeline (CLI --pipeline + both engine ctors).
With the certified higher strength it clears both photoreal and flat-graphic
content, whereas plain SDXL left SynthID on flat graphics.
- Rename the plain-SDXL profile default -> sdxl; "default" stays as a back-compat
alias (normalize_profile + a click callback that warns).
- Unify the strength ladder: resolve_strength applies ONE vendor-adaptive ladder
(the certified controlnet floors OpenAI 0.20 / Google 0.30 / unknown 0.30) to
both pipelines. sdxl is the weaker remover on its own hard case (flat fills),
so the certified floor is the right floor for it too.
CLI completeness:
- Add --model (HF model id) to invisible + batch (was only on all) and
--guidance-scale (CFG) to all three diffusion commands; both were library
knobs the CLI did not expose.
- Flip --adaptive-polish to ON by default (it self-gates to a no-op where there
is no detail deficit, so default-on is safe).
- Share --pipeline / --strength / --model / --guidance-scale as single
decorators so invisible/all/batch keep an identical surface; the --strength
help is derived from the strength constants (strength_default_help) so it can
never drift from the ladder.
Removals:
- Delete the auto_config content-detection planner + its YuNet/DBNet assets
(~2.6 MB): with controlnet always the pipeline and the polish self-gating, the
face/text/edge detection no longer changed behavior. --auto is now a deprecated
no-op that only warns (the polish it enabled is the default).
Docs (README, CLAUDE.md, docs/synthid.md) updated throughout; added an
InvisibleEngine Python API example. Tests cover the alias warnings, the
polish default, and the --model/--guidance-scale wiring.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Empirical conclusion from the 2026-06-04 - 2026-06-08 Modal cert sweeps:
every face-restore approach we built (GFPGAN-on-cleaned, PhotoMaker-V2,
InstantID txt2img, InstantID img2img-on-cleaned at three parameter
settings) regenerates the face via SDXL diffusion rather than preserves
it. Output face pixels are diffusion-fresh, so the regenerated face
inherits SDXL "clean skin" aesthetic and loses original identity
precision -- it looks MORE AI-generated than the cleaned image, not
less. The cleaned image from the main controlnet 0.20 removal pass is
the least-AI face state we can reach without re-introducing SynthID.
Nothing in the restore family achieves the actual goal (preserve the
original person's face). Keeping them around as opt-in invites users to
ship something that defeats the point. Removing entirely.
Library changes:
- Deleted src/remove_ai_watermarks/instantid_restore.py
- Deleted src/remove_ai_watermarks/photomaker_restore.py
- Deleted tests/test_instantid_restore.py
- Deleted tests/test_photomaker_restore.py
- Removed `instantid` and `photomaker` extras from pyproject.toml
- Removed `[tool.hatch.metadata] allow-direct-references = true` (was
only needed for the photomaker git+ URL)
- InvisibleEngine.remove_watermark: dropped `restore_faces` +
`restore_faces_method` params, removed both `_restore_faces_instantid`
and `_restore_faces_photomaker` private methods, removed dispatch
- CLI: dropped `_restore_faces_options` decorator, all four cmd_*
signatures lose `restore_faces` + `restore_faces_method`, kwarg passes
to remove_watermark dropped
- _apply_auto: dropped `restore_faces` from tuple shape (was unused after
the engine no longer takes it)
- auto_config.AutoConfig: dropped `restore_faces` field; `plan()` no
longer sets it; `reason` no longer mentions it
- Tests updated accordingly (test_auto_config.TestReason no longer asserts
"face-restore on" in the reason string)
Docs updated:
- CLAUDE.md: removed the photomaker extras bullet, the Face restore
trade-off bullet, the instantid_restore.py + photomaker_restore.py
module bullets; replaced restore mentions in watermark_remover and
controlnet bullets and prod recipe with the empirical conclusion
- README.md: removed both `--restore-faces` callouts and the install
snippet; the feature bullet and auto-mode comment updated
- docs/synthid-robust-identity-research.md: added Status-retired notice
at the top pointing at the 2026-06-08 followup
raiw-app:
- modal_cert.py: dropped `--restore-faces` flag entirely; sweep() no
longer takes restore_faces; pinned _LIB_SPEC to `[gpu]` extras (no
`photomaker` / `instantid` extras), points at main
ruff + strict pyright clean; 569 tests pass; 18 restore-specific tests
gone.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Last commit added `_color_match` which shifts the face crop's mean to the
canvas mean -- the old test fed a uniform face (210) into a uniform cleaned
canvas (90), so after color-match the face was uniform 90 and the
composite was undetectable by value. Switched the fake pipeline to a
gradient face so the color-match preserves variance, and the assertion
now checks that the face region has non-zero std (composite injected
gradient pixels) instead of a value threshold.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per the 2026-06-08 deep-research synthesis (docs/synthid-robust-identity-
research-2026-06-08.md), the entire ArcFace-class identity-adapter ecosystem
for SDXL is blocked from commercial use by InsightFace's non-commercial model
packs (antelopev2 / buffalo_l). No commercial-safe ArcFace-grade identity
stack exists today. The user explicitly opted into shipping a non-commercial
restore path (research / personal use; raiw.cc must NOT install the extra).
Architectural choice: InstantID over PhotoMaker-V2 as the default.
- PhotoMaker-V2 (CLIP+ArcFace dual encoder, txt2img only): documented upstream
identity drift on Asian male faces, visually confirmed in our cert sweep
(tatsunari rendered as a generic woman; group photo collapsed into a
patchwork).
- InstantID (ArcFace cross-attention + landmark ControlNet): semantic
identity branch + spatial weak landmark control, decoupled. Per InstantID
paper (arXiv:2401.07519) and the research report, stronger identity fidelity
on single portraits. Critically: NO original face pixels enter the diffusion
(ArcFace embedding is semantic, landmark stick figure is pure geometry), so
SynthID is not transported.
Implementation:
- New `src/remove_ai_watermarks/instantid_restore.py` mirrors the
`photomaker_restore.py` shape (lazy singletons for pipeline + FaceAnalysis,
per-face crop + _composite_faces from photomaker_restore). Loads the
InstantID community pipeline via `DiffusionPipeline.from_pretrained(
custom_pipeline="pipeline_stable_diffusion_xl_instantid")` -- no upstream
Python package needed; diffusers fetches the file from its community
examples.
- New `instantid` extra in pyproject (insightface + onnxruntime +
huggingface-hub). NON-COMMERCIAL block in the comment explains why.
- CLI: `--restore-faces-method [instantid|photomaker]`, default `instantid`.
Both methods explicitly labeled NON-COMMERCIAL in the help text.
- Engine: dispatch on `restore_faces_method` to either
`_restore_faces_instantid` or `_restore_faces_photomaker`.
- 9 control-flow tests for InstantID without model download (mirror the
photomaker_restore.py test pattern + draw_kps helper checks). 587/587 pass.
Diffusers-0.38 compat verified by upstream code inspection: the InstantID
pipeline inherits from `StableDiffusionXLControlNetPipeline`, uses only
public diffusers APIs (`encode_prompt`, `prepare_image`, `prepare_latents`,
`get_guidance_scale_embedding`), uses legacy attention processor API which
diffusers preserves for backward compat. No PhotoMaker-V1-style internal
text_encoder access. End-to-end execution will be validated by the Modal
cert sweep in the next step.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The previous commit added a real call into FaceAnalysis2 / analyze_faces inside
restore_faces_photomaker, which broke the model-free control-flow test. Stub it:
- monkeypatch _get_face_analyser to return a sentinel
- install a fake `photomaker` module with analyze_faces returning a single
512-d zero embedding
- add dtype=torch.float32 to the fake pipeline class so .to(device, dtype=...) works
11/11 green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Visual review of the GFPGAN-on-cleaned output (9-face grid, 1448x1086) showed it
only polished the already-drifted face without restoring identity — useless for the
"restore who is in the photo" intent. Dropping it.
The shipped restore path is now PhotoMaker-V2, which delivers true identity-from-
embedding face regeneration via a CLIP+ArcFace dual encoder. The ArcFace branch
pulls InsightFace antelopev2/buffalo_l model packs at runtime, which InsightFace
releases under a research-only license, so the whole extra is **NON-COMMERCIAL**.
raiw.cc and any monetized deployment must NOT install the `photomaker` extra.
This is called out at every entry point: CLI flag help, module docstring,
pyproject extra block, CLAUDE.md extras bullet, README install snippet.
Changes:
- Deleted `src/remove_ai_watermarks/face_restore.py` and its tests.
- Deleted the `restore` extra (gfpgan/facexlib/basicsr + scipy<1.18 / numba<0.60
pins) and the basicsr setuptools<69 build pin from pyproject.toml.
- Restored `src/remove_ai_watermarks/photomaker_restore.py` (V2 this time:
`TencentARC/PhotoMaker-V2`, `photomaker-v2.bin`, no `pm_version='v1'` override).
- Restored the `photomaker` extra in pyproject with all the upstream-compat
pins (einops, peft, onnxruntime, insightface) and the `allow-direct-references`
hatch metadata block.
- `InvisibleEngine` swapped `_restore_faces` -> `_restore_faces_photomaker`;
`--restore-faces-method` removed (only one method, no choice).
- CLI flag help, CLAUDE.md, README, docs/synthid.md, and
docs/controlnet-removal-pipeline-research.md all updated.
- docs/synthid-robust-identity-research.md status notice rewritten to list both
abandoned commercial-safe attempts (V1 + GFPGAN-on-cleaned) and the
non-commercial trade-off we accepted.
ruff + strict pyright(src/) clean; 578 tests pass (the 9 GFPGAN tests are gone,
the 11 PhotoMaker tests stay green).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
After 7 cascading upstream-compat fixes (insightface dep, peft dep, pm_version,
device, etc.), the PhotoMaker V1 cert sweep still hit a CFG batch-dim mismatch
inside the denoising loop. The upstream PhotoMaker `pipeline.py` is forked from
diffusers v0.29.1 and our env runs 0.38; SDXL prompt-encoder handling changed
significantly between those versions, so making PhotoMaker work end-to-end
needs a proper fork or a diffusers downgrade — both expensive. Not worth
shipping today.
Pivot: restore `face_restore.py` (GFPGAN) with a single-line fix that makes it
SynthID-safe by construction. The previous design ran GFPGAN.enhance on the
ORIGINAL watermarked image and was oracle-confirmed to re-add SynthID via the
weight-0.5 pixel blend. The fix is to run GFPGAN on the diffusion-CLEANED
image — whatever pixels GFPGAN derives from are already SynthID-free, so the
partial blend cannot transport the watermark. Identity fidelity is lower than
a true identity-as-embedding stack would deliver, but it ships and works.
Changes:
- `src/remove_ai_watermarks/face_restore.py` restored from pre-wipe state with
one line changed: `restorer.enhance(cleaned_bgr, ...)` instead of
`restorer.enhance(original_bgr, ...)`. `original_bgr` is kept as an unused
positional argument for API stability.
- `src/remove_ai_watermarks/photomaker_restore.py` and its tests REMOVED. The
research note (`docs/synthid-robust-identity-research.md`) keeps a "status
notice" documenting why PhotoMaker is parked for now and what the path back
in would look like.
- `pyproject.toml` `restore` extra restored (gfpgan/facexlib/basicsr +
scipy<1.18 + numba<0.60 pins + the basicsr setuptools<69 build pin), plus
`photomaker` extra (with its einops/insightface/peft pile) and the
`[tool.hatch.metadata] allow-direct-references = true` block REMOVED.
- `InvisibleEngine._restore_faces_photomaker` removed; `_restore_faces`
restored. The `--restore-faces` CLI flag and its plumbing through cmd_*
signatures are unchanged.
- CLAUDE.md, README.md, docs/synthid.md, docs/controlnet-removal-pipeline-
research.md updated to describe the shipped GFPGAN-on-cleaned design and to
reference PhotoMaker only as the parked alternative.
ruff + strict pyright(src/) clean; 578 tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The upstream PhotoMaker package's `__init__.py` unconditionally imports a
face-analyser class from its `insightface_package` submodule, so JUST importing
`PhotoMakerStableDiffusionXLPipeline` (the V1 pipeline class we use) raises
`ModuleNotFoundError: No module named 'insightface'` if insightface isn't
present in the env. The Modal cert sweep caught this on the V1 image.
Resolution: pin `insightface>=0.7.3` (and its `onnxruntime` runtime dep) in the
`photomaker` extra. The PyPI insightface package is MIT-licensed CODE; the
non-commercial restriction sits on the pretrained model packs (antelopev2,
buffalo_l) which download only when `FaceAnalysis()` is instantiated. Our V1 path
never instantiates the face-analyser -- it loads photomaker-v1.bin (CLIP-only
encoder) via `load_photomaker_adapter` -- so the model-pack license does not
bind us; we depend only on the MIT code for the import to resolve.
Safety guards:
- Runtime check in `_get_pipeline`: raises if `_PHOTOMAKER_FILE` is ever pointed
at v2 (so a future maintainer can't silently regress to the InsightFace path).
- New test class `TestV1OnlyCommercialSafetyGuard`: asserts repo + filename
pin to V1 AND asserts the module source never references the face-analyser
class (a static check that our codepath stays out of the runtime that would
pull the non-commercial model packs).
Docs: documented the import dance + legal split inline at the top of
`photomaker_restore.py`.
ruff clean; 581 tests pass (the 9 PhotoMaker tests plus 3 new V1-guard tests).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The GFPGAN `restore` extra and its `face_restore.py` module are gone. They were
oracle-confirmed to re-introduce SynthID by blending watermarked original face
pixels at fidelity weight 0.5 (clean A/B: gemini_3 controlnet 0.20 detected WITH
GFPGAN, clean WITHOUT). Keeping them as the default restore method was a footgun
for the removal pipeline. PhotoMaker-V2 (added in the previous commit) is the
single shipped restore path now -- identity-as-embedding, SynthID-safe by
construction.
Removed:
- src/remove_ai_watermarks/face_restore.py + tests/test_face_restore.py
- pyproject.toml `restore` extra (gfpgan/facexlib/basicsr + scipy/numba pins)
- pyproject.toml `[tool.uv.extra-build-dependencies] basicsr = [...]` build pin
- CLI: `--restore-faces-method` and `--restore-faces-weight` (no method choice
to make, no GFPGAN weight knob to expose)
- InvisibleEngine._restore_faces method (only _restore_faces_photomaker remains)
- All restore-faces-method / restore-faces-weight threading through cmd_*
signatures and _process_batch_image
Kept:
- `--restore-faces / --no-restore-faces`: now binds to PhotoMaker-V2.
- All adopted oracle findings about GFPGAN re-introducing SynthID (kept in the
research docs as historical context that explains why the path was removed).
Docs updated: CLAUDE.md (restore extras bullet collapsed to photomaker, removed
face_restore Key-modules bullet, several inline GFPGAN refs scrubbed), README.md
(face-identity callout + install section now point to the photomaker extra),
docs/synthid.md 5.5 (net recipe), docs/controlnet-removal-pipeline-research.md
(recommendations).
ruff + strict pyright (src/) clean; 578 tests pass (the 9 GFPGAN tests are gone,
the 9 PhotoMaker tests stay green).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds the second face-restore mechanism, selectable via the new CLI option
`--restore-faces-method=photomaker`. Unlike the existing GFPGAN path (which runs on
the watermarked ORIGINAL and was oracle-confirmed to re-introduce SynthID by partial
pixel blending), PhotoMaker carries identity in a SynthID-invariant OpenCLIP
embedding and regenerates fresh face pixels conditioned on it — the pixels in the
output are diffusion-fresh, so the watermark cannot be transported.
The load-bearing assumption (embedding invariance to SynthID-magnitude pixel noise)
was empirically validated in the prior commit (smoke test): cosine drift 0.002
under a ±2 LSB low-freq carrier, an order of magnitude less than JPEG90 drift
which SynthID survives at >=99% TPR.
End-to-end commercial-safe:
- PhotoMaker-V2 weights: Apache-2.0 (TencentARC)
- ID encoder: OpenCLIP-ViT-H/14 (MIT)
- SDXL base: shared with the main pipeline
- NO InsightFace (the non-commercial blocker for IP-Adapter FaceID / InstantID /
PuLID / Arc2Face)
Two-pass architecture (PhotoMaker has no ControlNetImg2img class in diffusers):
1) main controlnet/default removal pass cleans SynthID + drifts faces
2) PhotoMaker txt2img regenerates each face from its embedding, feather-composited
back into the cleaned image
New module `photomaker_restore.py` mirrors `face_restore.py`: lazy pipeline
singleton (double-checked lock), `is_available()` gate, pure `_face_crop_square` and
`_composite_faces` helpers, all unit-tested without the model (9 new tests). New
`InvisibleEngine._restore_faces_photomaker` runs after the diffusion pass, mirroring
`_restore_faces`. CLI flag `--restore-faces-method=[gfpgan|photomaker]` threaded
through `cmd_invisible`/`cmd_all`/`cmd_batch` + `_process_batch_image`.
New optional `photomaker` extra (Apache-2.0 + Apache-2.0/MIT deps, no basicsr).
`[tool.hatch.metadata] allow-direct-references = true` is required because the
upstream PhotoMaker package lives only on GitHub.
The next step (separate work) is oracle validation: run a 6-image cert sweep
through the new pipeline (default/controlnet at the certified strength +
--restore-faces-method=photomaker) and confirm SynthID stays clean while face
identity is recovered. The required infrastructure (`raiw-app/modal_cert.py`) is
already in place.
ruff + strict pyright(src/) clean; 586 tests pass (+ 9 new in
tests/test_photomaker_restore.py).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
New samsung_engine.py mirrors the jimeng engine but anchors bottom-left; wired
into watermark_registry, the CLI (--mark samsung / auto), and identify
(visible_samsung, medium). visible_alpha_solve.py gains a corner=bl mode;
samsung_alpha.png solved from @f-liva's flat captures. Calibrated for the
Italian "Contenuti generati dall'AI" variant. Flat black/gray/white captures
committed, real photos gitignored. Tests + docs.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The fp16-fix VAE swap (#29) is gated to the default SDXL checkpoint, so a
custom model_id, a stale pre-fix install, or a fal/custom loader can still
decode to an all-black/NaN frame in fp16 (reporter: gpt-image 1448x1086,
the `image_processor.py invalid value encountered in cast` warning).
Add a model-agnostic backstop in remove_watermark: after generation, if the
run was fp16 and the output is degenerate (_is_degenerate_image: near-zero
mean and variance), rebuild the pipeline in fp32 on the same device and
re-run once. fp32 is the verified-clean path, so a black image is never
returned regardless of model_id or version. Mirrors the MPS->CPU fallback's
self-mutation pattern; batch inherits it. Verified e2e on MPS by forcing
fp16 with the swap disabled (first pass black, guard fired, retry clean).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
After reverse-alpha, re-detect the sparkle; when one survives at or above the
registry fail line (conf >= 0.5) -- an alpha mismatch the per-image gain estimate
could not fully correct -- inpaint the footprint and keep that only when it lowers
the re-detect confidence. The footprint inpaint reconstructs the slot from its
darker surroundings, so it physically removes the bright sparkle; purely additive,
the common clean removal re-detects below 0.5 and is returned untouched.
Measured on the spaces visible-removal audit: gemini removal-audit failures drop
15 -> 11 (4 genuine rescues), doubao 65/65 and jimeng 11/11 unchanged, zero
regressions on the 468 already-clean removals.
An offset+scale alignment search was prototyped on the remaining 11 fails and
rejected: an audit "ceiling" suggested +4 more, but those were NCC-gaming -- the
lower-scoring placement left the sparkle as bright or brighter, just reshaping the
residual so the contrast-invariant shape-NCC scored lower (a5a9: first-pass slot
~76 at background level vs the "aligned win" ~164). A brightness sanity check
rejected every one, so it contributed nothing and was removed.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Three content-quality features for the invisible/all/batch pipeline.
DBNet text detector (auto_config): replace the MSER text heuristic with
PP-OCRv3 differentiable-binarization via cv2.dnn.TextDetectionModel_DB,
using a bundled 2.4 MB Apache-2.0 model (en/cn detection nets are
byte-identical, so it ships language-neutral). cv2.dnn is core OpenCV, so
no new pip dep. MSER stays as the fallback when the model can't load.
Validated on real images: matches MSER everywhere and additionally catches
the Doubao CJK mark MSER missed; routing decisions unchanged otherwise.
Real-ESRGAN upscaler (new upscaler.py, esrgan extra): optional
pre-diffusion super-resolution for the min-resolution floor upscale, loaded
via spandrel (MIT, no basicsr) with BSD-3-Clause weights downloaded on
first use. New --upscaler {lanczos,esrgan} on invisible/all/batch; default
stays lanczos and the engine falls back to lanczos when the extra is absent
or the model errors (never breaks removal). It is a manual opt-in knob (the
auto plan never selects it) -- as a generic GAN it sharpens photo/texture
content strongly but can degrade faces (the diffusion pass regenerates
them) and thin text, documented accordingly.
batch --auto: wire the content-adaptive --auto (+ --adaptive-polish) into
cmd_batch. The plan is recomputed per image and the invisible engine is
cached per resolved pipeline (default/controlnet), so a mixed directory
builds at most one engine of each kind. Verified end-to-end: 3 mixed
images routed correctly with only 2 pipeline loads (controlnet reused).
ruff + strict pyright(src/) clean; 558 tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Brings in commit 5cf68a6 (single C2PA_AI_VENDORS registry, erase_lama
grayscale/BGRA support, batch device-cache clearing + --controlnet-scale,
uv publish via OIDC, hatchling pin <1.31). Auto-merged with no conflicts;
ruff/pytest(544)/pyright all clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
detect_watermark's shape-only NCC (spatial/gradient/var fusion) fires on ornate
or flat content (text strips, banners, hatching) that coincidentally matches the
diamond shape. The NCC is contrast-invariant, so it cannot see the defining
property of a real Gemini sparkle: a bright WHITE overlay whose core sits above
the local background.
The fusion now demotes (caps confidence to 0.30) a match that is BOTH
low-confidence (< _SPARKLE_FP_CONF 0.65) AND has a low core-ring brightness
margin (_core_ring_margin < _SPARKLE_FP_MARGIN 5). Real sparkles escape via
EITHER high confidence (white-bg sparkles score >=0.79 despite a low margin) OR
high margin (dark/mid backgrounds, incl. the #36 faint-corner case), so both
must fail to demote. The gate is monotonic -- it only removes detections, never
adds -- so it cannot regress the verified-negative corpus (already 0 FPs).
On the spaces corpus it demoted 16/495 flagged sparkles (13 no AI metadata =
content FPs; the 3 AI-meta ones were visually FPs / a near-invisible
white-on-white sparkle whose AI verdict is held by metadata), and dropped the
removal-audit failures 20 -> 15.
- _core_and_bg shared helper (core 75th-pct brightness vs background-ring median);
_estimate_alpha_gain refactored onto it, new _core_ring_margin wrapper.
- TestSparkleFalsePositiveGate: margin high/low, strong-sparkle kept (incl. on
white via high conf), blurred no-core blob demoted.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The fixed mild auto polish (unsharp 0.5 / grain 2.0) under-corrected soft
photo/face output (gemini_3 stayed at lap-var 84 vs its 592 original) and its
grain speckled small text. Replace it with humanizer.adaptive_polish: target the
input's Laplacian variance with a capped unsharp scaled to the deficit + edge-
masked grain (smooth regions only), calibrated by a short sigma search. Self-
limiting on text/graphics -- already high-frequency, so almost no polish lands
and text edges are masked out. Validated on the spaces corpus (gemini_3 84 -> 334
end-to-end; openai_1 text near-untouched).
Interface: every --auto decision is now independently overridable -- add
--adaptive-polish/--no-adaptive-polish (matching --restore-faces; works without
--auto too) so the polish can be disabled or used manually. _apply_auto overrides
exactly the three content-adaptive modes (pipeline, restore-faces, adaptive-
polish); --unsharp/--humanize stay independent fixed filters.
cv2-only, no new deps. Threaded through invisible/all (not batch).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Three P2 cleanups from a library-wide review.
Detection -- single C2PA_AI_VENDORS registry (noai/constants.py):
- C2PA_ISSUERS, SYNTHID_C2PA_ISSUERS, and identify._ISSUER_PLATFORM now derive
from one C2paAiVendor table, so adding a C2PA vendor is one entry instead of
edits in three places across two files. Behavior-identical (262 detection
tests pass; the kept `needle` field is load-bearing -- it differs from `org`
for Google and ByteDance, with no mechanical derivation).
Code-health:
- region_eraser.erase_lama now accepts grayscale/BGRA like erase_cv2 (it
crashed on grayscale and silently dropped alpha on BGRA). +2 regression tests.
- batch frees the device cache between images via a shared try_empty_device_cache
helper (generalized from the MPS-only _try_clear_mps_cache, now reused by both
the MPS->CPU fallback and the batch loop).
- batch gained --controlnet-scale (parity with invisible/all).
CI / packaging:
- publish.yml uploads via `uv publish` (PyPI trusted publishing over OIDC),
replacing pypa/gh-action-pypi-publish so uploads no longer depend on that
action's bundled twine accepting the Metadata-Version. Workflow filename +
pypi environment unchanged, so PyPI's trusted-publisher entry still matches.
- hatchling pin relaxed <1.28 -> <1.31 (verified against hatch's changelog:
1.30.0 made Metadata 2.5 the default, 1.30.1 reverted to 2.4; 1.27-1.29 were
always 2.4). Kept as belt-and-suspenders so the first uv-publish release ships
2.4, isolating the uploader swap from the metadata-version bump.
Docs (CLAUDE.md, pyproject) synced; corrected the inaccurate "hatchling 1.28+
emits 2.5" note.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add `auto_config.plan(image_path) -> AutoConfig`, the first step of the
invisible/all pipeline: it inspects the input image (before the diffusion model
loads) and picks the quality modes so the run adapts to content. Quality-priority
routing -- ControlNet (text/face-structure preservation) is the default, skipped for
plain SDXL only on a clearly structure-less image; GFPGAN face restore when a face is
present; a mild sharpen + grain polish when a smoothing pass ran. Exposed as `--auto`
on `all`/`invisible` (`_apply_auto`; explicit flags override via click's parameter
source). Not wired into batch (its engine is cached per-mode).
Detection is cv2-only and torch-free (~100 MB peak RSS, a few ms): OpenCV YuNet
(`cv2.FaceDetectorYN`, MIT, 232 KB model bundled in assets/) for faces, a Canny
edge-density + MSER heuristic for text/structure (a rough Phase-1 placeholder; DBNet
via cv2.dnn is the planned upgrade). ZERO new pip deps. Designed to run wherever the
pipeline runs -- the raiw.cc Modal GPU worker -- never on the 512 MB web host.
Real-ESRGAN-via-Spandrel upscaling (a new `esrgan` extra) and an adaptive
Laplacian-variance polish are deferred to later phases.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The captured sparkle alpha peaks ~0.51, but some real Gemini sparkles are
rendered more opaque. The fixed-alpha reverse blend then UNDER-subtracts and
leaves a bright residual the detector still fires on. A visible-removal audit
through the registry path on the spaces corpus showed this as a meaningful
fraction of marks -- all under-removals, not a background-brightness class
(failures and successes had the same input confidence and background luma; the
discriminator was the removal delta itself).
remove_watermark now estimates a per-image alpha gain (_estimate_alpha_gain:
effective sparkle opacity at the bright core vs the local background ring,
a_eff/a_cap, clamped [1.0, 1.94]) and scales the alpha to match before the
over-sub/blend branch. A 1.05 deadband keeps a sparkle that already matches the
capture byte-identical to the pre-fix output, so the fix is purely additive
(0 regressions on the audit set; failures dropped substantially). The over-sub
guard still runs on the scaled alpha as the safety net for an over-shoot.
- _estimate_alpha_gain + _ALPHA_GAIN_MAX/_DEADBAND/_CORE_FRAC in gemini_engine.
- TestUnderSubtractionGain asserts on footprint pixels, NOT the detector (its
NCC is degenerate on a flat synthetic bg; the real corpus removal drops the
detector ~0.80 -> ~0.27).
- scripts/visible_removal_audit.py: the detect -> remove -> re-detect audit tool
that found and validated this (operates on gitignored data/spaces only).
- CLAUDE.md + README: document the under-subtraction gain.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two quality knobs for the SDXL invisible pass:
- min_resolution floor (default 1024, --min-resolution): small inputs are
upscaled to a 1024px long-side floor before diffusion, since SDXL img2img
distorts on a tiny latent (a 381x512 portrait wrecks at native). The output
is restored to the original input size, so it is a transparent quality boost;
it adds time/memory on small inputs. 0 disables. Extends the pure _target_size
helper (now cap-or-floor-or-native, min skipped on a min>max misconfig),
unit-tested without a model.
- unsharp post-filter (humanizer.unsharp_mask, --unsharp, opt-in default 0):
applied LAST, after the GFPGAN face pass (a pre-GFPGAN sharpen would be
smoothed back over), to counter the soft/over-smoothed look that diffusion +
restoration leave behind (an AI tell). Pairs with --humanize (grain).
Both threaded through invisible/all/batch + the module-level helper. Verified
end-to-end on a 381x512 portrait: upscaled to 1024, sharpened, restored to
381x512.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add an optional, commercial-safe face-restoration post-pass that recovers
face identity the diffusion removal pass drifts (canny holds structure, not
likeness) while still scrubbing the pixel watermark in the face regions.
- face_restore.py: GFPGANer singleton (CPU unless CUDA), the basicsr
torchvision.transforms.functional_tensor shim, and the pure feather
_composite_faces helper (unit-tested without the model). GFPGAN
re-synthesizes each face from a StyleGAN2 prior, so composited face pixels
are GAN-generated (no watermark, no pixel-copy) -- oracle-clean at weight 0.5
with identity preserved.
- InvisibleEngine.remove_watermark: restore_faces / restore_faces_weight,
best-effort, auto-skips when the extra is absent or no face is detected.
- CLI --restore-faces/--no-restore-faces + --restore-faces-weight on
invisible/all/batch (on by default).
- restore extra (gfpgan/facexlib/basicsr), numpy<2-pinned (scipy<1.18,
numba<0.60) and kept out of `all`; basicsr needs Python <3.13 + setuptools<69
to build, so pin .python-version 3.12.
Commercial-safe: GFPGAN Apache-2.0, RetinaFace MIT. The CodeFormer alternative
is non-commercial and is not shipped. The earlier IP-Adapter FaceID layer was
removed (footgun: needs high strength, corrupts faces at the low removal
strength).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add `--pipeline controlnet` (SDXL base + xinsir canny ControlNet via
StableDiffusionXLControlNetImg2ImgPipeline): the canny edge map conditions the
img2img regeneration so text and face STRUCTURE stay sharp, while the watermark
is still removed by the regeneration (`strength`) -- no original pixels are
copied or frozen, so SynthID does not survive. Oracle-verified clean on OpenAI
with better text/structure fidelity than plain img2img at equal strength.
`--controlnet-scale` tunes structure preservation; fp32 on mps/cpu (fp16-fixed
VAE on cuda/xpu). Shares the img2img runner (live progress + MPS->CPU fallback)
and the fp16-VAE-fix / device-move helpers with the default pipeline.
Remove the superseded subsystems -- ctrlregen (SD1.5 clean-noise),
text-protection (differential / region-hires) and face-protection: they either
destroyed real content or shielded the watermark by re-using original pixels.
controlnet replaces them by regenerating everything under edge conditioning.
Canny preserves face structure but not identity; face IDENTITY is a separate
face-restoration post-pass (CodeFormer/GFPGAN), researched + prototyped but not
yet shipped. An IP-Adapter FaceID attempt was built and removed (footgun: needs
high strength, corrupts faces at removal strength).
Docs: docs/controlnet-removal-pipeline-research.md, scripts/controlnet_sweep.py.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
detect_watermark's size-weighted global NCC search lets a larger, mediocre
match (e.g. a bright collar in a portrait) outrank a small, near-perfect
sparkle in the bottom-right corner, so a faint sparkle on a busy background
scored below threshold and the image read as clean -- the regression from
widening the search window 256px->512px between v0.7.2 and v0.8.8.
Add _corner_promote: a bottom-right-corner raw-NCC pass that overrides the
global pick when the corner holds a match with raw NCC >= 0.85 that beats it.
It only ever replaces a lower-fidelity pick (cannot weaken an existing
detection) and keeps the wider window for variant margins. The corner side is
relative-clamped (0.20 of the short side, [96, 384]) so it stays a true corner
at every scale: a fixed 256px covers ~70% of a small portrait, where a real
photo raw-matches the star at ~0.81; relative tightening drops that to ~0.69.
The 0.85 gate sits between the worst real-photo corner match (~0.78) and a
genuine faint sparkle (~0.93): zero false positives across native + downscaled
negatives, headshot rescued from below-threshold to 0.71.
Factor the shared multi-scale matchTemplate loop into _scan_scales.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The C2PA issuer attribution (`c2pa`) and the SynthID proxy (`synthid`) are
derived from the same manifest, so treating them as independent signals made
rule 1 fire on legitimate multi-actor manifests where a product wraps another
vendor's engine (Microsoft Designer on OpenAI, Microsoft on Google) or an edit
chain re-signs (Adobe over a Gemini original). 19 such files in the
2026-06-01/02 spaces batches read as "likely spoofed/laundered" before this.
Group `c2pa` + `synthid` into one provenance source via `_CLASH_SOURCE`; rule 1
now requires two vendors from different sources. A manifest vendor still clashes
with a genuinely independent stamp (EXIF/XMP generator, IPTC AISystemUsed, AIGC,
xAI).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
On a dark/textured background (e.g. grass) the captured alpha map over-estimates
the real Gemini sparkle's effective opacity (~0.51 captured vs ~0.31 effective),
so the fixed-alpha reverse blend over-subtracts (watermarked - alpha*logo goes
negative) and drives the footprint to black -- the white sparkle turns into a
black diamond (issue #30, reported by @CoolZimo1).
remove_watermark now detects this via _reverse_alpha_oversubtracts (fraction of
footprint pixels with a negative numerator > 5%) and inpaints the small sparkle
footprint from the surrounding pixels (cv2 NS, cropped to a padded box) instead.
Behavior-neutral on the working case: a bright background over-subtracts at ~0%,
so reverse-alpha is used and the output is byte-identical to before (verified:
demo_banana 0.0 frac vs the issue-#30 grass image 0.61 frac; issue-#30 footprint
recovers to background grass with no pit, residual sparkle conf 0.25 < 0.35).
Guard is scoped to GeminiEngine: doubao/jimeng already NCC-align their alpha to
the actual mark per image, which sidesteps the fixed-alpha mismatch.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The default img2img strength is now chosen from the detected SynthID vendor
(C2PA issuer) instead of a single fixed 0.30: OpenAI gpt-image -> 0.10, Google
Gemini -> 0.15, unknown source -> 0.15. Explicit --strength always wins.
Basis: an oracle-verified June 2026 controlled study (clean v0.8.6, text/face
protection OFF, per-image openai.com/verify or Gemini-app verdict). OpenAI's
SynthID clears at 0.05 across 1024-1600 px (n=4, resolution-independent);
Google's is ~3x more robust and needs 0.15 on the capped-1536 path (n=4). The
dominant factor is the VENDOR, not resolution. The earlier single 0.30 default
and the "resolution dependence" lore came from contaminated tests run with the
protect-text bug ON (issue #14) -- re-running those same 1600x1600 images clean
removes SynthID at 0.05.
`vendor_for_strength(path)` reads metadata.synthid_source on the ORIGINAL input
and is threaded through cli (invisible/all/batch) -> invisible_engine ->
watermark_remover -> resolve_strength(strength, profile, vendor), so display and
execution use the same vendor (the engine sees a temp path whose C2PA the visible
pass already stripped, so detection must happen in the CLI on the pristine
source). Caveat: Google's 0.15 was validated only on --max-resolution 1536;
native 2816 Gemini was not locally measurable (OOM on Apple Silicon) and is
pending GPU validation on raiw.cc.
Docs: docs/synthid.md sections 2.2/4.4/5.2 corrected (the contaminated
resolution-dependence findings replaced with the clean oracle-verified table);
README and CLAUDE.md updated; CLI --strength help reflects the adaptive default.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>