The upstream PhotoMaker package's `__init__.py` unconditionally imports a
face-analyser class from its `insightface_package` submodule, so JUST importing
`PhotoMakerStableDiffusionXLPipeline` (the V1 pipeline class we use) raises
`ModuleNotFoundError: No module named 'insightface'` if insightface isn't
present in the env. The Modal cert sweep caught this on the V1 image.
Resolution: pin `insightface>=0.7.3` (and its `onnxruntime` runtime dep) in the
`photomaker` extra. The PyPI insightface package is MIT-licensed CODE; the
non-commercial restriction sits on the pretrained model packs (antelopev2,
buffalo_l) which download only when `FaceAnalysis()` is instantiated. Our V1 path
never instantiates the face-analyser -- it loads photomaker-v1.bin (CLIP-only
encoder) via `load_photomaker_adapter` -- so the model-pack license does not
bind us; we depend only on the MIT code for the import to resolve.
Safety guards:
- Runtime check in `_get_pipeline`: raises if `_PHOTOMAKER_FILE` is ever pointed
at v2 (so a future maintainer can't silently regress to the InsightFace path).
- New test class `TestV1OnlyCommercialSafetyGuard`: asserts repo + filename
pin to V1 AND asserts the module source never references the face-analyser
class (a static check that our codepath stays out of the runtime that would
pull the non-commercial model packs).
Docs: documented the import dance + legal split inline at the top of
`photomaker_restore.py`.
ruff clean; 581 tests pass (the 9 PhotoMaker tests plus 3 new V1-guard tests).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A Modal cert sweep caught what the research doc missed: PhotoMaker-V2 fails at
import without InsightFace ("No module named 'insightface'"). Reading the upstream
source confirms it: `photomaker/__init__.py` imports `FaceAnalysis2` (an InsightFace
wrapper) at module load, V2's encoder is named
`PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken`, and `model_v2.py`'s forward
takes an `id_embeds` argument that the pipeline computes via
`insightface.app.FaceAnalysis(name='antelopev2', ...)`. So V2 is a DUAL encoder
(CLIP + ArcFace), not CLIP-only as the model card line "id_encoder includes
finetuned OpenCLIP-ViT-H-14 and a few fuse layers" implied.
InsightFace's pretrained model packs (antelopev2, buffalo_l) are research/
non-commercial only per their own README:
"The pretrained models we provided with this library are available for
non-commercial research purposes only."
So V2 is blocked for a paid service like raiw.cc.
PhotoMaker-V1 is the commercial-safe alternative — its `PhotoMakerIDEncoder`
(model.py) forward takes only `(id_pixel_values, prompt_embeds, class_tokens_mask)`,
no ArcFace branch. Identity is CLIP-only, license is Apache-2.0, no InsightFace.
Code change: swap the repo + filename constants in `photomaker_restore.py`
(TencentARC/PhotoMaker, photomaker-v1.bin). Tests still pass (the 9 PhotoMaker
tests use a fake pipeline, so the model swap is transparent to them).
Doc correction: rewrote the verdict / license table / section 5 of
`docs/synthid-robust-identity-research.md` to lead with V1 and add a correction
notice explaining the V2 misread. Bulk-renamed `PhotoMaker-V2` to `PhotoMaker-V1`
across CLAUDE.md, README.md, docs/synthid.md, and
docs/controlnet-removal-pipeline-research.md (kept V2 only in the correction
notice, the license table, and the anchor reference).
ruff clean; 578 tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PhotoMaker imports einops in its forward path but its install_requires doesn't
declare it, so the photomaker extra resolved without einops on a clean install
and the Modal cert sweep died at the restore-faces step with
"No module named 'einops'" -- the post-pass failed gracefully and returned the
un-restored cleaned output, so the cert artifact had no face recovery.
Pin einops>=0.7.0 in the photomaker extra so the extra is self-contained.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The GFPGAN `restore` extra and its `face_restore.py` module are gone. They were
oracle-confirmed to re-introduce SynthID by blending watermarked original face
pixels at fidelity weight 0.5 (clean A/B: gemini_3 controlnet 0.20 detected WITH
GFPGAN, clean WITHOUT). Keeping them as the default restore method was a footgun
for the removal pipeline. PhotoMaker-V2 (added in the previous commit) is the
single shipped restore path now -- identity-as-embedding, SynthID-safe by
construction.
Removed:
- src/remove_ai_watermarks/face_restore.py + tests/test_face_restore.py
- pyproject.toml `restore` extra (gfpgan/facexlib/basicsr + scipy/numba pins)
- pyproject.toml `[tool.uv.extra-build-dependencies] basicsr = [...]` build pin
- CLI: `--restore-faces-method` and `--restore-faces-weight` (no method choice
to make, no GFPGAN weight knob to expose)
- InvisibleEngine._restore_faces method (only _restore_faces_photomaker remains)
- All restore-faces-method / restore-faces-weight threading through cmd_*
signatures and _process_batch_image
Kept:
- `--restore-faces / --no-restore-faces`: now binds to PhotoMaker-V2.
- All adopted oracle findings about GFPGAN re-introducing SynthID (kept in the
research docs as historical context that explains why the path was removed).
Docs updated: CLAUDE.md (restore extras bullet collapsed to photomaker, removed
face_restore Key-modules bullet, several inline GFPGAN refs scrubbed), README.md
(face-identity callout + install section now point to the photomaker extra),
docs/synthid.md 5.5 (net recipe), docs/controlnet-removal-pipeline-research.md
(recommendations).
ruff + strict pyright (src/) clean; 578 tests pass (the 9 GFPGAN tests are gone,
the 9 PhotoMaker tests stay green).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds the second face-restore mechanism, selectable via the new CLI option
`--restore-faces-method=photomaker`. Unlike the existing GFPGAN path (which runs on
the watermarked ORIGINAL and was oracle-confirmed to re-introduce SynthID by partial
pixel blending), PhotoMaker carries identity in a SynthID-invariant OpenCLIP
embedding and regenerates fresh face pixels conditioned on it — the pixels in the
output are diffusion-fresh, so the watermark cannot be transported.
The load-bearing assumption (embedding invariance to SynthID-magnitude pixel noise)
was empirically validated in the prior commit (smoke test): cosine drift 0.002
under a ±2 LSB low-freq carrier, an order of magnitude less than JPEG90 drift
which SynthID survives at >=99% TPR.
End-to-end commercial-safe:
- PhotoMaker-V2 weights: Apache-2.0 (TencentARC)
- ID encoder: OpenCLIP-ViT-H/14 (MIT)
- SDXL base: shared with the main pipeline
- NO InsightFace (the non-commercial blocker for IP-Adapter FaceID / InstantID /
PuLID / Arc2Face)
Two-pass architecture (PhotoMaker has no ControlNetImg2img class in diffusers):
1) main controlnet/default removal pass cleans SynthID + drifts faces
2) PhotoMaker txt2img regenerates each face from its embedding, feather-composited
back into the cleaned image
New module `photomaker_restore.py` mirrors `face_restore.py`: lazy pipeline
singleton (double-checked lock), `is_available()` gate, pure `_face_crop_square` and
`_composite_faces` helpers, all unit-tested without the model (9 new tests). New
`InvisibleEngine._restore_faces_photomaker` runs after the diffusion pass, mirroring
`_restore_faces`. CLI flag `--restore-faces-method=[gfpgan|photomaker]` threaded
through `cmd_invisible`/`cmd_all`/`cmd_batch` + `_process_batch_image`.
New optional `photomaker` extra (Apache-2.0 + Apache-2.0/MIT deps, no basicsr).
`[tool.hatch.metadata] allow-direct-references = true` is required because the
upstream PhotoMaker package lives only on GitHub.
The next step (separate work) is oracle validation: run a 6-image cert sweep
through the new pipeline (default/controlnet at the certified strength +
--restore-faces-method=photomaker) and confirm SynthID stays clean while face
identity is recovered. The required infrastructure (`raiw-app/modal_cert.py`) is
already in place.
ruff + strict pyright(src/) clean; 586 tests pass (+ 9 new in
tests/test_photomaker_restore.py).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Empirical confirmation of the load-bearing assumption in the PhotoMaker-V2 path: the
identity embedding cannot transport an invisible pixel watermark.
Tested OpenCLIP-ViT-H/14 (laion2B-s32B-b79K — the same encoder PhotoMaker-V2
fine-tunes) on 31 face crops from gemini_3/gemini_4/openai_3 grid. cosine
similarity between embed(orig) and embed(perturbed):
- synthid_proxy (±2 LSB low-frequency noise, the regime SynthID actually lives in):
mean 0.9977, min 0.9937. Embedding moves by 0.002 — an order of magnitude less
than JPEG90 (mean 0.928), which SynthID survives at >=99% TPR by design.
- noise3 / jpeg70 / blur1: 0.89-0.95, all clearly above the SynthID floor.
- self check: 1.0000 (pipeline sane).
So the embedder discards exactly the dimensions SynthID hides in. PhotoMaker-V2
conditioned on a watermarked face will see the same identity vector as a clean
face of that person, so the generated face inherits identity, not the watermark.
This unblocks step 2 of the research plan: prototype PhotoMaker-V2 in the
controlnet pipeline. The previously logged ad-hoc "cos(orig, SDXL-cleaned)"
numbers (0.56-0.93) measured diffusion drift, not watermark invariance, and are
not relevant to the hypothesis.
Docs only.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
After GFPGAN restore was oracle-confirmed to RE-INTRODUCE SynthID (it is a fidelity-
restoration net conditioned on the watermarked input), the only identity path that
will not transport the watermark is identity-by-EMBEDDING: a semantic vector that
conditions a fresh generation. That requires a face-recognition / ArcFace-class or
CLIP-image embedder.
Verified the license stack of every credible 2025-2026 SDXL identity adapter by
fetching primary sources directly (HuggingFace model cards, insightface.ai):
- IP-Adapter FaceID family, InstantID, PuLID, Arc2Face -> all blocked. Each
depends at runtime on InsightFace's antelopev2/buffalo_l ArcFace packs, and
insightface.ai explicitly states "Code is MIT licensed; models require separate
commercial licensing." IP-Adapter FaceID's own model card flags itself non-
commercial for the same reason.
- PhotoMaker-V2 is the single commercial-safe end-to-end stack today: Apache-2.0
adapter weights with identity encoded as a fine-tuned OpenCLIP-ViT-H/14 (the
model card's exact phrase: "id_encoder includes finetuned OpenCLIP-ViT-H-14
and a few fuse layers"). No InsightFace.
Mechanistic argument that an identity embedding cannot transport SynthID: the
embedder is trained to be invariant to low-amplitude pixel changes (JPEG, resize,
brightness, noise), which is exactly the regime SynthID hides in by design. So
the embedding extracted from a watermarked face should be ~identical to the
embedding from the cleaned face, and the embedding cannot carry the watermark
into a freshly generated face. Flagged explicitly as not-yet-measured -- the
first integration step is a cosine-similarity smoke test (no codegen) before
investing in a PhotoMaker prototype.
Process note: the deep-research harness was run but its verifier subagents failed
to call StructuredOutput (same harness bug as a prior session), so its synthesis
was unusable; the license claims here are direct quotes from the primary
sources, fetched and verified, not from the workflow synthesis.
Docs only.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Ran the isolated raiw-controlnet-cert Modal app (raiw-app/modal_cert.py) over a
strength x seed grid, restore OFF, --max-resolution 1536, each vendor checked on its
OWN oracle (OpenAI -> openai.com/verify, Gemini -> the Gemini app). Certified
controlnet SynthID-removal floors:
- OpenAI 0.20: 2 photoreal images (9-face grid + bracelet) x seed {1,2,3} = 6/6 clean;
the bracelet that flipped at 0.15 is seed-robust at 0.20. Transfers to prod (OpenAI
removal is resolution-independent).
- Gemini 0.30: 0.20 detected -> 0.30 clean on 2/2 seeds (hardest face). Holds only at
<= 1536; Gemini is resolution-sensitive and raiw.cc runs NATIVE, so cap Gemini
<= 1536 + use 0.30, or native-calibrate (~0.35+).
Prod recipe recorded: controlnet + a controlnet-specific per-vendor schedule in
resolve_strength (OpenAI 0.20 / Gemini 0.30, NOT the default 0.10/0.15 ladder) +
FIXED prod seed (kills the near-threshold non-determinism) + restore reworked/off.
Added to docs/controlnet-removal-pipeline-research.md (certified floors table),
docs/synthid.md 5.5, and the CLAUDE.md controlnet bullet. Docs only.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Oracle validation (openai.com/verify + the Gemini app) overturned three claims that
were on main, and consolidates the controlnet findings into one authoritative place.
- controlnet does NOT reliably remove SynthID at the low vendor-adaptive strength:
removal is content x pipeline dependent and the survivors FLIP by content type
(photoreal survives controlnet / clears default; flat graphic survives default /
clears controlnet; flat text clears both). Root cause is insufficient strength,
not the pipeline; controlnet needs a higher, per-vendor floor than default.
- removal near the threshold is SEED-non-deterministic (same image+pipeline+strength
can pass or fail run-to-run); a single clean run does not certify a strength.
- `--restore-faces` RE-INTRODUCES SynthID: GFPGAN runs on the ORIGINAL watermarked
face at weight 0.5 and composites it back over the cleaned result (clean A/B:
a Gemini face stayed detected through controlnet 0.15/0.20/0.25 WITH restore,
cleared at 0.20 with --no-restore-faces). The old "GFPGAN scrubs SynthID" claim
was wrong.
Corrected in CLAUDE.md (watermark_remover controlnet bullet, controlnet
Known-limitations bullet, face_restore bullet, vendor-adaptive strength bullet) and
docs/synthid.md (5.1 controlnet/face-identity, 5.2 strength floors, new 5.5 oracle
validation log). docs/controlnet-removal-pipeline-research.md gains an authoritative
"Oracle validation 2026-06-04" section that the others point to as the single source.
Docs only; no code change.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
New samsung_engine.py mirrors the jimeng engine but anchors bottom-left; wired
into watermark_registry, the CLI (--mark samsung / auto), and identify
(visible_samsung, medium). visible_alpha_solve.py gains a corner=bl mode;
samsung_alpha.png solved from @f-liva's flat captures. Calibrated for the
Italian "Contenuti generati dall'AI" variant. Flat black/gray/white captures
committed, real photos gitignored. Tests + docs.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The fp16-fix VAE swap (#29) is gated to the default SDXL checkpoint, so a
custom model_id, a stale pre-fix install, or a fal/custom loader can still
decode to an all-black/NaN frame in fp16 (reporter: gpt-image 1448x1086,
the `image_processor.py invalid value encountered in cast` warning).
Add a model-agnostic backstop in remove_watermark: after generation, if the
run was fp16 and the output is degenerate (_is_degenerate_image: near-zero
mean and variance), rebuild the pipeline in fp32 on the same device and
re-run once. fp32 is the verified-clean path, so a black image is never
returned regardless of model_id or version. Mirrors the MPS->CPU fallback's
self-mutation pattern; batch inherits it. Verified e2e on MPS by forcing
fp16 with the swap disabled (first pass black, guard fired, retry clean).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Targeted `uv lock --upgrade-package aiohttp`; only the aiohttp pin changes (no
other package added/removed). Clears the two moderate Dependabot alerts on the
transitive aiohttp. The third alert (basicsr GHSA-86w8-vhw6-q9qq, command
injection, no patch) is accepted: basicsr is the optional, off-by-default
`restore` extra pinned to 1.4.2 as the only buildable version.
Imports + targeted suite (identify/metadata/gemini) green after the bump.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
After reverse-alpha, re-detect the sparkle; when one survives at or above the
registry fail line (conf >= 0.5) -- an alpha mismatch the per-image gain estimate
could not fully correct -- inpaint the footprint and keep that only when it lowers
the re-detect confidence. The footprint inpaint reconstructs the slot from its
darker surroundings, so it physically removes the bright sparkle; purely additive,
the common clean removal re-detects below 0.5 and is returned untouched.
Measured on the spaces visible-removal audit: gemini removal-audit failures drop
15 -> 11 (4 genuine rescues), doubao 65/65 and jimeng 11/11 unchanged, zero
regressions on the 468 already-clean removals.
An offset+scale alignment search was prototyped on the remaining 11 fails and
rejected: an audit "ceiling" suggested +4 more, but those were NCC-gaming -- the
lower-scoring placement left the sparkle as bright or brighter, just reshaping the
residual so the contrast-invariant shape-NCC scored lower (a5a9: first-pass slot
~76 at background level vs the "aligned win" ~164). A brightness sanity check
rejected every one, so it contributed nothing and was removed.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Three content-quality features for the invisible/all/batch pipeline.
DBNet text detector (auto_config): replace the MSER text heuristic with
PP-OCRv3 differentiable-binarization via cv2.dnn.TextDetectionModel_DB,
using a bundled 2.4 MB Apache-2.0 model (en/cn detection nets are
byte-identical, so it ships language-neutral). cv2.dnn is core OpenCV, so
no new pip dep. MSER stays as the fallback when the model can't load.
Validated on real images: matches MSER everywhere and additionally catches
the Doubao CJK mark MSER missed; routing decisions unchanged otherwise.
Real-ESRGAN upscaler (new upscaler.py, esrgan extra): optional
pre-diffusion super-resolution for the min-resolution floor upscale, loaded
via spandrel (MIT, no basicsr) with BSD-3-Clause weights downloaded on
first use. New --upscaler {lanczos,esrgan} on invisible/all/batch; default
stays lanczos and the engine falls back to lanczos when the extra is absent
or the model errors (never breaks removal). It is a manual opt-in knob (the
auto plan never selects it) -- as a generic GAN it sharpens photo/texture
content strongly but can degrade faces (the diffusion pass regenerates
them) and thin text, documented accordingly.
batch --auto: wire the content-adaptive --auto (+ --adaptive-polish) into
cmd_batch. The plan is recomputed per image and the invisible engine is
cached per resolved pipeline (default/controlnet), so a mixed directory
builds at most one engine of each kind. Verified end-to-end: 3 mixed
images routed correctly with only 2 pipeline loads (controlnet reused).
ruff + strict pyright(src/) clean; 558 tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Brings in commit 5cf68a6 (single C2PA_AI_VENDORS registry, erase_lama
grayscale/BGRA support, batch device-cache clearing + --controlnet-scale,
uv publish via OIDC, hatchling pin <1.31). Auto-merged with no conflicts;
ruff/pytest(544)/pyright all clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
detect_watermark's shape-only NCC (spatial/gradient/var fusion) fires on ornate
or flat content (text strips, banners, hatching) that coincidentally matches the
diamond shape. The NCC is contrast-invariant, so it cannot see the defining
property of a real Gemini sparkle: a bright WHITE overlay whose core sits above
the local background.
The fusion now demotes (caps confidence to 0.30) a match that is BOTH
low-confidence (< _SPARKLE_FP_CONF 0.65) AND has a low core-ring brightness
margin (_core_ring_margin < _SPARKLE_FP_MARGIN 5). Real sparkles escape via
EITHER high confidence (white-bg sparkles score >=0.79 despite a low margin) OR
high margin (dark/mid backgrounds, incl. the #36 faint-corner case), so both
must fail to demote. The gate is monotonic -- it only removes detections, never
adds -- so it cannot regress the verified-negative corpus (already 0 FPs).
On the spaces corpus it demoted 16/495 flagged sparkles (13 no AI metadata =
content FPs; the 3 AI-meta ones were visually FPs / a near-invisible
white-on-white sparkle whose AI verdict is held by metadata), and dropped the
removal-audit failures 20 -> 15.
- _core_and_bg shared helper (core 75th-pct brightness vs background-ring median);
_estimate_alpha_gain refactored onto it, new _core_ring_margin wrapper.
- TestSparkleFalsePositiveGate: margin high/low, strong-sparkle kept (incl. on
white via high conf), blurred no-core blob demoted.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The fixed mild auto polish (unsharp 0.5 / grain 2.0) under-corrected soft
photo/face output (gemini_3 stayed at lap-var 84 vs its 592 original) and its
grain speckled small text. Replace it with humanizer.adaptive_polish: target the
input's Laplacian variance with a capped unsharp scaled to the deficit + edge-
masked grain (smooth regions only), calibrated by a short sigma search. Self-
limiting on text/graphics -- already high-frequency, so almost no polish lands
and text edges are masked out. Validated on the spaces corpus (gemini_3 84 -> 334
end-to-end; openai_1 text near-untouched).
Interface: every --auto decision is now independently overridable -- add
--adaptive-polish/--no-adaptive-polish (matching --restore-faces; works without
--auto too) so the polish can be disabled or used manually. _apply_auto overrides
exactly the three content-adaptive modes (pipeline, restore-faces, adaptive-
polish); --unsharp/--humanize stay independent fixed filters.
cv2-only, no new deps. Threaded through invisible/all (not batch).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Three P2 cleanups from a library-wide review.
Detection -- single C2PA_AI_VENDORS registry (noai/constants.py):
- C2PA_ISSUERS, SYNTHID_C2PA_ISSUERS, and identify._ISSUER_PLATFORM now derive
from one C2paAiVendor table, so adding a C2PA vendor is one entry instead of
edits in three places across two files. Behavior-identical (262 detection
tests pass; the kept `needle` field is load-bearing -- it differs from `org`
for Google and ByteDance, with no mechanical derivation).
Code-health:
- region_eraser.erase_lama now accepts grayscale/BGRA like erase_cv2 (it
crashed on grayscale and silently dropped alpha on BGRA). +2 regression tests.
- batch frees the device cache between images via a shared try_empty_device_cache
helper (generalized from the MPS-only _try_clear_mps_cache, now reused by both
the MPS->CPU fallback and the batch loop).
- batch gained --controlnet-scale (parity with invisible/all).
CI / packaging:
- publish.yml uploads via `uv publish` (PyPI trusted publishing over OIDC),
replacing pypa/gh-action-pypi-publish so uploads no longer depend on that
action's bundled twine accepting the Metadata-Version. Workflow filename +
pypi environment unchanged, so PyPI's trusted-publisher entry still matches.
- hatchling pin relaxed <1.28 -> <1.31 (verified against hatch's changelog:
1.30.0 made Metadata 2.5 the default, 1.30.1 reverted to 2.4; 1.27-1.29 were
always 2.4). Kept as belt-and-suspenders so the first uv-publish release ships
2.4, isolating the uploader swap from the metadata-version bump.
Docs (CLAUDE.md, pyproject) synced; corrected the inaccurate "hatchling 1.28+
emits 2.5" note.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add `auto_config.plan(image_path) -> AutoConfig`, the first step of the
invisible/all pipeline: it inspects the input image (before the diffusion model
loads) and picks the quality modes so the run adapts to content. Quality-priority
routing -- ControlNet (text/face-structure preservation) is the default, skipped for
plain SDXL only on a clearly structure-less image; GFPGAN face restore when a face is
present; a mild sharpen + grain polish when a smoothing pass ran. Exposed as `--auto`
on `all`/`invisible` (`_apply_auto`; explicit flags override via click's parameter
source). Not wired into batch (its engine is cached per-mode).
Detection is cv2-only and torch-free (~100 MB peak RSS, a few ms): OpenCV YuNet
(`cv2.FaceDetectorYN`, MIT, 232 KB model bundled in assets/) for faces, a Canny
edge-density + MSER heuristic for text/structure (a rough Phase-1 placeholder; DBNet
via cv2.dnn is the planned upgrade). ZERO new pip deps. Designed to run wherever the
pipeline runs -- the raiw.cc Modal GPU worker -- never on the 512 MB web host.
Real-ESRGAN-via-Spandrel upscaling (a new `esrgan` extra) and an adaptive
Laplacian-variance polish are deferred to later phases.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pairs <hash>_src / <hash>_clean outputs, computes SSIM + detail/resolution
proxies, ranks the worst-preserved images for visual classification. Used to
characterize the classes the SDXL scrub degrades (line-art, faces, dense text).
Operates on gitignored data/spaces only; writes nothing tracked.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The captured sparkle alpha peaks ~0.51, but some real Gemini sparkles are
rendered more opaque. The fixed-alpha reverse blend then UNDER-subtracts and
leaves a bright residual the detector still fires on. A visible-removal audit
through the registry path on the spaces corpus showed this as a meaningful
fraction of marks -- all under-removals, not a background-brightness class
(failures and successes had the same input confidence and background luma; the
discriminator was the removal delta itself).
remove_watermark now estimates a per-image alpha gain (_estimate_alpha_gain:
effective sparkle opacity at the bright core vs the local background ring,
a_eff/a_cap, clamped [1.0, 1.94]) and scales the alpha to match before the
over-sub/blend branch. A 1.05 deadband keeps a sparkle that already matches the
capture byte-identical to the pre-fix output, so the fix is purely additive
(0 regressions on the audit set; failures dropped substantially). The over-sub
guard still runs on the scaled alpha as the safety net for an over-shoot.
- _estimate_alpha_gain + _ALPHA_GAIN_MAX/_DEADBAND/_CORE_FRAC in gemini_engine.
- TestUnderSubtractionGain asserts on footprint pixels, NOT the detector (its
NCC is degenerate on a flat synthetic bg; the real corpus removal drops the
detector ~0.80 -> ~0.27).
- scripts/visible_removal_audit.py: the detect -> remove -> re-detect audit tool
that found and validated this (operates on gitignored data/spaces only).
- CLAUDE.md + README: document the under-subtraction gain.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two quality knobs for the SDXL invisible pass:
- min_resolution floor (default 1024, --min-resolution): small inputs are
upscaled to a 1024px long-side floor before diffusion, since SDXL img2img
distorts on a tiny latent (a 381x512 portrait wrecks at native). The output
is restored to the original input size, so it is a transparent quality boost;
it adds time/memory on small inputs. 0 disables. Extends the pure _target_size
helper (now cap-or-floor-or-native, min skipped on a min>max misconfig),
unit-tested without a model.
- unsharp post-filter (humanizer.unsharp_mask, --unsharp, opt-in default 0):
applied LAST, after the GFPGAN face pass (a pre-GFPGAN sharpen would be
smoothed back over), to counter the soft/over-smoothed look that diffusion +
restoration leave behind (an AI tell). Pairs with --humanize (grain).
Both threaded through invisible/all/batch + the module-level helper. Verified
end-to-end on a 381x512 portrait: upscaled to 1024, sharpened, restored to
381x512.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Capture the rule: archive only cleaned outputs from the current default SDXL
img2img pass; never archive examples from removed methods (ctrlregen, old
text/face protection, FaceID, CodeFormer) or experimental opt-in paths
(controlnet, GFPGAN). A removed method's output is not a reproducible example.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Back docs/synthid.md section 2.2 with the actual test set: the per-image
oracle-verified subjects were only in a local working dir, while the doc claimed
they were recorded in data/synthid_corpus/. Ingest the key pos+cleaned pairs so
the claim holds.
- pos: openai_1/2/3 originals (gpt-image, openai-verify) + gemini_1/2/3/4
originals (Gemini app, gemini-app); all probe as C2PA-SynthID present.
- cleaned: OpenAI at strength 0.05 (openai_2 only s010 captured) + Gemini at 0.15
--max-resolution 1536; oracle: SynthID NOT detected. Metadata stripped, so no
C2PA on the cleaned rows.
- Excluded the third-party issue #14 image (pic3): oracle-verified but not
committed to the public corpus.
- docs/synthid.md 2.2: state OpenAI n=4 = 3 archived + 1 external-only.
- CLAUDE.md: drop the drift-prone "~65 MB" corpus size from the sdist note.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Both content-preservation features are now flagged EXPERIMENTAL and opt-in.
--pipeline controlnet was already opt-in (default=default); --restore-faces
flips from on-by-default to OFF by default, matching the repo's prior pattern
for experimental preservation passes (the removed protect_text/protect_faces).
- cli.py: --restore-faces/--no-restore-faces default False; EXPERIMENTAL in the
--restore-faces / --controlnet-scale / --pipeline help; batch default False.
- invisible_engine.py: remove_watermark restore_faces default False + docstring.
- CLAUDE.md / README.md / docs/synthid.md: label both experimental/opt-in.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add an optional, commercial-safe face-restoration post-pass that recovers
face identity the diffusion removal pass drifts (canny holds structure, not
likeness) while still scrubbing the pixel watermark in the face regions.
- face_restore.py: GFPGANer singleton (CPU unless CUDA), the basicsr
torchvision.transforms.functional_tensor shim, and the pure feather
_composite_faces helper (unit-tested without the model). GFPGAN
re-synthesizes each face from a StyleGAN2 prior, so composited face pixels
are GAN-generated (no watermark, no pixel-copy) -- oracle-clean at weight 0.5
with identity preserved.
- InvisibleEngine.remove_watermark: restore_faces / restore_faces_weight,
best-effort, auto-skips when the extra is absent or no face is detected.
- CLI --restore-faces/--no-restore-faces + --restore-faces-weight on
invisible/all/batch (on by default).
- restore extra (gfpgan/facexlib/basicsr), numpy<2-pinned (scipy<1.18,
numba<0.60) and kept out of `all`; basicsr needs Python <3.13 + setuptools<69
to build, so pin .python-version 3.12.
Commercial-safe: GFPGAN Apache-2.0, RetinaFace MIT. The CodeFormer alternative
is non-commercial and is not shipped. The earlier IP-Adapter FaceID layer was
removed (footgun: needs high strength, corrupts faces at the low removal
strength).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add `--pipeline controlnet` (SDXL base + xinsir canny ControlNet via
StableDiffusionXLControlNetImg2ImgPipeline): the canny edge map conditions the
img2img regeneration so text and face STRUCTURE stay sharp, while the watermark
is still removed by the regeneration (`strength`) -- no original pixels are
copied or frozen, so SynthID does not survive. Oracle-verified clean on OpenAI
with better text/structure fidelity than plain img2img at equal strength.
`--controlnet-scale` tunes structure preservation; fp32 on mps/cpu (fp16-fixed
VAE on cuda/xpu). Shares the img2img runner (live progress + MPS->CPU fallback)
and the fp16-VAE-fix / device-move helpers with the default pipeline.
Remove the superseded subsystems -- ctrlregen (SD1.5 clean-noise),
text-protection (differential / region-hires) and face-protection: they either
destroyed real content or shielded the watermark by re-using original pixels.
controlnet replaces them by regenerating everything under edge conditioning.
Canny preserves face structure but not identity; face IDENTITY is a separate
face-restoration post-pass (CodeFormer/GFPGAN), researched + prototyped but not
yet shipped. An IP-Adapter FaceID attempt was built and removed (footgun: needs
high strength, corrupts faces at removal strength).
Docs: docs/controlnet-removal-pipeline-research.md, scripts/controlnet_sweep.py.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
detect_watermark's size-weighted global NCC search lets a larger, mediocre
match (e.g. a bright collar in a portrait) outrank a small, near-perfect
sparkle in the bottom-right corner, so a faint sparkle on a busy background
scored below threshold and the image read as clean -- the regression from
widening the search window 256px->512px between v0.7.2 and v0.8.8.
Add _corner_promote: a bottom-right-corner raw-NCC pass that overrides the
global pick when the corner holds a match with raw NCC >= 0.85 that beats it.
It only ever replaces a lower-fidelity pick (cannot weaken an existing
detection) and keeps the wider window for variant margins. The corner side is
relative-clamped (0.20 of the short side, [96, 384]) so it stays a true corner
at every scale: a fixed 256px covers ~70% of a small portrait, where a real
photo raw-matches the star at ~0.81; relative tightening drops that to ~0.69.
The 0.85 gate sits between the worst real-photo corner match (~0.78) and a
genuine faint sparkle (~0.93): zero false positives across native + downscaled
negatives, headshot rescued from below-threshold to 0.71.
Factor the shared multi-scale matchTemplate loop into _scan_scales.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Confirmed on real CUDA hardware 2026-06-03: `all` on a 1086x1448 OpenAI
gpt-image at fp16 produces a normal (non-black) output, so the fp16-fix VAE
swap resolves the all-black decode. Removes the prior "NOT verifiable on this
MPS machine" caveat.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The C2PA issuer attribution (`c2pa`) and the SynthID proxy (`synthid`) are
derived from the same manifest, so treating them as independent signals made
rule 1 fire on legitimate multi-actor manifests where a product wraps another
vendor's engine (Microsoft Designer on OpenAI, Microsoft on Google) or an edit
chain re-signs (Adobe over a Gemini original). 19 such files in the
2026-06-01/02 spaces batches read as "likely spoofed/laundered" before this.
Group `c2pa` + `synthid` into one provenance source via `_CLASH_SOURCE`; rule 1
now requires two vendors from different sources. A manifest vendor still clashes
with a genuinely independent stamp (EXIF/XMP generator, IPTC AISystemUsed, AIGC,
xAI).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
On a dark/textured background (e.g. grass) the captured alpha map over-estimates
the real Gemini sparkle's effective opacity (~0.51 captured vs ~0.31 effective),
so the fixed-alpha reverse blend over-subtracts (watermarked - alpha*logo goes
negative) and drives the footprint to black -- the white sparkle turns into a
black diamond (issue #30, reported by @CoolZimo1).
remove_watermark now detects this via _reverse_alpha_oversubtracts (fraction of
footprint pixels with a negative numerator > 5%) and inpaints the small sparkle
footprint from the surrounding pixels (cv2 NS, cropped to a padded box) instead.
Behavior-neutral on the working case: a bright background over-subtracts at ~0%,
so reverse-alpha is used and the output is byte-identical to before (verified:
demo_banana 0.0 frac vs the issue-#30 grass image 0.61 frac; issue-#30 footprint
recovers to background grass with no pit, residual sparkle conf 0.25 < 0.35).
Guard is scoped to GeminiEngine: doubao/jimeng already NCC-align their alpha to
the actual mark per image, which sidesteps the fixed-alpha mismatch.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The default img2img strength is now chosen from the detected SynthID vendor
(C2PA issuer) instead of a single fixed 0.30: OpenAI gpt-image -> 0.10, Google
Gemini -> 0.15, unknown source -> 0.15. Explicit --strength always wins.
Basis: an oracle-verified June 2026 controlled study (clean v0.8.6, text/face
protection OFF, per-image openai.com/verify or Gemini-app verdict). OpenAI's
SynthID clears at 0.05 across 1024-1600 px (n=4, resolution-independent);
Google's is ~3x more robust and needs 0.15 on the capped-1536 path (n=4). The
dominant factor is the VENDOR, not resolution. The earlier single 0.30 default
and the "resolution dependence" lore came from contaminated tests run with the
protect-text bug ON (issue #14) -- re-running those same 1600x1600 images clean
removes SynthID at 0.05.
`vendor_for_strength(path)` reads metadata.synthid_source on the ORIGINAL input
and is threaded through cli (invisible/all/batch) -> invisible_engine ->
watermark_remover -> resolve_strength(strength, profile, vendor), so display and
execution use the same vendor (the engine sees a temp path whose C2PA the visible
pass already stripped, so detection must happen in the CLI on the pristine
source). Caveat: Google's 0.15 was validated only on --max-resolution 1536;
native 2816 Gemini was not locally measurable (OOM on Apple Silicon) and is
pending GPU validation on raiw.cc.
Docs: docs/synthid.md sections 2.2/4.4/5.2 corrected (the contaminated
resolution-dependence findings replaced with the clean oracle-verified table);
README and CLAUDE.md updated; CLI --strength help reflects the adaptive default.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The 256px limit caused misses when Gemini places the sparkle further from the
corner than the standard 160px (margin 64 + logo 96). Observed variant at ~300px
reported in issue #30. 512px covers all known Gemini margin variations with room
to spare; matchTemplate on a 512x512 region is still fast on CPU.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Both text and face protection were shielding SynthID from removal. The
text-protection high-res re-scrub regenerates pixels at an upscaled resolution
where the per-region pass may not be strong enough to re-destroy the SynthID
payload, allowing it to survive in text areas. Face protection has an even more
direct mechanism: it pastes back the original (pre-diffusion, watermarked) face
pixels after the global pass, guaranteeing SynthID survives in face regions
regardless of strength.
Both --protect-text and --protect-faces are now off by default and opt-in.
Rename from --no-protect-text / --no-protect-faces to --protect-text /
--protect-faces. Extract shared click.option decorators to module-level
constants (_protect_text_option, _protect_faces_option) to eliminate
copy-paste between cmd_invisible and cmd_all.
Add docs/synthid.md: primary-source-cited technical reference for SynthID-Image
covering mechanism (post-hoc encoder/decoder, 136-bit payload, pixel-space, no
model-weight modification), robustness numbers (arXiv:2510.09263: ~99.98% TPR
at 0.1% FPR across 30 transforms), removal attacks and forensic detectability
(arXiv:2605.09203: all 6 attacks detectable >98% TPR@1%FPR), detectability
limits, oracle scope, adoption landscape, and practical implications including
the protect-text/faces SynthID-preservation finding.
Verified June 2026 on gpt-image 1600x1600 via openai.com/verify: with
--protect-text SynthID detected; without, SynthID removed.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
identify previously ran only the Gemini sparkle as a visible detector, so a
Doubao/Jimeng image with stripped TC260 metadata had no visible fallback. Add
`_visible_text_marks` (registry-backed) so the ByteDance Doubao 豆包AI生成 and
Jimeng 即梦AI marks are detected too, each gated by its own engine NCC threshold
via MarkDetection.detected. New signals `visible_doubao` / `visible_jimeng`
(medium), same stripped-metadata fallback role as the sparkle; excluded from
integrity-clash vendor claims; set platform only when no harder signal did.
Also make `noai/__init__` lazy (PEP 562 __getattr__): importing the light
`noai.c2pa` / `noai.constants` submodules (which identify needs) no longer
eagerly pulls `watermark_remover`, which imports torch + diffusers at module
top. `import remove_ai_watermarks.identify` drops from ~420 MB to ~21 MB in a
full gpu/detect install (torch not loaded), so it fits a 512 MB host; the
removal API resolves lazily on first access. Guarded by TestIdentifyImportIsLight.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
hatchling 1.28+ emits Metadata-Version 2.5 (PEP 639); the twine in
pypa/gh-action-pypi-publish@release/v1 rejects it, which failed the v0.8.3 PyPI
upload (build + tag-match passed, upload step failed, nothing uploaded). 1.27.x
emits 2.4, which uploads fine (0.8.2). Pin the build backend; lift once the action
twine is 2.5-aware or the workflow uses uv publish.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The protect_text correction was first cited only to qw1212ss_pic3, an OpenAI
image that carries no Google SynthID (so the Gemini oracle is not a valid check
for it). The updated study re-ran the 0.3 protect_text A/B on a genuine Gemini
SynthID image (gemini_633uuy, photo with a Chinese-text sign): SynthID removed
with protection ON and OFF, Gemini-oracle verified. Cite that as the load-bearing
evidence so the claim rests on a valid subject. Confirms the shipped 0.30 +
protect_text=ON default on a real Gemini target.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
An A/B at strength 0.3 on a real e-commerce infographic (updated GPU study)
reverses the earlier claim: SynthID is a GLOBAL watermark, so 0.3 removes it
whether protect_text is on or off, and protection SALVAGES text fidelity (medium
headings/body stay readable; off, they garble). The earlier 'protect_text shields
the watermark, use --no-protect-text' was wrong -- it mistook the 0.10 strength
failure for a protection effect. Recommended SynthID config: ~0.3 + protect_text ON
(the default). Also document the oracle scope: the Gemini app 'Verify with SynthID'
is the only valid SynthID oracle; openai.com/verify is provenance-scoped (C2PA) and
does NOT measure SynthID. Corrects CLAUDE.md + README + watermark_profiles comment
shipped in cddbaf6.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
An oracle-verified GPU strength study (Modal A100, native res, Gemini-app
'Verify with SynthID', n=3 fresh Gemini images, protect_text/faces off) found the
current Google SynthID survives strength 0.10/0.15/0.2 and is removed only at 0.3.
The previous 0.10 default (set from an n=1 result) no longer clears it -- Google
hardened SynthID and the threshold has climbed 0.05 -> 0.10 -> ~0.3. Bump
DEFAULT_STRENGTH to 0.30; OpenAI/ChatGPT carry C2PA not SynthID, so 0.10 is plenty
there (pass --strength 0.10). Note protect_text shields the text regions SynthID
hides in (use --no-protect-text for full removal on text-heavy images).
The same study found ctrlregen at clean-noise strength DESTROYS real images
(hallucinated micro-text in smooth regions), with no usable middle setting, so the
literature's 'clean-noise is the lever' did not hold empirically. Flag ctrlregen
EXPERIMENTAL in the CLI --pipeline help, README, and watermark_profiles; SDXL
img2img at ~0.3 stays the shippable path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The cli refactor dropped rich from dependencies, but four scripts still did
`from rich.console import Console` / `rich.table import Table`. Their test
modules import the scripts, so a clean `uv sync --frozen` (CI: core+dev, no
rich) failed at collection with ModuleNotFoundError on macOS/Windows/Linux.
Add a shared plain-text shim `scripts/_plain_console.py` (Console/Table via
click.echo, markup stripped) and switch all four scripts to it. Verified: all
four import with rich blocked, and tests/test_synthid_corpus.py +
tests/test_synthid_pixel_probe.py pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Mirrors --no-protect-text: when the image has no people, skip loading and
running the YOLO face detector entirely. The heavy extract+blend already only
ran when a face was found, but the detector itself always loaded+inferred to
decide; this flag lets callers skip that fixed cost.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
cli.py now emits plain ASCII through a small click.echo shim
(_Console / _Table / _Progress) instead of rich: no colors, markup tags,
panels, progress bar, or Unicode glyphs (Warning: / -> / ... and dropped
checkmark/cross marks). identify and metadata tables render as indented
plain lines.
- drop rich from dependencies (pyproject.toml + uv.lock)
- __init__: set TRANSFORMERS_VERBOSITY=error (setdefault) plus a warnings
filter so the transformers Siglip2ImageProcessorFast deprecation no
longer prints at CLI startup (it fires from the eager noai import)
- TestGpuHintMarkup: the [gpu] hint is now printed verbatim; docstring updated
- CLAUDE.md: replace the obsolete rich-markup lesson, note the verbosity fix
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>