The default img2img strength is now chosen from the detected SynthID vendor
(C2PA issuer) instead of a single fixed 0.30: OpenAI gpt-image -> 0.10, Google
Gemini -> 0.15, unknown source -> 0.15. Explicit --strength always wins.
Basis: an oracle-verified June 2026 controlled study (clean v0.8.6, text/face
protection OFF, per-image openai.com/verify or Gemini-app verdict). OpenAI's
SynthID clears at 0.05 across 1024-1600 px (n=4, resolution-independent);
Google's is ~3x more robust and needs 0.15 on the capped-1536 path (n=4). The
dominant factor is the VENDOR, not resolution. The earlier single 0.30 default
and the "resolution dependence" lore came from contaminated tests run with the
protect-text bug ON (issue #14) -- re-running those same 1600x1600 images clean
removes SynthID at 0.05.
`vendor_for_strength(path)` reads metadata.synthid_source on the ORIGINAL input
and is threaded through cli (invisible/all/batch) -> invisible_engine ->
watermark_remover -> resolve_strength(strength, profile, vendor), so display and
execution use the same vendor (the engine sees a temp path whose C2PA the visible
pass already stripped, so detection must happen in the CLI on the pristine
source). Caveat: Google's 0.15 was validated only on --max-resolution 1536;
native 2816 Gemini was not locally measurable (OOM on Apple Silicon) and is
pending GPU validation on raiw.cc.
Docs: docs/synthid.md sections 2.2/4.4/5.2 corrected (the contaminated
resolution-dependence findings replaced with the clean oracle-verified table);
README and CLAUDE.md updated; CLI --strength help reflects the adaptive default.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
An oracle-verified GPU strength study (Modal A100, native res, Gemini-app
'Verify with SynthID', n=3 fresh Gemini images, protect_text/faces off) found the
current Google SynthID survives strength 0.10/0.15/0.2 and is removed only at 0.3.
The previous 0.10 default (set from an n=1 result) no longer clears it -- Google
hardened SynthID and the threshold has climbed 0.05 -> 0.10 -> ~0.3. Bump
DEFAULT_STRENGTH to 0.30; OpenAI/ChatGPT carry C2PA not SynthID, so 0.10 is plenty
there (pass --strength 0.10). Note protect_text shields the text regions SynthID
hides in (use --no-protect-text for full removal on text-heavy images).
The same study found ctrlregen at clean-noise strength DESTROYS real images
(hallucinated micro-text in smooth regions), with no usable middle setting, so the
literature's 'clean-noise is the lever' did not hold empirically. Flag ctrlregen
EXPERIMENTAL in the CLI --pipeline help, README, and watermark_profiles; SDXL
img2img at ~0.3 stays the shippable path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The ctrlregen profile inherited the SDXL img2img --strength default (0.10), a
near-identity pass that loaded ControlNet + DINOv2-giant and barely changed the
image -- a no-op for removal. resolve_strength() now resolves an unset strength
per profile: 0.10 for the SDXL default, CTRLREGEN_DEFAULT_STRENGTH (1.0,
clean-noise) for ctrlregen. It checks `is None` rather than falsiness, so an
explicit 0.0 is respected (the old `strength or DEFAULT` swallowed it).
Research basis: CtrlRegen (ICLR 2025, arXiv:2410.05470) removes robust
watermarks by regenerating from clean Gaussian noise; partial-noise img2img
retains watermark info that diffuses back, so a high (clean-noise) strength is
the lever, not a knob on the light SDXL pass. CLI wiring (--strength default
None) lands with the cli refactor.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The stock SDXL VAE overflows to NaN in fp16, so the plain img2img path decodes
to an all-black image on a CUDA/XPU fp16 backend. This is the raiw.cc black
result HitaoLin reported (a 1086x1448 input came back uniformly black). cpu/mps
run fp32 and never hit it, and the differential / region-hires pipeline already
upcasts the VAE itself, so only the plain path on a fp16 GPU was exposed.
`_load_pipeline` now loads `madebyollin/sdxl-vae-fp16-fix` for the default SDXL
checkpoint when running fp16, gated by the pure helper `_needs_fp16_vae_fix`. A
custom non-SDXL model keeps its own VAE.
The decision logic is unit-tested without a download (TestFp16VaeFix). The
black->clean recovery itself needs a CUDA GPU and was not verifiable on this MPS
machine; it must be confirmed on the backend.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* Raise default SynthID-removal strength 0.05 -> 0.10 (current Google SynthID)
The old default (0.04/0.05) no longer removes the CURRENT Google SynthID (Nano
Banana / Gemini 3): verified 2026-05-30 via the Gemini 'Verify with SynthID'
oracle on a real image -- 0.05 still detected, 0.10 not detected (OpenAI's was
already cleared at 0.05). Add DEFAULT_STRENGTH=0.10 in watermark_profiles, route
the engine + CLI defaults to it. At 0.10 small text deforms more, which is why
text protection (_run_region_hires) runs by default. CLAUDE.md SynthID note
corrected. CAVEAT: n=1 Google + n=1 OpenAI; broad corpus oracle validation
pending (task tracked).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Drop unused LOW/MEDIUM/HIGH strength profiles; CLI --strength defaults to DEFAULT_STRENGTH
The fixed strength presets (and get_recommended_strength) were dead -- nothing in
the pipeline used them, only tests. One knob now: DEFAULT_STRENGTH (0.10),
overridable per-call via the CLI --strength flag, which now defaults to that
constant (single source of truth). Removed the WatermarkRemover.LOW/MEDIUM/HIGH
class attrs and the get_recommended_strength tests.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(device): support xpu backend
* Fall back to CPU seed generator when device RNG unsupported (xpu)
Some torch-xpu builds have no device-side RNG, so torch.Generator(device="xpu")
raises when --seed is used. _make_seed_generator tries the device generator and
falls back to a backend-agnostic CPU generator. Adds a fallback unit test.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Victor Kuznetsov <kuznetsov.va@gmail.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Add cross-platform CI test matrix, PyPI classifiers
CI: new test.yml runs lint (ubuntu) + a test matrix (ubuntu/macos/windows
x py3.10/3.12, core+dev, GPU tests skip) on push to main and PRs, closing the
gap where only the release publish.yml ran (ubuntu, no tests). Add PyPI
classifiers (OS/Python/topic). README Tests badge, CLAUDE.md CI note.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Make availability tests reflect installed deps, not assume gpu extra
The new core+dev CI matrix has no diffusers, so the invisible-engine
availability tests (asserting is_available() is True unconditionally) and the
two mocked invisible CLI tests (whose command gates on is_available before the
mock) failed. Assert availability == actual importability of torch+diffusers,
and patch the CLI availability gate so the mocked-engine tests run regardless of
the gpu extra.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Detect SynthID-bearing images via their C2PA companion: a manifest signed by a
SynthID-using vendor (Google/OpenAI) on AI-generated content implies an
invisible SynthID pixel watermark. Verified end-to-end against the vendor
oracles (openai.com/verify, Gemini "Verify with SynthID").
- metadata: synthid_source() + synthid_watermark verdict in get_ai_metadata,
surfaced as a `metadata --check` callout. Format-agnostic (PNG caBX parser +
JPEG/WebP/AVIF/HEIF/JXL binary scan).
- constants: SYNTHID_C2PA_ISSUERS {Google, OpenAI}; +opened/placed actions.
- c2pa: single CBOR-aware parser (_cbor_text_after) replaces glitchy regex
(fixes fGPT-4o claim_generator); removed duplicate _scan_png_c2pa_chunk from
metadata; shared synthid_verdict / synthid_vendors_in helpers.
- corpus: scripts/synthid_corpus.py ingest tool + data/synthid_corpus/
(manifest tracked, images gitignored) for a labeled reference set.
- tests: +38 across C2PA parser internals, extract/inject round-trip, ISOBMFF
container stripping, all IPTC AI markers, and invisible watermark strength
tiers (SynthID/StableSignature/TreeRing/StegaStamp/RingID/RivaGAN/...).
Pixel-level SynthID detection remains out of reach locally (Google's decoder is
proprietary); a from-scratch spectral pilot confirmed it does not separate real
content. See CLAUDE.md for the full evaluation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SD-1.5 dreamshaper at 768 px did not defeat SynthID v2 on Gemini 3 Pro
outputs (verified May 2026 via Gemini app's "Verify with SynthID"). Switch
the default invisible engine to SDXL at 1024 px, matching the raiw-app
production config (strength 0.05, steps 50). Drop the SD-1.5 pipeline.
Metadata layer: add C2PA UUID and IPTC AI marker byte-scan detection
across all formats, plus an ISOBMFF box walker (noai/isobmff.py) that
strips top-level C2PA uuid and JUMBF jumb boxes from AVIF/HEIF/JPEG-XL
containers without re-encoding.
README gets a Legal table and a Threat-model section about SynthID v2's
136-bit payload. CLAUDE.md tracks the SD-1.5 regression as historical
context.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- CLI with visible, invisible, all, metadata, and batch commands
- Gemini watermark removal via reverse alpha blending
- Invisible watermark removal via diffusion regeneration (SynthID, TreeRing)
- AI metadata stripping (EXIF, PNG text, C2PA)
- Face protection (YOLO/Haar) and analog humanizer
- 137 tests covering all CLI modes and core engines
- Ruff and Pyright clean