diff --git a/docs/known-limitations.md b/docs/known-limitations.md index 0703278..cc18f00 100644 --- a/docs/known-limitations.md +++ b/docs/known-limitations.md @@ -71,11 +71,11 @@ Generic HuggingFace detectors (`Organika/sdxl-detector` Swin Transformer, `umm-m ## Default strength is vendor-adaptive, one ladder for both pipelines -**DEFAULT STRENGTH IS VENDOR-ADAPTIVE, ONE LADDER FOR BOTH PIPELINES (raised + unified 2026-06-09; vendor-adaptive since 2026-06-01, SUPERSEDES every fixed-default claim in this bullet and the next).** +**DEFAULT STRENGTH IS VENDOR-ADAPTIVE, ONE LADDER FOR BOTH PIPELINES (LOWERED 2026-06-14; raised + unified 2026-06-09; vendor-adaptive since 2026-06-01, SUPERSEDES every fixed-default claim in this bullet and the next).** -`resolve_strength(strength, vendor)` + `vendor_for_strength(path)` (`watermark_profiles.py`) read the C2PA issuer (`metadata.synthid_source`) on the ORIGINAL input and pick `OPENAI_STRENGTH` **0.20** / `GEMINI_STRENGTH` **0.30** / `UNKNOWN_STRENGTH` **0.30** when `--strength` is unset; explicit `--strength` always wins. +`resolve_strength(strength, vendor)` + `vendor_for_strength(path)` (`watermark_profiles.py`) read the C2PA issuer (`metadata.synthid_source`) on the ORIGINAL input and pick `OPENAI_STRENGTH` **0.10** / `GEMINI_STRENGTH` **0.15** / `UNKNOWN_STRENGTH` **0.15** when `--strength` is unset; explicit `--strength` always wins. -**The SAME ladder applies to BOTH pipelines** (`sdxl` and `controlnet`) -- these are the 2026-06-04 Modal-cert controlnet floors. +**The SAME ladder applies to BOTH pipelines** (`sdxl` and `controlnet`). **2026-06-14: lowered from the 2026-06-04 cert floors (OpenAI 0.20 / Google 0.30) back toward the original 2026-06-01 study (OpenAI ~0.05-0.10 / Google 0.15).** A re-test on the deployed Modal controlnet worker cleared SynthID on the oracle at OpenAI 0.10 (2 photoreal, 1402/1448 px) and Google 0.15 (2 NATIVE 2816x1536 images -- retiring the "native ~2816 likely needs >=0.30" guess), while a pixel sweep showed 0.20/0.30 over-regenerated for no efficacy gain (Google MAE -20% at 0.15). See `watermark_profiles.py` "Data basis". **CAVEATS that stand:** (1) removal near this floor is SEED-NON-DETERMINISTIC (the 2026-06-09 finding below) -- a SERVICE on this ladder must pin a fixed, oracle-verified seed, not rely on a random one; (2) the re-test is n=2 per vendor on photoreal/landscape, NOT flat graphics (the `sdxl` weak spot), so raise `--strength` if an oracle reads SynthID on a flat output. **Why one ladder (NOT a per-pipeline split):** the cert was run on controlnet and does NOT transfer to `sdxl` by symmetry (opposite hard cases -- controlnet leaves SynthID on photoreal, `sdxl` on flat graphics), BUT on its OWN hard case (flat fills) `sdxl` is the WEAKER remover (plain img2img barely perturbs a flat region at low strength), so it needs AT LEAST controlnet's strength -- hence the certified floor is the right floor for `sdxl` too. It is a MARGIN argument for `sdxl`, not a fresh certification (no local SynthID detector to self-verify); raise `--strength` if an oracle still reads a flat `sdxl` output. The higher strength costs little quality because `controlnet` is now the default pipeline AND the only `--auto` pick, so `sdxl` is reached only via an explicit `--pipeline sdxl` (a deliberate opt-down for inputs without faces/text), where over-regeneration has nothing to damage. (A short-lived per-pipeline split ladder -- `sdxl` 0.15/0.20 vs controlnet 0.20/0.30 -- existed on 2026-06-09 before being unified the same day; the `resolve_strength` `pipeline` param and the `CONTROLNET_*_STRENGTH` constants were removed.) The CLI detects the vendor from the pristine source (before the visible pass / metadata-strip removes C2PA from the temp file) and passes it to display calls so display and execution agree; `cmd_invisible`/`cmd_all`/`batch` thread `vendor`. diff --git a/docs/synthid.md b/docs/synthid.md index b42bd6c..6a61345 100644 --- a/docs/synthid.md +++ b/docs/synthid.md @@ -453,9 +453,18 @@ study (section 2.2) gives empirical floors: The default is **vendor-adaptive** (`watermark_profiles.resolve_strength` + `vendor_for_strength`): the tool reads the C2PA issuer on the original input and picks -`OPENAI_STRENGTH` 0.20 / `GEMINI_STRENGTH` 0.30 / `UNKNOWN_STRENGTH` 0.30. **The SAME -ladder applies to both pipelines** (`sdxl` and `controlnet`) -- these are the -oracle-certified controlnet floors (§5.5, the 2026-06-04 Modal cert). Why one ladder +`OPENAI_STRENGTH` 0.10 / `GEMINI_STRENGTH` 0.15 / `UNKNOWN_STRENGTH` 0.15 **(LOWERED +2026-06-14 from the 2026-06-04 cert floors 0.20/0.30/0.30)**. **The SAME ladder applies +to both pipelines** (`sdxl` and `controlnet`). The 2026-06-14 re-test on the deployed +Modal controlnet worker (v0.10.0) cleared SynthID on the oracle at OpenAI 0.10 (2 +photoreal) and Google 0.15 (2 NATIVE 2816x1536, contradicting the "native >= 0.30" guess +on line above), and a pixel sweep showed 0.20/0.30 over-regenerated for no efficacy gain. +**This re-opens a genuine tension with the 2026-06-04 pass, which found photoreal STILL +detected after controlnet at 0.10/0.15 (lines above):** either the v0.10.0 controlnet +default improved the floor, or n=2 landed on the lucky side of the seed-non-determinism +(§5.5). So a SERVICE on this ladder MUST pin a fixed, oracle-verified seed (not random), +and flat-graphic hard cases (NOT in the n=2 re-test) still need a per-content oracle +recheck -- raise `--strength` there. The prior cert floors are the §5.5 record. Why one ladder covers plain `sdxl` too: the certification was run on controlnet and does NOT transfer by symmetry (the two pipelines have OPPOSITE hard cases -- controlnet leaves SynthID on photoreal, `sdxl` on flat graphics, the §5.1 content-x-pipeline table), BUT on its own diff --git a/src/remove_ai_watermarks/noai/watermark_profiles.py b/src/remove_ai_watermarks/noai/watermark_profiles.py index b63a0d2..96c9995 100644 --- a/src/remove_ai_watermarks/noai/watermark_profiles.py +++ b/src/remove_ai_watermarks/noai/watermark_profiles.py @@ -37,12 +37,22 @@ CONTROLNET_CANNY_MODEL = "xinsir/controlnet-canny-sdxl-1.0" # applies to BOTH pipelines (`sdxl` plain img2img and `controlnet`) -- see "why one # ladder" below. # -# Data basis (see docs/synthid.md sections 2.2 / 5.5): the values are the ORACLE- -# CERTIFIED controlnet floors (2026-06-04, isolated Modal cert app, each vendor on its -# own verifier): OpenAI 0.20 (2 photoreal x 3 seeds = 6/6 clean, resolution-independent), -# Google 0.30 (clean on 2/2 seeds, validated ONLY at <= 1536 -- Gemini is resolution- -# sensitive, native ~2816 likely needs ~0.35+). Unknown vendor gets the Google (more -# robust watermark) value: safe-by-default. +# Data basis (see docs/synthid.md sections 2.2 / 5.5): ORACLE-CERTIFIED controlnet floors. +# A 2026-06-14 re-test on the deployed Modal worker (the production controlnet pipeline) +# LOWERED the ladder back to OpenAI 0.10 / Google 0.15: each output verified on its own +# oracle (openai.com/verify for OpenAI, the Google Gemini app for Google), all clean -> +# - OpenAI 0.10: 2 photoreal images (1402 / 1448 px), SynthID not found on either. +# - Google 0.15: 2 NATIVE-resolution images (both 2816x1536), SynthID not found on +# either -- this directly retires the earlier "native ~2816 likely needs ~0.35+" +# guess, which was speculative and never oracle-checked at that resolution. +# This supersedes the 2026-06-04 cert (OpenAI 0.20 / Google 0.30), whose higher floor a +# pixel-fidelity sweep showed was ~2x the removal floor and over-regenerated for no +# efficacy gain (Google MAE -20% at 0.15 vs 0.30, no SynthID returning). Unknown vendor +# tracks the Google (more robust watermark) value -> 0.15, still safe-by-default and the +# floor that real (no-vendor) photos hit, so it also minimizes damage when there is in +# fact nothing to remove. CAVEAT: the re-test is n=2 per vendor on photoreal / landscape +# content; FLAT-GRAPHIC hard cases (the historical `sdxl` weak spot) were NOT in the +# sample, so if an oracle still reads SynthID on a flat output, raise `--strength`. # # Why ONE ladder for both pipelines (2026-06-09): the certification was run on # controlnet, and it does NOT transfer to `sdxl` by symmetry -- the two pipelines have @@ -57,9 +67,9 @@ CONTROLNET_CANNY_MODEL = "xinsir/controlnet-canny-sdxl-1.0" # this is a MARGIN argument for `sdxl`, not a fresh certification -- there is no local # SynthID detector, so if an oracle still reads SynthID on a flat `sdxl` output, raise # `--strength`. -OPENAI_STRENGTH = 0.20 -GEMINI_STRENGTH = 0.30 -UNKNOWN_STRENGTH = 0.30 +OPENAI_STRENGTH = 0.10 +GEMINI_STRENGTH = 0.15 +UNKNOWN_STRENGTH = 0.15 # Backwards-compatible alias: the vendor-unknown value (what a caller gets without a # detected vendor). Kept as DEFAULT_STRENGTH for existing references. DEFAULT_STRENGTH = UNKNOWN_STRENGTH diff --git a/tests/test_platform.py b/tests/test_platform.py index 2e5d3b3..463e62b 100644 --- a/tests/test_platform.py +++ b/tests/test_platform.py @@ -140,11 +140,12 @@ class TestResolveStrength: assert resolve_strength(None, "adobe") == UNKNOWN_STRENGTH def test_ladder_is_the_certified_controlnet_floors(self): - # The unified ladder == the oracle-certified controlnet floors (OpenAI 0.20, - # Google/unknown 0.30); Google is the more-robust watermark, so it is higher. - assert OPENAI_STRENGTH == 0.20 - assert GEMINI_STRENGTH == 0.30 - assert UNKNOWN_STRENGTH == 0.30 + # The unified ladder == the oracle-certified controlnet floors. Lowered on the + # 2026-06-14 Modal re-test (OpenAI 0.10, Google/unknown 0.15); Google is the + # more-robust watermark, so it is higher. + assert OPENAI_STRENGTH == 0.10 + assert GEMINI_STRENGTH == 0.15 + assert UNKNOWN_STRENGTH == 0.15 assert OPENAI_STRENGTH < GEMINI_STRENGTH def test_default_strength_alias_is_unknown_vendor_value(self):