mirror of
https://github.com/wiltodelta/remove-ai-watermarks.git
synced 2026-07-05 16:07:49 +02:00
lower(strength): drop vendor-adaptive floor to OpenAI 0.10 / Google 0.15
A 2026-06-14 oracle re-test on the deployed Modal controlnet worker (v0.10.0) cleared SynthID at OpenAI 0.10 (2 photoreal) and Google 0.15 (2 native 2816x1536, retiring the "native >= 0.30" guess), while a pixel sweep showed the 2026-06-04 cert floors (0.20/0.30) over-regenerated for no efficacy gain (Google MAE -20% at 0.15). Lowers OPENAI_STRENGTH 0.20->0.10, GEMINI_STRENGTH and UNKNOWN_STRENGTH 0.30->0.15. Caveats documented in watermark_profiles.py + docs: removal near this floor is seed-non-deterministic (a service must pin a verified seed), and the n=2 re-test did not cover flat-graphic hard cases. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -71,11 +71,11 @@ Generic HuggingFace detectors (`Organika/sdxl-detector` Swin Transformer, `umm-m
|
||||
|
||||
## Default strength is vendor-adaptive, one ladder for both pipelines
|
||||
|
||||
**DEFAULT STRENGTH IS VENDOR-ADAPTIVE, ONE LADDER FOR BOTH PIPELINES (raised + unified 2026-06-09; vendor-adaptive since 2026-06-01, SUPERSEDES every fixed-default claim in this bullet and the next).**
|
||||
**DEFAULT STRENGTH IS VENDOR-ADAPTIVE, ONE LADDER FOR BOTH PIPELINES (LOWERED 2026-06-14; raised + unified 2026-06-09; vendor-adaptive since 2026-06-01, SUPERSEDES every fixed-default claim in this bullet and the next).**
|
||||
|
||||
`resolve_strength(strength, vendor)` + `vendor_for_strength(path)` (`watermark_profiles.py`) read the C2PA issuer (`metadata.synthid_source`) on the ORIGINAL input and pick `OPENAI_STRENGTH` **0.20** / `GEMINI_STRENGTH` **0.30** / `UNKNOWN_STRENGTH` **0.30** when `--strength` is unset; explicit `--strength` always wins.
|
||||
`resolve_strength(strength, vendor)` + `vendor_for_strength(path)` (`watermark_profiles.py`) read the C2PA issuer (`metadata.synthid_source`) on the ORIGINAL input and pick `OPENAI_STRENGTH` **0.10** / `GEMINI_STRENGTH` **0.15** / `UNKNOWN_STRENGTH` **0.15** when `--strength` is unset; explicit `--strength` always wins.
|
||||
|
||||
**The SAME ladder applies to BOTH pipelines** (`sdxl` and `controlnet`) -- these are the 2026-06-04 Modal-cert controlnet floors.
|
||||
**The SAME ladder applies to BOTH pipelines** (`sdxl` and `controlnet`). **2026-06-14: lowered from the 2026-06-04 cert floors (OpenAI 0.20 / Google 0.30) back toward the original 2026-06-01 study (OpenAI ~0.05-0.10 / Google 0.15).** A re-test on the deployed Modal controlnet worker cleared SynthID on the oracle at OpenAI 0.10 (2 photoreal, 1402/1448 px) and Google 0.15 (2 NATIVE 2816x1536 images -- retiring the "native ~2816 likely needs >=0.30" guess), while a pixel sweep showed 0.20/0.30 over-regenerated for no efficacy gain (Google MAE -20% at 0.15). See `watermark_profiles.py` "Data basis". **CAVEATS that stand:** (1) removal near this floor is SEED-NON-DETERMINISTIC (the 2026-06-09 finding below) -- a SERVICE on this ladder must pin a fixed, oracle-verified seed, not rely on a random one; (2) the re-test is n=2 per vendor on photoreal/landscape, NOT flat graphics (the `sdxl` weak spot), so raise `--strength` if an oracle reads SynthID on a flat output.
|
||||
|
||||
**Why one ladder (NOT a per-pipeline split):** the cert was run on controlnet and does NOT transfer to `sdxl` by symmetry (opposite hard cases -- controlnet leaves SynthID on photoreal, `sdxl` on flat graphics), BUT on its OWN hard case (flat fills) `sdxl` is the WEAKER remover (plain img2img barely perturbs a flat region at low strength), so it needs AT LEAST controlnet's strength -- hence the certified floor is the right floor for `sdxl` too. It is a MARGIN argument for `sdxl`, not a fresh certification (no local SynthID detector to self-verify); raise `--strength` if an oracle still reads a flat `sdxl` output. The higher strength costs little quality because `controlnet` is now the default pipeline AND the only `--auto` pick, so `sdxl` is reached only via an explicit `--pipeline sdxl` (a deliberate opt-down for inputs without faces/text), where over-regeneration has nothing to damage. (A short-lived per-pipeline split ladder -- `sdxl` 0.15/0.20 vs controlnet 0.20/0.30 -- existed on 2026-06-09 before being unified the same day; the `resolve_strength` `pipeline` param and the `CONTROLNET_*_STRENGTH` constants were removed.) The CLI detects the vendor from the pristine source (before the visible pass / metadata-strip removes C2PA from the temp file) and passes it to display calls so display and execution agree; `cmd_invisible`/`cmd_all`/`batch` thread `vendor`.
|
||||
|
||||
|
||||
+12
-3
@@ -453,9 +453,18 @@ study (section 2.2) gives empirical floors:
|
||||
|
||||
The default is **vendor-adaptive** (`watermark_profiles.resolve_strength` +
|
||||
`vendor_for_strength`): the tool reads the C2PA issuer on the original input and picks
|
||||
`OPENAI_STRENGTH` 0.20 / `GEMINI_STRENGTH` 0.30 / `UNKNOWN_STRENGTH` 0.30. **The SAME
|
||||
ladder applies to both pipelines** (`sdxl` and `controlnet`) -- these are the
|
||||
oracle-certified controlnet floors (§5.5, the 2026-06-04 Modal cert). Why one ladder
|
||||
`OPENAI_STRENGTH` 0.10 / `GEMINI_STRENGTH` 0.15 / `UNKNOWN_STRENGTH` 0.15 **(LOWERED
|
||||
2026-06-14 from the 2026-06-04 cert floors 0.20/0.30/0.30)**. **The SAME ladder applies
|
||||
to both pipelines** (`sdxl` and `controlnet`). The 2026-06-14 re-test on the deployed
|
||||
Modal controlnet worker (v0.10.0) cleared SynthID on the oracle at OpenAI 0.10 (2
|
||||
photoreal) and Google 0.15 (2 NATIVE 2816x1536, contradicting the "native >= 0.30" guess
|
||||
on line above), and a pixel sweep showed 0.20/0.30 over-regenerated for no efficacy gain.
|
||||
**This re-opens a genuine tension with the 2026-06-04 pass, which found photoreal STILL
|
||||
detected after controlnet at 0.10/0.15 (lines above):** either the v0.10.0 controlnet
|
||||
default improved the floor, or n=2 landed on the lucky side of the seed-non-determinism
|
||||
(§5.5). So a SERVICE on this ladder MUST pin a fixed, oracle-verified seed (not random),
|
||||
and flat-graphic hard cases (NOT in the n=2 re-test) still need a per-content oracle
|
||||
recheck -- raise `--strength` there. The prior cert floors are the §5.5 record. Why one ladder
|
||||
covers plain `sdxl` too: the certification was run on controlnet and does NOT transfer
|
||||
by symmetry (the two pipelines have OPPOSITE hard cases -- controlnet leaves SynthID on
|
||||
photoreal, `sdxl` on flat graphics, the §5.1 content-x-pipeline table), BUT on its own
|
||||
|
||||
Reference in New Issue
Block a user