docs(synthid): correct protect_text guidance -- it does NOT block removal (keep ON)

An A/B at strength 0.3 on a real e-commerce infographic (updated GPU study)
reverses the earlier claim: SynthID is a GLOBAL watermark, so 0.3 removes it
whether protect_text is on or off, and protection SALVAGES text fidelity (medium
headings/body stay readable; off, they garble). The earlier 'protect_text shields
the watermark, use --no-protect-text' was wrong -- it mistook the 0.10 strength
failure for a protection effect. Recommended SynthID config: ~0.3 + protect_text ON
(the default). Also document the oracle scope: the Gemini app 'Verify with SynthID'
is the only valid SynthID oracle; openai.com/verify is provenance-scoped (C2PA) and
does NOT measure SynthID. Corrects CLAUDE.md + README + watermark_profiles comment
shipped in cddbaf6.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Victor Kuznetsov
2026-05-31 16:50:13 -07:00
parent cddbaf6413
commit b991b11a19
3 changed files with 10 additions and 7 deletions
+1 -1
View File
File diff suppressed because one or more lines are too long
+1 -1
View File
@@ -116,7 +116,7 @@ image → encode to latent space (VAE) at native resolution
> **Default strength is `0.30`, tuned to remove the current Google SynthID.** An oracle-verified study (fresh Gemini images, "Verify with SynthID") found the current SynthID survives `0.10`/`0.15`/`0.20` and clears only at `0.30`. SynthID is a moving target (the threshold has climbed `0.05``0.10``~0.30` as Google hardens it), and there is no local SynthID detector, so the tool cannot self-check and auto-tune. If the oracle still reads SynthID, raise `--strength` further; if you care more about preserving fine text, lower it. `0.30` softens dense typography somewhat, so use the lowest value that comes back clean on the oracle.
>
> **For SynthID in text, also pass `--no-protect-text`.** Text protection preserves text regions, but SynthID hides in them, so on text-heavy images the watermark can survive inside text at `0.30` unless protection is off. This trades text crispness for full removal — a genuine tradeoff, not a bug.
> **Keep text protection on (the default) — it does not block SynthID removal.** SynthID is a global watermark, so strength `0.30` clears it whether or not text is protected, and text protection keeps headings and body text readable through the pass (only the very finest print still softens at `0.30`). You do not need to disable it for removal; `--no-protect-text` only trades text quality for a faster run.
>
> **OpenAI / ChatGPT images do not carry Google SynthID** (they use C2PA metadata, stripped by the metadata step), so `0.30` is overkill there; `--strength 0.10` preserves quality and the metadata strip is what matters.
>
@@ -18,11 +18,14 @@ CTRLREGEN_MODEL_ID = "yepengliu/ctrlregen"
# treat this as a moving target and re-test against fresh Gemini output periodically.
# Cost of 0.3: SSIM ~0.97 vs original (modest), but fine/dense typography softens, and
# it is OVERKILL for non-SynthID sources (OpenAI/ChatGPT carry C2PA, not Google SynthID
# -- 0.10 is plenty there). Two known tensions, documented but not auto-handled here:
# (1) higher strength deforms text more (why text protection runs by default), and
# (2) `protect_text` SHIELDS the text regions where SynthID hides, so text-region
# SynthID can survive at 0.3 unless `--no-protect-text` is passed. (Fixed LOW/MEDIUM/
# HIGH presets were removed -- the one knob is this default + the per-call override.)
# -- 0.10 is plenty there). protect_text is RECOMMENDED ON for SynthID removal (A/B
# verified 2026-05-31): SynthID is GLOBAL, so 0.3 clears it whether protection is on or
# off, and protection salvages medium-text fidelity (~3x runtime); only the very finest
# text still softens at 0.3. (An earlier comment claimed protect_text shields the
# watermark -- that was wrong, it mistook the 0.10 strength failure for a protection
# effect.) The only true tension is the finest typography softening at this aggressive
# strength. (Fixed LOW/MEDIUM/HIGH presets were removed -- the one knob is this default
# plus the per-call override.)
DEFAULT_STRENGTH = 0.30
# CtrlRegen removes watermarks by regenerating from (near) clean Gaussian noise,