feat(invisible): vendor-adaptive default strength (OpenAI 0.10 / Google 0.15)

The default img2img strength is now chosen from the detected SynthID vendor (C2PA issuer) instead of a single fixed 0.30: OpenAI gpt-image -> 0.10, Google Gemini -> 0.15, unknown source -> 0.15. Explicit --strength always wins. Basis: an oracle-verified June 2026 controlled study (clean v0.8.6, text/face protection OFF, per-image openai.com/verify or Gemini-app verdict). OpenAI's SynthID clears at 0.05 across 1024-1600 px (n=4, resolution-independent); Google's is ~3x more robust and needs 0.15 on the capped-1536 path (n=4). The dominant factor is the VENDOR, not resolution. The earlier single 0.30 default and the "resolution dependence" lore came from contaminated tests run with the protect-text bug ON (issue #14) -- re-running those same 1600x1600 images clean removes SynthID at 0.05. `vendor_for_strength(path)` reads metadata.synthid_source on the ORIGINAL input and is threaded through cli (invisible/all/batch) -> invisible_engine -> watermark_remover -> resolve_strength(strength, profile, vendor), so display and execution use the same vendor (the engine sees a temp path whose C2PA the visible pass already stripped, so detection must happen in the CLI on the pristine source). Caveat: Google's 0.15 was validated only on --max-resolution 1536; native 2816 Gemini was not locally measurable (OOM on Apple Silicon) and is pending GPU validation on raiw.cc. Docs: docs/synthid.md sections 2.2/4.4/5.2 corrected (the contaminated resolution-dependence findings replaced with the clean oracle-verified table); README and CLAUDE.md updated; CLI --strength help reflects the adaptive default. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-20 06:37:20 +02:00 · 2026-06-01 19:29:47 -07:00
parent 1708857772
commit 96038f960f
8 changed files with 243 additions and 87 deletions
@@ -108,19 +108,18 @@ The removal pipeline (default profile, SDXL):
 ```text
 image → encode to latent space (VAE) at native resolution
      → add controlled noise (forward diffusion)
-      → denoise (reverse diffusion, ~50 steps at strength 0.30)
+      → denoise (reverse diffusion, ~50 steps; strength is vendor-adaptive:
+        0.10 OpenAI / 0.15 Google / 0.15 unknown, override with --strength)
      → decode back to pixels (VAE)
 ```

 - Native resolution avoids shrinking the input to 1024 px first; that down-then-up round-trip was the main quality loss (issue #10). Use `--max-resolution N` only to cap GPU/MPS memory on very large inputs.

-> **Default strength is `0.30`, tuned to remove the current Google SynthID.** An oracle-verified study (fresh Gemini images, "Verify with SynthID") found the current SynthID survives `0.10`/`0.15`/`0.20` and clears only at `0.30`. SynthID is a moving target (the threshold has climbed `0.05` → `0.10` → `~0.30` as Google hardens it), and there is no local SynthID detector, so the tool cannot self-check and auto-tune. If the oracle still reads SynthID, raise `--strength` further; if you care more about preserving fine text, lower it. `0.30` softens dense typography somewhat, so use the lowest value that comes back clean on the oracle.
+> **Default strength is vendor-adaptive (no flag needed).** The tool reads the C2PA issuer to detect which vendor's SynthID is present and picks the strength that clears it with the least quality loss: **OpenAI gpt-image → `0.10`**, **Google Gemini → `0.15`**, **unknown source → `0.15`**. An oracle-verified June 2026 study (clean pipeline, per-image openai.com/verify or Gemini app) found OpenAI's watermark clears at `0.05` across `1024`-`1600` px (resolution-independent) while Google's is ~3x more robust and needs `0.15`. The dominant factor is the vendor, not resolution. There is no local SynthID detector, so if the oracle still reads SynthID, raise `--strength`; if you care more about preserving fine text, lower it. (Caveat: Google's `0.15` was validated on the capped `--max-resolution 1536` path; a very large native Gemini image may need more.)
 >
 > **Text and face protection are OFF by default.** The high-resolution text re-scrub can shield SynthID in text regions, leaving the watermark intact there even after the global pass clears it everywhere else (verified June 2026: same image, with `--protect-text` → SynthID detected; without → SynthID removed). Both features are opt-in with `--protect-text` / `--protect-faces` and considered **experimental**. If you enable them, verify the result with the oracle.
 >
-> **OpenAI / ChatGPT images do not carry Google SynthID** (they use C2PA metadata, stripped by the metadata step), so `0.30` is overkill there; `--strength 0.10` preserves quality and the metadata strip is what matters.
->
-> **`--pipeline ctrlregen` is experimental and not recommended.** On paper CtrlRegen ([ICLR 2025](https://github.com/yepengliu/CtrlRegen)) regenerates from near-clean Gaussian noise to defeat robust watermarks, but in testing on real images it **destroys content** — smooth and background regions fill with hallucinated micro-text — and it is heavy (several GB of extra models, minutes per image). It has no usable middle setting (too low removes nothing, high enough to remove wrecks the image), so the shippable path is the default SDXL pipeline at `~0.30`. CtrlRegen stays available for experimentation only.
+> **`--pipeline ctrlregen` is experimental and not recommended.** On paper CtrlRegen ([ICLR 2025](https://github.com/yepengliu/CtrlRegen)) regenerates from near-clean Gaussian noise to defeat robust watermarks, but in testing on real images it **destroys content** — smooth and background regions fill with hallucinated micro-text — and it is heavy (several GB of extra models, minutes per image). It has no usable middle setting (too low removes nothing, high enough to remove wrecks the image), so the shippable path is the default SDXL pipeline at the vendor-adaptive strength. CtrlRegen stays available for experimentation only.

 SDXL is the default since May 2026: empirically defeats SynthID v2 on Gemini 3 Pro outputs, where the older SD-1.5 pipeline at 768 px did not. The SD-1.5 path was removed once it was verified not to handle v2. Note the scope: this defeats the SynthID *verifier*, which is not the same as being forensically indistinguishable from a real photo. Recent work ([arXiv:2605.09203](https://arxiv.org/abs/2605.09203)) shows watermark-removal pipelines leave detectable traces, so a separate "this image was processed" classifier can still flag the output.

@@ -274,7 +273,9 @@ remove-ai-watermarks erase image.png --region 1640,1930,400,100 -o clean.png
 remove-ai-watermarks invisible image.png -o clean.png --humanize 4.0
 # Runs at native resolution by default. On a very large image that OOMs the
 # GPU/MPS, cap the long side: --max-resolution 2048
-# Text / CJK glyphs are preserved automatically; disable with --no-protect-text
+# Strength is vendor-adaptive by default (OpenAI 0.10 / Google 0.15); override
+# with --strength. Text/face protection is opt-in (--protect-text /
+# --protect-faces, experimental: they can shield SynthID).

 # Check / strip AI metadata (C2PA, EXIF, "Made with AI" labels)
 # --check also flags SynthID-bearing sources: a C2PA manifest signed by
@@ -169,29 +169,51 @@ Limitations section of the paper (Section 10) was not recoverable from the
 public HTML version of arXiv:2510.09263v1 due to a rendering failure in the
 conversion (the body text of Section 10 is absent from the HTML).

-**What is known empirically from independent work and our own testing:**
+**What is known empirically from our own oracle-verified testing.**

- **Diffusion regeneration / img2img** at sufficient strength degrades or
-  removes the watermark. Our testing (May-June 2026, Gemini oracle):
-  - strength 0.05: insufficient for current Gemini SynthID (survives)
-  - strength 0.10: removes Gemini SynthID (verified via Gemini app oracle, n=1)
-  - strength 0.30: current DEFAULT; removes Gemini SynthID (verified n=3 via
-    Gemini app oracle on fresh Gemini images, June 2026 oracle study)
-  - strength 0.30: **does NOT reliably remove OpenAI gpt-image SynthID on
-    1600x1600 images** (verified via openai.com/verify, issue #14 reports,
-    June 2026)
-  - strength 0.35 and 0.40: not yet oracle-verified on 1600x1600 gpt-image;
-    0.40 visibly corrupts text-heavy images
-  - **Resolution dependence confirmed**: same strength removes watermark on
-    small images (376x429) but not on large ones (1600x1600) -- larger images
-    appear to carry a stronger or more spatially distributed signal
-  - The production SynthID has been progressively hardened: 0.05 worked earlier
-    (pre-May 2026 Gemini), then 0.10 was needed, now 0.30 for Gemini and still
-    failing at 0.30 for 1600x1600 gpt-image. It is a moving target.
+A controlled study (June 2026, clean v0.8.6 with text/face protection OFF,
+native resolution on this repo's default SDXL pipeline) measured the minimum
+img2img strength that removes the SynthID pixel watermark, verified per image on
+the vendor's own oracle (openai.com/verify for OpenAI, the Gemini app "Verify
+with SynthID" for Google). The test set and per-image results are recorded in
+`data/synthid_corpus/` (manifest `verified_via` = `openai-verify` / `gemini-app`).

- **Heavy JPEG compression** (quality < ~50-60): not specifically tested with
-  oracle verification; the DL approach is more robust than DWT-DCT but Google
-  acknowledges limits at "extreme" manipulation.
+| Vendor | Images | Resolution(s) | Pipeline | Removed at |
+|--------|--------|---------------|----------|------------|
+| OpenAI (gpt-image) | n=4 | 1024x1536 .. 1600x1600 | native | **0.05** |
+| Google (Gemini)    | n=4 | 2816x1536 -> capped 1536 | `--max-resolution 1536` | **0.15** (0.05 and 0.10 do NOT clear) |
+
+**Two findings, both oracle-verified:**
+
+1. **Vendor is the dominant factor, not resolution.** Google's SynthID is
+   roughly 3x more robust than OpenAI's: at a comparable (small) working
+   resolution, OpenAI clears at 0.05 while Google needs 0.15. This matches
+   Google having hardened SynthID more aggressively over time.
+
+2. **OpenAI SynthID removal is resolution-independent in the tested range.**
+   All four OpenAI images (including a 1600x1600) cleared at 0.05.
+
+**CORRECTION (supersedes the earlier "resolution dependence" claim).** A prior
+version of this doc and CLAUDE.md stated that strength 0.30 failed to remove
+SynthID on 1600x1600 gpt-image and that removal was resolution-dependent. That
+was an **artifact of the text-protection bug** (issue #14): those tests ran a
+build where `protect_text` was ON by default, and the high-resolution text
+re-scrub re-introduced SynthID in the dense-text regions of the infographic
+images tested. Re-running the *same* 1600x1600 image on clean v0.8.6 (protect
+OFF) removes SynthID at **0.05**. The "large images resist removal" conclusion
+was false; the resistance was the protect-text shielding, now fixed (v0.8.5).
+
+**Open / not locally testable:**
+
+- **Native large Gemini (2816x1536, ~4.3 MP).** The Gemini floor of 0.15 was
+  measured on the *capped* (`--max-resolution 1536`) path, which is the
+  practical local route on Apple-Silicon (native 2816 OOMs / falls back to slow
+  CPU on a 32 GB M-series). Native large Gemini was not measured here; the
+  vendor and resolution effects would stack, so it plausibly needs >= 0.30 or a
+  discrete GPU. Confirm on a CUDA box if needed.
+- **Heavy JPEG compression** (quality < ~50-60): not oracle-tested; the DL
+  approach is more robust than DWT-DCT but Google acknowledges limits at
+  "extreme" manipulation.

 ### 2.3 Removal attacks and forensic detectability

@@ -331,15 +353,17 @@ empirically from oracle tests:

 - **Before May 2026 (Gemini)**: strength 0.05 removed the watermark
 - **May 2026 (Gemini)**: strength 0.05 insufficient; 0.10 required
- **Current (Gemini, June 2026)**: strength 0.10 insufficient for fresh images;
-  0.30 verified clean (Gemini app oracle, n=3, A100 GPU, native resolution)
- **Current (OpenAI gpt-image 1600x1600, June 2026)**: strength 0.30 still
-  detected by openai.com/verify (issue #14, user qw1212ss report)
+- **Current (Gemini, June 2026)**: on the capped 1536 path, 0.05 and 0.10 do
+  NOT clear; 0.15 clears (n=4, Gemini app oracle). See section 2.2.
+- **OpenAI (June 2026)**: clears at 0.05 across 1024-1600 (n=4, clean v0.8.6).
+  The earlier "0.30 still detected on 1600x1600" report (issue #14) was the
+  text-protection bug, not a hardening of the watermark -- see the correction in
+  section 2.2.

-The progression suggests Google has progressively hardened the watermark -- the
-embedding signal strength or spatial distribution has increased across model
-generations. No Google announcement confirms this; the observation is purely
-empirical from oracle tests.
+Google has hardened SynthID relative to OpenAI's (vendor gap measured at ~3x
+strength), but the year-over-year "0.05 -> 0.10 -> 0.30" progression above
+conflates a real hardening trend with the now-debunked protect-text artifact;
+treat only the section 2.2 controlled numbers as authoritative.

 ---

@@ -373,16 +397,23 @@ watermark removal completeness, and always verify the result with the oracle.

 ### 5.2 Strength setting

-There is no single permanent correct strength. The default 0.30 was set based
-on the June 2026 oracle study (Gemini, n=3). Known gaps:
+There is no single permanent correct strength, but the controlled June 2026
+study (section 2.2) gives empirical floors:

- **OpenAI gpt-image at 1600x1600**: 0.30 does not clear it (oracle-verified,
-  June 2026). 0.35 and 0.40 untested with oracle. 0.40 visibly corrupts text.
- **Resolution matters**: the same strength that clears a 376x429 image fails
-  at 1600x1600 (qw1212ss observation, issue #14, multiple images)
+- **OpenAI**: 0.05 clears across 1024-1600 (n=4). 0.30 is large overkill here.
+- **Google (capped 1536)**: 0.15 (n=4); 0.05 and 0.10 do not clear.
+- **Google native 2816**: not locally measured; likely needs >= 0.30 (vendor +
+  resolution stack). Use a GPU or `--max-resolution 1536`.

-If the watermark survives at 0.30, the correct guidance is to try 0.35 then
-0.40, using the lowest value that reads clean on the vendor oracle.
+The default is **vendor-adaptive** (`watermark_profiles.resolve_strength` +
+`vendor_for_strength`): the tool reads the C2PA issuer on the original input and
+picks `OPENAI_STRENGTH` 0.10 / `GEMINI_STRENGTH` 0.15 / `UNKNOWN_STRENGTH` 0.15.
+This uses the vendor signal we DO have locally (the C2PA SynthID proxy) to avoid
+the overkill of a single high default on OpenAI images, without needing a local
+pixel detector. An explicit `--strength` always wins. If the watermark still
+survives (e.g. a large native Gemini beyond the capped-1536 validation), raise
+toward 0.30 then 0.35-0.40 (0.40 visibly corrupts dense text), using the lowest
+value that reads clean on the oracle.

 ### 5.3 Test methodology

@@ -18,7 +18,7 @@ from typing import TYPE_CHECKING, Any, Literal
 import click

 from remove_ai_watermarks import __version__, watermark_registry
-from remove_ai_watermarks.noai.watermark_profiles import resolve_strength
+from remove_ai_watermarks.noai.watermark_profiles import resolve_strength, vendor_for_strength

 if TYPE_CHECKING:
    from collections.abc import Generator
@@ -452,7 +452,8 @@ def cmd_erase(
    "--strength",
    type=float,
    default=None,
-    help="Denoising strength (0.0-1.0). Default: 0.30 (SDXL SynthID threshold); ctrlregen uses 1.0.",
+    help="Denoising strength (0.0-1.0). Default: vendor-adaptive (OpenAI 0.10 / Google 0.15 / "
+    "unknown 0.15, from the C2PA issuer); ctrlregen uses 1.0.",
 )
@click.option("--steps", type=int, default=50, help="Number of denoising steps. Default: 50.")
@click.option(
@@ -527,9 +528,12 @@ def cmd_invisible(
        progress_callback=progress_cb,
    )

+    # Detect the SynthID vendor from the ORIGINAL (before processing strips C2PA) so the
+    # displayed and executed strength agree on the vendor-adaptive default.
+    vendor = vendor_for_strength(source)
    console.print(f"  Input:    {source.name}")
    console.print(f"  Pipeline: {pipeline}")
-    console.print(f"  Strength: {resolve_strength(strength, pipeline)}  Steps: {steps}")
+    console.print(f"  Strength: {resolve_strength(strength, pipeline, vendor)}  Steps: {steps}")

    t0 = time.monotonic()
    result_path = engine.remove_watermark(
@@ -543,6 +547,7 @@ def cmd_invisible(
        protect_text=protect_text,
        protect_faces=protect_faces,
        max_resolution=max_resolution,
+        vendor=vendor,
    )
    elapsed = time.monotonic() - t0

@@ -689,7 +694,8 @@ def cmd_identify(ctx: click.Context, source: Path, no_visible: bool, as_json: bo
    "--strength",
    type=float,
    default=None,
-    help="Invisible watermark denoising strength. Default: 0.30 (SDXL); ctrlregen uses 1.0.",
+    help="Invisible watermark denoising strength. Default: vendor-adaptive "
+    "(OpenAI 0.10 / Google 0.15 / unknown 0.15); ctrlregen uses 1.0.",
 )
@click.option("--steps", type=int, default=50, help="Number of denoising steps for invisible removal.")
@click.option(
@@ -818,7 +824,11 @@ def cmd_all(
                progress_callback=progress_cb,
            )

-            console.print(f"    Strength: {resolve_strength(strength, pipeline)}  Steps: {steps}")
+            # Detect the vendor from the pristine ORIGINAL (`source`); `tmp_path` has
+            # already lost its C2PA to the visible-removal pass, so reading it would
+            # always resolve to the unknown-vendor default.
+            vendor = vendor_for_strength(source)
+            console.print(f"    Strength: {resolve_strength(strength, pipeline, vendor)}  Steps: {steps}")
            inv_engine.remove_watermark(
                image_path=tmp_path,
                output_path=tmp_path,
@@ -829,6 +839,7 @@ def cmd_all(
                protect_text=protect_text,
                protect_faces=protect_faces,
                max_resolution=max_resolution,
+                vendor=vendor,
            )
            console.print("    Invisible watermark removed")

@@ -941,6 +952,9 @@ def _process_batch_image(
                seed=seed,
                humanize=humanize,
                max_resolution=max_resolution,
+                # Detect the vendor from the pristine original (`img_path`), not the
+                # visible-processed `out_path` whose C2PA is already gone.
+                vendor=vendor_for_strength(img_path),
            )

    if mode in ("metadata", "all"):
@@ -128,6 +128,7 @@ class InvisibleEngine:
        protect_faces: bool = False,
        protect_text: bool = False,
        max_resolution: int = 0,
+        vendor: str | None = None,
    ) -> Path:
        """Remove invisible watermark from an image.

@@ -217,6 +218,7 @@ class InvisibleEngine:
                guidance_scale=guidance_scale,
                seed=seed,
                protect_text=protect_text,
+                vendor=vendor,
            )

            # Optional: Face restoration & Humanizer (Phase 2 - Post-processing)
@@ -5,28 +5,41 @@ Pure configuration and lookup functions with no ML dependencies.

 from __future__ import annotations

+from typing import TYPE_CHECKING, Literal
+
+if TYPE_CHECKING:
+    from pathlib import Path
+
 DEFAULT_MODEL_ID = "stabilityai/stable-diffusion-xl-base-1.0"
 CTRLREGEN_MODEL_ID = "yepengliu/ctrlregen"

-# Single default denoising strength for the SDXL img2img scrub, overridable from
-# the CLI (`--strength`). Raised 0.10 -> 0.30 after an oracle-verified GPU strength
-# study (2026-05-31, Modal A100, native res, Gemini-app "Verify with SynthID", n=3
-# FRESH Gemini images + protect_text/faces OFF): the CURRENT Google SynthID survives
-# 0.10/0.15/0.2 and is only REMOVED at 0.3 (0.3 is the threshold; 0.2 still present).
-# This supersedes the earlier n=1 "0.10 removes it" note, which is now stale -- Google
-# has hardened SynthID and the threshold has climbed 0.05 -> 0.10 -> ~0.3 over time, so
-# treat this as a moving target and re-test against fresh Gemini output periodically.
-# Cost of 0.3: SSIM ~0.97 vs original (modest), but fine/dense typography softens, and
-# it is OVERKILL for non-SynthID sources (OpenAI/ChatGPT carry C2PA, not Google SynthID
-# -- 0.10 is plenty there). protect_text is RECOMMENDED ON for SynthID removal (A/B
-# verified 2026-05-31): SynthID is GLOBAL, so 0.3 clears it whether protection is on or
-# off, and protection salvages medium-text fidelity (~3x runtime); only the very finest
-# text still softens at 0.3. (An earlier comment claimed protect_text shields the
-# watermark -- that was wrong, it mistook the 0.10 strength failure for a protection
-# effect.) The only true tension is the finest typography softening at this aggressive
-# strength. (Fixed LOW/MEDIUM/HIGH presets were removed -- the one knob is this default
-# plus the per-call override.)
-DEFAULT_STRENGTH = 0.30
+# Vendor-adaptive default denoising strength for the SDXL img2img scrub, overridable
+# from the CLI (`--strength`). The right strength depends on which vendor's SynthID is
+# present, detected from the C2PA issuer (metadata.synthid_source). Oracle-verified
+# controlled study (2026-06-01, clean v0.8.6 with protect_text/faces OFF, per-image
+# openai.com/verify or Gemini-app verdict; see docs/synthid.md section 2.2):
+#   - OpenAI gpt-image: removed at 0.05 across 1024-1600 (n=4), resolution-independent.
+#     OPENAI_STRENGTH 0.10 = the 0.05 floor plus a 2x margin (keeps quality high).
+#   - Google Gemini: removed at 0.15 on the capped-1536 path (n=4); 0.05/0.10 do NOT
+#     clear. GEMINI_STRENGTH 0.15. CAVEAT: 0.15 was validated only on
+#     `--max-resolution 1536`; native 2816 (the default path) was not locally
+#     measurable (OOM on Apple Silicon) and may need more -- pending GPU validation on
+#     the raiw.cc backend. If a native large Gemini still verifies positive at 0.15,
+#     raise `--strength`.
+#   - Unknown vendor (metadata stripped, or non-OpenAI/Google C2PA): UNKNOWN_STRENGTH
+#     0.15, the safe middle that clears both vendors at the tested resolutions.
+# The dominant factor is VENDOR, not resolution: Google's SynthID is ~3x more robust
+# than OpenAI's. The earlier single 0.30 default (and the "resolution dependence" lore)
+# came from contaminated tests run with protect_text ON -- see docs/synthid.md 2.2.
+OPENAI_STRENGTH = 0.10
+GEMINI_STRENGTH = 0.15
+UNKNOWN_STRENGTH = 0.15
+# Backwards-compatible alias: the vendor-unknown default (what a caller gets without a
+# detected vendor). Kept as DEFAULT_STRENGTH for existing references.
+DEFAULT_STRENGTH = UNKNOWN_STRENGTH
+
+# Detected-vendor -> default strength. Vendor strings come from `vendor_for_strength`.
+_VENDOR_STRENGTH = {"openai": OPENAI_STRENGTH, "google": GEMINI_STRENGTH}

 # CtrlRegen removes watermarks by regenerating from (near) clean Gaussian noise,
 # NOT by the light-touch partial-noise img2img the SDXL default uses. The research
@@ -52,18 +65,45 @@ DEFAULT_STRENGTH = 0.30
 CTRLREGEN_DEFAULT_STRENGTH = 1.0


-def resolve_strength(strength: float | None, profile: str) -> float:
-    """Resolve the denoising strength, applying the profile-specific default when unset.
+def resolve_strength(strength: float | None, profile: str, vendor: str | None = None) -> float:
+    """Resolve the denoising strength, applying the profile/vendor default when unset.

-    ``None`` means "the user did not pass ``--strength``": the SDXL default profile
-    resolves to ``DEFAULT_STRENGTH`` (the SynthID-removal default, ~0.3), while
-    ``ctrlregen`` resolves to ``CTRLREGEN_DEFAULT_STRENGTH`` (clean-noise regeneration).
-    An explicit value always wins. Shared by the CLI (for display) and the engine (for
-    execution) so the two never disagree.
+    ``None`` means "the user did not pass ``--strength``". ``ctrlregen`` resolves to
+    ``CTRLREGEN_DEFAULT_STRENGTH`` (clean-noise regeneration). The SDXL default profile
+    resolves **vendor-adaptively**: ``vendor`` (``"openai"`` / ``"google"`` / None, from
+    ``vendor_for_strength``) selects ``OPENAI_STRENGTH`` / ``GEMINI_STRENGTH`` /
+    ``UNKNOWN_STRENGTH``. An explicit value always wins (including ``0.0`` -- the check is
+    ``is None``, not falsiness). Shared by the CLI (for display) and the engine (for
+    execution) so the two never disagree -- both must pass the SAME ``vendor``.
    """
    if strength is not None:
        return strength
-    return CTRLREGEN_DEFAULT_STRENGTH if profile == "ctrlregen" else DEFAULT_STRENGTH
+    if profile == "ctrlregen":
+        return CTRLREGEN_DEFAULT_STRENGTH
+    return _VENDOR_STRENGTH.get(vendor or "", UNKNOWN_STRENGTH)
+
+
+def vendor_for_strength(image_path: Path) -> Literal["openai", "google"] | None:
+    """Detect the SynthID vendor for strength selection: ``"openai"`` / ``"google"`` / None.
+
+    Reads the C2PA SynthID proxy (``metadata.synthid_source``) on the ORIGINAL input,
+    so it must run before any pass that strips metadata. When both issuers appear (a
+    rare multi-sign anomaly) Google wins -- the more-robust watermark -> safer (higher)
+    strength. Returns None when metadata is stripped or the issuer is neither vendor,
+    which maps to ``UNKNOWN_STRENGTH``. Lazy-imports ``metadata`` to keep this module
+    dependency-light.
+    """
+    try:
+        from remove_ai_watermarks.metadata import synthid_source
+
+        src = (synthid_source(image_path) or "").lower()
+    except Exception:  # metadata unreadable -> treat as unknown vendor
+        return None
+    if "google" in src:
+        return "google"
+    if "openai" in src:
+        return "openai"
+    return None


 def get_model_id_for_profile(profile: str) -> str:
@@ -447,13 +447,15 @@ class WatermarkRemover:
        guidance_scale: float | None = None,
        seed: int | None = None,
        protect_text: bool = True,
+        vendor: str | None = None,
    ) -> Path:
        """Remove watermark from an image using regeneration attack.

        Args:
            image_path: Path to the watermarked image.
            output_path: Path for the cleaned image. If None, modifies in place.
-            strength: Denoising strength (0.0-1.0).
+            strength: Denoising strength (0.0-1.0). None -> the vendor-adaptive
+                default (see ``vendor``).
            num_inference_steps: Number of denoising steps.
            guidance_scale: Classifier-free guidance scale.
            seed: Random seed for reproducibility.
@@ -461,6 +463,11 @@ class WatermarkRemover:
                Diffusion when any are found (SDXL default profile only). On by
                default; the detector decides per image, and text-free inputs run
                the standard pass at no extra cost.
+            vendor: SynthID vendor (``"openai"`` / ``"google"`` / None) used to pick the
+                default strength when ``strength`` is None. Detect it from the ORIGINAL
+                input with ``watermark_profiles.vendor_for_strength`` before processing
+                strips the metadata; the caller passes it down so display and execution
+                agree.

        Returns:
            Path to the cleaned image.
@@ -475,7 +482,7 @@ class WatermarkRemover:
        if output_path is None:
            output_path = image_path

-        strength = resolve_strength(strength, self.model_profile)
+        strength = resolve_strength(strength, self.model_profile, vendor)

        if not 0.0 <= strength <= 1.0:
            raise ValueError(f"Strength must be between 0.0 and 1.0, got {strength}")
@@ -902,12 +909,16 @@ def remove_watermark(
 ) -> Path:
    """Convenience function to remove watermark from an image.

-    ``strength=None`` lets the profile pick its default (0.10 for SDXL, clean-noise
-    1.0 for ctrlregen); pass a value to override.
+    ``strength=None`` lets the profile pick its default: vendor-adaptive for SDXL
+    (0.10 OpenAI / 0.15 Google / 0.15 unknown, from the C2PA SynthID proxy on the
+    input), clean-noise 1.0 for ctrlregen. Pass a value to override.
    """
+    from remove_ai_watermarks.noai.watermark_profiles import vendor_for_strength
+
    remover = WatermarkRemover(model_id=model_id, device=device, hf_token=hf_token)
    return remover.remove_watermark(
        image_path=image_path,
        output_path=output_path,
        strength=strength,
+        vendor=vendor_for_strength(image_path),
    )
@@ -6,6 +6,7 @@ code paths work correctly on CPU, MPS (macOS), and CUDA (Linux/Windows).

 from __future__ import annotations

+from pathlib import Path
 from unittest.mock import MagicMock, patch

 import pytest
@@ -15,6 +16,9 @@ from remove_ai_watermarks.noai.utils import get_image_format, is_supported_forma
 from remove_ai_watermarks.noai.watermark_profiles import (
    CTRLREGEN_DEFAULT_STRENGTH,
    DEFAULT_STRENGTH,
+    GEMINI_STRENGTH,
+    OPENAI_STRENGTH,
+    UNKNOWN_STRENGTH,
    detect_model_profile,
    get_model_id_for_profile,
    resolve_strength,
@@ -125,19 +129,31 @@ class TestModelProfiles:


 class TestResolveStrength:
-    """resolve_strength applies the profile default only when strength is unset."""
+    """resolve_strength applies the profile/vendor default only when strength is unset."""

-    def test_none_default_profile_uses_sdxl_default(self):
-        assert resolve_strength(None, "default") == DEFAULT_STRENGTH
+    def test_none_default_profile_is_vendor_adaptive(self):
+        # No vendor -> unknown default; OpenAI lower, Google == unknown.
+        assert resolve_strength(None, "default") == UNKNOWN_STRENGTH
+        assert resolve_strength(None, "default", "openai") == OPENAI_STRENGTH
+        assert resolve_strength(None, "default", "google") == GEMINI_STRENGTH
+        assert resolve_strength(None, "default", None) == UNKNOWN_STRENGTH
+        # An unrecognized vendor string falls through to the unknown default.
+        assert resolve_strength(None, "default", "adobe") == UNKNOWN_STRENGTH
+
+    def test_default_strength_alias_is_unknown_vendor_value(self):
+        assert DEFAULT_STRENGTH == UNKNOWN_STRENGTH
+        assert OPENAI_STRENGTH < UNKNOWN_STRENGTH

    def test_none_ctrlregen_uses_clean_noise_default(self):
-        # ctrlregen must NOT inherit the SDXL DEFAULT_STRENGTH (that makes it a no-op);
-        # clean-noise regeneration is the lever against robust marks.
+        # ctrlregen must NOT inherit the SDXL vendor defaults (that makes it a no-op);
+        # clean-noise regeneration is the lever against robust marks. Vendor is ignored.
        assert resolve_strength(None, "ctrlregen") == CTRLREGEN_DEFAULT_STRENGTH
+        assert resolve_strength(None, "ctrlregen", "openai") == CTRLREGEN_DEFAULT_STRENGTH
        assert CTRLREGEN_DEFAULT_STRENGTH > DEFAULT_STRENGTH

-    def test_explicit_value_overrides_both_profiles(self):
+    def test_explicit_value_overrides_profile_and_vendor(self):
        assert resolve_strength(0.3, "default") == 0.3
+        assert resolve_strength(0.3, "default", "openai") == 0.3
        assert resolve_strength(0.3, "ctrlregen") == 0.3

    def test_explicit_zero_is_respected_not_treated_as_unset(self):
@@ -145,6 +161,46 @@ class TestResolveStrength:
        # (the old `strength or DEFAULT` bug would have). Range validation lives in
        # remove_watermark, not here.
        assert resolve_strength(0.0, "ctrlregen") == 0.0
+        assert resolve_strength(0.0, "default", "google") == 0.0
+
+
+class TestVendorForStrength:
+    """vendor_for_strength normalizes the C2PA SynthID proxy to openai/google/None."""
+
+    @staticmethod
+    def _patch(value):
+        return patch("remove_ai_watermarks.metadata.synthid_source", return_value=value)
+
+    def test_openai(self):
+        from remove_ai_watermarks.noai.watermark_profiles import vendor_for_strength
+
+        with self._patch("OpenAI"):
+            assert vendor_for_strength(Path("x.png")) == "openai"
+
+    def test_google(self):
+        from remove_ai_watermarks.noai.watermark_profiles import vendor_for_strength
+
+        with self._patch("Google"):
+            assert vendor_for_strength(Path("x.png")) == "google"
+
+    def test_both_issuers_google_wins(self):
+        # The more-robust watermark wins -> safer (higher) strength.
+        from remove_ai_watermarks.noai.watermark_profiles import vendor_for_strength
+
+        with self._patch("OpenAI, Google"):
+            assert vendor_for_strength(Path("x.png")) == "google"
+
+    def test_none_when_no_synthid_source(self):
+        from remove_ai_watermarks.noai.watermark_profiles import vendor_for_strength
+
+        with self._patch(None):
+            assert vendor_for_strength(Path("x.png")) is None
+
+    def test_unreadable_metadata_is_none(self):
+        from remove_ai_watermarks.noai.watermark_profiles import vendor_for_strength
+
+        with patch("remove_ai_watermarks.metadata.synthid_source", side_effect=OSError):
+            assert vendor_for_strength(Path("x.png")) is None


 # ── Format utilities ────────────────────────────────────────────────