feat(visible): Samsung Galaxy AI mark removal (bottom-left reverse-alpha, #37)
New samsung_engine.py mirrors the jimeng engine but anchors bottom-left; wired into watermark_registry, the CLI (--mark samsung / auto), and identify (visible_samsung, medium). visible_alpha_solve.py gains a corner=bl mode; samsung_alpha.png solved from @f-liva's flat captures. Calibrated for the Italian "Contenuti generati dall'AI" variant. Flat black/gray/white captures committed, real photos gitignored. Tests + docs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@@ -46,6 +46,8 @@ data/jimeng_capture/seeds/
|
||||
data/jimeng_capture/captures/jimeng_content_*.png
|
||||
data/gemini_capture/seeds/
|
||||
data/gemini_capture/captures/gemini_content_*.png
|
||||
data/samsung_capture/seeds/
|
||||
data/samsung_capture/captures/samsung_content_*
|
||||
|
||||
# GFPGAN downloads its RetinaFace/parsing weights to a CWD ./gfpgan/weights/
|
||||
# working dir on first use (the restore extra). Runtime artifact, never committed.
|
||||
|
||||
@@ -17,7 +17,7 @@ If this tool saves you time, consider [sponsoring its development](https://githu
|
||||
|
||||
## Features
|
||||
|
||||
- **Visible watermark removal** — a registry of known marks in their usual places: the Gemini / Nano Banana sparkle, the Doubao "豆包AI生成" text strip, and the Jimeng "★ 即梦AI" wordmark. Each is removed by **reverse-alpha blending** against a captured alpha map (`original = (wm − α·logo)/(1−α)`), recovering the true pixels rather than inpainting a guess. The Gemini sparkle recovers cleanly on its own on bright backgrounds; it adapts the alpha to each image's sparkle opacity, so a more-opaque-than-captured sparkle is still fully removed (and on a dark background, where the fixed alpha would over-subtract and leave a dark spot, it automatically inpaints the small sparkle footprint instead); the Doubao and Jimeng text marks re-rasterize slightly per image, so a thin residual inpaint over the glyph footprint clears the leftover edges (the alpha maps are reproducibly rebuilt from controlled captures by `scripts/visible_alpha_solve.py`). Fast, offline, no GPU. `visible --mark auto` finds and removes the strongest detected mark. (For arbitrary logos/objects, see `erase`.)
|
||||
- **Visible watermark removal** — a registry of known marks in their usual places: the Gemini / Nano Banana sparkle, the Doubao "豆包AI生成" text strip, the Jimeng "★ 即梦AI" wordmark, and the Samsung Galaxy AI "✦ Contenuti generati dall'AI" strip (bottom-left, locale-specific). Each is removed by **reverse-alpha blending** against a captured alpha map (`original = (wm − α·logo)/(1−α)`), recovering the true pixels rather than inpainting a guess. The Gemini sparkle recovers cleanly on its own on bright backgrounds; it adapts the alpha to each image's sparkle opacity, so a more-opaque-than-captured sparkle is still fully removed (and on a dark background, where the fixed alpha would over-subtract and leave a dark spot, it automatically inpaints the small sparkle footprint instead); the Doubao, Jimeng, and Samsung text marks re-rasterize slightly per image, so a thin residual inpaint over the glyph footprint clears the leftover edges (the alpha maps are reproducibly rebuilt from controlled captures by `scripts/visible_alpha_solve.py`). Fast, offline, no GPU. `visible --mark auto` finds and removes the strongest detected mark. (For arbitrary logos/objects, see `erase`.)
|
||||
- **Universal region eraser (`erase`)** — remove any logo / watermark / object inside boxes you specify, regardless of position or colour. Default cv2 inpainting (CPU, instant); optional big-LaMa via onnxruntime (`lama` extra) for higher quality
|
||||
- **Invisible watermark removal** — SynthID, StableSignature, TreeRing via diffusion-based regeneration (needs a local GPU, or run it with no setup on [raiw.cc](https://raiw.cc))
|
||||
- **AI metadata stripping** — EXIF, PNG text chunks, C2PA provenance manifests (PNG / JPEG / AVIF / HEIF / JPEG-XL, **MP4 / MOV / M4V / M4A** at the container level, and **WebM / MP3 / WAV / FLAC / OGG** losslessly via ffmpeg), XMP DigitalSourceType
|
||||
@@ -26,7 +26,7 @@ If this tool saves you time, consider [sponsoring its development](https://githu
|
||||
- **Text and face preservation (experimental)** — optional `--pipeline controlnet` adds a canny ControlNet that keeps text and face structure sharp through the removal pass (without copying original pixels, so SynthID is still removed). Canny preserves face *structure*, not *identity* (the regenerated face drifts in likeness); identity is preserved by the `--restore-faces` GFPGAN post-pass (opt-in). Both are experimental and off by default.
|
||||
- **Batch processing** — process entire directories
|
||||
- **Detection** — three-stage NCC watermark detection with confidence scoring
|
||||
- **Provenance detection (`identify`)** — aggregate C2PA issuer, the C2PA soft-binding forensic-watermark vendor (Adobe TrustMark, Digimarc, Imatag, ...), IPTC "Made with AI" plus the IPTC 2025.1 `AISystemUsed` field, embedded SD/ComfyUI params, EXIF/XMP generator tags, the xAI/Grok EXIF signature, the China TC260 AIGC label (XMP, PNG chunk, or EXIF), the HuggingFace `hf-job-id` job marker, the SynthID metadata proxy, the visible marks (Gemini sparkle plus the Doubao "豆包AI生成" / Jimeng "即梦AI" text marks), the open SD/SDXL/FLUX invisible watermark, and (with the `trustmark` extra) the open Adobe TrustMark watermark into one origin-platform + watermark-inventory verdict (`--json` for machine output)
|
||||
- **Provenance detection (`identify`)** — aggregate C2PA issuer, the C2PA soft-binding forensic-watermark vendor (Adobe TrustMark, Digimarc, Imatag, ...), IPTC "Made with AI" plus the IPTC 2025.1 `AISystemUsed` field, embedded SD/ComfyUI params, EXIF/XMP generator tags, the xAI/Grok EXIF signature, the China TC260 AIGC label (XMP, PNG chunk, or EXIF), the HuggingFace `hf-job-id` job marker, the SynthID metadata proxy, the visible marks (Gemini sparkle plus the Doubao "豆包AI生成" / Jimeng "即梦AI" / Samsung Galaxy AI "Contenuti generati dall'AI" text marks), the open SD/SDXL/FLUX invisible watermark, and (with the `trustmark` extra) the open Adobe TrustMark watermark into one origin-platform + watermark-inventory verdict (`--json` for machine output)
|
||||
|
||||
## Examples
|
||||
|
||||
@@ -51,14 +51,14 @@ If this tool saves you time, consider [sponsoring its development](https://githu
|
||||
| **Meta AI** | — | — | ✅ IPTC "Made with AI" (digitalSourceType) | Metadata strip (removes the label) |
|
||||
| **Doubao** (ByteDance) / China AIGC generators | ✅ "豆包AI生成" text strip (bottom-right) | — | ✅ TC260 AIGC label (`<TC260:AIGC>` XMP, `AIGC` PNG chunk, or EXIF JSON) **+ C2PA** signed by ByteDance Volcano Engine (`volcengine`) | Reverse-alpha (captured α map) + thin residual inpaint, NCC-aligned across resolutions, + metadata strip |
|
||||
| **Jimeng / Dreamina** (即梦AI, ByteDance) | ✅ "★ 即梦AI" wordmark (bottom-right) | — | ✅ TC260 AIGC label + C2PA (Volcano Engine) | Reverse-alpha (captured α map) + residual inpaint over the glyph footprint, NCC-aligned across resolutions, + metadata strip |
|
||||
| **Samsung Galaxy AI** (Generative Edit, Sketch to Image, ...) | — | — | ✅ C2PA (signer "Samsung Galaxy") + `trainedAlgorithmicMedia` / proprietary `genAIType` marker | Detected (`identify`) + metadata strip |
|
||||
| **Samsung Galaxy AI** (Generative Edit, Sketch to Image, ...) | ✅ "✦ Contenuti generati dall'AI" strip (bottom-left, locale-specific) | — | ✅ C2PA (signer "Samsung Galaxy") + `trainedAlgorithmicMedia` / proprietary `genAIType` marker | Reverse-alpha (captured α map) + thin residual inpaint, NCC-aligned across resolutions, + metadata strip |
|
||||
| **Black Forest Labs** (FLUX API) | — | — | ✅ C2PA (`Black Forest Labs API` + `c2pa.ai_generated_content` + `trainedAlgorithmicMedia`) | Metadata strip |
|
||||
| **StableSignature** (Meta) | — | ✅ In-model watermark | — | Diffusion regeneration |
|
||||
| **TreeRing** | — | ✅ Latent space watermark | — | Diffusion regeneration |
|
||||
|
||||
> Visible overlays are used by Google Gemini / Nano Banana (sparkle logo) and by ByteDance's Doubao ("豆包AI生成" corner text) and Jimeng / Dreamina ("★ 即梦AI" wordmark). All are removed on CPU by reverse-alpha against a captured alpha map (Jimeng adds a residual inpaint over the glyph footprint, since its mark re-rasterizes per image). Other services rely on invisible watermarks and/or metadata; our diffusion-based regeneration works against any invisible watermark in pixel or frequency domain. For a visible mark from any other source (any position, any colour), use the universal `erase --region` command.
|
||||
> Visible overlays are used by Google Gemini / Nano Banana (sparkle logo), by ByteDance's Doubao ("豆包AI生成" corner text) and Jimeng / Dreamina ("★ 即梦AI" wordmark), and by Samsung Galaxy AI ("✦ Contenuti generati dall'AI" strip, bottom-left, locale-specific). All are removed on CPU by reverse-alpha against a captured alpha map (Jimeng and Samsung add a thin residual inpaint over the glyph footprint, since their marks re-rasterize per image). Other services rely on invisible watermarks and/or metadata; our diffusion-based regeneration works against any invisible watermark in pixel or frequency domain. For a visible mark from any other source (any position, any colour), use the universal `erase --region` command.
|
||||
|
||||
> **Detection:** `remove-ai-watermarks identify <image>` reports the origin platform and watermark inventory for all the signals above — C2PA issuer, the C2PA soft-binding forensic-watermark vendor (TrustMark / Digimarc / Imatag / ...), IPTC "Made with AI" plus the IPTC 2025.1 `AISystemUsed` field, the China TC260 AIGC label (XMP, PNG chunk, or EXIF), the HuggingFace `hf-job-id` job marker, embedded generation params, EXIF/XMP generator tags, the xAI/Grok EXIF signature, the SynthID metadata proxy, the visible marks (Gemini sparkle plus the Doubao "豆包AI生成" / Jimeng "即梦AI" text marks), and (with the `[detect]` / `[trustmark]` extras) the open SD/SDXL/FLUX and Adobe TrustMark invisible watermarks. SynthID and the proprietary soft-binding watermarks (Digimarc etc.) have no local decoder, so they are reported by metadata proxy / vendor name only.
|
||||
> **Detection:** `remove-ai-watermarks identify <image>` reports the origin platform and watermark inventory for all the signals above — C2PA issuer, the C2PA soft-binding forensic-watermark vendor (TrustMark / Digimarc / Imatag / ...), IPTC "Made with AI" plus the IPTC 2025.1 `AISystemUsed` field, the China TC260 AIGC label (XMP, PNG chunk, or EXIF), the HuggingFace `hf-job-id` job marker, embedded generation params, EXIF/XMP generator tags, the xAI/Grok EXIF signature, the SynthID metadata proxy, the visible marks (Gemini sparkle plus the Doubao "豆包AI生成" / Jimeng "即梦AI" / Samsung Galaxy AI "Contenuti generati dall'AI" text marks), and (with the `[detect]` / `[trustmark]` extras) the open SD/SDXL/FLUX and Adobe TrustMark invisible watermarks. SynthID and the proprietary soft-binding watermarks (Digimarc etc.) have no local decoder, so they are reported by metadata proxy / vendor name only.
|
||||
|
||||
## How it works
|
||||
|
||||
@@ -95,6 +95,15 @@ remove-ai-watermarks visible jimeng.png -o clean.png # --mark auto pi
|
||||
remove-ai-watermarks visible jimeng.png --mark jimeng -o clean.png
|
||||
```
|
||||
|
||||
### Removing the Samsung Galaxy AI "✦ Contenuti generati dall'AI" mark
|
||||
|
||||
Samsung's on-device Generative AI edits (Generative Edit, Sketch to Image, Portrait Studio) burn a visible sparkle + "generated with AI" string into the **bottom-left** corner — a faint, low-opacity semi-transparent white overlay. It is solved from controlled black / gray / white captures the same way as Jimeng and removed by reverse-alpha plus a thin residual inpaint over the glyph footprint (the mark re-rasterizes per image, and the flat captures are smaller than real photos, so the alpha template is NCC-aligned and width-scaled to the actual mark). `visible --mark auto` detects and removes it (or force it with `--mark samsung`); being bottom-left it never confuses the bottom-right Gemini/Doubao/Jimeng marks. The string is **locale-specific** — this build is calibrated for the Italian "Contenuti generati dall'AI" variant; other locales need their own captured template (open a sample on issue #37).
|
||||
|
||||
```bash
|
||||
remove-ai-watermarks visible samsung.jpg -o clean.jpg # --mark auto picks Samsung
|
||||
remove-ai-watermarks visible samsung.jpg --mark samsung -o clean.jpg
|
||||
```
|
||||
|
||||
### Universal region eraser
|
||||
|
||||
For any visible mark the dedicated engines do not cover — a logo anywhere, any colour — `erase --region x,y,w,h` inpaints the box you specify. The default `cv2` backend is instant and dependency-free; the optional `lama` backend (big-LaMa via onnxruntime, `lama` extra, ~200 MB model downloaded on first use) gives much cleaner fills on textured regions at the cost of ~3-4 GB RAM per call.
|
||||
@@ -276,8 +285,9 @@ remove-ai-watermarks batch ./images/ --mode all
|
||||
remove-ai-watermarks identify image.png
|
||||
|
||||
# Visible watermark only — fast, offline, CPU. --mark auto (default) finds the
|
||||
# strongest known mark (Gemini sparkle / Doubao "豆包AI生成" / Jimeng "即梦AI"); force
|
||||
# one with --mark gemini / doubao / jimeng. Removed by reverse-alpha (true-pixel recovery).
|
||||
# strongest known mark (Gemini sparkle / Doubao "豆包AI生成" / Jimeng "即梦AI" /
|
||||
# Samsung Galaxy AI "Contenuti generati dall'AI"); force one with
|
||||
# --mark gemini / doubao / jimeng / samsung. Removed by reverse-alpha (true-pixel recovery).
|
||||
remove-ai-watermarks visible image.png -o clean.png
|
||||
|
||||
# Erase arbitrary region(s) — universal, any logo/watermark/object, any position.
|
||||
|
||||
@@ -0,0 +1,66 @@
|
||||
# Samsung Galaxy AI visible watermark capture
|
||||
|
||||
> **Status (built 2026-06-05):** flat black/gray/white Samsung Galaxy AI captures
|
||||
> were obtained (issue #37, from @f-liva) and the alpha map was solved. Removal is
|
||||
> reverse-alpha plus a thin residual inpaint over the glyph footprint; see the
|
||||
> `samsung_engine.py` notes in the root `CLAUDE.md`. The text below is the capture
|
||||
> plan and the open quality follow-up.
|
||||
|
||||
Goal: capture the Samsung Galaxy AI "✦ Contenuti generati dall'AI" visible wordmark
|
||||
over known flat backgrounds so we can build a per-pixel alpha map and a reverse-alpha
|
||||
remover, the same way the Gemini sparkle and the Doubao / Jimeng strips work
|
||||
(`src/remove_ai_watermarks/gemini_engine.py`, `doubao_engine.py`, `jimeng_engine.py`).
|
||||
|
||||
## What we learned (verified from the captures, 2026-06-05)
|
||||
|
||||
- Mark: a sparkle icon followed by the locale string "Contenuti generati dall'AI"
|
||||
(Italian), a light low-opacity (peak alpha ~0.38) semi-transparent **white**
|
||||
overlay, anchored **bottom-LEFT** (Doubao/Jimeng are bottom-right). The string is
|
||||
locale-specific, so the alpha template is per-locale; this build is the Italian
|
||||
variant. Other locales need their own captured template.
|
||||
- Blend model: alpha compositing with a pure-white logo, `watermarked =
|
||||
a*255 + (1-a)*original`, solved from the GRAY capture (same careful recipe as
|
||||
Doubao/Jimeng: cubic-background fit, mean over channels, full halo extent,
|
||||
unblurred). The white capture confirms the logo is white; on white the mark is
|
||||
white-on-white and not detectable (no contrast), which is fine -- there is nothing
|
||||
to recover there.
|
||||
- Geometry (fraction of image WIDTH): asset width ~0.32, height ~0.038, left margin
|
||||
~0.011, bottom margin ~0.006. The mark scales with width: a 1086-wide flat capture
|
||||
and a 2958-wide real photo both measure width_frac ~0.31.
|
||||
- **Resolution caveat (open quality follow-up):** the flat black/gray/white captures
|
||||
arrived at the phone's flat-edit size (1086 wide and a landscape 1920 set), while
|
||||
the real photos are ~3000 wide, so the captured glyph (~334 px) is ~2.7x smaller
|
||||
than on a real photo (~900 px). The alpha is solved at the capture size and
|
||||
width-scaled + NCC-aligned per image, which removes the mark cleanly (verified on a
|
||||
real 2958-wide photo: re-detect 0.79 -> 0.00, no readable text or outline), but a
|
||||
flat capture taken at the real photo resolution (~3000 wide) would let the alpha be
|
||||
pixel-sharp instead of upscaled. Not a blocker; a quality upgrade if a full-res
|
||||
flat capture is provided.
|
||||
|
||||
## Capture protocol (to re-capture or add a locale)
|
||||
|
||||
On a Samsung Galaxy AI device (set the UI language to the target locale):
|
||||
|
||||
1. Run the AI edit (Generative Edit / Sketch to Image) on a solid black image, so
|
||||
the overlay lands on a flat black background. Download the ORIGINAL output file
|
||||
(not a screenshot, no crop or re-save).
|
||||
2. Repeat over solid white and solid gray (those pin the exact glyph color).
|
||||
3. Ideally run all three flat edits at the same resolution as real photos (~3000
|
||||
wide) so the alpha map is pixel-sharp rather than upscaled.
|
||||
4. Plus 3-5 real outputs with the visible mark over normal content for validation.
|
||||
|
||||
## Files
|
||||
|
||||
- `captures/samsung_black_1.png`, `samsung_gray_1.png`, `samsung_white_1.png` --
|
||||
portrait flat edits (1086 wide), the primary calibration set.
|
||||
- `captures/samsung_black_2.png`, `samsung_gray_2.png`, `samsung_white_2.png` --
|
||||
a second (landscape 1920) set.
|
||||
- `captures/samsung_content_*` -- real-photo validation downloads, **gitignored**
|
||||
(user content, repo is public).
|
||||
- `seeds/` -- synthetic solid-color inputs, gitignored (regenerable).
|
||||
|
||||
Rebuild the alpha asset with:
|
||||
|
||||
```
|
||||
uv run python scripts/visible_alpha_solve.py samsung
|
||||
```
|
||||
|
After Width: | Height: | Size: 434 KiB |
|
After Width: | Height: | Size: 317 KiB |
|
After Width: | Height: | Size: 904 KiB |
|
After Width: | Height: | Size: 424 KiB |
|
After Width: | Height: | Size: 815 KiB |
|
After Width: | Height: | Size: 314 KiB |
@@ -68,6 +68,7 @@ class EngineSpec:
|
||||
gray: str
|
||||
asset: Path
|
||||
native_width: int = 2048
|
||||
corner: str = "br" # which corner the mark sits in: "br" (Doubao/Jimeng) or "bl" (Samsung)
|
||||
|
||||
|
||||
_SPECS: dict[str, EngineSpec] = {
|
||||
@@ -85,6 +86,18 @@ _SPECS: dict[str, EngineSpec] = {
|
||||
"jimeng_cap_C.png", # gray seed
|
||||
_ROOT / "src" / "remove_ai_watermarks" / "assets" / "jimeng_alpha.png",
|
||||
),
|
||||
"samsung": EngineSpec(
|
||||
"samsung",
|
||||
_ROOT / "data" / "samsung_capture" / "captures",
|
||||
"samsung_black_1.png", # black flat edit (mark on true black, bottom-left)
|
||||
"samsung_gray_1.png", # gray flat edit
|
||||
_ROOT / "src" / "remove_ai_watermarks" / "assets" / "samsung_alpha.png",
|
||||
# The flat captures arrive at the phone's flat-edit size (1086 wide); the
|
||||
# mark is a fixed FRACTION of width (~0.31), consistent with the 2958-wide
|
||||
# real photos, so geometry is emitted relative to the capture width.
|
||||
native_width=1086,
|
||||
corner="bl",
|
||||
),
|
||||
}
|
||||
|
||||
_CUBIC_BG_PAD = 30 # px of background margin around the mark for the cubic fit
|
||||
@@ -119,19 +132,26 @@ def _union_bbox(mask: NDArray[np.uint8], err: str) -> tuple[int, int, int, int]:
|
||||
return x0, x1, y0, y1
|
||||
|
||||
|
||||
def _locate_on_black(black: NDArray[np.float32]) -> tuple[int, int, int, int]:
|
||||
"""Bounding box of the white mark on the black capture (bottom-right).
|
||||
def _locate_on_black(black: NDArray[np.float32], corner: str = "br") -> tuple[int, int, int, int]:
|
||||
"""Bounding box of the white mark on the black capture, in the given corner.
|
||||
|
||||
Thresholds well above the blotchy near-black background, then unions the
|
||||
sufficiently-large bright components so the box spans the whole word.
|
||||
sufficiently-large bright components so the box spans the whole word. ``corner``
|
||||
is ``"br"`` (bottom-right, Doubao/Jimeng) or ``"bl"`` (bottom-left, Samsung).
|
||||
The horizontal window is kept generous (the Samsung text strip is ~0.31 of the
|
||||
width, so a corner *quarter* would clip it) while still excluding any centered
|
||||
generated content the flat edit hallucinated.
|
||||
"""
|
||||
h, w = black.shape[:2]
|
||||
lum = black.mean(axis=2)
|
||||
br = lum > 40 # comfortably above the ~5-30 background blotches
|
||||
br[: h * 3 // 4, :] = False # bottom quarter only
|
||||
br[:, : w * 3 // 4] = False # right quarter only
|
||||
if corner == "bl":
|
||||
br[:, w // 2 :] = False # left half only
|
||||
else:
|
||||
br[:, : w * 3 // 4] = False # right quarter only
|
||||
bright = cv2.morphologyEx(br.astype(np.uint8) * 255, cv2.MORPH_CLOSE, np.ones((9, 9), np.uint8))
|
||||
return _union_bbox(bright, "no mark found on the black capture (bottom-right is empty)")
|
||||
return _union_bbox(bright, f"no mark found on the black capture ({corner} corner is empty)")
|
||||
|
||||
|
||||
def _cubic_background(crop: NDArray[np.float32], glyph: NDArray[np.bool_]) -> NDArray[np.float32]:
|
||||
@@ -161,7 +181,7 @@ def solve_alpha(spec: EngineSpec) -> NDArray[np.uint8]:
|
||||
gray_f = gray.astype(np.float32)
|
||||
|
||||
img_h, img_w = black_f.shape[:2]
|
||||
mx0, mx1, my0, my1 = _locate_on_black(black_f)
|
||||
mx0, mx1, my0, my1 = _locate_on_black(black_f, spec.corner)
|
||||
pad = _CUBIC_BG_PAD
|
||||
rx0, rx1 = max(0, mx0 - pad), min(img_w, mx1 + pad)
|
||||
ry0, ry1 = max(0, my0 - pad), min(img_h, my1 + pad)
|
||||
@@ -186,16 +206,20 @@ def solve_alpha(spec: EngineSpec) -> NDArray[np.uint8]:
|
||||
aw, ah = tight.shape[1], tight.shape[0]
|
||||
# Absolute asset position in the capture, for the engine's geometry constants.
|
||||
abs_x0, abs_y0 = rx0 + cx0, ry0 + cy0
|
||||
# Horizontal margin depends on the anchor corner: left margin for "bl", right
|
||||
# margin (distance from the right edge) for "br".
|
||||
h_margin = abs_x0 if spec.corner == "bl" else img_w - (abs_x0 + aw)
|
||||
log.info(
|
||||
"%s: alpha %dx%d max %.3f | WIDTH_FRAC %.4f HEIGHT_FRAC %.4f "
|
||||
"MARGIN_RIGHT_FRAC %.4f MARGIN_BOTTOM_FRAC %.4f (native_width %d)",
|
||||
"MARGIN_%s_FRAC %.4f MARGIN_BOTTOM_FRAC %.4f (native_width %d)",
|
||||
spec.name,
|
||||
aw,
|
||||
ah,
|
||||
float(tight.max()),
|
||||
aw / spec.native_width,
|
||||
ah / spec.native_width,
|
||||
(img_w - (abs_x0 + aw)) / spec.native_width,
|
||||
"LEFT" if spec.corner == "bl" else "RIGHT",
|
||||
h_margin / spec.native_width,
|
||||
(img_h - (abs_y0 + ah)) / spec.native_width,
|
||||
spec.native_width,
|
||||
)
|
||||
@@ -234,7 +258,7 @@ def main(engine: str) -> None:
|
||||
raise OSError(f"failed to write {path}")
|
||||
log.info("%s: wrote %s", label, path.relative_to(_ROOT))
|
||||
|
||||
if engine in ("doubao", "jimeng", "all"):
|
||||
if engine in (*_SPECS, "all"):
|
||||
specs = list(_SPECS.values()) if engine == "all" else [_SPECS[engine]]
|
||||
for spec in specs:
|
||||
_write(spec.asset, solve_alpha(spec), spec.name)
|
||||
|
||||
|
After Width: | Height: | Size: 5.6 KiB |
@@ -358,6 +358,7 @@ def _visible_sparkle(image_path: Path) -> float | None:
|
||||
_VISIBLE_MARK_PLATFORM = {
|
||||
"doubao": "ByteDance Doubao (visible 豆包AI生成 mark detected)",
|
||||
"jimeng": "ByteDance Jimeng / Dreamina (visible 即梦AI mark detected)",
|
||||
"samsung": "Samsung Galaxy AI (visible 'Contenuti generati dall'AI' mark detected)",
|
||||
}
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,394 @@
|
||||
"""Samsung Galaxy AI visible watermark removal engine.
|
||||
|
||||
Samsung's on-device Generative AI photo edits (Generative Edit / Sketch to Image /
|
||||
Portrait Studio on Galaxy phones) stamp a visible localized wordmark -- a sparkle
|
||||
icon followed by a "generated with AI" string -- in the **bottom-left** corner: a
|
||||
light, low-opacity semi-transparent white overlay. The string is locale-specific;
|
||||
this engine is calibrated for the Italian "Contenuti generati dall'AI" variant
|
||||
(issue #37, captures from @f-liva). Other locales need their own captured alpha
|
||||
template, but the geometry and removal recipe are shared.
|
||||
|
||||
Like the Gemini sparkle and the Doubao / Jimeng marks it is a fixed overlay, so
|
||||
removal starts from **reverse-alpha blending** against a captured alpha map
|
||||
(``remove_watermark_reverse_alpha``): ``original = (wm - a*logo)/(1-a)``. The logo
|
||||
is pure white (255,255,255); the alpha map was solved from the GRAY Samsung capture
|
||||
(see ``data/samsung_capture/``), bundled as ``assets/samsung_alpha.png`` -- the same
|
||||
careful build as Jimeng/Doubao (cubic-background fit, mean over channels, full halo
|
||||
extent, unblurred). The Samsung mark is faint (peak alpha ~0.38), so the glyph reads
|
||||
as a soft light-gray strip.
|
||||
|
||||
The mark is anchored bottom-LEFT (Doubao/Jimeng are bottom-right) and scales with
|
||||
image WIDTH (~0.32 of width). The flat calibration captures arrive at the phone's
|
||||
flat-edit size (~1086 wide) while real photos are ~3000 wide, so a single alpha map
|
||||
cannot pixel-cancel the upscaled, per-image re-rasterized mark; removal therefore
|
||||
NCC-aligns the alpha to the actual mark (always), reverse-alphas, then clears the
|
||||
residual with a deliberately THIN inpaint over the glyph footprint -- the exact
|
||||
recipe Jimeng uses. Verified on the flat captures and a real ~2958-wide download.
|
||||
|
||||
Detection (``detect``) matches the bundled glyph silhouette against the corner
|
||||
candidate via normalized correlation, keying on the actual mark shape rather than
|
||||
coverage heuristics. Samsung edits also carry C2PA + the Galaxy ``genAIType``
|
||||
marker (see ``metadata``/``identify``), so the visible path is the stripped-metadata
|
||||
fallback / the *removal* path, not a new ``identify`` signal.
|
||||
|
||||
``locate`` (geometry box) and ``extract_mask`` (the candidate glyph mask the
|
||||
detector correlates) mirror the Doubao/Jimeng engines. Fast, offline, no GPU.
|
||||
Arbitrary-region inpainting still lives in ``region_eraser`` / the ``erase`` command.
|
||||
"""
|
||||
|
||||
# cv2/numpy boundary: third-party libs ship no usable element types; relax the
|
||||
# unknown-type rules for this file only.
|
||||
# pyright: reportUnknownMemberType=false, reportUnknownArgumentType=false, reportUnknownVariableType=false, reportUnknownParameterType=false, reportMissingTypeArgument=false, reportMissingTypeStubs=false, reportMissingImports=false, reportArgumentType=false, reportAssignmentType=false, reportReturnType=false, reportCallIssue=false, reportIndexIssue=false, reportOperatorIssue=false, reportOptionalMemberAccess=false, reportOptionalCall=false, reportOptionalSubscript=false, reportOptionalOperand=false, reportAttributeAccessIssue=false, reportPrivateImportUsage=false, reportPrivateUsage=false, reportInvalidTypeForm=false, reportConstantRedefinition=false, reportUnnecessaryComparison=false
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from dataclasses import dataclass
|
||||
from typing import TYPE_CHECKING, Any
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from pathlib import Path
|
||||
|
||||
from numpy.typing import NDArray
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# Geometry as a fraction of image WIDTH. The Samsung mark scales with width and is
|
||||
# anchored bottom-LEFT. The box is intentionally generous (the glyph mask tightens
|
||||
# it and the alignment search refines position); values cover the 1086 flat captures
|
||||
# and the ~2958 real photos (both measured at width_frac ~0.31).
|
||||
WM_WIDTH_FRAC = 0.40
|
||||
WM_HEIGHT_FRAC = 0.060
|
||||
MARGIN_LEFT_FRAC = 0.004
|
||||
MARGIN_BOTTOM_FRAC = 0.002
|
||||
|
||||
# Glyph appearance: a low-saturation light gray rendered brighter than the
|
||||
# surrounding content (white top-hat), same polarity logic as Doubao/Jimeng so a
|
||||
# white-paper document is left untouched. LOGO_MIN_LUMA is lower than Jimeng's
|
||||
# because the Samsung mark is fainter (peak alpha ~0.38), so on a mid/dark
|
||||
# background the glyph luma is lower; the top-hat + NCC shape gate keep precision.
|
||||
MAX_SATURATION = 55 # max channel spread to count a pixel as "grayish"
|
||||
LOGO_MIN_LUMA = 110 # glyphs are at least this bright in absolute terms
|
||||
TOPHAT_DELTA = 8 # glyph must exceed the local background by this many levels
|
||||
|
||||
# Detection matches the bundled alpha-template glyph silhouette
|
||||
# (assets/samsung_alpha.png) against the candidate via zero-mean normalized
|
||||
# correlation (cv2 TM_CCOEFF_NORMED). A small coverage floor skips the template
|
||||
# match on a near-empty candidate box. The threshold is validated against the real
|
||||
# capture set and the other visible marks (Doubao/Jimeng/Gemini must not cross-fire).
|
||||
DETECT_MIN_COVERAGE = 0.01
|
||||
DETECT_NCC_THRESHOLD = 0.40
|
||||
|
||||
# ── Reverse-alpha (recovery, Gemini/Doubao/Jimeng-style) ─────────────
|
||||
# The Samsung mark is a fixed semi-transparent white overlay; given its alpha map
|
||||
# the original pixels are recovered by inverting the blend. The logo is pure white
|
||||
# (the white capture confirms it). The alpha map was solved from the GRAY capture by
|
||||
# scripts/visible_alpha_solve.py (cubic-background fit, mean over channels, full halo,
|
||||
# unblurred); the bundled asset (assets/samsung_alpha.png) is that template (a*255)
|
||||
# at the captured width. The mark scales with image WIDTH, and the flat captures are
|
||||
# ~2.7x smaller than real photos, so a pure width-scale is only approximate; removal
|
||||
# also registers the template to the actual mark via a TM_CCOEFF_NORMED scale+position
|
||||
# search (`_aligned_alpha_map`).
|
||||
_ALPHA_NATIVE_WIDTH = 1086
|
||||
_ALPHA_LOGO_BGR: tuple[float, float, float] = (255.0, 255.0, 255.0)
|
||||
# Geometry below is emitted by scripts/visible_alpha_solve.py for the bundled
|
||||
# asset -- keep them in sync when the asset is rebuilt.
|
||||
_ALPHA_WIDTH_FRAC = 0.3195 # asset width / image width -- the alignment scale seed
|
||||
_ALPHA_HEIGHT_FRAC = 0.0378
|
||||
# Margins (of image WIDTH) of the captured mark -- the geometry record / where to
|
||||
# seed; alignment refines the actual position, so these are not load-bearing.
|
||||
_ALPHA_MARGIN_LEFT_FRAC = 0.0110
|
||||
_ALPHA_MARGIN_BOTTOM_FRAC = 0.0064
|
||||
# Alignment scale search (np.linspace args) around the width-scaled glyph size --
|
||||
# wider than Jimeng's because the flat captures are far off the real-photo width, so
|
||||
# the per-image scale can drift more from the width-scaled seed.
|
||||
_ALPHA_ALIGN_SEARCH = (0.85, 1.18, 23)
|
||||
# Residual inpaint footprint: a single capture upscaled to the real-photo width
|
||||
# cannot pixel-cancel the re-rasterized mark, so the glyph footprint (alpha above
|
||||
# this) is always inpainted after reverse-alpha (dilated by this kernel, INPAINT_NS).
|
||||
# Kept deliberately THIN -- reverse-alpha already recovers the true background under
|
||||
# the semi-transparent mark, so the inpaint only finishes the residual edges.
|
||||
_RESIDUAL_ALPHA_FLOOR = 0.05
|
||||
_RESIDUAL_DILATE = 5
|
||||
_RESIDUAL_INPAINT_RADIUS = 2
|
||||
_alpha_template_cache: NDArray[Any] | None = None
|
||||
|
||||
|
||||
def _alpha_template() -> NDArray[Any] | None:
|
||||
"""Lazily load the bundled Samsung alpha template (float [0,1]), or None."""
|
||||
global _alpha_template_cache
|
||||
if _alpha_template_cache is None:
|
||||
from pathlib import Path
|
||||
|
||||
from remove_ai_watermarks import image_io
|
||||
|
||||
path = Path(__file__).parent / "assets" / "samsung_alpha.png"
|
||||
img = image_io.imread(str(path), cv2.IMREAD_GRAYSCALE)
|
||||
if img is None:
|
||||
return None
|
||||
_alpha_template_cache = img.astype(np.float32) / 255.0
|
||||
return _alpha_template_cache
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class SamsungLocation:
|
||||
"""Located watermark box (bottom-left), in absolute pixel coordinates."""
|
||||
|
||||
x: int
|
||||
y: int
|
||||
w: int
|
||||
h: int
|
||||
is_fallback: bool = True # geometry anchor (no template match) -> always True for now
|
||||
|
||||
@property
|
||||
def bbox(self) -> tuple[int, int, int, int]:
|
||||
return self.x, self.y, self.w, self.h
|
||||
|
||||
|
||||
@dataclass
|
||||
class SamsungDetection:
|
||||
"""Result of visible Samsung Galaxy AI watermark detection."""
|
||||
|
||||
detected: bool = False
|
||||
confidence: float = 0.0
|
||||
region: tuple[int, int, int, int] = (0, 0, 0, 0)
|
||||
coverage: float = 0.0 # fraction of the box occupied by glyph pixels
|
||||
|
||||
|
||||
_silhouette_cache: NDArray[Any] | None = None
|
||||
|
||||
|
||||
def _glyph_silhouette() -> NDArray[Any] | None:
|
||||
"""Binary glyph silhouette (255 = glyph) from the bundled alpha map, used as the
|
||||
detection template. None if the alpha asset is missing. The threshold is a
|
||||
fraction of the (faint) peak alpha so the thin strokes survive."""
|
||||
global _silhouette_cache
|
||||
if _silhouette_cache is None:
|
||||
at = _alpha_template()
|
||||
if at is None:
|
||||
return None
|
||||
_silhouette_cache = (at > 0.10).astype(np.uint8) * 255
|
||||
return _silhouette_cache
|
||||
|
||||
|
||||
def _template_match_score(box_mask: NDArray[Any], image_width: int) -> float:
|
||||
"""Zero-mean normalized correlation of the alpha-template glyph silhouette
|
||||
(scaled to the mark's expected size) against the candidate ``box_mask``."""
|
||||
sil = _glyph_silhouette()
|
||||
if sil is None or box_mask.size == 0:
|
||||
return 0.0
|
||||
gw = min(box_mask.shape[1] - 1, max(16, int(_ALPHA_WIDTH_FRAC * image_width)))
|
||||
gh = min(box_mask.shape[0] - 1, max(4, int(_ALPHA_HEIGHT_FRAC * image_width)))
|
||||
if gw < 16 or gh < 4:
|
||||
return 0.0
|
||||
template = cv2.resize(sil, (gw, gh), interpolation=cv2.INTER_NEAREST)
|
||||
return float(cv2.matchTemplate(box_mask, template, cv2.TM_CCOEFF_NORMED).max())
|
||||
|
||||
|
||||
class SamsungEngine:
|
||||
"""Remove the visible Samsung Galaxy AI watermark (locate -> mask -> reverse-alpha)."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
*,
|
||||
width_frac: float = WM_WIDTH_FRAC,
|
||||
height_frac: float = WM_HEIGHT_FRAC,
|
||||
margin_left_frac: float = MARGIN_LEFT_FRAC,
|
||||
margin_bottom_frac: float = MARGIN_BOTTOM_FRAC,
|
||||
) -> None:
|
||||
self.width_frac = width_frac
|
||||
self.height_frac = height_frac
|
||||
self.margin_left_frac = margin_left_frac
|
||||
self.margin_bottom_frac = margin_bottom_frac
|
||||
|
||||
# ── Locate ────────────────────────────────────────────────────────
|
||||
|
||||
def locate(self, image: NDArray[Any]) -> SamsungLocation:
|
||||
"""Anchor the watermark box in the bottom-left corner by geometry."""
|
||||
h, w = image.shape[:2]
|
||||
wm_w = max(40, int(w * self.width_frac))
|
||||
wm_h = max(16, int(w * self.height_frac))
|
||||
margin_l = max(2, int(w * self.margin_left_frac))
|
||||
margin_b = max(2, int(w * self.margin_bottom_frac))
|
||||
x = min(margin_l, max(0, w - wm_w))
|
||||
y = max(0, h - margin_b - wm_h)
|
||||
wm_w = min(wm_w, w - x)
|
||||
wm_h = min(wm_h, h - y)
|
||||
return SamsungLocation(x=x, y=y, w=wm_w, h=wm_h, is_fallback=True)
|
||||
|
||||
# ── Mask ──────────────────────────────────────────────────────────
|
||||
|
||||
def extract_mask(self, image: NDArray[Any], loc: SamsungLocation) -> NDArray[Any]:
|
||||
"""Build a full-image uint8 mask (255 = watermark glyph) for the box.
|
||||
|
||||
Polarity-aware: the mark is a light, low-saturation gray rendered brighter
|
||||
than the local background (white top-hat), so a white-paper document is left
|
||||
untouched (nothing brighter than its surroundings is masked there).
|
||||
"""
|
||||
h, w = image.shape[:2]
|
||||
x, y, bw, bh = loc.bbox
|
||||
# A degenerate ROI (a sliver from an extremely wide/short image) cannot hold
|
||||
# the mark and would feed cv2's GaussianBlur/morphology a ~1-px-tall array,
|
||||
# which can fault the native code on some platforms (mirrors the Doubao/Jimeng
|
||||
# guard). Skip the cv2 pipeline and return an empty mask there.
|
||||
if bh < 16 or bw < 16:
|
||||
return np.zeros((h, w), np.uint8)
|
||||
# Normalize the ROI to 3-channel BGR: a 2D grayscale or 4-channel BGRA input
|
||||
# would otherwise break the axis=2 channel reductions below.
|
||||
roi = image[y : y + bh, x : x + bw]
|
||||
if roi.ndim == 2:
|
||||
roi = cv2.cvtColor(roi, cv2.COLOR_GRAY2BGR)
|
||||
elif roi.shape[2] == 4:
|
||||
roi = cv2.cvtColor(roi, cv2.COLOR_BGRA2BGR)
|
||||
roi = roi.astype(np.float32)
|
||||
|
||||
luma = roi.mean(axis=2)
|
||||
sat = roi.max(axis=2) - roi.min(axis=2)
|
||||
grayish = sat < MAX_SATURATION
|
||||
|
||||
sigma = max(4.0, bh * 0.4)
|
||||
local_bg = cv2.GaussianBlur(luma, (0, 0), sigmaX=sigma, sigmaY=sigma)
|
||||
tophat = luma - local_bg
|
||||
|
||||
cand = grayish & (tophat > TOPHAT_DELTA) & (luma > LOGO_MIN_LUMA)
|
||||
glyph = cand.astype(np.uint8) * 255
|
||||
glyph = cv2.morphologyEx(glyph, cv2.MORPH_CLOSE, np.ones((5, 5), np.uint8))
|
||||
glyph = cv2.morphologyEx(glyph, cv2.MORPH_OPEN, np.ones((3, 3), np.uint8))
|
||||
|
||||
mask = np.zeros((h, w), np.uint8)
|
||||
mask[y : y + bh, x : x + bw] = glyph
|
||||
return mask
|
||||
|
||||
# ── Detect ────────────────────────────────────────────────────────
|
||||
|
||||
def detect(self, image: NDArray[Any]) -> SamsungDetection:
|
||||
"""Detect the visible Samsung mark by matching the alpha-template glyph
|
||||
silhouette against the corner candidate (TM_CCOEFF_NORMED)."""
|
||||
det = SamsungDetection()
|
||||
if image is None or image.size == 0:
|
||||
return det
|
||||
loc = self.locate(image)
|
||||
mask = self.extract_mask(image, loc)
|
||||
x, y, bw, bh = loc.bbox
|
||||
box = mask[y : y + bh, x : x + bw]
|
||||
coverage = float((box > 0).sum()) / float(max(1, bw * bh))
|
||||
det.region = loc.bbox
|
||||
det.coverage = coverage
|
||||
if coverage >= DETECT_MIN_COVERAGE:
|
||||
score = _template_match_score(box, image.shape[1])
|
||||
det.confidence = score
|
||||
det.detected = score >= DETECT_NCC_THRESHOLD
|
||||
logger.debug("Samsung detect: coverage=%.3f ncc=%.2f detected=%s", coverage, score, det.detected)
|
||||
return det
|
||||
|
||||
# ── Reverse-alpha (recovery + residual inpaint) ───────────────────
|
||||
|
||||
def reverse_alpha_available(self, image: NDArray[Any]) -> bool:
|
||||
"""True if the bundled alpha map is loadable (NCC alignment places it at any
|
||||
resolution; the caller still gates on ``detect``)."""
|
||||
return image is not None and image.size > 0 and _alpha_template() is not None
|
||||
|
||||
def _fixed_alpha_map(self, image: NDArray[Any]) -> tuple[NDArray[Any], tuple[int, int, int, int]] | None:
|
||||
"""Place the template by fixed width-relative geometry (bottom-left)."""
|
||||
at = _alpha_template()
|
||||
if at is None:
|
||||
return None
|
||||
h, w = image.shape[:2]
|
||||
gw = min(w, max(1, int(_ALPHA_WIDTH_FRAC * w)))
|
||||
gh = min(h, max(1, int(_ALPHA_HEIGHT_FRAC * w)))
|
||||
ax = min(max(0, int(_ALPHA_MARGIN_LEFT_FRAC * w)), max(0, w - gw))
|
||||
ay = max(0, h - int(_ALPHA_MARGIN_BOTTOM_FRAC * w) - gh)
|
||||
amap = np.zeros((h, w), np.float32)
|
||||
amap[ay : ay + gh, ax : ax + gw] = cv2.resize(at, (gw, gh), interpolation=cv2.INTER_LINEAR)
|
||||
return amap, (ax, ay, gw, gh)
|
||||
|
||||
def _aligned_alpha_map(self, image: NDArray[Any]) -> tuple[NDArray[Any], tuple[int, int, int, int]] | None:
|
||||
"""Register the captured template to the actual mark via a TM_CCOEFF_NORMED
|
||||
scale + position search -- so the single capture works off the captured
|
||||
width. Returns ``(alpha_map, glyph_bbox)`` or None."""
|
||||
at = _alpha_template()
|
||||
sil = _glyph_silhouette()
|
||||
if at is None or sil is None:
|
||||
return None
|
||||
h, w = image.shape[:2]
|
||||
loc = self.locate(image)
|
||||
bx, by, bw, bh = loc.bbox
|
||||
box_mask = self.extract_mask(image, loc)[by : by + bh, bx : bx + bw]
|
||||
expected = _ALPHA_WIDTH_FRAC * w
|
||||
best: tuple[float, int, int, int, int] | None = None
|
||||
for scale in np.linspace(*_ALPHA_ALIGN_SEARCH):
|
||||
gw, gh = int(expected * scale), int(_ALPHA_HEIGHT_FRAC * w * scale)
|
||||
if gw < 16 or gh < 4 or gw >= bw or gh >= bh:
|
||||
continue
|
||||
t = cv2.resize(sil, (gw, gh), interpolation=cv2.INTER_NEAREST)
|
||||
_, score, _, top_left = cv2.minMaxLoc(cv2.matchTemplate(box_mask, t, cv2.TM_CCOEFF_NORMED))
|
||||
if best is None or score > best[0]:
|
||||
best = (score, gw, gh, top_left[0], top_left[1])
|
||||
if best is None:
|
||||
return None
|
||||
_, gw, gh, ox, oy = best
|
||||
ax, ay = bx + ox, by + oy
|
||||
amap = np.zeros((h, w), np.float32)
|
||||
amap[ay : ay + gh, ax : ax + gw] = cv2.resize(at, (gw, gh), interpolation=cv2.INTER_LINEAR)
|
||||
return amap, (ax, ay, gw, gh)
|
||||
|
||||
def _apply_reverse_alpha(self, image: NDArray[Any], amap: NDArray[Any]) -> NDArray[Any]:
|
||||
"""Invert the alpha blend with ``amap``: ``original = (wm - a*logo)/(1-a)``."""
|
||||
a3 = np.clip(amap, 0.0, 1.0)[:, :, None]
|
||||
logo = np.array(_ALPHA_LOGO_BGR, np.float32)
|
||||
return np.clip((image.astype(np.float32) - a3 * logo) / np.clip(1.0 - a3, 0.25, 1.0), 0, 255).astype(np.uint8)
|
||||
|
||||
def remove_watermark_reverse_alpha(self, image: NDArray[Any], *, residual_inpaint: bool = True) -> NDArray[Any]:
|
||||
"""Recover the original pixels by inverting the alpha blend, then clear the
|
||||
residual outline with a thin inpaint over the glyph footprint.
|
||||
|
||||
Placement: fixed geometry AND the NCC-aligned placement are always tried and
|
||||
the one leaving the least residual mark (lowest re-``detect`` confidence) is
|
||||
kept -- the flat capture is far off the real-photo width and the mark
|
||||
re-rasterizes per image, so fixed geometry alone is not reliable. A single
|
||||
capture cannot pixel-cancel the upscaled mark, so a deliberately THIN residual
|
||||
inpaint (``_RESIDUAL_*``) follows. Call only when
|
||||
:meth:`reverse_alpha_available` and the mark is detected.
|
||||
"""
|
||||
# Normalize to 3-channel BGR so a 2D grayscale or 4-channel BGRA input does
|
||||
# not break the reverse-alpha math (which assumes a 3-channel logo).
|
||||
if image.ndim == 2:
|
||||
image = cv2.cvtColor(image, cv2.COLOR_GRAY2BGR)
|
||||
elif image.shape[2] == 4:
|
||||
image = cv2.cvtColor(image, cv2.COLOR_BGRA2BGR)
|
||||
# An image too small to hold the mark would make the geometry boxes degenerate
|
||||
# and feed cv2.resize a ~1-px-tall target; skip cv2 entirely (mirrors Jimeng).
|
||||
h, w = image.shape[:2]
|
||||
if h < 32 or w < 64:
|
||||
return image.copy()
|
||||
maps = [c for c in (self._fixed_alpha_map(image), self._aligned_alpha_map(image)) if c is not None]
|
||||
if not maps:
|
||||
return image.copy()
|
||||
best_out: NDArray[Any] | None = None
|
||||
best_amap: NDArray[Any] | None = None
|
||||
best_residual = float("inf")
|
||||
for amap, _region in maps:
|
||||
out = self._apply_reverse_alpha(image, amap)
|
||||
residual = self.detect(out).confidence
|
||||
if residual < best_residual:
|
||||
best_residual, best_out, best_amap = residual, out, amap
|
||||
if best_out is None or best_amap is None: # pragma: no cover - maps is non-empty
|
||||
return image.copy()
|
||||
if residual_inpaint:
|
||||
kernel = np.ones((_RESIDUAL_DILATE, _RESIDUAL_DILATE), np.uint8)
|
||||
rm = cv2.dilate((best_amap > _RESIDUAL_ALPHA_FLOOR).astype(np.uint8) * 255, kernel)
|
||||
best_out = cv2.inpaint(best_out, rm, _RESIDUAL_INPAINT_RADIUS, cv2.INPAINT_NS)
|
||||
return best_out
|
||||
|
||||
|
||||
def load_image_bgr(path: str | Path) -> NDArray[Any]:
|
||||
"""Read an image as BGR ndarray (helper for scripts/tests)."""
|
||||
from remove_ai_watermarks import image_io
|
||||
|
||||
img = image_io.imread(path, cv2.IMREAD_COLOR)
|
||||
if img is None:
|
||||
raise FileNotFoundError(f"Failed to read image: {path}")
|
||||
return img
|
||||
@@ -23,6 +23,7 @@ Entries:
|
||||
- ``gemini`` -- Google Gemini / Nano Banana sparkle, bottom-right.
|
||||
- ``doubao`` -- ByteDance Doubao "豆包AI生成" text strip, bottom-right.
|
||||
- ``jimeng`` -- ByteDance Jimeng / Dreamina "★ 即梦AI" wordmark, bottom-right.
|
||||
- ``samsung`` -- Samsung Galaxy AI "Contenuti generati dall'AI" strip, bottom-left.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
@@ -116,6 +117,10 @@ def _engine(key: str) -> Any:
|
||||
from remove_ai_watermarks.jimeng_engine import JimengEngine
|
||||
|
||||
_engines[key] = JimengEngine()
|
||||
elif key == "samsung":
|
||||
from remove_ai_watermarks.samsung_engine import SamsungEngine
|
||||
|
||||
_engines[key] = SamsungEngine()
|
||||
else: # pragma: no cover - guarded by the registry keys
|
||||
raise KeyError(key)
|
||||
return _engines[key]
|
||||
@@ -190,6 +195,24 @@ def _jimeng_remove(
|
||||
return image.copy(), None
|
||||
|
||||
|
||||
def _samsung_detect(image: NDArray[Any]) -> MarkDetection:
|
||||
d = _engine("samsung").detect(image)
|
||||
return MarkDetection("samsung", "Samsung Galaxy AI text", "bottom-left", d.detected, d.confidence, d.region)
|
||||
|
||||
|
||||
def _samsung_remove(
|
||||
image: NDArray[Any], _inpaint_method: InpaintMethod, _inpaint: bool, _strength: float, force: bool
|
||||
) -> tuple[NDArray[Any], Region | None]:
|
||||
# Reverse-alpha (with an always-on thin residual inpaint over the glyph
|
||||
# footprint, see the engine): apply when the mark is present and the alpha asset
|
||||
# loads. Skipped otherwise (no hallucination on a clean corner).
|
||||
engine = _engine("samsung")
|
||||
det = engine.detect(image)
|
||||
if (det.detected or force) and engine.reverse_alpha_available(image):
|
||||
return engine.remove_watermark_reverse_alpha(image), (det.region if det.detected else None)
|
||||
return image.copy(), None
|
||||
|
||||
|
||||
_REGISTRY: tuple[KnownMark, ...] = (
|
||||
KnownMark("gemini", "Google Gemini sparkle", "bottom-right", True, "reverse-alpha", _gemini_detect, _gemini_remove),
|
||||
KnownMark(
|
||||
@@ -198,6 +221,9 @@ _REGISTRY: tuple[KnownMark, ...] = (
|
||||
KnownMark(
|
||||
"jimeng", "Jimeng 即梦AI wordmark", "bottom-right", True, "reverse-alpha", _jimeng_detect, _jimeng_remove
|
||||
),
|
||||
KnownMark(
|
||||
"samsung", "Samsung Galaxy AI text", "bottom-left", True, "reverse-alpha", _samsung_detect, _samsung_remove
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,192 @@
|
||||
"""Tests for the Samsung Galaxy AI visible-watermark engine.
|
||||
|
||||
No real Samsung sample is committed (the real-photo captures are gitignored, repo
|
||||
is public), so detection/removal is exercised against a watermark synthesized from
|
||||
the bundled alpha asset itself -- self-consistent and download-free. The mark is
|
||||
anchored bottom-LEFT (unlike the bottom-right Doubao/Jimeng marks).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
import pytest
|
||||
|
||||
from remove_ai_watermarks.samsung_engine import (
|
||||
_ALPHA_HEIGHT_FRAC,
|
||||
_ALPHA_LOGO_BGR,
|
||||
_ALPHA_MARGIN_BOTTOM_FRAC,
|
||||
_ALPHA_MARGIN_LEFT_FRAC,
|
||||
_ALPHA_NATIVE_WIDTH,
|
||||
_ALPHA_WIDTH_FRAC,
|
||||
DETECT_NCC_THRESHOLD,
|
||||
SamsungEngine,
|
||||
_alpha_template,
|
||||
_glyph_silhouette,
|
||||
_template_match_score,
|
||||
)
|
||||
|
||||
|
||||
def _compose(w: int, h: int, bg: float = 100.0):
|
||||
"""Composite the real alpha (scaled to width ``w``) onto a flat bg by the
|
||||
engine's fixed bottom-left geometry. Returns ``(watermarked_uint8, mark_bool_mask)``."""
|
||||
img = np.full((h, w, 3), bg, np.float32)
|
||||
at = _alpha_template()
|
||||
gw, gh = int(_ALPHA_WIDTH_FRAC * w), int(_ALPHA_HEIGHT_FRAC * w)
|
||||
ax = int(_ALPHA_MARGIN_LEFT_FRAC * w)
|
||||
ay = h - int(_ALPHA_MARGIN_BOTTOM_FRAC * w) - gh
|
||||
amap = np.zeros((h, w), np.float32)
|
||||
amap[ay : ay + gh, ax : ax + gw] = cv2.resize(at, (gw, gh))
|
||||
a3 = amap[:, :, None]
|
||||
wm = (a3 * np.array(_ALPHA_LOGO_BGR, np.float32) + (1 - a3) * img).clip(0, 255).astype(np.uint8)
|
||||
return wm, amap > 0.15
|
||||
|
||||
|
||||
class TestLocate:
|
||||
def test_box_anchored_bottom_left(self):
|
||||
eng = SamsungEngine()
|
||||
img = np.zeros((1448, 1086, 3), np.uint8)
|
||||
loc = eng.locate(img)
|
||||
assert loc.x < int(1086 * 0.03) # hugs the left edge
|
||||
assert 1448 - (loc.y + loc.h) < int(1086 * 0.03) # hugs the bottom
|
||||
|
||||
def test_box_scales_with_width(self):
|
||||
eng = SamsungEngine()
|
||||
small = eng.locate(np.zeros((1024, 1024, 3), np.uint8))
|
||||
large = eng.locate(np.zeros((2048, 2048, 3), np.uint8))
|
||||
assert large.w == pytest.approx(small.w * 2, rel=0.1)
|
||||
|
||||
|
||||
class TestDetect:
|
||||
def test_clean_gradient_not_detected(self):
|
||||
eng = SamsungEngine()
|
||||
ramp = np.tile(np.linspace(0, 255, 1086, dtype=np.uint8), (1086, 1))
|
||||
img = cv2.cvtColor(ramp, cv2.COLOR_GRAY2BGR)
|
||||
assert not eng.detect(img).detected
|
||||
|
||||
def test_solid_blob_corner_not_detected(self):
|
||||
"""A bright blob is not the glyph shape -> low correlation, not detected."""
|
||||
eng = SamsungEngine()
|
||||
img = np.zeros((1086, 1086, 3), np.uint8)
|
||||
x, y, bw, bh = eng.locate(img).bbox
|
||||
img[y + bh // 4 : y + bh * 3 // 4, x : x + bw // 2] = 200
|
||||
assert not eng.detect(img).detected
|
||||
|
||||
def test_silhouette_loads(self):
|
||||
sil = _glyph_silhouette()
|
||||
assert sil is not None
|
||||
assert set(np.unique(sil)).issubset({0, 255})
|
||||
|
||||
def test_match_score_shape_sensitive(self):
|
||||
"""The glyph silhouette correlates with itself, not with a filled block."""
|
||||
sil = _glyph_silhouette()
|
||||
h, w = sil.shape
|
||||
box = np.zeros((h + 8, int(w / _ALPHA_WIDTH_FRAC * 0.2) + w), np.uint8)
|
||||
box[4 : 4 + h, 4 : 4 + w] = sil
|
||||
assert _template_match_score(box, _ALPHA_NATIVE_WIDTH) >= DETECT_NCC_THRESHOLD
|
||||
solid = np.full_like(box, 255)
|
||||
assert _template_match_score(solid, _ALPHA_NATIVE_WIDTH) < DETECT_NCC_THRESHOLD
|
||||
|
||||
def test_synthetic_mark_detected(self):
|
||||
"""A watermark composed from the real alpha is detected at its threshold."""
|
||||
eng = SamsungEngine()
|
||||
wm, _mark = _compose(_ALPHA_NATIVE_WIDTH, int(_ALPHA_NATIVE_WIDTH * 1.33))
|
||||
det = eng.detect(wm)
|
||||
assert det.detected
|
||||
assert det.confidence >= DETECT_NCC_THRESHOLD
|
||||
|
||||
|
||||
class TestReverseAlpha:
|
||||
def test_alpha_asset_loads(self):
|
||||
at = _alpha_template()
|
||||
assert at is not None
|
||||
assert at.dtype.kind == "f"
|
||||
assert float(at.min()) >= 0.0
|
||||
assert float(at.max()) <= 1.0
|
||||
|
||||
def test_logo_is_white(self):
|
||||
assert _ALPHA_LOGO_BGR == (255.0, 255.0, 255.0)
|
||||
|
||||
def test_available_whenever_asset_present(self):
|
||||
eng = SamsungEngine()
|
||||
assert eng.reverse_alpha_available(np.zeros((1086, 1086, 3), np.uint8))
|
||||
assert eng.reverse_alpha_available(np.zeros((4054, 2958, 3), np.uint8))
|
||||
assert not eng.reverse_alpha_available(np.zeros((0, 0, 3), np.uint8))
|
||||
|
||||
def test_removes_synthetic_mark(self):
|
||||
"""Reverse-alpha + residual inpaint clears the composed mark (re-detect no
|
||||
longer fires)."""
|
||||
eng = SamsungEngine()
|
||||
wm, _mark = _compose(_ALPHA_NATIVE_WIDTH, int(_ALPHA_NATIVE_WIDTH * 1.33))
|
||||
assert eng.detect(wm).detected
|
||||
out = eng.remove_watermark_reverse_alpha(wm)
|
||||
assert not eng.detect(out).detected
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
("w", "h", "max_err"),
|
||||
[
|
||||
(_ALPHA_NATIVE_WIDTH, int(_ALPHA_NATIVE_WIDTH * 1.33), 5.0), # captured width
|
||||
(2958, 4054, 10.0), # real-photo width (~2.7x native) -> NCC alignment generalizes
|
||||
],
|
||||
)
|
||||
def test_recovers_flat_background(self, w, h, max_err):
|
||||
eng = SamsungEngine()
|
||||
wm, mark = _compose(w, h)
|
||||
assert float(np.abs(wm.astype(np.float32)[mark] - 100.0).mean()) > 15 # mark visible
|
||||
out = eng.remove_watermark_reverse_alpha(wm).astype(np.float32)
|
||||
assert float(np.abs(out[mark] - 100.0).mean()) < max_err
|
||||
|
||||
def test_far_region_untouched(self):
|
||||
"""The residual inpaint only touches the bottom-left footprint; the
|
||||
opposite (top-right) corner stays pixel-identical."""
|
||||
eng = SamsungEngine()
|
||||
wm, _mark = _compose(_ALPHA_NATIVE_WIDTH, int(_ALPHA_NATIVE_WIDTH * 1.33))
|
||||
out = eng.remove_watermark_reverse_alpha(wm)
|
||||
h, w = wm.shape[:2]
|
||||
assert np.array_equal(wm[: h // 2, w // 2 :], out[: h // 2, w // 2 :])
|
||||
|
||||
def test_recovers_shifted_mark_on_texture(self):
|
||||
"""A real mark is re-rasterized a few px off its fixed slot, so removal must
|
||||
NCC-align to it (a too-tight locate box would let a corner-ward shift escape
|
||||
the search and leave a readable outline). Composes the real alpha SHIFTED on
|
||||
a known texture and asserts the texture is recovered."""
|
||||
eng = SamsungEngine()
|
||||
w, h = _ALPHA_NATIVE_WIDTH, int(_ALPHA_NATIVE_WIDTH * 1.33)
|
||||
at = _alpha_template()
|
||||
gw, gh = int(_ALPHA_WIDTH_FRAC * w), int(_ALPHA_HEIGHT_FRAC * w)
|
||||
ax = max(0, int(_ALPHA_MARGIN_LEFT_FRAC * w) + 9) # shift right of the fixed slot
|
||||
ay = h - int(_ALPHA_MARGIN_BOTTOM_FRAC * w) - gh - 7 # shift up
|
||||
amap = np.zeros((h, w), np.float32)
|
||||
amap[ay : ay + gh, ax : ax + gw] = cv2.resize(at, (gw, gh))
|
||||
a3 = amap[:, :, None]
|
||||
yy, xx = np.mgrid[0:h, 0:w].astype(np.float32)
|
||||
base = 120 + 40 * np.sin(xx / 90.0) + 30 * np.cos(yy / 70.0)
|
||||
bg = np.clip(np.stack([base, base * 0.95, base * 1.05], axis=-1), 0, 255)
|
||||
wm = (a3 * np.array(_ALPHA_LOGO_BGR, np.float32) + (1 - a3) * bg).clip(0, 255).astype(np.uint8)
|
||||
mark = amap > 0.15
|
||||
assert float(np.abs(wm.astype(np.float32)[mark] - bg[mark]).mean()) > 20 # mark clearly visible
|
||||
out = eng.remove_watermark_reverse_alpha(wm).astype(np.float32)
|
||||
assert float(np.abs(out[mark] - bg[mark]).mean()) < 10.0 # texture recovered, no outline
|
||||
|
||||
|
||||
class TestDegenerateAndChannelInputs:
|
||||
"""Removal must not crash on degenerate sizes or non-3-channel inputs."""
|
||||
|
||||
@pytest.mark.parametrize(("w", "h"), [(2048, 1), (1, 2048), (2048, 8)])
|
||||
def test_wide_short_does_not_raise(self, w, h):
|
||||
eng = SamsungEngine()
|
||||
img = np.zeros((h, w, 3), np.uint8)
|
||||
out = eng.remove_watermark_reverse_alpha(img)
|
||||
assert out.shape == img.shape
|
||||
|
||||
def test_grayscale_2d_does_not_raise(self):
|
||||
eng = SamsungEngine()
|
||||
gray = np.zeros((1448, 1086), np.uint8)
|
||||
out = eng.remove_watermark_reverse_alpha(gray)
|
||||
assert out.shape == (1448, 1086, 3)
|
||||
|
||||
def test_bgra_4channel_does_not_raise(self):
|
||||
eng = SamsungEngine()
|
||||
bgra = np.zeros((1448, 1086, 4), np.uint8)
|
||||
out = eng.remove_watermark_reverse_alpha(bgra)
|
||||
assert out.shape == (1448, 1086, 3)
|
||||
@@ -14,7 +14,7 @@ DOUBAO_SAMPLE = Path(__file__).resolve().parents[1] / "data" / "samples" / "doub
|
||||
|
||||
class TestCatalog:
|
||||
def test_keys(self):
|
||||
assert reg.mark_keys() == ["gemini", "doubao", "jimeng"]
|
||||
assert reg.mark_keys() == ["gemini", "doubao", "jimeng", "samsung"]
|
||||
|
||||
def test_all_in_auto(self):
|
||||
assert all(m.in_auto for m in reg.known_marks())
|
||||
@@ -28,6 +28,7 @@ class TestCatalog:
|
||||
assert by_key["gemini"].location == "bottom-right"
|
||||
assert by_key["doubao"].location == "bottom-right"
|
||||
assert by_key["jimeng"].location == "bottom-right"
|
||||
assert by_key["samsung"].location == "bottom-left"
|
||||
|
||||
def test_get_mark_unknown_raises(self):
|
||||
with pytest.raises(KeyError):
|
||||
@@ -38,7 +39,7 @@ class TestScan:
|
||||
def test_detect_marks_scans_all(self):
|
||||
img = np.zeros((256, 256, 3), np.uint8)
|
||||
keys = {d.key for d in reg.detect_marks(img)}
|
||||
assert keys == {"gemini", "doubao", "jimeng"}
|
||||
assert keys == {"gemini", "doubao", "jimeng", "samsung"}
|
||||
|
||||
def test_blank_image_no_auto_mark(self):
|
||||
assert reg.best_auto_mark(np.zeros((256, 256, 3), np.uint8)) is None
|
||||
|
||||