mirror of
https://github.com/wiltodelta/remove-ai-watermarks.git
synced 2026-06-10 04:43:54 +02:00
refactor(face-restore): wipe GFPGAN path, --restore-faces is PhotoMaker-only
The GFPGAN `restore` extra and its `face_restore.py` module are gone. They were oracle-confirmed to re-introduce SynthID by blending watermarked original face pixels at fidelity weight 0.5 (clean A/B: gemini_3 controlnet 0.20 detected WITH GFPGAN, clean WITHOUT). Keeping them as the default restore method was a footgun for the removal pipeline. PhotoMaker-V2 (added in the previous commit) is the single shipped restore path now -- identity-as-embedding, SynthID-safe by construction. Removed: - src/remove_ai_watermarks/face_restore.py + tests/test_face_restore.py - pyproject.toml `restore` extra (gfpgan/facexlib/basicsr + scipy/numba pins) - pyproject.toml `[tool.uv.extra-build-dependencies] basicsr = [...]` build pin - CLI: `--restore-faces-method` and `--restore-faces-weight` (no method choice to make, no GFPGAN weight knob to expose) - InvisibleEngine._restore_faces method (only _restore_faces_photomaker remains) - All restore-faces-method / restore-faces-weight threading through cmd_* signatures and _process_batch_image Kept: - `--restore-faces / --no-restore-faces`: now binds to PhotoMaker-V2. - All adopted oracle findings about GFPGAN re-introducing SynthID (kept in the research docs as historical context that explains why the path was removed). Docs updated: CLAUDE.md (restore extras bullet collapsed to photomaker, removed face_restore Key-modules bullet, several inline GFPGAN refs scrubbed), README.md (face-identity callout + install section now point to the photomaker extra), docs/synthid.md 5.5 (net recipe), docs/controlnet-removal-pipeline-research.md (recommendations). ruff + strict pyright (src/) clean; 578 tests pass (the 9 GFPGAN tests are gone, the 9 PhotoMaker tests stay green). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -23,7 +23,7 @@ If this tool saves you time, consider [sponsoring its development](https://githu
|
||||
- **AI metadata stripping** — EXIF, PNG text chunks, C2PA provenance manifests (PNG / JPEG / AVIF / HEIF / JPEG-XL, **MP4 / MOV / M4V / M4A** at the container level, and **WebM / MP3 / WAV / FLAC / OGG** losslessly via ffmpeg), XMP DigitalSourceType
|
||||
- **"Made with AI" label removal** — removes the AI-disclosure metadata that platforms read to apply automatic labels (useful for clearing a false-positive label from a human-edited photograph)
|
||||
- **Analog Humanizer** — optional film grain and chromatic aberration post-processing
|
||||
- **Text and face preservation (experimental)** — optional `--pipeline controlnet` adds a canny ControlNet that keeps text and face structure sharp through the removal pass (without copying original pixels, so SynthID is still removed). Canny preserves face *structure*, not *identity* (the regenerated face drifts in likeness); identity is preserved by the `--restore-faces` GFPGAN post-pass (opt-in). Both are experimental and off by default.
|
||||
- **Text and face preservation (experimental)** — optional `--pipeline controlnet` adds a canny ControlNet that keeps text and face structure sharp through the removal pass (without copying original pixels, so SynthID is still removed). Canny preserves face *structure*, not *identity* (the regenerated face drifts in likeness); identity is preserved by the `--restore-faces` PhotoMaker-V2 post-pass (opt-in, SynthID-safe). Both are experimental and off by default.
|
||||
- **Batch processing** — process entire directories
|
||||
- **Detection** — three-stage NCC watermark detection with confidence scoring
|
||||
- **Provenance detection (`identify`)** — aggregate C2PA issuer, the C2PA soft-binding forensic-watermark vendor (Adobe TrustMark, Digimarc, Imatag, ...), IPTC "Made with AI" plus the IPTC 2025.1 `AISystemUsed` field, embedded SD/ComfyUI params, EXIF/XMP generator tags, the xAI/Grok EXIF signature, the China TC260 AIGC label (XMP, PNG chunk, or EXIF), the HuggingFace `hf-job-id` job marker, the SynthID metadata proxy, the visible marks (Gemini sparkle plus the Doubao "豆包AI生成" / Jimeng "即梦AI" / Samsung Galaxy AI "Contenuti generati dall'AI" text marks), the open SD/SDXL/FLUX invisible watermark, and (with the `trustmark` extra) the open Adobe TrustMark watermark into one origin-platform + watermark-inventory verdict (`--json` for machine output)
|
||||
@@ -128,7 +128,7 @@ image → encode to latent space (VAE) at native resolution
|
||||
>
|
||||
> **`--pipeline controlnet` preserves text and face structure (experimental, opt-in).** It runs the same SDXL img2img scrub but adds a canny ControlNet that conditions the regeneration on the image's edge map, so text and structure stay sharp at the strengths that remove SynthID. The watermark removal still comes from the img2img regeneration (`--strength`); the ControlNet only preserves structure — no original pixels are copied or frozen, so SynthID does not survive. `--controlnet-scale` tunes the preservation strength (higher = closer to the original structure). Runs fp32 on mps/cpu (fp16 only on cuda/xpu, where the fp16-fixed SDXL VAE is loaded automatically).
|
||||
>
|
||||
> **`--restore-faces` preserves face identity (GFPGAN, experimental, opt-in).** Canny preserves where a face is, but not who it is — the regenerated face drifts in likeness. The `--restore-faces` post-pass (experimental, off by default; needs the `restore` extra) fixes this: after the removal pass it runs GFPGAN on the original faces and composites the restored face regions into the cleaned image. GFPGAN re-synthesizes each face from a StyleGAN2 prior, so those pixels are GAN-generated (not copied) — the watermark is still scrubbed in the face regions while identity is held (oracle-confirmed clean). It auto-skips when no face is detected or the extra is absent. Tune fidelity with `--restore-faces-weight` (default `0.5`; lower = more regeneration / cleaner scrub, higher = closer to the input). Commercial-safe (GFPGAN is Apache-2.0, its RetinaFace detector MIT); the CodeFormer alternative is non-commercial and is not shipped. (An IP-Adapter FaceID approach was tried earlier and removed: it needs high denoise strength and corrupts faces at the low strength used for removal.)
|
||||
> **`--restore-faces` preserves face identity (PhotoMaker-V2, experimental, opt-in).** Canny preserves where a face is, but not who it is — the regenerated face drifts in likeness. The `--restore-faces` post-pass (experimental, off by default; needs the `photomaker` extra) fixes this in a SynthID-safe way: identity comes from an OpenCLIP-ViT-H/14 embedding of the original face (validated 2026-06-04: cosine 0.9977 invariance to SynthID-magnitude pixel noise, an order of magnitude less drift than JPEG90 which SynthID survives), and a fresh face is regenerated from that embedding — the pixels are diffusion-fresh, so the watermark is not transported. Commercial-safe end-to-end: PhotoMaker-V2 weights Apache-2.0, OpenCLIP-ViT-H/14 MIT, no InsightFace. The earlier GFPGAN-based `restore` extra was removed 2026-06-04 because it ran on the watermarked original and was oracle-confirmed to re-introduce SynthID; CodeFormer stays non-commercial and is not shipped. See `docs/synthid-robust-identity-research.md`.
|
||||
|
||||
SDXL is the default since May 2026: empirically defeats SynthID v2 on Gemini 3 Pro outputs, where the older SD-1.5 pipeline at 768 px did not. The SD-1.5 path was removed once it was verified not to handle v2. Note the scope: this defeats the SynthID *verifier*, which is not the same as being forensically indistinguishable from a real photo. Recent work ([arXiv:2605.09203](https://arxiv.org/abs/2605.09203)) shows watermark-removal pipelines leave detectable traces, so a separate "this image was processed" classifier can still flag the output.
|
||||
|
||||
@@ -136,7 +136,7 @@ SDXL is the default since May 2026: empirically defeats SynthID v2 on Gemini 3 P
|
||||
|
||||
> **Technical deep-dive:** see [`docs/synthid.md`](docs/synthid.md) for a primary-source-cited breakdown of how SynthID works mechanically (post-hoc encoder/decoder, 136-bit payload, pixel-space embedding), what it empirically survives (JPEG, crop, resize: ~99.98% TPR at 0.1% FPR from arXiv:2510.09263), what removes it, and the forensic-stealth tradeoff (all known removal attacks are detectable at >98% TPR@1%FPR per arXiv:2605.09203).
|
||||
|
||||
**Text and face preservation** (experimental, opt-in `--pipeline controlnet`): adds a canny ControlNet so text and face *structure* stay sharp through the removal pass, without copying or freezing any original pixels (so SynthID is still removed). Tune the preservation strength with `--controlnet-scale`. Canny preserves structure but not face *identity* (identity is preserved by the `--restore-faces` GFPGAN post-pass, experimental and off by default — see the callout above). Both features are experimental.
|
||||
**Text and face preservation** (experimental, opt-in `--pipeline controlnet`): adds a canny ControlNet so text and face *structure* stay sharp through the removal pass, without copying or freezing any original pixels (so SynthID is still removed). Tune the preservation strength with `--controlnet-scale`. Canny preserves structure but not face *identity* (identity is preserved by the `--restore-faces` PhotoMaker-V2 post-pass, experimental and off by default — see the callout above). Both features are experimental.
|
||||
|
||||
**Analog Humanizer**: optional film grain and chromatic aberration injection that mimics a photo of a screen, raising the bar for AI-generated image classifiers. (It frustrates generic classifiers but does not guarantee forensic invisibility — see the [arXiv:2605.09203](https://arxiv.org/abs/2605.09203) note above.)
|
||||
|
||||
@@ -215,12 +215,13 @@ After installation the `remove-ai-watermarks` command is available system-wide.
|
||||
> ```
|
||||
>
|
||||
> To preserve face identity after invisible removal (the `--restore-faces`
|
||||
> GFPGAN post-pass, experimental and opt-in), install the `restore` extra. The GFPGANv1.4
|
||||
> and RetinaFace weights download on first use. It needs Python < 3.13 (basicsr
|
||||
> does not build on 3.13):
|
||||
> PhotoMaker-V2 post-pass, experimental and opt-in, SynthID-safe), install the
|
||||
> `photomaker` extra. The PhotoMaker-V2 adapter and SDXL base weights download on
|
||||
> first use (~4 GB total). Commercial-safe end-to-end (Apache-2.0 + MIT, no
|
||||
> InsightFace):
|
||||
>
|
||||
> ```bash
|
||||
> pip install -e ".[restore]" # or: uv pip install -e ".[restore]"
|
||||
> pip install -e ".[photomaker]" # or: uv pip install -e ".[photomaker]"
|
||||
> ```
|
||||
>
|
||||
> For sharper upscaling of small inputs before diffusion (`--upscaler esrgan`,
|
||||
|
||||
@@ -124,9 +124,11 @@ Gemini app; the two payloads are vendor-specific and never cross-checked):
|
||||
- **Fix the seed in prod.** The non-determinism is purely `seed=None` (random); a fixed
|
||||
`--seed` makes every run reproduce the certified-clean result, so you ship a
|
||||
deterministic, re-certifiable config (and the seed sweep collapses to one config).
|
||||
- **Rework `--restore-faces` before any removal use:** run GFPGAN on the diffusion-CLEANED
|
||||
image (not the original), or drop the weight well below 0.5, or leave it off — then
|
||||
re-validate on the oracle.
|
||||
- **`--restore-faces` is SynthID-safe by construction now (PhotoMaker-V2, 2026-06-04).**
|
||||
The GFPGAN-on-original path that re-added SynthID was removed; the shipped restore
|
||||
carries identity in a SynthID-invariant OpenCLIP embedding and regenerates fresh
|
||||
pixels conditioned on it. Needs the `photomaker` extra. See
|
||||
`docs/synthid-robust-identity-research.md`.
|
||||
- **No local SynthID detector exists** → the service can't self-verify; bake in strength
|
||||
margin and periodic oracle spot-checks.
|
||||
- **Lesson:** visual-quality / face-identity recovery does NOT prove removal — only the
|
||||
|
||||
+3
-2
@@ -568,8 +568,9 @@ table.
|
||||
**Net for raiw.cc:** (1) controlnet needs a higher, per-vendor strength than
|
||||
`default` -- CERTIFIED OpenAI 0.20 / Gemini 0.30 (above); add a controlnet-specific
|
||||
schedule to `resolve_strength`, do not reuse the default ladder; (2) the
|
||||
`--restore-faces` pass can re-add SynthID and must be reworked (restore on the
|
||||
cleaned image / lower weight / off) before it is safe in a removal pipeline; (3)
|
||||
`--restore-faces` pass is now SynthID-safe by construction (the GFPGAN-on-original
|
||||
path that re-added SynthID was removed 2026-06-04; the shipped restore is
|
||||
PhotoMaker-V2, identity-as-embedding, see `synthid-robust-identity-research.md`); (3)
|
||||
removal near threshold is seed-non-deterministic -> FIX the prod seed (kills the
|
||||
coin-flip; ship a deterministic certified config).
|
||||
|
||||
|
||||
+9
-34
@@ -76,42 +76,25 @@ lama = [
|
||||
"onnxruntime>=1.16.0",
|
||||
"huggingface-hub>=0.20.0",
|
||||
]
|
||||
# Optional GFPGAN face-restoration post-pass (commercial-safe Apache-2.0 GFPGAN +
|
||||
# MIT RetinaFace). Re-synthesizes each face from a StyleGAN2 prior after the
|
||||
# diffusion removal pass, so it restores identity while still scrubbing the pixel
|
||||
# watermark. The GFPGANv1.4 weights + RetinaFace detector download on first use;
|
||||
# they are never bundled. gfpgan/basicsr/facexlib are an OLD ecosystem and must
|
||||
# stay on numpy < 2.0 to match the pinned gpu diffusion stack -- scipy is capped
|
||||
# < 1.18 (>= 1.18 uses np.long, gone in numpy 1.24-1.26) and numba < 0.60 to keep
|
||||
# the whole env on one numpy 1.26 resolution (same trap class as the removed
|
||||
# faceid/insightface extra). Kept OUT of `all` (heavy + model download).
|
||||
restore = [
|
||||
"gfpgan>=1.3.8",
|
||||
"facexlib>=0.3.0",
|
||||
"basicsr>=1.4.2",
|
||||
"scipy<1.18",
|
||||
"numba<0.60",
|
||||
]
|
||||
# Optional PhotoMaker-V2 face-identity restoration (commercial-safe end-to-end:
|
||||
# PhotoMaker-V2 weights Apache-2.0 + OpenCLIP-ViT-H/14 MIT, NO InsightFace). Unlike
|
||||
# the `restore` extra above (which runs GFPGAN on the watermarked ORIGINAL and was
|
||||
# oracle-confirmed to re-introduce SynthID), PhotoMaker carries identity in a
|
||||
# SEMANTIC EMBEDDING and generates fresh face pixels conditioned on it -- so the
|
||||
# pixel watermark is not transported. Empirically validated 2026-06-04: the OpenCLIP
|
||||
# embedding changes by cosine 0.002 under SynthID-magnitude pixel noise (an order of
|
||||
# magnitude less than JPEG90 drift, which SynthID survives). See
|
||||
# PhotoMaker-V2 weights Apache-2.0 + OpenCLIP-ViT-H/14 MIT, NO InsightFace). Carries
|
||||
# identity in a SEMANTIC EMBEDDING and generates fresh face pixels conditioned on it
|
||||
# -- so the pixel watermark is not transported. Empirically validated 2026-06-04: the
|
||||
# OpenCLIP embedding changes by cosine 0.002 under SynthID-magnitude pixel noise (an
|
||||
# order of magnitude less than JPEG90 drift, which SynthID survives). Replaces the
|
||||
# removed `restore` (GFPGAN) extra, which ran on the watermarked ORIGINAL and was
|
||||
# oracle-confirmed to re-introduce SynthID. See
|
||||
# docs/synthid-robust-identity-research.md and
|
||||
# src/remove_ai_watermarks/photomaker_restore.py. Weights (~3 GB SDXL + ~1 GB
|
||||
# PhotoMaker-V2 adapter) download on first use; never bundled. Kept OUT of `all`
|
||||
# (heavy + model download), same as `restore`/`esrgan`.
|
||||
# (heavy + model download), same as `esrgan`.
|
||||
photomaker = [
|
||||
"photomaker @ git+https://github.com/TencentARC/PhotoMaker.git",
|
||||
"huggingface-hub>=0.20.0",
|
||||
]
|
||||
# Optional pre-diffusion super-resolution for small inputs (Real-ESRGAN). Loaded via
|
||||
# spandrel (MIT) -- a pure model-loader with NO basicsr dependency (it pulls only
|
||||
# torch / torchvision / safetensors / numpy / einops), which sidesteps the
|
||||
# basicsr / torchvision.functional_tensor breakage that the `restore` extra fights.
|
||||
# torch / torchvision / safetensors / numpy / einops).
|
||||
# The Real-ESRGAN weights (BSD-3-Clause) download on first use and are cached; they
|
||||
# are never bundled. CPU works but is slow on large inputs -- it is meant for the
|
||||
# pre-diffusion upscale of SMALL inputs (and the GPU worker). Guarded by
|
||||
@@ -137,14 +120,6 @@ all = ["remove-ai-watermarks[gpu,detect,trustmark,lama,dev]"]
|
||||
[tool.uv]
|
||||
prerelease = "allow"
|
||||
|
||||
# basicsr 1.4.2 (pulled by the `restore` GFPGAN extra) ships sdist-only and its
|
||||
# setup.py get_version() reads basicsr/version.py in a way that newer setuptools
|
||||
# (>= 69) breaks with ``KeyError: '__version__'`` under isolated PEP 517 builds.
|
||||
# Pin an old setuptools as its build dependency so the sdist builds; this is
|
||||
# scoped to basicsr and does not affect the rest of the resolution.
|
||||
[tool.uv.extra-build-dependencies]
|
||||
basicsr = ["setuptools<69"]
|
||||
|
||||
# PyTorch Intel-GPU (XPU) wheel index. ``explicit = true`` keeps it inert for
|
||||
# the default CPU/CUDA install: uv consults it only when a torch install
|
||||
# explicitly targets it (see the ``gpu`` extra comment), so it does not alter
|
||||
|
||||
@@ -8,7 +8,7 @@ host (image work there OOM-crashes the container).
|
||||
|
||||
Routing is **quality-priority**: ControlNet (text/face-structure preservation) is the
|
||||
default; it is only skipped for a clearly structure-less image (no face, no text,
|
||||
near-zero edges), where plain SDXL is cheaper and just as good. GFPGAN face
|
||||
near-zero edges), where plain SDXL is cheaper and just as good. PhotoMaker face
|
||||
restoration is enabled when a face is present. When a smoothing pass (controlnet or
|
||||
face restore) ran, the **adaptive polish** (``humanizer.adaptive_polish``) restores
|
||||
the input's detail level -- a capped unsharp + edge-masked grain targeting the input's
|
||||
|
||||
@@ -236,32 +236,21 @@ def _warn_if_esrgan_unavailable(upscaler: str) -> None:
|
||||
|
||||
|
||||
def _restore_faces_options(f: Any) -> Any:
|
||||
"""Attach the shared face-restoration flags to an invisible-pipeline command."""
|
||||
restore_flag = click.option(
|
||||
"""Attach the face-restoration flag to an invisible-pipeline command.
|
||||
|
||||
PhotoMaker-V2 is the only restoration method shipped (the prior GFPGAN path was
|
||||
oracle-confirmed to re-introduce SynthID by partial pixel blending and has been
|
||||
removed). PhotoMaker carries identity in a SynthID-invariant OpenCLIP embedding
|
||||
and regenerates fresh face pixels conditioned on it -- see
|
||||
``docs/synthid-robust-identity-research.md``.
|
||||
"""
|
||||
return click.option(
|
||||
"--restore-faces/--no-restore-faces",
|
||||
default=False,
|
||||
help="EXPERIMENTAL, opt-in. Restore face identity with a post-pass when faces are "
|
||||
"present; off by default, auto-skips when no face is detected or the chosen extra "
|
||||
"is absent.",
|
||||
)
|
||||
method_flag = click.option(
|
||||
"--restore-faces-method",
|
||||
type=click.Choice(["gfpgan", "photomaker"]),
|
||||
default="gfpgan",
|
||||
help="Face-restore mechanism: 'gfpgan' (cheap, needs 'restore' extra, BUT runs on "
|
||||
"the watermarked original and re-introduces SynthID) or 'photomaker' (PhotoMaker-V2, "
|
||||
"needs the 'photomaker' extra; carries identity via a SynthID-invariant OpenCLIP "
|
||||
"embedding so the regenerated face pixels are watermark-free). Default: gfpgan.",
|
||||
)
|
||||
weight_flag = click.option(
|
||||
"--restore-faces-weight",
|
||||
type=float,
|
||||
default=0.5,
|
||||
help="GFPGAN fidelity weight (0-1); lower = more GAN regeneration (cleaner "
|
||||
"watermark scrub), higher = closer to the input. Ignored when "
|
||||
"--restore-faces-method=photomaker.",
|
||||
)
|
||||
return restore_flag(method_flag(weight_flag(f)))
|
||||
help="EXPERIMENTAL, opt-in. Restore face identity with the PhotoMaker-V2 post-pass "
|
||||
"when faces are present (needs the 'photomaker' extra); off by default, auto-skips "
|
||||
"when no face is detected or the extra is absent.",
|
||||
)(f)
|
||||
|
||||
|
||||
def _watermark_region(det: DetectionResult, width: int, height: int) -> tuple[int, int, int, int]:
|
||||
@@ -612,8 +601,6 @@ def cmd_invisible(
|
||||
min_resolution: int,
|
||||
controlnet_scale: float,
|
||||
restore_faces: bool,
|
||||
restore_faces_weight: float,
|
||||
restore_faces_method: str,
|
||||
upscaler: str,
|
||||
auto: bool,
|
||||
adaptive_polish: bool,
|
||||
@@ -676,8 +663,6 @@ def cmd_invisible(
|
||||
upscaler=upscaler,
|
||||
vendor=vendor,
|
||||
restore_faces=restore_faces,
|
||||
restore_faces_weight=restore_faces_weight,
|
||||
restore_faces_method=restore_faces_method,
|
||||
)
|
||||
elapsed = time.monotonic() - t0
|
||||
|
||||
@@ -879,8 +864,6 @@ def cmd_all(
|
||||
min_resolution: int,
|
||||
controlnet_scale: float,
|
||||
restore_faces: bool,
|
||||
restore_faces_weight: float,
|
||||
restore_faces_method: str,
|
||||
upscaler: str,
|
||||
auto: bool,
|
||||
adaptive_polish: bool,
|
||||
@@ -989,8 +972,6 @@ def cmd_all(
|
||||
upscaler=upscaler,
|
||||
vendor=vendor,
|
||||
restore_faces=restore_faces,
|
||||
restore_faces_weight=restore_faces_weight,
|
||||
restore_faces_method=restore_faces_method,
|
||||
)
|
||||
console.print(" Invisible watermark removed")
|
||||
|
||||
@@ -1046,8 +1027,6 @@ def _process_batch_image(
|
||||
max_resolution: int = 0,
|
||||
min_resolution: int = 1024,
|
||||
restore_faces: bool = False,
|
||||
restore_faces_weight: float = 0.5,
|
||||
restore_faces_method: str = "gfpgan",
|
||||
controlnet_scale: float = 1.0,
|
||||
upscaler: str = "lanczos",
|
||||
auto: bool = False,
|
||||
@@ -1126,8 +1105,6 @@ def _process_batch_image(
|
||||
min_resolution=min_resolution,
|
||||
upscaler=upscaler,
|
||||
restore_faces=restore_faces,
|
||||
restore_faces_weight=restore_faces_weight,
|
||||
restore_faces_method=restore_faces_method,
|
||||
# Detect the vendor from the pristine original (`img_path`), not the
|
||||
# visible-processed `out_path` whose C2PA is already gone.
|
||||
vendor=vendor_for_strength(img_path),
|
||||
@@ -1210,8 +1187,6 @@ def cmd_batch(
|
||||
max_resolution: int,
|
||||
min_resolution: int,
|
||||
restore_faces: bool,
|
||||
restore_faces_weight: float,
|
||||
restore_faces_method: str,
|
||||
controlnet_scale: float,
|
||||
upscaler: str,
|
||||
auto: bool,
|
||||
@@ -1271,8 +1246,6 @@ def cmd_batch(
|
||||
max_resolution=max_resolution,
|
||||
min_resolution=min_resolution,
|
||||
restore_faces=restore_faces,
|
||||
restore_faces_weight=restore_faces_weight,
|
||||
restore_faces_method=restore_faces_method,
|
||||
controlnet_scale=controlnet_scale,
|
||||
upscaler=upscaler,
|
||||
auto=auto,
|
||||
|
||||
@@ -1,191 +0,0 @@
|
||||
"""Optional GFPGAN face-restoration post-pass for the invisible removal pipeline.
|
||||
|
||||
The diffusion removal pass scrubs the watermark everywhere but lets faces drift in
|
||||
likeness (canny holds face *structure*, not *identity*). This module restores each
|
||||
face's identity by running GFPGAN on the ORIGINAL (watermarked) image and
|
||||
feather-compositing the restored face REGIONS into the cleaned image.
|
||||
|
||||
GFPGAN RE-SYNTHESIZES each face from a StyleGAN2 prior -- the composited pixels are
|
||||
GAN-generated, NOT copied from the original -- so the pixel watermark is scrubbed in
|
||||
the face regions too, while identity is preserved (oracle-validated at weight 0.5).
|
||||
Both GFPGAN (Apache-2.0) and its RetinaFace detector (MIT) are commercial-safe.
|
||||
|
||||
The GFPGANv1.4 weights and the RetinaFace detector download on first use and are
|
||||
never bundled. Requires the optional ``restore`` extra (gfpgan/facexlib/basicsr).
|
||||
"""
|
||||
|
||||
# cv2/torch/gfpgan boundary: gfpgan/basicsr/facexlib ship no usable type stubs and
|
||||
# this module wraps cv2 (feather composite) and torch; relax the unknown-type rules
|
||||
# for this file only.
|
||||
# pyright: reportUnknownMemberType=false, reportUnknownArgumentType=false, reportUnknownVariableType=false, reportUnknownParameterType=false, reportMissingTypeArgument=false, reportMissingTypeStubs=false, reportMissingImports=false, reportArgumentType=false, reportAssignmentType=false, reportReturnType=false, reportCallIssue=false, reportIndexIssue=false, reportOperatorIssue=false, reportOptionalMemberAccess=false, reportOptionalCall=false, reportOptionalSubscript=false, reportOptionalOperand=false, reportAttributeAccessIssue=false, reportPrivateImportUsage=false, reportPrivateUsage=false, reportInvalidTypeForm=false, reportConstantRedefinition=false, reportUnnecessaryComparison=false
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import sys
|
||||
import threading
|
||||
from typing import TYPE_CHECKING, Any
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from numpy.typing import NDArray
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# GFPGANv1.4 weights (Apache-2.0). Downloaded on first use, never bundled.
|
||||
_GFPGAN_MODEL_URL = "https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.4.pth"
|
||||
_GFPGAN_ARCH = "clean"
|
||||
_GFPGAN_CHANNEL_MULTIPLIER = 2
|
||||
|
||||
_restorer: Any | None = None
|
||||
_restorer_lock = threading.Lock()
|
||||
|
||||
|
||||
def is_available() -> bool:
|
||||
"""True when the optional GFPGAN face-restoration deps are importable."""
|
||||
import importlib.util
|
||||
|
||||
return importlib.util.find_spec("gfpgan") is not None and importlib.util.find_spec("facexlib") is not None
|
||||
|
||||
|
||||
def _apply_basicsr_shim() -> None:
|
||||
"""Install the ``torchvision.transforms.functional_tensor`` compatibility shim.
|
||||
|
||||
basicsr (a GFPGAN dependency) imports ``rgb_to_grayscale`` from the
|
||||
``torchvision.transforms.functional_tensor`` module, which newer torchvision
|
||||
removed. Recreate that module pointing at the public functional API. Idempotent:
|
||||
only installed when the real module is missing.
|
||||
"""
|
||||
import importlib.util
|
||||
|
||||
if importlib.util.find_spec("torchvision.transforms.functional_tensor") is not None:
|
||||
return
|
||||
if "torchvision.transforms.functional_tensor" in sys.modules:
|
||||
return
|
||||
|
||||
import types
|
||||
|
||||
import torchvision.transforms.functional as tv_functional
|
||||
|
||||
shim = types.ModuleType("torchvision.transforms.functional_tensor")
|
||||
shim.rgb_to_grayscale = tv_functional.rgb_to_grayscale
|
||||
sys.modules["torchvision.transforms.functional_tensor"] = shim
|
||||
|
||||
|
||||
def _select_device() -> str:
|
||||
"""Pick the GFPGAN device: CUDA when present, else CPU.
|
||||
|
||||
The pip GFPGANer has an MPS device-mismatch bug, and this is a cheap post-pass
|
||||
on a few face crops, so MPS is deliberately avoided -- CPU is the safe default
|
||||
on Apple silicon.
|
||||
"""
|
||||
try:
|
||||
import torch
|
||||
|
||||
if torch.cuda.is_available():
|
||||
return "cuda"
|
||||
except Exception as e:
|
||||
logger.debug("face_restore: CUDA probe failed (%s); using CPU", e)
|
||||
return "cpu"
|
||||
|
||||
|
||||
def _get_restorer() -> Any:
|
||||
"""Return the lazily-built GFPGANer singleton (downloads weights on first use)."""
|
||||
global _restorer
|
||||
if _restorer is not None:
|
||||
return _restorer
|
||||
with _restorer_lock:
|
||||
if _restorer is None:
|
||||
_apply_basicsr_shim()
|
||||
from gfpgan import GFPGANer
|
||||
|
||||
_restorer = GFPGANer(
|
||||
model_path=_GFPGAN_MODEL_URL,
|
||||
upscale=1,
|
||||
arch=_GFPGAN_ARCH,
|
||||
channel_multiplier=_GFPGAN_CHANNEL_MULTIPLIER,
|
||||
device=_select_device(),
|
||||
)
|
||||
return _restorer
|
||||
|
||||
|
||||
def _composite_faces(
|
||||
base_bgr: NDArray[Any],
|
||||
restored_bgr: NDArray[Any],
|
||||
boxes: list[tuple[float, float, float, float]],
|
||||
pad: int = 14,
|
||||
feather_div: int = 6,
|
||||
) -> NDArray[Any]:
|
||||
"""Feather-composite restored face regions from ``restored_bgr`` into ``base_bgr``.
|
||||
|
||||
Pure cv2/numpy helper (no gfpgan), so it is unit-testable without the model.
|
||||
For each ``(x1, y1, x2, y2)`` box: pad and clip to the image, build a Gaussian-
|
||||
feathered rectangular alpha, and blend ``restored * a + base * (1 - a)``. Boxes
|
||||
that fall fully outside the image (or an empty list) leave ``base_bgr`` unchanged.
|
||||
"""
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
out = base_bgr.astype(np.float32)
|
||||
h, w = base_bgr.shape[:2]
|
||||
|
||||
for box in boxes:
|
||||
x1 = int(box[0]) - pad
|
||||
y1 = int(box[1]) - pad
|
||||
x2 = int(box[2]) + pad
|
||||
y2 = int(box[3]) + pad
|
||||
x1 = max(0, min(x1, w))
|
||||
y1 = max(0, min(y1, h))
|
||||
x2 = max(0, min(x2, w))
|
||||
y2 = max(0, min(y2, h))
|
||||
bw = x2 - x1
|
||||
bh = y2 - y1
|
||||
if bw <= 0 or bh <= 0:
|
||||
continue
|
||||
|
||||
alpha = np.zeros((h, w), dtype=np.float32)
|
||||
alpha[y1:y2, x1:x2] = 1.0
|
||||
k = max(3, (min(bw, bh) // feather_div) | 1) # odd kernel >= 3
|
||||
alpha = cv2.GaussianBlur(alpha, (k, k), 0)
|
||||
alpha = alpha[:, :, None]
|
||||
out = restored_bgr.astype(np.float32) * alpha + out * (1.0 - alpha)
|
||||
|
||||
return np.clip(out, 0, 255).astype(np.uint8)
|
||||
|
||||
|
||||
def restore_faces(
|
||||
original_bgr: NDArray[Any],
|
||||
cleaned_bgr: NDArray[Any],
|
||||
weight: float = 0.5,
|
||||
pad: int = 14,
|
||||
feather_div: int = 6,
|
||||
) -> NDArray[Any]:
|
||||
"""Restore face identity in ``cleaned_bgr`` using GFPGAN on ``original_bgr``.
|
||||
|
||||
Runs GFPGAN on the ORIGINAL (watermarked) image to recover the true-identity,
|
||||
GAN-regenerated faces plus the RetinaFace boxes, then feather-composites those
|
||||
face regions into the cleaned image. The composited pixels are GFPGAN-generated
|
||||
(not original), so no watermark and no pixel-copy. Returns ``cleaned_bgr``
|
||||
unchanged when no face is detected.
|
||||
|
||||
Args:
|
||||
original_bgr: The original (watermarked) image as cv2 BGR.
|
||||
cleaned_bgr: The diffusion-cleaned image as cv2 BGR (faces drifted).
|
||||
weight: GFPGAN fidelity weight (0-1); lower = more GAN regeneration.
|
||||
pad: Pixels to grow each face box before compositing.
|
||||
feather_div: Larger = sharper composite edge (box-min // feather_div kernel).
|
||||
"""
|
||||
restorer = _get_restorer()
|
||||
_, _, restored_img = restorer.enhance(
|
||||
original_bgr,
|
||||
has_aligned=False,
|
||||
only_center_face=False,
|
||||
paste_back=True,
|
||||
weight=weight,
|
||||
)
|
||||
|
||||
det_faces = getattr(restorer.face_helper, "det_faces", None) or []
|
||||
boxes = [(float(b[0]), float(b[1]), float(b[2]), float(b[3])) for b in det_faces]
|
||||
if not boxes:
|
||||
logger.debug("face_restore: no faces detected; returning cleaned image unchanged")
|
||||
return cleaned_bgr
|
||||
|
||||
return _composite_faces(cleaned_bgr, restored_img, boxes, pad=pad, feather_div=feather_div)
|
||||
@@ -165,8 +165,6 @@ class InvisibleEngine:
|
||||
min_resolution: int = 1024,
|
||||
vendor: str | None = None,
|
||||
restore_faces: bool = False,
|
||||
restore_faces_weight: float = 0.5,
|
||||
restore_faces_method: str = "gfpgan",
|
||||
unsharp: float = 0.0,
|
||||
adaptive_polish: bool = False,
|
||||
upscaler: str = "lanczos",
|
||||
@@ -182,22 +180,16 @@ class InvisibleEngine:
|
||||
guidance_scale: Classifier-free guidance scale.
|
||||
seed: Random seed for reproducibility.
|
||||
humanize: Intensity of Analog Humanizer film grain (0 = off).
|
||||
restore_faces: EXPERIMENTAL, opt-in (default False). Run the GFPGAN
|
||||
face-restoration post-pass when faces are present (needs the
|
||||
``restore`` extra). Auto-skips with a debug log when the extra is
|
||||
absent or no face is detected.
|
||||
restore_faces_method: Which face-identity restoration mechanism to run after
|
||||
the diffusion pass: ``"gfpgan"`` (default; cheap, but WARNING the GFPGAN
|
||||
pass runs on the watermarked ORIGINAL and re-introduces SynthID -- see
|
||||
``face_restore.py``) or ``"photomaker"`` (PhotoMaker-V2; carries identity
|
||||
via a SynthID-invariant OpenCLIP embedding and regenerates fresh face
|
||||
pixels conditioned on it -- SynthID-safe, but heavier and requires the
|
||||
``photomaker`` extra). See ``docs/synthid-robust-identity-research.md``.
|
||||
restore_faces_weight: GFPGAN fidelity weight (0-1); lower = more GAN
|
||||
regeneration (cleaner watermark scrub), higher = closer to input.
|
||||
restore_faces: EXPERIMENTAL, opt-in (default False). Run the PhotoMaker-V2
|
||||
face-identity post-pass when faces are present (needs the
|
||||
``photomaker`` extra). Carries identity via a SynthID-invariant OpenCLIP
|
||||
embedding and regenerates fresh face pixels conditioned on it, so the
|
||||
pixel watermark is not transported. Auto-skips with a debug log when the
|
||||
extra is absent or no face is detected. See
|
||||
``docs/synthid-robust-identity-research.md``.
|
||||
unsharp: Final unsharp-mask sharpening strength (0 = off, default).
|
||||
Applied last (after face restoration) to counter the soft,
|
||||
over-smoothed look of the diffusion/GFPGAN passes; ~0.5-0.8 is a
|
||||
over-smoothed look of the diffusion + restoration; ~0.5-0.8 is a
|
||||
safe range, higher risks edge halos.
|
||||
adaptive_polish: When True (the --auto mode default), restore the input's
|
||||
detail level in the softened output instead of fixed unsharp/humanize:
|
||||
@@ -320,19 +312,16 @@ class InvisibleEngine:
|
||||
out_cv = cv2.resize(out_cv, orig_size, interpolation=cv2.INTER_LANCZOS4)
|
||||
image_io.imwrite(out_path, out_cv)
|
||||
|
||||
# Optional GFPGAN face-restoration post-pass: restore face identity that
|
||||
# the diffusion regeneration drifted, while still scrubbing the pixel
|
||||
# watermark (GFPGAN re-synthesizes faces from a StyleGAN2 prior). Runs on
|
||||
# the cleaned output at its final resolution; auto-skips when faces are
|
||||
# Optional PhotoMaker-V2 face-identity post-pass: restore face identity that
|
||||
# the diffusion regeneration drifted, carrying identity in a SynthID-invariant
|
||||
# OpenCLIP embedding so the regenerated face pixels are watermark-free. Runs
|
||||
# on the cleaned output at its final resolution; auto-skips when faces are
|
||||
# absent or the optional extra is not installed.
|
||||
if restore_faces:
|
||||
if restore_faces_method == "photomaker":
|
||||
self._restore_faces_photomaker(out_path, image, seed)
|
||||
else:
|
||||
self._restore_faces(out_path, image, restore_faces_weight)
|
||||
self._restore_faces_photomaker(out_path, image, seed)
|
||||
|
||||
# Final sharpening, LAST so it crisps the face-restored result too (a
|
||||
# pre-GFPGAN sharpen would be smoothed back over by the face pass).
|
||||
# pre-restore sharpen would be smoothed back over by the face pass).
|
||||
if unsharp > 0.0:
|
||||
import cv2
|
||||
|
||||
@@ -368,55 +357,6 @@ class InvisibleEngine:
|
||||
if _tmp_path.exists():
|
||||
_tmp_path.unlink()
|
||||
|
||||
def _restore_faces(
|
||||
self,
|
||||
out_path: Path,
|
||||
original_image: Any,
|
||||
weight: float,
|
||||
) -> None:
|
||||
"""Run the GFPGAN face-restoration post-pass on the cleaned ``out_path``.
|
||||
|
||||
Composites GFPGAN-restored (identity-preserving, watermark-scrubbed) face
|
||||
regions from the ORIGINAL image into the cleaned output. Best-effort: any
|
||||
failure logs a warning and leaves the un-restored cleaned output in place;
|
||||
a missing ``restore`` extra is logged at debug and skipped (the default-on
|
||||
flag must never error when the extra is absent or no face is present).
|
||||
"""
|
||||
from remove_ai_watermarks import face_restore
|
||||
|
||||
if not face_restore.is_available():
|
||||
logger.debug("restore_faces requested but the 'restore' extra is not installed; skipping")
|
||||
return
|
||||
|
||||
try:
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
from remove_ai_watermarks import image_io
|
||||
|
||||
cleaned_bgr = image_io.imread(out_path, cv2.IMREAD_COLOR)
|
||||
if cleaned_bgr is None:
|
||||
logger.warning("restore_faces: could not read cleaned output %s; skipping", out_path)
|
||||
return
|
||||
|
||||
# Original (EXIF-transposed) as BGR, aligned to the cleaned image so the
|
||||
# GFPGAN face boxes land in the cleaned image's coordinate space. The
|
||||
# cleaned output is already restored to the original resolution above, so
|
||||
# this resize is normally a no-op (it only fires if a max-resolution cap
|
||||
# left the source PIL image smaller).
|
||||
original_rgb = original_image.convert("RGB")
|
||||
original_bgr = cv2.cvtColor(np.array(original_rgb), cv2.COLOR_RGB2BGR)
|
||||
cleaned_size = (cleaned_bgr.shape[1], cleaned_bgr.shape[0])
|
||||
if (original_bgr.shape[1], original_bgr.shape[0]) != cleaned_size:
|
||||
original_bgr = cv2.resize(original_bgr, cleaned_size, interpolation=cv2.INTER_LANCZOS4)
|
||||
|
||||
if self._progress_callback:
|
||||
self._progress_callback("Restoring face identity (GFPGAN post-pass)...")
|
||||
restored = face_restore.restore_faces(original_bgr, cleaned_bgr, weight=weight)
|
||||
image_io.imwrite(out_path, restored)
|
||||
except Exception as e:
|
||||
logger.warning("restore_faces post-pass failed (%s); keeping un-restored output", e)
|
||||
|
||||
def _restore_faces_photomaker(
|
||||
self,
|
||||
out_path: Path,
|
||||
|
||||
@@ -1,85 +0,0 @@
|
||||
"""Tests for the GFPGAN face-restoration post-pass.
|
||||
|
||||
The pure feather-composite helper is unit-tested without the model; the
|
||||
model-running paths are gated behind ``is_available()`` (a multi-hundred-MB
|
||||
download), matching the discipline used for the other ML-adjacent modules.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import numpy as np
|
||||
import pytest
|
||||
|
||||
from remove_ai_watermarks import face_restore
|
||||
|
||||
|
||||
class TestIsAvailable:
|
||||
def test_returns_bool(self):
|
||||
assert isinstance(face_restore.is_available(), bool)
|
||||
|
||||
def test_reflects_dependencies(self):
|
||||
import importlib.util
|
||||
|
||||
expected = all(importlib.util.find_spec(m) is not None for m in ("gfpgan", "facexlib"))
|
||||
assert face_restore.is_available() is expected
|
||||
|
||||
|
||||
class TestCompositeFaces:
|
||||
"""Unit tests for the pure ``_composite_faces`` helper (cv2/numpy only)."""
|
||||
|
||||
def _base_and_restored(self, h: int = 100, w: int = 120):
|
||||
base = np.zeros((h, w, 3), dtype=np.uint8) # black
|
||||
restored = np.full((h, w, 3), 255, dtype=np.uint8) # white
|
||||
return base, restored
|
||||
|
||||
def test_output_shape_and_dtype(self):
|
||||
base, restored = self._base_and_restored()
|
||||
out = face_restore._composite_faces(base, restored, [(40.0, 30.0, 80.0, 70.0)])
|
||||
assert out.shape == base.shape
|
||||
assert out.dtype == np.uint8
|
||||
|
||||
def test_box_region_pulls_toward_restored(self):
|
||||
base, restored = self._base_and_restored()
|
||||
out = face_restore._composite_faces(base, restored, [(40.0, 30.0, 80.0, 70.0)])
|
||||
# Center of the box should be near the restored (white) value.
|
||||
cy, cx = 50, 60
|
||||
assert out[cy, cx].mean() > 200
|
||||
|
||||
def test_far_from_box_stays_base(self):
|
||||
base, restored = self._base_and_restored()
|
||||
out = face_restore._composite_faces(base, restored, [(40.0, 30.0, 80.0, 70.0)], pad=2)
|
||||
# Top-left corner is far from the box and feather, so it stays black.
|
||||
assert out[0, 0].mean() < 5
|
||||
|
||||
def test_empty_boxes_returns_base_unchanged(self):
|
||||
base, restored = self._base_and_restored()
|
||||
out = face_restore._composite_faces(base, restored, [])
|
||||
assert np.array_equal(out, base)
|
||||
|
||||
def test_box_fully_outside_is_skipped(self):
|
||||
base, restored = self._base_and_restored(h=100, w=120)
|
||||
# Box entirely beyond the right/bottom edge -> clipped to empty -> no-op.
|
||||
out = face_restore._composite_faces(base, restored, [(200.0, 200.0, 260.0, 260.0)], pad=0)
|
||||
assert np.array_equal(out, base)
|
||||
|
||||
def test_near_edge_box_clips_without_error(self):
|
||||
base, restored = self._base_and_restored(h=100, w=120)
|
||||
# Box reaching past the bottom-right corner must clip, not raise.
|
||||
out = face_restore._composite_faces(base, restored, [(100.0, 80.0, 130.0, 110.0)], pad=10)
|
||||
assert out.shape == base.shape
|
||||
# The clipped in-bounds region still pulls toward white.
|
||||
assert out[95, 115].mean() > 100
|
||||
|
||||
|
||||
@pytest.mark.skipif(not face_restore.is_available(), reason="requires the 'restore' extra (gfpgan/facexlib)")
|
||||
class TestRestoreFacesModel:
|
||||
"""Model-running smoke test, gated behind the optional extra."""
|
||||
|
||||
def test_no_faces_returns_cleaned_unchanged(self):
|
||||
# A flat gray image has no faces; restore_faces must return the cleaned
|
||||
# input unchanged (the no-op path).
|
||||
cleaned = np.full((128, 128, 3), 127, dtype=np.uint8)
|
||||
original = np.full((128, 128, 3), 127, dtype=np.uint8)
|
||||
out = face_restore.restore_faces(original, cleaned)
|
||||
assert out.shape == cleaned.shape
|
||||
assert np.array_equal(out, cleaned)
|
||||
Reference in New Issue
Block a user