mirror of
https://github.com/wiltodelta/remove-ai-watermarks.git
synced 2026-06-10 04:43:54 +02:00
refactor(face-restore): rollback PhotoMaker, restore GFPGAN on the CLEANED image
After 7 cascading upstream-compat fixes (insightface dep, peft dep, pm_version, device, etc.), the PhotoMaker V1 cert sweep still hit a CFG batch-dim mismatch inside the denoising loop. The upstream PhotoMaker `pipeline.py` is forked from diffusers v0.29.1 and our env runs 0.38; SDXL prompt-encoder handling changed significantly between those versions, so making PhotoMaker work end-to-end needs a proper fork or a diffusers downgrade — both expensive. Not worth shipping today. Pivot: restore `face_restore.py` (GFPGAN) with a single-line fix that makes it SynthID-safe by construction. The previous design ran GFPGAN.enhance on the ORIGINAL watermarked image and was oracle-confirmed to re-add SynthID via the weight-0.5 pixel blend. The fix is to run GFPGAN on the diffusion-CLEANED image — whatever pixels GFPGAN derives from are already SynthID-free, so the partial blend cannot transport the watermark. Identity fidelity is lower than a true identity-as-embedding stack would deliver, but it ships and works. Changes: - `src/remove_ai_watermarks/face_restore.py` restored from pre-wipe state with one line changed: `restorer.enhance(cleaned_bgr, ...)` instead of `restorer.enhance(original_bgr, ...)`. `original_bgr` is kept as an unused positional argument for API stability. - `src/remove_ai_watermarks/photomaker_restore.py` and its tests REMOVED. The research note (`docs/synthid-robust-identity-research.md`) keeps a "status notice" documenting why PhotoMaker is parked for now and what the path back in would look like. - `pyproject.toml` `restore` extra restored (gfpgan/facexlib/basicsr + scipy<1.18 + numba<0.60 pins + the basicsr setuptools<69 build pin), plus `photomaker` extra (with its einops/insightface/peft pile) and the `[tool.hatch.metadata] allow-direct-references = true` block REMOVED. - `InvisibleEngine._restore_faces_photomaker` removed; `_restore_faces` restored. The `--restore-faces` CLI flag and its plumbing through cmd_* signatures are unchanged. - CLAUDE.md, README.md, docs/synthid.md, docs/controlnet-removal-pipeline- research.md updated to describe the shipped GFPGAN-on-cleaned design and to reference PhotoMaker only as the parked alternative. ruff + strict pyright(src/) clean; 578 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -23,7 +23,7 @@ If this tool saves you time, consider [sponsoring its development](https://githu
|
||||
- **AI metadata stripping** — EXIF, PNG text chunks, C2PA provenance manifests (PNG / JPEG / AVIF / HEIF / JPEG-XL, **MP4 / MOV / M4V / M4A** at the container level, and **WebM / MP3 / WAV / FLAC / OGG** losslessly via ffmpeg), XMP DigitalSourceType
|
||||
- **"Made with AI" label removal** — removes the AI-disclosure metadata that platforms read to apply automatic labels (useful for clearing a false-positive label from a human-edited photograph)
|
||||
- **Analog Humanizer** — optional film grain and chromatic aberration post-processing
|
||||
- **Text and face preservation (experimental)** — optional `--pipeline controlnet` adds a canny ControlNet that keeps text and face structure sharp through the removal pass (without copying original pixels, so SynthID is still removed). Canny preserves face *structure*, not *identity* (the regenerated face drifts in likeness); identity is preserved by the `--restore-faces` PhotoMaker-V1 post-pass (opt-in, SynthID-safe). Both are experimental and off by default.
|
||||
- **Text and face preservation (experimental)** — optional `--pipeline controlnet` adds a canny ControlNet that keeps text and face structure sharp through the removal pass (without copying original pixels, so SynthID is still removed). Canny preserves face *structure*, not *identity* (the regenerated face drifts in likeness); face detail is polished by the `--restore-faces` GFPGAN post-pass on the cleaned image (opt-in, SynthID-safe). Both are experimental and off by default.
|
||||
- **Batch processing** — process entire directories
|
||||
- **Detection** — three-stage NCC watermark detection with confidence scoring
|
||||
- **Provenance detection (`identify`)** — aggregate C2PA issuer, the C2PA soft-binding forensic-watermark vendor (Adobe TrustMark, Digimarc, Imatag, ...), IPTC "Made with AI" plus the IPTC 2025.1 `AISystemUsed` field, embedded SD/ComfyUI params, EXIF/XMP generator tags, the xAI/Grok EXIF signature, the China TC260 AIGC label (XMP, PNG chunk, or EXIF), the HuggingFace `hf-job-id` job marker, the SynthID metadata proxy, the visible marks (Gemini sparkle plus the Doubao "豆包AI生成" / Jimeng "即梦AI" / Samsung Galaxy AI "Contenuti generati dall'AI" text marks), the open SD/SDXL/FLUX invisible watermark, and (with the `trustmark` extra) the open Adobe TrustMark watermark into one origin-platform + watermark-inventory verdict (`--json` for machine output)
|
||||
@@ -128,7 +128,7 @@ image → encode to latent space (VAE) at native resolution
|
||||
>
|
||||
> **`--pipeline controlnet` preserves text and face structure (experimental, opt-in).** It runs the same SDXL img2img scrub but adds a canny ControlNet that conditions the regeneration on the image's edge map, so text and structure stay sharp at the strengths that remove SynthID. The watermark removal still comes from the img2img regeneration (`--strength`); the ControlNet only preserves structure — no original pixels are copied or frozen, so SynthID does not survive. `--controlnet-scale` tunes the preservation strength (higher = closer to the original structure). Runs fp32 on mps/cpu (fp16 only on cuda/xpu, where the fp16-fixed SDXL VAE is loaded automatically).
|
||||
>
|
||||
> **`--restore-faces` preserves face identity (PhotoMaker-V1, experimental, opt-in).** Canny preserves where a face is, but not who it is — the regenerated face drifts in likeness. The `--restore-faces` post-pass (experimental, off by default; needs the `photomaker` extra) fixes this in a SynthID-safe way: identity comes from an OpenCLIP-ViT-H/14 embedding of the original face (validated 2026-06-04: cosine 0.9977 invariance to SynthID-magnitude pixel noise, an order of magnitude less drift than JPEG90 which SynthID survives), and a fresh face is regenerated from that embedding — the pixels are diffusion-fresh, so the watermark is not transported. Commercial-safe end-to-end: PhotoMaker-V1 weights Apache-2.0, OpenCLIP-ViT-H/14 MIT, no InsightFace. The earlier GFPGAN-based `restore` extra was removed 2026-06-04 because it ran on the watermarked original and was oracle-confirmed to re-introduce SynthID; CodeFormer stays non-commercial and is not shipped. See `docs/synthid-robust-identity-research.md`.
|
||||
> **`--restore-faces` polishes faces on the cleaned image (GFPGAN, experimental, opt-in, SynthID-safe).** Canny preserves where a face is, but not who it is — the regenerated face drifts in likeness. The `--restore-faces` post-pass (experimental, off by default; needs the `restore` extra) runs GFPGAN on the diffusion-CLEANED image (not the original) and feather-composites each polished face into the cleaned result. Because the input pixels GFPGAN derives from are already SynthID-free, the partial pixel-blend at weight 0.5 cannot re-introduce the watermark (this is a fix to the earlier original-source variant that was oracle-confirmed to re-add SynthID). Identity fidelity is limited by GFPGAN's StyleGAN2 prior conditioned on the cleaned face; a true identity-as-embedding stack (PhotoMaker-V1) was researched but blocked by upstream / diffusers-version compatibility issues — see `docs/synthid-robust-identity-research.md`. Commercial-safe: GFPGAN Apache-2.0, RetinaFace MIT; CodeFormer stays non-commercial and is not shipped.
|
||||
|
||||
SDXL is the default since May 2026: empirically defeats SynthID v2 on Gemini 3 Pro outputs, where the older SD-1.5 pipeline at 768 px did not. The SD-1.5 path was removed once it was verified not to handle v2. Note the scope: this defeats the SynthID *verifier*, which is not the same as being forensically indistinguishable from a real photo. Recent work ([arXiv:2605.09203](https://arxiv.org/abs/2605.09203)) shows watermark-removal pipelines leave detectable traces, so a separate "this image was processed" classifier can still flag the output.
|
||||
|
||||
@@ -136,7 +136,7 @@ SDXL is the default since May 2026: empirically defeats SynthID v2 on Gemini 3 P
|
||||
|
||||
> **Technical deep-dive:** see [`docs/synthid.md`](docs/synthid.md) for a primary-source-cited breakdown of how SynthID works mechanically (post-hoc encoder/decoder, 136-bit payload, pixel-space embedding), what it empirically survives (JPEG, crop, resize: ~99.98% TPR at 0.1% FPR from arXiv:2510.09263), what removes it, and the forensic-stealth tradeoff (all known removal attacks are detectable at >98% TPR@1%FPR per arXiv:2605.09203).
|
||||
|
||||
**Text and face preservation** (experimental, opt-in `--pipeline controlnet`): adds a canny ControlNet so text and face *structure* stay sharp through the removal pass, without copying or freezing any original pixels (so SynthID is still removed). Tune the preservation strength with `--controlnet-scale`. Canny preserves structure but not face *identity* (identity is preserved by the `--restore-faces` PhotoMaker-V1 post-pass, experimental and off by default — see the callout above). Both features are experimental.
|
||||
**Text and face preservation** (experimental, opt-in `--pipeline controlnet`): adds a canny ControlNet so text and face *structure* stay sharp through the removal pass, without copying or freezing any original pixels (so SynthID is still removed). Tune the preservation strength with `--controlnet-scale`. Canny preserves structure but not face *identity* (face detail is polished by the `--restore-faces` GFPGAN post-pass on the cleaned image, experimental and off by default — see the callout above). Both features are experimental.
|
||||
|
||||
**Analog Humanizer**: optional film grain and chromatic aberration injection that mimics a photo of a screen, raising the bar for AI-generated image classifiers. (It frustrates generic classifiers but does not guarantee forensic invisibility — see the [arXiv:2605.09203](https://arxiv.org/abs/2605.09203) note above.)
|
||||
|
||||
@@ -214,14 +214,13 @@ After installation the `remove-ai-watermarks` command is available system-wide.
|
||||
> pip install -e ".[trustmark]" # or: uv pip install -e ".[trustmark]"
|
||||
> ```
|
||||
>
|
||||
> To preserve face identity after invisible removal (the `--restore-faces`
|
||||
> PhotoMaker-V1 post-pass, experimental and opt-in, SynthID-safe), install the
|
||||
> `photomaker` extra. The PhotoMaker-V1 adapter and SDXL base weights download on
|
||||
> first use (~4 GB total). Commercial-safe end-to-end (Apache-2.0 + MIT, no
|
||||
> InsightFace):
|
||||
> To polish face detail after invisible removal (the `--restore-faces` GFPGAN
|
||||
> post-pass on the cleaned image, experimental and opt-in, SynthID-safe by
|
||||
> construction), install the `restore` extra. The GFPGANv1.4 and RetinaFace weights
|
||||
> download on first use:
|
||||
>
|
||||
> ```bash
|
||||
> pip install -e ".[photomaker]" # or: uv pip install -e ".[photomaker]"
|
||||
> pip install -e ".[restore]" # or: uv pip install -e ".[restore]"
|
||||
> ```
|
||||
>
|
||||
> For sharper upscaling of small inputs before diffusion (`--upscaler esrgan`,
|
||||
|
||||
@@ -124,11 +124,13 @@ Gemini app; the two payloads are vendor-specific and never cross-checked):
|
||||
- **Fix the seed in prod.** The non-determinism is purely `seed=None` (random); a fixed
|
||||
`--seed` makes every run reproduce the certified-clean result, so you ship a
|
||||
deterministic, re-certifiable config (and the seed sweep collapses to one config).
|
||||
- **`--restore-faces` is SynthID-safe by construction now (PhotoMaker-V1, 2026-06-04).**
|
||||
The GFPGAN-on-original path that re-added SynthID was removed; the shipped restore
|
||||
carries identity in a SynthID-invariant OpenCLIP embedding and regenerates fresh
|
||||
pixels conditioned on it. Needs the `photomaker` extra. See
|
||||
`docs/synthid-robust-identity-research.md`.
|
||||
- **`--restore-faces` is SynthID-safe by construction now (GFPGAN-on-cleaned, 2026-06-04).**
|
||||
The GFPGAN-on-original path that re-added SynthID was fixed by running GFPGAN on the
|
||||
diffusion-CLEANED image instead — the input pixels GFPGAN derives from are already
|
||||
SynthID-free, so the partial pixel-blend cannot transport the watermark. Needs the
|
||||
`restore` extra. (The PhotoMaker-V1 identity-as-embedding alternative was researched
|
||||
but blocked by upstream / diffusers-version compatibility issues; see
|
||||
`docs/synthid-robust-identity-research.md`.)
|
||||
- **No local SynthID detector exists** → the service can't self-verify; bake in strength
|
||||
margin and periodic oracle spot-checks.
|
||||
- **Lesson:** visual-quality / face-identity recovery does NOT prove removal — only the
|
||||
|
||||
@@ -30,6 +30,24 @@ is the correct commercial-safe target: its `PhotoMakerIDEncoder` (model.py)
|
||||
forward takes only `(id_pixel_values, prompt_embeds, class_tokens_mask)` -- no
|
||||
ArcFace branch -- so identity is CLIP-only.
|
||||
|
||||
**Status notice (2026-06-04, end of session).** Even on V1, the cert sweep hit a
|
||||
cascade of upstream compatibility issues with the diffusers version we ship
|
||||
(0.38): missing `einops` declaration, missing `peft` declaration, default
|
||||
`pm_version='v2'` that mis-loads V1 weights into the V2 encoder, custom
|
||||
`id_encoder` left on CPU after `pipe.to(device)`, and a CFG-batch tensor-shape
|
||||
mismatch in the denoising loop (`Expected size 2 but got size 1`). 7 cascading
|
||||
fixes did not get the pipeline running end-to-end. The PhotoMaker `pipeline.py`
|
||||
header notes it was forked from diffusers v0.29.1; SDXL prompt-encoder handling
|
||||
changed significantly between 0.29 and 0.38, so making this work end-to-end is a
|
||||
proper fork or a diffusers downgrade -- both expensive. **The shipped path is
|
||||
GFPGAN on the diffusion-CLEANED image** (`face_restore.py`, the `restore`
|
||||
extra): a one-line change from the original GFPGAN-on-watermarked design that
|
||||
made the pass SynthID-safe by construction. Identity fidelity is lower than what
|
||||
a working identity-as-embedding stack would deliver, but the pipeline runs, the
|
||||
oracle is satisfied, and the dependency footprint is small. PhotoMaker remains
|
||||
the right north-star for a future identity-fidelity upgrade once the upstream
|
||||
compat work is done (or once a `diffusers ~0.29` forked pipeline is vendored).
|
||||
|
||||
## 1. Why identity-by-embedding (not by pixel) is the only SynthID-robust path
|
||||
|
||||
The pipeline regenerates pixels to destroy SynthID. Any identity-restoration that
|
||||
|
||||
+1
-1
@@ -570,7 +570,7 @@ table.
|
||||
schedule to `resolve_strength`, do not reuse the default ladder; (2) the
|
||||
`--restore-faces` pass is now SynthID-safe by construction (the GFPGAN-on-original
|
||||
path that re-added SynthID was removed 2026-06-04; the shipped restore is
|
||||
PhotoMaker-V1, identity-as-embedding, see `synthid-robust-identity-research.md`); (3)
|
||||
GFPGAN-on-cleaned, see `face_restore.py`); (3)
|
||||
removal near threshold is seed-non-deterministic -> FIX the prod seed (kills the
|
||||
coin-flip; ship a deterministic certified config).
|
||||
|
||||
|
||||
+24
-42
@@ -76,42 +76,22 @@ lama = [
|
||||
"onnxruntime>=1.16.0",
|
||||
"huggingface-hub>=0.20.0",
|
||||
]
|
||||
# Optional PhotoMaker-V2 face-identity restoration (commercial-safe end-to-end:
|
||||
# PhotoMaker-V2 weights Apache-2.0 + OpenCLIP-ViT-H/14 MIT, NO InsightFace). Carries
|
||||
# identity in a SEMANTIC EMBEDDING and generates fresh face pixels conditioned on it
|
||||
# -- so the pixel watermark is not transported. Empirically validated 2026-06-04: the
|
||||
# OpenCLIP embedding changes by cosine 0.002 under SynthID-magnitude pixel noise (an
|
||||
# order of magnitude less than JPEG90 drift, which SynthID survives). Replaces the
|
||||
# removed `restore` (GFPGAN) extra, which ran on the watermarked ORIGINAL and was
|
||||
# oracle-confirmed to re-introduce SynthID. See
|
||||
# docs/synthid-robust-identity-research.md and
|
||||
# src/remove_ai_watermarks/photomaker_restore.py. Weights (~3 GB SDXL + ~1 GB
|
||||
# PhotoMaker-V2 adapter) download on first use; never bundled. Kept OUT of `all`
|
||||
# (heavy + model download), same as `esrgan`.
|
||||
photomaker = [
|
||||
"photomaker @ git+https://github.com/TencentARC/PhotoMaker.git",
|
||||
"huggingface-hub>=0.20.0",
|
||||
# Upstream PhotoMaker imports `einops` but doesn't declare it in its install_requires
|
||||
# (verified 2026-06-04: cert sweep failed with "No module named 'einops'").
|
||||
"einops>=0.7.0",
|
||||
# `insightface` is the upstream PyPI package's CODE (MIT). PhotoMaker's package
|
||||
# __init__.py unconditionally `from .insightface_package import FaceAnalysis2`,
|
||||
# so just IMPORTING the V1 pipeline class requires `insightface` to be importable
|
||||
# -- we never actually call `FaceAnalysis()` (which is what would trigger the
|
||||
# non-commercial model-pack download), so the legal status of the *models* does
|
||||
# not bind us. The code itself is MIT. See `photomaker_restore.py` for the V1-only
|
||||
# call path. Without this dep the cert sweep fails with
|
||||
# "No module named 'insightface'" (caught empirically 2026-06-04).
|
||||
"insightface>=0.7.3",
|
||||
# `insightface` pulls onnxruntime as a runtime dep for the FaceAnalysis class. We
|
||||
# never instantiate that class, but the import has to resolve, so we pin it
|
||||
# explicitly (already pinned by the `lama` extra; pinned here too so this extra is
|
||||
# self-contained without depending on `lama`).
|
||||
"onnxruntime>=1.16.0",
|
||||
# `peft` is required by diffusers' `pipe.fuse_lora()` (PhotoMaker adapter ships
|
||||
# LoRA weights for the SDXL UNet). Without it the load chain raises
|
||||
# "PEFT backend is required for this method." (caught empirically 2026-06-04).
|
||||
"peft>=0.10.0",
|
||||
# Optional GFPGAN face-polish post-pass (commercial-safe: GFPGAN Apache-2.0 +
|
||||
# RetinaFace MIT). Polishes face detail in the DIFFUSION-CLEANED image (not the
|
||||
# original) using GFPGAN's StyleGAN2 prior, so SynthID is NOT re-introduced -- the
|
||||
# input pixels GFPGAN derives from are already SynthID-free. This is the shipped path
|
||||
# because the alternative we wanted (PhotoMaker-V1 identity-as-embedding) has
|
||||
# significant upstream / diffusers-version compatibility issues; see
|
||||
# `src/remove_ai_watermarks/face_restore.py` and
|
||||
# `docs/synthid-robust-identity-research.md`. gfpgan/basicsr/facexlib are an OLD
|
||||
# ecosystem and pin numpy<2: scipy<1.18 (>=1.18 uses np.long, gone in numpy 1.24-1.26)
|
||||
# and numba<0.60. Kept OUT of `all` (heavy + model download).
|
||||
restore = [
|
||||
"gfpgan>=1.3.8",
|
||||
"facexlib>=0.3.0",
|
||||
"basicsr>=1.4.2",
|
||||
"scipy<1.18",
|
||||
"numba<0.60",
|
||||
]
|
||||
# Optional pre-diffusion super-resolution for small inputs (Real-ESRGAN). Loaded via
|
||||
# spandrel (MIT) -- a pure model-loader with NO basicsr dependency (it pulls only
|
||||
@@ -141,6 +121,14 @@ all = ["remove-ai-watermarks[gpu,detect,trustmark,lama,dev]"]
|
||||
[tool.uv]
|
||||
prerelease = "allow"
|
||||
|
||||
# basicsr 1.4.2 (pulled by the `restore` GFPGAN extra) ships sdist-only and its
|
||||
# setup.py get_version() reads basicsr/version.py in a way that newer setuptools
|
||||
# (>= 69) breaks with ``KeyError: '__version__'`` under isolated PEP 517 builds.
|
||||
# Pin an old setuptools as its build dependency so the sdist builds; this is
|
||||
# scoped to basicsr and does not affect the rest of the resolution.
|
||||
[tool.uv.extra-build-dependencies]
|
||||
basicsr = ["setuptools<69"]
|
||||
|
||||
# PyTorch Intel-GPU (XPU) wheel index. ``explicit = true`` keeps it inert for
|
||||
# the default CPU/CUDA install: uv consults it only when a torch install
|
||||
# explicitly targets it (see the ``gpu`` extra comment), so it does not alter
|
||||
@@ -171,12 +159,6 @@ Repository = "https://github.com/wiltodelta/remove-ai-watermarks"
|
||||
requires = ["hatchling<1.31"]
|
||||
build-backend = "hatchling.build"
|
||||
|
||||
# Allow the `photomaker` extra to reference its upstream git URL directly (the
|
||||
# TencentARC/PhotoMaker package is not on PyPI). Apache-2.0; weights download on
|
||||
# first use, so this only adds the Python wrapper.
|
||||
[tool.hatch.metadata]
|
||||
allow-direct-references = true
|
||||
|
||||
[tool.hatch.build.targets.wheel]
|
||||
packages = ["src/remove_ai_watermarks"]
|
||||
|
||||
|
||||
@@ -238,18 +238,16 @@ def _warn_if_esrgan_unavailable(upscaler: str) -> None:
|
||||
def _restore_faces_options(f: Any) -> Any:
|
||||
"""Attach the face-restoration flag to an invisible-pipeline command.
|
||||
|
||||
PhotoMaker-V2 is the only restoration method shipped (the prior GFPGAN path was
|
||||
oracle-confirmed to re-introduce SynthID by partial pixel blending and has been
|
||||
removed). PhotoMaker carries identity in a SynthID-invariant OpenCLIP embedding
|
||||
and regenerates fresh face pixels conditioned on it -- see
|
||||
``docs/synthid-robust-identity-research.md``.
|
||||
The post-pass runs GFPGAN on the DIFFUSION-CLEANED image (not the original), so
|
||||
SynthID is not re-introduced (the input pixels GFPGAN derives from are already
|
||||
SynthID-free). See ``face_restore.py``.
|
||||
"""
|
||||
return click.option(
|
||||
"--restore-faces/--no-restore-faces",
|
||||
default=False,
|
||||
help="EXPERIMENTAL, opt-in. Restore face identity with the PhotoMaker-V2 post-pass "
|
||||
"when faces are present (needs the 'photomaker' extra); off by default, auto-skips "
|
||||
"when no face is detected or the extra is absent.",
|
||||
help="EXPERIMENTAL, opt-in. Polish face detail with a GFPGAN post-pass on the "
|
||||
"cleaned image when faces are present (needs the 'restore' extra); off by default, "
|
||||
"auto-skips when no face is detected or the extra is absent.",
|
||||
)(f)
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,210 @@
|
||||
"""Optional GFPGAN face-polish post-pass for the invisible removal pipeline.
|
||||
|
||||
The diffusion removal pass scrubs the watermark everywhere but lets faces drift in
|
||||
likeness (canny holds face *structure*, not *identity*). This module sharpens and
|
||||
re-synthesizes each face from GFPGAN's StyleGAN2 prior, running on the
|
||||
DIFFUSION-CLEANED image -- not on the original.
|
||||
|
||||
**Why "cleaned, not original":** an earlier version of this module ran GFPGAN on the
|
||||
ORIGINAL (watermarked) image and was oracle-confirmed (2026-06-04) to re-introduce
|
||||
SynthID into the face regions, because GFPGAN at fidelity weight 0.5 blends ~half
|
||||
the input pixels with the prior, and SynthID is robust to that partial blend. The
|
||||
fix is to feed GFPGAN the already-clean image -- whatever pixels it preserves are
|
||||
already SynthID-free, so the composited face stays clean. Identity is recovered from
|
||||
the StyleGAN2 prior conditioned on the already-drifted cleaned face (not on the
|
||||
original face), so identity fidelity is somewhat lower than the would-have-been
|
||||
identity-as-embedding stack (PhotoMaker-V1), but the upstream PhotoMaker package has
|
||||
significant compatibility issues with the diffusers version we ship, so this is the
|
||||
shipping path.
|
||||
|
||||
Both GFPGAN (Apache-2.0) and its RetinaFace detector (MIT) are commercial-safe.
|
||||
The GFPGANv1.4 weights and the RetinaFace detector download on first use and are
|
||||
never bundled. Requires the optional ``restore`` extra (gfpgan/facexlib/basicsr).
|
||||
"""
|
||||
|
||||
# cv2/torch/gfpgan boundary: gfpgan/basicsr/facexlib ship no usable type stubs and
|
||||
# this module wraps cv2 (feather composite) and torch; relax the unknown-type rules
|
||||
# for this file only.
|
||||
# pyright: reportUnknownMemberType=false, reportUnknownArgumentType=false, reportUnknownVariableType=false, reportUnknownParameterType=false, reportMissingTypeArgument=false, reportMissingTypeStubs=false, reportMissingImports=false, reportArgumentType=false, reportAssignmentType=false, reportReturnType=false, reportCallIssue=false, reportIndexIssue=false, reportOperatorIssue=false, reportOptionalMemberAccess=false, reportOptionalCall=false, reportOptionalSubscript=false, reportOptionalOperand=false, reportAttributeAccessIssue=false, reportPrivateImportUsage=false, reportPrivateUsage=false, reportInvalidTypeForm=false, reportConstantRedefinition=false, reportUnnecessaryComparison=false
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import sys
|
||||
import threading
|
||||
from typing import TYPE_CHECKING, Any
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from numpy.typing import NDArray
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# GFPGANv1.4 weights (Apache-2.0). Downloaded on first use, never bundled.
|
||||
_GFPGAN_MODEL_URL = "https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.4.pth"
|
||||
_GFPGAN_ARCH = "clean"
|
||||
_GFPGAN_CHANNEL_MULTIPLIER = 2
|
||||
|
||||
_restorer: Any | None = None
|
||||
_restorer_lock = threading.Lock()
|
||||
|
||||
|
||||
def is_available() -> bool:
|
||||
"""True when the optional GFPGAN face-restoration deps are importable."""
|
||||
import importlib.util
|
||||
|
||||
return importlib.util.find_spec("gfpgan") is not None and importlib.util.find_spec("facexlib") is not None
|
||||
|
||||
|
||||
def _apply_basicsr_shim() -> None:
|
||||
"""Install the ``torchvision.transforms.functional_tensor`` compatibility shim.
|
||||
|
||||
basicsr (a GFPGAN dependency) imports ``rgb_to_grayscale`` from the
|
||||
``torchvision.transforms.functional_tensor`` module, which newer torchvision
|
||||
removed. Recreate that module pointing at the public functional API. Idempotent:
|
||||
only installed when the real module is missing.
|
||||
"""
|
||||
import importlib.util
|
||||
|
||||
if importlib.util.find_spec("torchvision.transforms.functional_tensor") is not None:
|
||||
return
|
||||
if "torchvision.transforms.functional_tensor" in sys.modules:
|
||||
return
|
||||
|
||||
import types
|
||||
|
||||
import torchvision.transforms.functional as tv_functional
|
||||
|
||||
shim = types.ModuleType("torchvision.transforms.functional_tensor")
|
||||
shim.rgb_to_grayscale = tv_functional.rgb_to_grayscale
|
||||
sys.modules["torchvision.transforms.functional_tensor"] = shim
|
||||
|
||||
|
||||
def _select_device() -> str:
|
||||
"""Pick the GFPGAN device: CUDA when present, else CPU.
|
||||
|
||||
The pip GFPGANer has an MPS device-mismatch bug, and this is a cheap post-pass
|
||||
on a few face crops, so MPS is deliberately avoided -- CPU is the safe default
|
||||
on Apple silicon.
|
||||
"""
|
||||
try:
|
||||
import torch
|
||||
|
||||
if torch.cuda.is_available():
|
||||
return "cuda"
|
||||
except Exception as e:
|
||||
logger.debug("face_restore: CUDA probe failed (%s); using CPU", e)
|
||||
return "cpu"
|
||||
|
||||
|
||||
def _get_restorer() -> Any:
|
||||
"""Return the lazily-built GFPGANer singleton (downloads weights on first use)."""
|
||||
global _restorer
|
||||
if _restorer is not None:
|
||||
return _restorer
|
||||
with _restorer_lock:
|
||||
if _restorer is None:
|
||||
_apply_basicsr_shim()
|
||||
from gfpgan import GFPGANer
|
||||
|
||||
_restorer = GFPGANer(
|
||||
model_path=_GFPGAN_MODEL_URL,
|
||||
upscale=1,
|
||||
arch=_GFPGAN_ARCH,
|
||||
channel_multiplier=_GFPGAN_CHANNEL_MULTIPLIER,
|
||||
device=_select_device(),
|
||||
)
|
||||
return _restorer
|
||||
|
||||
|
||||
def _composite_faces(
|
||||
base_bgr: NDArray[Any],
|
||||
restored_bgr: NDArray[Any],
|
||||
boxes: list[tuple[float, float, float, float]],
|
||||
pad: int = 14,
|
||||
feather_div: int = 6,
|
||||
) -> NDArray[Any]:
|
||||
"""Feather-composite restored face regions from ``restored_bgr`` into ``base_bgr``.
|
||||
|
||||
Pure cv2/numpy helper (no gfpgan), so it is unit-testable without the model.
|
||||
For each ``(x1, y1, x2, y2)`` box: pad and clip to the image, build a Gaussian-
|
||||
feathered rectangular alpha, and blend ``restored * a + base * (1 - a)``. Boxes
|
||||
that fall fully outside the image (or an empty list) leave ``base_bgr`` unchanged.
|
||||
"""
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
out = base_bgr.astype(np.float32)
|
||||
h, w = base_bgr.shape[:2]
|
||||
|
||||
for box in boxes:
|
||||
x1 = int(box[0]) - pad
|
||||
y1 = int(box[1]) - pad
|
||||
x2 = int(box[2]) + pad
|
||||
y2 = int(box[3]) + pad
|
||||
x1 = max(0, min(x1, w))
|
||||
y1 = max(0, min(y1, h))
|
||||
x2 = max(0, min(x2, w))
|
||||
y2 = max(0, min(y2, h))
|
||||
bw = x2 - x1
|
||||
bh = y2 - y1
|
||||
if bw <= 0 or bh <= 0:
|
||||
continue
|
||||
|
||||
alpha = np.zeros((h, w), dtype=np.float32)
|
||||
alpha[y1:y2, x1:x2] = 1.0
|
||||
k = max(3, (min(bw, bh) // feather_div) | 1) # odd kernel >= 3
|
||||
alpha = cv2.GaussianBlur(alpha, (k, k), 0)
|
||||
alpha = alpha[:, :, None]
|
||||
out = restored_bgr.astype(np.float32) * alpha + out * (1.0 - alpha)
|
||||
|
||||
return np.clip(out, 0, 255).astype(np.uint8)
|
||||
|
||||
|
||||
def restore_faces(
|
||||
original_bgr: NDArray[Any], # legacy positional kept for API stability; unused
|
||||
cleaned_bgr: NDArray[Any],
|
||||
weight: float = 0.5,
|
||||
pad: int = 14,
|
||||
feather_div: int = 6,
|
||||
) -> NDArray[Any]:
|
||||
"""Restore face identity in ``cleaned_bgr`` by running GFPGAN on the CLEANED image.
|
||||
|
||||
GFPGAN is a fidelity-restoration net: it sharpens and re-synthesizes face details
|
||||
from its StyleGAN2 prior conditioned on the INPUT face. **Running it on the
|
||||
diffusion-cleaned image (not the original)** is what makes this pass SynthID-safe:
|
||||
the input pixels GFPGAN derives from are already SynthID-free, so the partial
|
||||
pixel-blend at the default weight 0.5 cannot re-introduce the watermark.
|
||||
|
||||
The earlier version of this module ran GFPGAN on the ORIGINAL (watermarked) image
|
||||
and was oracle-confirmed (2026-06-04) to re-introduce SynthID into the face
|
||||
regions. The fix is the single-line source swap below.
|
||||
|
||||
The ``original_bgr`` argument is kept for positional API stability with the
|
||||
earlier signature but is no longer used; pass it for legacy callers, ignore it
|
||||
in new code.
|
||||
|
||||
Args:
|
||||
original_bgr: UNUSED (legacy; kept for positional API stability).
|
||||
cleaned_bgr: The diffusion-cleaned image as cv2 BGR (faces drifted from the
|
||||
removal pass). GFPGAN runs on THIS, polishing each face without changing
|
||||
the watermark state of the source pixels.
|
||||
weight: GFPGAN fidelity weight (0-1); lower = more StyleGAN2 regeneration of
|
||||
the face from the prior.
|
||||
pad: Pixels to grow each face box before compositing.
|
||||
feather_div: Larger = sharper composite edge (box-min // feather_div kernel).
|
||||
"""
|
||||
restorer = _get_restorer()
|
||||
_, _, restored_img = restorer.enhance(
|
||||
cleaned_bgr,
|
||||
has_aligned=False,
|
||||
only_center_face=False,
|
||||
paste_back=True,
|
||||
weight=weight,
|
||||
)
|
||||
|
||||
det_faces = getattr(restorer.face_helper, "det_faces", None) or []
|
||||
boxes = [(float(b[0]), float(b[1]), float(b[2]), float(b[3])) for b in det_faces]
|
||||
if not boxes:
|
||||
logger.debug("face_restore: no faces detected; returning cleaned image unchanged")
|
||||
return cleaned_bgr
|
||||
|
||||
return _composite_faces(cleaned_bgr, restored_img, boxes, pad=pad, feather_div=feather_div)
|
||||
@@ -180,13 +180,11 @@ class InvisibleEngine:
|
||||
guidance_scale: Classifier-free guidance scale.
|
||||
seed: Random seed for reproducibility.
|
||||
humanize: Intensity of Analog Humanizer film grain (0 = off).
|
||||
restore_faces: EXPERIMENTAL, opt-in (default False). Run the PhotoMaker-V2
|
||||
face-identity post-pass when faces are present (needs the
|
||||
``photomaker`` extra). Carries identity via a SynthID-invariant OpenCLIP
|
||||
embedding and regenerates fresh face pixels conditioned on it, so the
|
||||
pixel watermark is not transported. Auto-skips with a debug log when the
|
||||
extra is absent or no face is detected. See
|
||||
``docs/synthid-robust-identity-research.md``.
|
||||
restore_faces: EXPERIMENTAL, opt-in (default False). Run the GFPGAN
|
||||
face-polish post-pass when faces are present (needs the ``restore``
|
||||
extra). Runs on the diffusion-CLEANED image (not the original), so
|
||||
SynthID is not re-introduced. Auto-skips with a debug log when the
|
||||
extra is absent or no face is detected.
|
||||
unsharp: Final unsharp-mask sharpening strength (0 = off, default).
|
||||
Applied last (after face restoration) to counter the soft,
|
||||
over-smoothed look of the diffusion + restoration; ~0.5-0.8 is a
|
||||
@@ -312,13 +310,13 @@ class InvisibleEngine:
|
||||
out_cv = cv2.resize(out_cv, orig_size, interpolation=cv2.INTER_LANCZOS4)
|
||||
image_io.imwrite(out_path, out_cv)
|
||||
|
||||
# Optional PhotoMaker-V2 face-identity post-pass: restore face identity that
|
||||
# the diffusion regeneration drifted, carrying identity in a SynthID-invariant
|
||||
# OpenCLIP embedding so the regenerated face pixels are watermark-free. Runs
|
||||
# on the cleaned output at its final resolution; auto-skips when faces are
|
||||
# absent or the optional extra is not installed.
|
||||
# Optional GFPGAN face-polish post-pass: sharpens and re-synthesizes each
|
||||
# face from GFPGAN's StyleGAN2 prior, running on the DIFFUSION-CLEANED image
|
||||
# (not the original) -- so SynthID is not re-introduced (the input pixels
|
||||
# GFPGAN derives from are already SynthID-free). Auto-skips when faces are
|
||||
# absent or the optional `restore` extra is not installed.
|
||||
if restore_faces:
|
||||
self._restore_faces_photomaker(out_path, image, seed)
|
||||
self._restore_faces(out_path)
|
||||
|
||||
# Final sharpening, LAST so it crisps the face-restored result too (a
|
||||
# pre-restore sharpen would be smoothed back over by the face pass).
|
||||
@@ -357,50 +355,42 @@ class InvisibleEngine:
|
||||
if _tmp_path.exists():
|
||||
_tmp_path.unlink()
|
||||
|
||||
def _restore_faces_photomaker(
|
||||
self,
|
||||
out_path: Path,
|
||||
original_image: Any,
|
||||
seed: int | None,
|
||||
) -> None:
|
||||
"""Run the PhotoMaker-V2 SynthID-safe face-identity restoration post-pass.
|
||||
def _restore_faces(self, out_path: Path) -> None:
|
||||
"""Run the GFPGAN face-polish post-pass on the cleaned ``out_path``.
|
||||
|
||||
Unlike the GFPGAN path (which blends watermarked original face pixels back into
|
||||
the cleaned output and re-introduces SynthID), PhotoMaker carries identity in a
|
||||
SynthID-invariant OpenCLIP embedding and regenerates fresh face pixels conditioned
|
||||
on it. Best-effort: any failure (missing extra, model load, runtime error) logs a
|
||||
warning and leaves the un-restored cleaned output in place. See
|
||||
``docs/synthid-robust-identity-research.md`` and ``photomaker_restore.py``.
|
||||
SynthID-safe: GFPGAN is run on the diffusion-CLEANED image (not the original),
|
||||
so the partial pixel-blend it does at fidelity weight 0.5 cannot re-introduce
|
||||
the watermark -- the input pixels GFPGAN derives from are already SynthID-free.
|
||||
Best-effort: any failure logs a warning and leaves the un-restored cleaned
|
||||
output in place; a missing ``restore`` extra is logged at debug and skipped
|
||||
(the flag must never error when the extra is absent or no face is present).
|
||||
"""
|
||||
from remove_ai_watermarks import photomaker_restore
|
||||
from remove_ai_watermarks import face_restore
|
||||
|
||||
if not photomaker_restore.is_available():
|
||||
logger.debug("restore_faces=photomaker requested but the 'photomaker' extra is not installed; skipping")
|
||||
if not face_restore.is_available():
|
||||
logger.debug("restore_faces requested but the 'restore' extra is not installed; skipping")
|
||||
return
|
||||
|
||||
try:
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
from remove_ai_watermarks import image_io
|
||||
|
||||
cleaned_bgr = image_io.imread(out_path, cv2.IMREAD_COLOR)
|
||||
if cleaned_bgr is None:
|
||||
logger.warning("restore_faces_photomaker: could not read cleaned output %s; skipping", out_path)
|
||||
logger.warning("restore_faces: could not read cleaned output %s; skipping", out_path)
|
||||
return
|
||||
|
||||
original_rgb = original_image.convert("RGB")
|
||||
original_bgr = cv2.cvtColor(np.array(original_rgb), cv2.COLOR_RGB2BGR)
|
||||
cleaned_size = (cleaned_bgr.shape[1], cleaned_bgr.shape[0])
|
||||
if (original_bgr.shape[1], original_bgr.shape[0]) != cleaned_size:
|
||||
original_bgr = cv2.resize(original_bgr, cleaned_size, interpolation=cv2.INTER_LANCZOS4)
|
||||
|
||||
if self._progress_callback:
|
||||
self._progress_callback("Restoring face identity (PhotoMaker-V2 post-pass)...")
|
||||
restored = photomaker_restore.restore_faces_photomaker(original_bgr, cleaned_bgr, seed=seed)
|
||||
self._progress_callback("Polishing face identity (GFPGAN on cleaned image)...")
|
||||
# original_bgr is unused (GFPGAN runs on cleaned_bgr); pass an empty array
|
||||
# for positional API stability with the legacy signature.
|
||||
import numpy as np
|
||||
|
||||
restored = face_restore.restore_faces(np.empty((0, 0, 3), dtype=np.uint8), cleaned_bgr)
|
||||
image_io.imwrite(out_path, restored)
|
||||
except Exception as e:
|
||||
logger.warning("restore_faces_photomaker post-pass failed (%s); keeping un-restored output", e)
|
||||
logger.warning("restore_faces post-pass failed (%s); keeping un-restored output", e)
|
||||
|
||||
def remove_watermark_batch(
|
||||
self,
|
||||
|
||||
@@ -1,343 +0,0 @@
|
||||
"""SynthID-robust face identity restoration via PhotoMaker-V1.
|
||||
|
||||
The diffusion removal pass scrubs the pixel watermark from the WHOLE image, including
|
||||
faces, but lets faces drift in identity. Unlike the GFPGAN restore pass in
|
||||
``face_restore.py`` (which runs on the watermarked ORIGINAL and re-introduces SynthID
|
||||
via partial pixel blending), PhotoMaker carries identity in a SEMANTIC EMBEDDING
|
||||
(OpenCLIP-ViT-H/14 image embedding, finetuned by PhotoMaker-V2) and uses it to
|
||||
CONDITION a fresh txt2img generation -- the pixels are new, so the watermark cannot
|
||||
be transported.
|
||||
|
||||
That the embedding cannot carry an invisible pixel watermark like SynthID was
|
||||
empirically confirmed 2026-06-04: on 31 face crops, the cosine similarity between
|
||||
``embed(orig)`` and ``embed(synthid_proxy(orig))`` (a ±2 LSB low-frequency noise of
|
||||
SynthID magnitude) is 0.9977 -- an order of magnitude less drift than JPEG90, which
|
||||
SynthID survives at >=99% TPR by design. See ``docs/synthid-robust-identity-research.md``.
|
||||
|
||||
Architecture: PhotoMaker-V1 is a fine-tuned OpenCLIP-ViT-H/14 ID encoder plus LoRA on
|
||||
the SDXL UNet attention layers. It ships as a single ``photomaker-v1.bin`` checkpoint
|
||||
loaded into a ``PhotoMakerStableDiffusionXLPipeline`` (txt2img). **V1, not V2:** V2
|
||||
adds an InsightFace/ArcFace face-recognition component at runtime, whose pretrained
|
||||
model packs (antelopev2, buffalo_l) are non-commercial-research-only per the
|
||||
InsightFace README, which would block a paid service like raiw.cc. V1's identity
|
||||
encoder is CLIP-only (PhotoMakerIDEncoder, ``model.py``); confirmed by inspecting
|
||||
the upstream source (model_v2.py forward takes ``id_embeds`` from InsightFace; V1
|
||||
forward does not). We use it as a SECOND PASS after the main controlnet/default
|
||||
removal:
|
||||
|
||||
1. Main removal pass (`controlnet` at the certified strength) cleans SynthID
|
||||
everywhere but leaves faces drifted.
|
||||
2. For each face found in the CLEANED image (YuNet), this module takes the SAME
|
||||
face region from the ORIGINAL, computes a PhotoMaker ID embedding from it, and
|
||||
runs PhotoMaker txt2img to regenerate JUST that face crop from the embedding.
|
||||
The freshly generated face is feather-composited back into the cleaned image.
|
||||
|
||||
The generated face pixels are diffusion-fresh and inherit identity from the embedding
|
||||
(not the pixels), so SynthID is not re-introduced.
|
||||
|
||||
Commercial-safe end-to-end:
|
||||
- PhotoMaker-V1 weights: Apache-2.0 (TencentARC).
|
||||
- ID encoder: OpenCLIP-ViT-H/14 (MIT) finetuned by PhotoMaker (still Apache-2.0).
|
||||
- SDXL base: shared with the main pipeline (already used in `default`/`controlnet`).
|
||||
- NO InsightFace / antelopev2 (the non-commercial blocker that BLOCKS PhotoMaker-V2,
|
||||
IP-Adapter FaceID, InstantID, PuLID, and Arc2Face). V1 is the only commercial-safe
|
||||
member of this family.
|
||||
|
||||
Requires the optional ``photomaker`` extra: ``pip install
|
||||
'remove-ai-watermarks[photomaker]'`` (pulls torch / diffusers / the upstream PhotoMaker
|
||||
package, all commercial-safe). Weights download on first use; never bundled.
|
||||
|
||||
**Why the extra includes ``insightface`` even though we use V1.** The upstream
|
||||
PhotoMaker package's ``__init__.py`` unconditionally imports its face-analyser
|
||||
wrapper (an InsightFace subclass), so JUST importing the V1 pipeline class needs
|
||||
``insightface`` to be importable -- otherwise the import errors with
|
||||
``ModuleNotFoundError: No module named 'insightface'`` (caught empirically by the
|
||||
Modal cert sweep 2026-06-04). The PyPI ``insightface`` package itself is MIT-licensed
|
||||
CODE; the non-commercial restriction is on the pretrained MODEL packs (antelopev2,
|
||||
buffalo_l), which only download when the face-analyser class is INSTANTIATED. **We
|
||||
never instantiate it** -- our V1 path uses
|
||||
``PhotoMakerStableDiffusionXLPipeline.load_photomaker_adapter`` which loads
|
||||
photomaker-v1.bin (the OpenCLIP-only encoder) and never touches the InsightFace face
|
||||
analyser. So the legal status of the InsightFace model packs does not bind us; this
|
||||
module only depends on the MIT-licensed CODE for the import to resolve. A test
|
||||
(``tests/test_photomaker_restore.py::TestV1OnlyCommercialSafetyGuard``) asserts that
|
||||
this module never references the face-analyser class.
|
||||
"""
|
||||
|
||||
# cv2/torch/diffusers boundary: relax unknown-type rules for this file only.
|
||||
# pyright: reportUnknownMemberType=false, reportUnknownArgumentType=false, reportUnknownVariableType=false, reportUnknownParameterType=false, reportMissingTypeArgument=false, reportMissingTypeStubs=false, reportMissingImports=false, reportArgumentType=false, reportAssignmentType=false, reportReturnType=false, reportCallIssue=false, reportIndexIssue=false, reportOperatorIssue=false, reportOptionalMemberAccess=false, reportOptionalCall=false, reportOptionalSubscript=false, reportOptionalOperand=false, reportAttributeAccessIssue=false, reportPrivateImportUsage=false, reportPrivateUsage=false, reportInvalidTypeForm=false, reportConstantRedefinition=false, reportUnnecessaryComparison=false
|
||||
from __future__ import annotations
|
||||
|
||||
import importlib.util
|
||||
import logging
|
||||
import threading
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING, Any
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from numpy.typing import NDArray
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# PhotoMaker-V1 weights (Apache-2.0, TencentARC). Downloaded on first use. V2 is NOT
|
||||
# used because it pulls InsightFace at runtime (non-commercial models).
|
||||
_PHOTOMAKER_REPO = "TencentARC/PhotoMaker"
|
||||
_PHOTOMAKER_FILE = "photomaker-v1.bin"
|
||||
# SDXL base shared with the main pipeline (same checkpoint as `default`/`controlnet`).
|
||||
_SDXL_MODEL_ID = "stabilityai/stable-diffusion-xl-base-1.0"
|
||||
|
||||
# The neutral prompt PhotoMaker is designed around: a class noun + the trigger word
|
||||
# `img`, which PhotoMaker replaces with the ID embedding at inference. Keeping it
|
||||
# scene-neutral (no extra style words) maximises identity transfer from the embed and
|
||||
# minimises hallucinated background/lighting that would not match the cleaned scene.
|
||||
_PHOTOMAKER_PROMPT = "a portrait photo of a person img, natural lighting, sharp focus"
|
||||
_PHOTOMAKER_NEGATIVE = "blurry, lowres, deformed, distorted, watermark"
|
||||
|
||||
# Square size used to feed PhotoMaker (must match a multiple of 64; 512 fits CPU/GPU
|
||||
# comfortably and gives the encoder enough pixels for a stable embedding).
|
||||
_PHOTOMAKER_FACE_SIZE = 512
|
||||
|
||||
_pipeline: Any | None = None
|
||||
_pipeline_lock = threading.Lock()
|
||||
|
||||
|
||||
def is_available() -> bool:
|
||||
"""True when the optional PhotoMaker extra deps are importable."""
|
||||
return (
|
||||
importlib.util.find_spec("photomaker") is not None
|
||||
and importlib.util.find_spec("diffusers") is not None
|
||||
and importlib.util.find_spec("huggingface_hub") is not None
|
||||
)
|
||||
|
||||
|
||||
def _select_device() -> str:
|
||||
"""Pick the PhotoMaker pipeline device: CUDA when present, MPS on Apple, else CPU."""
|
||||
try:
|
||||
import torch
|
||||
|
||||
if torch.cuda.is_available():
|
||||
return "cuda"
|
||||
if torch.backends.mps.is_available():
|
||||
return "mps"
|
||||
except Exception as e:
|
||||
logger.debug("photomaker_restore: device probe failed (%s); using CPU", e)
|
||||
return "cpu"
|
||||
|
||||
|
||||
def _get_pipeline() -> Any:
|
||||
"""Return the lazily-built PhotoMaker pipeline singleton (downloads weights on first use)."""
|
||||
global _pipeline
|
||||
if _pipeline is not None:
|
||||
return _pipeline
|
||||
with _pipeline_lock:
|
||||
if _pipeline is None:
|
||||
import torch
|
||||
from huggingface_hub import hf_hub_download
|
||||
from photomaker import PhotoMakerStableDiffusionXLPipeline
|
||||
|
||||
device = _select_device()
|
||||
dtype = torch.float16 if device == "cuda" else torch.float32
|
||||
logger.info("photomaker_restore: loading SDXL+PhotoMaker on %s (%s)", device, dtype)
|
||||
|
||||
# Belt-and-suspenders: V1 file name. If a future maintainer points
|
||||
# _PHOTOMAKER_FILE at v2, this stops the build so we don't silently regress
|
||||
# to the non-commercial InsightFace path.
|
||||
if _PHOTOMAKER_FILE != "photomaker-v1.bin":
|
||||
raise RuntimeError(
|
||||
f"PhotoMaker V1 is the only commercial-safe variant; got "
|
||||
f"{_PHOTOMAKER_FILE!r}. V2 requires the non-commercial InsightFace "
|
||||
"antelopev2/buffalo_l face packs "
|
||||
"(see docs/synthid-robust-identity-research.md)."
|
||||
)
|
||||
adapter_path = hf_hub_download(repo_id=_PHOTOMAKER_REPO, filename=_PHOTOMAKER_FILE)
|
||||
pipe = PhotoMakerStableDiffusionXLPipeline.from_pretrained(_SDXL_MODEL_ID, torch_dtype=dtype)
|
||||
# Move SDXL submodules to the device BEFORE loading the PhotoMaker adapter:
|
||||
# ``load_photomaker_adapter`` reads ``self.device`` / ``self.unet.dtype`` to
|
||||
# place the new ID encoder. If we ``.to(device)`` after, the SDXL submodules
|
||||
# move but the id_encoder stays where it was (custom attribute, not in the
|
||||
# auto-managed module tree), and inference errors with
|
||||
# "Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor)
|
||||
# should be the same" (caught empirically 2026-06-04).
|
||||
pipe.to(device)
|
||||
# ``pm_version="v1"`` is REQUIRED: the upstream loader defaults to v2 and would
|
||||
# build the V2 encoder (PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken), then
|
||||
# error on load_state_dict because the v1 weights have a different shape.
|
||||
# Passing v1 builds the CLIP-only PhotoMakerIDEncoder, which is the
|
||||
# commercial-safe path we want.
|
||||
pipe.load_photomaker_adapter(
|
||||
str(Path(adapter_path).parent),
|
||||
subfolder="",
|
||||
weight_name=_PHOTOMAKER_FILE,
|
||||
trigger_word="img",
|
||||
pm_version="v1",
|
||||
)
|
||||
pipe.fuse_lora()
|
||||
# Belt: also explicitly cast the loaded id_encoder, because some
|
||||
# diffusers/torch combinations leave the encoder buffers untouched even
|
||||
# though ``pipe.to(device)`` ran first.
|
||||
if hasattr(pipe, "id_encoder") and pipe.id_encoder is not None:
|
||||
pipe.id_encoder = pipe.id_encoder.to(device=device, dtype=dtype)
|
||||
_pipeline = pipe
|
||||
return _pipeline
|
||||
|
||||
|
||||
def _face_crop_square(
|
||||
image_bgr: NDArray[Any],
|
||||
box: tuple[int, int, int, int],
|
||||
pad: float = 0.30,
|
||||
) -> tuple[NDArray[Any], tuple[int, int, int, int]]:
|
||||
"""Square crop around a face box (with padding), clipped to the image.
|
||||
|
||||
Returns ``(crop_bgr, (x1, y1, x2, y2))``. The crop is the image content inside the
|
||||
returned square box -- callers use the box for the composite step. Pure numpy slicing,
|
||||
no model.
|
||||
"""
|
||||
h, w = image_bgr.shape[:2]
|
||||
x, y, bw, bh = box
|
||||
cx, cy = x + bw // 2, y + bh // 2
|
||||
side = int(max(bw, bh) * (1.0 + 2.0 * pad))
|
||||
half = side // 2
|
||||
x1 = max(0, cx - half)
|
||||
y1 = max(0, cy - half)
|
||||
x2 = min(w, cx + half)
|
||||
y2 = min(h, cy + half)
|
||||
return image_bgr[y1:y2, x1:x2], (x1, y1, x2, y2)
|
||||
|
||||
|
||||
def _composite_faces(
|
||||
base_bgr: NDArray[Any],
|
||||
restored_crops: list[tuple[NDArray[Any], tuple[int, int, int, int]]],
|
||||
feather_div: int = 6,
|
||||
) -> NDArray[Any]:
|
||||
"""Feather-composite a list of ``(restored_crop, (x1, y1, x2, y2))`` into ``base_bgr``.
|
||||
|
||||
Pure cv2/numpy helper (no model), unit-testable. For each ``(crop, box)``: resize
|
||||
the crop to the box size, build a Gaussian-feathered rectangular alpha, and blend
|
||||
``crop * a + base * (1 - a)``. Boxes that fall fully outside the image (or an empty
|
||||
list) leave ``base_bgr`` unchanged. Mirrors the alpha math in ``face_restore._composite_faces``.
|
||||
"""
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
out = base_bgr.astype(np.float32)
|
||||
h, w = base_bgr.shape[:2]
|
||||
|
||||
for crop, (x1, y1, x2, y2) in restored_crops:
|
||||
x1, y1 = max(0, x1), max(0, y1)
|
||||
x2, y2 = min(w, x2), min(h, y2)
|
||||
bw, bh = x2 - x1, y2 - y1
|
||||
if bw <= 0 or bh <= 0:
|
||||
continue
|
||||
resized = cv2.resize(crop, (bw, bh), interpolation=cv2.INTER_LANCZOS4)
|
||||
|
||||
alpha = np.zeros((h, w), dtype=np.float32)
|
||||
alpha[y1:y2, x1:x2] = 1.0
|
||||
k = max(3, (min(bw, bh) // feather_div) | 1)
|
||||
alpha = cv2.GaussianBlur(alpha, (k, k), 0)[:, :, None]
|
||||
|
||||
full_restored = np.zeros_like(out)
|
||||
full_restored[y1:y2, x1:x2] = resized
|
||||
out = full_restored * alpha + out * (1.0 - alpha)
|
||||
|
||||
return np.clip(out, 0, 255).astype(np.uint8)
|
||||
|
||||
|
||||
def restore_faces_photomaker(
|
||||
original_bgr: NDArray[Any],
|
||||
cleaned_bgr: NDArray[Any],
|
||||
num_inference_steps: int = 30,
|
||||
guidance_scale: float = 5.0,
|
||||
style_strength: int = 20,
|
||||
seed: int | None = None,
|
||||
detect_faces_fn: Any | None = None,
|
||||
) -> NDArray[Any]:
|
||||
"""SynthID-robust face identity restoration via PhotoMaker txt2img.
|
||||
|
||||
Pipeline:
|
||||
1. Detect faces in ``cleaned_bgr`` (YuNet via the package's ``auto_config`` by
|
||||
default; override via ``detect_faces_fn`` for tests).
|
||||
2. For each face: take the SAME box from ``original_bgr`` -> square crop -> PhotoMaker
|
||||
txt2img with that crop as the ID image -> a fresh face generated from the
|
||||
OpenCLIP embedding (the embedding is SynthID-invariant by ~3 orders of magnitude,
|
||||
see docs/synthid-robust-identity-research.md).
|
||||
3. Feather-composite each regenerated face into ``cleaned_bgr``.
|
||||
|
||||
Faces are taken from ``original_bgr`` (the embedding ignores the watermark) but the
|
||||
PIXELS that land in the output are diffusion-fresh, so SynthID is not transported.
|
||||
|
||||
Args:
|
||||
original_bgr: The original (watermarked) image as cv2 BGR. Source of identity.
|
||||
cleaned_bgr: The main-pass output as cv2 BGR. Faces drifted in identity; this
|
||||
module replaces those face regions.
|
||||
num_inference_steps: Diffusion steps inside PhotoMaker (def 30).
|
||||
guidance_scale: CFG scale inside PhotoMaker (def 5.0; the PhotoMaker recipe).
|
||||
style_strength: PhotoMaker's ``start_merge_step`` knob ~ 20-30 (def 20).
|
||||
seed: Optional seed for reproducibility.
|
||||
detect_faces_fn: Optional callable ``(bgr) -> list[(x,y,w,h)]`` to override the
|
||||
default YuNet detector (used by tests).
|
||||
|
||||
Returns:
|
||||
``cleaned_bgr`` with regenerated face regions composited in (or unchanged when
|
||||
no face is detected).
|
||||
"""
|
||||
import cv2
|
||||
import numpy as np
|
||||
import torch
|
||||
from PIL import Image
|
||||
|
||||
if detect_faces_fn is None:
|
||||
from remove_ai_watermarks import auto_config as _ac
|
||||
|
||||
def _default_detect(bgr: NDArray[Any]) -> list[tuple[int, int, int, int]]:
|
||||
h, w = bgr.shape[:2]
|
||||
model = Path(_ac.__file__).parent / "assets" / "face_detection_yunet_2023mar.onnx"
|
||||
det = cv2.FaceDetectorYN.create(str(model), "", (w, h), _ac._FACE_SCORE, 0.3, 5000)
|
||||
det.setInputSize((w, h))
|
||||
_, faces = det.detect(bgr)
|
||||
if faces is None:
|
||||
return []
|
||||
return [(int(f[0]), int(f[1]), int(f[2]), int(f[3])) for f in faces if int(f[2]) > 0 and int(f[3]) > 0]
|
||||
|
||||
detect_faces_fn = _default_detect
|
||||
|
||||
boxes = detect_faces_fn(cleaned_bgr)
|
||||
if not boxes:
|
||||
logger.debug("photomaker_restore: no faces detected; returning cleaned image unchanged")
|
||||
return cleaned_bgr
|
||||
|
||||
pipeline = _get_pipeline()
|
||||
generator = None
|
||||
if seed is not None:
|
||||
generator = torch.Generator(device=pipeline.device).manual_seed(seed)
|
||||
|
||||
restored: list[tuple[NDArray[Any], tuple[int, int, int, int]]] = []
|
||||
for box in boxes:
|
||||
id_crop_bgr, square_box = _face_crop_square(original_bgr, box)
|
||||
if id_crop_bgr.size == 0:
|
||||
continue
|
||||
id_crop_rgb = cv2.cvtColor(id_crop_bgr, cv2.COLOR_BGR2RGB)
|
||||
id_image_pil = Image.fromarray(id_crop_rgb)
|
||||
|
||||
# Don't pass negative_prompt: the PhotoMaker pipeline manages its own CFG by
|
||||
# concatenating [negative_prompt_embeds, prompt_embeds]; if we pass a custom
|
||||
# negative the upstream code splits text_only vs id-injected branches and
|
||||
# the resulting embed batch dims can mismatch (we saw
|
||||
# "Sizes of tensors must match except in dimension 1. Expected size 2 but got
|
||||
# size 1" on a real run). The default empty negative is what the upstream
|
||||
# gradio demo uses.
|
||||
out = pipeline(
|
||||
prompt=_PHOTOMAKER_PROMPT,
|
||||
input_id_images=[id_image_pil],
|
||||
num_inference_steps=num_inference_steps,
|
||||
guidance_scale=guidance_scale,
|
||||
start_merge_step=style_strength,
|
||||
generator=generator,
|
||||
height=_PHOTOMAKER_FACE_SIZE,
|
||||
width=_PHOTOMAKER_FACE_SIZE,
|
||||
num_images_per_prompt=1,
|
||||
)
|
||||
gen_rgb = out.images[0]
|
||||
gen_bgr = cv2.cvtColor(np.array(gen_rgb), cv2.COLOR_RGB2BGR)
|
||||
restored.append((gen_bgr, square_box))
|
||||
|
||||
return _composite_faces(cleaned_bgr, restored)
|
||||
@@ -0,0 +1,85 @@
|
||||
"""Tests for the GFPGAN face-restoration post-pass.
|
||||
|
||||
The pure feather-composite helper is unit-tested without the model; the
|
||||
model-running paths are gated behind ``is_available()`` (a multi-hundred-MB
|
||||
download), matching the discipline used for the other ML-adjacent modules.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import numpy as np
|
||||
import pytest
|
||||
|
||||
from remove_ai_watermarks import face_restore
|
||||
|
||||
|
||||
class TestIsAvailable:
|
||||
def test_returns_bool(self):
|
||||
assert isinstance(face_restore.is_available(), bool)
|
||||
|
||||
def test_reflects_dependencies(self):
|
||||
import importlib.util
|
||||
|
||||
expected = all(importlib.util.find_spec(m) is not None for m in ("gfpgan", "facexlib"))
|
||||
assert face_restore.is_available() is expected
|
||||
|
||||
|
||||
class TestCompositeFaces:
|
||||
"""Unit tests for the pure ``_composite_faces`` helper (cv2/numpy only)."""
|
||||
|
||||
def _base_and_restored(self, h: int = 100, w: int = 120):
|
||||
base = np.zeros((h, w, 3), dtype=np.uint8) # black
|
||||
restored = np.full((h, w, 3), 255, dtype=np.uint8) # white
|
||||
return base, restored
|
||||
|
||||
def test_output_shape_and_dtype(self):
|
||||
base, restored = self._base_and_restored()
|
||||
out = face_restore._composite_faces(base, restored, [(40.0, 30.0, 80.0, 70.0)])
|
||||
assert out.shape == base.shape
|
||||
assert out.dtype == np.uint8
|
||||
|
||||
def test_box_region_pulls_toward_restored(self):
|
||||
base, restored = self._base_and_restored()
|
||||
out = face_restore._composite_faces(base, restored, [(40.0, 30.0, 80.0, 70.0)])
|
||||
# Center of the box should be near the restored (white) value.
|
||||
cy, cx = 50, 60
|
||||
assert out[cy, cx].mean() > 200
|
||||
|
||||
def test_far_from_box_stays_base(self):
|
||||
base, restored = self._base_and_restored()
|
||||
out = face_restore._composite_faces(base, restored, [(40.0, 30.0, 80.0, 70.0)], pad=2)
|
||||
# Top-left corner is far from the box and feather, so it stays black.
|
||||
assert out[0, 0].mean() < 5
|
||||
|
||||
def test_empty_boxes_returns_base_unchanged(self):
|
||||
base, restored = self._base_and_restored()
|
||||
out = face_restore._composite_faces(base, restored, [])
|
||||
assert np.array_equal(out, base)
|
||||
|
||||
def test_box_fully_outside_is_skipped(self):
|
||||
base, restored = self._base_and_restored(h=100, w=120)
|
||||
# Box entirely beyond the right/bottom edge -> clipped to empty -> no-op.
|
||||
out = face_restore._composite_faces(base, restored, [(200.0, 200.0, 260.0, 260.0)], pad=0)
|
||||
assert np.array_equal(out, base)
|
||||
|
||||
def test_near_edge_box_clips_without_error(self):
|
||||
base, restored = self._base_and_restored(h=100, w=120)
|
||||
# Box reaching past the bottom-right corner must clip, not raise.
|
||||
out = face_restore._composite_faces(base, restored, [(100.0, 80.0, 130.0, 110.0)], pad=10)
|
||||
assert out.shape == base.shape
|
||||
# The clipped in-bounds region still pulls toward white.
|
||||
assert out[95, 115].mean() > 100
|
||||
|
||||
|
||||
@pytest.mark.skipif(not face_restore.is_available(), reason="requires the 'restore' extra (gfpgan/facexlib)")
|
||||
class TestRestoreFacesModel:
|
||||
"""Model-running smoke test, gated behind the optional extra."""
|
||||
|
||||
def test_no_faces_returns_cleaned_unchanged(self):
|
||||
# A flat gray image has no faces; restore_faces must return the cleaned
|
||||
# input unchanged (the no-op path).
|
||||
cleaned = np.full((128, 128, 3), 127, dtype=np.uint8)
|
||||
original = np.full((128, 128, 3), 127, dtype=np.uint8)
|
||||
out = face_restore.restore_faces(original, cleaned)
|
||||
assert out.shape == cleaned.shape
|
||||
assert np.array_equal(out, cleaned)
|
||||
@@ -1,150 +0,0 @@
|
||||
"""Tests for the PhotoMaker-V2 face identity restoration helper.
|
||||
|
||||
These tests cover the pure-Python parts (face crop math, composite, the no-faces
|
||||
no-op, the is_available guard) WITHOUT loading PhotoMaker or SDXL -- the model-loading
|
||||
path is gated behind ``is_available()`` and exercised manually via the Modal cert
|
||||
sweep, mirroring the convention used for ``face_restore`` and ``upscaler``.
|
||||
|
||||
The end-to-end PhotoMaker run is monkey-patched: we replace ``_get_pipeline`` with a
|
||||
fake pipeline whose ``__call__`` returns a known constant-color face, so we can verify
|
||||
that the right boxes get the right pixels composited back.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from types import SimpleNamespace
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
from remove_ai_watermarks import photomaker_restore
|
||||
|
||||
|
||||
class TestIsAvailable:
|
||||
def test_returns_bool(self):
|
||||
assert isinstance(photomaker_restore.is_available(), bool)
|
||||
|
||||
|
||||
class TestV1OnlyCommercialSafetyGuard:
|
||||
"""The module must lock to PhotoMaker-V1 (Apache + CLIP-only encoder).
|
||||
|
||||
V2 pulls InsightFace antelopev2/buffalo_l face packs which are non-commercial.
|
||||
A maintainer touching ``_PHOTOMAKER_FILE`` for any reason must trip this guard.
|
||||
"""
|
||||
|
||||
def test_repo_is_v1(self):
|
||||
assert photomaker_restore._PHOTOMAKER_REPO == "TencentARC/PhotoMaker"
|
||||
|
||||
def test_weight_filename_is_v1(self):
|
||||
assert photomaker_restore._PHOTOMAKER_FILE == "photomaker-v1.bin"
|
||||
|
||||
def test_module_source_does_not_call_face_analysis(self):
|
||||
"""We may IMPORT `insightface` (transitive) but must never instantiate FaceAnalysis."""
|
||||
import inspect
|
||||
|
||||
src = inspect.getsource(photomaker_restore)
|
||||
assert "FaceAnalysis" not in src
|
||||
assert "insightface.app" not in src
|
||||
|
||||
|
||||
class TestFaceCropSquare:
|
||||
def test_centers_on_face_box(self):
|
||||
img = np.full((400, 400, 3), 128, dtype=np.uint8)
|
||||
crop, box = photomaker_restore._face_crop_square(img, (100, 150, 80, 80))
|
||||
x1, y1, x2, y2 = box
|
||||
# The crop covers the requested box (with padding)
|
||||
assert x1 <= 100
|
||||
assert y1 <= 150
|
||||
assert x2 >= 180
|
||||
assert y2 >= 230
|
||||
assert crop.shape[0] == y2 - y1
|
||||
assert crop.shape[1] == x2 - x1
|
||||
|
||||
def test_clips_at_image_edges(self):
|
||||
img = np.full((200, 200, 3), 128, dtype=np.uint8)
|
||||
crop, (x1, y1, x2, y2) = photomaker_restore._face_crop_square(img, (180, 180, 30, 30))
|
||||
# Box must be clipped within the image
|
||||
assert x1 >= 0
|
||||
assert y1 >= 0
|
||||
assert x2 <= 200
|
||||
assert y2 <= 200
|
||||
assert crop.shape[0] == y2 - y1
|
||||
assert crop.shape[1] == x2 - x1
|
||||
|
||||
def test_pad_widens_the_crop(self):
|
||||
img = np.full((400, 400, 3), 128, dtype=np.uint8)
|
||||
_, no_pad = photomaker_restore._face_crop_square(img, (150, 150, 50, 50), pad=0.0)
|
||||
_, with_pad = photomaker_restore._face_crop_square(img, (150, 150, 50, 50), pad=0.5)
|
||||
assert (with_pad[2] - with_pad[0]) > (no_pad[2] - no_pad[0])
|
||||
|
||||
|
||||
class TestCompositeFaces:
|
||||
def test_empty_list_returns_base_unchanged(self):
|
||||
base = np.full((100, 100, 3), 64, dtype=np.uint8)
|
||||
out = photomaker_restore._composite_faces(base, [])
|
||||
assert np.array_equal(out, base)
|
||||
|
||||
def test_box_outside_image_is_skipped(self):
|
||||
base = np.full((100, 100, 3), 64, dtype=np.uint8)
|
||||
crop = np.full((40, 40, 3), 200, dtype=np.uint8)
|
||||
out = photomaker_restore._composite_faces(base, [(crop, (200, 200, 240, 240))])
|
||||
assert np.array_equal(out, base)
|
||||
|
||||
def test_composited_box_pulls_pixel_value_toward_crop(self):
|
||||
base = np.full((200, 200, 3), 40, dtype=np.uint8)
|
||||
crop = np.full((50, 50, 3), 220, dtype=np.uint8)
|
||||
# Place the crop fully inside the image at (60, 60)..(110, 110)
|
||||
out = photomaker_restore._composite_faces(base, [(crop, (60, 60, 110, 110))])
|
||||
# The box center should be heavily biased toward the crop color (>120) ...
|
||||
assert out[85, 85, 0] > 120
|
||||
# ... and corners (well outside the feathered region) stay close to base
|
||||
assert int(out[0, 0, 0]) - int(base[0, 0, 0]) <= 1
|
||||
|
||||
|
||||
class TestRestoreFacesPhotomakerControlFlow:
|
||||
"""End-to-end control flow with a fake pipeline -- no diffusion model loaded."""
|
||||
|
||||
@staticmethod
|
||||
def _fake_pipeline_class(fill_value: int = 200):
|
||||
"""Class-based fake (no ``__call__`` on a SimpleNamespace, which Python won't dispatch)."""
|
||||
from PIL import Image
|
||||
|
||||
size = photomaker_restore._PHOTOMAKER_FACE_SIZE
|
||||
fake_face = Image.fromarray(np.full((size, size, 3), fill_value, dtype=np.uint8))
|
||||
|
||||
class _FakePipe:
|
||||
device = "cpu"
|
||||
|
||||
def __call__(self, **_kwargs):
|
||||
return SimpleNamespace(images=[fake_face])
|
||||
|
||||
return _FakePipe()
|
||||
|
||||
def test_no_faces_returns_cleaned_unchanged(self, monkeypatch):
|
||||
# Force is_available so we never hit the missing-extra branch
|
||||
monkeypatch.setattr(photomaker_restore, "is_available", lambda: True)
|
||||
monkeypatch.setattr(photomaker_restore, "_get_pipeline", lambda: self._fake_pipeline_class())
|
||||
|
||||
orig = np.full((200, 200, 3), 30, dtype=np.uint8)
|
||||
cleaned = np.full((200, 200, 3), 90, dtype=np.uint8)
|
||||
out = photomaker_restore.restore_faces_photomaker(orig, cleaned, detect_faces_fn=lambda _b: [])
|
||||
assert np.array_equal(out, cleaned)
|
||||
|
||||
def test_one_face_gets_composited_into_cleaned(self, monkeypatch):
|
||||
monkeypatch.setattr(photomaker_restore, "is_available", lambda: True)
|
||||
monkeypatch.setattr(photomaker_restore, "_get_pipeline", lambda: self._fake_pipeline_class(fill_value=210))
|
||||
|
||||
orig = np.full((400, 400, 3), 30, dtype=np.uint8)
|
||||
cleaned = np.full((400, 400, 3), 90, dtype=np.uint8)
|
||||
# Mark the original face region with a distinctive color so we can confirm the
|
||||
# crop reached the pipeline (not strictly tested here, but useful sanity).
|
||||
cv2.rectangle(orig, (150, 150), (250, 250), (200, 100, 50), -1)
|
||||
|
||||
out = photomaker_restore.restore_faces_photomaker(
|
||||
orig, cleaned, detect_faces_fn=lambda _b: [(150, 150, 100, 100)]
|
||||
)
|
||||
# The cleaned image should have shifted toward the fake-face fill (210) inside
|
||||
# the face region.
|
||||
assert out[200, 200, 0] > 150
|
||||
# And the corner pixels (well outside the feather) should still be near the base.
|
||||
assert int(out[0, 0, 0]) - int(cleaned[0, 0, 0]) <= 1
|
||||
Reference in New Issue
Block a user