refactor(face-restore): rollback PhotoMaker, restore GFPGAN on the CLEANED image

After 7 cascading upstream-compat fixes (insightface dep, peft dep, pm_version, device, etc.), the PhotoMaker V1 cert sweep still hit a CFG batch-dim mismatch inside the denoising loop. The upstream PhotoMaker `pipeline.py` is forked from diffusers v0.29.1 and our env runs 0.38; SDXL prompt-encoder handling changed significantly between those versions, so making PhotoMaker work end-to-end needs a proper fork or a diffusers downgrade — both expensive. Not worth shipping today. Pivot: restore `face_restore.py` (GFPGAN) with a single-line fix that makes it SynthID-safe by construction. The previous design ran GFPGAN.enhance on the ORIGINAL watermarked image and was oracle-confirmed to re-add SynthID via the weight-0.5 pixel blend. The fix is to run GFPGAN on the diffusion-CLEANED image — whatever pixels GFPGAN derives from are already SynthID-free, so the partial blend cannot transport the watermark. Identity fidelity is lower than a true identity-as-embedding stack would deliver, but it ships and works. Changes: - `src/remove_ai_watermarks/face_restore.py` restored from pre-wipe state with one line changed: `restorer.enhance(cleaned_bgr, ...)` instead of `restorer.enhance(original_bgr, ...)`. `original_bgr` is kept as an unused positional argument for API stability. - `src/remove_ai_watermarks/photomaker_restore.py` and its tests REMOVED. The research note (`docs/synthid-robust-identity-research.md`) keeps a "status notice" documenting why PhotoMaker is parked for now and what the path back in would look like. - `pyproject.toml` `restore` extra restored (gfpgan/facexlib/basicsr + scipy<1.18 + numba<0.60 pins + the basicsr setuptools<69 build pin), plus `photomaker` extra (with its einops/insightface/peft pile) and the `[tool.hatch.metadata] allow-direct-references = true` block REMOVED. - `InvisibleEngine._restore_faces_photomaker` removed; `_restore_faces` restored. The `--restore-faces` CLI flag and its plumbing through cmd_* signatures are unchanged. - CLAUDE.md, README.md, docs/synthid.md, docs/controlnet-removal-pipeline- research.md updated to describe the shipped GFPGAN-on-cleaned design and to reference PhotoMaker only as the parked alternative. ruff + strict pyright(src/) clean; 578 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 12:53:56 +02:00 · 2026-06-08 16:55:45 -07:00
parent d1b85ee6a8
commit 01fe98bf54
13 changed files with 1273 additions and 851 deletions
@@ -238,18 +238,16 @@ def _warn_if_esrgan_unavailable(upscaler: str) -> None:
 def _restore_faces_options(f: Any) -> Any:
    """Attach the face-restoration flag to an invisible-pipeline command.

-    PhotoMaker-V2 is the only restoration method shipped (the prior GFPGAN path was
-    oracle-confirmed to re-introduce SynthID by partial pixel blending and has been
-    removed). PhotoMaker carries identity in a SynthID-invariant OpenCLIP embedding
-    and regenerates fresh face pixels conditioned on it -- see
-    ``docs/synthid-robust-identity-research.md``.
+    The post-pass runs GFPGAN on the DIFFUSION-CLEANED image (not the original), so
+    SynthID is not re-introduced (the input pixels GFPGAN derives from are already
+    SynthID-free). See ``face_restore.py``.
    """
    return click.option(
        "--restore-faces/--no-restore-faces",
        default=False,
-        help="EXPERIMENTAL, opt-in. Restore face identity with the PhotoMaker-V2 post-pass "
-        "when faces are present (needs the 'photomaker' extra); off by default, auto-skips "
-        "when no face is detected or the extra is absent.",
+        help="EXPERIMENTAL, opt-in. Polish face detail with a GFPGAN post-pass on the "
+        "cleaned image when faces are present (needs the 'restore' extra); off by default, "
+        "auto-skips when no face is detected or the extra is absent.",
    )(f)


@@ -0,0 +1,210 @@
+"""Optional GFPGAN face-polish post-pass for the invisible removal pipeline.
+
+The diffusion removal pass scrubs the watermark everywhere but lets faces drift in
+likeness (canny holds face *structure*, not *identity*). This module sharpens and
+re-synthesizes each face from GFPGAN's StyleGAN2 prior, running on the
+DIFFUSION-CLEANED image -- not on the original.
+
+**Why "cleaned, not original":** an earlier version of this module ran GFPGAN on the
+ORIGINAL (watermarked) image and was oracle-confirmed (2026-06-04) to re-introduce
+SynthID into the face regions, because GFPGAN at fidelity weight 0.5 blends ~half
+the input pixels with the prior, and SynthID is robust to that partial blend. The
+fix is to feed GFPGAN the already-clean image -- whatever pixels it preserves are
+already SynthID-free, so the composited face stays clean. Identity is recovered from
+the StyleGAN2 prior conditioned on the already-drifted cleaned face (not on the
+original face), so identity fidelity is somewhat lower than the would-have-been
+identity-as-embedding stack (PhotoMaker-V1), but the upstream PhotoMaker package has
+significant compatibility issues with the diffusers version we ship, so this is the
+shipping path.
+
+Both GFPGAN (Apache-2.0) and its RetinaFace detector (MIT) are commercial-safe.
+The GFPGANv1.4 weights and the RetinaFace detector download on first use and are
+never bundled. Requires the optional ``restore`` extra (gfpgan/facexlib/basicsr).
+"""
+
+# cv2/torch/gfpgan boundary: gfpgan/basicsr/facexlib ship no usable type stubs and
+# this module wraps cv2 (feather composite) and torch; relax the unknown-type rules
+# for this file only.
+# pyright: reportUnknownMemberType=false, reportUnknownArgumentType=false, reportUnknownVariableType=false, reportUnknownParameterType=false, reportMissingTypeArgument=false, reportMissingTypeStubs=false, reportMissingImports=false, reportArgumentType=false, reportAssignmentType=false, reportReturnType=false, reportCallIssue=false, reportIndexIssue=false, reportOperatorIssue=false, reportOptionalMemberAccess=false, reportOptionalCall=false, reportOptionalSubscript=false, reportOptionalOperand=false, reportAttributeAccessIssue=false, reportPrivateImportUsage=false, reportPrivateUsage=false, reportInvalidTypeForm=false, reportConstantRedefinition=false, reportUnnecessaryComparison=false
+from __future__ import annotations
+
+import logging
+import sys
+import threading
+from typing import TYPE_CHECKING, Any
+
+if TYPE_CHECKING:
+    from numpy.typing import NDArray
+
+logger = logging.getLogger(__name__)
+
+# GFPGANv1.4 weights (Apache-2.0). Downloaded on first use, never bundled.
+_GFPGAN_MODEL_URL = "https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.4.pth"
+_GFPGAN_ARCH = "clean"
+_GFPGAN_CHANNEL_MULTIPLIER = 2
+
+_restorer: Any | None = None
+_restorer_lock = threading.Lock()
+
+
+def is_available() -> bool:
+    """True when the optional GFPGAN face-restoration deps are importable."""
+    import importlib.util
+
+    return importlib.util.find_spec("gfpgan") is not None and importlib.util.find_spec("facexlib") is not None
+
+
+def _apply_basicsr_shim() -> None:
+    """Install the ``torchvision.transforms.functional_tensor`` compatibility shim.
+
+    basicsr (a GFPGAN dependency) imports ``rgb_to_grayscale`` from the
+    ``torchvision.transforms.functional_tensor`` module, which newer torchvision
+    removed. Recreate that module pointing at the public functional API. Idempotent:
+    only installed when the real module is missing.
+    """
+    import importlib.util
+
+    if importlib.util.find_spec("torchvision.transforms.functional_tensor") is not None:
+        return
+    if "torchvision.transforms.functional_tensor" in sys.modules:
+        return
+
+    import types
+
+    import torchvision.transforms.functional as tv_functional
+
+    shim = types.ModuleType("torchvision.transforms.functional_tensor")
+    shim.rgb_to_grayscale = tv_functional.rgb_to_grayscale
+    sys.modules["torchvision.transforms.functional_tensor"] = shim
+
+
+def _select_device() -> str:
+    """Pick the GFPGAN device: CUDA when present, else CPU.
+
+    The pip GFPGANer has an MPS device-mismatch bug, and this is a cheap post-pass
+    on a few face crops, so MPS is deliberately avoided -- CPU is the safe default
+    on Apple silicon.
+    """
+    try:
+        import torch
+
+        if torch.cuda.is_available():
+            return "cuda"
+    except Exception as e:
+        logger.debug("face_restore: CUDA probe failed (%s); using CPU", e)
+    return "cpu"
+
+
+def _get_restorer() -> Any:
+    """Return the lazily-built GFPGANer singleton (downloads weights on first use)."""
+    global _restorer
+    if _restorer is not None:
+        return _restorer
+    with _restorer_lock:
+        if _restorer is None:
+            _apply_basicsr_shim()
+            from gfpgan import GFPGANer
+
+            _restorer = GFPGANer(
+                model_path=_GFPGAN_MODEL_URL,
+                upscale=1,
+                arch=_GFPGAN_ARCH,
+                channel_multiplier=_GFPGAN_CHANNEL_MULTIPLIER,
+                device=_select_device(),
+            )
+    return _restorer
+
+
+def _composite_faces(
+    base_bgr: NDArray[Any],
+    restored_bgr: NDArray[Any],
+    boxes: list[tuple[float, float, float, float]],
+    pad: int = 14,
+    feather_div: int = 6,
+) -> NDArray[Any]:
+    """Feather-composite restored face regions from ``restored_bgr`` into ``base_bgr``.
+
+    Pure cv2/numpy helper (no gfpgan), so it is unit-testable without the model.
+    For each ``(x1, y1, x2, y2)`` box: pad and clip to the image, build a Gaussian-
+    feathered rectangular alpha, and blend ``restored * a + base * (1 - a)``. Boxes
+    that fall fully outside the image (or an empty list) leave ``base_bgr`` unchanged.
+    """
+    import cv2
+    import numpy as np
+
+    out = base_bgr.astype(np.float32)
+    h, w = base_bgr.shape[:2]
+
+    for box in boxes:
+        x1 = int(box[0]) - pad
+        y1 = int(box[1]) - pad
+        x2 = int(box[2]) + pad
+        y2 = int(box[3]) + pad
+        x1 = max(0, min(x1, w))
+        y1 = max(0, min(y1, h))
+        x2 = max(0, min(x2, w))
+        y2 = max(0, min(y2, h))
+        bw = x2 - x1
+        bh = y2 - y1
+        if bw <= 0 or bh <= 0:
+            continue
+
+        alpha = np.zeros((h, w), dtype=np.float32)
+        alpha[y1:y2, x1:x2] = 1.0
+        k = max(3, (min(bw, bh) // feather_div) | 1)  # odd kernel >= 3
+        alpha = cv2.GaussianBlur(alpha, (k, k), 0)
+        alpha = alpha[:, :, None]
+        out = restored_bgr.astype(np.float32) * alpha + out * (1.0 - alpha)
+
+    return np.clip(out, 0, 255).astype(np.uint8)
+
+
+def restore_faces(
+    original_bgr: NDArray[Any],  # legacy positional kept for API stability; unused
+    cleaned_bgr: NDArray[Any],
+    weight: float = 0.5,
+    pad: int = 14,
+    feather_div: int = 6,
+) -> NDArray[Any]:
+    """Restore face identity in ``cleaned_bgr`` by running GFPGAN on the CLEANED image.
+
+    GFPGAN is a fidelity-restoration net: it sharpens and re-synthesizes face details
+    from its StyleGAN2 prior conditioned on the INPUT face. **Running it on the
+    diffusion-cleaned image (not the original)** is what makes this pass SynthID-safe:
+    the input pixels GFPGAN derives from are already SynthID-free, so the partial
+    pixel-blend at the default weight 0.5 cannot re-introduce the watermark.
+
+    The earlier version of this module ran GFPGAN on the ORIGINAL (watermarked) image
+    and was oracle-confirmed (2026-06-04) to re-introduce SynthID into the face
+    regions. The fix is the single-line source swap below.
+
+    The ``original_bgr`` argument is kept for positional API stability with the
+    earlier signature but is no longer used; pass it for legacy callers, ignore it
+    in new code.
+
+    Args:
+        original_bgr: UNUSED (legacy; kept for positional API stability).
+        cleaned_bgr: The diffusion-cleaned image as cv2 BGR (faces drifted from the
+            removal pass). GFPGAN runs on THIS, polishing each face without changing
+            the watermark state of the source pixels.
+        weight: GFPGAN fidelity weight (0-1); lower = more StyleGAN2 regeneration of
+            the face from the prior.
+        pad: Pixels to grow each face box before compositing.
+        feather_div: Larger = sharper composite edge (box-min // feather_div kernel).
+    """
+    restorer = _get_restorer()
+    _, _, restored_img = restorer.enhance(
+        cleaned_bgr,
+        has_aligned=False,
+        only_center_face=False,
+        paste_back=True,
+        weight=weight,
+    )
+
+    det_faces = getattr(restorer.face_helper, "det_faces", None) or []
+    boxes = [(float(b[0]), float(b[1]), float(b[2]), float(b[3])) for b in det_faces]
+    if not boxes:
+        logger.debug("face_restore: no faces detected; returning cleaned image unchanged")
+        return cleaned_bgr
+
+    return _composite_faces(cleaned_bgr, restored_img, boxes, pad=pad, feather_div=feather_div)
@@ -180,13 +180,11 @@ class InvisibleEngine:
            guidance_scale: Classifier-free guidance scale.
            seed: Random seed for reproducibility.
            humanize: Intensity of Analog Humanizer film grain (0 = off).
-            restore_faces: EXPERIMENTAL, opt-in (default False). Run the PhotoMaker-V2
-                face-identity post-pass when faces are present (needs the
-                ``photomaker`` extra). Carries identity via a SynthID-invariant OpenCLIP
-                embedding and regenerates fresh face pixels conditioned on it, so the
-                pixel watermark is not transported. Auto-skips with a debug log when the
-                extra is absent or no face is detected. See
-                ``docs/synthid-robust-identity-research.md``.
+            restore_faces: EXPERIMENTAL, opt-in (default False). Run the GFPGAN
+                face-polish post-pass when faces are present (needs the ``restore``
+                extra). Runs on the diffusion-CLEANED image (not the original), so
+                SynthID is not re-introduced. Auto-skips with a debug log when the
+                extra is absent or no face is detected.
            unsharp: Final unsharp-mask sharpening strength (0 = off, default).
                Applied last (after face restoration) to counter the soft,
                over-smoothed look of the diffusion + restoration; ~0.5-0.8 is a
@@ -312,13 +310,13 @@ class InvisibleEngine:
                    out_cv = cv2.resize(out_cv, orig_size, interpolation=cv2.INTER_LANCZOS4)
                    image_io.imwrite(out_path, out_cv)

-            # Optional PhotoMaker-V2 face-identity post-pass: restore face identity that
-            # the diffusion regeneration drifted, carrying identity in a SynthID-invariant
-            # OpenCLIP embedding so the regenerated face pixels are watermark-free. Runs
-            # on the cleaned output at its final resolution; auto-skips when faces are
-            # absent or the optional extra is not installed.
+            # Optional GFPGAN face-polish post-pass: sharpens and re-synthesizes each
+            # face from GFPGAN's StyleGAN2 prior, running on the DIFFUSION-CLEANED image
+            # (not the original) -- so SynthID is not re-introduced (the input pixels
+            # GFPGAN derives from are already SynthID-free). Auto-skips when faces are
+            # absent or the optional `restore` extra is not installed.
            if restore_faces:
-                self._restore_faces_photomaker(out_path, image, seed)
+                self._restore_faces(out_path)

            # Final sharpening, LAST so it crisps the face-restored result too (a
            # pre-restore sharpen would be smoothed back over by the face pass).
@@ -357,50 +355,42 @@ class InvisibleEngine:
            if _tmp_path.exists():
                _tmp_path.unlink()

-    def _restore_faces_photomaker(
-        self,
-        out_path: Path,
-        original_image: Any,
-        seed: int | None,
-    ) -> None:
-        """Run the PhotoMaker-V2 SynthID-safe face-identity restoration post-pass.
+    def _restore_faces(self, out_path: Path) -> None:
+        """Run the GFPGAN face-polish post-pass on the cleaned ``out_path``.

-        Unlike the GFPGAN path (which blends watermarked original face pixels back into
-        the cleaned output and re-introduces SynthID), PhotoMaker carries identity in a
-        SynthID-invariant OpenCLIP embedding and regenerates fresh face pixels conditioned
-        on it. Best-effort: any failure (missing extra, model load, runtime error) logs a
-        warning and leaves the un-restored cleaned output in place. See
-        ``docs/synthid-robust-identity-research.md`` and ``photomaker_restore.py``.
+        SynthID-safe: GFPGAN is run on the diffusion-CLEANED image (not the original),
+        so the partial pixel-blend it does at fidelity weight 0.5 cannot re-introduce
+        the watermark -- the input pixels GFPGAN derives from are already SynthID-free.
+        Best-effort: any failure logs a warning and leaves the un-restored cleaned
+        output in place; a missing ``restore`` extra is logged at debug and skipped
+        (the flag must never error when the extra is absent or no face is present).
        """
-        from remove_ai_watermarks import photomaker_restore
+        from remove_ai_watermarks import face_restore

-        if not photomaker_restore.is_available():
-            logger.debug("restore_faces=photomaker requested but the 'photomaker' extra is not installed; skipping")
+        if not face_restore.is_available():
+            logger.debug("restore_faces requested but the 'restore' extra is not installed; skipping")
            return

        try:
            import cv2
-            import numpy as np

            from remove_ai_watermarks import image_io

            cleaned_bgr = image_io.imread(out_path, cv2.IMREAD_COLOR)
            if cleaned_bgr is None:
-                logger.warning("restore_faces_photomaker: could not read cleaned output %s; skipping", out_path)
+                logger.warning("restore_faces: could not read cleaned output %s; skipping", out_path)
                return

-            original_rgb = original_image.convert("RGB")
-            original_bgr = cv2.cvtColor(np.array(original_rgb), cv2.COLOR_RGB2BGR)
-            cleaned_size = (cleaned_bgr.shape[1], cleaned_bgr.shape[0])
-            if (original_bgr.shape[1], original_bgr.shape[0]) != cleaned_size:
-                original_bgr = cv2.resize(original_bgr, cleaned_size, interpolation=cv2.INTER_LANCZOS4)
-
            if self._progress_callback:
-                self._progress_callback("Restoring face identity (PhotoMaker-V2 post-pass)...")
-            restored = photomaker_restore.restore_faces_photomaker(original_bgr, cleaned_bgr, seed=seed)
+                self._progress_callback("Polishing face identity (GFPGAN on cleaned image)...")
+            # original_bgr is unused (GFPGAN runs on cleaned_bgr); pass an empty array
+            # for positional API stability with the legacy signature.
+            import numpy as np
+
+            restored = face_restore.restore_faces(np.empty((0, 0, 3), dtype=np.uint8), cleaned_bgr)
            image_io.imwrite(out_path, restored)
        except Exception as e:
-            logger.warning("restore_faces_photomaker post-pass failed (%s); keeping un-restored output", e)
+            logger.warning("restore_faces post-pass failed (%s); keeping un-restored output", e)

    def remove_watermark_batch(
        self,
@@ -1,343 +0,0 @@
-"""SynthID-robust face identity restoration via PhotoMaker-V1.
-
-The diffusion removal pass scrubs the pixel watermark from the WHOLE image, including
-faces, but lets faces drift in identity. Unlike the GFPGAN restore pass in
-``face_restore.py`` (which runs on the watermarked ORIGINAL and re-introduces SynthID
-via partial pixel blending), PhotoMaker carries identity in a SEMANTIC EMBEDDING
-(OpenCLIP-ViT-H/14 image embedding, finetuned by PhotoMaker-V2) and uses it to
-CONDITION a fresh txt2img generation -- the pixels are new, so the watermark cannot
-be transported.
-
-That the embedding cannot carry an invisible pixel watermark like SynthID was
-empirically confirmed 2026-06-04: on 31 face crops, the cosine similarity between
-``embed(orig)`` and ``embed(synthid_proxy(orig))`` (a ±2 LSB low-frequency noise of
-SynthID magnitude) is 0.9977 -- an order of magnitude less drift than JPEG90, which
-SynthID survives at >=99% TPR by design. See ``docs/synthid-robust-identity-research.md``.
-
-Architecture: PhotoMaker-V1 is a fine-tuned OpenCLIP-ViT-H/14 ID encoder plus LoRA on
-the SDXL UNet attention layers. It ships as a single ``photomaker-v1.bin`` checkpoint
-loaded into a ``PhotoMakerStableDiffusionXLPipeline`` (txt2img). **V1, not V2:** V2
-adds an InsightFace/ArcFace face-recognition component at runtime, whose pretrained
-model packs (antelopev2, buffalo_l) are non-commercial-research-only per the
-InsightFace README, which would block a paid service like raiw.cc. V1's identity
-encoder is CLIP-only (PhotoMakerIDEncoder, ``model.py``); confirmed by inspecting
-the upstream source (model_v2.py forward takes ``id_embeds`` from InsightFace; V1
-forward does not). We use it as a SECOND PASS after the main controlnet/default
-removal:
-
-  1. Main removal pass (`controlnet` at the certified strength) cleans SynthID
-     everywhere but leaves faces drifted.
-  2. For each face found in the CLEANED image (YuNet), this module takes the SAME
-     face region from the ORIGINAL, computes a PhotoMaker ID embedding from it, and
-     runs PhotoMaker txt2img to regenerate JUST that face crop from the embedding.
-     The freshly generated face is feather-composited back into the cleaned image.
-
-The generated face pixels are diffusion-fresh and inherit identity from the embedding
-(not the pixels), so SynthID is not re-introduced.
-
-Commercial-safe end-to-end:
- PhotoMaker-V1 weights: Apache-2.0 (TencentARC).
- ID encoder: OpenCLIP-ViT-H/14 (MIT) finetuned by PhotoMaker (still Apache-2.0).
- SDXL base: shared with the main pipeline (already used in `default`/`controlnet`).
- NO InsightFace / antelopev2 (the non-commercial blocker that BLOCKS PhotoMaker-V2,
-  IP-Adapter FaceID, InstantID, PuLID, and Arc2Face). V1 is the only commercial-safe
-  member of this family.
-
-Requires the optional ``photomaker`` extra: ``pip install
-'remove-ai-watermarks[photomaker]'`` (pulls torch / diffusers / the upstream PhotoMaker
-package, all commercial-safe). Weights download on first use; never bundled.
-
-**Why the extra includes ``insightface`` even though we use V1.** The upstream
-PhotoMaker package's ``__init__.py`` unconditionally imports its face-analyser
-wrapper (an InsightFace subclass), so JUST importing the V1 pipeline class needs
-``insightface`` to be importable -- otherwise the import errors with
-``ModuleNotFoundError: No module named 'insightface'`` (caught empirically by the
-Modal cert sweep 2026-06-04). The PyPI ``insightface`` package itself is MIT-licensed
-CODE; the non-commercial restriction is on the pretrained MODEL packs (antelopev2,
-buffalo_l), which only download when the face-analyser class is INSTANTIATED. **We
-never instantiate it** -- our V1 path uses
-``PhotoMakerStableDiffusionXLPipeline.load_photomaker_adapter`` which loads
-photomaker-v1.bin (the OpenCLIP-only encoder) and never touches the InsightFace face
-analyser. So the legal status of the InsightFace model packs does not bind us; this
-module only depends on the MIT-licensed CODE for the import to resolve. A test
-(``tests/test_photomaker_restore.py::TestV1OnlyCommercialSafetyGuard``) asserts that
-this module never references the face-analyser class.
-"""
-
-# cv2/torch/diffusers boundary: relax unknown-type rules for this file only.
-# pyright: reportUnknownMemberType=false, reportUnknownArgumentType=false, reportUnknownVariableType=false, reportUnknownParameterType=false, reportMissingTypeArgument=false, reportMissingTypeStubs=false, reportMissingImports=false, reportArgumentType=false, reportAssignmentType=false, reportReturnType=false, reportCallIssue=false, reportIndexIssue=false, reportOperatorIssue=false, reportOptionalMemberAccess=false, reportOptionalCall=false, reportOptionalSubscript=false, reportOptionalOperand=false, reportAttributeAccessIssue=false, reportPrivateImportUsage=false, reportPrivateUsage=false, reportInvalidTypeForm=false, reportConstantRedefinition=false, reportUnnecessaryComparison=false
-from __future__ import annotations
-
-import importlib.util
-import logging
-import threading
-from pathlib import Path
-from typing import TYPE_CHECKING, Any
-
-if TYPE_CHECKING:
-    from numpy.typing import NDArray
-
-logger = logging.getLogger(__name__)
-
-# PhotoMaker-V1 weights (Apache-2.0, TencentARC). Downloaded on first use. V2 is NOT
-# used because it pulls InsightFace at runtime (non-commercial models).
-_PHOTOMAKER_REPO = "TencentARC/PhotoMaker"
-_PHOTOMAKER_FILE = "photomaker-v1.bin"
-# SDXL base shared with the main pipeline (same checkpoint as `default`/`controlnet`).
-_SDXL_MODEL_ID = "stabilityai/stable-diffusion-xl-base-1.0"
-
-# The neutral prompt PhotoMaker is designed around: a class noun + the trigger word
-# `img`, which PhotoMaker replaces with the ID embedding at inference. Keeping it
-# scene-neutral (no extra style words) maximises identity transfer from the embed and
-# minimises hallucinated background/lighting that would not match the cleaned scene.
-_PHOTOMAKER_PROMPT = "a portrait photo of a person img, natural lighting, sharp focus"
-_PHOTOMAKER_NEGATIVE = "blurry, lowres, deformed, distorted, watermark"
-
-# Square size used to feed PhotoMaker (must match a multiple of 64; 512 fits CPU/GPU
-# comfortably and gives the encoder enough pixels for a stable embedding).
-_PHOTOMAKER_FACE_SIZE = 512
-
-_pipeline: Any | None = None
-_pipeline_lock = threading.Lock()
-
-
-def is_available() -> bool:
-    """True when the optional PhotoMaker extra deps are importable."""
-    return (
-        importlib.util.find_spec("photomaker") is not None
-        and importlib.util.find_spec("diffusers") is not None
-        and importlib.util.find_spec("huggingface_hub") is not None
-    )
-
-
-def _select_device() -> str:
-    """Pick the PhotoMaker pipeline device: CUDA when present, MPS on Apple, else CPU."""
-    try:
-        import torch
-
-        if torch.cuda.is_available():
-            return "cuda"
-        if torch.backends.mps.is_available():
-            return "mps"
-    except Exception as e:
-        logger.debug("photomaker_restore: device probe failed (%s); using CPU", e)
-    return "cpu"
-
-
-def _get_pipeline() -> Any:
-    """Return the lazily-built PhotoMaker pipeline singleton (downloads weights on first use)."""
-    global _pipeline
-    if _pipeline is not None:
-        return _pipeline
-    with _pipeline_lock:
-        if _pipeline is None:
-            import torch
-            from huggingface_hub import hf_hub_download
-            from photomaker import PhotoMakerStableDiffusionXLPipeline
-
-            device = _select_device()
-            dtype = torch.float16 if device == "cuda" else torch.float32
-            logger.info("photomaker_restore: loading SDXL+PhotoMaker on %s (%s)", device, dtype)
-
-            # Belt-and-suspenders: V1 file name. If a future maintainer points
-            # _PHOTOMAKER_FILE at v2, this stops the build so we don't silently regress
-            # to the non-commercial InsightFace path.
-            if _PHOTOMAKER_FILE != "photomaker-v1.bin":
-                raise RuntimeError(
-                    f"PhotoMaker V1 is the only commercial-safe variant; got "
-                    f"{_PHOTOMAKER_FILE!r}. V2 requires the non-commercial InsightFace "
-                    "antelopev2/buffalo_l face packs "
-                    "(see docs/synthid-robust-identity-research.md)."
-                )
-            adapter_path = hf_hub_download(repo_id=_PHOTOMAKER_REPO, filename=_PHOTOMAKER_FILE)
-            pipe = PhotoMakerStableDiffusionXLPipeline.from_pretrained(_SDXL_MODEL_ID, torch_dtype=dtype)
-            # Move SDXL submodules to the device BEFORE loading the PhotoMaker adapter:
-            # ``load_photomaker_adapter`` reads ``self.device`` / ``self.unet.dtype`` to
-            # place the new ID encoder. If we ``.to(device)`` after, the SDXL submodules
-            # move but the id_encoder stays where it was (custom attribute, not in the
-            # auto-managed module tree), and inference errors with
-            # "Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor)
-            # should be the same" (caught empirically 2026-06-04).
-            pipe.to(device)
-            # ``pm_version="v1"`` is REQUIRED: the upstream loader defaults to v2 and would
-            # build the V2 encoder (PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken), then
-            # error on load_state_dict because the v1 weights have a different shape.
-            # Passing v1 builds the CLIP-only PhotoMakerIDEncoder, which is the
-            # commercial-safe path we want.
-            pipe.load_photomaker_adapter(
-                str(Path(adapter_path).parent),
-                subfolder="",
-                weight_name=_PHOTOMAKER_FILE,
-                trigger_word="img",
-                pm_version="v1",
-            )
-            pipe.fuse_lora()
-            # Belt: also explicitly cast the loaded id_encoder, because some
-            # diffusers/torch combinations leave the encoder buffers untouched even
-            # though ``pipe.to(device)`` ran first.
-            if hasattr(pipe, "id_encoder") and pipe.id_encoder is not None:
-                pipe.id_encoder = pipe.id_encoder.to(device=device, dtype=dtype)
-            _pipeline = pipe
-    return _pipeline
-
-
-def _face_crop_square(
-    image_bgr: NDArray[Any],
-    box: tuple[int, int, int, int],
-    pad: float = 0.30,
-) -> tuple[NDArray[Any], tuple[int, int, int, int]]:
-    """Square crop around a face box (with padding), clipped to the image.
-
-    Returns ``(crop_bgr, (x1, y1, x2, y2))``. The crop is the image content inside the
-    returned square box -- callers use the box for the composite step. Pure numpy slicing,
-    no model.
-    """
-    h, w = image_bgr.shape[:2]
-    x, y, bw, bh = box
-    cx, cy = x + bw // 2, y + bh // 2
-    side = int(max(bw, bh) * (1.0 + 2.0 * pad))
-    half = side // 2
-    x1 = max(0, cx - half)
-    y1 = max(0, cy - half)
-    x2 = min(w, cx + half)
-    y2 = min(h, cy + half)
-    return image_bgr[y1:y2, x1:x2], (x1, y1, x2, y2)
-
-
-def _composite_faces(
-    base_bgr: NDArray[Any],
-    restored_crops: list[tuple[NDArray[Any], tuple[int, int, int, int]]],
-    feather_div: int = 6,
-) -> NDArray[Any]:
-    """Feather-composite a list of ``(restored_crop, (x1, y1, x2, y2))`` into ``base_bgr``.
-
-    Pure cv2/numpy helper (no model), unit-testable. For each ``(crop, box)``: resize
-    the crop to the box size, build a Gaussian-feathered rectangular alpha, and blend
-    ``crop * a + base * (1 - a)``. Boxes that fall fully outside the image (or an empty
-    list) leave ``base_bgr`` unchanged. Mirrors the alpha math in ``face_restore._composite_faces``.
-    """
-    import cv2
-    import numpy as np
-
-    out = base_bgr.astype(np.float32)
-    h, w = base_bgr.shape[:2]
-
-    for crop, (x1, y1, x2, y2) in restored_crops:
-        x1, y1 = max(0, x1), max(0, y1)
-        x2, y2 = min(w, x2), min(h, y2)
-        bw, bh = x2 - x1, y2 - y1
-        if bw <= 0 or bh <= 0:
-            continue
-        resized = cv2.resize(crop, (bw, bh), interpolation=cv2.INTER_LANCZOS4)
-
-        alpha = np.zeros((h, w), dtype=np.float32)
-        alpha[y1:y2, x1:x2] = 1.0
-        k = max(3, (min(bw, bh) // feather_div) | 1)
-        alpha = cv2.GaussianBlur(alpha, (k, k), 0)[:, :, None]
-
-        full_restored = np.zeros_like(out)
-        full_restored[y1:y2, x1:x2] = resized
-        out = full_restored * alpha + out * (1.0 - alpha)
-
-    return np.clip(out, 0, 255).astype(np.uint8)
-
-
-def restore_faces_photomaker(
-    original_bgr: NDArray[Any],
-    cleaned_bgr: NDArray[Any],
-    num_inference_steps: int = 30,
-    guidance_scale: float = 5.0,
-    style_strength: int = 20,
-    seed: int | None = None,
-    detect_faces_fn: Any | None = None,
-) -> NDArray[Any]:
-    """SynthID-robust face identity restoration via PhotoMaker txt2img.
-
-    Pipeline:
-      1. Detect faces in ``cleaned_bgr`` (YuNet via the package's ``auto_config`` by
-         default; override via ``detect_faces_fn`` for tests).
-      2. For each face: take the SAME box from ``original_bgr`` -> square crop -> PhotoMaker
-         txt2img with that crop as the ID image -> a fresh face generated from the
-         OpenCLIP embedding (the embedding is SynthID-invariant by ~3 orders of magnitude,
-         see docs/synthid-robust-identity-research.md).
-      3. Feather-composite each regenerated face into ``cleaned_bgr``.
-
-    Faces are taken from ``original_bgr`` (the embedding ignores the watermark) but the
-    PIXELS that land in the output are diffusion-fresh, so SynthID is not transported.
-
-    Args:
-        original_bgr: The original (watermarked) image as cv2 BGR. Source of identity.
-        cleaned_bgr: The main-pass output as cv2 BGR. Faces drifted in identity; this
-            module replaces those face regions.
-        num_inference_steps: Diffusion steps inside PhotoMaker (def 30).
-        guidance_scale: CFG scale inside PhotoMaker (def 5.0; the PhotoMaker recipe).
-        style_strength: PhotoMaker's ``start_merge_step`` knob ~ 20-30 (def 20).
-        seed: Optional seed for reproducibility.
-        detect_faces_fn: Optional callable ``(bgr) -> list[(x,y,w,h)]`` to override the
-            default YuNet detector (used by tests).
-
-    Returns:
-        ``cleaned_bgr`` with regenerated face regions composited in (or unchanged when
-        no face is detected).
-    """
-    import cv2
-    import numpy as np
-    import torch
-    from PIL import Image
-
-    if detect_faces_fn is None:
-        from remove_ai_watermarks import auto_config as _ac
-
-        def _default_detect(bgr: NDArray[Any]) -> list[tuple[int, int, int, int]]:
-            h, w = bgr.shape[:2]
-            model = Path(_ac.__file__).parent / "assets" / "face_detection_yunet_2023mar.onnx"
-            det = cv2.FaceDetectorYN.create(str(model), "", (w, h), _ac._FACE_SCORE, 0.3, 5000)
-            det.setInputSize((w, h))
-            _, faces = det.detect(bgr)
-            if faces is None:
-                return []
-            return [(int(f[0]), int(f[1]), int(f[2]), int(f[3])) for f in faces if int(f[2]) > 0 and int(f[3]) > 0]
-
-        detect_faces_fn = _default_detect
-
-    boxes = detect_faces_fn(cleaned_bgr)
-    if not boxes:
-        logger.debug("photomaker_restore: no faces detected; returning cleaned image unchanged")
-        return cleaned_bgr
-
-    pipeline = _get_pipeline()
-    generator = None
-    if seed is not None:
-        generator = torch.Generator(device=pipeline.device).manual_seed(seed)
-
-    restored: list[tuple[NDArray[Any], tuple[int, int, int, int]]] = []
-    for box in boxes:
-        id_crop_bgr, square_box = _face_crop_square(original_bgr, box)
-        if id_crop_bgr.size == 0:
-            continue
-        id_crop_rgb = cv2.cvtColor(id_crop_bgr, cv2.COLOR_BGR2RGB)
-        id_image_pil = Image.fromarray(id_crop_rgb)
-
-        # Don't pass negative_prompt: the PhotoMaker pipeline manages its own CFG by
-        # concatenating [negative_prompt_embeds, prompt_embeds]; if we pass a custom
-        # negative the upstream code splits text_only vs id-injected branches and
-        # the resulting embed batch dims can mismatch (we saw
-        # "Sizes of tensors must match except in dimension 1. Expected size 2 but got
-        # size 1" on a real run). The default empty negative is what the upstream
-        # gradio demo uses.
-        out = pipeline(
-            prompt=_PHOTOMAKER_PROMPT,
-            input_id_images=[id_image_pil],
-            num_inference_steps=num_inference_steps,
-            guidance_scale=guidance_scale,
-            start_merge_step=style_strength,
-            generator=generator,
-            height=_PHOTOMAKER_FACE_SIZE,
-            width=_PHOTOMAKER_FACE_SIZE,
-            num_images_per_prompt=1,
-        )
-        gen_rgb = out.images[0]
-        gen_bgr = cv2.cvtColor(np.array(gen_rgb), cv2.COLOR_RGB2BGR)
-        restored.append((gen_bgr, square_box))
-
-    return _composite_faces(cleaned_bgr, restored)