feat(instantid): tighter face ellipse + color match for cleaner multi-face composite

Second multi-face iteration. v1-rect: full-1024 frame + Gaussian rectangle ->
patchwork. v2-ellipse: tight crop + ellipse 0.45*bw x 0.55*bh -> ellipse
exceeds bbox vertically and clips forehead/chin on single portrait, plus
group-photo faces visibly drift cooler than the warm bar background. v3:

1. **Smaller ellipse axes**: 0.32*bw x 0.42*bh. Both fit inside the bbox (since
   axes are radii from center, 0.32*bw extends 0.64*bw total width and
   0.42*bh extends 0.84*bh total height) so no chin/forehead clip even on
   non-square boxes. Face shape: vertically elongated (0.42 vs 0.32),
   matching real face geometry.

2. **Wider feather**: `min(bw, bh) // 5` instead of // 8. Edges fade over a
   wider band so the elliptical seam is less visible.

3. **Per-channel mean color match** (`_color_match`): before compositing,
   shift the regenerated face's mean BGR to match the cleaned canvas region
   where it lands. Each InstantID generation has independent SDXL noise so
   white balance drifts -- matching means equalises tone (warm bar / cool
   face -> warm face) without rescaling contrast.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Victor Kuznetsov
2026-06-08 20:25:34 -07:00
parent 92c7245e2d
commit cdd6bd1fea
+42 -14
View File
@@ -456,21 +456,49 @@ def restore_faces_instantid(
return _composite_faces_elliptical(cleaned_bgr, restored)
def _color_match(src_bgr: NDArray[Any], ref_bgr: NDArray[Any]) -> NDArray[Any]:
"""Shift ``src_bgr`` mean colour to ``ref_bgr`` mean colour, per channel.
Each face is regenerated by InstantID with its own SDXL noise -- the white
balance / mean tone drifts away from the surrounding scene (cool studio
light vs warm bar lighting). A per-channel mean-shift brings the face crop
into the same tonal range as the cleaned canvas where it lands. Contrast
and saturation are preserved (we don't rescale variance).
"""
import numpy as np
src = src_bgr.astype(np.float32)
ref = ref_bgr.astype(np.float32)
if ref.size == 0:
return src_bgr
src_mean = src.mean(axis=(0, 1), keepdims=True)
ref_mean = ref.mean(axis=(0, 1), keepdims=True)
return np.clip(src - src_mean + ref_mean, 0, 255).astype(np.uint8)
def _composite_faces_elliptical(
base_bgr: NDArray[Any],
restored_crops: list[tuple[NDArray[Any], tuple[int, int, int, int]]],
feather_div: int = 8,
feather_div: int = 5,
) -> NDArray[Any]:
"""Composite face crops into ``base_bgr`` using an elliptical, feathered alpha.
Unlike ``photomaker_restore._composite_faces`` which feathers a RECTANGULAR
alpha over the whole crop bbox, this builds an ELLIPSE inscribed in each
bbox and feathers the ellipse edge. The bbox corners (which carry
regenerated-scene background pixels) fade to zero so the cleaned-image
background stays intact, eliminating multi-face patchwork on group photos.
The ellipse covers roughly the head silhouette which is what we want to
replace; everything outside it -- hair edges, shoulders, scene context --
stays from the cleaned canvas.
Two changes vs the simpler rectangular Gaussian feather:
- **Inscribed face-shaped ellipse.** Axes are ``(0.32*bw, 0.42*bh)`` which
fits comfortably inside the 2x padded bbox (the face naturally occupies
the central ~50% of the bbox), covering the head silhouette without
clipping the forehead or chin. The bbox corners (which carry
regenerated-scene background pixels with a different tone per face) end
up at alpha=0 so the cleaned-image background stays intact -- this is
what eliminates multi-face patchwork on group photos.
- **Soft feather.** ``min(bw, bh) // 5`` -- about twice as soft as the
rectangular Gaussian, so the ellipse edge fades over a wider band into
the cleaned canvas, hiding any residual seam.
Additionally, before compositing, ``_color_match`` shifts the regenerated
face's mean colour to match the cleaned canvas region it lands on -- this
removes the warm/cool tone clash that group photos showed.
"""
import cv2
import numpy as np
@@ -485,15 +513,15 @@ def _composite_faces_elliptical(
if bw <= 0 or bh <= 0:
continue
resized = cv2.resize(crop, (bw, bh), interpolation=cv2.INTER_LANCZOS4)
# Tone match the regenerated face to the cleaned canvas it sits on.
ref_region = base_bgr[y1:y2, x1:x2]
resized = _color_match(resized, ref_region)
# Elliptical alpha inscribed in the bbox (axes slightly inside so the
# feather edge tapers cleanly inside the rectangle), feathered with a
# Gaussian sized by the shorter side.
alpha_crop = np.zeros((bh, bw), dtype=np.float32)
center = (bw // 2, bh // 2)
axes = (max(1, int(bw * 0.45)), max(1, int(bh * 0.55)))
axes = (max(1, int(bw * 0.32)), max(1, int(bh * 0.42)))
cv2.ellipse(alpha_crop, center, axes, 0, 0, 360, 1.0, -1)
k = max(3, (min(bw, bh) // feather_div) | 1)
k = max(7, (min(bw, bh) // feather_div) | 1)
alpha_crop = cv2.GaussianBlur(alpha_crop, (k, k), 0)
alpha_full = np.zeros((h_b, w_b), dtype=np.float32)