fix(photomaker): drop explicit negative_prompt to fix CFG batch mismatch

Modal cert sweep #6 made it INTO the denoising loop and died with
"Sizes of tensors must match except in dimension 1. Expected size 2 but got size 1
for tensor number 1 in the list."

In the PhotoMaker pipeline's denoising loop, the per-step embeddings are built
as torch.cat([negative_prompt_embeds, prompt_embeds(_text_only)], dim=0). The
text-encoder + ID-encoder flow can leave the negative branch at batch=2 and the
ID-injected branch at batch=1 when a custom negative_prompt is passed, so the
cat fails. The upstream gradio demo just passes no negative_prompt and relies
on the pipeline's empty default; do the same.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Victor Kuznetsov
2026-06-08 16:35:40 -07:00
parent 031c38dc7f
commit d1b85ee6a8
@@ -318,9 +318,15 @@ def restore_faces_photomaker(
id_crop_rgb = cv2.cvtColor(id_crop_bgr, cv2.COLOR_BGR2RGB)
id_image_pil = Image.fromarray(id_crop_rgb)
# Don't pass negative_prompt: the PhotoMaker pipeline manages its own CFG by
# concatenating [negative_prompt_embeds, prompt_embeds]; if we pass a custom
# negative the upstream code splits text_only vs id-injected branches and
# the resulting embed batch dims can mismatch (we saw
# "Sizes of tensors must match except in dimension 1. Expected size 2 but got
# size 1" on a real run). The default empty negative is what the upstream
# gradio demo uses.
out = pipeline(
prompt=_PHOTOMAKER_PROMPT,
negative_prompt=_PHOTOMAKER_NEGATIVE,
input_id_images=[id_image_pil],
num_inference_steps=num_inference_steps,
guidance_scale=guidance_scale,