fix(photomaker): drop explicit negative_prompt to fix CFG batch mismatch

Modal cert sweep #6 made it INTO the denoising loop and died with "Sizes of tensors must match except in dimension 1. Expected size 2 but got size 1 for tensor number 1 in the list." In the PhotoMaker pipeline's denoising loop, the per-step embeddings are built as torch.cat([negative_prompt_embeds, prompt_embeds(_text_only)], dim=0). The text-encoder + ID-encoder flow can leave the negative branch at batch=2 and the ID-injected branch at batch=1 when a custom negative_prompt is passed, so the cat fails. The upstream gradio demo just passes no negative_prompt and relies on the pipeline's empty default; do the same. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 12:53:56 +02:00 · 2026-06-08 16:35:40 -07:00
parent 031c38dc7f
commit d1b85ee6a8
1 changed files with 7 additions and 1 deletions
@@ -318,9 +318,15 @@ def restore_faces_photomaker(
        id_crop_rgb = cv2.cvtColor(id_crop_bgr, cv2.COLOR_BGR2RGB)
        id_image_pil = Image.fromarray(id_crop_rgb)

+        # Don't pass negative_prompt: the PhotoMaker pipeline manages its own CFG by
+        # concatenating [negative_prompt_embeds, prompt_embeds]; if we pass a custom
+        # negative the upstream code splits text_only vs id-injected branches and
+        # the resulting embed batch dims can mismatch (we saw
+        # "Sizes of tensors must match except in dimension 1. Expected size 2 but got
+        # size 1" on a real run). The default empty negative is what the upstream
+        # gradio demo uses.
        out = pipeline(
            prompt=_PHOTOMAKER_PROMPT,
-            negative_prompt=_PHOTOMAKER_NEGATIVE,
            input_id_images=[id_image_pil],
            num_inference_steps=num_inference_steps,
            guidance_scale=guidance_scale,