mirror of
https://github.com/wiltodelta/remove-ai-watermarks.git
synced 2026-06-10 12:53:56 +02:00
fix(photomaker): drop explicit negative_prompt to fix CFG batch mismatch
Modal cert sweep #6 made it INTO the denoising loop and died with "Sizes of tensors must match except in dimension 1. Expected size 2 but got size 1 for tensor number 1 in the list." In the PhotoMaker pipeline's denoising loop, the per-step embeddings are built as torch.cat([negative_prompt_embeds, prompt_embeds(_text_only)], dim=0). The text-encoder + ID-encoder flow can leave the negative branch at batch=2 and the ID-injected branch at batch=1 when a custom negative_prompt is passed, so the cat fails. The upstream gradio demo just passes no negative_prompt and relies on the pipeline's empty default; do the same. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -318,9 +318,15 @@ def restore_faces_photomaker(
|
||||
id_crop_rgb = cv2.cvtColor(id_crop_bgr, cv2.COLOR_BGR2RGB)
|
||||
id_image_pil = Image.fromarray(id_crop_rgb)
|
||||
|
||||
# Don't pass negative_prompt: the PhotoMaker pipeline manages its own CFG by
|
||||
# concatenating [negative_prompt_embeds, prompt_embeds]; if we pass a custom
|
||||
# negative the upstream code splits text_only vs id-injected branches and
|
||||
# the resulting embed batch dims can mismatch (we saw
|
||||
# "Sizes of tensors must match except in dimension 1. Expected size 2 but got
|
||||
# size 1" on a real run). The default empty negative is what the upstream
|
||||
# gradio demo uses.
|
||||
out = pipeline(
|
||||
prompt=_PHOTOMAKER_PROMPT,
|
||||
negative_prompt=_PHOTOMAKER_NEGATIVE,
|
||||
input_id_images=[id_image_pil],
|
||||
num_inference_steps=num_inference_steps,
|
||||
guidance_scale=guidance_scale,
|
||||
|
||||
Reference in New Issue
Block a user