remove-ai-watermarks

mirror of https://github.com/wiltodelta/remove-ai-watermarks.git synced 2026-07-04 23:47:49 +02:00

Author	SHA1	Message	Date
Victor Kuznetsov	41a2af2ecb	fix(cli): preserve SynthID uncertainty in no-visible-mark message The 'no signal' branch of the visible no-mark path claimed 'No AI provenance signal found either', which reads as 'the image is clean'. A missing metadata proxy is not proof an invisible pixel watermark (SynthID) is absent: it cannot be detected once metadata is gone and may have been stripped upstream. The message now preserves that uncertainty and routes to both 'all' (regenerate pixels) and 'erase'. Regression-guarded by the SynthID/all asserts in test_cli.py. CLAUDE.md visible-command note updated to match. Also adds a 'Scope and non-goals' section (CLAUDE.md + README): removing AI-provenance marks on the user's own content is in scope; stripping stock/paid-content watermarks (Shutterstock/Getty/iStock, classifieds) is out of scope by principle, not by difficulty. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-13 19:30:49 -07:00
Victor Kuznetsov	d8cdc9f478	docs: correct stale strength-ladder values in remove_watermark docstring The convenience wrapper's docstring still quoted the pre-2026-06 ladder (0.10 OpenAI / 0.15 Google / 0.15 unknown). The live constants in watermark_profiles.py are 0.20 / 0.30 / 0.30, applied to both the controlnet and sdxl pipelines. Docstring only; behaviour was already correct via vendor_for_strength + resolve_strength. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-13 09:51:09 -07:00
Victor Kuznetsov	6237429610	chore(release): v0.11.2 v0.11.2	2026-06-12 21:37:04 -07:00
Victor Kuznetsov	30b56f0ea3	fix(cli): stop silent passthrough when visible finds no known mark When `visible --mark auto` (or an explicit `--mark` with detection on) found no registered mark, it exited 0 without writing output -- which a wrapping service reads as success and re-serves the unchanged input. ~74% of real uploads carry no registered visible mark, so this was the dominant "it didn't work" / NPS score-0 failure mode. Now it runs a cheap metadata-only identify, prints actionable guidance (route to `all` for an invisible/metadata mark, or `erase` for an arbitrary logo), writes no output file, and exits EXIT_NO_VISIBLE_MARK (2) -- distinct from success (0) and a hard error (1) so the caller can surface the message. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-12 21:36:56 -07:00
Victor Kuznetsov	b08405bece	chore(release): v0.11.1 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> v0.11.1	2026-06-12 12:15:20 -07:00
Victor Kuznetsov	28569bd05d	fix(gemini): recover sub-0.85 corner sparkles via top-K fusion selection The 256->512 detection-search widening (v0.8) let a large, low-gradient shape match outrank a genuine mid-size corner sparkle whose raw NCC sits below the 0.85 corner-promote gate, so `identify` read `unknown` on Gemini images that v0.7.2 caught (reporter osachub: scale-48 sparkle on light bedding -- true sparkle spatial 0.775 / grad 0.960 / fusion 0.676, but the size-weighted argmax locked onto a decoy at spatial 0.628 / grad 0.036). detect_watermark now keeps the top-K (_SELECT_TOPK=3) size-weighted candidates (NMS-deduped) plus the corner-promote candidate, scores each by full fusion (spatial+gradient+variance) via the extracted _grad_var_scores helper, and selects the highest -- the gradient term lifts the true sparkle over the decoy. Ranking by the SIZE-WEIGHTED score (not a raw-NCC argmax) preserves tiny-patch suppression: a raw-NCC argmax re-admitted 16-18px content false positives (14/65 doubao + 4/11 jimeng visible images). Top-K adds zero flips on the doubao/jimeng corpora and leaves the 495-image Gemini set unchanged (479 detected) while recovering the reporter's image at 0.676. - _grad_var_scores: gradient/variance scoring factored out of detect_watermark - confidence = best_fused (drop the duplicated fusion recompute) - tests: rename test_promotion_is_what_rescues_it -> test_size_weighted_search_alone_traps_on_the_decoy (corner-promote is no longer the sole rescue path); add a deterministic regression test mirroring the real spatial/grad signature - docs: module-internals.md detector section + CLAUDE.md mechanism map Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-12 12:04:20 -07:00
Victor Kuznetsov	9feea4ac1e	Slim CLAUDE.md: move module internals, limitations, landscape research to docs Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 15:50:03 -07:00
Victor Kuznetsov	3055aa6c4a	test: patch is_available in full-pipeline all tests (fix no-gpu CI) test_all_basic / test_all_visible_step_uses_registry asserted exit 0 but did not patch is_available, so on CI (core+dev only, no gpu) they took the skip branch and hit the new non-zero exit. Passed locally where gpu is present. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 10:07:05 -07:00
Victor Kuznetsov	c8bc4b7c68	chore(release): v0.11.0 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> v0.11.0	2026-06-11 10:03:51 -07:00
Victor Kuznetsov	a8e218acf6	Make `all` fail loudly when the gpu extra is missing Step 2 (invisible/SynthID) was skipped with a quiet inline warning and the run still exited 0, so a missing [gpu] extra was mistaken for a clean result (recurring #14/#47). Add a prominent end-of-run banner and a non-zero exit. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 09:58:49 -07:00
Victor Kuznetsov	ad7e4ee08b	feat(identify): close 3 detector gaps found on the spaces corpus (06-05..06-11) - AIGC: parse the bare ``AIGC{...}`` blob form (label glued to its JSON in a JPEG APP segment near the JFIF header), and scan both raw-JSON forms in one fall-through loop so a quoted ``"AIGC"`` later in an XMP packet no longer shadows a real bare label earlier in the file (3 files read unknown before). - Integrity clash rule 2: a camera device + an AI marker from the SAME C2PA manifest (Google Pixel Magic Editor / Pixel Studio edit chain) is a legitimate edit chain, not a contradiction. Fire only when the AI marker's source is independent of the camera's manifest; pure cameras (Leica/Sony/Nikon) are unaffected (2 Pixel files mis-flagged before). - New c2pa_cloud_manifest detector: surface a C2PA 2.4 Durable Content Credentials cloud-manifest reference (Adobe cai-manifests.adobe.com) as a medium provenance signal when the embedded manifest is stripped. Provenance only, never asserts is_ai (2 files read fully unknown before). identify reuses its already-loaded scan head for the cloud check (no second read). +7 tests; CLAUDE.md + README synced. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 09:28:15 -07:00
Victor Kuznetsov	22bc171806	ci: bump checkout to v6 (Node 24), note dismissed torch alert actions/checkout@v4 ran on the deprecated Node 20; bump to v6 to match test.yml/publish.yml. Document the dismissed Dependabot torch alert (GHSA-rrmf-rvhw-rf47, not_used: no torch.jit usage, gpu-extra-only, no patch). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-10 16:00:35 -07:00
Victor Kuznetsov	d763581ed3	chore(release): v0.10.3 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> v0.10.3	2026-06-10 15:53:50 -07:00
Victor Kuznetsov	0d99f403fb	ci: auto-distribute releases to Homebrew tap + HF Space distribute.yml fans a published GitHub Release out to the channels that would otherwise be manual: it waits for the sdist on PyPI, bumps the Homebrew formula (HOMEBREW_TAP_TOKEN) and factory-rebuilds the HF Space (HF_TOKEN). PyPI stays on publish.yml; conda-forge on its autotick bot. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-10 15:47:03 -07:00
Victor Kuznetsov	e78e5f1154	docs: address HN feedback in README (scope, limitations, honest use case) From the HN front-page discussion (news.ycombinator.com/item?id=48200569): - Threat model: drop the 'third-party classifiers' overclaim. State scope honestly: it removes SynthID / visible marks / provenance metadata, does NOT defeat trained AI-vs-real classifiers (Hive), and watermarks are a weak trust signal to begin with. - Replace the 'preserving art / historical record' use case (criticized as not holding) with the defensible one: clearing an overstated AI label from your own lightly-AI-edited photo. - Add a Limitations section: lossless visible/metadata vs lossy content-dependent SynthID path, no local self-verify, large images not tiled yet, out-of-scope. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-10 11:41:39 -07:00
Victor Kuznetsov	0a77d3198e	chore(release): v0.10.2 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> v0.10.2	2026-06-10 10:38:50 -07:00
Victor Kuznetsov	9aea5f240f	chore: improve discoverability (PyPI keywords/classifiers, README badges) Research-informed metadata for organic dev discovery: - pyproject: add a keywords field (was absent; biggest PyPI search gap) and expand classifiers (audience, console, security, AI, utilities); rewrite the summary noun-first, naming Nano Banana / SynthID / C2PA verbatim. - README: add PyPI version, Python versions, downloads, and license badges. GitHub topics (comfyui, watermark-remover) and the repo description were updated out of band. PyPI metadata ships on the next release. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-10 10:34:43 -07:00
Victor Kuznetsov	c3ddf8a801	docs: document Homebrew, conda-forge, and ComfyUI distribution channels - README: add Homebrew install, conda (conda-forge, in review), and a ComfyUI custom-nodes section. - CLAUDE.md: per-channel release/bump cadence (Homebrew formula, conda-forge autotick bot, ComfyUI Registry); note pip_check: false on the conda recipe. - Add packaging/conda/recipe.yaml (v1, noarch core-only), verified green on conda-forge/staged-recipes PR #33674. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 19:29:40 -07:00
Victor Kuznetsov	5777458296	chore(release): v0.10.1 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> v0.10.1	2026-06-09 17:08:44 -07:00
Victor Kuznetsov	295e7ada2b	chore: project review (dev tools in extras, dep upgrades, optional-deps guard, stale cleanup) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-09 17:03:17 -07:00
Victor Kuznetsov	826cfdb82a	chore(release): v0.10.0 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> v0.10.0	2026-06-09 13:24:37 -07:00
Victor Kuznetsov	2fcd00ced0	fix: address whole-project code review (visible all/batch, engine consolidation, I/O) Nine findings from a high-effort project-wide review, fixed and verified (571 passed, ruff/pyright clean): Correctness: - all/batch now remove Doubao/Jimeng/Samsung visible text marks: the visible step routes through the registry (new cli._remove_visible_auto) instead of a hardcoded GeminiEngine, so they no longer leave the wordmark intact. - batch always reads the original source (dropped the out_path-reuse that re-processed already-cleaned outputs on a re-run). - img2img_runner only retries the diffusion call on the deprecated-callback TypeError; any other TypeError now propagates instead of double-running. - gemini detect/remove and the reverse-alpha engines normalize channels via a new image_io.to_bgr, fixing a grayscale/BGRA crash in the FP-gate path. - _png_late_metadata advances its cursor by the clamped length, so a malformed chunk length no longer aborts the late AI-label scan. Cleanup / efficiency: - Consolidate the ~90%-identical Doubao/Jimeng/Samsung engines into a shared config-driven _text_mark_engine.TextMarkEngine base; each engine is now a thin subclass (TextMarkConfig + test shims). Behavior is byte-exact (the three engine test suites pass unchanged). Registry adapters collapse to one _text_mark(...) row each. Gemini stays a separate engine. - scan_head is memoized per (path, size, mtime), so identify() reads the file head once instead of ~8 times. - invisible_engine post-processing decodes/encodes the output once (chained in memory) instead of 2-4 times across stages. - Remove the orphaned get_model_id_for_profile (+ CONTROLNET_PROFILE); derive the --strength help from the strength constants (strength_default_help) so it cannot drift; share the --pipeline/--strength click options; simplify the retired --auto resolver. Net -835 lines. Tests added for the registry-routed visible pass, to_bgr, the polish/model/guidance wiring, and strength_default_help. CLAUDE.md updated for the new base module, the engine/registry changes, image_io.to_bgr, and the scan_head cache. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 13:21:13 -07:00
Victor Kuznetsov	b1189549b8	feat(invisible): controlnet default, unified strength, retire --auto, add --model/--guidance-scale Overhaul the diffusion-removal surface around a single robust default and a complete, consistent CLI. Pipeline + strength: - controlnet is now the DEFAULT pipeline (CLI --pipeline + both engine ctors). With the certified higher strength it clears both photoreal and flat-graphic content, whereas plain SDXL left SynthID on flat graphics. - Rename the plain-SDXL profile default -> sdxl; "default" stays as a back-compat alias (normalize_profile + a click callback that warns). - Unify the strength ladder: resolve_strength applies ONE vendor-adaptive ladder (the certified controlnet floors OpenAI 0.20 / Google 0.30 / unknown 0.30) to both pipelines. sdxl is the weaker remover on its own hard case (flat fills), so the certified floor is the right floor for it too. CLI completeness: - Add --model (HF model id) to invisible + batch (was only on all) and --guidance-scale (CFG) to all three diffusion commands; both were library knobs the CLI did not expose. - Flip --adaptive-polish to ON by default (it self-gates to a no-op where there is no detail deficit, so default-on is safe). - Share --pipeline / --strength / --model / --guidance-scale as single decorators so invisible/all/batch keep an identical surface; the --strength help is derived from the strength constants (strength_default_help) so it can never drift from the ladder. Removals: - Delete the auto_config content-detection planner + its YuNet/DBNet assets (~2.6 MB): with controlnet always the pipeline and the polish self-gating, the face/text/edge detection no longer changed behavior. --auto is now a deprecated no-op that only warns (the polish it enabled is the default). Docs (README, CLAUDE.md, docs/synthid.md) updated throughout; added an InvisibleEngine Python API example. Tests cover the alias warnings, the polish default, and the --model/--guidance-scale wiring. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 12:40:45 -07:00
Victor Kuznetsov	efc5b4a9af	docs(auto): drop stale face-restore mentions from --auto The face-restore family was removed in `20d7eda`, but the auto_config module docstring still claimed "PhotoMaker face restoration is enabled when a face is present" and the --auto help text (CLI + README example) listed "face restore" as something --auto picks. A detected face now only routes to the controlnet pipeline (canny preserves face STRUCTURE, not identity); there is no identity restoration. Comments/docstrings/help only, no code behavior change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 11:12:53 -07:00
Victor Kuznetsov	ea098cf1be	chore(release): v0.9.0 BREAKING: - Drop `--restore-faces` / `--restore-faces-method` CLI flags - Drop `restore`, `photomaker`, `instantid` extras - Drop `restore_faces` / `restore_faces_method` params from InvisibleEngine.remove_watermark and AutoConfig Rationale (full empirical record in docs/synthid-robust-identity-research-2026-06-08.md "Empirical follow-up"): every face-restore approach evaluated 2026-06-04 - 2026-06-08 (GFPGAN-on- cleaned, PhotoMaker-V2, InstantID txt2img, InstantID img2img-on-cleaned at three parameter sweeps) regenerates the face via SDXL diffusion -- output face pixels are diffusion-fresh, so the regenerated face inherits SDXL's "clean skin" aesthetic and loses original identity precision. The result looks MORE AI-generated than the cleaned image, not less. The cleaned controlnet 0.20 image is the least-AI face state we can reach without re-introducing SynthID. License: - MIT -> Apache 2.0 (Apache adds an explicit patent grant + trademark clause; better fit with the upstream Apache projects this library mirrors / depends on -- diffusers, transformers, controlnet-aux, xinsir's controlnet weights) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> v0.9.0	2026-06-08 21:28:09 -07:00
Victor Kuznetsov	a4554bb5d3	chore(license): switch from MIT to Apache 2.0 Replace LICENSE with the canonical Apache License 2.0 text + a brief copyright notice for "wiltodelta 2025-2026". Update pyproject.toml's `license` field to "Apache-2.0" and the PyPI classifier to "Apache Software License". Update README's License section to point at the LICENSE file and name the copyright holder. Why: Apache 2.0 gives downstream users an explicit patent grant and the trademark-use clause, which MIT doesn't carry. It is also the more common license among the upstream projects this library depends on / mirrors (diffusers, transformers, controlnet-aux, xinsir's canny controlnet weights), so contributions can flow either way without a permission-shape mismatch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 21:23:57 -07:00
Victor Kuznetsov	20d7eda96a	remove: drop all face-restore code (regeneration, not preservation) Empirical conclusion from the 2026-06-04 - 2026-06-08 Modal cert sweeps: every face-restore approach we built (GFPGAN-on-cleaned, PhotoMaker-V2, InstantID txt2img, InstantID img2img-on-cleaned at three parameter settings) regenerates the face via SDXL diffusion rather than preserves it. Output face pixels are diffusion-fresh, so the regenerated face inherits SDXL "clean skin" aesthetic and loses original identity precision -- it looks MORE AI-generated than the cleaned image, not less. The cleaned image from the main controlnet 0.20 removal pass is the least-AI face state we can reach without re-introducing SynthID. Nothing in the restore family achieves the actual goal (preserve the original person's face). Keeping them around as opt-in invites users to ship something that defeats the point. Removing entirely. Library changes: - Deleted src/remove_ai_watermarks/instantid_restore.py - Deleted src/remove_ai_watermarks/photomaker_restore.py - Deleted tests/test_instantid_restore.py - Deleted tests/test_photomaker_restore.py - Removed `instantid` and `photomaker` extras from pyproject.toml - Removed `[tool.hatch.metadata] allow-direct-references = true` (was only needed for the photomaker git+ URL) - InvisibleEngine.remove_watermark: dropped `restore_faces` + `restore_faces_method` params, removed both `_restore_faces_instantid` and `_restore_faces_photomaker` private methods, removed dispatch - CLI: dropped `_restore_faces_options` decorator, all four cmd_* signatures lose `restore_faces` + `restore_faces_method`, kwarg passes to remove_watermark dropped - _apply_auto: dropped `restore_faces` from tuple shape (was unused after the engine no longer takes it) - auto_config.AutoConfig: dropped `restore_faces` field; `plan()` no longer sets it; `reason` no longer mentions it - Tests updated accordingly (test_auto_config.TestReason no longer asserts "face-restore on" in the reason string) Docs updated: - CLAUDE.md: removed the photomaker extras bullet, the Face restore trade-off bullet, the instantid_restore.py + photomaker_restore.py module bullets; replaced restore mentions in watermark_remover and controlnet bullets and prod recipe with the empirical conclusion - README.md: removed both `--restore-faces` callouts and the install snippet; the feature bullet and auto-mode comment updated - docs/synthid-robust-identity-research.md: added Status-retired notice at the top pointing at the 2026-06-08 followup raiw-app: - modal_cert.py: dropped `--restore-faces` flag entirely; sweep() no longer takes restore_faces; pinned _LIB_SPEC to `[gpu]` extras (no `photomaker` / `instantid` extras), points at main ruff + strict pyright clean; 569 tests pass; 18 restore-specific tests gone. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 21:21:58 -07:00
Victor Kuznetsov	567f3ae729	docs(restore): document that restore methods REGENERATE, not preserve Empirical conclusion from the 2026-06-04 - 2026-06-08 cert sweeps: every shipped face-restore method (GFPGAN-on-cleaned, PhotoMaker-V2, InstantID txt2img, InstantID img2img-on-cleaned at three parameter settings) regenerates the face from an ArcFace embedding via SDXL diffusion. Output face pixels are diffusion-fresh, which makes the regenerated face look MORE AI-generated than the cleaned image (gloss, symmetric pores, SDXL "clean skin" aesthetic) regardless of license. The cleaned image from the main controlnet 0.20 removal pass is the LEAST-AI state we can reach without re-introducing SynthID; any restore on top trades original-look for embedding-driven regeneration. The fundamental issue is structural: ArcFace encodes "general look" at 512 dimensions, SDXL decodes that into pixels with the inherent SDXL aesthetic. Stronger identity push (higher strength + IP-Adapter scale) makes the face closer to the embedding but more AI-looking; weaker push leaves identity to drift further. No parameter setting recovers original identity AND looks less AI than cleaned. Production conclusion: do not ship `--restore-faces` in any monetized deployment. The extras (`instantid`, `photomaker`) stay in the library for research / personal use where users explicitly want regeneration. Documented at every entry point: - CLAUDE.md: new "Face restore trade-off" bullet + every restore mention rewritten to "REGENERATES, does NOT recover"; controlnet bullet updated - README.md: feature bullet + callout + secondary mention all updated - docs/synthid-robust-identity-research-2026-06-08.md: appended "Empirical follow-up" section documenting the InstantID sweep phases (Phase 1 txt2img v1/v2/v3, Phase 2 img2img defaults + stronger params) - docs/controlnet-removal-pipeline-research.md: updated restore-faces bullet to reflect the empirical conclusion - CLI help: `_restore_faces_options` docstring + `--restore-faces` / `--restore-faces-method` help text all updated Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 21:08:11 -07:00
Victor Kuznetsov	7d8af7882a	tune(instantid): raise IP-Adapter + landmark scale + strength for stronger identity First img2img cert sweep: scene/lighting integration was excellent on both single (tatsunari) and group (gemini_3) photos, but the regenerated faces were "recognizable similar people" rather than the original individuals. The cleaned face crop (which has already drifted from original through the main controlnet 0.20 removal pass) was competing as a structural prior; at the previous parameter settings InstantID's ArcFace branch couldn't dominate it. Push the identity signal: - `ip_adapter_scale`: 0.8 -> 1.0 at load time (full IP-Adapter strength) - `controlnet_conditioning_scale`: 0.8 -> 1.0 default (landmark anchor) - `img2img_strength`: 0.55 -> 0.7 default (more denoise, less cleaned structure survives, more room for the diffusion to render ArcFace) The cleaned image already passed the SynthID oracle, so the absolute floor on strength is "any positive value" -- raising it only increases the freedom of the diffusion to inject identity (SynthID-safety isn't reduced by higher strength, because the noise injection only destroys more of the input pixels). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 20:54:41 -07:00
Victor Kuznetsov	8ed2d16a23	fix(instantid): pass trust_remote_code=True for local custom_pipeline The img2img run silently produced an identity output because DiffusionPipeline.from_pretrained refused to load the local custom_pipeline .py without `trust_remote_code=True` (emits a single-line warning to stderr, then falls back to a default class). load_ip_adapter_instantid then AttributeError'd, our outer except logged + skipped, and the saved file was the un-restored cleaned image (exact byte size match against the no-restore baseline -- 250988 bytes). We fetch the file from a pinned raw.githubusercontent URL we control, so trust_remote_code is safe to opt in here. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 20:47:26 -07:00
Victor Kuznetsov	2687604b24	feat(instantid): switch from txt2img to img2img on cleaned crop The txt2img architecture (generate face from scratch in a fresh 1024 scene) fundamentally couldn't fix multi-face patchwork: each face was a studio portrait that didn't belong in the surrounding scene (wrong lighting, frontal pose, neutral expression vs the original group photo's varied angles and smiles). Tight crop + elliptical alpha + color match smoothed the seams but didn't make the faces look like they were SHOT in the scene. Replacing with img2img-on-cleaned: feed the CLEANED face crop as the img2img source, so the diffusion sees the actual scene context (shoulders, hair edges, lighting direction, shadows) and harmonises the regenerated face with it. Identity still flows through the ArcFace embedding (from original) + landmark ControlNet (kps from original) -- both semantic / pure geometry, neither carries pixels. SynthID safety preserved by construction: - img2img source pixels = cleaned crop = already oracle-verified clean - ArcFace embedding = 512-d semantic vector from original, no pixel content - Landmark stick figure = colour-coded geometry, no source pixels - img2img noise injection at strength 0.55 destroys any residual high-freq pattern in the cleaned crop - Pipeline is the upstream StableDiffusionXLInstantIDImg2ImgPipeline, inherits from StableDiffusionXLControlNetImg2ImgPipeline; we still patch check_inputs to neutralise the same diffusers-0.38 positional shift the txt2img variant had Implementation: - New _fetch_img2img_pipeline_file() caches the upstream pipeline file from GitHub raw on first use (not on PyPI / HF Hub, has to be downloaded separately) - _get_pipeline() now loads StableDiffusionXLInstantIDImg2ImgPipeline via custom_pipeline=<cached path> - restore_faces_instantid() crops the SAME bbox from both original and cleaned, runs InsightFace on original (sharper embedding), feeds cleaned crop as img2img source, ArcFace+landmark as conditioning - New img2img_strength=0.55 parameter (was no strength knob in txt2img mode) - Composite path unchanged (elliptical alpha + color_match) - 9 control-flow tests still pass (the mock pipe call shape change is absorbed by the kwargs-only fake) Cert sweep will validate on tatsunari (single) first per user request. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 20:43:27 -07:00
Victor Kuznetsov	7c0c16fd66	test(instantid): update composite assertion to survive color-match Last commit added `_color_match` which shifts the face crop's mean to the canvas mean -- the old test fed a uniform face (210) into a uniform cleaned canvas (90), so after color-match the face was uniform 90 and the composite was undetectable by value. Switched the fake pipeline to a gradient face so the color-match preserves variance, and the assertion now checks that the face region has non-zero std (composite injected gradient pixels) instead of a value threshold. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 20:26:56 -07:00
Victor Kuznetsov	cdd6bd1fea	feat(instantid): tighter face ellipse + color match for cleaner multi-face composite Second multi-face iteration. v1-rect: full-1024 frame + Gaussian rectangle -> patchwork. v2-ellipse: tight crop + ellipse 0.45bw x 0.55bh -> ellipse exceeds bbox vertically and clips forehead/chin on single portrait, plus group-photo faces visibly drift cooler than the warm bar background. v3: 1. Smaller ellipse axes: 0.32bw x 0.42bh. Both fit inside the bbox (since axes are radii from center, 0.32bw extends 0.64bw total width and 0.42bh extends 0.84bh total height) so no chin/forehead clip even on non-square boxes. Face shape: vertically elongated (0.42 vs 0.32), matching real face geometry. 2. Wider feather: `min(bw, bh) // 5` instead of // 8. Edges fade over a wider band so the elliptical seam is less visible. 3. Per-channel mean color match (`_color_match`): before compositing, shift the regenerated face's mean BGR to match the cleaned canvas region where it lands. Each InstantID generation has independent SDXL noise so white balance drifts -- matching means equalises tone (warm bar / cool face -> warm face) without rescaling contrast. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 20:25:34 -07:00
Victor Kuznetsov	92c7245e2d	chore: drop unused _composite_faces import Linter caught it after the elliptical composite swap. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 20:18:37 -07:00
Victor Kuznetsov	1786f6de9f	feat(instantid): multi-face anti-patchwork (tight-crop + elliptical composite) Group-photo cert sweep last round produced the same "patchwork quilt" failure mode as PhotoMaker-V2: each face is regenerated as a fresh 1024x1024 SCENE (face + background + lighting), then composited as a Gaussian-feathered RECTANGLE into the 2x square box around the original face. The rectangle's corners carry regenerated background pixels with different colors / textures per face, and the rectangular Gaussian feather lets them bleed into the cleaned image -- 9 face renders with 9 different backgrounds -> patchwork. Two changes, both surgical: 1. Tight-crop the regenerated face before composite. After generation, run YuNet again on the 1024 frame to find where the face actually landed, then crop tightly around it (matching the 2x padding our input crop uses so the face fills its natural slot). Drops the regenerated background's peripheral pixels. 2. Elliptical composite alpha (`_composite_faces_elliptical`). Instead of reusing photomaker_restore's rectangular Gaussian alpha, inscribe an ellipse in each face bbox (axes ~0.45bw x 0.55bh so the feather edge tapers cleanly inside the rectangle, head-silhouette shape), feather only the ellipse edge. Bbox corners (regenerated scene context) end up at alpha=0 and the cleaned-canvas pixels there stay intact. Only the head region is replaced. Net result: faces stay identity-restored (semantic ArcFace + landmark control still drives generation) but the canvas around each face is the cleaned image, not a regenerated frame. No more multi-face patchwork. Single-portrait case unchanged: there's one face to composite and the cleaned canvas around it is mostly the background that was already there. All 9 InstantID control-flow tests still pass (the mock face analyser responds to both .get() calls with the same fake bbox, so the new generated-image YuNet step is exercised end-to-end). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 20:18:15 -07:00
Victor Kuznetsov	4ec8ffec6b	fix(instantid): patch check_inputs for diffusers-0.38 + set scale at load time Two compat bugs caught by the Modal cert sweep, both rooted in diffusers 0.38 vs InstantID's community pipeline expectations: 1. Positional check_inputs misalignment. InstantID's __call__ calls `self.check_inputs(...)` POSITIONALLY using the parent's ~v0.29 signature. Diffusers 0.38 added two new parameters BEFORE `controlnet_conditioning_scale` in the parent's signature (`ip_adapter_image`, `ip_adapter_image_embeds`), which shifts every positional arg by two slots. The argument that lands in the parent's `controlnet_conditioning_scale` slot is actually InstantID's `control_guidance_end` -- which a few lines earlier was converted to `[1.0]` (a list) by InstantID's auto-broadcasting for the single-controlnet case. The parent's check then trips on `not isinstance([1.0], float)` -> TypeError. Our inputs are programmatic and validated by our own callers, so neutralising `pipe.check_inputs = lambda a, k: None` after load is safe. This is the standard workaround community ComfyUI ports use for the same compat break. 2. `ip_adapter_scale` was passed at call time and silently ignored.* It's not in `StableDiffusionXLInstantIDPipeline.__call__`'s signature -- the upstream API sets the IP-Adapter weight on the ArcFace cross-attention branch at LOAD time via `load_ip_adapter_instantid(scale=...)`. Moved the 0.8 default there, dropped the call-time kwarg. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 20:07:31 -07:00
Victor Kuznetsov	53d753f2ad	fix(instantid): pre-fetch antelopev2 from HF mirror (InsightFace auto-link is broken) InsightFace's built-in auto-download for the antelopev2 model pack (github.com/deepinsight/insightface/releases/download/v0.7/antelopev2.zip) has been broken since at least 2024 (upstream issues #2517, #2766, called out in InstantID's README: "manually download via this URL to models/ antelopev2 as the default link is invalid"). When the .onnx files aren't in place, FaceAnalysis.prepare() raises `assert 'detection' in self.models` -- which is exactly what our Modal cert sweep hit on the first real run. Fix: a tiny pre-flight `_ensure_antelopev2()` that pulls the five expected .onnx files (1k3d68, 2d106det, genderage, glintr100, scrfd_10g_bnkps) from the HuggingFace mirror `kidyu/antelopev2-for-InstantID-ComfyUI` into ./models/antelopev2/ before FaceAnalysis is instantiated. Idempotent (skips files that already exist); uses huggingface_hub's cache for free caching on the Modal volume. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 19:58:40 -07:00
Victor Kuznetsov	00c559482f	fix(invisible-engine): log exc_info + exception class on restore_faces failure The InstantID cert sweep emitted `restore_faces post-pass failed ()` -- the exception's str() was empty so the log line told us nothing about what actually failed. Adding `exc_info=True` plus `type(e).__name__` so the full traceback and exception class land in the log even when the message is empty. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 19:53:07 -07:00
Victor Kuznetsov	a296d5fe46	fix(instantid): inline YuNet detection (the imagined _get_yunet doesn't exist) The InstantID restore module imported `_get_yunet` from `auto_config`, but auto_config doesn't export that function -- the YuNet singleton lives inline inside `detect_face()`. Caught by the Modal cert sweep: restore_faces post-pass failed (cannot import name '_get_yunet' from 'remove_ai_watermarks.auto_config'); keeping un-restored output Inline the YuNet builder the same way `photomaker_restore` does (read `auto_config._FACE_SCORE` and the bundled `face_detection_yunet_2023mar.onnx` asset, build a fresh `FaceDetectorYN` per call). This is the proven pattern from PhotoMaker and avoids a private-API drift between the modules. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 19:48:21 -07:00
Victor Kuznetsov	70e8b3a517	feat(face-restore): add InstantID as the default non-commercial restore path Per the 2026-06-08 deep-research synthesis (docs/synthid-robust-identity- research-2026-06-08.md), the entire ArcFace-class identity-adapter ecosystem for SDXL is blocked from commercial use by InsightFace's non-commercial model packs (antelopev2 / buffalo_l). No commercial-safe ArcFace-grade identity stack exists today. The user explicitly opted into shipping a non-commercial restore path (research / personal use; raiw.cc must NOT install the extra). Architectural choice: InstantID over PhotoMaker-V2 as the default. - PhotoMaker-V2 (CLIP+ArcFace dual encoder, txt2img only): documented upstream identity drift on Asian male faces, visually confirmed in our cert sweep (tatsunari rendered as a generic woman; group photo collapsed into a patchwork). - InstantID (ArcFace cross-attention + landmark ControlNet): semantic identity branch + spatial weak landmark control, decoupled. Per InstantID paper (arXiv:2401.07519) and the research report, stronger identity fidelity on single portraits. Critically: NO original face pixels enter the diffusion (ArcFace embedding is semantic, landmark stick figure is pure geometry), so SynthID is not transported. Implementation: - New `src/remove_ai_watermarks/instantid_restore.py` mirrors the `photomaker_restore.py` shape (lazy singletons for pipeline + FaceAnalysis, per-face crop + _composite_faces from photomaker_restore). Loads the InstantID community pipeline via `DiffusionPipeline.from_pretrained( custom_pipeline="pipeline_stable_diffusion_xl_instantid")` -- no upstream Python package needed; diffusers fetches the file from its community examples. - New `instantid` extra in pyproject (insightface + onnxruntime + huggingface-hub). NON-COMMERCIAL block in the comment explains why. - CLI: `--restore-faces-method [instantid\|photomaker]`, default `instantid`. Both methods explicitly labeled NON-COMMERCIAL in the help text. - Engine: dispatch on `restore_faces_method` to either `_restore_faces_instantid` or `_restore_faces_photomaker`. - 9 control-flow tests for InstantID without model download (mirror the photomaker_restore.py test pattern + draw_kps helper checks). 587/587 pass. Diffusers-0.38 compat verified by upstream code inspection: the InstantID pipeline inherits from `StableDiffusionXLControlNetPipeline`, uses only public diffusers APIs (`encode_prompt`, `prepare_image`, `prepare_latents`, `get_guidance_scale_embedding`), uses legacy attention processor API which diffusers preserves for backward compat. No PhotoMaker-V1-style internal text_encoder access. End-to-end execution will be validated by the Modal cert sweep in the next step. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 19:44:17 -07:00
Victor Kuznetsov	c486badaa8	fix(photomaker-v2): render at SDXL native 1024, use upstream prompt + neg_prompt The 9-face grid + single-face cert outputs were still mosaic of training-time faces even after the id_embeds shape fix. WebFetch of the upstream inference_pmv2.py revealed three mismatches: 1. SDXL at width=height=512 falls into its low-res failure mode (small-detail collage / mosaic) on the V2 LoRA. Render at native 1024 then downscale into the original face bbox at composite time. 2. Upstream prompt is descriptive ("instagram photo, portrait photo of a woman img, colorful, perfect face, natural skin, hard shadows, film grain, best quality"). Our generic prompt let SDXL drift away from the ID embedding. Adopted the upstream pattern. 3. Upstream V2 explicitly passes negative_prompt; the CFG batch-mismatch we hit on V1 isn't a V2 issue. Re-added negative_prompt with the upstream wording (asymmetry/worst quality/etc). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 19:11:48 -07:00
Victor Kuznetsov	b1fed810fd	fix(photomaker-v2): don't pre-unsqueeze id_embeds (the pipeline does it) V2's pipeline forward at line 705 of upstream pipeline.py calls `id_embeds.unsqueeze(0)` itself to add a batch dim, so callers pass a 2-D (N_faces, 512) tensor and the pipeline turns it into 3-D. Upstream inference_pmv2.py shows the canonical form: torch.stack([...]) of per-image embeddings. Our previous call .unsqueeze(0)'d on the way in, which the pipeline then .unsqueeze(0)'d again, giving a (1, 1, 512) shape that the V2 id_encoder consumed as garbage -- the resulting output was a training-time face collage (verified visually 2026-06-04 against tatsunari + gemini_3 + the 9-face grid). Fix: pass torch.stack([torch.from_numpy(embedding)]) -- shape (1, 512) -- so the pipeline's internal unsqueeze gives the expected (1, 1, 512) inside the forward. Don't pre-cast dtype either; the pipeline handles that internally. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 19:03:04 -07:00
Victor Kuznetsov	37817a610f	test(photomaker): stub face_analyser + analyze_faces in the control-flow test The previous commit added a real call into FaceAnalysis2 / analyze_faces inside restore_faces_photomaker, which broke the model-free control-flow test. Stub it: - monkeypatch _get_face_analyser to return a sentinel - install a fake `photomaker` module with analyze_faces returning a single 512-d zero embedding - add dtype=torch.float32 to the fake pipeline class so .to(device, dtype=...) works 11/11 green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 18:51:26 -07:00
Victor Kuznetsov	3d00fed00c	fix(photomaker-v2): compute id_embeds via FaceAnalysis2 before pipeline call The Modal cert sweep against V2 hit the next layer of the API: PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken.forward() missing 1 required positional argument: 'id_embeds' V2 forward takes BOTH the CLIP image embedding (computed inside the pipeline from input_id_images) AND an ArcFace identity embedding (id_embeds) that the caller must compute. The upstream pipeline does NOT auto-compute it -- inference_pmv2.py shows the caller using FaceAnalysis2 + analyze_faces to extract the ArcFace vector from each input ID image and passing id_embeds=torch.stack([...]) into pipe(...). Wired the same flow here: - New _get_face_analyser() singleton (double-checked lock) builds FaceAnalysis2(['CUDAExecutionProvider' \| 'CPUExecutionProvider']).prepare(...). This is the non-commercial step (antelopev2/buffalo_l auto-download on first use). Module docstring already calls it out. - Per face: analyze_faces() -> torch.from_numpy(embedding) -> .unsqueeze(0) to match the pipeline's expected (B, D) shape, casting to pipeline.device/dtype. Faces InsightFace can't detect inside the crop get skipped (the most likely cause would be the diffusion-cleaned face being too small or stylised after the main pass; YuNet already gated us into having a face per crop, so this should be rare). - id_embeds= keyword threaded into the pipeline call site alongside the existing input_id_images=. Tests untouched (the V1-only safety guard was already removed in the previous commit when we swapped V1->V2; the existing 11 tests still pass). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 18:49:10 -07:00
Victor Kuznetsov	65de8df5c5	refactor(face-restore): drop GFPGAN, ship PhotoMaker-V2 as the sole restore (non-commercial) Visual review of the GFPGAN-on-cleaned output (9-face grid, 1448x1086) showed it only polished the already-drifted face without restoring identity — useless for the "restore who is in the photo" intent. Dropping it. The shipped restore path is now PhotoMaker-V2, which delivers true identity-from- embedding face regeneration via a CLIP+ArcFace dual encoder. The ArcFace branch pulls InsightFace antelopev2/buffalo_l model packs at runtime, which InsightFace releases under a research-only license, so the whole extra is NON-COMMERCIAL. raiw.cc and any monetized deployment must NOT install the `photomaker` extra. This is called out at every entry point: CLI flag help, module docstring, pyproject extra block, CLAUDE.md extras bullet, README install snippet. Changes: - Deleted `src/remove_ai_watermarks/face_restore.py` and its tests. - Deleted the `restore` extra (gfpgan/facexlib/basicsr + scipy<1.18 / numba<0.60 pins) and the basicsr setuptools<69 build pin from pyproject.toml. - Restored `src/remove_ai_watermarks/photomaker_restore.py` (V2 this time: `TencentARC/PhotoMaker-V2`, `photomaker-v2.bin`, no `pm_version='v1'` override). - Restored the `photomaker` extra in pyproject with all the upstream-compat pins (einops, peft, onnxruntime, insightface) and the `allow-direct-references` hatch metadata block. - `InvisibleEngine` swapped `_restore_faces` -> `_restore_faces_photomaker`; `--restore-faces-method` removed (only one method, no choice). - CLI flag help, CLAUDE.md, README, docs/synthid.md, and docs/controlnet-removal-pipeline-research.md all updated. - docs/synthid-robust-identity-research.md status notice rewritten to list both abandoned commercial-safe attempts (V1 + GFPGAN-on-cleaned) and the non-commercial trade-off we accepted. ruff + strict pyright(src/) clean; 578 tests pass (the 9 GFPGAN tests are gone, the 11 PhotoMaker tests stay green). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 18:41:01 -07:00
Victor Kuznetsov	01fe98bf54	refactor(face-restore): rollback PhotoMaker, restore GFPGAN on the CLEANED image After 7 cascading upstream-compat fixes (insightface dep, peft dep, pm_version, device, etc.), the PhotoMaker V1 cert sweep still hit a CFG batch-dim mismatch inside the denoising loop. The upstream PhotoMaker `pipeline.py` is forked from diffusers v0.29.1 and our env runs 0.38; SDXL prompt-encoder handling changed significantly between those versions, so making PhotoMaker work end-to-end needs a proper fork or a diffusers downgrade — both expensive. Not worth shipping today. Pivot: restore `face_restore.py` (GFPGAN) with a single-line fix that makes it SynthID-safe by construction. The previous design ran GFPGAN.enhance on the ORIGINAL watermarked image and was oracle-confirmed to re-add SynthID via the weight-0.5 pixel blend. The fix is to run GFPGAN on the diffusion-CLEANED image — whatever pixels GFPGAN derives from are already SynthID-free, so the partial blend cannot transport the watermark. Identity fidelity is lower than a true identity-as-embedding stack would deliver, but it ships and works. Changes: - `src/remove_ai_watermarks/face_restore.py` restored from pre-wipe state with one line changed: `restorer.enhance(cleaned_bgr, ...)` instead of `restorer.enhance(original_bgr, ...)`. `original_bgr` is kept as an unused positional argument for API stability. - `src/remove_ai_watermarks/photomaker_restore.py` and its tests REMOVED. The research note (`docs/synthid-robust-identity-research.md`) keeps a "status notice" documenting why PhotoMaker is parked for now and what the path back in would look like. - `pyproject.toml` `restore` extra restored (gfpgan/facexlib/basicsr + scipy<1.18 + numba<0.60 pins + the basicsr setuptools<69 build pin), plus `photomaker` extra (with its einops/insightface/peft pile) and the `[tool.hatch.metadata] allow-direct-references = true` block REMOVED. - `InvisibleEngine._restore_faces_photomaker` removed; `_restore_faces` restored. The `--restore-faces` CLI flag and its plumbing through cmd_* signatures are unchanged. - CLAUDE.md, README.md, docs/synthid.md, docs/controlnet-removal-pipeline- research.md updated to describe the shipped GFPGAN-on-cleaned design and to reference PhotoMaker only as the parked alternative. ruff + strict pyright(src/) clean; 578 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:55:45 -07:00
Victor Kuznetsov	d1b85ee6a8	fix(photomaker): drop explicit negative_prompt to fix CFG batch mismatch Modal cert sweep #6 made it INTO the denoising loop and died with "Sizes of tensors must match except in dimension 1. Expected size 2 but got size 1 for tensor number 1 in the list." In the PhotoMaker pipeline's denoising loop, the per-step embeddings are built as torch.cat([negative_prompt_embeds, prompt_embeds(_text_only)], dim=0). The text-encoder + ID-encoder flow can leave the negative branch at batch=2 and the ID-injected branch at batch=1 when a custom negative_prompt is passed, so the cat fails. The upstream gradio demo just passes no negative_prompt and relies on the pipeline's empty default; do the same. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:35:40 -07:00
Victor Kuznetsov	031c38dc7f	fix(photomaker): place id_encoder on the right device + dtype Modal cert sweep #5 made it through component load (V1 id_encoder + lora_weights) and died at inference with the classic "Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor) should be the same" — id_encoder lived on CPU/fp32 while the rest of the pipeline ran on CUDA/fp16. Two fixes: 1. Call `pipe.to(device)` BEFORE `load_photomaker_adapter` so the loader picks the right device/dtype from `self.device` / `self.unet.dtype` when it builds the encoder. 2. Belt: after load, explicitly `pipe.id_encoder.to(device, dtype)` because some torch/diffusers combos leave custom attributes on the old device even when `pipe.to` ran first. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:29:00 -07:00
Victor Kuznetsov	9435e12ce6	fix(photomaker extra): add peft dep (required by pipe.fuse_lora) Modal cert sweep #4 got further -- PhotoMaker V1 components actually loaded ("Loading PhotoMaker v1 components [1] id_encoder ... [2] lora_weights") -- and died on the next step: "PEFT backend is required for this method." That's diffusers' fuse_lora call gated on the peft library, which PhotoMaker doesn't declare in its install_requires either. Pin peft>=0.10.0 in the photomaker extra. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:23:32 -07:00
Victor Kuznetsov	1fb2a64b56	fix(photomaker): pass pm_version='v1' to load_photomaker_adapter Modal cert sweep #3 ran past the `insightface` import error and into a real state_dict mismatch: Error(s) in loading state_dict for PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken: Missing key(s) ... qformer_perceiver.token_proj.0.weight ... The upstream `load_photomaker_adapter` defaults to `pm_version='v2'` regardless of the .bin file passed -- the loader builds a V2 encoder (PhotoMakerIDEncoder_CLIPInsightfaceExtendtoken) and then tries to load V1 weights into it. We must pass `pm_version='v1'` explicitly so the loader instantiates the CLIP-only PhotoMakerIDEncoder. The pipeline-level `input_id_images` API is the same across V1 and V2, so the call site does not change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 16:18:52 -07:00

1 2 3 4 5

231 Commits