remove-ai-watermarks

mirror of https://github.com/wiltodelta/remove-ai-watermarks.git synced 2026-07-05 07:57:50 +02:00

Author	SHA1	Message	Date
Victor Kuznetsov	76e3d4154c	feat(invisible): add Qwen-Image img2img pipeline (--pipeline qwen) A third diffusion pipeline alongside sdxl/controlnet: Qwen-Image (20B MMDiT, Apache-2.0 code AND weights) img2img. The scrub still comes from the img2img strength; Qwen preserves text (incl. CJK) and structure markedly better than SDXL at the scrub floor, so it over-regenerates real photos far less (directly targets the controlnet over-regeneration that degrades real uploads). - watermark_profiles: QWEN_MODEL_ID, normalize_profile accepts "qwen". - WatermarkRemover: _load_qwen_pipeline (bf16, loads Qwen base unless --model overridden, clear ImportError if diffusers lacks the class), _run_qwen (no MPS fallback -- 20B is CUDA/cloud-class), dispatch in _generate_one/preload, pure _build_qwen_kwargs (true_cfg_scale, not guidance_scale). - Shared _base_load_kwargs() across all three loaders (dtype + token). - CLI --pipeline gains "qwen"; invisible_engine threads it through. - scripts/qwen_scrub_prototype.py: standalone PEP 723 GPU experiment. Prototype oracle floors (Modal A100-80GB, single seed, controls SynthID-positive, PENDING seed-repeat cert): OpenAI clears at strength ~0.10, Gemini at ~0.30 (0.20 still detected), with CJK text + faces faithful where controlnet plasticizes. The Gemini floor is higher than the shared default ladder, so pass an explicit --strength for Gemini on this pipeline until a Qwen-specific ladder is certified. The model-running path is CUDA-only (untestable locally); unit tests cover the pure call-shape (_build_qwen_kwargs) and profile normalization without torch. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 20:44:36 -07:00
Victor Kuznetsov	0c0c6c6b03	feat(invisible): sliding-window tiled diffusion for large inputs (--tile) Add a lossless alternative to the --max-resolution downscale for large images that OOM on MPS/GPU: regenerate in overlapping, feather-blended tiles at native resolution. - noai/tiling.py: pure plan_tiles (uniform tiles, last flush to edge) + feather_weights (strictly-positive separable taper -> partition-of-unity blend) + run_tiled (per-tile generate callable, decoupled from the pipeline). Unit-tested without the model. - WatermarkRemover.remove_watermark: refactor _generate into _generate_one + a tiled branch that engages only when --tile is set and the long side exceeds tile_size (ControlNet canny is rebuilt per tile). - Thread tile/tile_size/tile_overlap through InvisibleEngine and the invisible/all/batch CLI commands via a shared _tile_options decorator. Verified end-to-end on the real SDXL pipeline (forced 2x2 tiling on a 1024px sample, MPS): non-degenerate output, no gross seam at tile borders. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 11:54:58 -07:00
Victor Kuznetsov	99e57c872f	perf(text-mark): footprint-sized arrays in reverse-alpha CPU path The reverse-alpha text-mark engine (Doubao/Jimeng/Samsung) allocated full-frame arrays where only the glyph footprint is ever read: - _fixed_alpha_map / _aligned_alpha_map each built a full (h, w) float32 alpha map non-zero only inside the glyph box, and two were held at once during removal (~96 MB of mostly-zeros on a 12 MP frame); - extract_mask built a full (h, w) uint8 mask that every caller cropped to the located box (~12 MB, rebuilt per text-mark detector on the memory-tight identify path). Both now return footprint-sized arrays: the alpha helpers return the glyph-sized block plus its placement (ax, ay, gw, gh), and extract_mask returns the box-sized mask. _apply_reverse_alpha consumes the block directly; the residual inpaint embeds it into one full-frame uint8 mask only at cv2.inpaint time (which needs a full-frame mask). remove_watermark_ reverse_alpha tracks the winning region alongside best_amap to place it. Peak allocation drops from O(image4)x2 + O(image) to O(footprint)x2 + one gated O(image1) uint8 mask -- a win every consumer gets, motivated by the 512 MB raiw.cc worker that OOMs on large decodes. GPU path untouched. Byte-identical to the old full-frame path (verified: 17 output hashes across the three engines, inpaint/no-inpaint, detect, and the real doubao-1.png fixture, unchanged before/after). tests/test_text_mark_memory.py guards it by reconstructing the old full-frame path inline and asserting equality, so the proof survives a cv2/asset bump, and pins the O(footprint) shape so a regression to full-frame fails loudly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 10:01:07 -07:00
Victor Kuznetsov	41f67973ce	fix(visible): inpaint mid-tone Gemini sparkle instead of a dark diamond The free `visible` path over-subtracted a faint Gemini sparkle on a mid-tone background into a darker-than-background brown diamond instead of removing it (2026-06-18 prod NPS report, "the watermark was not removed, just its color changed"). The existing over-subtraction guard only tripped when reverse-alpha drove a footprint pixel fully negative (the issue #30 dark-background black-pit case); on a mid-tone background the over-subtraction darkens the core well below the background without any pixel crossing zero, so the gate missed it and shipped the dark mark. Add a second over-subtraction signal to `_reverse_alpha_oversubtracts`: predict the reverse-alpha output at the bright core, (core - a*logo)/(1-a), and route to the footprint inpaint when it lands more than `_OVERSUB_DARK_MARGIN` (25) gray levels below the local background ring. Calibrated wide: clean removals predict within ~12 of background (demo_banana ~-1), the prod regression ~-40, the issue #30 dark case ~-82. Corpus-validated on the 479 detected Gemini images: 10 switch reverse-alpha to inpaint, all of them dark-diamond cases that improve or match; the other 469 stay byte-identical. demo_banana stays on the reverse-alpha path (byte-identical). Also crop both reverse-alpha helpers to the region they actually touch, a pure O(image) -> O(mark) win that is byte-identical to the full-frame math (a uint8<->float32 round-trip is exact): - `GeminiEngine._core_and_bg` converts only the footprint+ring crop to gray, not the whole frame (~70 ms -> 0.1 ms on a 12 MP image; it runs for both the alpha-gain estimate and the new gate). Verified identical across 479 images; detector confidence unchanged. - `TextMarkEngine._apply_reverse_alpha` computes the blend on the glyph crop only (`amap` is zero outside it, so the math is a no-op there): ~275 ms -> ~2 ms per placement on a 12 MP frame, up to 2 placements per removal. Verified identical across 142 Doubao/Jimeng placements. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-18 17:19:41 -07:00
Victor Kuznetsov	28569bd05d	fix(gemini): recover sub-0.85 corner sparkles via top-K fusion selection The 256->512 detection-search widening (v0.8) let a large, low-gradient shape match outrank a genuine mid-size corner sparkle whose raw NCC sits below the 0.85 corner-promote gate, so `identify` read `unknown` on Gemini images that v0.7.2 caught (reporter osachub: scale-48 sparkle on light bedding -- true sparkle spatial 0.775 / grad 0.960 / fusion 0.676, but the size-weighted argmax locked onto a decoy at spatial 0.628 / grad 0.036). detect_watermark now keeps the top-K (_SELECT_TOPK=3) size-weighted candidates (NMS-deduped) plus the corner-promote candidate, scores each by full fusion (spatial+gradient+variance) via the extracted _grad_var_scores helper, and selects the highest -- the gradient term lifts the true sparkle over the decoy. Ranking by the SIZE-WEIGHTED score (not a raw-NCC argmax) preserves tiny-patch suppression: a raw-NCC argmax re-admitted 16-18px content false positives (14/65 doubao + 4/11 jimeng visible images). Top-K adds zero flips on the doubao/jimeng corpora and leaves the 495-image Gemini set unchanged (479 detected) while recovering the reporter's image at 0.676. - _grad_var_scores: gradient/variance scoring factored out of detect_watermark - confidence = best_fused (drop the duplicated fusion recompute) - tests: rename test_promotion_is_what_rescues_it -> test_size_weighted_search_alone_traps_on_the_decoy (corner-promote is no longer the sole rescue path); add a deterministic regression test mirroring the real spatial/grad signature - docs: module-internals.md detector section + CLAUDE.md mechanism map Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-12 12:04:20 -07:00
Victor Kuznetsov	9feea4ac1e	Slim CLAUDE.md: move module internals, limitations, landscape research to docs Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 15:50:03 -07:00

6 Commits