remove-ai-watermarks

CalvinBackup/remove-ai-watermarks

Fork 0

mirror of https://github.com/wiltodelta/remove-ai-watermarks.git synced 2026-06-10 12:53:56 +02:00

Commit Graph

Author	SHA1	Message	Date
Victor Kuznetsov	6d11c11b52	feat(auto): DBNet text detector, Real-ESRGAN upscaler, batch --auto Three content-quality features for the invisible/all/batch pipeline. DBNet text detector (auto_config): replace the MSER text heuristic with PP-OCRv3 differentiable-binarization via cv2.dnn.TextDetectionModel_DB, using a bundled 2.4 MB Apache-2.0 model (en/cn detection nets are byte-identical, so it ships language-neutral). cv2.dnn is core OpenCV, so no new pip dep. MSER stays as the fallback when the model can't load. Validated on real images: matches MSER everywhere and additionally catches the Doubao CJK mark MSER missed; routing decisions unchanged otherwise. Real-ESRGAN upscaler (new upscaler.py, esrgan extra): optional pre-diffusion super-resolution for the min-resolution floor upscale, loaded via spandrel (MIT, no basicsr) with BSD-3-Clause weights downloaded on first use. New --upscaler {lanczos,esrgan} on invisible/all/batch; default stays lanczos and the engine falls back to lanczos when the extra is absent or the model errors (never breaks removal). It is a manual opt-in knob (the auto plan never selects it) -- as a generic GAN it sharpens photo/texture content strongly but can degrade faces (the diffusion pass regenerates them) and thin text, documented accordingly. batch --auto: wire the content-adaptive --auto (+ --adaptive-polish) into cmd_batch. The plan is recomputed per image and the invisible engine is cached per resolved pipeline (default/controlnet), so a mixed directory builds at most one engine of each kind. Verified end-to-end: 3 mixed images routed correctly with only 2 pipeline loads (controlnet reused). ruff + strict pyright(src/) clean; 558 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-04 16:04:33 -07:00
Victor Kuznetsov	b686dbdd79	feat(auto): adaptive detail-targeting polish + --adaptive-polish flag The fixed mild auto polish (unsharp 0.5 / grain 2.0) under-corrected soft photo/face output (gemini_3 stayed at lap-var 84 vs its 592 original) and its grain speckled small text. Replace it with humanizer.adaptive_polish: target the input's Laplacian variance with a capped unsharp scaled to the deficit + edge- masked grain (smooth regions only), calibrated by a short sigma search. Self- limiting on text/graphics -- already high-frequency, so almost no polish lands and text edges are masked out. Validated on the spaces corpus (gemini_3 84 -> 334 end-to-end; openai_1 text near-untouched). Interface: every --auto decision is now independently overridable -- add --adaptive-polish/--no-adaptive-polish (matching --restore-faces; works without --auto too) so the polish can be disabled or used manually. _apply_auto overrides exactly the three content-adaptive modes (pipeline, restore-faces, adaptive- polish); --unsharp/--humanize stay independent fixed filters. cv2-only, no new deps. Threaded through invisible/all (not batch). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 21:49:08 -07:00
Victor Kuznetsov	9bd2c17cc4	feat(auto): content-adaptive --auto quality mode, Phase 1 Add `auto_config.plan(image_path) -> AutoConfig`, the first step of the invisible/all pipeline: it inspects the input image (before the diffusion model loads) and picks the quality modes so the run adapts to content. Quality-priority routing -- ControlNet (text/face-structure preservation) is the default, skipped for plain SDXL only on a clearly structure-less image; GFPGAN face restore when a face is present; a mild sharpen + grain polish when a smoothing pass ran. Exposed as `--auto` on `all`/`invisible` (`_apply_auto`; explicit flags override via click's parameter source). Not wired into batch (its engine is cached per-mode). Detection is cv2-only and torch-free (~100 MB peak RSS, a few ms): OpenCV YuNet (`cv2.FaceDetectorYN`, MIT, 232 KB model bundled in assets/) for faces, a Canny edge-density + MSER heuristic for text/structure (a rough Phase-1 placeholder; DBNet via cv2.dnn is the planned upgrade). ZERO new pip deps. Designed to run wherever the pipeline runs -- the raiw.cc Modal GPU worker -- never on the 512 MB web host. Real-ESRGAN-via-Spandrel upscaling (a new `esrgan` extra) and an adaptive Laplacian-variance polish are deferred to later phases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 20:52:17 -07:00

Author

SHA1

Message

Date

Victor Kuznetsov

6d11c11b52

feat(auto): DBNet text detector, Real-ESRGAN upscaler, batch --auto

Three content-quality features for the invisible/all/batch pipeline.

DBNet text detector (auto_config): replace the MSER text heuristic with
PP-OCRv3 differentiable-binarization via cv2.dnn.TextDetectionModel_DB,
using a bundled 2.4 MB Apache-2.0 model (en/cn detection nets are
byte-identical, so it ships language-neutral). cv2.dnn is core OpenCV, so
no new pip dep. MSER stays as the fallback when the model can't load.
Validated on real images: matches MSER everywhere and additionally catches
the Doubao CJK mark MSER missed; routing decisions unchanged otherwise.

Real-ESRGAN upscaler (new upscaler.py, esrgan extra): optional
pre-diffusion super-resolution for the min-resolution floor upscale, loaded
via spandrel (MIT, no basicsr) with BSD-3-Clause weights downloaded on
first use. New --upscaler {lanczos,esrgan} on invisible/all/batch; default
stays lanczos and the engine falls back to lanczos when the extra is absent
or the model errors (never breaks removal). It is a manual opt-in knob (the
auto plan never selects it) -- as a generic GAN it sharpens photo/texture
content strongly but can degrade faces (the diffusion pass regenerates
them) and thin text, documented accordingly.

batch --auto: wire the content-adaptive --auto (+ --adaptive-polish) into
cmd_batch. The plan is recomputed per image and the invisible engine is
cached per resolved pipeline (default/controlnet), so a mixed directory
builds at most one engine of each kind. Verified end-to-end: 3 mixed
images routed correctly with only 2 pipeline loads (controlnet reused).

ruff + strict pyright(src/) clean; 558 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-04 16:04:33 -07:00

Victor Kuznetsov

b686dbdd79

feat(auto): adaptive detail-targeting polish + --adaptive-polish flag

The fixed mild auto polish (unsharp 0.5 / grain 2.0) under-corrected soft
photo/face output (gemini_3 stayed at lap-var 84 vs its 592 original) and its
grain speckled small text. Replace it with humanizer.adaptive_polish: target the
input's Laplacian variance with a capped unsharp scaled to the deficit + edge-
masked grain (smooth regions only), calibrated by a short sigma search. Self-
limiting on text/graphics -- already high-frequency, so almost no polish lands
and text edges are masked out. Validated on the spaces corpus (gemini_3 84 -> 334
end-to-end; openai_1 text near-untouched).

Interface: every --auto decision is now independently overridable -- add
--adaptive-polish/--no-adaptive-polish (matching --restore-faces; works without
--auto too) so the polish can be disabled or used manually. _apply_auto overrides
exactly the three content-adaptive modes (pipeline, restore-faces, adaptive-
polish); --unsharp/--humanize stay independent fixed filters.

cv2-only, no new deps. Threaded through invisible/all (not batch).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-03 21:49:08 -07:00

Victor Kuznetsov

9bd2c17cc4

feat(auto): content-adaptive --auto quality mode, Phase 1

Add `auto_config.plan(image_path) -> AutoConfig`, the first step of the
invisible/all pipeline: it inspects the input image (before the diffusion model
loads) and picks the quality modes so the run adapts to content. Quality-priority
routing -- ControlNet (text/face-structure preservation) is the default, skipped for
plain SDXL only on a clearly structure-less image; GFPGAN face restore when a face is
present; a mild sharpen + grain polish when a smoothing pass ran. Exposed as `--auto`
on `all`/`invisible` (`_apply_auto`; explicit flags override via click's parameter
source). Not wired into batch (its engine is cached per-mode).

Detection is cv2-only and torch-free (~100 MB peak RSS, a few ms): OpenCV YuNet
(`cv2.FaceDetectorYN`, MIT, 232 KB model bundled in assets/) for faces, a Canny
edge-density + MSER heuristic for text/structure (a rough Phase-1 placeholder; DBNet
via cv2.dnn is the planned upgrade). ZERO new pip deps. Designed to run wherever the
pipeline runs -- the raiw.cc Modal GPU worker -- never on the 512 MB web host.

Real-ESRGAN-via-Spandrel upscaling (a new `esrgan` extra) and an adaptive
Laplacian-variance polish are deferred to later phases.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-03 20:52:17 -07:00

3 Commits