mirror of
https://github.com/wiltodelta/remove-ai-watermarks.git
synced 2026-06-10 12:53:56 +02:00
2fcd00ced0
Nine findings from a high-effort project-wide review, fixed and verified (571 passed, ruff/pyright clean): Correctness: - all/batch now remove Doubao/Jimeng/Samsung visible text marks: the visible step routes through the registry (new cli._remove_visible_auto) instead of a hardcoded GeminiEngine, so they no longer leave the wordmark intact. - batch always reads the original source (dropped the out_path-reuse that re-processed already-cleaned outputs on a re-run). - img2img_runner only retries the diffusion call on the deprecated-callback TypeError; any other TypeError now propagates instead of double-running. - gemini detect/remove and the reverse-alpha engines normalize channels via a new image_io.to_bgr, fixing a grayscale/BGRA crash in the FP-gate path. - _png_late_metadata advances its cursor by the clamped length, so a malformed chunk length no longer aborts the late AI-label scan. Cleanup / efficiency: - Consolidate the ~90%-identical Doubao/Jimeng/Samsung engines into a shared config-driven _text_mark_engine.TextMarkEngine base; each engine is now a thin subclass (TextMarkConfig + test shims). Behavior is byte-exact (the three engine test suites pass unchanged). Registry adapters collapse to one _text_mark(...) row each. Gemini stays a separate engine. - scan_head is memoized per (path, size, mtime), so identify() reads the file head once instead of ~8 times. - invisible_engine post-processing decodes/encodes the output once (chained in memory) instead of 2-4 times across stages. - Remove the orphaned get_model_id_for_profile (+ CONTROLNET_PROFILE); derive the --strength help from the strength constants (strength_default_help) so it cannot drift; share the --pipeline/--strength click options; simplify the retired --auto resolver. Net -835 lines. Tests added for the registry-routed visible pass, to_bgr, the polish/model/guidance wiring, and strength_default_help. CLAUDE.md updated for the new base module, the engine/registry changes, image_io.to_bgr, and the scan_head cache. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
90 lines
3.2 KiB
Python
90 lines
3.2 KiB
Python
"""Unicode-safe cv2 image IO (issue #17).
|
|
|
|
``cv2.imread`` / ``cv2.imwrite`` pass the path to the platform C runtime, which
|
|
on Windows uses the narrow (ANSI) code-page API and therefore fails on paths
|
|
containing non-ASCII characters (Chinese, Cyrillic, ...). The symptom is a
|
|
``can't open/read file`` warning and a ``None`` decode even though the file
|
|
exists.
|
|
|
|
These wrappers route through numpy buffers instead: ``np.fromfile`` /
|
|
``ndarray.tofile`` open the path in Python (full Unicode), and
|
|
``cv2.imdecode`` / ``cv2.imencode`` do the codec work. The decoded/encoded
|
|
bytes are byte-for-byte identical to ``imread`` / ``imwrite``. On macOS/Linux
|
|
cv2 already accepts UTF-8 paths, so the wrappers are behavior-neutral there.
|
|
|
|
cv2/numpy are imported lazily inside the functions so importing this module
|
|
stays cheap in a bare environment (matching the rest of the package).
|
|
"""
|
|
|
|
# cv2 ships no type stubs; mirror the pragma used by the other cv2-using modules.
|
|
# pyright: reportMissingTypeStubs=false, reportUnknownMemberType=false, reportUnknownVariableType=false, reportUnknownArgumentType=false
|
|
|
|
from __future__ import annotations
|
|
|
|
from pathlib import Path
|
|
from typing import TYPE_CHECKING, Any
|
|
|
|
if TYPE_CHECKING:
|
|
from numpy.typing import NDArray
|
|
|
|
|
|
def imread(path: str | Path, flags: int | None = None) -> NDArray[Any] | None:
|
|
"""Unicode-safe ``cv2.imread``.
|
|
|
|
``flags`` defaults to ``cv2.IMREAD_COLOR`` (same as ``cv2.imread``). Returns
|
|
``None`` when the file is missing or cannot be decoded, matching
|
|
``cv2.imread`` semantics so existing ``if img is None`` checks keep working.
|
|
"""
|
|
import cv2
|
|
import numpy as np
|
|
|
|
if flags is None:
|
|
flags = cv2.IMREAD_COLOR
|
|
try:
|
|
data = np.fromfile(str(path), dtype=np.uint8)
|
|
except OSError:
|
|
return None
|
|
if data.size == 0:
|
|
return None
|
|
return cv2.imdecode(data, flags)
|
|
|
|
|
|
def to_bgr(image: NDArray[Any]) -> NDArray[Any]:
|
|
"""Return a 3-channel BGR view of ``image``, promoting grayscale and BGRA.
|
|
|
|
The cv2-based engines (sparkle + the reverse-alpha text marks) assume a
|
|
3-channel BGR array for their channel reductions (``mean(axis=2)``, the
|
|
per-pixel logo subtraction). A 2D grayscale or 4-channel BGRA input -- a real
|
|
Gemini-app export is opaque RGBA -- would otherwise crash or mis-broadcast.
|
|
Centralizes the shape coercion that was inlined across the engines. A 3-channel
|
|
input is returned unchanged (no copy).
|
|
"""
|
|
import cv2
|
|
|
|
if image.ndim == 2 or image.shape[2] == 1:
|
|
return cv2.cvtColor(image, cv2.COLOR_GRAY2BGR)
|
|
if image.shape[2] == 4:
|
|
return cv2.cvtColor(image, cv2.COLOR_BGRA2BGR)
|
|
return image
|
|
|
|
|
|
def imwrite(path: str | Path, img: NDArray[Any]) -> bool:
|
|
"""Unicode-safe ``cv2.imwrite``.
|
|
|
|
The output format is taken from the path extension (e.g. ``.png``), exactly
|
|
like ``cv2.imwrite``. Returns ``True`` on success, ``False`` if the codec
|
|
rejects the image or the path cannot be written (matching ``cv2.imwrite``,
|
|
which returns ``False`` rather than raising on an unwritable path).
|
|
"""
|
|
import cv2
|
|
|
|
ext = Path(path).suffix or ".png"
|
|
ok, buf = cv2.imencode(ext, img)
|
|
if not ok:
|
|
return False
|
|
try:
|
|
buf.tofile(str(path))
|
|
except OSError:
|
|
return False
|
|
return True
|