mirror of
https://github.com/wiltodelta/remove-ai-watermarks.git
synced 2026-06-10 12:53:56 +02:00
chore: project review (dev tools in extras, dep upgrades, optional-deps guard, stale cleanup)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
+3
-5
@@ -18,9 +18,6 @@ Thumbs.db
|
||||
*.swp
|
||||
*.swo
|
||||
|
||||
# Test results
|
||||
data/results/
|
||||
|
||||
# SynthID corpus reference fills (synthetic black/white calibration tiles,
|
||||
# regenerable; the labeled pos/neg/cleaned images ARE tracked, see README)
|
||||
data/synthid_corpus/refs/
|
||||
@@ -49,6 +46,7 @@ data/gemini_capture/captures/gemini_content_*.png
|
||||
data/samsung_capture/seeds/
|
||||
data/samsung_capture/captures/samsung_content_*
|
||||
|
||||
# GFPGAN downloads its RetinaFace/parsing weights to a CWD ./gfpgan/weights/
|
||||
# working dir on first use (the restore extra). Runtime artifact, never committed.
|
||||
# Leftover GFPGAN weights dir from the retired face-restore experiments
|
||||
# (GFPGAN wrote RetinaFace/parsing weights to a CWD ./gfpgan/weights/ working
|
||||
# dir on first use). Runtime artifact, never committed.
|
||||
gfpgan/
|
||||
|
||||
@@ -15,11 +15,12 @@ You are a **principal Python engineer** maintaining a CLI tool and library for r
|
||||
|
||||
## Test and lint
|
||||
|
||||
- **CI** (`.github/workflows/test.yml`): runs on push to `main` + every PR. A `lint` job (ubuntu: `ruff check` + `ruff format --check`) plus a `test` matrix (ubuntu/macos/windows x py3.10/3.12) that does `uv sync --frozen --extra dev` then `pytest`. The matrix installs only core + dev (no `gpu` extra), so the GPU/model-running tests skip there and it exercises the metadata/identify/visible/cv2-eraser surface on all three OSes. Keep `uv.lock` valid (don't break `--frozen`) when editing `pyproject.toml`. `publish.yml` stays release-only and now verifies the release tag matches the `pyproject.toml` version (fails the build on a mismatch) before building, then uploads via `uv publish` (PyPI trusted publishing over OIDC, no token — replaced the `pypa/gh-action-pypi-publish` action so the upload no longer depends on that action's bundled twine accepting the Metadata-Version; the `id-token: write` permission + `pypi` environment + workflow filename are unchanged, so PyPI's trusted-publisher entry still matches). **Release flow:** bump the version in `pyproject.toml` + `src/remove_ai_watermarks/__init__.py` + `uv.lock` (the project's own `[[package]]` entry — find it with `grep -n 'name = "remove-ai-watermarks"' uv.lock`, the `version =` line right below it, ~line 2246), commit `chore(release): vX.Y.Z`, `git tag -a vX.Y.Z -m vX.Y.Z` (annotated — `git tag` without `-m` errors here), push `main` + the tag, then `gh release create vX.Y.Z` — **PyPI publish triggers on the GitHub Release `published` event, NOT on the tag push**, so the tag alone does not publish. **Sdist must exclude `data/`** (`[tool.hatch.build.targets.sdist] exclude = ["/data"]`): hatchling's default sdist bundles all VCS-tracked files, so the committed `data/` test corpora (the multi-hundred-MB synthid_corpus images + the visible-mark captures) pushed the **0.8.0** sdist past PyPI's per-project file-size limit (400 "File too large") — the wheel uploaded but the sdist was rejected, so 0.8.0 shipped wheel-only and 0.8.1 carried the fix. The wheel only ships `src/` (via `[tool.hatch.build.targets.wheel] packages`), so it was never affected. **A failed PyPI upload of one artifact still leaves the other live and you cannot re-upload the same version** — fix the build and cut the next patch. **Build backend is pinned `hatchling<1.31`** (`[build-system] requires`): hatchling **1.30.0** made **Metadata-Version 2.5** (PEP 794) the default, which the twine bundled in `pypa/gh-action-pypi-publish@release/v1` rejects (`"'2.5' is not a valid Metadata-Version"`) — this **failed the v0.8.3 PyPI upload on 2026-06-01** (tag-match + build passed, the upload step failed; nothing was uploaded, so the version stayed empty on PyPI), when unpinned `requires = ["hatchling"]` pulled 1.30.0. **hatchling 1.30.1 reverted the default back to 2.4** ("kept at 2.4 until more tools support 2.5"), and 1.27-1.29 always emitted 2.4 — so `<1.31` keeps `uv build` on a 2.4-emitting hatchling (it resolves to the latest allowed, **1.30.1**, which uploads fine). (The earlier "1.28+ emits 2.5" note was imprecise: the 2.5 default landed only in 1.30.0, verified against hatch's changelog.) The publish workflow **now uses `uv publish`** (its uploader handles 2.5), so this pin is no longer load-bearing — it stays as belt-and-suspenders so the first uv-publish release ships 2.4 metadata (isolating the uploader swap from the metadata-version bump); drop it to `requires = ["hatchling"]` once that release confirms the path.
|
||||
- `bash maintain.sh` — uv-outdated, uv-secure, ruff check/fix, ruff format, pyright, pytest -n auto
|
||||
- **Strict pyright is clean across `src/` (0 errors).** The cv2/torch/diffusers boundary files (`gemini_engine`, `region_eraser`, `doubao_engine`, `humanizer`, `invisible_engine`, `noai/watermark_remover`) carry a documented per-file `# pyright:` relax pragma that turns off only the unknown-type / untyped-third-party rules — those libs ship no usable types, so strict typing there fights the ecosystem. Pure-logic files stay fully strict; `typings/piexif/__init__.pyi` is a local stub so `metadata.py`/`extractor.py` resolve piexif. Public ndarray-returning signatures on the relaxed engines are still annotated `NDArray[Any]` so strict consumers (`cli.py`) stay clean. When touching a relaxed file, prefer fixing real issues over widening the pragma; keep the pragma scoped to genuinely-untyped boundaries. (`uv-secure` is clean since idna was bumped 3.11 -> 3.16, fixing GHSA-65pc-fj4g-8rjx, and aiohttp 3.13.5 -> 3.14.0 via `uv lock --upgrade-package aiohttp`, fixing GHSA-hg6j-4rv6-33pg + GHSA-jg22-mg44-37j8. (The basicsr Dependabot alert (GHSA-86w8-vhw6-q9qq, command injection, no patch, <= 1.4.2): accepted, not fixable -- basicsr is the optional `restore` extra pinned to 1.4.2 as the only buildable version, experimental and off by default.)
|
||||
- **CI** (`.github/workflows/test.yml`): runs on push to `main` + every PR. A `lint` job (ubuntu: `ruff check` + `ruff format --check`) plus a `test` matrix (ubuntu/macos/windows x py3.10/3.12) that does `uv sync --frozen --extra dev` then `pytest`. The matrix installs only core + dev (no `gpu` extra), so the GPU/model-running tests skip there and it exercises the metadata/identify/visible/cv2-eraser surface on all three OSes. Keep `uv.lock` valid (don't break `--frozen`) when editing `pyproject.toml`. `publish.yml` stays release-only and now verifies the release tag matches the `pyproject.toml` version (fails the build on a mismatch) before building, then uploads via `uv publish` (PyPI trusted publishing over OIDC, no token — replaced the `pypa/gh-action-pypi-publish` action so the upload no longer depends on that action's bundled twine accepting the Metadata-Version; the `id-token: write` permission + `pypi` environment + workflow filename are unchanged, so PyPI's trusted-publisher entry still matches). **Release flow:** bump the version in `pyproject.toml` + `src/remove_ai_watermarks/__init__.py` + `uv.lock` (the project's own `[[package]]` entry — find it with `grep -n 'name = "remove-ai-watermarks"' uv.lock`, the `version =` line right below it, ~line 2246), commit `chore(release): vX.Y.Z`, `git tag -a vX.Y.Z -m vX.Y.Z` (annotated — `git tag` without `-m` errors here), push `main` + the tag, then `gh release create vX.Y.Z` — **PyPI publish triggers on the GitHub Release `published` event, NOT on the tag push**, so the tag alone does not publish. **Sdist must exclude `data/`** (`[tool.hatch.build.targets.sdist] exclude = ["/data"]`): hatchling's default sdist bundles all VCS-tracked files, so the committed `data/` test corpora (the multi-hundred-MB synthid_corpus images + the visible-mark captures) pushed the **0.8.0** sdist past PyPI's per-project file-size limit (400 "File too large") — the wheel uploaded but the sdist was rejected, so 0.8.0 shipped wheel-only and 0.8.1 carried the fix. The wheel only ships `src/` (via `[tool.hatch.build.targets.wheel] packages`), so it was never affected. **A failed PyPI upload of one artifact still leaves the other live and you cannot re-upload the same version** — fix the build and cut the next patch. **Build backend is unpinned `hatchling`** (`[build-system] requires`) since 2026-06-09. History: it was pinned `<1.31` because hatchling 1.30.0 made Metadata-Version 2.5 (PEP 794) the default and the twine bundled in `pypa/gh-action-pypi-publish@release/v1` rejected it (`"'2.5' is not a valid Metadata-Version"`), which **failed the v0.8.3 PyPI upload on 2026-06-01**; hatchling 1.30.1 reverted the default to 2.4. After the workflow moved to `uv publish` (whose uploader accepts 2.5) the pin was belt-and-suspenders only, and once v0.9.0 + v0.10.0 both published wheel+sdist through that path (verified on PyPI) it was dropped. If a future hatchling flips the default to 2.5 again and some consumer chokes, re-pin with a dated comment.
|
||||
- `bash maintain.sh` — uv-outdated, uv-secure, ruff check/fix, ruff format, pyright (scoped `src/`, see the OOM note below), pytest -n auto. The helper tools live in the `dev` extra (`pytest-xdist`, plus `uv-outdated`/`uv-secure` marker-gated to py3.12+ so the py3.10 resolution stays solvable) — a bare env without `--extra dev` does not have them.
|
||||
- **Strict pyright is clean across `src/` (0 errors).** The cv2/torch/diffusers boundary files (`gemini_engine`, `region_eraser`, `doubao_engine`, `humanizer`, `invisible_engine`, `noai/watermark_remover`) carry a documented per-file `# pyright:` relax pragma that turns off only the unknown-type / untyped-third-party rules — those libs ship no usable types, so strict typing there fights the ecosystem. Pure-logic files stay fully strict; `typings/piexif/__init__.pyi` is a local stub so `metadata.py`/`extractor.py` resolve piexif. Public ndarray-returning signatures on the relaxed engines are still annotated `NDArray[Any]` so strict consumers (`cli.py`) stay clean. When touching a relaxed file, prefer fixing real issues over widening the pragma; keep the pragma scoped to genuinely-untyped boundaries. (`uv-secure` is clean since idna was bumped 3.11 -> 3.16, fixing GHSA-65pc-fj4g-8rjx, and aiohttp 3.13.5 -> 3.14.0 via `uv lock --upgrade-package aiohttp`, fixing GHSA-hg6j-4rv6-33pg + GHSA-jg22-mg44-37j8. (The old basicsr Dependabot alert (GHSA-86w8-vhw6-q9qq) is resolved by removal: the experimental `restore` extra was retired and basicsr is no longer anywhere in the dependency tree.)
|
||||
- **Full-project `uv run pyright` (no path) OOMs/crashes node on this ML-heavy repo** (emits a `libnode` stack frame, no summary) — a known environment limit, not a code error. Gate with `uv run --extra dev --extra gpu pyright src/` (completes, authoritative) or scope to changed files; also run `uv run ruff check` and `uv run pytest` directly.
|
||||
- Run `uv run` from the repo root — from another cwd it falls back to a bare env without numpy/cv2/torch.
|
||||
- **Stale `trustmark` remnant in site-packages after an extras change:** the `trustmark` package downloads model weights INTO its own package dir, so when a narrower `uv sync` prunes the package, a `trustmark/models/` directory survives as an empty namespace package. Symptom: pyright `"TrustMark" is unknown import symbol` on `trustmark_detector.py` and `find_spec("trustmark")` returning a loader-less spec (so `is_available()` lies True). Fix: `rm -rf .venv/lib/python3.12/site-packages/trustmark` (regenerable weights cache).
|
||||
- To add a dev tool (pytest/ruff/pyright) into the env, use `uv sync --frozen --extra dev --extra gpu`, **never `uv pip install`** — `uv pip install` re-resolves and rewrites `uv.lock`, which silently bumped `transformers` to a build incompatible with the pinned `diffusers` (`cannot import name 'Qwen3VLForConditionalGeneration'`) and broke every `identify`/metadata import. Recovery: `git checkout uv.lock && uv sync --frozen --extra gpu --extra dev`. The `gpu` extra holds `diffusers`/`transformers`/`torch`, so a bare `uv sync` (no extras) removes them; `noai/__init__` is now **lazy** (PEP 562 `__getattr__`, so importing `identify`/`metadata` no longer pulls `watermark_remover`/torch), so a bare env breaks only when the removal pipeline is actually invoked, not on import. `maintain.sh`'s `uv sync --all-extras` also pulls the heavy `trustmark`/`lama` wheels (pytorch-lightning, onnxruntime) — fine on a good connection, but on flaky DNS sync only `--extra gpu --extra dev` and run the lint/test steps by hand.
|
||||
- Metadata/C2PA tests assert against real committed fixtures in `data/samples/` (`chatgpt-*.png` = OpenAI C2PA, `firefly-1.png` = Adobe, `mj-*` = Midjourney IPTC, `doubao-1.png` = ByteDance Doubao with the China TC260 `<TC260:AIGC>` XMP label **and** a visible "豆包AI生成" text mark bottom-right; `grok-1.jpg` = xAI Grok with its EXIF-only `Signature:` blob + UUID `Artist` and no C2PA/SynthID/IPTC); synthetic byte blobs cover the JPEG/ISOBMFF format paths. The "non-AI / clean photo" control is no longer in `data/samples/` -- the `clean_photo` conftest fixture serves a verified-negative image from the corpus `neg/` set (skips if the corpus is absent).
|
||||
- SynthID reference corpus: `scripts/synthid_corpus.py` ingests labeled images into `data/synthid_corpus/`. The labeled `images/` (`pos/` `neg/` `cleaned/`) are **committed** (public repo -- review every image for private content before adding; `manifest.csv` is kept in sync with the files on disk, one row per tracked image); only the synthetic `refs/` calibration fills are gitignored. See its README for the collection protocol and verification oracles. **`cleaned/` examples must be produced by a CURRENT shipped removal method** -- the default SDXL img2img pass (optionally `--max-resolution`). Do NOT archive cleaned outputs from methods that are no longer in the pipeline (ctrlregen, the old text/face-protection, IP-Adapter FaceID, CodeFormer) or from the experimental opt-in paths (controlnet, face restore) as corpus examples; a cleaned reference should represent the canonical removal, and a removed method's output is not a reproducible example. Keep those experiment outputs in a local working dir, never in the committed corpus.
|
||||
|
||||
+3
-1
@@ -7,5 +7,7 @@ uv run uv-outdated
|
||||
uv run uv-secure --ignore-unfixed
|
||||
uv run ruff check --fix
|
||||
uv run ruff format
|
||||
uv run pyright
|
||||
# Scoped to src/: a full-project pyright run OOM-crashes node on this ML-heavy
|
||||
# repo (see CLAUDE.md "Test and lint"); src/ is the authoritative strict gate.
|
||||
uv run pyright src/
|
||||
uv run pytest -n auto
|
||||
|
||||
+6
-19
@@ -91,19 +91,17 @@ esrgan = [
|
||||
dev = [
|
||||
"pytest>=8.0.0",
|
||||
"pytest-cov>=4.1.0",
|
||||
"pytest-xdist>=3.5.0",
|
||||
"ruff>=0.4.0",
|
||||
"pyright>=1.1.0",
|
||||
"invisible-watermark>=0.2.0",
|
||||
# maintain.sh helpers; they only support newer Pythons, so gate them by
|
||||
# marker to keep the py3.10 resolution (and CI matrix) solvable.
|
||||
"uv-outdated>=0.1.0; python_version >= '3.12'",
|
||||
"uv-secure>=0.12.0; python_version >= '3.12'",
|
||||
]
|
||||
all = ["remove-ai-watermarks[gpu,detect,trustmark,lama,dev]"]
|
||||
|
||||
# diffusers 0.38.0 (security fix for GHSA-98h9-4798-4q5v) declares a dependency
|
||||
# on safetensors>=0.8.0rc0 — a pre-release. Allow pre-releases globally so the
|
||||
# resolver can satisfy that. Drop once diffusers publishes a release with a
|
||||
# stable safetensors pin (or once safetensors 0.8.0 stable is out).
|
||||
[tool.uv]
|
||||
prerelease = "allow"
|
||||
|
||||
# PyTorch Intel-GPU (XPU) wheel index. ``explicit = true`` keeps it inert for
|
||||
# the default CPU/CUDA install: uv consults it only when a torch install
|
||||
# explicitly targets it (see the ``gpu`` extra comment), so it does not alter
|
||||
@@ -120,18 +118,7 @@ remove-ai-watermarks = "remove_ai_watermarks.cli:main"
|
||||
Repository = "https://github.com/wiltodelta/remove-ai-watermarks"
|
||||
|
||||
[build-system]
|
||||
# Pin hatchling < 1.31. hatchling 1.30.0 made Metadata-Version 2.5 (PEP 794) the
|
||||
# default, which the twine bundled in pypa/gh-action-pypi-publish@release/v1 rejects
|
||||
# ("'2.5' is not a valid Metadata-Version"), failing the v0.8.3 PyPI upload
|
||||
# (2026-06-01) when unpinned requires = ["hatchling"] pulled 1.30.0. hatchling 1.30.1
|
||||
# reverted the default to 2.4 ("kept at 2.4 until more tools support 2.5"), and
|
||||
# 1.27-1.29 were always 2.4 -- so < 1.31 keeps `uv build` on a 2.4-emitting hatchling
|
||||
# (it resolves to the latest allowed, 1.30.1). The publish workflow now uses
|
||||
# `uv publish`, whose uploader accepts 2.5, so this pin is belt-and-suspenders, not
|
||||
# load-bearing: keeping it makes the first uv-publish release ship 2.4 metadata
|
||||
# (isolating the uploader swap from the metadata-version bump). Drop to
|
||||
# `requires = ["hatchling"]` once that release confirms the path.
|
||||
requires = ["hatchling<1.31"]
|
||||
requires = ["hatchling"]
|
||||
build-backend = "hatchling.build"
|
||||
|
||||
[tool.hatch.build.targets.wheel]
|
||||
|
||||
@@ -18,6 +18,7 @@ from typing import TYPE_CHECKING, Any, Literal
|
||||
import click
|
||||
|
||||
from remove_ai_watermarks import __version__, watermark_registry
|
||||
from remove_ai_watermarks.noai.constants import SUPPORTED_FORMATS
|
||||
from remove_ai_watermarks.noai.watermark_profiles import (
|
||||
resolve_strength,
|
||||
strength_default_help,
|
||||
@@ -106,8 +107,6 @@ Progress = _Progress
|
||||
SpinnerColumn = BarColumn = TextColumn = TimeElapsedColumn = _column
|
||||
console = _Console()
|
||||
|
||||
SUPPORTED_FORMATS = {".png", ".jpg", ".jpeg", ".webp"}
|
||||
|
||||
|
||||
def _setup_logging(verbose: bool) -> None:
|
||||
level = logging.DEBUG if verbose else logging.WARNING
|
||||
|
||||
@@ -2,8 +2,8 @@
|
||||
|
||||
``apply_analog_humanizer`` injects film grain and chromatic aberration to defeat
|
||||
digital AI-perfection classifiers (ported from NeuralBleach); ``unsharp_mask``
|
||||
counters the soft, over-smoothed look that diffusion + face-restoration leave
|
||||
behind (itself a common "this is AI" tell).
|
||||
counters the soft, over-smoothed look that the diffusion pass leaves behind
|
||||
(itself a common "this is AI" tell).
|
||||
"""
|
||||
|
||||
# cv2/numpy boundary: third-party libs ship no usable element types; relax the
|
||||
@@ -63,7 +63,7 @@ def apply_analog_humanizer(image: NDArray, grain_intensity: float = 4.0, chromat
|
||||
def unsharp_mask(image: NDArray, amount: float = 0.5, sigma: float = 1.0) -> NDArray:
|
||||
"""Sharpen via unsharp masking: ``out = image + amount * (image - blur(image))``.
|
||||
|
||||
Counters the soft, over-smoothed look of the diffusion + GFPGAN passes, which
|
||||
Counters the soft, over-smoothed look of the diffusion pass, which
|
||||
reads as an AI tell. ``amount`` 0 = no-op (returns an unchanged copy); ~0.5-0.8
|
||||
is a safe range -- higher risks bright edge halos that are their own artifact.
|
||||
``sigma`` is the Gaussian radius of the unsharp kernel.
|
||||
|
||||
@@ -19,6 +19,8 @@ import warnings
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING, Any
|
||||
|
||||
from .noai.watermark_profiles import DEFAULT_MODEL_ID as DEFAULT_SDXL_MODEL_ID
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Callable
|
||||
|
||||
@@ -37,9 +39,9 @@ logger = logging.getLogger(__name__)
|
||||
|
||||
def is_available() -> bool:
|
||||
"""Check if invisible watermark removal dependencies are installed."""
|
||||
import importlib.util
|
||||
from .optional_deps import module_available
|
||||
|
||||
return importlib.util.find_spec("diffusers") is not None and importlib.util.find_spec("torch") is not None
|
||||
return module_available("diffusers", "torch")
|
||||
|
||||
|
||||
def _target_size(width: int, height: int, max_resolution: int, min_resolution: int = 0) -> tuple[int, int] | None:
|
||||
@@ -83,7 +85,7 @@ class InvisibleEngine:
|
||||
|
||||
# SDXL base is the default since May 2026; the vendor-adaptive strength
|
||||
# removes the current SynthID (see watermark_profiles + docs/synthid.md).
|
||||
DEFAULT_MODEL_ID = "stabilityai/stable-diffusion-xl-base-1.0"
|
||||
DEFAULT_MODEL_ID = DEFAULT_SDXL_MODEL_ID
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
@@ -176,7 +178,7 @@ class InvisibleEngine:
|
||||
output_path: Output path (None = overwrite source).
|
||||
strength: Denoising strength (0.0-1.0). None -> the vendor-adaptive
|
||||
default.
|
||||
steps: Number of denoising steps.
|
||||
num_inference_steps: Number of denoising steps.
|
||||
guidance_scale: Classifier-free guidance scale.
|
||||
seed: Random seed for reproducibility.
|
||||
humanize: Intensity of Analog Humanizer film grain (0 = off).
|
||||
|
||||
@@ -50,9 +50,9 @@ _MATCH_SD1_FRAC = 0.92 # fraction of the 136 string bits that must match
|
||||
|
||||
def is_available() -> bool:
|
||||
"""True if the optional imwatermark decoder is installed."""
|
||||
import importlib.util
|
||||
from .optional_deps import module_available
|
||||
|
||||
return importlib.util.find_spec("imwatermark") is not None
|
||||
return module_available("imwatermark")
|
||||
|
||||
|
||||
def _bits_match(value: int, ref: int, width: int = 48) -> int:
|
||||
|
||||
@@ -20,6 +20,11 @@ if TYPE_CHECKING:
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Smaller scan_head window for the cheap marker checks (has_ai_metadata,
|
||||
# samsung_genai); the full-detail scans use scan_head's 1 MB default. Sharing
|
||||
# one constant also keeps both call sites on the same memoized cache entry.
|
||||
_QUICK_SCAN_BYTES = 512 * 1024
|
||||
|
||||
# ── Known AI metadata keys ──────────────────────────────────────────
|
||||
|
||||
AI_METADATA_KEYS: frozenset[str] = frozenset(
|
||||
@@ -306,7 +311,7 @@ def has_ai_metadata(image_path: Path) -> bool:
|
||||
|
||||
# Binary scan covers C2PA (PNG caBX, JPEG APP11, AVIF/HEIF/JXL uuid boxes)
|
||||
# and IPTC AI markers in XMP. First 512KB (plus late ISOBMFF provenance boxes).
|
||||
data = scan_head(image_path, 512 * 1024)
|
||||
data = scan_head(image_path, _QUICK_SCAN_BYTES)
|
||||
if c2pa_marker_in(data):
|
||||
return True
|
||||
if any(marker in data for marker in AIGC_MARKERS):
|
||||
@@ -453,7 +458,7 @@ def samsung_genai(image_path: Path) -> int | None:
|
||||
gated on the ``PhotoEditor_Re_Edit_Data`` container so an incidental
|
||||
``genAIType`` token cannot false-positive.
|
||||
"""
|
||||
head = scan_head(image_path, 512 * 1024)
|
||||
head = scan_head(image_path, _QUICK_SCAN_BYTES)
|
||||
if _SAMSUNG_EDITOR_MARKER not in head:
|
||||
return None
|
||||
m = _SAMSUNG_GENAI_RE.search(head)
|
||||
|
||||
@@ -0,0 +1,28 @@
|
||||
"""Shared availability guard for optional dependencies.
|
||||
|
||||
A bare ``importlib.util.find_spec(name) is not None`` check lies when only a
|
||||
leftover data directory exists in site-packages: e.g. ``trustmark`` downloads
|
||||
model weights into its own package dir, so after the package is uninstalled
|
||||
(``uv sync`` pruning an extra) a ``trustmark/models/`` remnant survives and
|
||||
``find_spec`` resolves it to a namespace-package spec (``loader is None``)
|
||||
while the actual import fails. Every ``is_available()`` guard routes through
|
||||
``module_available`` so a pure namespace package counts as absent.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import importlib.util
|
||||
|
||||
|
||||
def module_available(*names: str) -> bool:
|
||||
"""True when every named module resolves to a real, importable package.
|
||||
|
||||
A spec with ``loader is None`` is a pure namespace package -- for our
|
||||
optional deps that means a stale directory remnant, not an installed
|
||||
package -- so it is treated as not available.
|
||||
"""
|
||||
for name in names:
|
||||
spec = importlib.util.find_spec(name)
|
||||
if spec is None or spec.loader is None:
|
||||
return False
|
||||
return True
|
||||
@@ -82,9 +82,9 @@ def erase_cv2(
|
||||
|
||||
def lama_available() -> bool:
|
||||
"""True when the optional LaMa-ONNX backend can run (onnxruntime installed)."""
|
||||
import importlib.util
|
||||
from .optional_deps import module_available
|
||||
|
||||
return importlib.util.find_spec("onnxruntime") is not None
|
||||
return module_available("onnxruntime")
|
||||
|
||||
|
||||
def _get_lama_session() -> object:
|
||||
|
||||
@@ -40,9 +40,9 @@ _tm_lock = threading.Lock()
|
||||
|
||||
def is_available() -> bool:
|
||||
"""True if the optional ``trustmark`` package is installed."""
|
||||
import importlib.util
|
||||
from .optional_deps import module_available
|
||||
|
||||
return importlib.util.find_spec("trustmark") is not None
|
||||
return module_available("trustmark")
|
||||
|
||||
|
||||
def _decoder() -> Any:
|
||||
|
||||
@@ -8,8 +8,8 @@ The DEFAULT upscaler stays Lanczos (cv2, no deps); this is opt-in via the ``esrg
|
||||
extra and feeds the ``--upscaler esrgan`` path. ``spandrel`` is a pure model-loader
|
||||
(MIT) with NO basicsr dependency -- it pulls only torch/torchvision/safetensors/numpy/
|
||||
einops -- so it sidesteps the basicsr / ``torchvision.transforms.functional_tensor``
|
||||
breakage that the ``restore`` (GFPGAN) extra has to shim. Real-ESRGAN weights are
|
||||
BSD-3-Clause.
|
||||
breakage that the retired ``restore`` (GFPGAN) extra had to shim. Real-ESRGAN weights
|
||||
are BSD-3-Clause.
|
||||
|
||||
CPU works but is slow on large inputs, so this is meant for the pre-diffusion upscale of
|
||||
SMALL inputs (and the GPU worker). On a memory-constrained host it is a no-op (the extra
|
||||
@@ -21,7 +21,6 @@ is absent), and the caller falls back to Lanczos.
|
||||
# pyright: reportUnknownMemberType=false, reportUnknownArgumentType=false, reportUnknownVariableType=false, reportUnknownParameterType=false, reportMissingTypeArgument=false, reportMissingTypeStubs=false, reportMissingImports=false, reportArgumentType=false, reportAssignmentType=false, reportReturnType=false, reportCallIssue=false, reportIndexIssue=false, reportOperatorIssue=false, reportAttributeAccessIssue=false, reportPrivateImportUsage=false
|
||||
from __future__ import annotations
|
||||
|
||||
import importlib.util
|
||||
import logging
|
||||
import threading
|
||||
from pathlib import Path
|
||||
@@ -45,7 +44,9 @@ _lock = threading.Lock()
|
||||
|
||||
def is_available() -> bool:
|
||||
"""True if the ``esrgan`` extra (spandrel + torch) is importable."""
|
||||
return importlib.util.find_spec("spandrel") is not None and importlib.util.find_spec("torch") is not None
|
||||
from .optional_deps import module_available
|
||||
|
||||
return module_available("spandrel", "torch")
|
||||
|
||||
|
||||
def _model_cache_path() -> Path:
|
||||
|
||||
@@ -0,0 +1,58 @@
|
||||
"""Tests for the shared optional-dependency availability guard."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import importlib.machinery
|
||||
import importlib.util
|
||||
|
||||
from remove_ai_watermarks import optional_deps
|
||||
|
||||
|
||||
def _fake_find_spec(specs: dict[str, importlib.machinery.ModuleSpec | None]):
|
||||
def find_spec(name: str) -> importlib.machinery.ModuleSpec | None:
|
||||
return specs[name]
|
||||
|
||||
return find_spec
|
||||
|
||||
|
||||
def _real_spec(name: str) -> importlib.machinery.ModuleSpec:
|
||||
spec = importlib.util.find_spec(name)
|
||||
assert spec is not None
|
||||
assert spec.loader is not None
|
||||
return spec
|
||||
|
||||
|
||||
class TestModuleAvailable:
|
||||
def test_installed_module_is_available(self):
|
||||
assert optional_deps.module_available("json") is True
|
||||
|
||||
def test_missing_module_is_not_available(self, monkeypatch):
|
||||
monkeypatch.setattr(importlib.util, "find_spec", _fake_find_spec({"ghost": None}))
|
||||
assert optional_deps.module_available("ghost") is False
|
||||
|
||||
def test_namespace_package_remnant_is_not_available(self, monkeypatch):
|
||||
# A leftover data dir in site-packages (e.g. trustmark/models/ surviving
|
||||
# an uninstall) resolves to a namespace-package spec with loader=None;
|
||||
# the guard must not report it as installed.
|
||||
ns_spec = importlib.machinery.ModuleSpec("trustmark", loader=None, is_package=True)
|
||||
assert ns_spec.loader is None
|
||||
monkeypatch.setattr(importlib.util, "find_spec", _fake_find_spec({"trustmark": ns_spec}))
|
||||
assert optional_deps.module_available("trustmark") is False
|
||||
|
||||
def test_any_namespace_member_fails_the_conjunction(self, monkeypatch):
|
||||
ns_spec = importlib.machinery.ModuleSpec("spandrel", loader=None, is_package=True)
|
||||
specs = {"spandrel": ns_spec, "torch": _real_spec("json")}
|
||||
monkeypatch.setattr(importlib.util, "find_spec", _fake_find_spec(specs))
|
||||
assert optional_deps.module_available("torch", "spandrel") is False
|
||||
|
||||
def test_all_real_members_are_available(self):
|
||||
assert optional_deps.module_available("json", "logging") is True
|
||||
|
||||
|
||||
class TestGuardsUseSharedHelper:
|
||||
def test_trustmark_is_available_rejects_namespace_remnant(self, monkeypatch):
|
||||
from remove_ai_watermarks import trustmark_detector
|
||||
|
||||
ns_spec = importlib.machinery.ModuleSpec("trustmark", loader=None, is_package=True)
|
||||
monkeypatch.setattr(importlib.util, "find_spec", _fake_find_spec({"trustmark": ns_spec}))
|
||||
assert trustmark_detector.is_available() is False
|
||||
Reference in New Issue
Block a user