test(samples): commit real Doubao fixture + AIGC real-sample test

data/samples/doubao-1.png is the real #13 sample: carries the China TC260
<TC260:AIGC> XMP label and a visible '豆包AI生成' text mark (bottom-right).
Grounds the AIGC detection on a real file (alongside the synthetic tests)
and serves as the fixture for visible-watermark removal work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
test-user
2026-05-25 12:37:15 -07:00
parent 1afc1e60ef
commit e27f24f520
2 changed files with 18 additions and 1 deletions
+1 -1
View File
@@ -14,7 +14,7 @@ You are a **principal Python engineer** maintaining a CLI tool and library for r
- `bash maintain.sh` — uv-outdated, uv-secure, ruff check/fix, ruff format, pyright, pytest -n auto
- `maintain.sh` may not finish fully green (pre-existing, not per-change): strict pyright carries debt in `remove_ai_metadata` / `cli.py` (untyped piexif/PIL/click/rich). (`uv-secure` is clean since idna was bumped 3.11 -> 3.16, fixing GHSA-65pc-fj4g-8rjx.) To gate a change, run `uv run ruff check`, `uv run pyright <changed files>`, `uv run pytest` directly.
- Run `uv run` from the repo root — from another cwd it falls back to a bare env without numpy/cv2/torch.
- Metadata/C2PA tests assert against real committed fixtures in `data/samples/` (`chatgpt-*.png` = OpenAI C2PA, `firefly-1.png` = Adobe, `not-ai-*` = clean); synthetic byte blobs cover the JPEG/ISOBMFF format paths.
- Metadata/C2PA tests assert against real committed fixtures in `data/samples/` (`chatgpt-*.png` = OpenAI C2PA, `firefly-1.png` = Adobe, `mj-*` = Midjourney IPTC, `doubao-1.png` = ByteDance Doubao with the China TC260 `<TC260:AIGC>` XMP label **and** a visible "豆包AI生成" text mark bottom-right, `not-ai-*` = clean); synthetic byte blobs cover the JPEG/ISOBMFF format paths.
- SynthID reference corpus: `scripts/synthid_corpus.py` ingests labeled images into `data/synthid_corpus/` (`manifest.csv` tracked, `images/` gitignored); see its README for the collection protocol and verification oracles.
## Configuration
+17
View File
@@ -433,3 +433,20 @@ class TestAIGCLabel:
meta = get_ai_metadata(self._aigc_png(tmp_path))
assert "aigc_label" in meta
assert "TC260" in meta["aigc_label"]
@pytest.mark.skipif(not (SAMPLES_DIR / "doubao-1.png").exists(), reason="doubao sample not present")
class TestAIGCRealSample:
"""Real Doubao (ByteDance) sample carries the China TC260 AIGC XMP label."""
def test_doubao_aigc_label(self):
from remove_ai_watermarks.metadata import aigc_label
info = aigc_label(SAMPLES_DIR / "doubao-1.png")
assert info is not None
assert info["Label"] == "1"
assert info["ContentProducer"] # ByteDance producer code present
def test_doubao_detected_as_ai(self):
assert has_ai_metadata(SAMPLES_DIR / "doubao-1.png")
assert "aigc_label" in get_ai_metadata(SAMPLES_DIR / "doubao-1.png")