Files
remove-ai-watermarks/tests
Victor Kuznetsov 0c215b5b2f feat(identify): C2PA vendor coverage, AI-enhanced split, detect/remove threshold unify
Retained-corpus mining (2026-06-20) surfaced three provenance gaps; all are
oracle-free and regression-guarded.

- C2PA vendor coverage (roadmap): register Volcano Engine under its Chinese
  legal entity 北京火山引擎科技有限公司 (the latin "volcengine" needle misses
  those certs) -> normalizes to the same ByteDance platform; register ElevenLabs
  ("Eleven Labs Inc.", pure generative-AI) as a generator. Document the
  deliberate exclusion of TikTok Inc. and PixelBin.io/"Fynd" (provenance/transform
  signers, not generators) so they are not re-added.

- AI-generated vs AI-enhanced (roadmap): ProvenanceReport.ai_source_kind splits
  the C2PA digital-source-type into "generated" (trainedAlgorithmicMedia) vs
  "enhanced" (compositeWithTrainedAlgorithmicMedia) so a caller branches a
  full-frame scrub from a region-targeted clean. Parsed once in
  noai.c2pa._populate_registry_fields (PNG + any c2pa-python-readable container),
  with a raw head-scan fallback in identify for the non-PNG raw-blob path. CLI
  verdict reads "AI-generated (fully synthetic)" vs "AI-enhanced (real content
  with an AI-composited region)"; surfaced in --json.

- Detect-vs-remove threshold desync (P0#7): identify's sparkle threshold and the
  removal arbitration gate were two independent 0.5 constants. Unify them into the
  single GEMINI_SPARKLE_TRUST_CONF (identify imports it) so they can never drift.
  Lowering the gate to recover faint sub-0.5 sparkles was evaluated and REJECTED:
  a real Doubao text mark scores ~0.40-0.42 as a gemini match with a higher
  core-ring brightness margin than a genuine faint sparkle, so neither confidence
  nor the brightness gate separates them in [0.35, 0.5) -- lowering would trade a
  rare miss for false-positive removals on clean images. Regression-guarded by
  TestSparkleDetectRemoveAlignment (real demo sparkle at borderline opacities;
  identify and best_auto_mark must agree on either side of the line).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 15:34:20 -07:00
..