Files
remove-ai-watermarks/data/synthid_corpus/manifest.csv
T
test-user 03fb460f77 Track the labeled SynthID corpus; complete metadata-source test coverage
Corpus images were gitignored (local-only). The negatives were reviewed and
cleared for publishing, so the labeled set is now committed (regular git, 65 MB
across 25 files) -- making the removal regression set reproducible and CI-able.

Corpus:
- Track data/synthid_corpus/images/ (pos 9, neg 15, cleaned 1); keep only the
  synthetic refs/ calibration fills gitignored.
- Reconcile manifest.csv to the on-disk files: 117 -> 25 rows (92 dangling rows
  for removed images pruned; dedup left one cleaned output, f6dd47a5).
- Rewrite the corpus README layout/policy (images committed; review every image
  for private content before adding -- public repo, permanent history).

Test fixtures:
- Remove data/samples/not-ai-1/2/3 (personal iPhone photos, incl. GPS EXIF).
- Add the clean_photo conftest fixture serving a verified-negative image from
  the corpus neg/ set; repoint the three "non-AI / clean photo" tests onto it
  (skips if the corpus is absent).

Metadata-source coverage (close the last sub-variant gaps):
- c2pa digitalSourceType: algorithmicMedia (procedural, not flagged AI) and
  compositeWithTrainedAlgorithmicMedia (AI + SynthID proxy).
- exif_generator: EXIF Artist and ImageDescription fields (Software/Make/XMP
  CreatorTool were already covered).

All 8 metadata-source kinds are now tested at both the unit and identify()
level. 313 tests pass. CLAUDE.md updated (corpus tracked, clean_photo fixture).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 14:46:47 -07:00

5.8 KiB

1sha256filenamelabelsourcemodelwidthheightformatc2pa_issuersynthid_metadataverified_viaaddednotes
24ef377bde1a1d4eff141972841938643b173f5052992a018b9a21b31ac31731e4ef377bd-ChatGPT Image May 23, 2026, 02_43_02 PM.pngposChatGPTgpt-image12541254pngOpenAIyesopenai-verify2026-05-23T21:48:12Zfresh post-rollout 2026-05-23; openai.com/verify: SynthID+C2PA detected
3d09f84c0e4c6d8b336bf4a9a7277314e940dcb5052ae7051e785cbb3bb42d656d09f84c0-Gemini_Generated_Image_vq7wkwvq7wkwvq7w.pngposGemini appgemini28161536pngGoogle LLCyesc2pa-metadata2026-05-23T21:52:40Zuser: latest Gemini, SynthID v2
47b650522d42db09568e249c04d683c469fb3e280a2c53fcd1031cb9df27c619a7b650522-ChatGPT Image May 24, 2026, 12_19_54 PM.pngposChatGPTgpt-image1602982pngOpenAIyesc2pa-metadata2026-05-24T19:20:25Zcontent: misty pine forest at dawn
5fb28dba2a82cc101a92fdee5714867b32610d0564f37737fe4bb70782b8ecf32fb28dba2-Gemini_Generated_Image_dsjlnsdsjlnsdsjl.pngposGemini appgemini28161536pngGoogle LLCyesc2pa-metadata2026-05-24T19:30:25Zcontent: elderly fisherman portrait
6d20d4cc936dbdfe909c52502039a9e84ba93d97b42b24a0acee5b7d6c71930aed20d4cc9-Gemini_Generated_Image_ug6kdpug6kdpug6k.pngposGemini appgemini28161536pngGoogle LLCyesc2pa-metadata2026-05-24T19:33:15Zcontent: red coffee mug product shot
728f323345f6496d936c3f1a72f671ddf59d0f81565c24a63bf3286860f633afe28f32334-chatgpt_fisherman.pngposChatGPTgpt-image10231537pngOpenAIyesc2pa-metadata2026-05-24T19:39:20Zcontent: elderly fisherman portrait (fetch+blob dl)
888e61a384c2e0b12d97bc66046e4a10542b2987448ba89c4b49e66311e969c8488e61a38-chatgpt_tokyo.pngposChatGPTgpt-image10231537pngOpenAIyesc2pa-metadata2026-05-24T19:42:02Zcontent: tokyo street night (fetch+blob dl)
91fa5f77f710c11a4cc69fed60195450def734401ed57c2600a84ce191f9854401fa5f77f-IMG_0786.HEICnegiPhone (me/post_photos)40323024heicnone2026-05-24T20:58:47Zpost photo; no C2PA/SynthID (verified)
102ea228ed270cd169e9d38bc3f1a162de7edbf54b6e5ad3c701a3c90010ec70672ea228ed-IMG_1791.HEICnegiPhone (me/post_photos)40323024heicnone2026-05-24T20:58:47Zpost photo; no C2PA/SynthID (verified)
110bb4c176d83fbcbf4e628de7587d6183b9134c4f3fa85f55b96e94185273c7f60bb4c176-IMG_1790.HEICnegiPhone (me/post_photos)40323024heicnone2026-05-24T20:58:47Zpost photo; no C2PA/SynthID (verified)
123fc0c4253f86427777904f41355eacbdcc1a29f6897ec694c7fc68b6e7f708463fc0c425-IMG_1450.HEICnegiPhone (me/post_photos)40323024heicnone2026-05-24T20:58:48Zpost photo; no C2PA/SynthID (verified)
132b148db2b1a314a87647a03828b3d235e7c4c939252f448b572b7658c7bb97232b148db2-IMG_1832.HEICnegiPhone (me/post_photos)40323024heicnone2026-05-24T20:58:48Zpost photo; no C2PA/SynthID (verified)
142d02fed0b6d60ce1142eadac9837a83896569dddbdad7cc5cfdec018f0506d362d02fed0-IMG_1520.HEICnegiPhone (me/post_photos)40323024heicnone2026-05-24T20:58:48Zpost photo; no C2PA/SynthID (verified)
158f170c06f843c2bcf4cf6e249dd76365287fdb3a322637ba93c570f50cb197728f170c06-IMG_1078.HEICnegiPhone (me/post_photos)40323024heicnone2026-05-24T20:58:48Zpost photo; no C2PA/SynthID (verified)
163fc4c8316012abc462df5534d4d995e5af84d35076a2927331107ff994293d9e3fc4c831-IMG_2566.HEICnegiPhone (me/post_photos)57124284heicnone2026-05-24T20:58:48Zpost photo; no C2PA/SynthID (verified)
175fe59521d579b536340253d2bcaa7c28ddd2485fa964fc27a9b9e5118ad0cdd15fe59521-IMG_1496.HEICnegiPhone (me/post_photos)40323024heicnone2026-05-24T20:58:48Zpost photo; no C2PA/SynthID (verified)
184ead7918a5aeea7fba78b36af44358fea7ed1db7f46f4bc675152b9d04e68c384ead7918-IMG_1300.HEICnegiPhone (me/post_photos)40323024heicnone2026-05-24T20:58:48Zpost photo; no C2PA/SynthID (verified)
194c795238b89178cf52a3674e721ed7f4cd5068028385491d12171a3c62545c354c795238-IMG_3034.HEICnegiPhone (me/post_photos)57124284heicnone2026-05-24T20:58:49Zpost photo; no C2PA/SynthID (verified)
205abfaccb37c549de67c7f3a751a528a423e24a82c643fdb29094be8debbb206d5abfaccb-IMG_3018.HEICnegiPhone (me/post_photos)40323024heicnone2026-05-24T20:58:49Zpost photo; no C2PA/SynthID (verified)
215fed1923d513c1e9ffcba2f240e617fa9344fb39d35144169570ece8b0bd0f335fed1923-IMG_0272.HEICnegiPhone (me/post_photos)40323024heicnone2026-05-24T20:58:49Zpost photo; no C2PA/SynthID (verified)
2206b04d8fe8e1cd6bee9a973f93bfda37586924cbfec7d372f59d52aa9196160b06b04d8f-IMG_1474.HEICnegiPhone (me/post_photos)57124284heicnone2026-05-24T20:58:49Zpost photo; no C2PA/SynthID (verified)
238fdb574a94e65e14ac29017cf2d5a2ede18a8c4e3f12e04c64292b0d385700628fdb574a-IMG_3557.HEICnegiPhone (me/post_photos)40323024heicnone2026-05-24T20:58:49Zpost photo; no C2PA/SynthID (verified)
24c86973424817f62510e2a312b85c52e05adf47ace87a8e717fd442607596f501c8697342-aistudio_lake.pngposGoogle AI Studio (Nano Banana)gemini-2.5-flash-image10241024pnggemini-app2026-05-24T21:39:09ZAPI/playground: SynthID pixel CONFIRMED (Gemini-app oracle) + visible sparkle, but NO C2PA/IPTC -> synthid_source blind spot
251f81827c06d67cf6f6c7f5d53ec8f9738183942a6d1d2717b161fea0fdcc540a1f81827c-Designer.pngposMicrosoft Designerdall-e (Designer)10241024pngOpenAI, Microsoftyesc2pa-metadata2026-05-24T22:18:40ZC2PA issuer OpenAI+Microsoft; synthid_source=OpenAI (DALL-E surface inherits OpenAI SynthID+C2PA)
26f6dd47a5ffd319aea21bf10dcf9877097666420b02c2620080bac12b03976e7ef6dd47a5-4ef377bd-gpt-image-2-cleaned.pngcleanedour pipeline (invisible/SDXL, native-res default)stabilityai/stable-diffusion-xl-base-1.012541254pngopenai-verify2026-05-25T20:50:38Zcleaned from 4ef377bd via v0.5.3 'all' at native 1254x1254 (prod-equivalent); openai.com/verify: SynthID NOT detected. Re-confirms #10 native-res default defeats OpenAI SynthID (closes #15 root cause). Note: native res OOMs on 20GB MPS, auto-fell back to CPU.