Files
remove-ai-watermarks/data/synthid_corpus/manifest.csv
T
test-user 03fb460f77 Track the labeled SynthID corpus; complete metadata-source test coverage
Corpus images were gitignored (local-only). The negatives were reviewed and
cleared for publishing, so the labeled set is now committed (regular git, 65 MB
across 25 files) -- making the removal regression set reproducible and CI-able.

Corpus:
- Track data/synthid_corpus/images/ (pos 9, neg 15, cleaned 1); keep only the
  synthetic refs/ calibration fills gitignored.
- Reconcile manifest.csv to the on-disk files: 117 -> 25 rows (92 dangling rows
  for removed images pruned; dedup left one cleaned output, f6dd47a5).
- Rewrite the corpus README layout/policy (images committed; review every image
  for private content before adding -- public repo, permanent history).

Test fixtures:
- Remove data/samples/not-ai-1/2/3 (personal iPhone photos, incl. GPS EXIF).
- Add the clean_photo conftest fixture serving a verified-negative image from
  the corpus neg/ set; repoint the three "non-AI / clean photo" tests onto it
  (skips if the corpus is absent).

Metadata-source coverage (close the last sub-variant gaps):
- c2pa digitalSourceType: algorithmicMedia (procedural, not flagged AI) and
  compositeWithTrainedAlgorithmicMedia (AI + SynthID proxy).
- exif_generator: EXIF Artist and ImageDescription fields (Software/Make/XMP
  CreatorTool were already covered).

All 8 metadata-source kinds are now tested at both the unit and identify()
level. 313 tests pass. CLAUDE.md updated (corpus tracked, clean_photo fixture).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 14:46:47 -07:00

27 lines
5.8 KiB
CSV

sha256,filename,label,source,model,width,height,format,c2pa_issuer,synthid_metadata,verified_via,added,notes
4ef377bde1a1d4eff141972841938643b173f5052992a018b9a21b31ac31731e,"4ef377bd-ChatGPT Image May 23, 2026, 02_43_02 PM.png",pos,ChatGPT,gpt-image,1254,1254,png,OpenAI,yes,openai-verify,2026-05-23T21:48:12Z,fresh post-rollout 2026-05-23; openai.com/verify: SynthID+C2PA detected
d09f84c0e4c6d8b336bf4a9a7277314e940dcb5052ae7051e785cbb3bb42d656,d09f84c0-Gemini_Generated_Image_vq7wkwvq7wkwvq7w.png,pos,Gemini app,gemini,2816,1536,png,Google LLC,yes,c2pa-metadata,2026-05-23T21:52:40Z,"user: latest Gemini, SynthID v2"
7b650522d42db09568e249c04d683c469fb3e280a2c53fcd1031cb9df27c619a,"7b650522-ChatGPT Image May 24, 2026, 12_19_54 PM.png",pos,ChatGPT,gpt-image,1602,982,png,OpenAI,yes,c2pa-metadata,2026-05-24T19:20:25Z,content: misty pine forest at dawn
fb28dba2a82cc101a92fdee5714867b32610d0564f37737fe4bb70782b8ecf32,fb28dba2-Gemini_Generated_Image_dsjlnsdsjlnsdsjl.png,pos,Gemini app,gemini,2816,1536,png,Google LLC,yes,c2pa-metadata,2026-05-24T19:30:25Z,content: elderly fisherman portrait
d20d4cc936dbdfe909c52502039a9e84ba93d97b42b24a0acee5b7d6c71930ae,d20d4cc9-Gemini_Generated_Image_ug6kdpug6kdpug6k.png,pos,Gemini app,gemini,2816,1536,png,Google LLC,yes,c2pa-metadata,2026-05-24T19:33:15Z,content: red coffee mug product shot
28f323345f6496d936c3f1a72f671ddf59d0f81565c24a63bf3286860f633afe,28f32334-chatgpt_fisherman.png,pos,ChatGPT,gpt-image,1023,1537,png,OpenAI,yes,c2pa-metadata,2026-05-24T19:39:20Z,content: elderly fisherman portrait (fetch+blob dl)
88e61a384c2e0b12d97bc66046e4a10542b2987448ba89c4b49e66311e969c84,88e61a38-chatgpt_tokyo.png,pos,ChatGPT,gpt-image,1023,1537,png,OpenAI,yes,c2pa-metadata,2026-05-24T19:42:02Z,content: tokyo street night (fetch+blob dl)
1fa5f77f710c11a4cc69fed60195450def734401ed57c2600a84ce191f985440,1fa5f77f-IMG_0786.HEIC,neg,iPhone (me/post_photos),,4032,3024,heic,,,none,2026-05-24T20:58:47Z,post photo; no C2PA/SynthID (verified)
2ea228ed270cd169e9d38bc3f1a162de7edbf54b6e5ad3c701a3c90010ec7067,2ea228ed-IMG_1791.HEIC,neg,iPhone (me/post_photos),,4032,3024,heic,,,none,2026-05-24T20:58:47Z,post photo; no C2PA/SynthID (verified)
0bb4c176d83fbcbf4e628de7587d6183b9134c4f3fa85f55b96e94185273c7f6,0bb4c176-IMG_1790.HEIC,neg,iPhone (me/post_photos),,4032,3024,heic,,,none,2026-05-24T20:58:47Z,post photo; no C2PA/SynthID (verified)
3fc0c4253f86427777904f41355eacbdcc1a29f6897ec694c7fc68b6e7f70846,3fc0c425-IMG_1450.HEIC,neg,iPhone (me/post_photos),,4032,3024,heic,,,none,2026-05-24T20:58:48Z,post photo; no C2PA/SynthID (verified)
2b148db2b1a314a87647a03828b3d235e7c4c939252f448b572b7658c7bb9723,2b148db2-IMG_1832.HEIC,neg,iPhone (me/post_photos),,4032,3024,heic,,,none,2026-05-24T20:58:48Z,post photo; no C2PA/SynthID (verified)
2d02fed0b6d60ce1142eadac9837a83896569dddbdad7cc5cfdec018f0506d36,2d02fed0-IMG_1520.HEIC,neg,iPhone (me/post_photos),,4032,3024,heic,,,none,2026-05-24T20:58:48Z,post photo; no C2PA/SynthID (verified)
8f170c06f843c2bcf4cf6e249dd76365287fdb3a322637ba93c570f50cb19772,8f170c06-IMG_1078.HEIC,neg,iPhone (me/post_photos),,4032,3024,heic,,,none,2026-05-24T20:58:48Z,post photo; no C2PA/SynthID (verified)
3fc4c8316012abc462df5534d4d995e5af84d35076a2927331107ff994293d9e,3fc4c831-IMG_2566.HEIC,neg,iPhone (me/post_photos),,5712,4284,heic,,,none,2026-05-24T20:58:48Z,post photo; no C2PA/SynthID (verified)
5fe59521d579b536340253d2bcaa7c28ddd2485fa964fc27a9b9e5118ad0cdd1,5fe59521-IMG_1496.HEIC,neg,iPhone (me/post_photos),,4032,3024,heic,,,none,2026-05-24T20:58:48Z,post photo; no C2PA/SynthID (verified)
4ead7918a5aeea7fba78b36af44358fea7ed1db7f46f4bc675152b9d04e68c38,4ead7918-IMG_1300.HEIC,neg,iPhone (me/post_photos),,4032,3024,heic,,,none,2026-05-24T20:58:48Z,post photo; no C2PA/SynthID (verified)
4c795238b89178cf52a3674e721ed7f4cd5068028385491d12171a3c62545c35,4c795238-IMG_3034.HEIC,neg,iPhone (me/post_photos),,5712,4284,heic,,,none,2026-05-24T20:58:49Z,post photo; no C2PA/SynthID (verified)
5abfaccb37c549de67c7f3a751a528a423e24a82c643fdb29094be8debbb206d,5abfaccb-IMG_3018.HEIC,neg,iPhone (me/post_photos),,4032,3024,heic,,,none,2026-05-24T20:58:49Z,post photo; no C2PA/SynthID (verified)
5fed1923d513c1e9ffcba2f240e617fa9344fb39d35144169570ece8b0bd0f33,5fed1923-IMG_0272.HEIC,neg,iPhone (me/post_photos),,4032,3024,heic,,,none,2026-05-24T20:58:49Z,post photo; no C2PA/SynthID (verified)
06b04d8fe8e1cd6bee9a973f93bfda37586924cbfec7d372f59d52aa9196160b,06b04d8f-IMG_1474.HEIC,neg,iPhone (me/post_photos),,5712,4284,heic,,,none,2026-05-24T20:58:49Z,post photo; no C2PA/SynthID (verified)
8fdb574a94e65e14ac29017cf2d5a2ede18a8c4e3f12e04c64292b0d38570062,8fdb574a-IMG_3557.HEIC,neg,iPhone (me/post_photos),,4032,3024,heic,,,none,2026-05-24T20:58:49Z,post photo; no C2PA/SynthID (verified)
c86973424817f62510e2a312b85c52e05adf47ace87a8e717fd442607596f501,c8697342-aistudio_lake.png,pos,Google AI Studio (Nano Banana),gemini-2.5-flash-image,1024,1024,png,,,gemini-app,2026-05-24T21:39:09Z,"API/playground: SynthID pixel CONFIRMED (Gemini-app oracle) + visible sparkle, but NO C2PA/IPTC -> synthid_source blind spot"
1f81827c06d67cf6f6c7f5d53ec8f9738183942a6d1d2717b161fea0fdcc540a,1f81827c-Designer.png,pos,Microsoft Designer,dall-e (Designer),1024,1024,png,"OpenAI, Microsoft",yes,c2pa-metadata,2026-05-24T22:18:40Z,C2PA issuer OpenAI+Microsoft; synthid_source=OpenAI (DALL-E surface inherits OpenAI SynthID+C2PA)
f6dd47a5ffd319aea21bf10dcf9877097666420b02c2620080bac12b03976e7e,f6dd47a5-4ef377bd-gpt-image-2-cleaned.png,cleaned,"our pipeline (invisible/SDXL, native-res default)",stabilityai/stable-diffusion-xl-base-1.0,1254,1254,png,,,openai-verify,2026-05-25T20:50:38Z,"cleaned from 4ef377bd via v0.5.3 'all' at native 1254x1254 (prod-equivalent); openai.com/verify: SynthID NOT detected. Re-confirms #10 native-res default defeats OpenAI SynthID (closes #15 root cause). Note: native res OOMs on 20GB MPS, auto-fell back to CPU."