mirror of
https://github.com/wiltodelta/remove-ai-watermarks.git
synced 2026-06-05 10:38:00 +02:00
03fb460f77
Corpus images were gitignored (local-only). The negatives were reviewed and cleared for publishing, so the labeled set is now committed (regular git, 65 MB across 25 files) -- making the removal regression set reproducible and CI-able. Corpus: - Track data/synthid_corpus/images/ (pos 9, neg 15, cleaned 1); keep only the synthetic refs/ calibration fills gitignored. - Reconcile manifest.csv to the on-disk files: 117 -> 25 rows (92 dangling rows for removed images pruned; dedup left one cleaned output, f6dd47a5). - Rewrite the corpus README layout/policy (images committed; review every image for private content before adding -- public repo, permanent history). Test fixtures: - Remove data/samples/not-ai-1/2/3 (personal iPhone photos, incl. GPS EXIF). - Add the clean_photo conftest fixture serving a verified-negative image from the corpus neg/ set; repoint the three "non-AI / clean photo" tests onto it (skips if the corpus is absent). Metadata-source coverage (close the last sub-variant gaps): - c2pa digitalSourceType: algorithmicMedia (procedural, not flagged AI) and compositeWithTrainedAlgorithmicMedia (AI + SynthID proxy). - exif_generator: EXIF Artist and ImageDescription fields (Software/Make/XMP CreatorTool were already covered). All 8 metadata-source kinds are now tested at both the unit and identify() level. 313 tests pass. CLAUDE.md updated (corpus tracked, clean_photo fixture). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5.8 KiB
5.8 KiB
| 1 | sha256 | filename | label | source | model | width | height | format | c2pa_issuer | synthid_metadata | verified_via | added | notes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 4ef377bde1a1d4eff141972841938643b173f5052992a018b9a21b31ac31731e | 4ef377bd-ChatGPT Image May 23, 2026, 02_43_02 PM.png | pos | ChatGPT | gpt-image | 1254 | 1254 | png | OpenAI | yes | openai-verify | 2026-05-23T21:48:12Z | fresh post-rollout 2026-05-23; openai.com/verify: SynthID+C2PA detected |
| 3 | d09f84c0e4c6d8b336bf4a9a7277314e940dcb5052ae7051e785cbb3bb42d656 | d09f84c0-Gemini_Generated_Image_vq7wkwvq7wkwvq7w.png | pos | Gemini app | gemini | 2816 | 1536 | png | Google LLC | yes | c2pa-metadata | 2026-05-23T21:52:40Z | user: latest Gemini, SynthID v2 |
| 4 | 7b650522d42db09568e249c04d683c469fb3e280a2c53fcd1031cb9df27c619a | 7b650522-ChatGPT Image May 24, 2026, 12_19_54 PM.png | pos | ChatGPT | gpt-image | 1602 | 982 | png | OpenAI | yes | c2pa-metadata | 2026-05-24T19:20:25Z | content: misty pine forest at dawn |
| 5 | fb28dba2a82cc101a92fdee5714867b32610d0564f37737fe4bb70782b8ecf32 | fb28dba2-Gemini_Generated_Image_dsjlnsdsjlnsdsjl.png | pos | Gemini app | gemini | 2816 | 1536 | png | Google LLC | yes | c2pa-metadata | 2026-05-24T19:30:25Z | content: elderly fisherman portrait |
| 6 | d20d4cc936dbdfe909c52502039a9e84ba93d97b42b24a0acee5b7d6c71930ae | d20d4cc9-Gemini_Generated_Image_ug6kdpug6kdpug6k.png | pos | Gemini app | gemini | 2816 | 1536 | png | Google LLC | yes | c2pa-metadata | 2026-05-24T19:33:15Z | content: red coffee mug product shot |
| 7 | 28f323345f6496d936c3f1a72f671ddf59d0f81565c24a63bf3286860f633afe | 28f32334-chatgpt_fisherman.png | pos | ChatGPT | gpt-image | 1023 | 1537 | png | OpenAI | yes | c2pa-metadata | 2026-05-24T19:39:20Z | content: elderly fisherman portrait (fetch+blob dl) |
| 8 | 88e61a384c2e0b12d97bc66046e4a10542b2987448ba89c4b49e66311e969c84 | 88e61a38-chatgpt_tokyo.png | pos | ChatGPT | gpt-image | 1023 | 1537 | png | OpenAI | yes | c2pa-metadata | 2026-05-24T19:42:02Z | content: tokyo street night (fetch+blob dl) |
| 9 | 1fa5f77f710c11a4cc69fed60195450def734401ed57c2600a84ce191f985440 | 1fa5f77f-IMG_0786.HEIC | neg | iPhone (me/post_photos) | 4032 | 3024 | heic | none | 2026-05-24T20:58:47Z | post photo; no C2PA/SynthID (verified) | |||
| 10 | 2ea228ed270cd169e9d38bc3f1a162de7edbf54b6e5ad3c701a3c90010ec7067 | 2ea228ed-IMG_1791.HEIC | neg | iPhone (me/post_photos) | 4032 | 3024 | heic | none | 2026-05-24T20:58:47Z | post photo; no C2PA/SynthID (verified) | |||
| 11 | 0bb4c176d83fbcbf4e628de7587d6183b9134c4f3fa85f55b96e94185273c7f6 | 0bb4c176-IMG_1790.HEIC | neg | iPhone (me/post_photos) | 4032 | 3024 | heic | none | 2026-05-24T20:58:47Z | post photo; no C2PA/SynthID (verified) | |||
| 12 | 3fc0c4253f86427777904f41355eacbdcc1a29f6897ec694c7fc68b6e7f70846 | 3fc0c425-IMG_1450.HEIC | neg | iPhone (me/post_photos) | 4032 | 3024 | heic | none | 2026-05-24T20:58:48Z | post photo; no C2PA/SynthID (verified) | |||
| 13 | 2b148db2b1a314a87647a03828b3d235e7c4c939252f448b572b7658c7bb9723 | 2b148db2-IMG_1832.HEIC | neg | iPhone (me/post_photos) | 4032 | 3024 | heic | none | 2026-05-24T20:58:48Z | post photo; no C2PA/SynthID (verified) | |||
| 14 | 2d02fed0b6d60ce1142eadac9837a83896569dddbdad7cc5cfdec018f0506d36 | 2d02fed0-IMG_1520.HEIC | neg | iPhone (me/post_photos) | 4032 | 3024 | heic | none | 2026-05-24T20:58:48Z | post photo; no C2PA/SynthID (verified) | |||
| 15 | 8f170c06f843c2bcf4cf6e249dd76365287fdb3a322637ba93c570f50cb19772 | 8f170c06-IMG_1078.HEIC | neg | iPhone (me/post_photos) | 4032 | 3024 | heic | none | 2026-05-24T20:58:48Z | post photo; no C2PA/SynthID (verified) | |||
| 16 | 3fc4c8316012abc462df5534d4d995e5af84d35076a2927331107ff994293d9e | 3fc4c831-IMG_2566.HEIC | neg | iPhone (me/post_photos) | 5712 | 4284 | heic | none | 2026-05-24T20:58:48Z | post photo; no C2PA/SynthID (verified) | |||
| 17 | 5fe59521d579b536340253d2bcaa7c28ddd2485fa964fc27a9b9e5118ad0cdd1 | 5fe59521-IMG_1496.HEIC | neg | iPhone (me/post_photos) | 4032 | 3024 | heic | none | 2026-05-24T20:58:48Z | post photo; no C2PA/SynthID (verified) | |||
| 18 | 4ead7918a5aeea7fba78b36af44358fea7ed1db7f46f4bc675152b9d04e68c38 | 4ead7918-IMG_1300.HEIC | neg | iPhone (me/post_photos) | 4032 | 3024 | heic | none | 2026-05-24T20:58:48Z | post photo; no C2PA/SynthID (verified) | |||
| 19 | 4c795238b89178cf52a3674e721ed7f4cd5068028385491d12171a3c62545c35 | 4c795238-IMG_3034.HEIC | neg | iPhone (me/post_photos) | 5712 | 4284 | heic | none | 2026-05-24T20:58:49Z | post photo; no C2PA/SynthID (verified) | |||
| 20 | 5abfaccb37c549de67c7f3a751a528a423e24a82c643fdb29094be8debbb206d | 5abfaccb-IMG_3018.HEIC | neg | iPhone (me/post_photos) | 4032 | 3024 | heic | none | 2026-05-24T20:58:49Z | post photo; no C2PA/SynthID (verified) | |||
| 21 | 5fed1923d513c1e9ffcba2f240e617fa9344fb39d35144169570ece8b0bd0f33 | 5fed1923-IMG_0272.HEIC | neg | iPhone (me/post_photos) | 4032 | 3024 | heic | none | 2026-05-24T20:58:49Z | post photo; no C2PA/SynthID (verified) | |||
| 22 | 06b04d8fe8e1cd6bee9a973f93bfda37586924cbfec7d372f59d52aa9196160b | 06b04d8f-IMG_1474.HEIC | neg | iPhone (me/post_photos) | 5712 | 4284 | heic | none | 2026-05-24T20:58:49Z | post photo; no C2PA/SynthID (verified) | |||
| 23 | 8fdb574a94e65e14ac29017cf2d5a2ede18a8c4e3f12e04c64292b0d38570062 | 8fdb574a-IMG_3557.HEIC | neg | iPhone (me/post_photos) | 4032 | 3024 | heic | none | 2026-05-24T20:58:49Z | post photo; no C2PA/SynthID (verified) | |||
| 24 | c86973424817f62510e2a312b85c52e05adf47ace87a8e717fd442607596f501 | c8697342-aistudio_lake.png | pos | Google AI Studio (Nano Banana) | gemini-2.5-flash-image | 1024 | 1024 | png | gemini-app | 2026-05-24T21:39:09Z | API/playground: SynthID pixel CONFIRMED (Gemini-app oracle) + visible sparkle, but NO C2PA/IPTC -> synthid_source blind spot | ||
| 25 | 1f81827c06d67cf6f6c7f5d53ec8f9738183942a6d1d2717b161fea0fdcc540a | 1f81827c-Designer.png | pos | Microsoft Designer | dall-e (Designer) | 1024 | 1024 | png | OpenAI, Microsoft | yes | c2pa-metadata | 2026-05-24T22:18:40Z | C2PA issuer OpenAI+Microsoft; synthid_source=OpenAI (DALL-E surface inherits OpenAI SynthID+C2PA) |
| 26 | f6dd47a5ffd319aea21bf10dcf9877097666420b02c2620080bac12b03976e7e | f6dd47a5-4ef377bd-gpt-image-2-cleaned.png | cleaned | our pipeline (invisible/SDXL, native-res default) | stabilityai/stable-diffusion-xl-base-1.0 | 1254 | 1254 | png | openai-verify | 2026-05-25T20:50:38Z | cleaned from 4ef377bd via v0.5.3 'all' at native 1254x1254 (prod-equivalent); openai.com/verify: SynthID NOT detected. Re-confirms #10 native-res default defeats OpenAI SynthID (closes #15 root cause). Note: native res OOMs on 20GB MPS, auto-fell back to CPU. |