Commit Graph

45 Commits

Author SHA1 Message Date
Alosh Denny 93cf20df1b Fix link to SynthID visualizer in README
Updated the link to the SynthID visualizer to the correct URL.
v3
2026-04-14 01:16:17 +05:30
Alosh Denny 130b9cf68e Update README with visualizer link for SynthID
Added a link to a visualizer for SynthID watermarking.
2026-04-14 01:15:25 +05:30
Alosh Denny d012872d7d Merge pull request #23 from mrbeandev/improve-detection-accuracy
Fix detection: empirically verified carrier frequencies
2026-04-12 19:30:20 +05:30
mrbeandev defeb41f74 Fix detection accuracy: replace wrong carrier frequencies with empirically verified ones
The hardcoded carrier positions (48,0), (96,0), (0,88) etc. had low phase
coherence on actual Gemini images (~0.16-0.55). Detection was 80% on
reference images with 100% false positive rate on non-watermarked images.

Root cause analysis across 291 watermarked + 16 non-watermarked images
revealed:

1. The watermark is content-adaptive — dark images use diagonal-grid
   carriers at (±3,±4), (±5,±3) etc. while white images use horizontal-
   axis carriers at (0,±7), (0,±8), (0,±9) etc.

2. Both sets have >0.95 intra-set phase coherence and >0.5 discriminative
   gap vs non-watermarked images.

3. Previous metrics (noise correlation, structure ratio bounds, raw carrier
   magnitude) had heavy overlap between watermarked and non-watermarked
   content images and were not discriminative.

Changes:
- Replace carrier list with empirically verified dark + white carrier sets
- Add per-set reference phase templates to codebook (carrier_refs)
- Rewrite detect_array to try both carrier sets and take best phase match
- Use phase agreement as primary signal (WM: 0.92-0.99 vs non-WM: 0.47-0.71)
- Add noise-domain carrier-vs-random ratio as supporting signal
- Skip expensive multi-scale consistency computation (phase match is decisive)

Results on full dataset:
- Watermarked:     99.0% detection (was ~80%)
- Non-watermarked: 0% false positives (was 100%)
- Overall:         98.7% accuracy (was ~80% with no FP testing)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 14:43:04 +05:30
Alosh Denny 245c375d27 Merge pull request #22 from mrbeandev/migrate-images-to-huggingface
Migrate reference images to Hugging Face, remove from git
2026-04-11 11:03:17 +05:30
mrbeandev b308c4b341 Migrate reference images to Hugging Face, remove from git
Images are now hosted at https://huggingface.co/datasets/aoxo/reverse-synthid
to keep the git repo lightweight (~1.3GB of images removed).

Changes:
- Remove gemini_black/, gemini_white/, gemini_random/, gemini_*_nb_pro/ from git
- Add these folders to .gitignore
- Add scripts/download_images.py to fetch images from HF
- Update README: contribution guide points to HF dataset, add download instructions

Relates to #15
2026-04-11 08:06:05 +05:30
Alosh Denny d757d6aebe Update contact email for commercial license inquiries 2026-04-11 00:21:33 +05:30
Alosh Denny 725a675c3e Add support section to README
Added a section for supporting the research project through donations.
2026-04-11 00:01:39 +05:30
Alosh Denny b155a7b65e Change license to reverse-SynthID Research License v1.0
Updated the license to reverse-SynthID Research License v1.0 with specific terms for non-commercial use, attribution, and citation requirements.
2026-04-10 20:13:33 +05:30
Alosh Denny ccebc88a72 Merge pull request #18 from mrbeandev/fix/codebook-loading-npz-compat
Fix codebook loading error when .npz path is passed to extractor
2026-04-10 16:43:33 +05:30
mrbeandev 32a4b3d4b1 Fix codebook loading error when .npz path is passed to extractor
The extractor's load_codebook() was called with the .npz bypass codebook
path, but it only handles .pkl files. pickle.load() on an .npz file throws
a cryptic "persistent IDs" error, causing the extractor to silently fail.
This meant users got no before/after watermark verification during bypass.

Changes:
- load_codebook() now auto-discovers the .pkl codebook when given a .npz path
- Pickle save now uses protocol=4 for wider Python version compatibility

Fixes #10, #9, #11
2026-04-10 16:03:38 +05:30
Alosh Denny b96c72b34f Merge pull request #17 from mrbeandev/add-nb-pro-reference-images
Add 93 black reference images (gemini_black_nb_pro)
2026-04-10 14:14:49 +05:30
mrbeandev a48b414623 Add 93 black reference images from Gemini (nb_pro) for SynthID analysis
Generated via Gemini Pro web UI with "Create image" tool enabled,
uploading a pure black (#000000) 1024x1024 image and prompting
"recreate this as it is". All images verified to contain SynthID
watermarks (confidence 0.52-0.67, carrier strength ~9300-9900).

These reference images are critical for carrier frequency discovery,
phase validation, and improving cross-resolution robustness.
2026-04-10 13:22:59 +05:30
Alosh Denny 2a303112e8 Merge pull request #14 from BensonRen/contrib/multi-resolution-references
Add 1062 reference images at 2 new resolutions + expanded codebook
2026-04-10 11:47:28 +05:30
Ben Ren 7bf9545df9 Add validation results for new codebook profiles
4 test images generated via Gemini API with real content:
  - 2x 9:16 (cat, mountain) at 1344x768
  - 2x 4:3 (coffee, city) at 864x1184

V3 bypass results with expanded codebook:
  | Image          | Resolution | PSNR    | SSIM   | Exact Match |
  | city (4:3)     | 864x1184   | 45.7 dB | 0.9972 | yes         |
  | coffee (4:3)   | 864x1184   | 50.0 dB | 0.9981 | yes         |
  | cat (9:16)     | 1344x768   | 50.2 dB | 0.9978 | yes         |
  | mountain (9:16)| 1344x768   | 49.1 dB | 0.9971 | yes         |

PSNR/SSIM are excellent. Phase coherence drop is near-zero, suggesting
the API-generated images may have weaker watermark embedding than
web-UI outputs, or the carrier extraction needs further tuning.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 05:37:22 +00:00
Ben Ren 1138f0b51a Add expanded codebook with 1344x768 and 864x1184 profiles
Extends spectral_codebook_v3 from 2 to 4 resolution profiles:
  - 1024x1024 (existing, 100b+100w refs)
  - 1536x2816 (existing, 88 watermarked refs)
  - 1344x768  (new, 154b+364w refs, top carrier coherence 0.946)
  - 864x1184  (new, 268b+295w refs, top carrier coherence 0.979)

Key findings at new resolutions:
  - 1344x768 carriers sit on the vertical axis (fy=0, fx=3..10)
  - 864x1184 carriers are at mid-frequency diagonals (13,-16), (23,-24)
  - Both show distinct carrier structures vs existing profiles,
    confirming resolution-dependent embedding

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 05:37:11 +00:00
Ben Ren 22d7acc575 Add 4:3 (864x1184) reference images from Gemini API
268 black + 281 white pure-color reference images at 864x1184 (4:3
landscape aspect ratio). Generated via gemini-2.5-flash-image model.

This complements the 9:16 portrait images and covers the classic photo
aspect ratio commonly used in Gemini outputs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 05:36:58 +00:00
Ben Ren dd04dbef41 Add 9:16 (1344x768) reference images from Gemini API
154 black + 359 white pure-color reference images at 1344x768 (9:16
portrait aspect ratio). Generated via gemini-2.5-flash-image model.

This resolution is not covered by the existing codebook and is one of
the most common mobile Gemini output formats.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 05:36:49 +00:00
Ben Ren 52133eadde Add reference image generation script and .env to gitignore
generate_references.py automates generating pure-black and pure-white
reference images via the Gemini API at multiple aspect ratios (9:16, 4:3).
Includes rate-limit retry logic and per-resolution output directories.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 05:36:42 +00:00
Alosh Denny ce1b20490e Update README with PitchHut link and badges
Added a link to PitchHut and updated badges.
2026-04-10 10:29:46 +05:30
Alosh Denny d8afdfd95c Add maintainer section with contact details
Added maintainer contact information for Alosh Denny.
2026-04-10 09:36:19 +05:30
Alosh Denny aa709bc9f4 Merge pull request #13 from hobostay/fix/multiple-bugs
Fix multiple bugs in extraction pipeline
2026-04-10 09:32:15 +05:30
Test User 49a7cfc9d8 Fix multiple bugs in extraction pipeline
1. bypass_v2() ignores iterations parameter — the function accepted
   `iterations` but ran the transform pipeline only once. Now properly
   loops, with diminishing strength on subsequent iterations.

2. denoise_bilateral() has identical if/else branches — both 2D and 3D
   cases called the same cv2.bilateralFilter(). Removed dead branch.

3. apply_noise_replacement() allows negative sigma — with passes > 5,
   the formula `sigma * (1 - i * 0.2)` produces negative values. Added
   clamping and early break.

4. Broken import paths — synthid_bypass.py and watermark_remover.py
   used bare module imports that fail when scripts are run from outside
   their directory. Added sys.path.insert like benchmark_extraction.py.

5. Misleading "Python 3.14 bug" comment — the SSIM gate was disabled
   with a comment blaming Python 3.14, but the real reason is that
   heavy multi-pass transforms naturally depress SSIM. Updated comment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 10:49:22 +08:00
Alosh Denny 7d4dc9a11b Invite contributors for image generation to improve detection
Added a section inviting contributors to help expand the codebook by generating pure black and white images using Nano Banana Pro.
2026-04-08 16:37:21 +05:30
Alosh Denny 2d32b0fc48 spectral codebook 2026-03-28 18:58:52 +05:30
Alosh Denny d3db4c3cd9 remove LFS tracking 2026-03-28 18:58:35 +05:30
Alosh Denny 4e6a9987bb sparsified spectral codebook 2026-03-28 18:40:00 +05:30
Alosh Denny 9eadacaced spectral_codebook_v3.npz 2026-03-28 18:17:48 +05:30
Alosh Denny 5463363dcb Merge branch 'main' of https://github.com/aloshdenny/reverse-SynthID 2026-03-28 18:15:01 +05:30
Alosh Denny a9079c74c4 updated carriers 2026-03-28 18:14:51 +05:30
Alosh Denny c1f0fd8b58 multi-resolution carriers 2026-03-28 17:20:37 +05:30
Alosh Denny 5e43b389fe updated carriers 2026-03-28 17:20:08 +05:30
Alosh Denny f2ace3dead improved dataset 2026-03-28 15:56:48 +05:30
Alosh Denny 52c62c669c updated gitignore 2026-03-28 15:54:44 +05:30
Alosh Denny 31b0e07c46 Merge branch 'main' of https://github.com/aloshdenny/reverse-SynthID 2026-03-28 15:53:42 +05:30
Alosh Denny 8169824232 ran bypass on NB-2 2026-03-28 15:53:24 +05:30
Alosh Denny 908bf54ab9 Update repository URL in installation instructions 2026-03-06 21:20:00 +05:30
Alosh Denny c2acc5d259 updated refs 2026-02-15 18:14:55 +05:30
Alosh Denny 091a56761f updated images 2026-02-15 18:13:58 +05:30
Alosh Denny 4c95814928 updated images 2026-02-15 18:10:29 +05:30
Alosh Denny 25483c159a v3 2026-02-15 18:04:20 +05:30
Alosh Denny ad79ba532f fix cv2.resize scale bug 2026-02-15 17:54:52 +05:30
Alosh Denny e02b4a11da watermark investigation 2025-12-16 16:34:03 +05:30
Alosh Denny 21b094474e codebook analysis 2025-12-16 09:18:47 +05:30
Alosh Denny 01d2b45dd4 codebook analysis 2025-12-15 22:11:23 +05:30