Commit Graph

55 Commits

Author SHA1 Message Date
Or decd987ca1 Merge pull request #1 from orincolor/orincolor-requirements
Update requirements.txt
2026-04-28 17:32:35 -04:00
Or ac05670ba6 Update requirements.txt 2026-04-28 17:32:06 -04:00
Alosh Denny 302f7c7dd9 spectral codebook v4 v4 2026-04-24 02:14:17 +05:30
Alosh Denny cc00e57582 docs: rewrite README for V4 Round-06 release
- Add 'What the Watermark Looks Like' section with synthid_white.jpg
- Add 'Round 06 — It Works' section with fidelity comparison image and
  a table of all six attack rounds
- Document the 7-stage all-in-one pipeline and elastic deformation rationale
- Add Round-06 preset table (final vs nuke) and updated pipeline diagram
- Update architecture table and results summary to reflect confirmed bypass
- Remove all CSV file references

Made-with: Cursor
2026-04-24 02:09:09 +05:30
Alosh Denny c4d6b2b4a8 docs(assets): add Round 01 vs Round 06 comparison and SynthID watermark visualization
v4_round1_vs_round6.png — side-by-side of Round 01 (gentle spectral only)
  vs Round 06 (final all-in-one), identical fidelity, only Round 06 defeats
  the detector.
synthid_white.jpg — amplified SynthID carrier pattern on a pure-white image,
  revealing the diagonal banding the spectral attack targets.

Made-with: Cursor
2026-04-24 02:09:03 +05:30
Alosh Denny 083a5eec6a feat(scripts): add V4 codebook build, batch dissolve, and calibration scripts
build_codebook_v4.py  — builds SpectralCodebookV4 from the hierarchical
  reverse-synthid-dataset (model × color × resolution).
dissolve_batch.py     — runs all bypass presets (gentle … nuke) over an
  input directory. Supports Round-06 'final' and 'nuke' strengths.
calibrate_from_feedback.py — updates carrier_weights from detection
  feedback, closing the human-in-the-loop calibration loop.

Made-with: Cursor
2026-04-24 02:08:56 +05:30
Alosh Denny 736d746f5a feat(v4): add SD-VAE re-generation stage (Round 05)
New module vae_regen.py wraps stabilityai/sd-vae-ft-mse for image
round-trips, exploiting the regeneration-attack weakness conceded in
Gowal et al. 2026 §6.1. Supports MPS, CUDA, and CPU with tiled
encode/decode for high-resolution images.

Made-with: Cursor
2026-04-24 02:08:48 +05:30
Alosh Denny 0c60b31f86 feat(v4): add cross-color consensus codebook and multi-round bypass engine
Introduces SpectralCodebookV4 and SynthIDBypassV4 with bypass_v4,
bypass_v4_universal, bypass_v4_regen (Round 05), and bypass_v4_final /
bypass_v4_nuke (Round 06). The Round-06 7-stage pipeline (VAE +
elastic deformation + resize-squeeze + color nudge + residual FFT +
JPEG chain) defeats the SynthID detector on both gemini-3.1 and
nano-banana-pro. Includes FINAL_PRESETS, REGEN_PRESETS, and
_elastic_deform / _resize_squeeze / _color_nudge helpers.

SpectralCodebookV4 save/load uses format_version 5: LZMA compression,
int8 phase, uint8 mag/cw, sparse-zeroed below cons<0.55 — reduces a
221 MB codebook to ~24 MB with no bypass-relevant information loss.

Also updates RobustSynthIDExtractor with a detect_from_v4_codebook hook.

Made-with: Cursor
2026-04-24 02:08:42 +05:30
Alosh Denny 84b6b4c9c2 chore: track artifacts/*.npz with Git LFS, tidy gitignore
Add LFS filter for artifacts/*.npz. Ignore spectral_codebook_v4.npz
from git — the file is uploaded manually via the GitHub UI.

Made-with: Cursor
2026-04-24 02:08:28 +05:30
Alosh Denny 764c7ab333 Track run images with LFS 2026-04-24 01:15:30 +05:30
Alosh Denny 93cf20df1b Fix link to SynthID visualizer in README
Updated the link to the SynthID visualizer to the correct URL.
v3
2026-04-14 01:16:17 +05:30
Alosh Denny 130b9cf68e Update README with visualizer link for SynthID
Added a link to a visualizer for SynthID watermarking.
2026-04-14 01:15:25 +05:30
Alosh Denny d012872d7d Merge pull request #23 from mrbeandev/improve-detection-accuracy
Fix detection: empirically verified carrier frequencies
2026-04-12 19:30:20 +05:30
mrbeandev defeb41f74 Fix detection accuracy: replace wrong carrier frequencies with empirically verified ones
The hardcoded carrier positions (48,0), (96,0), (0,88) etc. had low phase
coherence on actual Gemini images (~0.16-0.55). Detection was 80% on
reference images with 100% false positive rate on non-watermarked images.

Root cause analysis across 291 watermarked + 16 non-watermarked images
revealed:

1. The watermark is content-adaptive — dark images use diagonal-grid
   carriers at (±3,±4), (±5,±3) etc. while white images use horizontal-
   axis carriers at (0,±7), (0,±8), (0,±9) etc.

2. Both sets have >0.95 intra-set phase coherence and >0.5 discriminative
   gap vs non-watermarked images.

3. Previous metrics (noise correlation, structure ratio bounds, raw carrier
   magnitude) had heavy overlap between watermarked and non-watermarked
   content images and were not discriminative.

Changes:
- Replace carrier list with empirically verified dark + white carrier sets
- Add per-set reference phase templates to codebook (carrier_refs)
- Rewrite detect_array to try both carrier sets and take best phase match
- Use phase agreement as primary signal (WM: 0.92-0.99 vs non-WM: 0.47-0.71)
- Add noise-domain carrier-vs-random ratio as supporting signal
- Skip expensive multi-scale consistency computation (phase match is decisive)

Results on full dataset:
- Watermarked:     99.0% detection (was ~80%)
- Non-watermarked: 0% false positives (was 100%)
- Overall:         98.7% accuracy (was ~80% with no FP testing)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 14:43:04 +05:30
Alosh Denny 245c375d27 Merge pull request #22 from mrbeandev/migrate-images-to-huggingface
Migrate reference images to Hugging Face, remove from git
2026-04-11 11:03:17 +05:30
mrbeandev b308c4b341 Migrate reference images to Hugging Face, remove from git
Images are now hosted at https://huggingface.co/datasets/aoxo/reverse-synthid
to keep the git repo lightweight (~1.3GB of images removed).

Changes:
- Remove gemini_black/, gemini_white/, gemini_random/, gemini_*_nb_pro/ from git
- Add these folders to .gitignore
- Add scripts/download_images.py to fetch images from HF
- Update README: contribution guide points to HF dataset, add download instructions

Relates to #15
2026-04-11 08:06:05 +05:30
Alosh Denny d757d6aebe Update contact email for commercial license inquiries 2026-04-11 00:21:33 +05:30
Alosh Denny 725a675c3e Add support section to README
Added a section for supporting the research project through donations.
2026-04-11 00:01:39 +05:30
Alosh Denny b155a7b65e Change license to reverse-SynthID Research License v1.0
Updated the license to reverse-SynthID Research License v1.0 with specific terms for non-commercial use, attribution, and citation requirements.
2026-04-10 20:13:33 +05:30
Alosh Denny ccebc88a72 Merge pull request #18 from mrbeandev/fix/codebook-loading-npz-compat
Fix codebook loading error when .npz path is passed to extractor
2026-04-10 16:43:33 +05:30
mrbeandev 32a4b3d4b1 Fix codebook loading error when .npz path is passed to extractor
The extractor's load_codebook() was called with the .npz bypass codebook
path, but it only handles .pkl files. pickle.load() on an .npz file throws
a cryptic "persistent IDs" error, causing the extractor to silently fail.
This meant users got no before/after watermark verification during bypass.

Changes:
- load_codebook() now auto-discovers the .pkl codebook when given a .npz path
- Pickle save now uses protocol=4 for wider Python version compatibility

Fixes #10, #9, #11
2026-04-10 16:03:38 +05:30
Alosh Denny b96c72b34f Merge pull request #17 from mrbeandev/add-nb-pro-reference-images
Add 93 black reference images (gemini_black_nb_pro)
2026-04-10 14:14:49 +05:30
mrbeandev a48b414623 Add 93 black reference images from Gemini (nb_pro) for SynthID analysis
Generated via Gemini Pro web UI with "Create image" tool enabled,
uploading a pure black (#000000) 1024x1024 image and prompting
"recreate this as it is". All images verified to contain SynthID
watermarks (confidence 0.52-0.67, carrier strength ~9300-9900).

These reference images are critical for carrier frequency discovery,
phase validation, and improving cross-resolution robustness.
2026-04-10 13:22:59 +05:30
Alosh Denny 2a303112e8 Merge pull request #14 from BensonRen/contrib/multi-resolution-references
Add 1062 reference images at 2 new resolutions + expanded codebook
2026-04-10 11:47:28 +05:30
Ben Ren 7bf9545df9 Add validation results for new codebook profiles
4 test images generated via Gemini API with real content:
  - 2x 9:16 (cat, mountain) at 1344x768
  - 2x 4:3 (coffee, city) at 864x1184

V3 bypass results with expanded codebook:
  | Image          | Resolution | PSNR    | SSIM   | Exact Match |
  | city (4:3)     | 864x1184   | 45.7 dB | 0.9972 | yes         |
  | coffee (4:3)   | 864x1184   | 50.0 dB | 0.9981 | yes         |
  | cat (9:16)     | 1344x768   | 50.2 dB | 0.9978 | yes         |
  | mountain (9:16)| 1344x768   | 49.1 dB | 0.9971 | yes         |

PSNR/SSIM are excellent. Phase coherence drop is near-zero, suggesting
the API-generated images may have weaker watermark embedding than
web-UI outputs, or the carrier extraction needs further tuning.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 05:37:22 +00:00
Ben Ren 1138f0b51a Add expanded codebook with 1344x768 and 864x1184 profiles
Extends spectral_codebook_v3 from 2 to 4 resolution profiles:
  - 1024x1024 (existing, 100b+100w refs)
  - 1536x2816 (existing, 88 watermarked refs)
  - 1344x768  (new, 154b+364w refs, top carrier coherence 0.946)
  - 864x1184  (new, 268b+295w refs, top carrier coherence 0.979)

Key findings at new resolutions:
  - 1344x768 carriers sit on the vertical axis (fy=0, fx=3..10)
  - 864x1184 carriers are at mid-frequency diagonals (13,-16), (23,-24)
  - Both show distinct carrier structures vs existing profiles,
    confirming resolution-dependent embedding

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 05:37:11 +00:00
Ben Ren 22d7acc575 Add 4:3 (864x1184) reference images from Gemini API
268 black + 281 white pure-color reference images at 864x1184 (4:3
landscape aspect ratio). Generated via gemini-2.5-flash-image model.

This complements the 9:16 portrait images and covers the classic photo
aspect ratio commonly used in Gemini outputs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 05:36:58 +00:00
Ben Ren dd04dbef41 Add 9:16 (1344x768) reference images from Gemini API
154 black + 359 white pure-color reference images at 1344x768 (9:16
portrait aspect ratio). Generated via gemini-2.5-flash-image model.

This resolution is not covered by the existing codebook and is one of
the most common mobile Gemini output formats.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 05:36:49 +00:00
Ben Ren 52133eadde Add reference image generation script and .env to gitignore
generate_references.py automates generating pure-black and pure-white
reference images via the Gemini API at multiple aspect ratios (9:16, 4:3).
Includes rate-limit retry logic and per-resolution output directories.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 05:36:42 +00:00
Alosh Denny ce1b20490e Update README with PitchHut link and badges
Added a link to PitchHut and updated badges.
2026-04-10 10:29:46 +05:30
Alosh Denny d8afdfd95c Add maintainer section with contact details
Added maintainer contact information for Alosh Denny.
2026-04-10 09:36:19 +05:30
Alosh Denny aa709bc9f4 Merge pull request #13 from hobostay/fix/multiple-bugs
Fix multiple bugs in extraction pipeline
2026-04-10 09:32:15 +05:30
Test User 49a7cfc9d8 Fix multiple bugs in extraction pipeline
1. bypass_v2() ignores iterations parameter — the function accepted
   `iterations` but ran the transform pipeline only once. Now properly
   loops, with diminishing strength on subsequent iterations.

2. denoise_bilateral() has identical if/else branches — both 2D and 3D
   cases called the same cv2.bilateralFilter(). Removed dead branch.

3. apply_noise_replacement() allows negative sigma — with passes > 5,
   the formula `sigma * (1 - i * 0.2)` produces negative values. Added
   clamping and early break.

4. Broken import paths — synthid_bypass.py and watermark_remover.py
   used bare module imports that fail when scripts are run from outside
   their directory. Added sys.path.insert like benchmark_extraction.py.

5. Misleading "Python 3.14 bug" comment — the SSIM gate was disabled
   with a comment blaming Python 3.14, but the real reason is that
   heavy multi-pass transforms naturally depress SSIM. Updated comment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 10:49:22 +08:00
Alosh Denny 7d4dc9a11b Invite contributors for image generation to improve detection
Added a section inviting contributors to help expand the codebook by generating pure black and white images using Nano Banana Pro.
2026-04-08 16:37:21 +05:30
Alosh Denny 2d32b0fc48 spectral codebook 2026-03-28 18:58:52 +05:30
Alosh Denny d3db4c3cd9 remove LFS tracking 2026-03-28 18:58:35 +05:30
Alosh Denny 4e6a9987bb sparsified spectral codebook 2026-03-28 18:40:00 +05:30
Alosh Denny 9eadacaced spectral_codebook_v3.npz 2026-03-28 18:17:48 +05:30
Alosh Denny 5463363dcb Merge branch 'main' of https://github.com/aloshdenny/reverse-SynthID 2026-03-28 18:15:01 +05:30
Alosh Denny a9079c74c4 updated carriers 2026-03-28 18:14:51 +05:30
Alosh Denny c1f0fd8b58 multi-resolution carriers 2026-03-28 17:20:37 +05:30
Alosh Denny 5e43b389fe updated carriers 2026-03-28 17:20:08 +05:30
Alosh Denny f2ace3dead improved dataset 2026-03-28 15:56:48 +05:30
Alosh Denny 52c62c669c updated gitignore 2026-03-28 15:54:44 +05:30
Alosh Denny 31b0e07c46 Merge branch 'main' of https://github.com/aloshdenny/reverse-SynthID 2026-03-28 15:53:42 +05:30
Alosh Denny 8169824232 ran bypass on NB-2 2026-03-28 15:53:24 +05:30
Alosh Denny 908bf54ab9 Update repository URL in installation instructions 2026-03-06 21:20:00 +05:30
Alosh Denny c2acc5d259 updated refs 2026-02-15 18:14:55 +05:30
Alosh Denny 091a56761f updated images 2026-02-15 18:13:58 +05:30
Alosh Denny 4c95814928 updated images 2026-02-15 18:10:29 +05:30