11 Commits

Author SHA1 Message Date
Alosh Denny 736d746f5a feat(v4): add SD-VAE re-generation stage (Round 05)
New module vae_regen.py wraps stabilityai/sd-vae-ft-mse for image
round-trips, exploiting the regeneration-attack weakness conceded in
Gowal et al. 2026 §6.1. Supports MPS, CUDA, and CPU with tiled
encode/decode for high-resolution images.

Made-with: Cursor
2026-04-24 02:08:48 +05:30
Alosh Denny 0c60b31f86 feat(v4): add cross-color consensus codebook and multi-round bypass engine
Introduces SpectralCodebookV4 and SynthIDBypassV4 with bypass_v4,
bypass_v4_universal, bypass_v4_regen (Round 05), and bypass_v4_final /
bypass_v4_nuke (Round 06). The Round-06 7-stage pipeline (VAE +
elastic deformation + resize-squeeze + color nudge + residual FFT +
JPEG chain) defeats the SynthID detector on both gemini-3.1 and
nano-banana-pro. Includes FINAL_PRESETS, REGEN_PRESETS, and
_elastic_deform / _resize_squeeze / _color_nudge helpers.

SpectralCodebookV4 save/load uses format_version 5: LZMA compression,
int8 phase, uint8 mag/cw, sparse-zeroed below cons<0.55 — reduces a
221 MB codebook to ~24 MB with no bypass-relevant information loss.

Also updates RobustSynthIDExtractor with a detect_from_v4_codebook hook.

Made-with: Cursor
2026-04-24 02:08:42 +05:30
mrbeandev defeb41f74 Fix detection accuracy: replace wrong carrier frequencies with empirically verified ones
The hardcoded carrier positions (48,0), (96,0), (0,88) etc. had low phase
coherence on actual Gemini images (~0.16-0.55). Detection was 80% on
reference images with 100% false positive rate on non-watermarked images.

Root cause analysis across 291 watermarked + 16 non-watermarked images
revealed:

1. The watermark is content-adaptive — dark images use diagonal-grid
   carriers at (±3,±4), (±5,±3) etc. while white images use horizontal-
   axis carriers at (0,±7), (0,±8), (0,±9) etc.

2. Both sets have >0.95 intra-set phase coherence and >0.5 discriminative
   gap vs non-watermarked images.

3. Previous metrics (noise correlation, structure ratio bounds, raw carrier
   magnitude) had heavy overlap between watermarked and non-watermarked
   content images and were not discriminative.

Changes:
- Replace carrier list with empirically verified dark + white carrier sets
- Add per-set reference phase templates to codebook (carrier_refs)
- Rewrite detect_array to try both carrier sets and take best phase match
- Use phase agreement as primary signal (WM: 0.92-0.99 vs non-WM: 0.47-0.71)
- Add noise-domain carrier-vs-random ratio as supporting signal
- Skip expensive multi-scale consistency computation (phase match is decisive)

Results on full dataset:
- Watermarked:     99.0% detection (was ~80%)
- Non-watermarked: 0% false positives (was 100%)
- Overall:         98.7% accuracy (was ~80% with no FP testing)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 14:43:04 +05:30
mrbeandev 32a4b3d4b1 Fix codebook loading error when .npz path is passed to extractor
The extractor's load_codebook() was called with the .npz bypass codebook
path, but it only handles .pkl files. pickle.load() on an .npz file throws
a cryptic "persistent IDs" error, causing the extractor to silently fail.
This meant users got no before/after watermark verification during bypass.

Changes:
- load_codebook() now auto-discovers the .pkl codebook when given a .npz path
- Pickle save now uses protocol=4 for wider Python version compatibility

Fixes #10, #9, #11
2026-04-10 16:03:38 +05:30
Test User 49a7cfc9d8 Fix multiple bugs in extraction pipeline
1. bypass_v2() ignores iterations parameter — the function accepted
   `iterations` but ran the transform pipeline only once. Now properly
   loops, with diminishing strength on subsequent iterations.

2. denoise_bilateral() has identical if/else branches — both 2D and 3D
   cases called the same cv2.bilateralFilter(). Removed dead branch.

3. apply_noise_replacement() allows negative sigma — with passes > 5,
   the formula `sigma * (1 - i * 0.2)` produces negative values. Added
   clamping and early break.

4. Broken import paths — synthid_bypass.py and watermark_remover.py
   used bare module imports that fail when scripts are run from outside
   their directory. Added sys.path.insert like benchmark_extraction.py.

5. Misleading "Python 3.14 bug" comment — the SSIM gate was disabled
   with a comment blaming Python 3.14, but the real reason is that
   heavy multi-pass transforms naturally depress SSIM. Updated comment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 10:49:22 +08:00
Alosh Denny 4e6a9987bb sparsified spectral codebook 2026-03-28 18:40:00 +05:30
Alosh Denny a9079c74c4 updated carriers 2026-03-28 18:14:51 +05:30
Alosh Denny 8169824232 ran bypass on NB-2 2026-03-28 15:53:24 +05:30
Alosh Denny 25483c159a v3 2026-02-15 18:04:20 +05:30
Alosh Denny ad79ba532f fix cv2.resize scale bug 2026-02-15 17:54:52 +05:30
Alosh Denny 01d2b45dd4 codebook analysis 2025-12-15 22:11:23 +05:30