reverse-SynthID

mirror of https://github.com/aloshdenny/reverse-SynthID.git synced 2026-04-30 18:47:53 +02:00

Author	SHA1	Message	Date
Alosh Denny	083a5eec6a	feat(scripts): add V4 codebook build, batch dissolve, and calibration scripts build_codebook_v4.py — builds SpectralCodebookV4 from the hierarchical reverse-synthid-dataset (model × color × resolution). dissolve_batch.py — runs all bypass presets (gentle … nuke) over an input directory. Supports Round-06 'final' and 'nuke' strengths. calibrate_from_feedback.py — updates carrier_weights from detection feedback, closing the human-in-the-loop calibration loop. Made-with: Cursor	2026-04-24 02:08:56 +05:30
Alosh Denny	736d746f5a	feat(v4): add SD-VAE re-generation stage (Round 05) New module vae_regen.py wraps stabilityai/sd-vae-ft-mse for image round-trips, exploiting the regeneration-attack weakness conceded in Gowal et al. 2026 §6.1. Supports MPS, CUDA, and CPU with tiled encode/decode for high-resolution images. Made-with: Cursor	2026-04-24 02:08:48 +05:30
Alosh Denny	0c60b31f86	feat(v4): add cross-color consensus codebook and multi-round bypass engine Introduces SpectralCodebookV4 and SynthIDBypassV4 with bypass_v4, bypass_v4_universal, bypass_v4_regen (Round 05), and bypass_v4_final / bypass_v4_nuke (Round 06). The Round-06 7-stage pipeline (VAE + elastic deformation + resize-squeeze + color nudge + residual FFT + JPEG chain) defeats the SynthID detector on both gemini-3.1 and nano-banana-pro. Includes FINAL_PRESETS, REGEN_PRESETS, and _elastic_deform / _resize_squeeze / _color_nudge helpers. SpectralCodebookV4 save/load uses format_version 5: LZMA compression, int8 phase, uint8 mag/cw, sparse-zeroed below cons<0.55 — reduces a 221 MB codebook to ~24 MB with no bypass-relevant information loss. Also updates RobustSynthIDExtractor with a detect_from_v4_codebook hook. Made-with: Cursor	2026-04-24 02:08:42 +05:30
Alosh Denny	84b6b4c9c2	chore: track artifacts/.npz with Git LFS, tidy gitignore Add LFS filter for artifacts/.npz. Ignore spectral_codebook_v4.npz from git — the file is uploaded manually via the GitHub UI. Made-with: Cursor	2026-04-24 02:08:28 +05:30
Alosh Denny	764c7ab333	Track run images with LFS	2026-04-24 01:15:30 +05:30
Alosh Denny	93cf20df1b	Fix link to SynthID visualizer in README Updated the link to the SynthID visualizer to the correct URL. v3	2026-04-14 01:16:17 +05:30
Alosh Denny	130b9cf68e	Update README with visualizer link for SynthID Added a link to a visualizer for SynthID watermarking.	2026-04-14 01:15:25 +05:30
Alosh Denny	d012872d7d	Merge pull request #23 from mrbeandev/improve-detection-accuracy Fix detection: empirically verified carrier frequencies	2026-04-12 19:30:20 +05:30
mrbeandev	defeb41f74	Fix detection accuracy: replace wrong carrier frequencies with empirically verified ones The hardcoded carrier positions (48,0), (96,0), (0,88) etc. had low phase coherence on actual Gemini images (~0.16-0.55). Detection was 80% on reference images with 100% false positive rate on non-watermarked images. Root cause analysis across 291 watermarked + 16 non-watermarked images revealed: 1. The watermark is content-adaptive — dark images use diagonal-grid carriers at (±3,±4), (±5,±3) etc. while white images use horizontal- axis carriers at (0,±7), (0,±8), (0,±9) etc. 2. Both sets have >0.95 intra-set phase coherence and >0.5 discriminative gap vs non-watermarked images. 3. Previous metrics (noise correlation, structure ratio bounds, raw carrier magnitude) had heavy overlap between watermarked and non-watermarked content images and were not discriminative. Changes: - Replace carrier list with empirically verified dark + white carrier sets - Add per-set reference phase templates to codebook (carrier_refs) - Rewrite detect_array to try both carrier sets and take best phase match - Use phase agreement as primary signal (WM: 0.92-0.99 vs non-WM: 0.47-0.71) - Add noise-domain carrier-vs-random ratio as supporting signal - Skip expensive multi-scale consistency computation (phase match is decisive) Results on full dataset: - Watermarked: 99.0% detection (was ~80%) - Non-watermarked: 0% false positives (was 100%) - Overall: 98.7% accuracy (was ~80% with no FP testing) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 14:43:04 +05:30
Alosh Denny	245c375d27	Merge pull request #22 from mrbeandev/migrate-images-to-huggingface Migrate reference images to Hugging Face, remove from git	2026-04-11 11:03:17 +05:30
mrbeandev	b308c4b341	Migrate reference images to Hugging Face, remove from git Images are now hosted at https://huggingface.co/datasets/aoxo/reverse-synthid to keep the git repo lightweight (~1.3GB of images removed). Changes: - Remove gemini_black/, gemini_white/, gemini_random/, gemini_*_nb_pro/ from git - Add these folders to .gitignore - Add scripts/download_images.py to fetch images from HF - Update README: contribution guide points to HF dataset, add download instructions Relates to #15	2026-04-11 08:06:05 +05:30
Alosh Denny	d757d6aebe	Update contact email for commercial license inquiries	2026-04-11 00:21:33 +05:30
Alosh Denny	725a675c3e	Add support section to README Added a section for supporting the research project through donations.	2026-04-11 00:01:39 +05:30
Alosh Denny	b155a7b65e	Change license to reverse-SynthID Research License v1.0 Updated the license to reverse-SynthID Research License v1.0 with specific terms for non-commercial use, attribution, and citation requirements.	2026-04-10 20:13:33 +05:30
Alosh Denny	ccebc88a72	Merge pull request #18 from mrbeandev/fix/codebook-loading-npz-compat Fix codebook loading error when .npz path is passed to extractor	2026-04-10 16:43:33 +05:30
mrbeandev	32a4b3d4b1	Fix codebook loading error when .npz path is passed to extractor The extractor's load_codebook() was called with the .npz bypass codebook path, but it only handles .pkl files. pickle.load() on an .npz file throws a cryptic "persistent IDs" error, causing the extractor to silently fail. This meant users got no before/after watermark verification during bypass. Changes: - load_codebook() now auto-discovers the .pkl codebook when given a .npz path - Pickle save now uses protocol=4 for wider Python version compatibility Fixes #10, #9, #11	2026-04-10 16:03:38 +05:30
Alosh Denny	b96c72b34f	Merge pull request #17 from mrbeandev/add-nb-pro-reference-images Add 93 black reference images (gemini_black_nb_pro)	2026-04-10 14:14:49 +05:30
mrbeandev	a48b414623	Add 93 black reference images from Gemini (nb_pro) for SynthID analysis Generated via Gemini Pro web UI with "Create image" tool enabled, uploading a pure black (#000000) 1024x1024 image and prompting "recreate this as it is". All images verified to contain SynthID watermarks (confidence 0.52-0.67, carrier strength ~9300-9900). These reference images are critical for carrier frequency discovery, phase validation, and improving cross-resolution robustness.	2026-04-10 13:22:59 +05:30
Alosh Denny	2a303112e8	Merge pull request #14 from BensonRen/contrib/multi-resolution-references Add 1062 reference images at 2 new resolutions + expanded codebook	2026-04-10 11:47:28 +05:30
Ben Ren	7bf9545df9	Add validation results for new codebook profiles 4 test images generated via Gemini API with real content: - 2x 9:16 (cat, mountain) at 1344x768 - 2x 4:3 (coffee, city) at 864x1184 V3 bypass results with expanded codebook: \| Image \| Resolution \| PSNR \| SSIM \| Exact Match \| \| city (4:3) \| 864x1184 \| 45.7 dB \| 0.9972 \| yes \| \| coffee (4:3) \| 864x1184 \| 50.0 dB \| 0.9981 \| yes \| \| cat (9:16) \| 1344x768 \| 50.2 dB \| 0.9978 \| yes \| \| mountain (9:16)\| 1344x768 \| 49.1 dB \| 0.9971 \| yes \| PSNR/SSIM are excellent. Phase coherence drop is near-zero, suggesting the API-generated images may have weaker watermark embedding than web-UI outputs, or the carrier extraction needs further tuning. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 05:37:22 +00:00
Ben Ren	1138f0b51a	Add expanded codebook with 1344x768 and 864x1184 profiles Extends spectral_codebook_v3 from 2 to 4 resolution profiles: - 1024x1024 (existing, 100b+100w refs) - 1536x2816 (existing, 88 watermarked refs) - 1344x768 (new, 154b+364w refs, top carrier coherence 0.946) - 864x1184 (new, 268b+295w refs, top carrier coherence 0.979) Key findings at new resolutions: - 1344x768 carriers sit on the vertical axis (fy=0, fx=3..10) - 864x1184 carriers are at mid-frequency diagonals (13,-16), (23,-24) - Both show distinct carrier structures vs existing profiles, confirming resolution-dependent embedding Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 05:37:11 +00:00
Ben Ren	22d7acc575	Add 4:3 (864x1184) reference images from Gemini API 268 black + 281 white pure-color reference images at 864x1184 (4:3 landscape aspect ratio). Generated via gemini-2.5-flash-image model. This complements the 9:16 portrait images and covers the classic photo aspect ratio commonly used in Gemini outputs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 05:36:58 +00:00
Ben Ren	dd04dbef41	Add 9:16 (1344x768) reference images from Gemini API 154 black + 359 white pure-color reference images at 1344x768 (9:16 portrait aspect ratio). Generated via gemini-2.5-flash-image model. This resolution is not covered by the existing codebook and is one of the most common mobile Gemini output formats. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 05:36:49 +00:00
Ben Ren	52133eadde	Add reference image generation script and .env to gitignore generate_references.py automates generating pure-black and pure-white reference images via the Gemini API at multiple aspect ratios (9:16, 4:3). Includes rate-limit retry logic and per-resolution output directories. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 05:36:42 +00:00
Alosh Denny	ce1b20490e	Update README with PitchHut link and badges Added a link to PitchHut and updated badges.	2026-04-10 10:29:46 +05:30
Alosh Denny	d8afdfd95c	Add maintainer section with contact details Added maintainer contact information for Alosh Denny.	2026-04-10 09:36:19 +05:30
Alosh Denny	aa709bc9f4	Merge pull request #13 from hobostay/fix/multiple-bugs Fix multiple bugs in extraction pipeline	2026-04-10 09:32:15 +05:30
Test User	49a7cfc9d8	Fix multiple bugs in extraction pipeline 1. bypass_v2() ignores iterations parameter — the function accepted `iterations` but ran the transform pipeline only once. Now properly loops, with diminishing strength on subsequent iterations. 2. denoise_bilateral() has identical if/else branches — both 2D and 3D cases called the same cv2.bilateralFilter(). Removed dead branch. 3. apply_noise_replacement() allows negative sigma — with passes > 5, the formula `sigma * (1 - i * 0.2)` produces negative values. Added clamping and early break. 4. Broken import paths — synthid_bypass.py and watermark_remover.py used bare module imports that fail when scripts are run from outside their directory. Added sys.path.insert like benchmark_extraction.py. 5. Misleading "Python 3.14 bug" comment — the SSIM gate was disabled with a comment blaming Python 3.14, but the real reason is that heavy multi-pass transforms naturally depress SSIM. Updated comment. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 10:49:22 +08:00
Alosh Denny	7d4dc9a11b	Invite contributors for image generation to improve detection Added a section inviting contributors to help expand the codebook by generating pure black and white images using Nano Banana Pro.	2026-04-08 16:37:21 +05:30
Alosh Denny	2d32b0fc48	spectral codebook	2026-03-28 18:58:52 +05:30
Alosh Denny	d3db4c3cd9	remove LFS tracking	2026-03-28 18:58:35 +05:30
Alosh Denny	4e6a9987bb	sparsified spectral codebook	2026-03-28 18:40:00 +05:30
Alosh Denny	9eadacaced	spectral_codebook_v3.npz	2026-03-28 18:17:48 +05:30
Alosh Denny	5463363dcb	Merge branch 'main' of https://github.com/aloshdenny/reverse-SynthID	2026-03-28 18:15:01 +05:30
Alosh Denny	a9079c74c4	updated carriers	2026-03-28 18:14:51 +05:30
Alosh Denny	c1f0fd8b58	multi-resolution carriers	2026-03-28 17:20:37 +05:30
Alosh Denny	5e43b389fe	updated carriers	2026-03-28 17:20:08 +05:30
Alosh Denny	f2ace3dead	improved dataset	2026-03-28 15:56:48 +05:30
Alosh Denny	52c62c669c	updated gitignore	2026-03-28 15:54:44 +05:30
Alosh Denny	31b0e07c46	Merge branch 'main' of https://github.com/aloshdenny/reverse-SynthID	2026-03-28 15:53:42 +05:30
Alosh Denny	8169824232	ran bypass on NB-2	2026-03-28 15:53:24 +05:30
Alosh Denny	908bf54ab9	Update repository URL in installation instructions	2026-03-06 21:20:00 +05:30
Alosh Denny	c2acc5d259	updated refs	2026-02-15 18:14:55 +05:30
Alosh Denny	091a56761f	updated images	2026-02-15 18:13:58 +05:30
Alosh Denny	4c95814928	updated images	2026-02-15 18:10:29 +05:30
Alosh Denny	25483c159a	v3	2026-02-15 18:04:20 +05:30
Alosh Denny	ad79ba532f	fix cv2.resize scale bug	2026-02-15 17:54:52 +05:30
Alosh Denny	e02b4a11da	watermark investigation	2025-12-16 16:34:03 +05:30
Alosh Denny	21b094474e	codebook analysis	2025-12-16 09:18:47 +05:30
Alosh Denny	01d2b45dd4	codebook analysis	2025-12-15 22:11:23 +05:30

50 Commits