remove-ai-watermarks

mirror of https://github.com/wiltodelta/remove-ai-watermarks.git synced 2026-07-28 18:28:49 +02:00

Author	SHA1	Message	Date
test-userandClaude Opus 4.7	fa104bcade	feat(identify): provenance command (platform + watermark inventory) New 'identify' command and identify.py module: upload an image, get one ProvenanceReport answering where it was made and what watermarks it carries. Aggregates every locally-readable signal: - C2PA Content Credentials -> generating platform (issuer + generator). - IPTC digitalSourceType 'Made with AI' (Meta and others). - Embedded SD/ComfyUI generation parameters (local pipelines). - SynthID metadata proxy (Google / OpenAI C2PA companion). - Visible Gemini sparkle (cv2 fallback for the stripped-metadata case), promoted only at confidence >= 0.5 (corpus-tuned: Gemini sparkles score >= 0.56, non-sparkle <= 0.49). is_ai_generated is True or None, never asserted False -- stripped metadata leaves no local proof of a clean origin, so absence of signals is reported as 'unknown' with an explicit caveat. The SynthID pixel watermark remains locally undecodable; the report says so. Non-PNG containers (JPEG/WebP/AVIF/HEIF/JXL) get the same issuer + generator attribution via a binary scan (the caBX parser is PNG-only). The cv2 dependency is isolated in gemini_engine.detect_sparkle_confidence so identify.py stays type-clean. CLI supports --json and --no-visible. Validated against the 109-image corpus: 14/14 positives flagged AI, 93/94 negatives clean (the one 'neg' flagged is a Meta image that genuinely carries the IPTC tag -- correct), zero true errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 16:19:26 -07:00
test-userandClaude Opus 4.7	f36320ff39	fix(metadata): guard get_ai_metadata PIL open against non-OSError get_ai_metadata opened the file with PIL unguarded, so a HEIC (or any format PIL can't open without optional plugins) raised UnidentifiedImageError instead of falling through to the binary scan -- unlike has_ai_metadata, which already guards. Wrap the open in except Exception and continue to the C2PA/IPTC path. Regression test feeds an unopenable .heic shell and asserts no raise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 16:19:15 -07:00
test-userandClaude Opus 4.7	6cef1d59f0	fix(c2pa): drop non-printable claim_generator garbage On some manifests (observed: Microsoft Designer) the first CBOR "name" key precedes a binary hash field, not the generator string, so _cbor_text_after returns control-char garbage. Guard with isprintable() to drop it; issuer detection (byte-search) and the SynthID verdict are unaffected. Adds TestParseChunkGuards covering kept-vs-dropped cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 15:55:07 -07:00
test-userandClaude Opus 4.7	f07ce10c72	feat(metadata): SynthID-source detection, C2PA parser consolidation, corpus + tests Detect SynthID-bearing images via their C2PA companion: a manifest signed by a SynthID-using vendor (Google/OpenAI) on AI-generated content implies an invisible SynthID pixel watermark. Verified end-to-end against the vendor oracles (openai.com/verify, Gemini "Verify with SynthID"). - metadata: synthid_source() + synthid_watermark verdict in get_ai_metadata, surfaced as a `metadata --check` callout. Format-agnostic (PNG caBX parser + JPEG/WebP/AVIF/HEIF/JXL binary scan). - constants: SYNTHID_C2PA_ISSUERS {Google, OpenAI}; +opened/placed actions. - c2pa: single CBOR-aware parser (_cbor_text_after) replaces glitchy regex (fixes fGPT-4o claim_generator); removed duplicate _scan_png_c2pa_chunk from metadata; shared synthid_verdict / synthid_vendors_in helpers. - corpus: scripts/synthid_corpus.py ingest tool + data/synthid_corpus/ (manifest tracked, images gitignored) for a labeled reference set. - tests: +38 across C2PA parser internals, extract/inject round-trip, ISOBMFF container stripping, all IPTC AI markers, and invisible watermark strength tiers (SynthID/StableSignature/TreeRing/StegaStamp/RingID/RivaGAN/...). Pixel-level SynthID detection remains out of reach locally (Google's decoder is proprietary); a from-scratch spectral pilot confirmed it does not separate real content. See CLAUDE.md for the full evaluation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 11:32:46 -07:00
test-userandClaude Opus 4.7	578e229713	style(cli): fix closing paren indentation in cmd_batch Whitespace-only ruff format alignment, no functional change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 13:58:35 -07:00
test-userandClaude Opus 4.7	f2fc5e09ab	feat: SDXL default; AVIF/HEIF/JPEG-XL C2PA stripping SD-1.5 dreamshaper at 768 px did not defeat SynthID v2 on Gemini 3 Pro outputs (verified May 2026 via Gemini app's "Verify with SynthID"). Switch the default invisible engine to SDXL at 1024 px, matching the raiw-app production config (strength 0.05, steps 50). Drop the SD-1.5 pipeline. Metadata layer: add C2PA UUID and IPTC AI marker byte-scan detection across all formats, plus an ISOBMFF box walker (noai/isobmff.py) that strips top-level C2PA uuid and JUMBF jumb boxes from AVIF/HEIF/JPEG-XL containers without re-encoding. README gets a Legal table and a Threat-model section about SynthID v2's 136-bit payload. CLAUDE.md tracks the SD-1.5 regression as historical context. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 12:54:37 -07:00
test-userandClaude Opus 4.7	eb1f65ae45	fix(gemini): no-op remove_watermark when nothing detected Reverse alpha blending applied at the assumed default position painted a visible inverse-sparkle artifact onto clean or edited images. The function now returns an unmodified copy when detection fails, instead of falling back to the hardcoded Gemini corner. Bump to 0.3.5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 12:17:25 -07:00
test-userandClaude Sonnet 4.6	87d02126e3	feat(metadata): parse C2PA JUMBF manifest fields, add Images 2.0 sample, bump to 0.3.4 - metadata --check now shows claim_generator, c2pa_spec, digital_source_type, c2pa_actions, signer instead of empty table for C2PA-only files - reuses existing extract_c2pa_chunk() from noai/c2pa.py — no more duplicate PNG chunk parsing or full-file reads - adds data/samples/openai-images-2/amur-leopard.png: real gpt-image-2 output with C2PA manifest signed by OpenAI OpCo LLC / Trufo CA (spec 2.2.0) - removes stale data/samples/nano-banana-1/2.png (no longer referenced) - updates README: new Images 2.0 row in supported models table - documents known text-degradation limitation in CLAUDE.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-22 17:21:51 -07:00
test-user	4a7950e054	chore: bump version to 0.3.3	2026-04-01 12:07:47 -07:00
test-userandClaude Opus 4.6	7eb32fedee	refactor: enforce strict linting and type checking across codebase - Expand ruff rules (B, S, SIM, RET, COM, C4, G, PT, PIE, T20, DTZ, ICN, TCH, RUF, ANN) - Switch pyright to strict mode with relaxed test environment - Replace try-except-pass with contextlib.suppress throughout - Move type-only imports into TYPE_CHECKING blocks - Replace ambiguous Unicode chars (en dash, multiplication sign, Greek alpha) with ASCII - Move color-matcher from base deps to [gpu], remove unused requests dep - Add pyright to dev deps, update dependabot to uv ecosystem - Fix hardcoded version in test_version, unused unpacked vars in tests - Update maintain.sh, CLAUDE.md, .gitignore, .claude/settings.json - Remove obsolete .agents/rules/project.md - Upgrade all dependencies (Pygments vulnerability fix) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 11:42:42 -07:00
test-user	47caacf5dc	chore: bump version to 0.3.2	2026-03-26 20:31:54 -07:00
test-user	aa3b09071d	refactor: improve code quality from review - cli: log metadata strip failures to verbose instead of swallowing - cli: extract _process_batch_image() from cmd_batch for readability - cli: reuse module-level SUPPORTED_FORMATS constant in batch command - metadata: limit C2PA binary scan to first 512KB to prevent OOM	2026-03-26 20:31:27 -07:00
test-user	b41c8e5aba	v0.3.1: Fix opencv conflict, graceful GPU fallback, correct docs - Remove opencv-python from [gpu] extra (conflicts with headless in base deps) - Add graceful fallback in 'invisible' and 'all' commands when GPU deps missing - Cache InvisibleEngine in batch mode (avoid reloading model per image) - Fix --humanize help text (was '0.0-1.0', actual range is 0-6.0+) - Fix stale docstring referencing non-existent [invisible] extra - Add [gpu] extra install instructions to README - Fix broken NeuralBleach placeholder URL in Credits	2026-03-26 10:50:26 -07:00
test-user	2bdc4bceff	Bump version to 0.3.0	2026-03-25 17:27:39 -07:00
test-user	f574929cd9	Fix downscale message: mention model training resolution	2026-03-25 12:33:58 -07:00
test-user	507757738e	v0.2.2: Unify quality defaults, improve README - Unify 'all' defaults to match 'invisible' (strength=0.02, steps=100) - Reorder CLI docs: 'all' command first, individual commands second - HuggingFace token is now documented as optional - Remove 'additional setup' label from invisible section	2026-03-25 12:28:02 -07:00
test-user	2152ebcd32	v0.2.1: Code review fixes, platform-neutral docs - Fix f-string logging → %-style (face_protector, invisible_engine) - Fix logger name: hardcoded string → __name__ - Add module docstrings to humanizer.py, face_protector.py - Break long warning string into multiple lines (PEP 8) - Make docs platform-neutral (macOS/Linux/Windows) - Rename 'optional' → 'additional setup' in README	2026-03-25 12:19:29 -07:00
test-user	cace97b04e	Bump version to 0.2.0 Changes since 0.1.0: - Fix phantom model param bug in invisible/all commands - Fix macOS SSL certificate issue for YOLO downloads - Use temp file in 'all' pipeline to hide intermediate output - Add legal disclaimer and fix license attribution - Add troubleshooting and upgrade docs to README - Expand test suite to 137 tests covering all CLI modes - Clean up dependencies and pyright config	2026-03-25 12:03:44 -07:00
test-user	9c65206806	Use temp file for 'all' pipeline to hide intermediate output The output file now only appears when the full pipeline completes, preventing user confusion during long model downloads.	2026-03-25 11:59:42 -07:00
test-user	1a3d2a448e	Fix macOS SSL cert issue, add troubleshooting and upgrade docs - Add SSL certificate auto-fix in FaceProtector (certifi) - Add Troubleshooting section to README (SSL, first-run downloads) - Add upgrade instructions for pipx/uv tool users	2026-03-25 11:53:02 -07:00
test-user	d7614a7b45	Add legal disclaimer, fix attribution, expand credits - Add disclaimer section to README (research/education purposes) - Remove incorrect Apache-2.0 license claim from ctrlregen docstrings - Expand Credits with CtrlRegen and NeuralBleach attribution - Add license info (MIT) for GeminiWatermarkTool and NeuralBleach	2026-03-25 11:23:28 -07:00
test-user	e5d8970add	Add project files, tests, and documentation for GitHub release - CLI with visible, invisible, all, metadata, and batch commands - Gemini watermark removal via reverse alpha blending - Invisible watermark removal via diffusion regeneration (SynthID, TreeRing) - AI metadata stripping (EXIF, PNG text, C2PA) - Face protection (YOLO/Haar) and analog humanizer - 137 tests covering all CLI modes and core engines - Ruff and Pyright clean	2026-03-25 11:15:05 -07:00