China-served generators embed an XMP <TC260:AIGC>{"Label":"1",...} block
(China's mandatory AI-content labeling, TC260 standard). Doubao (ByteDance)
uses it -- verified on the real #13 sample. It's none of C2PA / SynthID /
imwatermark / IPTC, so identify() previously returned unknown.
- metadata: AIGC_MARKERS + aigc_label() (json-decodes the HTML-entity-encoded
block); has_ai_metadata + get_ai_metadata now surface it.
- identify: new 'aigc' signal -> is_ai True, platform 'China AIGC-labeled
generator (TC260; e.g. Doubao)', carries the ContentProducer code.
- Container-agnostic raw-byte scan, so it covers the whole China-AIGC ecosystem
(Jimeng/Kling/Qwen/Ernie share the standard).
- Tests: synthetic TC260 block (metadata + identify). Docs updated.
Addresses #13.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
19 KiB
Remove-AI-Watermarks
Remove visible and invisible AI watermarks from images generated by Google Gemini (Nano Banana), ChatGPT / DALL-E, Stable Diffusion, Adobe Firefly, Midjourney, and other AI models.
Strips SynthID, C2PA Content Credentials, EXIF/XMP "Made with AI" labels, and visible sparkle overlays — all in one command.
Features
- Visible watermark removal — Gemini / Nano Banana sparkle logo via reverse alpha blending (fast, offline, deterministic)
- Invisible watermark removal — SynthID, StableSignature, TreeRing via diffusion-based regeneration
- AI metadata stripping — EXIF, PNG text chunks, C2PA provenance manifests (PNG / JPEG / AVIF / HEIF / JPEG-XL), XMP DigitalSourceType
- "Made with AI" label removal — removes the metadata that triggers AI labels on Instagram, Facebook, X (Twitter)
- Analog Humanizer — film grain and chromatic aberration to bypass AI image classifiers
- Smart Face Protection — automatic extraction and blending of human faces to prevent AI distortion
- Batch processing — process entire directories
- Detection — three-stage NCC watermark detection with confidence scoring
- Provenance detection (
identify) — aggregate C2PA issuer, IPTC "Made with AI", embedded SD/ComfyUI params, EXIF/XMP generator tags, the SynthID metadata proxy, the visible sparkle, and the open SD/SDXL/FLUX invisible watermark into one origin-platform + watermark-inventory verdict (--jsonfor machine output)
Try it online — don't want to install anything? Use raiw.cc, a free web service powered by this library.
Examples
| Before (Watermarked) | After (Cleaned) |
|---|---|
![]() |
![]() |
Supported models
| AI model | Visible watermark | Invisible watermark | Metadata | Our approach |
|---|---|---|---|---|
| Google Gemini / Nano Banana / Gemini 3 Pro | ✅ Sparkle logo | ✅ SynthID v1 + v2 (default SDXL pipeline, native resolution) | ✅ C2PA + EXIF | Alpha reversal + diffusion + metadata strip |
| OpenAI DALL-E 3 / ChatGPT | — | — | ✅ C2PA manifest | Metadata strip |
| OpenAI ChatGPT Images 2.0 (gpt-image-2) | — | ✅ SynthID + content-specific pixel watermark (since May 2026; no local decoder, openai.com/verify oracle) | ✅ C2PA manifest (verified) | Diffusion regeneration + metadata strip |
| Stable Diffusion / SDXL (AUTOMATIC1111, ComfyUI) | — | ✅ DWT-DCT (imwatermark — locally detectable) | ✅ PNG text chunks | Diffusion regeneration + metadata strip |
| Black Forest Labs FLUX | — | ✅ DWT-DCT (imwatermark — locally detectable) | ✅ C2PA (FLUX.2 Pro) | Diffusion regeneration + metadata strip |
| Adobe Firefly | — | — | ✅ Content Credentials (C2PA) | Metadata strip |
| Stability AI (DreamStudio / Stable Image) | — | — | ✅ C2PA ("Stability AI Ltd") | Metadata strip |
| Microsoft Designer / Bing Image Creator | — | ✅ SynthID via DALL-E backend (Designer) | ✅ C2PA (Bing runs MAI-Image, signed "Microsoft") | Metadata strip |
| Midjourney | — | — | ✅ EXIF + XMP (prompt, model, seed) | Metadata strip |
| Meta AI | — | — | ✅ IPTC "Made with AI" (digitalSourceType) | Metadata strip (removes the label) |
| Doubao (ByteDance) / China AIGC generators | — | — | ✅ TC260 <TC260:AIGC> XMP label (China's mandatory AI labeling) |
Metadata strip |
| StableSignature (Meta) | — | ✅ In-model watermark | — | Diffusion regeneration |
| TreeRing | — | ✅ Latent space watermark | — | Diffusion regeneration |
Visible watermarks (logo overlays) are currently used only by Google Gemini / Nano Banana. Other services rely on invisible watermarks and/or metadata. Our diffusion-based regeneration works against any invisible watermark in pixel or frequency domain.
Detection:
remove-ai-watermarks identify <image>reports the origin platform and watermark inventory for all the signals above — C2PA issuer, IPTC "Made with AI", the China TC260 AIGC label, embedded generation params, EXIF/XMP generator tags, the SynthID metadata proxy, the visible sparkle, and (with the[detect]extra) the open SD/SDXL/FLUX invisible watermark. The SynthID pixel watermark has no local decoder, so it is reported as a metadata proxy only.
How it works
Removing the Gemini / Nano Banana sparkle watermark
Google Gemini (internally codenamed Nano Banana) adds a visible sparkle logo to generated images using alpha blending:
watermarked = α × logo + (1 − α) × original
We reverse this with a known alpha map (extracted from Gemini / Nano Banana output on a pure-black background):
original = (watermarked − α × logo) / (1 − α)
A three-stage NCC (Normalized Cross-Correlation) detector finds the watermark position and scale dynamically, so it works even if the image was resized or cropped. After removal, residual sparkle-edge artifacts are cleaned via gradient-masked inpainting.
Speed: ~0.05s per image. No GPU needed.
Removing SynthID and other invisible watermarks
Google embeds SynthID into every image generated by Gemini / Nano Banana. Other AI services use StableSignature, TreeRing, and similar schemes. These imperceptible frequency-domain patterns survive cropping, resizing, and JPEG compression.
The removal pipeline (default profile, SDXL):
image → encode to latent space (VAE) at native resolution
→ add controlled noise (forward diffusion)
→ denoise (reverse diffusion, ~50 steps at strength 0.05)
→ decode back to pixels (VAE)
By default the image is processed at its native resolution with no pre-downscale, matching the hosted raiw.cc backend (fal fast-sdxl, which is stabilityai/stable-diffusion-xl-base-1.0 — the same checkpoint the CLI defaults to). At strength ~0.05 SDXL img2img does not need the input shrunk, and the old forced downscale-to-1024 then upscale-back round-trip was the main quality loss. Pass --max-resolution N to cap the long side only when a very large image runs out of GPU/MPS memory (it reintroduces that lossy round-trip).
SDXL is the default since May 2026: empirically defeats SynthID v2 on Gemini 3 Pro outputs, where the older SD-1.5 pipeline at 768 px did not. The SD-1.5 path was removed once it was verified not to handle v2.
Face Protection: before diffusion, YOLO detects people in the image and extracts them. After diffusion, the original faces are blended back with a soft elliptical mask to prevent AI distortion of facial features.
Analog Humanizer: optional film grain and chromatic aberration injection that makes the output indistinguishable from a photo of a screen, defeating AI-generated image classifiers.
Stripping C2PA, EXIF, and "Made with AI" metadata
AI tools embed generation metadata that social platforms use to show "Made with AI" labels:
- EXIF tags — prompt, seed, model hash, sampler settings (Stable Diffusion, Midjourney)
- XMP DigitalSourceType —
trainedAlgorithmicMediatag used by Instagram, Facebook, and X (Twitter) to show "Made with AI" - PNG text chunks — ComfyUI workflows, AUTOMATIC1111 parameters
- C2PA Content Credentials — cryptographic provenance manifests from Google Imagen, OpenAI DALL-E, Adobe Firefly
The cleaner parses each layer, removes AI-related fields, and preserves standard metadata (Author, Copyright, Title).
Installation
Recommended
Install as an isolated CLI tool — no need to manage virtual environments:
# Using pipx (https://pipx.pypa.io)
pipx install git+https://github.com/wiltodelta/remove-ai-watermarks.git
# Or using uv (https://docs.astral.sh/uv)
uv tool install git+https://github.com/wiltodelta/remove-ai-watermarks.git
To update to the latest version:
pipx upgrade remove-ai-watermarks
# or
uv tool upgrade remove-ai-watermarks
Install from repository
Prerequisites: Python 3.10+ and pip (or uv).
# 1. Clone the repository
git clone https://github.com/wiltodelta/remove-ai-watermarks.git
cd remove-ai-watermarks
# 2. Install the package in editable mode
pip install -e .
# Or, if you use uv:
uv pip install -e .
After installation the remove-ai-watermarks command is available system-wide.
Note
: The base install covers visible watermark removal and metadata stripping. For invisible watermark removal (SynthID etc.), install GPU dependencies:
pip install -e ".[gpu]" # or: uv pip install -e ".[gpu]"To let
identifydecode the open Stable Diffusion / SDXL / FLUX invisible watermarks, install thedetectextra (adds theinvisible-watermarkdecoder):pip install -e ".[detect]" # or: uv pip install -e ".[detect]"
Invisible watermark removal
Invisible removal uses diffusion models and a GPU for reasonable speed.
# On first run, the model (~2 GB) will be downloaded automatically.
# Device is auto-detected: CUDA (Linux/Windows) > MPS (macOS) > CPU.
# To force a device: --device cuda / --device mps / --device cpu
# Optional: set a HuggingFace token for gated/private models
cp .env.example .env
# Edit .env and set HF_TOKEN=hf_your_token_here
Developer setup
# Install with dev dependencies (pytest, ruff, pyright)
pip install -e ".[dev]"
# Or with uv:
uv pip install -e ".[dev]"
# Run tests
pytest
# Run linters
./maintain.sh
Usage
CLI
# Remove all watermarks from a single image (visible + invisible + metadata)
remove-ai-watermarks all image.png -o clean.png
# Process an entire directory
remove-ai-watermarks batch ./images/ --mode all
Individual commands
# Identify provenance: where an image was made + its watermark inventory.
# Aggregates C2PA, IPTC "Made with AI", embedded SD/ComfyUI params, EXIF/XMP
# generator tags (incl. inside AVIF/HEIF), the SynthID proxy, the visible Gemini
# sparkle, and (with the [detect] extra) the open SD/SDXL/FLUX invisible
# watermark into one verdict. Reports "unknown"
# (never "clean") when no signal is found, since stripped metadata is not proof
# of a clean origin. Add --json for machine-readable output.
remove-ai-watermarks identify image.png
# Visible watermark only (Gemini / Nano Banana sparkle) — fast, offline
remove-ai-watermarks visible image.png -o clean.png
# Invisible watermark only (SynthID etc.) — requires GPU
remove-ai-watermarks invisible image.png -o clean.png --humanize 4.0
# Runs at native resolution by default. On a very large image that OOMs the
# GPU/MPS, cap the long side: --max-resolution 2048
# Check / strip AI metadata (C2PA, EXIF, "Made with AI" labels)
# --check also flags SynthID-bearing sources: a C2PA manifest signed by
# Google or OpenAI implies an invisible SynthID watermark in the pixels
# (both vendors pair the two). Adobe Firefly / Microsoft sign C2PA without
# SynthID, so they are reported as C2PA only.
remove-ai-watermarks metadata image.png --check
remove-ai-watermarks metadata image.png --remove
# Batch with a specific mode
remove-ai-watermarks batch ./images/ --mode visible
Python API
from remove_ai_watermarks.gemini_engine import GeminiEngine
import cv2
engine = GeminiEngine()
image = cv2.imread("watermarked.png")
# Detect
result = engine.detect_watermark(image)
print(f"Detected: {result.detected} (confidence: {result.confidence:.1%})")
# Remove
clean = engine.remove_watermark(image)
cv2.imwrite("clean.png", clean)
Metadata stripping
from remove_ai_watermarks.metadata import has_ai_metadata, remove_ai_metadata
from pathlib import Path
if has_ai_metadata(Path("image.png")):
remove_ai_metadata(Path("image.png"), Path("clean.png"))
Requirements
- Python ≥ 3.10
- Visible removal / metadata: CPU only, no GPU required
- Invisible removal: GPU recommended (CUDA or MPS), works on CPU (slow)
Troubleshooting
SSL certificate error (CERTIFICATE_VERIFY_FAILED):
# Install certifi (the tool auto-detects it)
pip install certifi
# macOS only: run the Python certificate installer
/Applications/Python\ 3.*/Install\ Certificates.command
First run is slow — this is expected. The tool downloads model weights (~2 GB) on first launch. Subsequent runs use cached models.
Credits
- noai-watermark by mertizci — invisible watermark removal engine
- GeminiWatermarkTool by Allen Kuo (MIT) — visible watermark removal algorithm
- CtrlRegen by Liu et al. (ICLR 2025) — controllable regeneration pipeline
- NeuralBleach (MIT) — analog humanizer technique
Roadmap
Tracked but not yet implemented:
- SynthID-Image v2 automated regression test. The default SDXL profile defeats v2 per manual checks against the Gemini app's "Verify with SynthID" feature on a Gemini 3 Pro output (May 2026). An automated end-to-end test would need either programmatic access to the SynthID Detector portal (waitlist for media professionals and researchers) or an offline surrogate detector. The spectral phase-coherence surrogate from reverse-SynthID was evaluated and does not separate watermarked from cleaned real-content images (it only fires on controlled solid-color references at exact resolution), so it is not a usable oracle. Open.
- Local SynthID pixel detector. Not feasible today: Google's decoder is proprietary, and magnitude/carrier spectral methods do not separate real content (confirmed by three independent evaluations, including a from-scratch gpt-image pilot; see CLAUDE.md). Blocked on either (a) a programmatic generation path (OpenAI / Gemini API) to build a per-(model, resolution) labeled corpus at scale, or (b) a raw watermarked-output dataset. If data arrives, the next approach to try is a learned classifier on diverse content rather than a fixed carrier codebook.
- Grow the SynthID reference corpus (
data/synthid_corpus/) with oracle-labeled samples per model and resolution (Gemini app for Google, openai.com/verify for OpenAI). Prerequisite for any pixel-detector attempt and for an automated removal-regression set. - Real non-PNG C2PA fixtures. SynthID-source detection for JPEG / WebP / AVIF is currently covered only by synthetic byte blobs; replace with real vendor-emitted files to ground the binary-scan path.
- Maintenance debt. Clear strict-pyright debt in
remove_ai_metadata/cli.py(untyped piexif / PIL / click / rich) somaintain.shcan finish green. (uv-secureis already clean sinceidnawas bumped to 3.16.) - AVIF / HEIF / JPEG-XL detection limits. Removal strips top-level C2PA
uuidand JUMBFjumbboxes. EXIF/XMP boxes inside these containers are not yet scrubbed (PNG and JPEG are fully covered). - Video pipeline (
noai-video): per-frame inpainting and tracking for Sora 2 dynamic logo, Veo 3.1 badge, Kling, Runway. Separate package, not folded into this repo.
Won't fix:
- Nightshade / Glaze / PhotoGuard removal. These are defensive perturbations used by artists to protect their work from being scraped into AI training sets. Removing them attacks artists, not AI provenance. Out of scope.
Legal
Watermarking and provenance for AI-generated content is now regulated in several jurisdictions. The table below summarises the May 2026 status. None of this is legal advice.
| Jurisdiction | Instrument | Status (May 2026) | Relevance |
|---|---|---|---|
| EU | AI Act, Article 50 | Transparency duties apply from 2 August 2026. Legacy generative systems (placed on the market before that date) get a grandfathering period to 2 December 2026 for the Article 50(2) marking duty, under the Digital Omnibus (Commission proposal Nov 2025; co-legislator political agreement 7 May 2026). Article 50 guidelines and a marking Code of Practice are being finalised through 2026. | Removing mandated provenance markers with intent to deceive may be sanctioned under national implementations. |
| US (federal) | COPIED Act | Reintroduced April 2025; not enacted (pending in the Senate). | If passed, would set NIST provenance standards and prohibit tampering with / removing provenance information. The tool itself is lawful; usage may not be. |
| US (state) | CA AB 2655, TX SB 751, similar | TX SB 751 in force; CA AB 2655 struck down by a federal court (Aug 2025, Section 230 / First Amendment). | Content-specific (election deepfakes, sexual deepfakes). Not tool-specific. |
| China | Measures for Labeling AI-Generated Content (+ GB 45438-2025) | In force since 1 September 2025. | Mandatory explicit (visible) + implicit (metadata) labels for AI content; tampering with or removing labels is prohibited. |
| UK | Online Safety Act 2023 / Ofcom guidance | In force, but no statutory AI-provenance or watermarking obligation. | Ofcom encourages watermarking / provenance metadata as voluntary "attribution measures"; platform duties, not user obligations. |
Threat model
This tool defends already-distributed AI imagery against automatic detection systems (social-platform "Made with AI" labels, third-party classifiers, content-policy filters). It does not retroactively anonymise generation.
In particular, SynthID (Google DeepMind) is embedded across Google's generative media stack — Imagen (images), Veo (video), Lyria (audio) — and Gemini app image outputs (Nano Banana / Gemini 3 Pro, which we verified positive via the Gemini app's SynthID oracle); Google reported over 10 billion items watermarked by December 2025. It carries a multi-bit payload — the research paper's SynthID-O variant encodes 136-bit payloads in 512x512 images (arxiv 2510.09263). The payload is believed to encode a user / session identifier. If the original watermarked file ever passed through a system controlled by the prompt originator (a saved Gemini account history, a screenshot uploaded to a Google product, a backup), Google retains the ability to link that original to the generating account. Stripping the watermark from a copy you possess does not erase Google's server-side record.
Use cases where the threat model fits:
- You generated the image yourself, want to publish it as your own work, and accept the consequences if Google ever publishes their detector logs.
- You are running a security / robustness evaluation.
- You are preserving art or historical record against false-positive "AI-generated" labels.
Use cases where the threat model does not fit:
- Generating an image, expecting that removing the watermark anonymises you to Google. It doesn't.
- Distributing AI-generated content while claiming human authorship. The watermark is one of several traceability layers.
This tool is intended for legitimate purposes such as:
- Privacy protection (removing metadata that leaks user account identifiers).
- Art preservation and fair-use research.
- Removing false-positive "Made with AI" labels from human-edited photographs.
- Security research and watermark robustness study.
Removing AI provenance markers to misrepresent AI-generated content as human-created may violate the laws above, the DMCA, and platform terms of service. Users are solely responsible for ensuring their use complies with all applicable laws. The authors do not condone use of this tool for deception, fraud, or any activity that violates applicable laws or regulations.
License
MIT

