Files
JGoyd/docs/PHASE-3_EMAIL_HEADER_PROOF_SYSTEM.md
2026-05-18 22:58:05 -07:00

220 lines
11 KiB
Markdown

# Phase 3 — Email Header Proof System
This phase converts a confirmation email you hold privately into a public artifact that any third party can verify cryptographically — without trusting you.
The goal is to preserve the cryptographic chain (DKIM, ARC, Received) and the institutional metadata (sender domain, case ID, acknowledgement language) while redacting anything weaponizable or personally identifying.
---
## 1 · Source the raw message
Always start from the *raw* `.eml` source, never a forwarded copy. A forward strips DKIM signatures and rewrites the Received chain.
- **Gmail / Google Workspace:** Open message → ⋮ → "Download message" → produces `.eml`.
- **Proton Mail (web):** Open message → ⋯ → "Export" → `.eml`. Verify "Show all headers" first so you can confirm DKIM/ARC are present.
- **Apple Mail:** Message → "Save As" → choose "Raw Message Source".
- **Outlook desktop:** File → Save As → `.msg`, then convert with `mat2`-style tools or `msgconvert` (Email::Outlook::Message).
- **Outlook on the web:** Three-dot menu → "View" → "View message source" → copy raw text to `.eml`.
The raw file MUST contain headers including at minimum: `Received:` (multiple), `DKIM-Signature:`, `Authentication-Results:`, `Message-ID:`, `From:`, `To:`, `Date:`, `Subject:`. ARC chain (`ARC-Seal`, `ARC-Message-Signature`, `ARC-Authentication-Results`) if present must be preserved.
---
## 2 · Redaction rules (what to keep / what to remove)
Use these rules per file. They are deliberately conservative.
### Always KEEP (verbatim, unmodified bytes)
- All `Received:` headers, in order
- `DKIM-Signature:` (every one)
- `ARC-Seal`, `ARC-Message-Signature`, `ARC-Authentication-Results`
- `Authentication-Results:`
- `Message-ID:`
- `Date:`
- `From:` (the sending domain is the entire point)
- `To:` *(see redaction note below for the local part of your own address)*
- `Subject:`
- `Reply-To:`, `Return-Path:`
- `MIME-Version:`, `Content-Type:`, `Content-Transfer-Encoding:`, `Content-Language:`
- Body language that confirms acceptance/acknowledgement: "We have received your submission", "Your case has been assigned", "Tracking ID: …", agency reference numbers, case numbers, claim numbers, file numbers
- The agency's signature block
### REDACT (replace with `[REDACTED-<reason>]` of the same byte length where DKIM-canonicalized, or simply remove from the *body* — see §3 on DKIM canonicalization risk)
- Exploit payloads, PoC code, byte-level reproducers
- Unpatched technical details that could enable in-the-wild exploitation
- Your personal phone number, home address, date of birth, SSN
- Names of third-party individuals not yet public (victims, witnesses, ongoing subjects)
- Portal links containing authentication tokens (e.g. `https://portal.example.gov/c/AB12CD…`)
- File attachments (publish those separately as their own artifacts if needed)
- The local part of your own email address may be redacted, but the domain must remain so envelope-to alignment with DKIM can be checked
### NEVER touch
- Header field order, line wrapping, whitespace, or capitalization — DKIM canonicalization (`relaxed` or `simple`) will fail if you so much as add a trailing space inside the signed scope. **If you must redact inside the signed body**, you have two options:
1. Publish two files: `proof-<case>.original.eml.sha256` (just the hash of the original, signed and timestamp-anchored before any redaction) plus `proof-<case>.redacted.eml` (the human-readable redacted version, where DKIM will deliberately fail). Verifiers re-sign the original from your hash trail and confirm via headers-only DKIM.
2. Use **headers-only DKIM verification** — produce `proof-<case>.headers.eml` containing all headers + a stub `Content-Type: text/plain` body of exactly the original signed-body-hash placeholder. DKIM `bh=` lets a verifier confirm header integrity even when the body is withheld. The DKIM guide below covers this.
---
## 3 · Two-file publication pattern (recommended)
For each case, publish:
| File | Purpose |
|---|---|
| `proof-<case>.original.sha256` | SHA-256 of the unmodified raw `.eml` (computed before any redaction). Bind it via PGP signature + OpenTimestamps. |
| `proof-<case>.headers.eml` | All headers, empty body, original `bh=` value untouched. DKIM verifies header authenticity. |
| `proof-<case>.redacted.eml` | Human-readable redacted version. DKIM will not verify here; this file is for reading, not for cryptographic proof. |
| `proof-<case>.headers.eml.sig` | PGP detached signature over `proof-<case>.headers.eml`. |
| `proof-<case>.headers.eml.ots` | OpenTimestamps attestation. |
| `dkim-verification-guide.md` | Step-by-step verifier instructions (see §5). |
| `README.md` | Case framing — role, anchors, timeline, evidence index. |
This pattern gives a skeptic three independent ways to verify:
1. **DKIM** on `proof-<case>.headers.eml` confirms the agency's mail server signed the message.
2. **PGP signature** confirms you assert the same headers.
3. **OpenTimestamps** confirms the file existed at the timestamp you claim, so an after-the-fact fabrication is excluded.
---
## 4 · The proof-folder template
```
/evidence/<TRACK>-<CASE-ID>/
README.md
proof-<case>.original.sha256
proof-<case>.original.sha256.asc # PGP detached signature of the hash file
proof-<case>.original.sha256.ots # OpenTimestamps attestation
proof-<case>.headers.eml # headers + stubbed body
proof-<case>.headers.eml.asc # PGP signature
proof-<case>.headers.eml.ots # OpenTimestamps attestation
proof-<case>.redacted.eml # human-readable; not cryptographically valid
proof-<case>.redacted.eml.asc # PGP signature of the redacted version (binds your redactions to your key)
dkim-verification-guide.md
attachments/ # optional, each attachment hashed + signed + OTS-anchored separately
receipt.pdf
receipt.pdf.sha256
receipt.pdf.asc
receipt.pdf.ots
```
---
## 5 · DKIM verification guide (drop-in for each case folder)
Save the following as `dkim-verification-guide.md` inside every case folder. Update only the `From:` domain and selector references.
```markdown
# Verifying this email evidence
You do not need to trust me. This guide walks any third party through confirming:
(1) the email is unmodified since it left the sending organization's mail server,
(2) it came from that organization,
(3) it arrived at the stated time.
## Prerequisites
- Linux/macOS shell, or WSL.
- `gpg` (any version ≥ 2.2).
- One of: `dkimpy` (Python, `pip install dkimpy`) or `opendkim-tools` (Debian/Ubuntu: `sudo apt install opendkim-tools`).
- `opentimestamps-client` (Python, `pip install opentimestamps-client`).
## Step 1 — Confirm the file is what I committed
```bash
sha256sum proof-<case>.headers.eml
# compare to the hash inside proof-<case>.original.sha256
```
## Step 2 — Verify my PGP signature over the headers file
```bash
gpg --keyserver hkps://keys.openpgp.org --recv-keys <FINGERPRINT>
gpg --verify proof-<case>.headers.eml.asc proof-<case>.headers.eml
```
Expect: `Good signature from "Joseph R. Goydish II …"`. The fingerprint shown must match the one published on this canonical page and on at least one external keyserver.
## Step 3 — Verify the sender organization's DKIM signature
### Option A — dkimpy (most portable)
```bash
pip install dkimpy
python3 -m dkim verify proof-<case>.headers.eml
```
A successful verification prints `signature ok`. The library performs a live DNS TXT lookup against the selector named in `d=` and `s=` inside the `DKIM-Signature:` header (e.g. `d=sec.gov; s=mail-2024`).
### Option B — opendkim-testmsg
```bash
opendkim-testmsg < proof-<case>.headers.eml
# Empty output = signature OK. Non-empty = failure details.
```
### Interpreting the result
- If DKIM verifies, the headers were signed by a key the sending domain publishes in DNS. Substitution is cryptographically excluded.
- If DKIM fails on the headers-only file but verifies on the original (which you cannot publish in full), you should still find that:
- The `d=` value matches the From: domain or a subdomain controlled by it.
- The selector exists in DNS today and matches `s=`.
- `Authentication-Results:` at the receiving server records `dkim=pass`.
If DKIM keys have since rotated and DNS no longer publishes the old selector, fall back on the public DKIM-history archives (e.g. Farsight DNSDB, Cisco Talos passive DNS) to confirm the selector existed at the email's `Date:`.
## Step 4 — Verify the receiving timestamp chain
```bash
# Extract Received headers
grep -A1 "^Received:" proof-<case>.headers.eml
```
Read top-to-bottom (latest hop first). Confirm:
- The earliest Received line is from a host inside the sender organization's mail infrastructure (e.g. `mx1.sec.gov`, `genpro.gov.sk`, `outbound.prokuraturos.lt`, etc.). Cross-reference with the organization's published SPF record (`dig +short txt sec.gov`).
- Timestamps are monotonic.
- The final hop matches my receiving provider (Proton Mail) and the `Date:` is consistent.
## Step 5 — Verify the OpenTimestamps attestation
```bash
ots verify proof-<case>.headers.eml.ots
```
A successful verification prints the Bitcoin block height and timestamp at which the file's hash was anchored. This confirms the file existed no later than that block — excluding after-the-fact fabrication.
## Step 6 — Cross-check the institutional anchor
The acknowledgement language inside the email references one or more of:
- Case/file/tracking number — e.g. `01-1-03450-26`
- Submission ID — e.g. SEC TCR `17780-976-067-126`
- Reference number — e.g. FCA `212278528`
A journalist may contact the agency directly (using the public contact details on the agency's site, *not* details from this email) and ask whether the reference number is on file. The agency's yes/no is the final external anchor.
## What this proof does NOT establish
- It does **not** prove the underlying allegations.
- It does **not** prove agency action, prosecution, or adjudication.
- It proves only: a submission with the cited content reached the cited agency at the cited time, and the agency's mail system acknowledged it.
```
---
## 6 · Operational checklist before publishing any `.eml`
- [ ] Source is the raw downloaded message, not a forward
- [ ] I have computed `sha256sum` of the raw file **before** any edit
- [ ] I have PGP-signed the raw-file hash
- [ ] I have OpenTimestamps-anchored the raw-file hash (`.ots`)
- [ ] I have a headers-only stub for DKIM verification
- [ ] I have a redacted human-readable copy, separately signed
- [ ] DKIM/ARC headers in the headers file are byte-identical to the source
- [ ] No exploit payload, witness PII, or auth-token URL remains in the redacted copy
- [ ] `dkim-verification-guide.md` references the correct sender domain and key
- [ ] `README.md` states the role precisely and includes the disclaimer for the relevant track
If any checkbox is unchecked, do not commit the folder.