feat: zero-noise /cso security audits with FP filtering (v0.11.0.0)

Absorb Anthropic's security-review false positive filtering into /cso: - 17 hard exclusions (DOS, test files, log spoofing, SSRF path-only, regex injection, race conditions unless concrete, etc.) - 9 precedents (React XSS-safe, env vars trusted, client-side code doesn't need auth, shell scripts need concrete untrusted input path) - 8/10 confidence gate — below threshold = don't report - Independent sub-agent verification for each finding - Exploit scenario requirement per finding - Framework-aware analysis (Rails CSRF, React escaping, Angular sanitization) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-05 05:05:08 +02:00 · 2026-03-22 10:48:11 -07:00
parent 7ee261b026
commit 65ca7adfd4
5 changed files with 343 additions and 43 deletions
@@ -420,24 +420,113 @@ PUBLIC:
  - Marketing content, documentation, public APIs
 ```

-### Phase 5: Findings Report
+### Phase 5: False Positive Filtering

-Rate each finding using CVSS-inspired scoring:
+Before producing findings, run every candidate through this filter. The goal is
+**zero noise** — better to miss a theoretical issue than flood the report with
+false positives that erode trust.
+
+**Hard exclusions — automatically discard findings matching these:**
+
+1. Denial of Service (DOS), resource exhaustion, or rate limiting issues
+2. Secrets or credentials stored on disk if otherwise secured (encrypted, permissioned)
+3. Memory consumption, CPU exhaustion, or file descriptor leaks
+4. Input validation concerns on non-security-critical fields without proven impact
+5. GitHub Action workflow issues unless clearly triggerable via untrusted input
+6. Missing hardening measures — flag concrete vulnerabilities, not absent best practices
+7. Race conditions or timing attacks unless concretely exploitable with a specific path
+8. Vulnerabilities in outdated third-party libraries (handled by A06, not individual findings)
+9. Memory safety issues in memory-safe languages (Rust, Go, Java, C#)
+10. Files that are only unit tests or test fixtures
+11. Log spoofing — outputting unsanitized input to logs is not a vulnerability
+12. SSRF where attacker only controls the path, not the host or protocol
+13. User-controlled content in AI system prompts is not injection (it's the feature)
+14. Regex injection or regex DOS — not a real vulnerability class
+15. Security concerns in documentation files (*.md)
+16. Missing audit logs — absence of logging is not a vulnerability
+17. Insecure randomness in non-security contexts (e.g., UI element IDs)
+
+**Precedents — established rulings that prevent recurring false positives:**
+
+1. Logging secrets in plaintext IS a vulnerability. Logging URLs is safe.
+2. UUIDs are unguessable — don't flag missing UUID validation.
+3. Environment variables and CLI flags are trusted input. Attacks requiring
+   attacker-controlled env vars are invalid.
+4. React and Angular are XSS-safe by default. Only flag `dangerouslySetInnerHTML`,
+   `bypassSecurityTrustHtml`, or equivalent escape hatches.
+5. Client-side JS/TS does not need permission checks or auth — that's the server's job.
+   Don't flag frontend code for missing authorization.
+6. Shell script command injection needs a concrete untrusted input path.
+   Shell scripts generally don't receive untrusted user input.
+7. Subtle web vulnerabilities (tabnabbing, XS-Leaks, prototype pollution, open redirects)
+   only if extremely high confidence with concrete exploit.
+8. iPython notebooks (*.ipynb) — only flag if untrusted input can trigger the vulnerability.
+9. Logging non-PII data is not a vulnerability even if the data is somewhat sensitive.
+   Only flag logging of secrets, passwords, or PII.
+
+**Confidence gate:** Every finding must score **≥ 8/10 confidence** to appear in the
+final report. Score calibration:
+- **9-10:** Certain exploit path identified. Could write a PoC.
+- **8-9:** Clear vulnerability pattern with known exploitation methods.
+- **7-8:** Suspicious pattern requiring specific conditions. Include only if HIGH+ severity.
+- **Below 7:** Do not report. Too speculative.
+
+### Phase 5.5: Parallel Finding Verification
+
+For each candidate finding that survives the hard exclusion filter, launch an
+independent verification sub-task using the Agent tool. The verifier has fresh
+context and cannot see the initial scan's reasoning — only the finding itself
+and the false positive filtering rules.
+
+Prompt each verifier sub-task with:
+- The specific finding (file, line, category, description)
+- The full false positive filtering rules (hard exclusions + precedents)
+- Instruction: "Read the code at this location. Is this a real, exploitable
+  vulnerability? Assign a confidence score 1-10. If below 8, explain why
+  it's likely a false positive."
+
+Launch all verifier sub-tasks in parallel. Discard any finding where the
+verifier scores confidence below 8.
+
+If the Agent tool is unavailable, perform the verification pass yourself
+by re-reading the code for each finding with a skeptic's eye. Note: "Self-verified
+— independent sub-task unavailable."
+
+### Phase 6: Findings Report
+
+**Exploit scenario requirement:** Every finding MUST include a concrete exploit
+scenario — a step-by-step attack path an attacker would follow. "This pattern
+is insecure" is not a finding. "Attacker sends POST /api/users?id=OTHER_USER_ID
+and receives the other user's data because the controller uses params[:id]
+without scoping to current_user" is a finding.
+
+Rate each finding:
 ```
 SECURITY FINDINGS
 ═════════════════
-Sev    Category         Finding                              OWASP    Status
-────   ────────         ───────                              ─────    ──────
-CRIT   Injection        Raw SQL in search controller          A03     Open
-HIGH   Access Control   Missing auth on /api/admin/users      A01     Open
-HIGH   Crypto           API keys in plaintext config file     A02     Open
-MED    Config           CORS allows *, should be restricted   A05     Open
-MED    Logging          Failed auth attempts not logged        A09     Open
-LOW    Components       lodash@4.17.11 has prototype pollution A06     Open
-INFO   Design           No rate limiting on password reset     A04     Open
+#   Sev    Conf   Category         Finding                          OWASP   File:Line
+──  ────   ────   ────────         ───────                          ─────   ─────────
+1   CRIT   9/10   Injection        Raw SQL in search controller      A03    app/search.rb:47
+2   HIGH   8/10   Access Control   Missing auth on admin endpoint    A01    api/admin.ts:12
+3   HIGH   9/10   Crypto           API keys in plaintext config      A02    config/app.yml:8
+4   MED    8/10   Config           CORS allows * in production       A05    server.ts:34
 ```

-### Phase 6: Remediation Roadmap
+For each finding, include:
+
+```
+## Finding 1: [Title] — [File:Line]
+
+* **Severity:** CRITICAL | HIGH | MEDIUM
+* **Confidence:** N/10
+* **OWASP:** A01-A10
+* **Description:** [What's wrong — one paragraph]
+* **Exploit scenario:** [Step-by-step attack path — be specific]
+* **Impact:** [What an attacker gains — data breach, RCE, privilege escalation]
+* **Recommendation:** [Specific code change with example]
+```
+
+### Phase 7: Remediation Roadmap

 For the top 5 findings, present via AskUserQuestion:

@@ -450,25 +539,33 @@ For the top 5 findings, present via AskUserQuestion:
   - C) Accept risk — [document why, set review date]
   - D) Defer to TODOS.md with security label

-### Phase 7: Save Report
+### Phase 8: Save Report

 ```bash
 mkdir -p .gstack/security-reports
 ```

-Write findings to `.gstack/security-reports/{date}.json`.
+Write findings to `.gstack/security-reports/{date}.json`. Include:
+- Each finding with severity, confidence, category, file, line, description
+- Verification status (independently verified or self-verified)
+- Total findings by severity tier
+- False positives filtered count (so you can track filter effectiveness)

 If prior reports exist, show:
 - **Resolved:** Findings fixed since last audit
 - **Persistent:** Findings still open
 - **New:** Findings discovered this audit
 - **Trend:** Security posture improving or degrading?
+- **Filter stats:** N candidates scanned, M filtered as FP, K reported

 ## Important Rules

 - **Think like an attacker, report like a defender.** Show the exploit path, then the fix.
+- **Zero noise is more important than zero misses.** A report with 3 real findings is worth more than one with 3 real + 12 theoretical. Users stop reading noisy reports.
 - **No security theater.** Don't flag theoretical risks with no realistic exploit path. Focus on doors that are actually unlocked.
 - **Severity calibration matters.** A CRITICAL finding needs a realistic exploitation scenario. If you can't describe how an attacker would exploit it, it's not CRITICAL.
+- **Confidence gate is absolute.** Below 8/10 confidence = do not report. Period.
 - **Read-only.** Never modify code. Produce findings and recommendations only.
 - **Assume competent attackers.** Don't assume security through obscurity works.
 - **Check the obvious first.** Hardcoded credentials, missing auth checks, and SQL injection are still the top real-world vectors.
+- **Framework-aware.** Know your framework's built-in protections. Rails has CSRF tokens by default. React escapes by default. Don't flag what the framework already handles.
@@ -1,5 +1,14 @@
 # Changelog

+## [0.11.0.0] - 2026-03-22 — Zero-Noise Security Audits
+
+### Changed
+
+- **`/cso` now filters false positives like Anthropic's security review.** 17 hard exclusions (DOS, test files, log spoofing, SSRF path-only, regex injection, and 12 more) plus 9 established precedents (React is XSS-safe by default, env vars are trusted, client-side code doesn't need auth checks). Every finding must score 8/10+ confidence with a concrete exploit scenario — "this pattern is bad" doesn't make the cut. The result: reports with 3 real findings instead of 3 real + 12 theoretical.
+- **Independent finding verification.** Each candidate finding is verified by a fresh sub-agent that only sees the finding and the FP rules — no anchoring bias from the initial scan. Findings that fail independent verification are silently dropped.
+- **Exploit scenario requirement.** Every finding now requires a step-by-step attack path: who sends what, to where, and what they get. No more "insecure pattern detected" without a walkable attack.
+- **Framework-aware analysis.** /cso now knows that Rails has CSRF tokens, React escapes HTML, Angular sanitizes by default. It won't flag what the framework already handles.
+
 ## [0.10.1.0] - 2026-03-22 — Community Security Wave

 ### Added
@@ -1 +1 @@
-0.10.1.0
+0.11.0.0
@@ -428,24 +428,113 @@ PUBLIC:
  - Marketing content, documentation, public APIs
 ```

-### Phase 5: Findings Report
+### Phase 5: False Positive Filtering

-Rate each finding using CVSS-inspired scoring:
+Before producing findings, run every candidate through this filter. The goal is
+**zero noise** — better to miss a theoretical issue than flood the report with
+false positives that erode trust.
+
+**Hard exclusions — automatically discard findings matching these:**
+
+1. Denial of Service (DOS), resource exhaustion, or rate limiting issues
+2. Secrets or credentials stored on disk if otherwise secured (encrypted, permissioned)
+3. Memory consumption, CPU exhaustion, or file descriptor leaks
+4. Input validation concerns on non-security-critical fields without proven impact
+5. GitHub Action workflow issues unless clearly triggerable via untrusted input
+6. Missing hardening measures — flag concrete vulnerabilities, not absent best practices
+7. Race conditions or timing attacks unless concretely exploitable with a specific path
+8. Vulnerabilities in outdated third-party libraries (handled by A06, not individual findings)
+9. Memory safety issues in memory-safe languages (Rust, Go, Java, C#)
+10. Files that are only unit tests or test fixtures
+11. Log spoofing — outputting unsanitized input to logs is not a vulnerability
+12. SSRF where attacker only controls the path, not the host or protocol
+13. User-controlled content in AI system prompts is not injection (it's the feature)
+14. Regex injection or regex DOS — not a real vulnerability class
+15. Security concerns in documentation files (*.md)
+16. Missing audit logs — absence of logging is not a vulnerability
+17. Insecure randomness in non-security contexts (e.g., UI element IDs)
+
+**Precedents — established rulings that prevent recurring false positives:**
+
+1. Logging secrets in plaintext IS a vulnerability. Logging URLs is safe.
+2. UUIDs are unguessable — don't flag missing UUID validation.
+3. Environment variables and CLI flags are trusted input. Attacks requiring
+   attacker-controlled env vars are invalid.
+4. React and Angular are XSS-safe by default. Only flag `dangerouslySetInnerHTML`,
+   `bypassSecurityTrustHtml`, or equivalent escape hatches.
+5. Client-side JS/TS does not need permission checks or auth — that's the server's job.
+   Don't flag frontend code for missing authorization.
+6. Shell script command injection needs a concrete untrusted input path.
+   Shell scripts generally don't receive untrusted user input.
+7. Subtle web vulnerabilities (tabnabbing, XS-Leaks, prototype pollution, open redirects)
+   only if extremely high confidence with concrete exploit.
+8. iPython notebooks (*.ipynb) — only flag if untrusted input can trigger the vulnerability.
+9. Logging non-PII data is not a vulnerability even if the data is somewhat sensitive.
+   Only flag logging of secrets, passwords, or PII.
+
+**Confidence gate:** Every finding must score **≥ 8/10 confidence** to appear in the
+final report. Score calibration:
+- **9-10:** Certain exploit path identified. Could write a PoC.
+- **8-9:** Clear vulnerability pattern with known exploitation methods.
+- **7-8:** Suspicious pattern requiring specific conditions. Include only if HIGH+ severity.
+- **Below 7:** Do not report. Too speculative.
+
+### Phase 5.5: Parallel Finding Verification
+
+For each candidate finding that survives the hard exclusion filter, launch an
+independent verification sub-task using the Agent tool. The verifier has fresh
+context and cannot see the initial scan's reasoning — only the finding itself
+and the false positive filtering rules.
+
+Prompt each verifier sub-task with:
+- The specific finding (file, line, category, description)
+- The full false positive filtering rules (hard exclusions + precedents)
+- Instruction: "Read the code at this location. Is this a real, exploitable
+  vulnerability? Assign a confidence score 1-10. If below 8, explain why
+  it's likely a false positive."
+
+Launch all verifier sub-tasks in parallel. Discard any finding where the
+verifier scores confidence below 8.
+
+If the Agent tool is unavailable, perform the verification pass yourself
+by re-reading the code for each finding with a skeptic's eye. Note: "Self-verified
+— independent sub-task unavailable."
+
+### Phase 6: Findings Report
+
+**Exploit scenario requirement:** Every finding MUST include a concrete exploit
+scenario — a step-by-step attack path an attacker would follow. "This pattern
+is insecure" is not a finding. "Attacker sends POST /api/users?id=OTHER_USER_ID
+and receives the other user's data because the controller uses params[:id]
+without scoping to current_user" is a finding.
+
+Rate each finding:
 ```
 SECURITY FINDINGS
 ═════════════════
-Sev    Category         Finding                              OWASP    Status
-────   ────────         ───────                              ─────    ──────
-CRIT   Injection        Raw SQL in search controller          A03     Open
-HIGH   Access Control   Missing auth on /api/admin/users      A01     Open
-HIGH   Crypto           API keys in plaintext config file     A02     Open
-MED    Config           CORS allows *, should be restricted   A05     Open
-MED    Logging          Failed auth attempts not logged        A09     Open
-LOW    Components       lodash@4.17.11 has prototype pollution A06     Open
-INFO   Design           No rate limiting on password reset     A04     Open
+#   Sev    Conf   Category         Finding                          OWASP   File:Line
+──  ────   ────   ────────         ───────                          ─────   ─────────
+1   CRIT   9/10   Injection        Raw SQL in search controller      A03    app/search.rb:47
+2   HIGH   8/10   Access Control   Missing auth on admin endpoint    A01    api/admin.ts:12
+3   HIGH   9/10   Crypto           API keys in plaintext config      A02    config/app.yml:8
+4   MED    8/10   Config           CORS allows * in production       A05    server.ts:34
 ```

-### Phase 6: Remediation Roadmap
+For each finding, include:
+
+```
+## Finding 1: [Title] — [File:Line]
+
+* **Severity:** CRITICAL | HIGH | MEDIUM
+* **Confidence:** N/10
+* **OWASP:** A01-A10
+* **Description:** [What's wrong — one paragraph]
+* **Exploit scenario:** [Step-by-step attack path — be specific]
+* **Impact:** [What an attacker gains — data breach, RCE, privilege escalation]
+* **Recommendation:** [Specific code change with example]
+```
+
+### Phase 7: Remediation Roadmap

 For the top 5 findings, present via AskUserQuestion:

@@ -458,25 +547,33 @@ For the top 5 findings, present via AskUserQuestion:
   - C) Accept risk — [document why, set review date]
   - D) Defer to TODOS.md with security label

-### Phase 7: Save Report
+### Phase 8: Save Report

 ```bash
 mkdir -p .gstack/security-reports
 ```

-Write findings to `.gstack/security-reports/{date}.json`.
+Write findings to `.gstack/security-reports/{date}.json`. Include:
+- Each finding with severity, confidence, category, file, line, description
+- Verification status (independently verified or self-verified)
+- Total findings by severity tier
+- False positives filtered count (so you can track filter effectiveness)

 If prior reports exist, show:
 - **Resolved:** Findings fixed since last audit
 - **Persistent:** Findings still open
 - **New:** Findings discovered this audit
 - **Trend:** Security posture improving or degrading?
+- **Filter stats:** N candidates scanned, M filtered as FP, K reported

 ## Important Rules

 - **Think like an attacker, report like a defender.** Show the exploit path, then the fix.
+- **Zero noise is more important than zero misses.** A report with 3 real findings is worth more than one with 3 real + 12 theoretical. Users stop reading noisy reports.
 - **No security theater.** Don't flag theoretical risks with no realistic exploit path. Focus on doors that are actually unlocked.
 - **Severity calibration matters.** A CRITICAL finding needs a realistic exploitation scenario. If you can't describe how an attacker would exploit it, it's not CRITICAL.
+- **Confidence gate is absolute.** Below 8/10 confidence = do not report. Period.
 - **Read-only.** Never modify code. Produce findings and recommendations only.
 - **Assume competent attackers.** Don't assume security through obscurity works.
 - **Check the obvious first.** Hardcoded credentials, missing auth checks, and SQL injection are still the top real-world vectors.
+- **Framework-aware.** Know your framework's built-in protections. Rails has CSRF tokens by default. React escapes by default. Don't flag what the framework already handles.
@@ -204,24 +204,113 @@ PUBLIC:
  - Marketing content, documentation, public APIs
 ```

-### Phase 5: Findings Report
+### Phase 5: False Positive Filtering

-Rate each finding using CVSS-inspired scoring:
+Before producing findings, run every candidate through this filter. The goal is
+**zero noise** — better to miss a theoretical issue than flood the report with
+false positives that erode trust.
+
+**Hard exclusions — automatically discard findings matching these:**
+
+1. Denial of Service (DOS), resource exhaustion, or rate limiting issues
+2. Secrets or credentials stored on disk if otherwise secured (encrypted, permissioned)
+3. Memory consumption, CPU exhaustion, or file descriptor leaks
+4. Input validation concerns on non-security-critical fields without proven impact
+5. GitHub Action workflow issues unless clearly triggerable via untrusted input
+6. Missing hardening measures — flag concrete vulnerabilities, not absent best practices
+7. Race conditions or timing attacks unless concretely exploitable with a specific path
+8. Vulnerabilities in outdated third-party libraries (handled by A06, not individual findings)
+9. Memory safety issues in memory-safe languages (Rust, Go, Java, C#)
+10. Files that are only unit tests or test fixtures
+11. Log spoofing — outputting unsanitized input to logs is not a vulnerability
+12. SSRF where attacker only controls the path, not the host or protocol
+13. User-controlled content in AI system prompts is not injection (it's the feature)
+14. Regex injection or regex DOS — not a real vulnerability class
+15. Security concerns in documentation files (*.md)
+16. Missing audit logs — absence of logging is not a vulnerability
+17. Insecure randomness in non-security contexts (e.g., UI element IDs)
+
+**Precedents — established rulings that prevent recurring false positives:**
+
+1. Logging secrets in plaintext IS a vulnerability. Logging URLs is safe.
+2. UUIDs are unguessable — don't flag missing UUID validation.
+3. Environment variables and CLI flags are trusted input. Attacks requiring
+   attacker-controlled env vars are invalid.
+4. React and Angular are XSS-safe by default. Only flag `dangerouslySetInnerHTML`,
+   `bypassSecurityTrustHtml`, or equivalent escape hatches.
+5. Client-side JS/TS does not need permission checks or auth — that's the server's job.
+   Don't flag frontend code for missing authorization.
+6. Shell script command injection needs a concrete untrusted input path.
+   Shell scripts generally don't receive untrusted user input.
+7. Subtle web vulnerabilities (tabnabbing, XS-Leaks, prototype pollution, open redirects)
+   only if extremely high confidence with concrete exploit.
+8. iPython notebooks (*.ipynb) — only flag if untrusted input can trigger the vulnerability.
+9. Logging non-PII data is not a vulnerability even if the data is somewhat sensitive.
+   Only flag logging of secrets, passwords, or PII.
+
+**Confidence gate:** Every finding must score **≥ 8/10 confidence** to appear in the
+final report. Score calibration:
+- **9-10:** Certain exploit path identified. Could write a PoC.
+- **8-9:** Clear vulnerability pattern with known exploitation methods.
+- **7-8:** Suspicious pattern requiring specific conditions. Include only if HIGH+ severity.
+- **Below 7:** Do not report. Too speculative.
+
+### Phase 5.5: Parallel Finding Verification
+
+For each candidate finding that survives the hard exclusion filter, launch an
+independent verification sub-task using the Agent tool. The verifier has fresh
+context and cannot see the initial scan's reasoning — only the finding itself
+and the false positive filtering rules.
+
+Prompt each verifier sub-task with:
+- The specific finding (file, line, category, description)
+- The full false positive filtering rules (hard exclusions + precedents)
+- Instruction: "Read the code at this location. Is this a real, exploitable
+  vulnerability? Assign a confidence score 1-10. If below 8, explain why
+  it's likely a false positive."
+
+Launch all verifier sub-tasks in parallel. Discard any finding where the
+verifier scores confidence below 8.
+
+If the Agent tool is unavailable, perform the verification pass yourself
+by re-reading the code for each finding with a skeptic's eye. Note: "Self-verified
+— independent sub-task unavailable."
+
+### Phase 6: Findings Report
+
+**Exploit scenario requirement:** Every finding MUST include a concrete exploit
+scenario — a step-by-step attack path an attacker would follow. "This pattern
+is insecure" is not a finding. "Attacker sends POST /api/users?id=OTHER_USER_ID
+and receives the other user's data because the controller uses params[:id]
+without scoping to current_user" is a finding.
+
+Rate each finding:
 ```
 SECURITY FINDINGS
 ═════════════════
-Sev    Category         Finding                              OWASP    Status
-────   ────────         ───────                              ─────    ──────
-CRIT   Injection        Raw SQL in search controller          A03     Open
-HIGH   Access Control   Missing auth on /api/admin/users      A01     Open
-HIGH   Crypto           API keys in plaintext config file     A02     Open
-MED    Config           CORS allows *, should be restricted   A05     Open
-MED    Logging          Failed auth attempts not logged        A09     Open
-LOW    Components       lodash@4.17.11 has prototype pollution A06     Open
-INFO   Design           No rate limiting on password reset     A04     Open
+#   Sev    Conf   Category         Finding                          OWASP   File:Line
+──  ────   ────   ────────         ───────                          ─────   ─────────
+1   CRIT   9/10   Injection        Raw SQL in search controller      A03    app/search.rb:47
+2   HIGH   8/10   Access Control   Missing auth on admin endpoint    A01    api/admin.ts:12
+3   HIGH   9/10   Crypto           API keys in plaintext config      A02    config/app.yml:8
+4   MED    8/10   Config           CORS allows * in production       A05    server.ts:34
 ```

-### Phase 6: Remediation Roadmap
+For each finding, include:
+
+```
+## Finding 1: [Title] — [File:Line]
+
+* **Severity:** CRITICAL | HIGH | MEDIUM
+* **Confidence:** N/10
+* **OWASP:** A01-A10
+* **Description:** [What's wrong — one paragraph]
+* **Exploit scenario:** [Step-by-step attack path — be specific]
+* **Impact:** [What an attacker gains — data breach, RCE, privilege escalation]
+* **Recommendation:** [Specific code change with example]
+```
+
+### Phase 7: Remediation Roadmap

 For the top 5 findings, present via AskUserQuestion:

@@ -234,25 +323,33 @@ For the top 5 findings, present via AskUserQuestion:
   - C) Accept risk — [document why, set review date]
   - D) Defer to TODOS.md with security label

-### Phase 7: Save Report
+### Phase 8: Save Report

 ```bash
 mkdir -p .gstack/security-reports
 ```

-Write findings to `.gstack/security-reports/{date}.json`.
+Write findings to `.gstack/security-reports/{date}.json`. Include:
+- Each finding with severity, confidence, category, file, line, description
+- Verification status (independently verified or self-verified)
+- Total findings by severity tier
+- False positives filtered count (so you can track filter effectiveness)

 If prior reports exist, show:
 - **Resolved:** Findings fixed since last audit
 - **Persistent:** Findings still open
 - **New:** Findings discovered this audit
 - **Trend:** Security posture improving or degrading?
+- **Filter stats:** N candidates scanned, M filtered as FP, K reported

 ## Important Rules

 - **Think like an attacker, report like a defender.** Show the exploit path, then the fix.
+- **Zero noise is more important than zero misses.** A report with 3 real findings is worth more than one with 3 real + 12 theoretical. Users stop reading noisy reports.
 - **No security theater.** Don't flag theoretical risks with no realistic exploit path. Focus on doors that are actually unlocked.
 - **Severity calibration matters.** A CRITICAL finding needs a realistic exploitation scenario. If you can't describe how an attacker would exploit it, it's not CRITICAL.
+- **Confidence gate is absolute.** Below 8/10 confidence = do not report. Period.
 - **Read-only.** Never modify code. Produce findings and recommendations only.
 - **Assume competent attackers.** Don't assume security through obscurity works.
 - **Check the obvious first.** Hardcoded credentials, missing auth checks, and SQL injection are still the top real-world vectors.
+- **Framework-aware.** Know your framework's built-in protections. Rails has CSRF tokens by default. React escapes by default. Don't flag what the framework already handles.
@@ -1 +1 @@
 .10.1.0
 .11.0.0