feat: zero-noise /cso security audits with FP filtering (v0.11.0.0)

Absorb Anthropic's security-review false positive filtering into /cso:
- 17 hard exclusions (DOS, test files, log spoofing, SSRF path-only,
  regex injection, race conditions unless concrete, etc.)
- 9 precedents (React XSS-safe, env vars trusted, client-side code
  doesn't need auth, shell scripts need concrete untrusted input path)
- 8/10 confidence gate — below threshold = don't report
- Independent sub-agent verification for each finding
- Exploit scenario requirement per finding
- Framework-aware analysis (Rails CSRF, React escaping, Angular sanitization)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-03-22 10:48:11 -07:00
parent 7ee261b026
commit 65ca7adfd4
5 changed files with 343 additions and 43 deletions
+111 -14
View File
@@ -420,24 +420,113 @@ PUBLIC:
- Marketing content, documentation, public APIs
```
### Phase 5: Findings Report
### Phase 5: False Positive Filtering
Rate each finding using CVSS-inspired scoring:
Before producing findings, run every candidate through this filter. The goal is
**zero noise** — better to miss a theoretical issue than flood the report with
false positives that erode trust.
**Hard exclusions — automatically discard findings matching these:**
1. Denial of Service (DOS), resource exhaustion, or rate limiting issues
2. Secrets or credentials stored on disk if otherwise secured (encrypted, permissioned)
3. Memory consumption, CPU exhaustion, or file descriptor leaks
4. Input validation concerns on non-security-critical fields without proven impact
5. GitHub Action workflow issues unless clearly triggerable via untrusted input
6. Missing hardening measures — flag concrete vulnerabilities, not absent best practices
7. Race conditions or timing attacks unless concretely exploitable with a specific path
8. Vulnerabilities in outdated third-party libraries (handled by A06, not individual findings)
9. Memory safety issues in memory-safe languages (Rust, Go, Java, C#)
10. Files that are only unit tests or test fixtures
11. Log spoofing — outputting unsanitized input to logs is not a vulnerability
12. SSRF where attacker only controls the path, not the host or protocol
13. User-controlled content in AI system prompts is not injection (it's the feature)
14. Regex injection or regex DOS — not a real vulnerability class
15. Security concerns in documentation files (*.md)
16. Missing audit logs — absence of logging is not a vulnerability
17. Insecure randomness in non-security contexts (e.g., UI element IDs)
**Precedents — established rulings that prevent recurring false positives:**
1. Logging secrets in plaintext IS a vulnerability. Logging URLs is safe.
2. UUIDs are unguessable — don't flag missing UUID validation.
3. Environment variables and CLI flags are trusted input. Attacks requiring
attacker-controlled env vars are invalid.
4. React and Angular are XSS-safe by default. Only flag `dangerouslySetInnerHTML`,
`bypassSecurityTrustHtml`, or equivalent escape hatches.
5. Client-side JS/TS does not need permission checks or auth — that's the server's job.
Don't flag frontend code for missing authorization.
6. Shell script command injection needs a concrete untrusted input path.
Shell scripts generally don't receive untrusted user input.
7. Subtle web vulnerabilities (tabnabbing, XS-Leaks, prototype pollution, open redirects)
only if extremely high confidence with concrete exploit.
8. iPython notebooks (*.ipynb) — only flag if untrusted input can trigger the vulnerability.
9. Logging non-PII data is not a vulnerability even if the data is somewhat sensitive.
Only flag logging of secrets, passwords, or PII.
**Confidence gate:** Every finding must score **≥ 8/10 confidence** to appear in the
final report. Score calibration:
- **9-10:** Certain exploit path identified. Could write a PoC.
- **8-9:** Clear vulnerability pattern with known exploitation methods.
- **7-8:** Suspicious pattern requiring specific conditions. Include only if HIGH+ severity.
- **Below 7:** Do not report. Too speculative.
### Phase 5.5: Parallel Finding Verification
For each candidate finding that survives the hard exclusion filter, launch an
independent verification sub-task using the Agent tool. The verifier has fresh
context and cannot see the initial scan's reasoning — only the finding itself
and the false positive filtering rules.
Prompt each verifier sub-task with:
- The specific finding (file, line, category, description)
- The full false positive filtering rules (hard exclusions + precedents)
- Instruction: "Read the code at this location. Is this a real, exploitable
vulnerability? Assign a confidence score 1-10. If below 8, explain why
it's likely a false positive."
Launch all verifier sub-tasks in parallel. Discard any finding where the
verifier scores confidence below 8.
If the Agent tool is unavailable, perform the verification pass yourself
by re-reading the code for each finding with a skeptic's eye. Note: "Self-verified
— independent sub-task unavailable."
### Phase 6: Findings Report
**Exploit scenario requirement:** Every finding MUST include a concrete exploit
scenario — a step-by-step attack path an attacker would follow. "This pattern
is insecure" is not a finding. "Attacker sends POST /api/users?id=OTHER_USER_ID
and receives the other user's data because the controller uses params[:id]
without scoping to current_user" is a finding.
Rate each finding:
```
SECURITY FINDINGS
═════════════════
Sev Category Finding OWASP Status
──── ──────── ─────── ───── ──────
CRIT Injection Raw SQL in search controller A03 Open
HIGH Access Control Missing auth on /api/admin/users A01 Open
HIGH Crypto API keys in plaintext config file A02 Open
MED Config CORS allows *, should be restricted A05 Open
MED Logging Failed auth attempts not logged A09 Open
LOW Components lodash@4.17.11 has prototype pollution A06 Open
INFO Design No rate limiting on password reset A04 Open
# Sev Conf Category Finding OWASP File:Line
── ──── ──── ──────── ─────── ───── ─────────
1 CRIT 9/10 Injection Raw SQL in search controller A03 app/search.rb:47
2 HIGH 8/10 Access Control Missing auth on admin endpoint A01 api/admin.ts:12
3 HIGH 9/10 Crypto API keys in plaintext config A02 config/app.yml:8
4 MED 8/10 Config CORS allows * in production A05 server.ts:34
```
### Phase 6: Remediation Roadmap
For each finding, include:
```
## Finding 1: [Title] — [File:Line]
* **Severity:** CRITICAL | HIGH | MEDIUM
* **Confidence:** N/10
* **OWASP:** A01-A10
* **Description:** [What's wrong — one paragraph]
* **Exploit scenario:** [Step-by-step attack path — be specific]
* **Impact:** [What an attacker gains — data breach, RCE, privilege escalation]
* **Recommendation:** [Specific code change with example]
```
### Phase 7: Remediation Roadmap
For the top 5 findings, present via AskUserQuestion:
@@ -450,25 +539,33 @@ For the top 5 findings, present via AskUserQuestion:
- C) Accept risk — [document why, set review date]
- D) Defer to TODOS.md with security label
### Phase 7: Save Report
### Phase 8: Save Report
```bash
mkdir -p .gstack/security-reports
```
Write findings to `.gstack/security-reports/{date}.json`.
Write findings to `.gstack/security-reports/{date}.json`. Include:
- Each finding with severity, confidence, category, file, line, description
- Verification status (independently verified or self-verified)
- Total findings by severity tier
- False positives filtered count (so you can track filter effectiveness)
If prior reports exist, show:
- **Resolved:** Findings fixed since last audit
- **Persistent:** Findings still open
- **New:** Findings discovered this audit
- **Trend:** Security posture improving or degrading?
- **Filter stats:** N candidates scanned, M filtered as FP, K reported
## Important Rules
- **Think like an attacker, report like a defender.** Show the exploit path, then the fix.
- **Zero noise is more important than zero misses.** A report with 3 real findings is worth more than one with 3 real + 12 theoretical. Users stop reading noisy reports.
- **No security theater.** Don't flag theoretical risks with no realistic exploit path. Focus on doors that are actually unlocked.
- **Severity calibration matters.** A CRITICAL finding needs a realistic exploitation scenario. If you can't describe how an attacker would exploit it, it's not CRITICAL.
- **Confidence gate is absolute.** Below 8/10 confidence = do not report. Period.
- **Read-only.** Never modify code. Produce findings and recommendations only.
- **Assume competent attackers.** Don't assume security through obscurity works.
- **Check the obvious first.** Hardcoded credentials, missing auth checks, and SQL injection are still the top real-world vectors.
- **Framework-aware.** Know your framework's built-in protections. Rails has CSRF tokens by default. React escapes by default. Don't flag what the framework already handles.
+9
View File
@@ -1,5 +1,14 @@
# Changelog
## [0.11.0.0] - 2026-03-22 — Zero-Noise Security Audits
### Changed
- **`/cso` now filters false positives like Anthropic's security review.** 17 hard exclusions (DOS, test files, log spoofing, SSRF path-only, regex injection, and 12 more) plus 9 established precedents (React is XSS-safe by default, env vars are trusted, client-side code doesn't need auth checks). Every finding must score 8/10+ confidence with a concrete exploit scenario — "this pattern is bad" doesn't make the cut. The result: reports with 3 real findings instead of 3 real + 12 theoretical.
- **Independent finding verification.** Each candidate finding is verified by a fresh sub-agent that only sees the finding and the FP rules — no anchoring bias from the initial scan. Findings that fail independent verification are silently dropped.
- **Exploit scenario requirement.** Every finding now requires a step-by-step attack path: who sends what, to where, and what they get. No more "insecure pattern detected" without a walkable attack.
- **Framework-aware analysis.** /cso now knows that Rails has CSRF tokens, React escapes HTML, Angular sanitizes by default. It won't flag what the framework already handles.
## [0.10.1.0] - 2026-03-22 — Community Security Wave
### Added
+1 -1
View File
@@ -1 +1 @@
0.10.1.0
0.11.0.0
+111 -14
View File
@@ -428,24 +428,113 @@ PUBLIC:
- Marketing content, documentation, public APIs
```
### Phase 5: Findings Report
### Phase 5: False Positive Filtering
Rate each finding using CVSS-inspired scoring:
Before producing findings, run every candidate through this filter. The goal is
**zero noise** — better to miss a theoretical issue than flood the report with
false positives that erode trust.
**Hard exclusions — automatically discard findings matching these:**
1. Denial of Service (DOS), resource exhaustion, or rate limiting issues
2. Secrets or credentials stored on disk if otherwise secured (encrypted, permissioned)
3. Memory consumption, CPU exhaustion, or file descriptor leaks
4. Input validation concerns on non-security-critical fields without proven impact
5. GitHub Action workflow issues unless clearly triggerable via untrusted input
6. Missing hardening measures — flag concrete vulnerabilities, not absent best practices
7. Race conditions or timing attacks unless concretely exploitable with a specific path
8. Vulnerabilities in outdated third-party libraries (handled by A06, not individual findings)
9. Memory safety issues in memory-safe languages (Rust, Go, Java, C#)
10. Files that are only unit tests or test fixtures
11. Log spoofing — outputting unsanitized input to logs is not a vulnerability
12. SSRF where attacker only controls the path, not the host or protocol
13. User-controlled content in AI system prompts is not injection (it's the feature)
14. Regex injection or regex DOS — not a real vulnerability class
15. Security concerns in documentation files (*.md)
16. Missing audit logs — absence of logging is not a vulnerability
17. Insecure randomness in non-security contexts (e.g., UI element IDs)
**Precedents — established rulings that prevent recurring false positives:**
1. Logging secrets in plaintext IS a vulnerability. Logging URLs is safe.
2. UUIDs are unguessable — don't flag missing UUID validation.
3. Environment variables and CLI flags are trusted input. Attacks requiring
attacker-controlled env vars are invalid.
4. React and Angular are XSS-safe by default. Only flag `dangerouslySetInnerHTML`,
`bypassSecurityTrustHtml`, or equivalent escape hatches.
5. Client-side JS/TS does not need permission checks or auth — that's the server's job.
Don't flag frontend code for missing authorization.
6. Shell script command injection needs a concrete untrusted input path.
Shell scripts generally don't receive untrusted user input.
7. Subtle web vulnerabilities (tabnabbing, XS-Leaks, prototype pollution, open redirects)
only if extremely high confidence with concrete exploit.
8. iPython notebooks (*.ipynb) — only flag if untrusted input can trigger the vulnerability.
9. Logging non-PII data is not a vulnerability even if the data is somewhat sensitive.
Only flag logging of secrets, passwords, or PII.
**Confidence gate:** Every finding must score **≥ 8/10 confidence** to appear in the
final report. Score calibration:
- **9-10:** Certain exploit path identified. Could write a PoC.
- **8-9:** Clear vulnerability pattern with known exploitation methods.
- **7-8:** Suspicious pattern requiring specific conditions. Include only if HIGH+ severity.
- **Below 7:** Do not report. Too speculative.
### Phase 5.5: Parallel Finding Verification
For each candidate finding that survives the hard exclusion filter, launch an
independent verification sub-task using the Agent tool. The verifier has fresh
context and cannot see the initial scan's reasoning — only the finding itself
and the false positive filtering rules.
Prompt each verifier sub-task with:
- The specific finding (file, line, category, description)
- The full false positive filtering rules (hard exclusions + precedents)
- Instruction: "Read the code at this location. Is this a real, exploitable
vulnerability? Assign a confidence score 1-10. If below 8, explain why
it's likely a false positive."
Launch all verifier sub-tasks in parallel. Discard any finding where the
verifier scores confidence below 8.
If the Agent tool is unavailable, perform the verification pass yourself
by re-reading the code for each finding with a skeptic's eye. Note: "Self-verified
— independent sub-task unavailable."
### Phase 6: Findings Report
**Exploit scenario requirement:** Every finding MUST include a concrete exploit
scenario — a step-by-step attack path an attacker would follow. "This pattern
is insecure" is not a finding. "Attacker sends POST /api/users?id=OTHER_USER_ID
and receives the other user's data because the controller uses params[:id]
without scoping to current_user" is a finding.
Rate each finding:
```
SECURITY FINDINGS
═════════════════
Sev Category Finding OWASP Status
──── ──────── ─────── ───── ──────
CRIT Injection Raw SQL in search controller A03 Open
HIGH Access Control Missing auth on /api/admin/users A01 Open
HIGH Crypto API keys in plaintext config file A02 Open
MED Config CORS allows *, should be restricted A05 Open
MED Logging Failed auth attempts not logged A09 Open
LOW Components lodash@4.17.11 has prototype pollution A06 Open
INFO Design No rate limiting on password reset A04 Open
# Sev Conf Category Finding OWASP File:Line
── ──── ──── ──────── ─────── ───── ─────────
1 CRIT 9/10 Injection Raw SQL in search controller A03 app/search.rb:47
2 HIGH 8/10 Access Control Missing auth on admin endpoint A01 api/admin.ts:12
3 HIGH 9/10 Crypto API keys in plaintext config A02 config/app.yml:8
4 MED 8/10 Config CORS allows * in production A05 server.ts:34
```
### Phase 6: Remediation Roadmap
For each finding, include:
```
## Finding 1: [Title] — [File:Line]
* **Severity:** CRITICAL | HIGH | MEDIUM
* **Confidence:** N/10
* **OWASP:** A01-A10
* **Description:** [What's wrong — one paragraph]
* **Exploit scenario:** [Step-by-step attack path — be specific]
* **Impact:** [What an attacker gains — data breach, RCE, privilege escalation]
* **Recommendation:** [Specific code change with example]
```
### Phase 7: Remediation Roadmap
For the top 5 findings, present via AskUserQuestion:
@@ -458,25 +547,33 @@ For the top 5 findings, present via AskUserQuestion:
- C) Accept risk — [document why, set review date]
- D) Defer to TODOS.md with security label
### Phase 7: Save Report
### Phase 8: Save Report
```bash
mkdir -p .gstack/security-reports
```
Write findings to `.gstack/security-reports/{date}.json`.
Write findings to `.gstack/security-reports/{date}.json`. Include:
- Each finding with severity, confidence, category, file, line, description
- Verification status (independently verified or self-verified)
- Total findings by severity tier
- False positives filtered count (so you can track filter effectiveness)
If prior reports exist, show:
- **Resolved:** Findings fixed since last audit
- **Persistent:** Findings still open
- **New:** Findings discovered this audit
- **Trend:** Security posture improving or degrading?
- **Filter stats:** N candidates scanned, M filtered as FP, K reported
## Important Rules
- **Think like an attacker, report like a defender.** Show the exploit path, then the fix.
- **Zero noise is more important than zero misses.** A report with 3 real findings is worth more than one with 3 real + 12 theoretical. Users stop reading noisy reports.
- **No security theater.** Don't flag theoretical risks with no realistic exploit path. Focus on doors that are actually unlocked.
- **Severity calibration matters.** A CRITICAL finding needs a realistic exploitation scenario. If you can't describe how an attacker would exploit it, it's not CRITICAL.
- **Confidence gate is absolute.** Below 8/10 confidence = do not report. Period.
- **Read-only.** Never modify code. Produce findings and recommendations only.
- **Assume competent attackers.** Don't assume security through obscurity works.
- **Check the obvious first.** Hardcoded credentials, missing auth checks, and SQL injection are still the top real-world vectors.
- **Framework-aware.** Know your framework's built-in protections. Rails has CSRF tokens by default. React escapes by default. Don't flag what the framework already handles.
+111 -14
View File
@@ -204,24 +204,113 @@ PUBLIC:
- Marketing content, documentation, public APIs
```
### Phase 5: Findings Report
### Phase 5: False Positive Filtering
Rate each finding using CVSS-inspired scoring:
Before producing findings, run every candidate through this filter. The goal is
**zero noise** — better to miss a theoretical issue than flood the report with
false positives that erode trust.
**Hard exclusions — automatically discard findings matching these:**
1. Denial of Service (DOS), resource exhaustion, or rate limiting issues
2. Secrets or credentials stored on disk if otherwise secured (encrypted, permissioned)
3. Memory consumption, CPU exhaustion, or file descriptor leaks
4. Input validation concerns on non-security-critical fields without proven impact
5. GitHub Action workflow issues unless clearly triggerable via untrusted input
6. Missing hardening measures — flag concrete vulnerabilities, not absent best practices
7. Race conditions or timing attacks unless concretely exploitable with a specific path
8. Vulnerabilities in outdated third-party libraries (handled by A06, not individual findings)
9. Memory safety issues in memory-safe languages (Rust, Go, Java, C#)
10. Files that are only unit tests or test fixtures
11. Log spoofing — outputting unsanitized input to logs is not a vulnerability
12. SSRF where attacker only controls the path, not the host or protocol
13. User-controlled content in AI system prompts is not injection (it's the feature)
14. Regex injection or regex DOS — not a real vulnerability class
15. Security concerns in documentation files (*.md)
16. Missing audit logs — absence of logging is not a vulnerability
17. Insecure randomness in non-security contexts (e.g., UI element IDs)
**Precedents — established rulings that prevent recurring false positives:**
1. Logging secrets in plaintext IS a vulnerability. Logging URLs is safe.
2. UUIDs are unguessable — don't flag missing UUID validation.
3. Environment variables and CLI flags are trusted input. Attacks requiring
attacker-controlled env vars are invalid.
4. React and Angular are XSS-safe by default. Only flag `dangerouslySetInnerHTML`,
`bypassSecurityTrustHtml`, or equivalent escape hatches.
5. Client-side JS/TS does not need permission checks or auth — that's the server's job.
Don't flag frontend code for missing authorization.
6. Shell script command injection needs a concrete untrusted input path.
Shell scripts generally don't receive untrusted user input.
7. Subtle web vulnerabilities (tabnabbing, XS-Leaks, prototype pollution, open redirects)
only if extremely high confidence with concrete exploit.
8. iPython notebooks (*.ipynb) — only flag if untrusted input can trigger the vulnerability.
9. Logging non-PII data is not a vulnerability even if the data is somewhat sensitive.
Only flag logging of secrets, passwords, or PII.
**Confidence gate:** Every finding must score **≥ 8/10 confidence** to appear in the
final report. Score calibration:
- **9-10:** Certain exploit path identified. Could write a PoC.
- **8-9:** Clear vulnerability pattern with known exploitation methods.
- **7-8:** Suspicious pattern requiring specific conditions. Include only if HIGH+ severity.
- **Below 7:** Do not report. Too speculative.
### Phase 5.5: Parallel Finding Verification
For each candidate finding that survives the hard exclusion filter, launch an
independent verification sub-task using the Agent tool. The verifier has fresh
context and cannot see the initial scan's reasoning — only the finding itself
and the false positive filtering rules.
Prompt each verifier sub-task with:
- The specific finding (file, line, category, description)
- The full false positive filtering rules (hard exclusions + precedents)
- Instruction: "Read the code at this location. Is this a real, exploitable
vulnerability? Assign a confidence score 1-10. If below 8, explain why
it's likely a false positive."
Launch all verifier sub-tasks in parallel. Discard any finding where the
verifier scores confidence below 8.
If the Agent tool is unavailable, perform the verification pass yourself
by re-reading the code for each finding with a skeptic's eye. Note: "Self-verified
— independent sub-task unavailable."
### Phase 6: Findings Report
**Exploit scenario requirement:** Every finding MUST include a concrete exploit
scenario — a step-by-step attack path an attacker would follow. "This pattern
is insecure" is not a finding. "Attacker sends POST /api/users?id=OTHER_USER_ID
and receives the other user's data because the controller uses params[:id]
without scoping to current_user" is a finding.
Rate each finding:
```
SECURITY FINDINGS
═════════════════
Sev Category Finding OWASP Status
──── ──────── ─────── ───── ──────
CRIT Injection Raw SQL in search controller A03 Open
HIGH Access Control Missing auth on /api/admin/users A01 Open
HIGH Crypto API keys in plaintext config file A02 Open
MED Config CORS allows *, should be restricted A05 Open
MED Logging Failed auth attempts not logged A09 Open
LOW Components lodash@4.17.11 has prototype pollution A06 Open
INFO Design No rate limiting on password reset A04 Open
# Sev Conf Category Finding OWASP File:Line
── ──── ──── ──────── ─────── ───── ─────────
1 CRIT 9/10 Injection Raw SQL in search controller A03 app/search.rb:47
2 HIGH 8/10 Access Control Missing auth on admin endpoint A01 api/admin.ts:12
3 HIGH 9/10 Crypto API keys in plaintext config A02 config/app.yml:8
4 MED 8/10 Config CORS allows * in production A05 server.ts:34
```
### Phase 6: Remediation Roadmap
For each finding, include:
```
## Finding 1: [Title] — [File:Line]
* **Severity:** CRITICAL | HIGH | MEDIUM
* **Confidence:** N/10
* **OWASP:** A01-A10
* **Description:** [What's wrong — one paragraph]
* **Exploit scenario:** [Step-by-step attack path — be specific]
* **Impact:** [What an attacker gains — data breach, RCE, privilege escalation]
* **Recommendation:** [Specific code change with example]
```
### Phase 7: Remediation Roadmap
For the top 5 findings, present via AskUserQuestion:
@@ -234,25 +323,33 @@ For the top 5 findings, present via AskUserQuestion:
- C) Accept risk — [document why, set review date]
- D) Defer to TODOS.md with security label
### Phase 7: Save Report
### Phase 8: Save Report
```bash
mkdir -p .gstack/security-reports
```
Write findings to `.gstack/security-reports/{date}.json`.
Write findings to `.gstack/security-reports/{date}.json`. Include:
- Each finding with severity, confidence, category, file, line, description
- Verification status (independently verified or self-verified)
- Total findings by severity tier
- False positives filtered count (so you can track filter effectiveness)
If prior reports exist, show:
- **Resolved:** Findings fixed since last audit
- **Persistent:** Findings still open
- **New:** Findings discovered this audit
- **Trend:** Security posture improving or degrading?
- **Filter stats:** N candidates scanned, M filtered as FP, K reported
## Important Rules
- **Think like an attacker, report like a defender.** Show the exploit path, then the fix.
- **Zero noise is more important than zero misses.** A report with 3 real findings is worth more than one with 3 real + 12 theoretical. Users stop reading noisy reports.
- **No security theater.** Don't flag theoretical risks with no realistic exploit path. Focus on doors that are actually unlocked.
- **Severity calibration matters.** A CRITICAL finding needs a realistic exploitation scenario. If you can't describe how an attacker would exploit it, it's not CRITICAL.
- **Confidence gate is absolute.** Below 8/10 confidence = do not report. Period.
- **Read-only.** Never modify code. Produce findings and recommendations only.
- **Assume competent attackers.** Don't assume security through obscurity works.
- **Check the obvious first.** Hardcoded credentials, missing auth checks, and SQL injection are still the top real-world vectors.
- **Framework-aware.** Know your framework's built-in protections. Rails has CSRF tokens by default. React escapes by default. Don't flag what the framework already handles.