* feat: /cso v2 — infrastructure-first security audit
Rewrite /cso from code-centric OWASP scanning to infrastructure-first
attack surface analysis. 15 phases covering secrets archaeology, dependency
supply chain, CI/CD pipeline security, webhook verification, LLM/AI
security, skill supply chain scanning, plus OWASP Top 10, STRIDE, and
data classification.
Key design decisions from eng review + Codex adversarial review:
- Soft gate stack detection (prioritize, don't skip)
- Error on conflicting scope flags (never silently ignore)
- Permission gate before scanning ~/.claude/skills/
- Graceful degradation when audit tools aren't installed
- Finding fingerprints for cross-run trend tracking
- Variant analysis: one verified vuln triggers codebase-wide search
- Dual confidence modes: daily (8/10 gate) vs comprehensive (2/10)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: /cso v2 acknowledgements — 10 projects that informed the design
Credits: Sentry (confidence gating), Trail of Bits (mental model + variant
analysis), Shannon/Keygraph (active verification validation), afiqiqmal
(framework detection + LLM security), Snyk ToxicSkills (skill supply chain),
Miessler PAI (incident playbooks), McGo (report format), Claude Code
Security Pack (modular validation), Anthropic CCS (500+ zero-days), and
@gus_argon (v1 blind spot identification).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: /cso v2 E2E tests — full audit, diff mode, infra scope
Three E2E test cases with planted vulnerabilities:
- cso-full-audit: hardcoded API key + .env tracked by git
- cso-diff-mode: webhook without signature verification on feature branch
- cso-infra-scope: unpinned GitHub Action + Dockerfile without USER
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: /cso E2E tests — correct logCost and recordE2E signatures
logCost requires (label, result), recordE2E requires (collector, name,
suite, result). Fixed all 3 test cases.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: /cso infra E2E test — increase timeout to 360s
The infra scope test runs Agent sub-tasks for parallel finding
verification which can take longer than 240s. Increased maxTurns
from 25 to 60 and timeout from 240s to 360s.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: /cso infra E2E test — sharper prompt to prevent exploration waste
The agent was burning 30+ turns exploring a 3-file repo (18 Glob calls,
Explore subagent, 4 SKILL.md reads) before starting the audit. Two Agent
verification subagents then ate ~100s, causing the 240s timeout.
Fix: tell the agent the repo is tiny, list the exact files, skip the
preamble, remove Agent from allowed tools, reduce maxTurns 60→30.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.11.6.0)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address Codex adversarial findings in /cso v2
Six fixes from Codex adversarial review:
1. Phase 2: Use `git log -G` (regex) instead of `-S` (literal) for
patterns with alternation (ghp_|gho_|github_pat_, etc.)
2. Phase 12 exclusion #5: Add exception so CI/CD pipeline findings
from Phase 4 are never auto-discarded when --infra is active
3. Phase 12 exclusion #6: Add exception that unpinned actions and
missing CODEOWNERS are concrete risks, not "missing hardening"
4. Phase 12 exclusion #15: Add exception that SKILL.md files are
executable prompt code, not documentation — Phase 8 findings
in SKILL.md must not be excluded
5. Phase 12 exclusion #1: Add exception that LLM cost/spend
amplification from Phase 7 is financial risk, not DoS
6. E2E tests: Add exitReason === 'success' assertion to all 3 tests;
move finalizeEvalCollector to file-level afterAll
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>