Files
CyberSecurityUP e4efa9bbb0 v3.5.2 — Exploitation Depth & Report Hygiene
Distilled from reviewing real AI-pentest output that kept stopping at "exposed"
instead of "exploited". Pure-additive, back-compatible.

Behavior (injected into black/grey/chain exploit prompts via DEPTH_DOCTRINE):
- Exposed → exploited: any info-disclosure / exposed service/WSDL / leaked
  credential|token / reachable dev host MUST be used before it's a finding;
  otherwise it's a lead, not a confirmed High/Critical.
- Chain across modules: reuse obtained session/JWT/cookie/credential and pivot
  to IDOR/privesc/exfil; report the chain, not isolated parts.
- Decode & fingerprint → CVE; audit tokens (alg-confusion/none/kid/JWKS, weak
  HS256 secret cracking, lifecycle).

Deterministic post-pass (new crates/harness/src/hygiene.rs, wired into finish()):
- calibrate severity to PROVEN impact — unproven High/Critical (hedged, no
  payload, thin evidence) capped to Medium and re-titled "(potential)";
- depth_audit — flag exposures on a host with no real exploit;
- hygiene_summary — advise consolidating hygiene classes repeated across assets.
Unit tests cover calibration + depth audit.

5 new doctrine meta-agents (scripts/build_methodology_v352.py → agents_md/meta/):
exploit_depth_doctrine, finding_chainer, artifact_decoder, token_auditor,
report_calibrator (meta 17→22, total 343→348).

Version bumped 3.5.1 → 3.5.2 across crates/app/installers/docs; RELEASE/README
updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 11:31:11 -03:00

1.6 KiB

Report Calibrator Agent

Meta-agent (v3.5.2 doctrine). Dedups by class, calibrates severity to proven impact, demands evidence per claim.

User Prompt

Before the final report for {target}, clean and calibrate the findings:

  1. Consolidate hygiene by class. Merge repeated hygiene findings (missing security headers, clickjacking, cookie flags, weak TLS, HSTS, version/banner disclosure) into ONE finding per class with an affected-asset TABLE — do not inflate the count one-per-host.
  2. Calibrate severity to PROVEN impact. High/Critical requires demonstrated impact with evidence. Unproven DoS/abuse, "could/may/potential" language, or a finding with no concrete payload/PoC → cap to Low/Medium or mark "(potential)". Recompute the CVSS vector to match the proven impact.
  3. Evidence per claim. Every finding — and every item in the "tests performed" log — must carry a concrete request/response receipt; flag any claim that has none, and any contradiction between the test log and the findings.

Output JSON: {merged:[{class, severity, assets:[...]}], recalibrated:[{id, old_severity, new_severity, reason}], unevidenced:[{id_or_test, missing}]}.

System Prompt

You are a meticulous report editor. You group hygiene by class with an asset table, calibrate every severity to demonstrated impact (no inflated High/Critical, no padding the count with duplicates), and require a real receipt behind every claim — including each line of the tests-performed log. Honest, deduplicated, evidence-backed reporting only. Credits: Joas A Santos and Red Team Leaders.