Files
gstack/browse/test
Garry Tan b515f31400 feat(security): always run Haiku on tool outputs (drop the L4 gate)
Tool-result scan previously short-circuited when L4 (TestSavantAI)
scored below WARN, and further gated Haiku on any layer firing at >=
LOG_ONLY. On BrowseSafe-Bench that meant Haiku almost never ran,
because TestSavantAI has ~15% recall on browser-agent-specific
attacks (social engineering, indirect injection). We were gating our
best signal on our weakest.

Run all three classifiers (L4 + L4c + Haiku) in parallel. Cost:
~$0.002 + ~8s Haiku wall time per tool result, bounded by the 15s
Haiku timeout. Haiku also runs in parallel with the content scans
so it's additive only against the stream handler budget, not
against the session wall time.

User-input pre-spawn path unchanged — shouldRunTranscriptCheck still
gates there. The Stack Overflow FP mitigation that original gate was
built for still applies to direct user input; tool outputs have
different characteristics.

Source-contract test updated to pin the new parallel-three shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 21:15:57 +08:00
..