mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-06 05:35:46 +02:00
5d968c43ec
Two bugs that made checkTranscript return degraded on every call:
1. --model 'haiku-4-5' returns 404 from the Claude CLI. The accepted
shorthand is 'haiku' (resolves to claude-haiku-4-5-20251001
today, stays on the latest Haiku as models roll). Symptom: every
call exited non-zero with api_error_status=404.
2. 2000ms timeout is below the floor. Fresh `claude -p` spawn has
~2-3s CLI cold-start + 5-12s inference on ~1KB prompts. With the
wrong model gone, every successful call still timed out before it
returned. Measured: 0% firing rate.
Fix: model alias + 15s timeout. Sanity check against DAN-style
injection now returns confidence 0.99 with reasoning ("Tool output
contains multiple injection patterns: instruction override, jailbreak
attempt (DAN), system prompt exfil request, and malicious curl
command to attacker domain") in 8.7s.
This was the silent cause of the 15.3% detection rate on
BrowseSafe-Bench — the ensemble numbers matched L4-alone because
Haiku never actually voted.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>