gstack

mirror of https://github.com/garrytan/gstack.git synced 2026-07-07 16:48:01 +02:00

Files

T

Garry Tan 5d968c43ec fix(security): unbreak Haiku transcript classifier — wrong model + too-tight timeout

Two bugs that made checkTranscript return degraded on every call:

1. --model 'haiku-4-5' returns 404 from the Claude CLI. The accepted
   shorthand is 'haiku' (resolves to claude-haiku-4-5-20251001
   today, stays on the latest Haiku as models roll). Symptom: every
   call exited non-zero with api_error_status=404.

2. 2000ms timeout is below the floor. Fresh `claude -p` spawn has
   ~2-3s CLI cold-start + 5-12s inference on ~1KB prompts. With the
   wrong model gone, every successful call still timed out before it
   returned. Measured: 0% firing rate.

Fix: model alias + 15s timeout. Sanity check against DAN-style
injection now returns confidence 0.99 with reasoning ("Tool output
contains multiple injection patterns: instruction override, jailbreak
attempt (DAN), system prompt exfil request, and malicious curl
command to attacker domain") in 8.7s.

This was the silent cause of the 15.3% detection rate on
BrowseSafe-Bench — the ensemble numbers matched L4-alone because
Haiku never actually voted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-20 21:15:44 +08:00

bin

feat: multi-agent support — gstack works on Codex, Gemini CLI, and Cursor (v0.9.0) (#226 )

2026-03-19 18:20:50 -07:00

scripts

fix: ngrok Windows build + close CI error-swallowing gap (v0.18.0.1) (#1024 )

2026-04-16 13:49:04 -07:00

src

fix(security): unbreak Haiku transcript classifier — wrong model + too-tight timeout

2026-04-20 21:15:44 +08:00

test

test(security): full-stack review E2E — real classifier + mock-claude