mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-02 03:35:09 +02:00
4a56b882ab
- Accept error_max_turns as valid exit for planted-bug evals (agent may have written partial report before running out of turns) - Browse snapshot: log browseErrors as warnings instead of hard assertions (agent sometimes hallucinates paths like "baltimore" vs "bangalore") - Fall back to result.output when no report file exists - What matters is detection rate (outcome judge), not turn completion Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>