Files
gstack/test
Garry Tan f1581e6ff7 chore: upgrade eval judge to Sonnet 4.6, update changelog
Switch LLM-as-judge evals from Haiku to Sonnet 4.6 for more stable,
nuanced scoring. Add changelog entry for all eval improvements.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 00:12:48 -05:00
..