Logo
Explore Help
Sign In
CalvinBackup/gstack
1
0
Fork 0
You've already forked gstack
mirror of https://github.com/garrytan/gstack.git synced 2026-05-02 11:45:20 +02:00
Code Issues Packages Projects Releases Wiki Activity
Files
garrytan/check-evals
gstack/test
T
History
Garry Tan f1581e6ff7 chore: upgrade eval judge to Sonnet 4.6, update changelog
Switch LLM-as-judge evals from Haiku to Sonnet 4.6 for more stable,
nuanced scoring. Add changelog entry for all eval improvements.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 00:12:48 -05:00
..
helpers
feat: SKILL.md template system, 3-tier testing, DX tools (v0.3.3) (#41)
2026-03-13 21:08:12 -07:00
gen-skill-docs.test.ts
test: add usage consistency and pipe guard tests
2026-03-14 00:12:42 -05:00
skill-e2e.test.ts
feat: SKILL.md template system, 3-tier testing, DX tools (v0.3.3) (#41)
2026-03-13 21:08:12 -07:00
skill-llm-eval.test.ts
chore: upgrade eval judge to Sonnet 4.6, update changelog
2026-03-14 00:12:48 -05:00
skill-parser.test.ts
feat: SKILL.md template system, 3-tier testing, DX tools (v0.3.3) (#41)
2026-03-13 21:08:12 -07:00
skill-validation.test.ts
test: add usage consistency and pipe guard tests
2026-03-14 00:12:42 -05:00
Powered by Gitea Version: 1.26.0 Page: 33ms Template: 3ms
Auto
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API