feat: add eval:trend CLI for per-test pass rate tracking

computeTrends() classifies tests as stable-pass/stable-fail/flaky/
improving/degrading based on pass rate, flip count, and recent streak.
gstack eval trend shows sparkline table with --limit, --tier, --test
filters. Guard CLI main block with import.meta.main to prevent
execution on import.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-03-15 16:47:41 -05:00
parent 59752fc510
commit daea165333
3 changed files with 385 additions and 1 deletions
+1
View File
@@ -21,6 +21,7 @@
"eval:list": "bun run lib/cli-eval.ts list",
"eval:compare": "bun run lib/cli-eval.ts compare",
"eval:summary": "bun run lib/cli-eval.ts summary",
"eval:trend": "bun run lib/cli-eval.ts trend",
"eval:watch": "bun run lib/cli-eval.ts watch"
},
"dependencies": {