gstack

mirror of https://github.com/garrytan/gstack.git synced 2026-06-26 11:39:58 +02:00

Files

T

Garry Tan a468374272 fix: enrich SKILL.md docs to pass LLM evals, upgrade judge to Sonnet 4.6 (#43 )

* fix: enrich command descriptions and snapshot flags for LLM eval quality

14 command descriptions enriched with specific arg formats, valid values,
error behavior, and return types. Fixed header usage from <name> <value>
to <name>:<value>. Added cookie usage syntax. Snapshot flags now show
long names, ref numbering, and output format examples.

* refactor: auto-generate server.ts help text from COMMAND_DESCRIPTIONS

Replace hand-maintained help block with generateHelpText() that reads
from COMMAND_DESCRIPTIONS and SNAPSHOT_FLAGS. Eliminates help text
drift from source of truth.

* test: add usage consistency and pipe guard tests

Usage consistency test cross-checks Usage: patterns in implementation
against COMMAND_DESCRIPTIONS using structural skeleton comparison.
Pipe guard test ensures descriptions don't contain | which would break
markdown table rendering.

* chore: upgrade eval judge to Sonnet 4.6, update changelog

Switch LLM-as-judge evals from Haiku to Sonnet 4.6 for more stable,
nuanced scoring. Add changelog entry for all eval improvements.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-13 22:14:14 -07:00

bin

feat: v0.3.2 — project-local state, diff-aware QA, Greptile integration (#36 )

2026-03-13 18:10:56 -07:00

src

fix: enrich SKILL.md docs to pass LLM evals, upgrade judge to Sonnet 4.6 (#43 )

2026-03-13 22:14:14 -07:00

test

feat: v0.3.2 — project-local state, diff-aware QA, Greptile integration (#36 )

2026-03-13 18:10:56 -07:00

SKILL.md

fix: enrich SKILL.md docs to pass LLM evals, upgrade judge to Sonnet 4.6 (#43 )

2026-03-13 22:14:14 -07:00

SKILL.md.tmpl

feat: SKILL.md template system, 3-tier testing, DX tools (v0.3.3) (#41 )

2026-03-13 21:08:12 -07:00