mirror of https://github.com/garrytan/gstack.git synced 2026-05-01 19:25:10 +02:00

Files

T

Garry Tan a67dae5f84 fix: update check preamble exits 1 when up to date — convert all skills to .tmpl

The `[ -n "$_UPD" ] && echo "$_UPD"` line in 5 skills was missing `|| true`,
causing exit code 1 when the update check finds no update (empty $_UPD).

Fix: convert ship/, review/, plan-ceo-review/, plan-eng-review/, retro/ to
.tmpl templates using {{UPDATE_CHECK}} placeholder (same as browse/qa/etc).
All 9 skills now generated from templates — preamble changes propagate everywhere.

Also: regenerates qa/SKILL.md which had drifted from its template, adds 12 tests
validating the update check preamble exits 0 in all skills, removes completed TODO.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-14 04:40:46 -05:00

2.1 KiB

Raw Blame History

TODOS

Auto-upgrade mode (zero-prompt)

What: Add a GSTACK_AUTO_UPGRADE=1 env var or ~/.gstack/config option that skips the AskUserQuestion prompt and upgrades automatically when a new version is detected.

Why: Power users and CI environments may want zero-friction upgrades without being asked every time.

Context: The current upgrade system (v0.3.4) always prompts via AskUserQuestion. This TODO adds an opt-in bypass. Implementation is ~10 lines in the preamble instructions: check for the env var/config before calling AskUserQuestion, and if set, go straight to the upgrade flow. Depends on the full upgrade system being stable first — wait for user feedback on the prompt-based flow before adding this.

Effort: S (small) Priority: P3 (nice-to-have, revisit after adoption data)

GitHub Actions eval upload

What: Run eval suite in CI, upload result JSON as artifact, post summary comment on PR.

Why: Currently evals only run locally. CI integration would catch quality regressions before merge and provide a persistent record of eval results per PR.

Context: Requires ANTHROPIC_API_KEY in CI secrets. Cost is ~$4/run. The eval persistence system (v0.3.6) writes JSON to ~/.gstack-dev/evals/ — CI would upload these as GitHub Actions artifacts and use eval:compare to post a delta comment on the PR.

Depends on: Eval persistence shipping (v0.3.6). Effort: M (medium) Priority: P2

Eval web dashboard

What: bun run eval:dashboard serves local HTML with charts: cost trending, detection rate over time, pass/fail history.

Why: The CLI tools (eval:list, eval:compare, eval:summary) are good for quick checks but visual charts are better for spotting trends over many runs.

Context: Reads the same ~/.gstack-dev/evals/*.json files. ~200 lines HTML + chart.js code served via a simple Bun HTTP server. No external dependencies beyond what's already installed.

Depends on: Eval persistence + eval:list shipping (v0.3.6). Effort: M (medium) Priority: P3 (nice-to-have, revisit after eval system sees regular use)

2.1 KiB Raw Blame History

TODOS

Auto-upgrade mode (zero-prompt)

GitHub Actions eval upload

Eval web dashboard

2.1 KiB

Raw Blame History