mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-02 03:35:09 +02:00
docs: add user-facing benefit descriptions to v0.4.0 changelog
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
+10
-10
@@ -3,17 +3,17 @@
|
||||
## 0.4.0 — 2026-03-16
|
||||
|
||||
### Added
|
||||
- **QA-only skill** (`/qa-only`) — report-only QA mode that finds and documents bugs without making fixes. Uses `allowedTools` to block `Edit` tool entirely.
|
||||
- **QA fix loop** — `/qa` now runs a find-fix-verify cycle: discover bugs, fix them, commit, re-navigate to confirm the fix took.
|
||||
- **Plan-to-QA artifact flow** — `/plan-eng-review` writes test-plan artifacts to `~/.gstack/projects/<slug>/` that `/qa` picks up for targeted testing.
|
||||
- **`{{QA_METHODOLOGY}}` DRY placeholder** — shared QA methodology block injected into both `/qa` and `/qa-only` SKILL.md templates via gen-skill-docs.
|
||||
- **Eval efficiency metrics** — turns, duration, and cost now displayed across all eval surfaces (summary, comparison, list, watch). Comparison output includes natural-language **Takeaway** commentary interpreting deltas.
|
||||
- **`generateCommentary()` engine** — pure function that interprets comparison deltas: flags regressions, notes improvements, reports per-test efficiency changes, and produces overall summary.
|
||||
- **Eval list columns** — `bun run eval:list` now shows Turns and Duration per run.
|
||||
- **Eval summary per-test efficiency** — `bun run eval:summary` shows average turns/duration/cost per test across runs.
|
||||
- **QA-only skill** (`/qa-only`) — report-only QA mode that finds and documents bugs without making fixes. Hand off a clean bug report to your team without the agent touching your code.
|
||||
- **QA fix loop** — `/qa` now runs a find-fix-verify cycle: discover bugs, fix them, commit, re-navigate to confirm the fix took. One command to go from broken to shipped.
|
||||
- **Plan-to-QA artifact flow** — `/plan-eng-review` writes test-plan artifacts that `/qa` picks up automatically. Your engineering review now feeds directly into QA testing with no manual copy-paste.
|
||||
- **`{{QA_METHODOLOGY}}` DRY placeholder** — shared QA methodology block injected into both `/qa` and `/qa-only` templates. Keeps both skills in sync when you update testing standards.
|
||||
- **Eval efficiency metrics** — turns, duration, and cost now displayed across all eval surfaces with natural-language **Takeaway** commentary. See at a glance whether your prompt changes made the agent faster or slower.
|
||||
- **`generateCommentary()` engine** — interprets comparison deltas so you don't have to: flags regressions, notes improvements, and produces an overall efficiency summary.
|
||||
- **Eval list columns** — `bun run eval:list` now shows Turns and Duration per run. Spot expensive or slow runs instantly.
|
||||
- **Eval summary per-test efficiency** — `bun run eval:summary` shows average turns/duration/cost per test across runs. Identify which tests are costing you the most over time.
|
||||
- **`judgePassed()` unit tests** — extracted and tested the pass/fail judgment logic.
|
||||
- **3 new E2E tests** — qa-only no-fix guardrail, qa fix loop with commit verification, plan-eng-review test-plan artifact.
|
||||
- **Browser ref staleness detection** — `resolveRef()` now checks element count to detect stale refs after page mutations.
|
||||
- **Browser ref staleness detection** — `resolveRef()` now checks element count to detect stale refs after page mutations. SPA navigation no longer causes 30-second timeouts on missing elements.
|
||||
- 3 new snapshot tests for ref staleness.
|
||||
|
||||
### Changed
|
||||
@@ -23,7 +23,7 @@
|
||||
- `eval-store.test.ts` fixed pre-existing `_partial` file assertion bug.
|
||||
|
||||
### Fixed
|
||||
- Browser ref staleness — refs collected before page mutation (e.g. SPA navigation) are now detected and re-collected.
|
||||
- Browser ref staleness — refs collected before page mutation (e.g. SPA navigation) are now detected and re-collected. Eliminates a class of flaky QA failures on dynamic sites.
|
||||
|
||||
## 0.3.9 — 2026-03-15
|
||||
|
||||
|
||||
Reference in New Issue
Block a user