gstack

mirror of https://github.com/garrytan/gstack.git synced 2026-07-21 15:01:00 +02:00

Author	SHA1	Message	Date
Garry Tan	c90feac2ca	fix: improve contributor mode + qa-quick E2E reliability Contributor mode: - Add "do not truncate" directive to template — agent was stopping after "My rating" without completing Steps/Raw output/What would make this a 10 sections - Restore assertions for Steps to reproduce and Date footer QA quick: - Make test server URL prominent: top of prompt, explicit "already running" and "do NOT discover ports" instructions - Bump session timeout 180s→240s and test timeout 240s→300s - Set B= at top of prompt (was buried in prose) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 11:18:27 -05:00
Garry Tan	685be01b00	feat: redesign contributor mode — periodic reflection with 0-10 rating Replace passive "report when things break" with active reflection: - Rate gstack experience 0-10 at workflow step boundaries - Historical calibration example (await bug) anchors the reporting bar - "What would make this a 10" field focuses on actionable improvements - Removed category lists in favor of judgment-based assessment	2026-03-16 11:17:47 -05:00
Garry Tan	3e3843c4a9	feat: contributor mode, session awareness, recommendation format (#90 ) * feat: contributor mode, session awareness, universal RECOMMENDATION format - Rename {{UPDATE_CHECK}} → {{PREAMBLE}} across all 10 skill templates - Add session tracking (touch ~/.gstack/sessions/$PPID, count active sessions) - ELI16 mode when 3+ concurrent sessions detected (re-ground user on context) - Contributor mode: auto-file field reports to ~/.gstack/contributor-logs/ - Universal AskUserQuestion format: context → question → RECOMMENDATION → options - Update plan-ceo-review and plan-eng-review to reference preamble baseline - Add vendored symlink awareness section to CLAUDE.md - Rewrite CONTRIBUTING.md with contributor workflow and cross-project testing - Add tests for contributor mode and session awareness in generated output - Add E2E eval for contributor mode report filing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add Enum & Value Completeness to /review critical checklist New CRITICAL review category that traces new enum values, status strings, and type constants through every consumer outside the diff. Catches the class of bugs where a new value is added but not handled in all switch/case chains, allowlists, or frontend-backend contracts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: bump v0.4.1, user-facing changelog, update qa-only template and architecture docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add CHANGELOG style guide — user-facing, sell the feature Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: rewrite v0.4.1 changelog to be user-facing and sell the features Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add evals for RECOMMENDATION format, session awareness, and enum completeness Free tests (Tier 1): RECOMMENDATION format + session awareness in all preamble SKILL.md files, enum completeness checklist structure and CRITICAL classification. E2E eval: /review catches missed enum handlers when a new status value is added but not handled in case/switch and notify methods. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add E2E eval for session awareness ELI16 mode Stubs _SESSIONS=4, gives agent a decision point on feature/add-payments branch, verifies the output re-grounds the user with project, branch, context, and RECOMMENDATION — the ELI16 mode behavior for 3+ sessions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: contributor mode eval marked FAIL due to expected browse error The test intentionally runs a nonexistent binary to trigger contributor mode. The session runner's browse error detection catches "no such file or directory...browse" and sets browseErrors, causing recordE2E to mark passed=false. Override passed to check only exitReason since the browse error is the expected scenario. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 01:45:50 -05:00
Garry Tan	f3ee0ee28a	feat: QA restructure, browser ref staleness, eval efficiency metrics (v0.4.0) (#83 ) * feat: browser ref staleness detection via async count() validation resolveRef() now checks element count to detect stale refs after page mutations (e.g. SPA navigation). RefEntry stores role+name metadata for better diagnostics. 3 new snapshot tests for staleness detection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: qa-only skill, qa fix loop, plan-to-QA artifact flow Add /qa-only (report-only, Edit tool blocked), restructure /qa with find-fix-verify cycle, add {{QA_METHODOLOGY}} DRY placeholder for shared methodology. /plan-eng-review now writes test-plan artifacts to ~/.gstack/projects/<slug>/ for QA consumption. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: eval efficiency metrics — turns, duration, commentary across all surfaces Add generateCommentary() for natural-language delta interpretation, per-test turns/duration in comparison and summary output, judgePassed unit tests, 3 new E2E tests (qa-only, qa fix loop, plan artifact). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: bump version and changelog (v0.4.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: update ARCHITECTURE, BROWSER, CONTRIBUTING, README for v0.4.0 - ARCHITECTURE: add ref staleness detection section, update RefEntry type - BROWSER: add ref staleness paragraph to snapshot system docs - CONTRIBUTING: update eval tool descriptions with commentary feature - README: fix missing qa-only in project-local uninstall command Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add user-facing benefit descriptions to v0.4.0 changelog Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 23:55:39 -05:00
Garry Tan	bb46ca6b21	feat: smart update check with auto-upgrade, snooze backoff, config CLI (v0.3.9) (#62 ) * feat: add bin/gstack-config CLI for reading/writing ~/.gstack/config.yaml Simple get/set/list interface for persistent gstack configuration. Used by update-check and upgrade skill for auto_upgrade and update_check settings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: smart update check with 12h cache, snooze backoff, config disable - Reduce cache TTL from 24h to 12h for faster update detection - Add exponential snooze backoff: 24h → 48h → 1 week (resets on new version) - Add update_check: false config option to disable checks entirely - Clear snooze file on just-upgraded - 14 new tests covering snooze levels, expiry, corruption, and config paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: upgrade skill with auto-upgrade, 4-option prompt, vendored sync - Auto-upgrade mode via config or GSTACK_AUTO_UPGRADE=1 env var - 4-option AskUserQuestion: upgrade once, always, not now, never - Step 4.5: sync local vendored copy after upgrading primary install - Snooze write with escalating backoff on "Not now" - Update preamble text in gen-skill-docs for new upgrade flow - Regenerate all SKILL.md files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: simplify upgrade instructions, move auto-upgrade to completed README now points to /gstack-upgrade instead of long paste commands. Auto-upgrade TODO moved to Completed section (v0.3.8). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: bump version and changelog (v0.3.9) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-14 23:28:02 -07:00
Garry Tan	41141007c1	feat: TODOS-aware skills, 2-tier Greptile replies, gitignore fix (#61 ) * fix: log non-ENOENT errors in ensureStateDir() instead of silently swallowing Replace bare catch {} with ENOENT-only silence. Non-ENOENT errors (EACCES, ENOSPC) are now logged to .gstack/browse-server.log. Includes test for permission-denied scenario with chmod 444. * feat: merge TODO.md + TODOS.md into unified backlog with shared format reference Merge TODO.md (roadmap) and TODOS.md (near-term) into one file organized by skill/component with P0-P4 priority ordering and Completed section. Add shared review/TODOS-format.md for canonical format. Add static validation tests. * feat: add 2-tier Greptile reply system with escalation detection Add reply templates (Tier 1 friendly, Tier 2 firm), explicit escalation detection algorithm, and severity re-ranking guidance to greptile-triage.md. * feat: cross-skill TODOS awareness + Greptile template refs in all skills /ship Step 5.5: auto-detect completed TODOs, offer reorganization. /review Step 5.5: cross-reference PR against open TODOs. /plan-ceo-review, /plan-eng-review: TODOS context in planning. /retro: Backlog Health metric. /qa: bug TODO context in diff-aware mode. All Greptile-aware skills now reference reply templates and escalation detection. * chore: bump version and changelog (v0.3.8) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: update CONTRIBUTING.md for v0.3.8 changes Clarify test tier cost table (Tier 3 standalone vs combined), add TODOS.md to "Things to know", mention Greptile triage in ship workflow description. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-14 20:15:11 -07:00
Garry Tan	a67dae5f84	fix: update check preamble exits 1 when up to date — convert all skills to .tmpl The `[ -n "$_UPD" ] && echo "$_UPD"` line in 5 skills was missing `\|\| true`, causing exit code 1 when the update check finds no update (empty $_UPD). Fix: convert ship/, review/, plan-ceo-review/, plan-eng-review/, retro/ to .tmpl templates using {{UPDATE_CHECK}} placeholder (same as browse/qa/etc). All 9 skills now generated from templates — preamble changes propagate everywhere. Also: regenerates qa/SKILL.md which had drifted from its template, adds 12 tests validating the update check preamble exits 0 in all skills, removes completed TODO. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-14 04:40:46 -05:00
Garry Tan	6b69c46a27	feat: daily update check + /gstack-upgrade skill (v0.3.4) (#42 ) * feat: add daily update check script + /gstack-upgrade skill bin/gstack-update-check: pure bash, checks VERSION against remote once/day, outputs UPGRADE_AVAILABLE or JUST_UPGRADED. Uses ~/.gstack/ for state. gstack-upgrade/SKILL.md: new skill with inline upgrade flow for all preambles. Detects global-git, local-git, vendored installs. Shows What's New from CHANGELOG. browse/test/gstack-update-check.test.ts: 10 test cases covering all branch paths. * refactor: remove version check from find-browse, simplify to binary locator Delete checkVersion(), readCache(), writeCache(), fetchRemoteSHA(), resolveSkillDir(), CacheEntry interface, REPO_URL/CACHE_PATH/CACHE_TTL constants, and META output from find-browse.ts. Version checking is now handled by bin/gstack-update-check (previous commit). * feat: add update check preamble to all 9 skills Every skill now runs bin/gstack-update-check on invocation. If an upgrade is available, reads gstack-upgrade/SKILL.md inline upgrade flow. Also adds AskUserQuestion to 5 skills that lacked it (gstack root, browse, qa, retro, setup-browser-cookies) and Bash to plan-eng-review. Simplifies qa and setup-browser-cookies setup blocks (removes META parsing). * chore: bump version and changelog (v0.3.4) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove unused import + add corrupt cache test Address pre-landing review findings: - Remove unused mkdirSync import from gstack-update-check.test.ts - Add Path I test: corrupt cache file falls through to remote fetch Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-13 22:17:25 -07:00
Garry Tan	3d901066cd	Initial release — gstack v0.0.1 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-12 01:32:16 -07:00

9 Commits