Files
gstack/AGENTS.md
Garry Tan 87ce4c696f docs+test: AGENTS.md/docs/skills.md inventory sync + private-path leak detector
Inventory sync (codex-flagged drift):
- /debug → /investigate (skill renamed in v1.0.1.0)
- AGENTS.md grows from 21 to 40+ skills, organized by category (plan reviews,
  implementation, release, operational, browser, safety)
- docs/skills.md gains 11 missing entries: /plan-devex-review, /devex-review,
  /plan-tune, /context-save, /context-restore, /health, /landing-report,
  /benchmark-models, /pair-agent, /setup-gbrain, /make-pdf
- Stale "<5s bun test" claim dropped — slim-preamble harness + new tests means
  no realistic universal claim to make
- Adds explicit "Mac + Linux full, curated Windows lane" platform statement +
  "Git Bash / MSYS today, native PowerShell future" install note

New invariants in test/skill-validation.test.ts (~80 LOC):
- Private-path leak detector scans every SKILL.md / SKILL.md.tmpl for known
  maintainer-only filenames (coordination-board.md, SEEKING_LOG.md,
  RATIONAL_SUBJECT.md, VALUE_SIGNAL_LOOP.md, C:\LLM Playground\go).
  Adapted from the McGluut fork's skill-contract-audit.ts; we don't take
  the script wholesale because most of its checks are already covered by
  test/gen-skill-docs.test.ts:1668-2074 and test/skill-validation.test.ts:1419
  — only the private-path scan and doc-inventory cross-check are new.
- Doc-inventory cross-check: every skill directory with a SKILL.md.tmpl must
  appear in both AGENTS.md and docs/skills.md. Catches the inventory drift
  this commit is fixing — without this test it would just drift again.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 23:01:49 -07:00

5.6 KiB

gstack — AI Engineering Workflow

gstack is a collection of SKILL.md files that give AI agents structured roles for software development. Each skill is a specialist: CEO reviewer, eng manager, designer, QA lead, release engineer, debugger, and more.

Available skills

Skills live in .agents/skills/ (or ~/.claude/skills/gstack/ on Claude Code). Invoke them by name (e.g., /office-hours).

Plan-mode reviews

Skill What it does
/office-hours Start here. Reframes your product idea before you write code.
/plan-ceo-review CEO-level review: find the 10-star product in the request.
/plan-eng-review Lock architecture, data flow, edge cases, and tests.
/plan-design-review Rate each design dimension 0-10, explain what a 10 looks like.
/plan-devex-review DX-mode review: TTHW, magical moments, friction points, persona traces.
/plan-tune Self-tune AskUserQuestion sensitivity per question.
/autoplan One command runs CEO → design → eng → DX review.
/design-consultation Build a complete design system from scratch.

Implementation + review

Skill What it does
/review Pre-landing PR review. Finds bugs that pass CI but break in prod.
/codex Second opinion via OpenAI Codex. Review, challenge, or consult modes.
/investigate Systematic root-cause debugging. No fixes without investigation.
/design-review Live-site visual audit + fix loop with atomic commits.
/design-shotgun Generate multiple AI design variants, comparison board, iterate.
/design-html Generate production-quality Pretext-native HTML/CSS.
/devex-review Live developer experience audit (TTHW measured against the real flow).
/qa Open a real browser, find bugs, fix them, re-verify.
/qa-only Same methodology as /qa but report only — no code changes.

Release + deploy

Skill What it does
/ship Run tests, review, push, open PR. Workspace-aware version queue.
/land-and-deploy Merge the PR, wait for CI and deploy, verify production health.
/canary Post-deploy monitoring loop using the browse daemon.
/landing-report Read-only dashboard for the workspace-aware ship queue.
/document-release Update all docs to match what you just shipped.
/setup-deploy One-time deploy config detection (Fly.io, Render, Vercel, etc.).
/gstack-upgrade Update gstack to the latest version.

Operational + memory

Skill What it does
/context-save Save working context (git state, decisions, remaining work).
/context-restore Resume from a saved context, even across Conductor workspaces.
/learn Manage what gstack learned across sessions.
/retro Weekly retro with per-person breakdowns and shipping streaks.
/health Code quality dashboard (type checker, linter, tests, dead code).
/benchmark Performance regression detection (page load, Core Web Vitals).
/benchmark-models Cross-model benchmark for skills (Claude, GPT, Gemini side-by-side).
/cso OWASP Top 10 + STRIDE security audit.
/setup-gbrain Set up gbrain for cross-machine session memory sync.

Browser + agent integration

Skill What it does
/browse Headless browser — real Chromium, real clicks, ~100ms/command.
/open-gstack-browser Launch the visible GStack Browser with sidebar + stealth.
/setup-browser-cookies Import cookies from your real browser for authenticated testing.
/pair-agent Pair a remote AI agent (OpenClaw, Codex, etc.) with your browser.

Safety + scoping

Skill What it does
/careful Warn before destructive commands (rm -rf, DROP TABLE, force-push).
/freeze Lock edits to one directory. Hard block, not just a warning.
/guard Activate both careful + freeze at once.
/unfreeze Remove directory edit restrictions.
/make-pdf Turn any markdown file into a publication-quality PDF.

Build commands

bun install              # install dependencies
bun test                 # run free tests (no API spend)
bun run test:windows     # curated Windows-safe subset (runs on windows-latest)
bun run build            # generate docs + compile binaries
bun run gen:skill-docs   # regenerate SKILL.md files from templates
bun run skill:check      # health dashboard for all skills

Platform support

  • macOS + Linux: full test suite supported.
  • Windows: curated Windows-safe subset runs on windows-latest via the windows-free-tests CI job. Setup script (./setup) requires Git Bash or MSYS today; native PowerShell support is a future expansion. The bin/gstack-paths helper resolves state roots through CLAUDE_PLUGIN_DATA / GSTACK_HOME so plugin installs work on every platform.

Key conventions

  • SKILL.md files are generated from .tmpl templates. Edit the template, not the output.
  • Run bun run gen:skill-docs --host codex to regenerate Codex-specific output.
  • The browse binary provides headless browser access. Use $B <command> in skills.
  • Safety skills (careful, freeze, guard) use inline advisory prose — always confirm before destructive operations.
  • State paths resolve via bin/gstack-paths (sourced via eval "$(...)"). Honors GSTACK_HOME, CLAUDE_PLUGIN_DATA, CLAUDE_PLANS_DIR.
  • The claude CLI binary resolves via browse/src/claude-bin.ts (Bun.which() + GSTACK_CLAUDE_BIN override). Set GSTACK_CLAUDE_BIN=wsl plus GSTACK_CLAUDE_BIN_ARGS='["claude"]' to run Claude through WSL on Windows.