mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-30 23:09:32 +02:00
f8bb59094d
* feat(issue): add /issue skill for backlog-ready GitHub issue authoring
Interrogates an ambiguous request through five strict phases (why, scope,
technical, draft, final) and produces a GitHub issue precise enough that an
unfamiliar engineer or AI agent can execute it without follow-up. Slots in
after /office-hours (when the idea has passed the "worth building" bar) and
before /plan-eng-review (which assumes a plan already exists).
- issue/SKILL.md.tmpl + generated SKILL.md
- routing entry in root SKILL.md.tmpl
- llms.txt regenerated to include the new skill
* chore(spec): rename /issue → /spec + fix duplicate analytics block
Foundation commit for the /spec skill (extends PR #1698 by @jayzalowitz).
- Renames issue/ → spec/ (template + generated)
- Removes the hand-rolled analytics block in spec/SKILL.md.tmpl (lines 46-49 of the original); {{PREAMBLE}} already emits the analytics write with the telemetry opt-out guard, so the duplicate would have bypassed gstack-config set telemetry off
- Updates frontmatter (name: spec, expanded description with magical-moment preview, triggers reordered to lead with "spec this out")
- Updates root SKILL.md.tmpl routing entry → /spec
- Regenerates spec/SKILL.md and gstack/llms.txt via bun run gen:skill-docs
Co-Authored-By: Jay Zalowitz <jayzalowitz@gmail.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(spec): expansions — flags, archive, quality gate, plan-mode-aware Phase 5, /ship integration, tests
Builds on the @jayzalowitz foundation (commit a4e6ee38) with the full
expansion set from CEO + Eng + DX review (24 user decisions + 23 of 28
codex adversarial findings).
spec/SKILL.md.tmpl additions:
- Flag reference table (--dedupe / --no-gate / --audit / --execute /
--no-execute / --file-only / --plan-file / --sync-archive).
- Phase 1b --dedupe (default ON): gh issue list --search with graceful
skip on gh-not-installed / unauthed / rate-limited / other errors.
AskUserQuestion when matches found (merge / file-new / cancel).
- Phase 3 HARD requirement: agent MUST grep/read at least one piece of
evidence before asking. Project-level fallback prose for prompts with
no concrete file mapping. Greenfield escape clause.
- Phase 4.5 quality gate (default ON): codex adversarial dispatch with
fail-closed redaction (AWS/GitHub/Anthropic/OpenAI/private-key regex),
hard <<<USER_SPEC>>> delimiters + instruction boundary (prompt-injection
defense), score 0-10 with <7 block, up to 3 iterations, AskUserQuestion
escape on persistent <7 (ship anyway / save draft / one more try).
- Phase 5 plan-mode-aware dispatch: reads GSTACK_PLAN_MODE env. Active
→ file-only + load into plan file. Inactive → file + --execute spawn
by default. CLI overrides for explicit control.
- Archive block via eval $(gstack-paths) → $GSTACK_STATE_ROOT/projects/
$SLUG/specs/<datetime>-<pid>-<slug>.md. Atomic .tmp/mv write. Sync
excluded by default; --sync-archive to opt in.
- --execute path: dirty-worktree gate (porcelain check + 3-option AUQ
continue/stash/cancel), TOCTOU re-check after AUQ answer, SHA pin
via git rev-parse HEAD, unique branch spec/<slug>-$$ + PID-suffixed
worktree, mandatory final-confirm gate, stash policy with restore
safety (preserve ref, never auto-drop).
- TTHW timestamps captured at Phase 1 / first citation / file-or-spawn,
emitted as ttfc_ms + tthw_ms in preamble telemetry envelope.
Cross-system plumbing:
- scripts/resolvers/preamble/generate-preamble-bash.ts: emit
GSTACK_PLAN_MODE=active|inactive based on CLAUDE_PLAN_FILE presence.
- scripts/resolvers/preamble/generate-routing-injection.ts: add /spec
to the routing block injected into project CLAUDE.md.
- ship/SKILL.md.tmpl: new "Linked Spec" PR-body section. Reads archive
frontmatter spec_issue_number and adds Closes #N when full delivery
confirmed by existing plan-completion gate (codex F4 — conditional).
Branch-name inference NOT used (codex F3 — fragile under rebase).
Tests (W7):
- test/spec-template-invariants.test.ts: 35 deterministic assertions
covering Phase 1 hard gate, Phase 3 hard-grep mandate, --dedupe
graceful-skip paths, --execute race + security hardening (TOCTOU,
SHA pin, unique branch), quality-gate redaction + BLOCKED path,
archive atomic write + sync exclusion, plan-mode-aware Phase 5.
- test/spec-template-sync.test.ts: regen + byte-identical check.
- test/skill-e2e-spec-execute.test.ts (periodic-tier scaffold).
- test/skill-llm-eval-spec.test.ts (periodic-tier scaffold).
- test/helpers/touchfiles.ts: register both periodics in E2E_TIERS +
LLM_JUDGE_TOUCHFILES.
37/37 /spec tests pass. Full bun test exit 0 (pre-existing
url-validation timeout unrelated to /spec).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: v1.45.0.0 — regen all SKILL.md, bump VERSION, CHANGELOG entry
Mechanical regen pulling in two template-side changes:
- /spec expansion (spec/SKILL.md picks up ~1100 new lines)
- {{PREAMBLE}} now echoes GSTACK_PLAN_MODE env (every skill picks up
the new echo line in the preamble bash block)
VERSION 1.44.0.0 → 1.45.0.0 (MINOR per scale-aware rules: substantial
new capability — /spec skill with 5 CLI flags + race/security
hardening + plan-mode-aware Phase 5 + /ship integration).
CHANGELOG entry frames /spec as agent feedstock with the two-line
headline, "numbers that matter" table, and "what this means for
builders" close. Credits @jayzalowitz for the foundation contribution.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(spec): register /spec in scripts/proactive-suggestions.json
Auto-generated by bun run gen:skill-docs after the v1.46 catalog-trim
contract picked up /spec's frontmatter. lead + routing extracted from
spec/SKILL.md.tmpl description: block.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(spec): TODOS deferrals + package.json sync for v1.47.0.0
- TODOS.md: add P2 entry for /spec --epic mode (deferred from CEO SCOPE
EXPANSION review), P3 entry for --dedupe semantic matching upgrade.
Both have full context blocks so future picker can resume cold.
- package.json: bump 1.46.0.0 → 1.47.0.0 to match VERSION (was stale
from the main merge; /ship Step 12 idempotency caught it).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: register /spec skill in README, AGENTS, CLAUDE.md project tree
Adds /spec to the three discoverability surfaces it was missing:
- README.md sprint skills table (between /autoplan and /learn)
- AGENTS.md plan-mode reviews table
- CLAUDE.md project structure tree (between /investigate and /retro)
/spec shipped in v1.47.0.0 with CHANGELOG coverage but the entry-point
docs hadn't been updated; a user landing on README or AGENTS would not
discover the skill exists without reading CHANGELOG.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Jay Zalowitz <jayzalowitz@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
136 lines
7.3 KiB
Markdown
136 lines
7.3 KiB
Markdown
# gstack — AI Engineering Workflow
|
|
|
|
gstack is a collection of SKILL.md files that give AI agents structured roles for
|
|
software development. Each skill is a specialist: CEO reviewer, eng manager,
|
|
designer, QA lead, release engineer, debugger, and more.
|
|
|
|
## Available skills
|
|
|
|
Skills live in `.agents/skills/` (or `~/.claude/skills/gstack/` on Claude Code).
|
|
Invoke them by name (e.g., `/office-hours`).
|
|
|
|
### Plan-mode reviews
|
|
|
|
| Skill | What it does |
|
|
|-------|-------------|
|
|
| `/office-hours` | Start here. Reframes your product idea before you write code. |
|
|
| `/plan-ceo-review` | CEO-level review: find the 10-star product in the request. |
|
|
| `/plan-eng-review` | Lock architecture, data flow, edge cases, and tests. |
|
|
| `/plan-design-review` | Rate each design dimension 0-10, explain what a 10 looks like. |
|
|
| `/plan-devex-review` | DX-mode review: TTHW, magical moments, friction points, persona traces. |
|
|
| `/plan-tune` | Self-tune AskUserQuestion sensitivity per question. |
|
|
| `/autoplan` | One command runs CEO → design → eng → DX review. |
|
|
| `/design-consultation` | Build a complete design system from scratch. |
|
|
| `/spec` | Turn vague intent into a precise, executable spec in five phases. Files a GitHub issue, optionally spawns a Claude Code agent in a fresh worktree, and lets `/ship` close the source issue on merge. |
|
|
|
|
### Implementation + review
|
|
|
|
| Skill | What it does |
|
|
|-------|-------------|
|
|
| `/review` | Pre-landing PR review. Finds bugs that pass CI but break in prod. |
|
|
| `/codex` | Second opinion via OpenAI Codex. Review, challenge, or consult modes. |
|
|
| `/investigate` | Systematic root-cause debugging. No fixes without investigation. |
|
|
| `/design-review` | Live-site visual audit + fix loop with atomic commits. |
|
|
| `/design-shotgun` | Generate multiple AI design variants, comparison board, iterate. |
|
|
| `/design-html` | Generate production-quality Pretext-native HTML/CSS. |
|
|
| `/devex-review` | Live developer experience audit (TTHW measured against the real flow). |
|
|
| `/qa` | Open a real browser, find bugs, fix them, re-verify. |
|
|
| `/qa-only` | Same methodology as /qa but report only — no code changes. |
|
|
| `/scrape` | Pull data from a web page. First call prototypes; codified call runs in ~200ms. |
|
|
| `/skillify` | Codify the most recent successful `/scrape` flow into a permanent browser-skill. |
|
|
|
|
### Release + deploy
|
|
|
|
| Skill | What it does |
|
|
|-------|-------------|
|
|
| `/ship` | Run tests, review, push, open PR. Workspace-aware version queue. |
|
|
| `/land-and-deploy` | Merge the PR, wait for CI and deploy, verify production health. |
|
|
| `/canary` | Post-deploy monitoring loop using the browse daemon. |
|
|
| `/landing-report` | Read-only dashboard for the workspace-aware ship queue. |
|
|
| `/document-release` | Update all docs to match what you just shipped. |
|
|
| `/document-generate` | Generate Diataxis docs (tutorial / how-to / reference / explanation) from code. |
|
|
| `/setup-deploy` | One-time deploy config detection (Fly.io, Render, Vercel, etc.). |
|
|
| `/gstack-upgrade` | Update gstack to the latest version. |
|
|
|
|
### Operational + memory
|
|
|
|
| Skill | What it does |
|
|
|-------|-------------|
|
|
| `/context-save` | Save working context (git state, decisions, remaining work). |
|
|
| `/context-restore` | Resume from a saved context, even across Conductor workspaces. |
|
|
| `/learn` | Manage what gstack learned across sessions. |
|
|
| `/retro` | Weekly retro with per-person breakdowns and shipping streaks. |
|
|
| `/health` | Code quality dashboard (type checker, linter, tests, dead code). |
|
|
| `/benchmark` | Performance regression detection (page load, Core Web Vitals). |
|
|
| `/benchmark-models` | Cross-model benchmark for skills (Claude, GPT, Gemini side-by-side). |
|
|
| `/cso` | OWASP Top 10 + STRIDE security audit. |
|
|
| `/setup-gbrain` | Set up gbrain for cross-machine session memory sync. |
|
|
| `/sync-gbrain` | Keep gbrain current with this repo's code; refresh agent search guidance in CLAUDE.md. |
|
|
|
|
### Browser + agent integration
|
|
|
|
| Skill | What it does |
|
|
|-------|-------------|
|
|
| `/browse` | Headless browser — real Chromium, real clicks, ~100ms/command. |
|
|
| `/open-gstack-browser` | Launch the visible GStack Browser with sidebar + stealth. |
|
|
| `/setup-browser-cookies` | Import cookies from your real browser for authenticated testing. |
|
|
| `/pair-agent` | Pair a remote AI agent (OpenClaw, Codex, etc.) with your browser. |
|
|
|
|
### iOS QA — drive real iPhones over USB or Tailscale (v1.43.0.0+)
|
|
|
|
| Skill | What it does |
|
|
|-------|-------------|
|
|
| `/ios-qa` | Live-device iOS QA via USB CoreDevice tunnel + embedded StateServer. Optionally exposes the device over Tailscale so remote agents can drive it. |
|
|
| `/ios-fix` | Autonomous iOS bug fixer with regression snapshot capture. |
|
|
| `/ios-design-review` | Designer's-eye QA on a real iPhone — 10-dimension Apple HIG rubric. |
|
|
| `/ios-clean` | Convenience: strip DebugBridge + #if DEBUG wiring before a Release build. |
|
|
| `/ios-sync` | Regenerate the iOS debug bridge against the latest upstream templates. |
|
|
|
|
Companion CLIs (run on the Mac that's plugged into the device):
|
|
|
|
| Command | What it does |
|
|
|---------|-------------|
|
|
| `gstack-ios-qa-daemon` | Mac-side broker. Loopback by default; `--tailnet` adds a Tailscale-facing listener with capability tiers and audit logging. |
|
|
| `gstack-ios-qa-mint` | Owner-grant CLI for the tailnet allowlist (`grant`/`revoke`/`list`). |
|
|
|
|
End-to-end walkthrough: [docs/howto-ios-testing-with-gstack.md](docs/howto-ios-testing-with-gstack.md).
|
|
|
|
### Safety + scoping
|
|
|
|
| Skill | What it does |
|
|
|-------|-------------|
|
|
| `/careful` | Warn before destructive commands (rm -rf, DROP TABLE, force-push). |
|
|
| `/freeze` | Lock edits to one directory. Hard block, not just a warning. |
|
|
| `/guard` | Activate both careful + freeze at once. |
|
|
| `/unfreeze` | Remove directory edit restrictions. |
|
|
| `/make-pdf` | Turn any markdown file into a publication-quality PDF. |
|
|
|
|
## Build commands
|
|
|
|
```bash
|
|
bun install # install dependencies
|
|
bun test # run free tests (no API spend)
|
|
bun run test:windows # curated Windows-safe subset (runs on windows-latest)
|
|
bun run build # generate docs + compile binaries
|
|
bun run gen:skill-docs # regenerate SKILL.md files from templates
|
|
bun run skill:check # health dashboard for all skills
|
|
```
|
|
|
|
## Platform support
|
|
|
|
- **macOS** + **Linux**: full test suite supported.
|
|
- **Windows**: curated Windows-safe subset runs on `windows-latest` via the
|
|
`windows-free-tests` CI job. Setup script (`./setup`) requires Git Bash or
|
|
MSYS today; native PowerShell support is a future expansion. The `bin/gstack-paths`
|
|
helper resolves state roots through `CLAUDE_PLUGIN_DATA` / `GSTACK_HOME` so plugin
|
|
installs work on every platform.
|
|
|
|
## Key conventions
|
|
|
|
- SKILL.md files are **generated** from `.tmpl` templates. Edit the template, not the output.
|
|
- Run `bun run gen:skill-docs --host codex` to regenerate Codex-specific output.
|
|
- The browse binary provides headless browser access. Use `$B <command>` in skills.
|
|
- Safety skills (careful, freeze, guard) use inline advisory prose — always confirm before destructive operations.
|
|
- State paths resolve via `bin/gstack-paths` (sourced via `eval "$(...)"`). Honors `GSTACK_HOME`, `CLAUDE_PLUGIN_DATA`, `CLAUDE_PLANS_DIR`.
|
|
- The `claude` CLI binary resolves via `browse/src/claude-bin.ts` (`Bun.which()` + `GSTACK_CLAUDE_BIN` override). Set `GSTACK_CLAUDE_BIN=wsl` plus `GSTACK_CLAUDE_BIN_ARGS='["claude"]'` to run Claude through WSL on Windows.
|