mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-02 11:45:20 +02:00
00bc482fe1
* feat: add /canary, /benchmark, /land-and-deploy skills (v0.7.0) Three new skills that close the deploy loop: - /canary: standalone post-deploy monitoring with browse daemon - /benchmark: performance regression detection with Web Vitals - /land-and-deploy: merge PR, wait for deploy, canary verify production Incorporates patterns from community PR #151. Co-Authored-By: HMAKT99 <HMAKT99@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Performance & Bundle Impact category to review checklist New Pass 2 (INFORMATIONAL) category catching heavy dependencies (moment.js, lodash full), missing lazy loading, synchronous scripts, CSS @import blocking, fetch waterfalls, and tree-shaking breaks. Both /review and /ship automatically pick this up via checklist.md. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add {{DEPLOY_BOOTSTRAP}} resolver + deployed row in dashboard - New generateDeployBootstrap() resolver auto-detects deploy platform (Vercel, Netlify, Fly.io, GH Actions, etc.), production URL, and merge method. Persists to CLAUDE.md like test bootstrap. - Review Readiness Dashboard now shows a "Deployed" row from /land-and-deploy JSONL entries (informational, never gates shipping). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: mark 3 TODOs completed, bump v0.7.0, update CHANGELOG Superseded by /land-and-deploy: - /merge skill — review-gated PR merge - Deploy-verify skill - Post-deploy verification (ship + browse) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: /setup-deploy skill + platform-specific deploy verification - New /setup-deploy skill: interactive guided setup for deploy configuration. Detects Fly.io, Render, Vercel, Netlify, Heroku, Railway, GitHub Actions, and custom deploy scripts. Writes config to CLAUDE.md with custom hooks section for non-standard setups. - Enhanced deploy bootstrap: platform-specific URL resolution (fly.toml app → {app}.fly.dev, render.yaml → {service}.onrender.com, etc.), deploy status commands (fly status, heroku releases), and custom deploy hooks section in CLAUDE.md for manual/scripted deploys. - Platform-specific deploy verification in /land-and-deploy Step 6: Strategy A (GitHub Actions polling), Strategy B (platform CLI: fly/render/heroku), Strategy C (auto-deploy: vercel/netlify), Strategy D (custom hooks from CLAUDE.md). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: E2E + LLM-judge evals for deploy skills - 4 E2E tests: land-and-deploy (Fly.io detection + deploy report), canary (monitoring report structure), benchmark (perf report schema), setup-deploy (platform detection → CLAUDE.md config) - 4 LLM-judge evals: workflow quality for all 4 new skills - Touchfile entries for diff-based test selection (E2E + LLM-judge) - 460 free tests pass, 0 fail Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: harden E2E tests — server lifecycle, timeouts, preamble budget, skip flaky Cross-cutting fixes: - Pre-seed ~/.gstack/.completeness-intro-seen and ~/.gstack/.telemetry-prompted so preamble doesn't burn 3-7 turns on lake intro + telemetry in every test - Each describe block creates its own test server instance instead of sharing a global that dies between suites Test fixes (5 tests): - /qa quick: own server instance + preamble skip - /review SQL injection: timeout 90→180s, maxTurns 15→20, added assertion that review output actually mentions SQL injection - /review design-lite: maxTurns 25→35 + preamble skip (now detects 7/7) - ship-base-branch: both timeouts 90→150/180s + preamble skip - plan-eng artifact: clean stale state in beforeAll, maxTurns 20→25 Skipped (4 flaky/redundant tests): - contributor-mode: tests prompt compliance, not skill functionality - design-consultation-research: WebSearch-dependent, redundant with core - design-consultation-preview: redundant with core test - /qa bootstrap: too ambitious (65 turns, installs vitest) Also: preamble skip added to qa-only, qa-fix-loop, design-consultation-core, and design-consultation-existing prompts. Updated touchfiles entries and touchfiles.test.ts. Added honest comment to codex-review-findings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: redesign 6 skipped/todo E2E tests + add test.concurrent support Redesigned tests (previously skipped/todo): - contributor-mode: pre-fail approach, 5 turns/30s (was 10 turns/90s) - design-consultation-research: WebSearch-only, 8 turns/90s (was 45/480s) - design-consultation-preview: preview HTML only, 8 turns/90s (was 30/480s) - qa-bootstrap: bootstrap-only, 12 turns/90s (was 65/420s) - /ship workflow: local bare remote, 15 turns/120s (was test.todo) - /setup-browser-cookies: browser detection smoke, 5 turns/45s (was test.todo) Added testConcurrentIfSelected() helper for future parallelization. Updated touchfiles entries for all 6 re-enabled tests. Target: 0 skip, 0 todo, 0 fail across all E2E tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: relax contributor-mode assertions — test structure not exact phrasing * perf: enable test.concurrent for 31 independent E2E tests Convert 18 skill-e2e, 11 routing, and 2 codex tests from sequential to test.concurrent. Only design-consultation tests (4) remain sequential due to shared designDir state. Expected ~6x speedup on Teams high-burst. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add --concurrent flag to bun test + convert remaining 4 sequential tests bun's test.concurrent only works within a describe block, not across describe blocks. Adding --concurrent to the CLI command makes ALL tests concurrent regardless of describe boundaries. Also converted the 4 design-consultation tests to concurrent (each already independent). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * perf: split monolithic E2E test into 8 parallel files Split test/skill-e2e.test.ts (3442 lines) into 8 category files: - skill-e2e-browse.test.ts (7 tests) - skill-e2e-review.test.ts (7 tests) - skill-e2e-qa-bugs.test.ts (3 tests) - skill-e2e-qa-workflow.test.ts (4 tests) - skill-e2e-plan.test.ts (6 tests) - skill-e2e-design.test.ts (7 tests) - skill-e2e-workflow.test.ts (6 tests) - skill-e2e-deploy.test.ts (4 tests) Bun runs each file in its own worker = 10 parallel workers (8 split + routing + codex). Expected: 78 min → ~12 min. Extracted shared helpers to test/helpers/e2e-helpers.ts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * perf: bump default E2E concurrency to 15 * perf: add model pinning infrastructure + rate-limit telemetry to E2E runner Default E2E model changed from Opus to Sonnet (5x faster, 5x cheaper). Session runner now accepts `model` option with EVALS_MODEL env var override. Added timing telemetry (first_response_ms, max_inter_turn_ms) and wall_clock_ms to eval-store for diagnosing rate-limit impact. Added EVALS_FAST test filtering. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve 3 E2E test failures — tmpdir race, wasted turns, brittle assertions plan-design-review-plan-mode: give each test its own tmpdir to eliminate race condition where concurrent tests pollute each other's working directory. ship-local-workflow: inline ship workflow steps in prompt instead of having agent read 700+ line SKILL.md (was wasting 6 of 15 turns on file I/O). design-consultation-core: replace exact section name matching with fuzzy synonym-based matching (e.g. "Colors" matches "Color", "Type System" matches "Typography"). All 7 sections still required, LLM judge still hard fail. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * perf: pin quality tests to Opus, add --retry 2 and test:e2e:fast tier ~10 quality-sensitive tests (planted-bug detection, design quality judge, strategic review, retro analysis) explicitly pinned to Opus. ~30 structure tests default to Sonnet for 5x speed improvement. Added --retry 2 to all E2E scripts for flaky test resilience. Added test:e2e:fast script that excludes 8 slowest tests for quick feedback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: mark E2E model pinning TODO as shipped Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add SKILL.md merge conflict directive to CLAUDE.md When resolving merge conflicts on generated SKILL.md files, always merge the .tmpl templates first, then regenerate — never accept either side's generated output directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add DEPLOY_BOOTSTRAP resolver to gen-skill-docs The land-and-deploy template referenced {{DEPLOY_BOOTSTRAP}} but no resolver existed, causing gen-skill-docs to fail. Added generateDeployBootstrap() that generates the deploy config detection bash block (check CLAUDE.md for persisted config, auto-detect platform from config files, detect deploy workflows). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate SKILL.md files after DEPLOY_BOOTSTRAP fix Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: move prompt temp file outside workingDirectory to prevent race condition The .prompt-tmp file was written inside workingDirectory, which gets deleted by afterAll cleanup. With --concurrent --retry, afterAll can interleave with retries, causing "No such file or directory" crashes at 0s (seen in review-design-lite and office-hours-spec-review). Fix: write prompt file to os.tmpdir() with a unique suffix so it survives directory cleanup. Also convert review-design-lite from describeE2E to describeIfSelected for proper diff-based test selection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add --retry 2 --concurrent flags to test:evals scripts for consistency test:evals and test:evals:all were missing the retry and concurrency flags that test:e2e already had, causing inconsistent behavior between the two script families. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: HMAKT99 <HMAKT99@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
258 lines
13 KiB
Markdown
258 lines
13 KiB
Markdown
# gstack development
|
|
|
|
## Commands
|
|
|
|
```bash
|
|
bun install # install dependencies
|
|
bun test # run free tests (browse + snapshot + skill validation)
|
|
bun run test:evals # run paid evals: LLM judge + E2E (diff-based, ~$4/run max)
|
|
bun run test:evals:all # run ALL paid evals regardless of diff
|
|
bun run test:e2e # run E2E tests only (diff-based, ~$3.85/run max)
|
|
bun run test:e2e:all # run ALL E2E tests regardless of diff
|
|
bun run eval:select # show which tests would run based on current diff
|
|
bun run dev <cmd> # run CLI in dev mode, e.g. bun run dev goto https://example.com
|
|
bun run build # gen docs + compile binaries
|
|
bun run gen:skill-docs # regenerate SKILL.md files from templates
|
|
bun run skill:check # health dashboard for all skills
|
|
bun run dev:skill # watch mode: auto-regen + validate on change
|
|
bun run eval:list # list all eval runs from ~/.gstack-dev/evals/
|
|
bun run eval:compare # compare two eval runs (auto-picks most recent)
|
|
bun run eval:summary # aggregate stats across all eval runs
|
|
```
|
|
|
|
`test:evals` requires `ANTHROPIC_API_KEY`. Codex E2E tests (`test/codex-e2e.test.ts`)
|
|
use Codex's own auth from `~/.codex/` config — no `OPENAI_API_KEY` env var needed.
|
|
E2E tests stream progress in real-time (tool-by-tool via `--output-format stream-json
|
|
--verbose`). Results are persisted to `~/.gstack-dev/evals/` with auto-comparison
|
|
against the previous run.
|
|
|
|
**Diff-based test selection:** `test:evals` and `test:e2e` auto-select tests based
|
|
on `git diff` against the base branch. Each test declares its file dependencies in
|
|
`test/helpers/touchfiles.ts`. Changes to global touchfiles (session-runner, eval-store,
|
|
llm-judge, gen-skill-docs) trigger all tests. Use `EVALS_ALL=1` or the `:all` script
|
|
variants to force all tests. Run `eval:select` to preview which tests would run.
|
|
|
|
## Testing
|
|
|
|
```bash
|
|
bun test # run before every commit — free, <2s
|
|
bun run test:evals # run before shipping — paid, diff-based (~$4/run max)
|
|
```
|
|
|
|
`bun test` runs skill validation, gen-skill-docs quality checks, and browse
|
|
integration tests. `bun run test:evals` runs LLM-judge quality evals and E2E
|
|
tests via `claude -p`. Both must pass before creating a PR.
|
|
|
|
## Project structure
|
|
|
|
```
|
|
gstack/
|
|
├── browse/ # Headless browser CLI (Playwright)
|
|
│ ├── src/ # CLI + server + commands
|
|
│ │ ├── commands.ts # Command registry (single source of truth)
|
|
│ │ └── snapshot.ts # SNAPSHOT_FLAGS metadata array
|
|
│ ├── test/ # Integration tests + fixtures
|
|
│ └── dist/ # Compiled binary
|
|
├── scripts/ # Build + DX tooling
|
|
│ ├── gen-skill-docs.ts # Template → SKILL.md generator
|
|
│ ├── skill-check.ts # Health dashboard
|
|
│ └── dev-skill.ts # Watch mode
|
|
├── test/ # Skill validation + eval tests
|
|
│ ├── helpers/ # skill-parser.ts, session-runner.ts, llm-judge.ts, eval-store.ts
|
|
│ ├── fixtures/ # Ground truth JSON, planted-bug fixtures, eval baselines
|
|
│ ├── skill-validation.test.ts # Tier 1: static validation (free, <1s)
|
|
│ ├── gen-skill-docs.test.ts # Tier 1: generator quality (free, <1s)
|
|
│ ├── skill-llm-eval.test.ts # Tier 3: LLM-as-judge (~$0.15/run)
|
|
│ └── skill-e2e-*.test.ts # Tier 2: E2E via claude -p (~$3.85/run, split by category)
|
|
├── qa-only/ # /qa-only skill (report-only QA, no fixes)
|
|
├── plan-design-review/ # /plan-design-review skill (report-only design audit)
|
|
├── design-review/ # /design-review skill (design audit + fix loop)
|
|
├── ship/ # Ship workflow skill
|
|
├── review/ # PR review skill
|
|
├── plan-ceo-review/ # /plan-ceo-review skill
|
|
├── plan-eng-review/ # /plan-eng-review skill
|
|
├── office-hours/ # /office-hours skill (YC Office Hours — startup diagnostic + builder brainstorm)
|
|
├── investigate/ # /investigate skill (systematic root-cause debugging)
|
|
├── retro/ # Retrospective skill
|
|
├── document-release/ # /document-release skill (post-ship doc updates)
|
|
├── setup # One-time setup: build binary + symlink skills
|
|
├── SKILL.md # Generated from SKILL.md.tmpl (don't edit directly)
|
|
├── SKILL.md.tmpl # Template: edit this, run gen:skill-docs
|
|
├── ETHOS.md # Builder philosophy (Boil the Lake, Search Before Building)
|
|
└── package.json # Build scripts for browse
|
|
```
|
|
|
|
## SKILL.md workflow
|
|
|
|
SKILL.md files are **generated** from `.tmpl` templates. To update docs:
|
|
|
|
1. Edit the `.tmpl` file (e.g. `SKILL.md.tmpl` or `browse/SKILL.md.tmpl`)
|
|
2. Run `bun run gen:skill-docs` (or `bun run build` which does it automatically)
|
|
3. Commit both the `.tmpl` and generated `.md` files
|
|
|
|
To add a new browse command: add it to `browse/src/commands.ts` and rebuild.
|
|
To add a snapshot flag: add it to `SNAPSHOT_FLAGS` in `browse/src/snapshot.ts` and rebuild.
|
|
|
|
**Merge conflicts on SKILL.md files:** NEVER resolve conflicts on generated SKILL.md
|
|
files by accepting either side. Instead: (1) resolve conflicts on the `.tmpl` templates
|
|
and `scripts/gen-skill-docs.ts` (the sources of truth), (2) run `bun run gen:skill-docs`
|
|
to regenerate all SKILL.md files, (3) stage the regenerated files. Accepting one side's
|
|
generated output silently drops the other side's template changes.
|
|
|
|
## Platform-agnostic design
|
|
|
|
Skills must NEVER hardcode framework-specific commands, file patterns, or directory
|
|
structures. Instead:
|
|
|
|
1. **Read CLAUDE.md** for project-specific config (test commands, eval commands, etc.)
|
|
2. **If missing, AskUserQuestion** — let the user tell you or let gstack search the repo
|
|
3. **Persist the answer to CLAUDE.md** so we never have to ask again
|
|
|
|
This applies to test commands, eval commands, deploy commands, and any other
|
|
project-specific behavior. The project owns its config; gstack reads it.
|
|
|
|
## Writing SKILL templates
|
|
|
|
SKILL.md.tmpl files are **prompt templates read by Claude**, not bash scripts.
|
|
Each bash code block runs in a separate shell — variables do not persist between blocks.
|
|
|
|
Rules:
|
|
- **Use natural language for logic and state.** Don't use shell variables to pass
|
|
state between code blocks. Instead, tell Claude what to remember and reference
|
|
it in prose (e.g., "the base branch detected in Step 0").
|
|
- **Don't hardcode branch names.** Detect `main`/`master`/etc dynamically via
|
|
`gh pr view` or `gh repo view`. Use `{{BASE_BRANCH_DETECT}}` for PR-targeting
|
|
skills. Use "the base branch" in prose, `<base>` in code block placeholders.
|
|
- **Keep bash blocks self-contained.** Each code block should work independently.
|
|
If a block needs context from a previous step, restate it in the prose above.
|
|
- **Express conditionals as English.** Instead of nested `if/elif/else` in bash,
|
|
write numbered decision steps: "1. If X, do Y. 2. Otherwise, do Z."
|
|
|
|
## Browser interaction
|
|
|
|
When you need to interact with a browser (QA, dogfooding, cookie setup), use the
|
|
`/browse` skill or run the browse binary directly via `$B <command>`. NEVER use
|
|
`mcp__claude-in-chrome__*` tools — they are slow, unreliable, and not what this
|
|
project uses.
|
|
|
|
## Vendored symlink awareness
|
|
|
|
When developing gstack, `.claude/skills/gstack` may be a symlink back to this
|
|
working directory (gitignored). This means skill changes are **live immediately** —
|
|
great for rapid iteration, risky during big refactors where half-written skills
|
|
could break other Claude Code sessions using gstack concurrently.
|
|
|
|
**Check once per session:** Run `ls -la .claude/skills/gstack` to see if it's a
|
|
symlink or a real copy. If it's a symlink to your working directory, be aware that:
|
|
- Template changes + `bun run gen:skill-docs` immediately affect all gstack invocations
|
|
- Breaking changes to SKILL.md.tmpl files can break concurrent gstack sessions
|
|
- During large refactors, remove the symlink (`rm .claude/skills/gstack`) so the
|
|
global install at `~/.claude/skills/gstack/` is used instead
|
|
|
|
**For plan reviews:** When reviewing plans that modify skill templates or the
|
|
gen-skill-docs pipeline, consider whether the changes should be tested in isolation
|
|
before going live (especially if the user is actively using gstack in other windows).
|
|
|
|
## Commit style
|
|
|
|
**Always bisect commits.** Every commit should be a single logical change. When
|
|
you've made multiple changes (e.g., a rename + a rewrite + new tests), split them
|
|
into separate commits before pushing. Each commit should be independently
|
|
understandable and revertable.
|
|
|
|
Examples of good bisection:
|
|
- Rename/move separate from behavior changes
|
|
- Test infrastructure (touchfiles, helpers) separate from test implementations
|
|
- Template changes separate from generated file regeneration
|
|
- Mechanical refactors separate from new features
|
|
|
|
When the user says "bisect commit" or "bisect and push," split staged/unstaged
|
|
changes into logical commits and push.
|
|
|
|
## CHANGELOG style
|
|
|
|
CHANGELOG.md is **for users**, not contributors. Write it like product release notes:
|
|
|
|
- Lead with what the user can now **do** that they couldn't before. Sell the feature.
|
|
- Use plain language, not implementation details. "You can now..." not "Refactored the..."
|
|
- **Never mention TODOS.md, internal tracking, eval infrastructure, or contributor-facing
|
|
details.** These are invisible to users and meaningless to them.
|
|
- Put contributor/internal changes in a separate "For contributors" section at the bottom.
|
|
- Every entry should make someone think "oh nice, I want to try that."
|
|
- No jargon: say "every question now tells you which project and branch you're in" not
|
|
"AskUserQuestion format standardized across skill templates via preamble resolver."
|
|
|
|
## AI effort compression
|
|
|
|
When estimating or discussing effort, always show both human-team and CC+gstack time:
|
|
|
|
| Task type | Human team | CC+gstack | Compression |
|
|
|-----------|-----------|-----------|-------------|
|
|
| Boilerplate / scaffolding | 2 days | 15 min | ~100x |
|
|
| Test writing | 1 day | 15 min | ~50x |
|
|
| Feature implementation | 1 week | 30 min | ~30x |
|
|
| Bug fix + regression test | 4 hours | 15 min | ~20x |
|
|
| Architecture / design | 2 days | 4 hours | ~5x |
|
|
| Research / exploration | 1 day | 3 hours | ~3x |
|
|
|
|
Completeness is cheap. Don't recommend shortcuts when the complete implementation
|
|
is a "lake" (achievable) not an "ocean" (multi-quarter migration). See the
|
|
Completeness Principle in the skill preamble for the full philosophy.
|
|
|
|
## Search before building
|
|
|
|
Before designing any solution that involves concurrency, unfamiliar patterns,
|
|
infrastructure, or anything where the runtime/framework might have a built-in:
|
|
|
|
1. Search for "{runtime} {thing} built-in"
|
|
2. Search for "{thing} best practice {current year}"
|
|
3. Check official runtime/framework docs
|
|
|
|
Three layers of knowledge: tried-and-true (Layer 1), new-and-popular (Layer 2),
|
|
first-principles (Layer 3). Prize Layer 3 above all. See ETHOS.md for the full
|
|
builder philosophy.
|
|
|
|
## Local plans
|
|
|
|
Contributors can store long-range vision docs and design documents in `~/.gstack-dev/plans/`.
|
|
These are local-only (not checked in). When reviewing TODOS.md, check `plans/` for candidates
|
|
that may be ready to promote to TODOs or implement.
|
|
|
|
## E2E eval failure blame protocol
|
|
|
|
When an E2E eval fails during `/ship` or any other workflow, **never claim "not
|
|
related to our changes" without proving it.** These systems have invisible couplings —
|
|
a preamble text change affects agent behavior, a new helper changes timing, a
|
|
regenerated SKILL.md shifts prompt context.
|
|
|
|
**Required before attributing a failure to "pre-existing":**
|
|
1. Run the same eval on main (or base branch) and show it fails there too
|
|
2. If it passes on main but fails on the branch — it IS your change. Trace the blame.
|
|
3. If you can't run on main, say "unverified — may or may not be related" and flag it
|
|
as a risk in the PR body
|
|
|
|
"Pre-existing" without receipts is a lazy claim. Prove it or don't say it.
|
|
|
|
## Long-running tasks: don't give up
|
|
|
|
When running evals, E2E tests, or any long-running background task, **poll until
|
|
completion**. Use `sleep 180 && echo "ready"` + `TaskOutput` in a loop every 3
|
|
minutes. Never switch to blocking mode and give up when the poll times out. Never
|
|
say "I'll be notified when it completes" and stop checking — keep the loop going
|
|
until the task finishes or the user tells you to stop.
|
|
|
|
The full E2E suite can take 30-45 minutes. That's 10-15 polling cycles. Do all of
|
|
them. Report progress at each check (which tests passed, which are running, any
|
|
failures so far). The user wants to see the run complete, not a promise that
|
|
you'll check later.
|
|
|
|
## Deploying to the active skill
|
|
|
|
The active skill lives at `~/.claude/skills/gstack/`. After making changes:
|
|
|
|
1. Push your branch
|
|
2. Fetch and reset in the skill directory: `cd ~/.claude/skills/gstack && git fetch origin && git reset --hard origin/main`
|
|
3. Rebuild: `cd ~/.claude/skills/gstack && bun run build`
|
|
|
|
Or copy the binary directly: `cp browse/dist/browse ~/.claude/skills/gstack/browse/dist/browse`
|