diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index 5311c2cd..45768d07 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -200,6 +200,8 @@ Templates contain the workflows, tips, and examples that require human judgment. | `{{SNAPSHOT_FLAGS}}` | `snapshot.ts` | Flag reference with examples | | `{{PREAMBLE}}` | `gen-skill-docs.ts` | Startup block: update check, session tracking, contributor mode, AskUserQuestion format | | `{{BROWSE_SETUP}}` | `gen-skill-docs.ts` | Binary discovery + setup instructions | +| `{{BASE_BRANCH_DETECT}}` | `gen-skill-docs.ts` | Dynamic base branch detection for PR-targeting skills (ship, review, qa, plan-ceo-review) | +| `{{QA_METHODOLOGY}}` | `gen-skill-docs.ts` | Shared QA methodology block for /qa and /qa-only | This is structurally sound — if a command exists in code, it appears in docs. If it doesn't exist, it can't appear. diff --git a/BROWSER.md b/BROWSER.md index 2d828ebe..df4a6d1d 100644 --- a/BROWSER.md +++ b/BROWSER.md @@ -127,6 +127,18 @@ The `console`, `network`, and `dialog` commands read from the in-memory buffers, Dialogs (alert, confirm, prompt) are auto-accepted by default to prevent browser lockup. The `dialog-accept` and `dialog-dismiss` commands control this behavior. For prompts, `dialog-accept ` provides the response text. All dialogs are logged to the dialog buffer with type, message, and action taken. +### JavaScript execution (`js` and `eval`) + +`js` runs a single expression, `eval` runs a JS file. Both support `await` — expressions containing `await` are automatically wrapped in an async context: + +```bash +$B js "await fetch('/api/data').then(r => r.json())" # works +$B js "document.title" # also works (no wrapping needed) +$B eval my-script.js # file with await works too +``` + +For `eval` files, single-line files return the expression value directly. Multi-line files need explicit `return` when using `await`. Comments containing "await" don't trigger wrapping. + ### Multi-workspace support Each workspace gets its own isolated browser instance with its own Chromium process, tabs, cookies, and logs. State is stored in `.gstack/` inside the project root (detected via `git rev-parse --show-toplevel`). diff --git a/CHANGELOG.md b/CHANGELOG.md index ded48ada..85499795 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -21,6 +21,26 @@ - Escalation protocol assertions added to all preamble skills (DONE_WITH_CONCERNS, BLOCKED, NEEDS_CONTEXT). - Two new TODOs: design docs → Supabase team store sync (P2), /plan-design-review skill (P2). +## 0.4.2 — 2026-03-16 + +- **`$B js "await fetch(...)"` now just works.** Any `await` expression in `$B js` or `$B eval` is automatically wrapped in an async context. No more `SyntaxError: await is only valid in async functions`. Single-line eval files return values directly; multi-line files use explicit `return`. +- **Contributor mode now reflects, not just reacts.** Instead of only filing reports when something breaks, contributor mode now prompts periodic reflection: "Rate your gstack experience 0-10. Not a 10? Think about why." Catches quality-of-life issues and friction that passive detection misses. Reports now include a 0-10 rating and "What would make this a 10" to focus on actionable improvements. +- **Skills now respect your branch target.** `/ship`, `/review`, `/qa`, and `/plan-ceo-review` detect which branch your PR actually targets instead of assuming `main`. Stacked branches, Conductor workspaces targeting feature branches, and repos using `master` all just work now. +- **`/retro` works on any default branch.** Repos using `master`, `develop`, or other default branch names are detected automatically — no more empty retros because the branch name was wrong. +- **New `{{BASE_BRANCH_DETECT}}` placeholder** for skill authors — drop it into any template and get 3-step branch detection (PR base → repo default → fallback) for free. +- **3 new E2E smoke tests** validate base branch detection works end-to-end across ship, review, and retro skills. + +### For contributors + +- Added `hasAwait()` helper with comment-stripping to avoid false positives on `// await` in eval files. +- Smart eval wrapping: single-line → expression `(...)`, multi-line → block `{...}` with explicit `return`. +- 6 new async wrapping unit tests, 40 new contributor mode preamble validation tests. +- Calibration example framed as historical ("used to fail") to avoid implying a live bug post-fix. +- Added "Writing SKILL templates" section to CLAUDE.md — rules for natural language over bash-isms, dynamic branch detection, self-contained code blocks. +- Hardcoded-main regression test scans all `.tmpl` files for git commands with hardcoded `main`. +- QA template cleaned up: removed `REPORT_DIR` shell variable, simplified port detection to prose. +- gstack-upgrade template: explicit cross-step prose for variable references between bash blocks. + ## 0.4.1 — 2026-03-16 - **gstack now notices when it screws up.** Turn on contributor mode (`gstack-config set gstack_contributor true`) and gstack automatically writes up what went wrong — what you were doing, what broke, repro steps. Next time something annoys you, the bug report is already written. Fork gstack and fix it yourself. diff --git a/CLAUDE.md b/CLAUDE.md index d0bddfbf..4ebee4d8 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -67,6 +67,23 @@ SKILL.md files are **generated** from `.tmpl` templates. To update docs: To add a new browse command: add it to `browse/src/commands.ts` and rebuild. To add a snapshot flag: add it to `SNAPSHOT_FLAGS` in `browse/src/snapshot.ts` and rebuild. +## Writing SKILL templates + +SKILL.md.tmpl files are **prompt templates read by Claude**, not bash scripts. +Each bash code block runs in a separate shell — variables do not persist between blocks. + +Rules: +- **Use natural language for logic and state.** Don't use shell variables to pass + state between code blocks. Instead, tell Claude what to remember and reference + it in prose (e.g., "the base branch detected in Step 0"). +- **Don't hardcode branch names.** Detect `main`/`master`/etc dynamically via + `gh pr view` or `gh repo view`. Use `{{BASE_BRANCH_DETECT}}` for PR-targeting + skills. Use "the base branch" in prose, `` in code block placeholders. +- **Keep bash blocks self-contained.** Each code block should work independently. + If a block needs context from a previous step, restate it in the prose above. +- **Express conditionals as English.** Instead of nested `if/elif/else` in bash, + write numbered decision steps: "1. If X, do Y. 2. Otherwise, do Z." + ## Browser interaction When you need to interact with a browser (QA, dogfooding, cookie setup), use the @@ -103,6 +120,21 @@ CHANGELOG.md is **for users**, not contributors. Write it like product release n - No jargon: say "every question now tells you which project and branch you're in" not "AskUserQuestion format standardized across skill templates via preamble resolver." +## E2E eval failure blame protocol + +When an E2E eval fails during `/ship` or any other workflow, **never claim "not +related to our changes" without proving it.** These systems have invisible couplings — +a preamble text change affects agent behavior, a new helper changes timing, a +regenerated SKILL.md shifts prompt context. + +**Required before attributing a failure to "pre-existing":** +1. Run the same eval on main (or base branch) and show it fails there too +2. If it passes on main but fails on the branch — it IS your change. Trace the blame. +3. If you can't run on main, say "unverified — may or may not be related" and flag it + as a risk in the PR body + +"Pre-existing" without receipts is a lazy claim. Prove it or don't say it. + ## Deploying to the active skill The active skill lives at `~/.claude/skills/gstack/`. After making changes: diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index abe1cf16..4af2e889 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -22,9 +22,11 @@ bin/dev-teardown # deactivate — back to your global install ## Contributor mode -Contributor mode is for people who want to fix gstack when it annoys them. Enable it -and Claude Code will automatically log issues to `~/.gstack/contributor-logs/` as you -work — what you were doing, what went wrong, repro steps, raw output. +Contributor mode turns gstack into a self-improving tool. Enable it and Claude Code +will periodically reflect on its gstack experience — rating it 0-10 at the end of +each major workflow step. When something isn't a 10, it thinks about why and files +a report to `~/.gstack/contributor-logs/` with what happened, repro steps, and what +would make it better. ```bash ~/.claude/skills/gstack/bin/gstack-config set gstack_contributor true @@ -36,7 +38,7 @@ the issue, fix it, and open a PR. ### The contributor workflow -1. **Hit friction while using gstack** — contributor mode logs it automatically +1. **Use gstack normally** — contributor mode reflects and logs issues automatically 2. **Check your logs:** `ls ~/.gstack/contributor-logs/` 3. **Fork and clone gstack** (if you haven't already) 4. **Symlink your fork into the project where you hit the bug:** @@ -217,6 +219,8 @@ bun run skill:check bun run dev:skill ``` +For template authoring best practices (natural language over bash-isms, dynamic branch detection, `{{BASE_BRANCH_DETECT}}` usage), see CLAUDE.md's "Writing SKILL templates" section. + To add a browse command, add it to `browse/src/commands.ts`. To add a snapshot flag, add it to `SNAPSHOT_FLAGS` in `browse/src/snapshot.ts`. Then rebuild. ## Conductor workspaces diff --git a/SKILL.md b/SKILL.md index 67bb6a41..9ac6858b 100644 --- a/SKILL.md +++ b/SKILL.md @@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Completion Status Protocol diff --git a/brainstorm/SKILL.md b/brainstorm/SKILL.md index 02992bba..70056efa 100644 --- a/brainstorm/SKILL.md +++ b/brainstorm/SKILL.md @@ -45,12 +45,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -59,20 +62,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Completion Status Protocol diff --git a/browse/SKILL.md b/browse/SKILL.md index 22e464bb..55c24aff 100644 --- a/browse/SKILL.md +++ b/browse/SKILL.md @@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Completion Status Protocol diff --git a/browse/src/read-commands.ts b/browse/src/read-commands.ts index 53efec8a..a7d76352 100644 --- a/browse/src/read-commands.ts +++ b/browse/src/read-commands.ts @@ -11,6 +11,12 @@ import type { Page } from 'playwright'; import * as fs from 'fs'; import * as path from 'path'; +/** Detect await keyword, ignoring comments. Accepted risk: await in string literals triggers wrapping (harmless). */ +function hasAwait(code: string): boolean { + const stripped = code.replace(/\/\/.*$/gm, '').replace(/\/\*[\s\S]*?\*\//g, ''); + return /\bawait\b/.test(stripped); +} + // Security: Path validation to prevent path traversal attacks const SAFE_DIRECTORIES = ['/tmp', process.cwd()]; @@ -118,7 +124,8 @@ export async function handleReadCommand( case 'js': { const expr = args[0]; if (!expr) throw new Error('Usage: browse js '); - const result = await page.evaluate(expr); + const wrapped = hasAwait(expr) ? `(async()=>(${expr}))()` : expr; + const result = await page.evaluate(wrapped); return typeof result === 'object' ? JSON.stringify(result, null, 2) : String(result ?? ''); } @@ -128,6 +135,13 @@ export async function handleReadCommand( validateReadPath(filePath); if (!fs.existsSync(filePath)) throw new Error(`File not found: ${filePath}`); const code = fs.readFileSync(filePath, 'utf-8'); + if (hasAwait(code)) { + const trimmed = code.trim(); + const isSingleExpr = trimmed.split('\n').length === 1; + const wrapped = isSingleExpr ? `(async()=>(${trimmed}))()` : `(async()=>{\n${code}\n})()`; + const result = await page.evaluate(wrapped); + return typeof result === 'object' ? JSON.stringify(result, null, 2) : String(result ?? ''); + } const result = await page.evaluate(code); return typeof result === 'object' ? JSON.stringify(result, null, 2) : String(result ?? ''); } diff --git a/browse/test/commands.test.ts b/browse/test/commands.test.ts index a3e201d9..d8aaeab6 100644 --- a/browse/test/commands.test.ts +++ b/browse/test/commands.test.ts @@ -144,6 +144,60 @@ describe('Inspection', () => { expect(obj.b).toBe(2); }); + test('js supports await expressions', async () => { + const result = await handleReadCommand('js', ['await Promise.resolve(42)'], bm); + expect(result).toBe('42'); + }); + + test('js does not false-positive on await substring', async () => { + const result = await handleReadCommand('js', ['(() => { const awaitable = 5; return awaitable })()'], bm); + expect(result).toBe('5'); + }); + + test('eval supports await in single-line file', async () => { + const tmp = '/tmp/eval-await-test.js'; + fs.writeFileSync(tmp, 'await Promise.resolve("hello from eval")'); + try { + const result = await handleReadCommand('eval', [tmp], bm); + expect(result).toBe('hello from eval'); + } finally { + fs.unlinkSync(tmp); + } + }); + + test('eval does not wrap when await is only in a comment', async () => { + const tmp = '/tmp/eval-comment-test.js'; + fs.writeFileSync(tmp, '// no need to await this\ndocument.title'); + try { + const result = await handleReadCommand('eval', [tmp], bm); + expect(result).toBe('Test Page - Basic'); + } finally { + fs.unlinkSync(tmp); + } + }); + + test('eval multi-line with await and explicit return', async () => { + const tmp = '/tmp/eval-multiline-await.js'; + fs.writeFileSync(tmp, 'const data = await Promise.resolve("multi");\nreturn data;'); + try { + const result = await handleReadCommand('eval', [tmp], bm); + expect(result).toBe('multi'); + } finally { + fs.unlinkSync(tmp); + } + }); + + test('eval multi-line with await but no return gives empty string', async () => { + const tmp = '/tmp/eval-multiline-no-return.js'; + fs.writeFileSync(tmp, 'const data = await Promise.resolve("lost");\ndata;'); + try { + const result = await handleReadCommand('eval', [tmp], bm); + expect(result).toBe(''); + } finally { + fs.unlinkSync(tmp); + } + }); + test('css returns computed property', async () => { const result = await handleReadCommand('css', ['h1', 'color'], bm); // Navy color diff --git a/debug/SKILL.md b/debug/SKILL.md index 23a6ba5a..ce4147d0 100644 --- a/debug/SKILL.md +++ b/debug/SKILL.md @@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Completion Status Protocol diff --git a/gstack-upgrade/SKILL.md b/gstack-upgrade/SKILL.md index 1cf7d548..42f13f69 100644 --- a/gstack-upgrade/SKILL.md +++ b/gstack-upgrade/SKILL.md @@ -94,14 +94,20 @@ fi echo "Install type: $INSTALL_TYPE at $INSTALL_DIR" ``` +The install type and directory path printed above will be used in all subsequent steps. + ### Step 3: Save old version +Use the install directory from Step 2's output below: + ```bash OLD_VERSION=$(cat "$INSTALL_DIR/VERSION" 2>/dev/null || echo "unknown") ``` ### Step 4: Upgrade +Use the install type and directory detected in Step 2: + **For git installs** (global-git, local-git): ```bash cd "$INSTALL_DIR" @@ -125,7 +131,7 @@ rm -rf "$INSTALL_DIR.bak" "$TMP_DIR" ### Step 4.5: Sync local vendored copy -After upgrading the primary install, check if there's also a local copy in the current project that needs updating: +Use the install directory from Step 2. Check if there's also a local vendored copy that needs updating: ```bash _ROOT=$(git rev-parse --show-toplevel 2>/dev/null) diff --git a/gstack-upgrade/SKILL.md.tmpl b/gstack-upgrade/SKILL.md.tmpl index 4a124be1..a199db6c 100644 --- a/gstack-upgrade/SKILL.md.tmpl +++ b/gstack-upgrade/SKILL.md.tmpl @@ -92,14 +92,20 @@ fi echo "Install type: $INSTALL_TYPE at $INSTALL_DIR" ``` +The install type and directory path printed above will be used in all subsequent steps. + ### Step 3: Save old version +Use the install directory from Step 2's output below: + ```bash OLD_VERSION=$(cat "$INSTALL_DIR/VERSION" 2>/dev/null || echo "unknown") ``` ### Step 4: Upgrade +Use the install type and directory detected in Step 2: + **For git installs** (global-git, local-git): ```bash cd "$INSTALL_DIR" @@ -123,7 +129,7 @@ rm -rf "$INSTALL_DIR.bak" "$TMP_DIR" ### Step 4.5: Sync local vendored copy -After upgrading the primary install, check if there's also a local copy in the current project that needs updating: +Use the install directory from Step 2. Check if there's also a local vendored copy that needs updating: ```bash _ROOT=$(git rev-parse --show-toplevel 2>/dev/null) diff --git a/plan-ceo-review/SKILL.md b/plan-ceo-review/SKILL.md index c3c9428e..e782099f 100644 --- a/plan-ceo-review/SKILL.md +++ b/plan-ceo-review/SKILL.md @@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Completion Status Protocol @@ -98,6 +104,25 @@ ATTEMPTED: [what you tried] RECOMMENDATION: [what the user should do next] ``` +## Step 0: Detect base branch + +Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps. + +1. Check if a PR already exists for this branch: + `gh pr view --json baseRefName -q .baseRefName` + If this succeeds, use the printed branch name as the base branch. + +2. If no PR exists (command fails), detect the repo's default branch: + `gh repo view --json defaultBranchRef -q .defaultBranchRef.name` + +3. If both commands fail, fall back to `main`. + +Print the detected base branch name. In every subsequent `git diff`, `git log`, +`git fetch`, `git merge`, and `gh pr create` command, substitute the detected +branch name wherever the instructions say "the base branch." + +--- + # Mega Plan Review Mode ## Philosophy @@ -142,7 +167,7 @@ Before doing anything else, run a system audit. This is not the plan review — Run the following commands: ``` git log --oneline -30 # Recent history -git diff main --stat # What's already changed +git diff --stat # What's already changed git stash list # Any stashed work grep -r "TODO\|FIXME\|HACK\|XXX" --include="*.rb" --include="*.js" -l find . -name "*.rb" -newer Gemfile.lock | head -20 # Recently touched files diff --git a/plan-ceo-review/SKILL.md.tmpl b/plan-ceo-review/SKILL.md.tmpl index 400fb579..5ac18876 100644 --- a/plan-ceo-review/SKILL.md.tmpl +++ b/plan-ceo-review/SKILL.md.tmpl @@ -16,6 +16,8 @@ allowed-tools: {{PREAMBLE}} +{{BASE_BRANCH_DETECT}} + # Mega Plan Review Mode ## Philosophy @@ -60,7 +62,7 @@ Before doing anything else, run a system audit. This is not the plan review — Run the following commands: ``` git log --oneline -30 # Recent history -git diff main --stat # What's already changed +git diff --stat # What's already changed git stash list # Any stashed work grep -r "TODO\|FIXME\|HACK\|XXX" --include="*.rb" --include="*.js" -l find . -name "*.rb" -newer Gemfile.lock | head -20 # Recently touched files diff --git a/plan-eng-review/SKILL.md b/plan-eng-review/SKILL.md index b9934838..24400337 100644 --- a/plan-eng-review/SKILL.md +++ b/plan-eng-review/SKILL.md @@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Completion Status Protocol diff --git a/qa-only/SKILL.md b/qa-only/SKILL.md index fb4d89e8..5a99a148 100644 --- a/qa-only/SKILL.md +++ b/qa-only/SKILL.md @@ -43,12 +43,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -57,20 +60,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Completion Status Protocol diff --git a/qa/SKILL.md b/qa/SKILL.md index 7a982353..2f211f73 100644 --- a/qa/SKILL.md +++ b/qa/SKILL.md @@ -48,12 +48,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -62,20 +65,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Completion Status Protocol @@ -102,6 +108,25 @@ ATTEMPTED: [what you tried] RECOMMENDATION: [what the user should do next] ``` +## Step 0: Detect base branch + +Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps. + +1. Check if a PR already exists for this branch: + `gh pr view --json baseRefName -q .baseRefName` + If this succeeds, use the printed branch name as the base branch. + +2. If no PR exists (command fails), detect the repo's default branch: + `gh repo view --json defaultBranchRef -q .defaultBranchRef.name` + +3. If both commands fail, fall back to `main`. + +Print the detected base branch name. In every subsequent `git diff`, `git log`, +`git fetch`, `git merge`, and `gh pr create` command, substitute the detected +branch name wherever the instructions say "the base branch." + +--- + # /qa: Test → Fix → Verify You are a QA engineer AND a bug-fix engineer. Test web applications like a real user — click everything, fill every form, check every state. When you find bugs, fix them in source code with atomic commits, then re-verify. Produce a structured report with before/after evidence. @@ -158,8 +183,7 @@ If `NEEDS_SETUP`: **Create output directories:** ```bash -REPORT_DIR=".gstack/qa-reports" -mkdir -p "$REPORT_DIR/screenshots" +mkdir -p .gstack/qa-reports/screenshots ``` --- diff --git a/qa/SKILL.md.tmpl b/qa/SKILL.md.tmpl index a3e5a9f0..f491999e 100644 --- a/qa/SKILL.md.tmpl +++ b/qa/SKILL.md.tmpl @@ -20,6 +20,8 @@ allowed-tools: {{PREAMBLE}} +{{BASE_BRANCH_DETECT}} + # /qa: Test → Fix → Verify You are a QA engineer AND a bug-fix engineer. Test web applications like a real user — click everything, fill every form, check every state. When you find bugs, fix them in source code with atomic commits, then re-verify. Produce a structured report with before/after evidence. @@ -59,8 +61,7 @@ fi **Create output directories:** ```bash -REPORT_DIR=".gstack/qa-reports" -mkdir -p "$REPORT_DIR/screenshots" +mkdir -p .gstack/qa-reports/screenshots ``` --- diff --git a/retro/SKILL.md b/retro/SKILL.md index 1ff5cb18..5f07b680 100644 --- a/retro/SKILL.md +++ b/retro/SKILL.md @@ -43,12 +43,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -57,20 +60,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Completion Status Protocol @@ -97,6 +103,16 @@ ATTEMPTED: [what you tried] RECOMMENDATION: [what the user should do next] ``` +## Detect default branch + +Before gathering data, detect the repo's default branch name: +`gh repo view --json defaultBranchRef -q .defaultBranchRef.name` + +If this fails, fall back to `main`. Use the detected name wherever the instructions +say `origin/` below. + +--- + # /retro — Weekly Engineering Retrospective Generates a comprehensive engineering retrospective analyzing commit history, work patterns, and code quality metrics. Team-aware: identifies the user running the command, then analyzes every contributor with per-person praise and growth opportunities. Designed for a senior IC/CTO-level builder using Claude Code as a force multiplier. @@ -131,7 +147,7 @@ Usage: /retro [window] First, fetch origin and identify the current user: ```bash -git fetch origin main --quiet +git fetch origin --quiet # Identify who is running the retro git config user.name git config user.email @@ -143,28 +159,28 @@ Run ALL of these git commands in parallel (they are independent): ```bash # 1. All commits in window with timestamps, subject, hash, AUTHOR, files changed, insertions, deletions -git log origin/main --since="" --format="%H|%aN|%ae|%ai|%s" --shortstat +git log origin/ --since="" --format="%H|%aN|%ae|%ai|%s" --shortstat # 2. Per-commit test vs total LOC breakdown with author # Each commit block starts with COMMIT:|, followed by numstat lines. # Separate test files (matching test/|spec/|__tests__/) from production files. -git log origin/main --since="" --format="COMMIT:%H|%aN" --numstat +git log origin/ --since="" --format="COMMIT:%H|%aN" --numstat # 3. Commit timestamps for session detection and hourly distribution (with author) # Use TZ=America/Los_Angeles for Pacific time conversion -TZ=America/Los_Angeles git log origin/main --since="" --format="%at|%aN|%ai|%s" | sort -n +TZ=America/Los_Angeles git log origin/ --since="" --format="%at|%aN|%ai|%s" | sort -n # 4. Files most frequently changed (hotspot analysis) -git log origin/main --since="" --format="" --name-only | grep -v '^$' | sort | uniq -c | sort -rn +git log origin/ --since="" --format="" --name-only | grep -v '^$' | sort | uniq -c | sort -rn # 5. PR numbers from commit messages (extract #NNN patterns) -git log origin/main --since="" --format="%s" | grep -oE '#[0-9]+' | sed 's/^#//' | sort -n | uniq | sed 's/^/#/' +git log origin/ --since="" --format="%s" | grep -oE '#[0-9]+' | sed 's/^#//' | sort -n | uniq | sed 's/^/#/' # 6. Per-author file hotspots (who touches what) -git log origin/main --since="" --format="AUTHOR:%aN" --name-only +git log origin/ --since="" --format="AUTHOR:%aN" --name-only # 7. Per-author commit counts (quick summary) -git shortlog origin/main --since="" -sn --no-merges +git shortlog origin/ --since="" -sn --no-merges # 8. Greptile triage history (if available) cat ~/.gstack/greptile-history.md 2>/dev/null || true @@ -323,14 +339,14 @@ If the time window is 14 days or more, split into weekly buckets and show trends ### Step 11: Streak Tracking -Count consecutive days with at least 1 commit to origin/main, going back from today. Track both team streak and personal streak: +Count consecutive days with at least 1 commit to origin/, going back from today. Track both team streak and personal streak: ```bash # Team streak: all unique commit dates (Pacific time) — no hard cutoff -TZ=America/Los_Angeles git log origin/main --format="%ad" --date=format:"%Y-%m-%d" | sort -u +TZ=America/Los_Angeles git log origin/ --format="%ad" --date=format:"%Y-%m-%d" | sort -u # Personal streak: only the current user's commits -TZ=America/Los_Angeles git log origin/main --author="" --format="%ad" --date=format:"%Y-%m-%d" | sort -u +TZ=America/Los_Angeles git log origin/ --author="" --format="%ad" --date=format:"%Y-%m-%d" | sort -u ``` Count backward from today — how many consecutive days have at least one commit? This queries the full history so streaks of any length are reported accurately. Display both: @@ -548,7 +564,7 @@ When the user runs `/retro compare` (or `/retro compare 14d`): ## Important Rules - ALL narrative output goes directly to the user in the conversation. The ONLY file written is the `.context/retros/` JSON snapshot. -- Use `origin/main` for all git queries (not local main which may be stale) +- Use `origin/` for all git queries (not local main which may be stale) - Convert all timestamps to Pacific time for display (use `TZ=America/Los_Angeles`) - If the window has zero commits, say so and suggest a different window - Round LOC/hour to nearest 50 diff --git a/retro/SKILL.md.tmpl b/retro/SKILL.md.tmpl index 07e08885..2f39fb5c 100644 --- a/retro/SKILL.md.tmpl +++ b/retro/SKILL.md.tmpl @@ -15,6 +15,16 @@ allowed-tools: {{PREAMBLE}} +## Detect default branch + +Before gathering data, detect the repo's default branch name: +`gh repo view --json defaultBranchRef -q .defaultBranchRef.name` + +If this fails, fall back to `main`. Use the detected name wherever the instructions +say `origin/` below. + +--- + # /retro — Weekly Engineering Retrospective Generates a comprehensive engineering retrospective analyzing commit history, work patterns, and code quality metrics. Team-aware: identifies the user running the command, then analyzes every contributor with per-person praise and growth opportunities. Designed for a senior IC/CTO-level builder using Claude Code as a force multiplier. @@ -49,7 +59,7 @@ Usage: /retro [window] First, fetch origin and identify the current user: ```bash -git fetch origin main --quiet +git fetch origin --quiet # Identify who is running the retro git config user.name git config user.email @@ -61,28 +71,28 @@ Run ALL of these git commands in parallel (they are independent): ```bash # 1. All commits in window with timestamps, subject, hash, AUTHOR, files changed, insertions, deletions -git log origin/main --since="" --format="%H|%aN|%ae|%ai|%s" --shortstat +git log origin/ --since="" --format="%H|%aN|%ae|%ai|%s" --shortstat # 2. Per-commit test vs total LOC breakdown with author # Each commit block starts with COMMIT:|, followed by numstat lines. # Separate test files (matching test/|spec/|__tests__/) from production files. -git log origin/main --since="" --format="COMMIT:%H|%aN" --numstat +git log origin/ --since="" --format="COMMIT:%H|%aN" --numstat # 3. Commit timestamps for session detection and hourly distribution (with author) # Use TZ=America/Los_Angeles for Pacific time conversion -TZ=America/Los_Angeles git log origin/main --since="" --format="%at|%aN|%ai|%s" | sort -n +TZ=America/Los_Angeles git log origin/ --since="" --format="%at|%aN|%ai|%s" | sort -n # 4. Files most frequently changed (hotspot analysis) -git log origin/main --since="" --format="" --name-only | grep -v '^$' | sort | uniq -c | sort -rn +git log origin/ --since="" --format="" --name-only | grep -v '^$' | sort | uniq -c | sort -rn # 5. PR numbers from commit messages (extract #NNN patterns) -git log origin/main --since="" --format="%s" | grep -oE '#[0-9]+' | sed 's/^#//' | sort -n | uniq | sed 's/^/#/' +git log origin/ --since="" --format="%s" | grep -oE '#[0-9]+' | sed 's/^#//' | sort -n | uniq | sed 's/^/#/' # 6. Per-author file hotspots (who touches what) -git log origin/main --since="" --format="AUTHOR:%aN" --name-only +git log origin/ --since="" --format="AUTHOR:%aN" --name-only # 7. Per-author commit counts (quick summary) -git shortlog origin/main --since="" -sn --no-merges +git shortlog origin/ --since="" -sn --no-merges # 8. Greptile triage history (if available) cat ~/.gstack/greptile-history.md 2>/dev/null || true @@ -241,14 +251,14 @@ If the time window is 14 days or more, split into weekly buckets and show trends ### Step 11: Streak Tracking -Count consecutive days with at least 1 commit to origin/main, going back from today. Track both team streak and personal streak: +Count consecutive days with at least 1 commit to origin/, going back from today. Track both team streak and personal streak: ```bash # Team streak: all unique commit dates (Pacific time) — no hard cutoff -TZ=America/Los_Angeles git log origin/main --format="%ad" --date=format:"%Y-%m-%d" | sort -u +TZ=America/Los_Angeles git log origin/ --format="%ad" --date=format:"%Y-%m-%d" | sort -u # Personal streak: only the current user's commits -TZ=America/Los_Angeles git log origin/main --author="" --format="%ad" --date=format:"%Y-%m-%d" | sort -u +TZ=America/Los_Angeles git log origin/ --author="" --format="%ad" --date=format:"%Y-%m-%d" | sort -u ``` Count backward from today — how many consecutive days have at least one commit? This queries the full history so streaks of any length are reported accurately. Display both: @@ -466,7 +476,7 @@ When the user runs `/retro compare` (or `/retro compare 14d`): ## Important Rules - ALL narrative output goes directly to the user in the conversation. The ONLY file written is the `.context/retros/` JSON snapshot. -- Use `origin/main` for all git queries (not local main which may be stale) +- Use `origin/` for all git queries (not local main which may be stale) - Convert all timestamps to Pacific time for display (use `TZ=America/Los_Angeles`) - If the window has zero commits, say so and suggest a different window - Round LOC/hour to nearest 50 diff --git a/review/SKILL.md b/review/SKILL.md index c05e68aa..2e53f372 100644 --- a/review/SKILL.md +++ b/review/SKILL.md @@ -2,7 +2,7 @@ name: review version: 1.0.0 description: | - Pre-landing PR review. Analyzes diff against main for SQL safety, LLM trust + Pre-landing PR review. Analyzes diff against the base branch for SQL safety, LLM trust boundary violations, conditional side effects, and other structural issues. allowed-tools: - Bash @@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Completion Status Protocol @@ -98,17 +104,36 @@ ATTEMPTED: [what you tried] RECOMMENDATION: [what the user should do next] ``` +## Step 0: Detect base branch + +Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps. + +1. Check if a PR already exists for this branch: + `gh pr view --json baseRefName -q .baseRefName` + If this succeeds, use the printed branch name as the base branch. + +2. If no PR exists (command fails), detect the repo's default branch: + `gh repo view --json defaultBranchRef -q .defaultBranchRef.name` + +3. If both commands fail, fall back to `main`. + +Print the detected base branch name. In every subsequent `git diff`, `git log`, +`git fetch`, `git merge`, and `gh pr create` command, substitute the detected +branch name wherever the instructions say "the base branch." + +--- + # Pre-Landing PR Review -You are running the `/review` workflow. Analyze the current branch's diff against main for structural issues that tests don't catch. +You are running the `/review` workflow. Analyze the current branch's diff against the base branch for structural issues that tests don't catch. --- ## Step 1: Check branch 1. Run `git branch --show-current` to get the current branch. -2. If on `main`, output: **"Nothing to review — you're on main or have no changes against main."** and stop. -3. Run `git fetch origin main --quiet && git diff origin/main --stat` to check if there's a diff. If no diff, output the same message and stop. +2. If on the base branch, output: **"Nothing to review — you're on the base branch or have no changes against it."** and stop. +3. Run `git fetch origin --quiet && git diff origin/ --stat` to check if there's a diff. If no diff, output the same message and stop. --- @@ -117,10 +142,10 @@ You are running the `/review` workflow. Analyze the current branch's diff agains Before reviewing code quality, check: **did they build what was requested — nothing more, nothing less?** 1. Read `TODOS.md` (if it exists). Read PR description (`gh pr view --json body --jq .body 2>/dev/null || true`). - Read commit messages (`git log origin/main..HEAD --oneline`). + Read commit messages (`git log origin/..HEAD --oneline`). **If no PR exists:** rely on commit messages and TODOS.md for stated intent — this is the common case since /review runs before /ship creates the PR. 2. Identify the **stated intent** — what was this branch supposed to accomplish? -3. Run `git diff origin/main --stat` and compare the files changed against the stated intent. +3. Run `git diff origin/ --stat` and compare the files changed against the stated intent. 4. Evaluate with skepticism: **SCOPE CREEP detection:** @@ -166,13 +191,13 @@ Read `.claude/skills/review/greptile-triage.md` and follow the fetch, filter, cl ## Step 3: Get the diff -Fetch the latest main to avoid false positives from a stale local main: +Fetch the latest base branch to avoid false positives from stale local state: ```bash -git fetch origin main --quiet +git fetch origin --quiet ``` -Run `git diff origin/main` to get the full diff. This includes both committed and uncommitted changes against the latest main. +Run `git diff origin/` to get the full diff. This includes both committed and uncommitted changes against the latest base branch. --- diff --git a/review/SKILL.md.tmpl b/review/SKILL.md.tmpl index 14afe6f7..44046c60 100644 --- a/review/SKILL.md.tmpl +++ b/review/SKILL.md.tmpl @@ -2,7 +2,7 @@ name: review version: 1.0.0 description: | - Pre-landing PR review. Analyzes diff against main for SQL safety, LLM trust + Pre-landing PR review. Analyzes diff against the base branch for SQL safety, LLM trust boundary violations, conditional side effects, and other structural issues. allowed-tools: - Bash @@ -16,17 +16,19 @@ allowed-tools: {{PREAMBLE}} +{{BASE_BRANCH_DETECT}} + # Pre-Landing PR Review -You are running the `/review` workflow. Analyze the current branch's diff against main for structural issues that tests don't catch. +You are running the `/review` workflow. Analyze the current branch's diff against the base branch for structural issues that tests don't catch. --- ## Step 1: Check branch 1. Run `git branch --show-current` to get the current branch. -2. If on `main`, output: **"Nothing to review — you're on main or have no changes against main."** and stop. -3. Run `git fetch origin main --quiet && git diff origin/main --stat` to check if there's a diff. If no diff, output the same message and stop. +2. If on the base branch, output: **"Nothing to review — you're on the base branch or have no changes against it."** and stop. +3. Run `git fetch origin --quiet && git diff origin/ --stat` to check if there's a diff. If no diff, output the same message and stop. --- @@ -35,10 +37,10 @@ You are running the `/review` workflow. Analyze the current branch's diff agains Before reviewing code quality, check: **did they build what was requested — nothing more, nothing less?** 1. Read `TODOS.md` (if it exists). Read PR description (`gh pr view --json body --jq .body 2>/dev/null || true`). - Read commit messages (`git log origin/main..HEAD --oneline`). + Read commit messages (`git log origin/..HEAD --oneline`). **If no PR exists:** rely on commit messages and TODOS.md for stated intent — this is the common case since /review runs before /ship creates the PR. 2. Identify the **stated intent** — what was this branch supposed to accomplish? -3. Run `git diff origin/main --stat` and compare the files changed against the stated intent. +3. Run `git diff origin/ --stat` and compare the files changed against the stated intent. 4. Evaluate with skepticism: **SCOPE CREEP detection:** @@ -84,13 +86,13 @@ Read `.claude/skills/review/greptile-triage.md` and follow the fetch, filter, cl ## Step 3: Get the diff -Fetch the latest main to avoid false positives from a stale local main: +Fetch the latest base branch to avoid false positives from stale local state: ```bash -git fetch origin main --quiet +git fetch origin --quiet ``` -Run `git diff origin/main` to get the full diff. This includes both committed and uncommitted changes against the latest main. +Run `git diff origin/` to get the full diff. This includes both committed and uncommitted changes against the latest base branch. --- diff --git a/scripts/gen-skill-docs.ts b/scripts/gen-skill-docs.ts index 45ca4fbb..7ec00e25 100644 --- a/scripts/gen-skill-docs.ts +++ b/scripts/gen-skill-docs.ts @@ -123,12 +123,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If \`_CONTRIB\` is \`true\`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If \`_CONTRIB\` is \`true\`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write \`~/.gstack/contributor-logs/{slug}.md\` with this structure: +**Calibration — this is the bar:** For example, \`$B js "await fetch(...)"\` used to fail with \`SyntaxError: await is only valid in async functions\` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write \`~/.gstack/contributor-logs/{slug}.md\` with **all sections below** (do not truncate — include every section through the Date/Version footer): \`\`\` # {Title} @@ -137,20 +140,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +\`\`\` +{paste the actual error or unexpected output here} +\`\`\` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} \`\`\` -Then run: \`mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md\` - -Slug: lowercase, hyphens, max 60 chars (e.g. \`browse-snapshot-ref-gap\`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. \`browse-js-no-await\`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Completion Status Protocol @@ -199,6 +205,27 @@ If \`NEEDS_SETUP\`: 3. If \`bun\` is not installed: \`curl -fsSL https://bun.sh/install | bash\``; } +function generateBaseBranchDetect(): string { + return `## Step 0: Detect base branch + +Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps. + +1. Check if a PR already exists for this branch: + \`gh pr view --json baseRefName -q .baseRefName\` + If this succeeds, use the printed branch name as the base branch. + +2. If no PR exists (command fails), detect the repo's default branch: + \`gh repo view --json defaultBranchRef -q .defaultBranchRef.name\` + +3. If both commands fail, fall back to \`main\`. + +Print the detected base branch name. In every subsequent \`git diff\`, \`git log\`, +\`git fetch\`, \`git merge\`, and \`gh pr create\` command, substitute the detected +branch name wherever the instructions say "the base branch." + +---`; +} + function generateQAMethodology(): string { return `## Modes @@ -480,6 +507,7 @@ const RESOLVERS: Record string> = { SNAPSHOT_FLAGS: generateSnapshotFlags, PREAMBLE: generatePreamble, BROWSE_SETUP: generateBrowseSetup, + BASE_BRANCH_DETECT: generateBaseBranchDetect, QA_METHODOLOGY: generateQAMethodology, }; diff --git a/setup-browser-cookies/SKILL.md b/setup-browser-cookies/SKILL.md index 3d9cfcd1..22d05b65 100644 --- a/setup-browser-cookies/SKILL.md +++ b/setup-browser-cookies/SKILL.md @@ -41,12 +41,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -55,20 +58,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Completion Status Protocol diff --git a/ship/SKILL.md b/ship/SKILL.md index 14d3aa1b..fda16de9 100644 --- a/ship/SKILL.md +++ b/ship/SKILL.md @@ -2,7 +2,7 @@ name: ship version: 1.0.0 description: | - Ship workflow: merge main, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. + Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. allowed-tools: - Bash - Read @@ -43,12 +43,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -57,20 +60,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Completion Status Protocol @@ -97,12 +103,31 @@ ATTEMPTED: [what you tried] RECOMMENDATION: [what the user should do next] ``` +## Step 0: Detect base branch + +Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps. + +1. Check if a PR already exists for this branch: + `gh pr view --json baseRefName -q .baseRefName` + If this succeeds, use the printed branch name as the base branch. + +2. If no PR exists (command fails), detect the repo's default branch: + `gh repo view --json defaultBranchRef -q .defaultBranchRef.name` + +3. If both commands fail, fall back to `main`. + +Print the detected base branch name. In every subsequent `git diff`, `git log`, +`git fetch`, `git merge`, and `gh pr create` command, substitute the detected +branch name wherever the instructions say "the base branch." + +--- + # Ship: Fully Automated Ship Workflow You are running the `/ship` workflow. This is a **non-interactive, fully automated** workflow. Do NOT ask for confirmation at any step. The user said `/ship` which means DO IT. Run straight through and output the PR URL at the end. **Only stop for:** -- On `main` branch (abort) +- On the base branch (abort) - Merge conflicts that can't be auto-resolved (stop, show conflicts) - Test failures (stop, show failures) - Pre-landing review finds CRITICAL issues and user chooses to fix (not acknowledge or skip) @@ -123,20 +148,20 @@ You are running the `/ship` workflow. This is a **non-interactive, fully automat ## Step 1: Pre-flight -1. Check the current branch. If on `main`, **abort**: "You're on main. Ship from a feature branch." +1. Check the current branch. If on the base branch or the repo's default branch, **abort**: "You're on the base branch. Ship from a feature branch." 2. Run `git status` (never use `-uall`). Uncommitted changes are always included — no need to ask. -3. Run `git diff main...HEAD --stat` and `git log main..HEAD --oneline` to understand what's being shipped. +3. Run `git diff ...HEAD --stat` and `git log ..HEAD --oneline` to understand what's being shipped. --- -## Step 2: Merge origin/main (BEFORE tests) +## Step 2: Merge the base branch (BEFORE tests) -Fetch and merge `origin/main` into the feature branch so tests run against the merged state: +Fetch and merge the base branch into the feature branch so tests run against the merged state: ```bash -git fetch origin main && git merge origin/main --no-edit +git fetch origin && git merge origin/ --no-edit ``` **If there are merge conflicts:** Try to auto-resolve if they are simple (VERSION, schema.rb, CHANGELOG ordering). If conflicts are complex or ambiguous, **STOP** and show them. @@ -174,7 +199,7 @@ Evals are mandatory when prompt-related files change. Skip this step entirely if **1. Check if the diff touches prompt-related files:** ```bash -git diff origin/main --name-only +git diff origin/ --name-only ``` Match against these patterns (from CLAUDE.md): @@ -235,7 +260,7 @@ Review the diff for structural issues that tests don't catch. 1. Read `.claude/skills/review/checklist.md`. If the file cannot be read, **STOP** and report the error. -2. Run `git diff origin/main` to get the full diff (scoped to feature changes against the freshly-fetched remote main). +2. Run `git diff origin/` to get the full diff (scoped to feature changes against the freshly-fetched base branch). 3. Apply the review checklist in two passes: - **Pass 1 (CRITICAL):** SQL & Data Safety, LLM Output Trust Boundary @@ -303,7 +328,7 @@ For each classified comment: 1. Read the current `VERSION` file (4-digit format: `MAJOR.MINOR.PATCH.MICRO`) 2. **Auto-decide the bump level based on the diff:** - - Count lines changed (`git diff origin/main...HEAD --stat | tail -1`) + - Count lines changed (`git diff origin/...HEAD --stat | tail -1`) - **MICRO** (4th digit): < 50 lines changed, trivial tweaks, typos, config - **PATCH** (3rd digit): 50+ lines changed, bug fixes, small-medium features - **MINOR** (2nd digit): **ASK the user** — only for major features or significant architectural changes @@ -322,8 +347,8 @@ For each classified comment: 1. Read `CHANGELOG.md` header to know the format. 2. Auto-generate the entry from **ALL commits on the branch** (not just recent ones): - - Use `git log main..HEAD --oneline` to see every commit being shipped - - Use `git diff main...HEAD` to see the full diff against main + - Use `git log ..HEAD --oneline` to see every commit being shipped + - Use `git diff ...HEAD` to see the full diff against the base branch - The CHANGELOG entry must be comprehensive of ALL changes going into the PR - If existing CHANGELOG entries on the branch already cover some commits, replace them with one unified entry for the new version - Categorize changes into applicable sections: @@ -371,8 +396,8 @@ Read TODOS.md and verify it follows the recommended structure: This step is fully automatic — no user interaction. Use the diff and commit history already gathered in earlier steps: -- `git diff main...HEAD` (full diff against main) -- `git log main..HEAD --oneline` (all commits being shipped) +- `git diff ...HEAD` (full diff against the base branch) +- `git log ..HEAD --oneline` (all commits being shipped) For each TODO item, check if the changes in this PR complete it by: - Matching commit messages against the TODO title and description @@ -469,7 +494,7 @@ git push -u origin Create a pull request using `gh`: ```bash -gh pr create --title ": " --body "$(cat <<'EOF' +gh pr create --base --title ": " --body "$(cat <<'EOF' ## Summary diff --git a/ship/SKILL.md.tmpl b/ship/SKILL.md.tmpl index 9c1b1649..e8e329dc 100644 --- a/ship/SKILL.md.tmpl +++ b/ship/SKILL.md.tmpl @@ -2,7 +2,7 @@ name: ship version: 1.0.0 description: | - Ship workflow: merge main, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. + Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. allowed-tools: - Bash - Read @@ -15,12 +15,14 @@ allowed-tools: {{PREAMBLE}} +{{BASE_BRANCH_DETECT}} + # Ship: Fully Automated Ship Workflow You are running the `/ship` workflow. This is a **non-interactive, fully automated** workflow. Do NOT ask for confirmation at any step. The user said `/ship` which means DO IT. Run straight through and output the PR URL at the end. **Only stop for:** -- On `main` branch (abort) +- On the base branch (abort) - Merge conflicts that can't be auto-resolved (stop, show conflicts) - Test failures (stop, show failures) - Pre-landing review finds CRITICAL issues and user chooses to fix (not acknowledge or skip) @@ -41,20 +43,20 @@ You are running the `/ship` workflow. This is a **non-interactive, fully automat ## Step 1: Pre-flight -1. Check the current branch. If on `main`, **abort**: "You're on main. Ship from a feature branch." +1. Check the current branch. If on the base branch or the repo's default branch, **abort**: "You're on the base branch. Ship from a feature branch." 2. Run `git status` (never use `-uall`). Uncommitted changes are always included — no need to ask. -3. Run `git diff main...HEAD --stat` and `git log main..HEAD --oneline` to understand what's being shipped. +3. Run `git diff ...HEAD --stat` and `git log ..HEAD --oneline` to understand what's being shipped. --- -## Step 2: Merge origin/main (BEFORE tests) +## Step 2: Merge the base branch (BEFORE tests) -Fetch and merge `origin/main` into the feature branch so tests run against the merged state: +Fetch and merge the base branch into the feature branch so tests run against the merged state: ```bash -git fetch origin main && git merge origin/main --no-edit +git fetch origin && git merge origin/ --no-edit ``` **If there are merge conflicts:** Try to auto-resolve if they are simple (VERSION, schema.rb, CHANGELOG ordering). If conflicts are complex or ambiguous, **STOP** and show them. @@ -92,7 +94,7 @@ Evals are mandatory when prompt-related files change. Skip this step entirely if **1. Check if the diff touches prompt-related files:** ```bash -git diff origin/main --name-only +git diff origin/ --name-only ``` Match against these patterns (from CLAUDE.md): @@ -153,7 +155,7 @@ Review the diff for structural issues that tests don't catch. 1. Read `.claude/skills/review/checklist.md`. If the file cannot be read, **STOP** and report the error. -2. Run `git diff origin/main` to get the full diff (scoped to feature changes against the freshly-fetched remote main). +2. Run `git diff origin/` to get the full diff (scoped to feature changes against the freshly-fetched base branch). 3. Apply the review checklist in two passes: - **Pass 1 (CRITICAL):** SQL & Data Safety, LLM Output Trust Boundary @@ -221,7 +223,7 @@ For each classified comment: 1. Read the current `VERSION` file (4-digit format: `MAJOR.MINOR.PATCH.MICRO`) 2. **Auto-decide the bump level based on the diff:** - - Count lines changed (`git diff origin/main...HEAD --stat | tail -1`) + - Count lines changed (`git diff origin/...HEAD --stat | tail -1`) - **MICRO** (4th digit): < 50 lines changed, trivial tweaks, typos, config - **PATCH** (3rd digit): 50+ lines changed, bug fixes, small-medium features - **MINOR** (2nd digit): **ASK the user** — only for major features or significant architectural changes @@ -240,8 +242,8 @@ For each classified comment: 1. Read `CHANGELOG.md` header to know the format. 2. Auto-generate the entry from **ALL commits on the branch** (not just recent ones): - - Use `git log main..HEAD --oneline` to see every commit being shipped - - Use `git diff main...HEAD` to see the full diff against main + - Use `git log ..HEAD --oneline` to see every commit being shipped + - Use `git diff ...HEAD` to see the full diff against the base branch - The CHANGELOG entry must be comprehensive of ALL changes going into the PR - If existing CHANGELOG entries on the branch already cover some commits, replace them with one unified entry for the new version - Categorize changes into applicable sections: @@ -289,8 +291,8 @@ Read TODOS.md and verify it follows the recommended structure: This step is fully automatic — no user interaction. Use the diff and commit history already gathered in earlier steps: -- `git diff main...HEAD` (full diff against main) -- `git log main..HEAD --oneline` (all commits being shipped) +- `git diff ...HEAD` (full diff against the base branch) +- `git log ..HEAD --oneline` (all commits being shipped) For each TODO item, check if the changes in this PR complete it by: - Matching commit messages against the TODO title and description @@ -387,7 +389,7 @@ git push -u origin Create a pull request using `gh`: ```bash -gh pr create --title ": " --body "$(cat <<'EOF' +gh pr create --base --title ": " --body "$(cat <<'EOF' ## Summary diff --git a/test/gen-skill-docs.test.ts b/test/gen-skill-docs.test.ts index a2499af6..e77989f0 100644 --- a/test/gen-skill-docs.test.ts +++ b/test/gen-skill-docs.test.ts @@ -203,6 +203,27 @@ describe('gen-skill-docs', () => { }); }); +describe('BASE_BRANCH_DETECT resolver', () => { + // Find a generated SKILL.md that uses the placeholder (ship is guaranteed to) + const shipContent = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8'); + + test('resolver output contains PR base detection command', () => { + expect(shipContent).toContain('gh pr view --json baseRefName'); + }); + + test('resolver output contains repo default branch detection command', () => { + expect(shipContent).toContain('gh repo view --json defaultBranchRef'); + }); + + test('resolver output contains fallback to main', () => { + expect(shipContent).toMatch(/fall\s*back\s+to\s+`main`/i); + }); + + test('resolver output uses "the base branch" phrasing', () => { + expect(shipContent).toContain('the base branch'); + }); +}); + /** * Quality evals — catch description regressions. * diff --git a/test/skill-e2e.test.ts b/test/skill-e2e.test.ts index 8ce1f40f..aa50a976 100644 --- a/test/skill-e2e.test.ts +++ b/test/skill-e2e.test.ts @@ -13,6 +13,11 @@ import * as os from 'os'; const ROOT = path.resolve(import.meta.dir, '..'); // Skip unless EVALS=1. Session runner strips CLAUDE* env vars to avoid nested session issues. +// +// BLAME PROTOCOL: When an eval fails, do NOT claim "pre-existing" or "not related +// to our changes" without proof. Run the same eval on main to verify. These tests +// have invisible couplings — preamble text, SKILL.md content, and timing all affect +// agent behavior. See CLAUDE.md "E2E eval failure blame protocol" for details. const evalsEnabled = !!process.env.EVALS; const describeE2E = evalsEnabled ? describe : describe.skip; @@ -322,10 +327,16 @@ File a contributor report about this issue. Then tell me what you filed.`, const logFiles = fs.readdirSync(logsDir).filter(f => f.endsWith('.md')); expect(logFiles.length).toBeGreaterThan(0); + // Verify new reflection-based format const logContent = fs.readFileSync(path.join(logsDir, logFiles[0]), 'utf-8'); expect(logContent).toContain('Hey gstack team'); expect(logContent).toContain('What I was trying to do'); expect(logContent).toContain('What happened instead'); + expect(logContent).toMatch(/rating/i); + // Verify report has repro steps (agent may use "Steps to reproduce", "Repro Steps", etc.) + expect(logContent).toMatch(/repro|steps to reproduce|how to reproduce/i); + // Verify report has date/version footer (agent may format differently) + expect(logContent).toMatch(/date.*2026|2026.*date/i); // Clean up try { fs.rmSync(contribDir, { recursive: true, force: true }); } catch {} @@ -424,16 +435,20 @@ describeE2E('QA skill E2E', () => { test('/qa quick completes without browse errors', async () => { const result = await runSkillTest({ - prompt: `You have a browse binary at ${browseBin}. Assign it to B variable like: B="${browseBin}" + prompt: `B="${browseBin}" + +The test server is already running at: ${testServer.url} +Target page: ${testServer.url}/basic.html Read the file qa/SKILL.md for the QA workflow instructions. Run a Quick-depth QA test on ${testServer.url}/basic.html Do NOT use AskUserQuestion — run Quick tier directly. +Do NOT try to start a server or discover ports — the URL above is ready. Write your report to ${qaDir}/qa-reports/qa-report.md`, workingDirectory: qaDir, maxTurns: 35, - timeout: 180_000, + timeout: 240_000, testName: 'qa-quick', runId, }); @@ -448,7 +463,7 @@ Write your report to ${qaDir}/qa-reports/qa-report.md`, } // Accept error_max_turns — the agent doing thorough QA work is not a failure expect(['success', 'error_max_turns']).toContain(result.exitReason); - }, 240_000); + }, 300_000); }); // --- B5: Review skill E2E --- @@ -1344,6 +1359,193 @@ Write your review to ${planDir}/review-output.md`, }, 420_000); }); +// --- Base branch detection smoke tests --- + +describeE2E('Base branch detection', () => { + let baseBranchDir: string; + const run = (cmd: string, args: string[], cwd: string) => + spawnSync(cmd, args, { cwd, stdio: 'pipe', timeout: 5000 }); + + beforeAll(() => { + baseBranchDir = fs.mkdtempSync(path.join(os.tmpdir(), 'skill-e2e-basebranch-')); + }); + + afterAll(() => { + try { fs.rmSync(baseBranchDir, { recursive: true, force: true }); } catch {} + }); + + test('/review detects base branch and diffs against it', async () => { + const dir = path.join(baseBranchDir, 'review-base'); + fs.mkdirSync(dir, { recursive: true }); + + // Create git repo with a feature branch off main + run('git', ['init'], dir); + run('git', ['config', 'user.email', 'test@test.com'], dir); + run('git', ['config', 'user.name', 'Test'], dir); + + fs.writeFileSync(path.join(dir, 'app.rb'), '# clean base\nclass App\nend\n'); + run('git', ['add', 'app.rb'], dir); + run('git', ['commit', '-m', 'initial commit'], dir); + + // Create feature branch with a change + run('git', ['checkout', '-b', 'feature/test-review'], dir); + fs.writeFileSync(path.join(dir, 'app.rb'), '# clean base\nclass App\n def hello; "world"; end\nend\n'); + run('git', ['add', 'app.rb'], dir); + run('git', ['commit', '-m', 'feat: add hello method'], dir); + + // Copy review skill files + fs.copyFileSync(path.join(ROOT, 'review', 'SKILL.md'), path.join(dir, 'review-SKILL.md')); + fs.copyFileSync(path.join(ROOT, 'review', 'checklist.md'), path.join(dir, 'review-checklist.md')); + fs.copyFileSync(path.join(ROOT, 'review', 'greptile-triage.md'), path.join(dir, 'review-greptile-triage.md')); + + const result = await runSkillTest({ + prompt: `You are in a git repo on a feature branch with changes. +Read review-SKILL.md for the review workflow instructions. +Also read review-checklist.md and apply it. + +IMPORTANT: Follow Step 0 to detect the base branch. Since there is no remote, gh commands will fail — fall back to main. +Then run the review against the detected base branch. +Write your findings to ${dir}/review-output.md`, + workingDirectory: dir, + maxTurns: 15, + timeout: 90_000, + testName: 'review-base-branch', + runId, + }); + + logCost('/review base-branch', result); + recordE2E('/review base branch detection', 'Base branch detection', result); + expect(result.exitReason).toBe('success'); + + // Verify the review used "base branch" language (from Step 0) + const toolOutputs = result.toolCalls.map(tc => tc.output || '').join('\n'); + const allOutput = (result.output || '') + toolOutputs; + // The agent should have run git diff against main (the fallback) + const usedGitDiff = result.toolCalls.some(tc => + tc.tool === 'Bash' && typeof tc.input === 'string' && tc.input.includes('git diff') + ); + expect(usedGitDiff).toBe(true); + }, 120_000); + + test('/ship Step 0-1 detects base branch without destructive actions', async () => { + const dir = path.join(baseBranchDir, 'ship-base'); + fs.mkdirSync(dir, { recursive: true }); + + // Create git repo with feature branch + run('git', ['init'], dir); + run('git', ['config', 'user.email', 'test@test.com'], dir); + run('git', ['config', 'user.name', 'Test'], dir); + + fs.writeFileSync(path.join(dir, 'app.ts'), 'console.log("v1");\n'); + run('git', ['add', 'app.ts'], dir); + run('git', ['commit', '-m', 'initial'], dir); + + run('git', ['checkout', '-b', 'feature/ship-test'], dir); + fs.writeFileSync(path.join(dir, 'app.ts'), 'console.log("v2");\n'); + run('git', ['add', 'app.ts'], dir); + run('git', ['commit', '-m', 'feat: update to v2'], dir); + + // Copy ship skill + fs.copyFileSync(path.join(ROOT, 'ship', 'SKILL.md'), path.join(dir, 'ship-SKILL.md')); + + const result = await runSkillTest({ + prompt: `Read ship-SKILL.md for the ship workflow. + +Run ONLY Step 0 (Detect base branch) and Step 1 (Pre-flight) from the ship workflow. +Since there is no remote, gh commands will fail — fall back to main. + +After completing Step 0 and Step 1, STOP. Do NOT proceed to Step 2 or beyond. +Do NOT push, create PRs, or modify VERSION/CHANGELOG. + +Write a summary of what you detected to ${dir}/ship-preflight.md including: +- The detected base branch name +- The current branch name +- The diff stat against the base branch`, + workingDirectory: dir, + maxTurns: 10, + timeout: 60_000, + testName: 'ship-base-branch', + runId, + }); + + logCost('/ship base-branch', result); + recordE2E('/ship base branch detection', 'Base branch detection', result); + expect(result.exitReason).toBe('success'); + + // Verify preflight output was written + const preflightPath = path.join(dir, 'ship-preflight.md'); + if (fs.existsSync(preflightPath)) { + const content = fs.readFileSync(preflightPath, 'utf-8'); + expect(content.length).toBeGreaterThan(20); + // Should mention the branch name + expect(content.toLowerCase()).toMatch(/main|base/); + } + + // Verify no destructive actions — no push, no PR creation + const destructiveTools = result.toolCalls.filter(tc => + tc.tool === 'Bash' && typeof tc.input === 'string' && + (tc.input.includes('git push') || tc.input.includes('gh pr create')) + ); + expect(destructiveTools).toHaveLength(0); + }, 90_000); + + test('/retro detects default branch for git queries', async () => { + const dir = path.join(baseBranchDir, 'retro-base'); + fs.mkdirSync(dir, { recursive: true }); + + // Create git repo with commit history + run('git', ['init'], dir); + run('git', ['config', 'user.email', 'dev@example.com'], dir); + run('git', ['config', 'user.name', 'Dev'], dir); + + fs.writeFileSync(path.join(dir, 'app.ts'), 'console.log("hello");\n'); + run('git', ['add', 'app.ts'], dir); + run('git', ['commit', '-m', 'feat: initial app', '--date', '2026-03-14T09:00:00'], dir); + + fs.writeFileSync(path.join(dir, 'auth.ts'), 'export function login() {}\n'); + run('git', ['add', 'auth.ts'], dir); + run('git', ['commit', '-m', 'feat: add auth', '--date', '2026-03-15T10:00:00'], dir); + + fs.writeFileSync(path.join(dir, 'test.ts'), 'test("it works", () => {});\n'); + run('git', ['add', 'test.ts'], dir); + run('git', ['commit', '-m', 'test: add tests', '--date', '2026-03-16T11:00:00'], dir); + + // Copy retro skill + fs.mkdirSync(path.join(dir, 'retro'), { recursive: true }); + fs.copyFileSync(path.join(ROOT, 'retro', 'SKILL.md'), path.join(dir, 'retro', 'SKILL.md')); + + const result = await runSkillTest({ + prompt: `Read retro/SKILL.md for instructions on how to run a retrospective. + +IMPORTANT: Follow the "Detect default branch" step first. Since there is no remote, gh will fail — fall back to main. +Then use the detected branch name for all git queries. + +Run /retro for the last 7 days of this git repo. Skip any AskUserQuestion calls — this is non-interactive. +This is a local-only repo so use the local branch (main) instead of origin/main for all git log commands. + +Write your retrospective to ${dir}/retro-output.md`, + workingDirectory: dir, + maxTurns: 25, + timeout: 240_000, + testName: 'retro-base-branch', + runId, + }); + + logCost('/retro base-branch', result); + recordE2E('/retro default branch detection', 'Base branch detection', result, { + passed: ['success', 'error_max_turns'].includes(result.exitReason), + }); + expect(['success', 'error_max_turns']).toContain(result.exitReason); + + // Verify retro output was produced + const retroPath = path.join(dir, 'retro-output.md'); + if (fs.existsSync(retroPath)) { + const content = fs.readFileSync(retroPath, 'utf-8'); + expect(content.length).toBeGreaterThan(100); + } + }, 300_000); +}); + // --- Deferred skill E2E tests (destructive or require interactive UI) --- describeE2E('Deferred skill E2E', () => { diff --git a/test/skill-validation.test.ts b/test/skill-validation.test.ts index bdec781b..1713c725 100644 --- a/test/skill-validation.test.ts +++ b/test/skill-validation.test.ts @@ -389,6 +389,64 @@ describe('Greptile history format consistency', () => { }); }); +// --- Hardcoded branch name detection in templates --- + +describe('No hardcoded branch names in SKILL templates', () => { + const tmplFiles = [ + 'ship/SKILL.md.tmpl', + 'review/SKILL.md.tmpl', + 'qa/SKILL.md.tmpl', + 'plan-ceo-review/SKILL.md.tmpl', + 'retro/SKILL.md.tmpl', + ]; + + // Patterns that indicate hardcoded 'main' in git commands + const gitMainPatterns = [ + /\bgit\s+diff\s+(?:origin\/)?main\b/, + /\bgit\s+log\s+(?:origin\/)?main\b/, + /\bgit\s+fetch\s+origin\s+main\b/, + /\bgit\s+merge\s+origin\/main\b/, + /\borigin\/main\b/, + ]; + + // Lines that are allowed to mention 'main' (fallback logic, prose) + const allowlist = [ + /fall\s*back\s+to\s+`main`/i, + /fall\s*back\s+to\s+`?main`?/i, + /typically\s+`?main`?/i, + /If\s+on\s+`main`/i, // old pattern — should not exist + ]; + + for (const tmplFile of tmplFiles) { + test(`${tmplFile} has no hardcoded 'main' in git commands`, () => { + const filePath = path.join(ROOT, tmplFile); + if (!fs.existsSync(filePath)) return; + const lines = fs.readFileSync(filePath, 'utf-8').split('\n'); + const violations: string[] = []; + + for (let i = 0; i < lines.length; i++) { + const line = lines[i]; + const isAllowlisted = allowlist.some(p => p.test(line)); + if (isAllowlisted) continue; + + for (const pattern of gitMainPatterns) { + if (pattern.test(line)) { + violations.push(`Line ${i + 1}: ${line.trim()}`); + break; + } + } + } + + if (violations.length > 0) { + throw new Error( + `${tmplFile} has hardcoded 'main' in git commands:\n` + + violations.map(v => ` ${v}`).join('\n') + ); + } + }); + } +}); + // --- Part 7b: TODOS-format.md reference consistency --- describe('TODOS-format.md reference consistency', () => { @@ -468,6 +526,44 @@ describe('debug skill structure', () => { } }); +// --- Contributor mode preamble structure validation --- + +describe('Contributor mode preamble structure', () => { + const skillsWithPreamble = [ + 'SKILL.md', 'browse/SKILL.md', 'qa/SKILL.md', + 'qa-only/SKILL.md', + 'setup-browser-cookies/SKILL.md', + 'ship/SKILL.md', 'review/SKILL.md', + 'plan-ceo-review/SKILL.md', 'plan-eng-review/SKILL.md', + 'retro/SKILL.md', + ]; + + for (const skill of skillsWithPreamble) { + test(`${skill} has 0-10 rating in contributor mode`, () => { + const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8'); + expect(content).toContain('0 to 10'); + expect(content).toContain('My rating'); + }); + + test(`${skill} has calibration example`, () => { + const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8'); + expect(content).toContain('Calibration'); + expect(content).toContain('the bar'); + }); + + test(`${skill} has "what would make this a 10" field`, () => { + const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8'); + expect(content).toContain('What would make this a 10'); + }); + + test(`${skill} uses periodic reflection (not per-command)`, () => { + const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8'); + expect(content).toContain('workflow step'); + expect(content).not.toContain('After you use gstack-provided CLIs'); + }); + } +}); + describe('Enum & Value Completeness in review checklist', () => { const checklist = fs.readFileSync(path.join(ROOT, 'review', 'checklist.md'), 'utf-8');