mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-02 11:45:20 +02:00
484cf1fb3b
* refactor: extract processExternalHost() shared helper for multi-host generation Refactor the Codex-specific output routing block in gen-skill-docs.ts into a shared processExternalHost() function. Both Codex and future external hosts (Factory Droid) will use this helper for output routing, symlink loop detection, frontmatter transformation, path rewrites, and metadata generation. - Rename codexSkillName() to externalSkillName() everywhere - Extract ExternalHostConfig interface with per-host settings - Codex output is byte-identical (verified via --dry-run) - Skip /codex skill for all non-Claude hosts (not just codex) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Factory Droid host type, preamble, and co-author trailer - Add 'factory' to Host union type with .factory/skills/gstack paths - Extend preamble runtime root detection for Factory ($HOME/.factory/) - Add GSTACK_DESIGN env var to preamble (was missing for Codex too) - Add Factory Droid co-author trailer for git commits Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Factory Droid generation, --host all, and host-aware frontmatter - Add --host factory (alias: --host droid) to gen-skill-docs - Add --host all: generates for claude, codex, and factory in one invocation with fault-tolerant per-host error handling (only fails if claude fails) - Factory frontmatter: name + description + user-invocable: true - Factory sensitive skills: disable-model-invocation: true (from sensitive: field) - Claude: strips sensitive: field from output (only Factory uses it) - Factory tool name translation: Claude tool names → generic phrasing - Replace chained gen:skill-docs calls with --host all in package.json build Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sensitive frontmatter for Factory Droid auto-invocation safety Add sensitive: true to 6 skill templates with side effects that Factory Droids shouldn't auto-invoke (ship, land-and-deploy, guard, careful, freeze, unfreeze). The field is: - Factory: emitted as disable-model-invocation: true - Claude/Codex: stripped from output by transformFrontmatter() Also fix Claude host path: call transformFrontmatter() for Claude to strip the sensitive: field from Claude output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: gstack-platform-detect binary for multi-host debugging Bash script that prints a table of installed AI coding agents (Claude, Codex, Factory Droid, Kiro) with versions, skill paths, and gstack installation status. Useful for debugging multi-host setups. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Factory Droid support in setup script - Add factory to --host values (auto-detected via command -v droid) - Add .factory/ skill doc generation step alongside .agents/ - Add create_factory_runtime_root() and link_factory_skill_dirs() helpers mirroring the Codex equivalents - Factory install section creates ~/.factory/skills/ with symlinks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Factory Droid awareness in skill-check and uninstall - skill-check.ts: add Factory skills validation and freshness check - gstack-uninstall: add Factory artifact cleanup (~/.factory/skills/gstack* and per-project .factory/ sidecar) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: Factory Droid generation + --host all test suites Add 13 new tests: - Factory output paths, frontmatter (user-invocable, disable-model-invocation) - Sensitive vs non-sensitive skill classification - Path rewrites (no .claude/skills/ in Factory output) - /codex skill exclusion, openai.yaml absence - Factory keeps Codex integration blocks (for second opinions) - --host droid alias, --dry-run freshness, preamble paths - --host all generates for all 3 hosts - Setup script host validation updated for factory Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: Factory Droid install instructions + CI freshness check - README: add Factory Droid section with install instructions and restart note (Factory requires restart to rescan skills) - CI: add Factory skill doc freshness verification to skill-docs.yml Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: generated Factory Droid skill output (.factory/skills/) 29 skills generated for Factory Droid with: - user-invocable: true on all skills - disable-model-invocation: true on 6 sensitive skills - .factory/skills/ paths (no .claude/skills/ references) - $GSTACK_ROOT env vars for runtime root detection - Tool name translation (Claude tool names → generic phrasing) Committed to git for CI freshness checks and direct consumption. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: add Factory Droid P1 TODO for browse MCP server Add 3 TODOs under new ## Factory Droid section: - P1: Browse MCP server (Option B, deeper Factory integration) - P3: .agent/skills/ dual output for cross-agent compatibility - P3: Custom Droid definitions alongside skills Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.13.5.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
378 lines
16 KiB
TypeScript
378 lines
16 KiB
TypeScript
import type { TemplateContext } from './types';
|
||
|
||
export function generateSlugEval(ctx: TemplateContext): string {
|
||
return `eval "$(${ctx.paths.binDir}/gstack-slug 2>/dev/null)"`;
|
||
}
|
||
|
||
export function generateSlugSetup(ctx: TemplateContext): string {
|
||
return `eval "$(${ctx.paths.binDir}/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG`;
|
||
}
|
||
|
||
export function generateBaseBranchDetect(_ctx: TemplateContext): string {
|
||
return `## Step 0: Detect platform and base branch
|
||
|
||
First, detect the git hosting platform from the remote URL:
|
||
|
||
\`\`\`bash
|
||
git remote get-url origin 2>/dev/null
|
||
\`\`\`
|
||
|
||
- If the URL contains "github.com" → platform is **GitHub**
|
||
- If the URL contains "gitlab" → platform is **GitLab**
|
||
- Otherwise, check CLI availability:
|
||
- \`gh auth status 2>/dev/null\` succeeds → platform is **GitHub** (covers GitHub Enterprise)
|
||
- \`glab auth status 2>/dev/null\` succeeds → platform is **GitLab** (covers self-hosted)
|
||
- Neither → **unknown** (use git-native commands only)
|
||
|
||
Determine which branch this PR/MR targets, or the repo's default branch if no
|
||
PR/MR exists. Use the result as "the base branch" in all subsequent steps.
|
||
|
||
**If GitHub:**
|
||
1. \`gh pr view --json baseRefName -q .baseRefName\` — if succeeds, use it
|
||
2. \`gh repo view --json defaultBranchRef -q .defaultBranchRef.name\` — if succeeds, use it
|
||
|
||
**If GitLab:**
|
||
1. \`glab mr view -F json 2>/dev/null\` and extract the \`target_branch\` field — if succeeds, use it
|
||
2. \`glab repo view -F json 2>/dev/null\` and extract the \`default_branch\` field — if succeeds, use it
|
||
|
||
**Git-native fallback (if unknown platform, or CLI commands fail):**
|
||
1. \`git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's|refs/remotes/origin/||'\`
|
||
2. If that fails: \`git rev-parse --verify origin/main 2>/dev/null\` → use \`main\`
|
||
3. If that fails: \`git rev-parse --verify origin/master 2>/dev/null\` → use \`master\`
|
||
|
||
If all fail, fall back to \`main\`.
|
||
|
||
Print the detected base branch name. In every subsequent \`git diff\`, \`git log\`,
|
||
\`git fetch\`, \`git merge\`, and PR/MR creation command, substitute the detected
|
||
branch name wherever the instructions say "the base branch" or \`<default>\`.
|
||
|
||
---`;
|
||
}
|
||
|
||
export function generateDeployBootstrap(_ctx: TemplateContext): string {
|
||
return `\`\`\`bash
|
||
# Check for persisted deploy config in CLAUDE.md
|
||
DEPLOY_CONFIG=$(grep -A 20 "## Deploy Configuration" CLAUDE.md 2>/dev/null || echo "NO_CONFIG")
|
||
echo "$DEPLOY_CONFIG"
|
||
|
||
# If config exists, parse it
|
||
if [ "$DEPLOY_CONFIG" != "NO_CONFIG" ]; then
|
||
PROD_URL=$(echo "$DEPLOY_CONFIG" | grep -i "production.*url" | head -1 | sed 's/.*: *//')
|
||
PLATFORM=$(echo "$DEPLOY_CONFIG" | grep -i "platform" | head -1 | sed 's/.*: *//')
|
||
echo "PERSISTED_PLATFORM:$PLATFORM"
|
||
echo "PERSISTED_URL:$PROD_URL"
|
||
fi
|
||
|
||
# Auto-detect platform from config files
|
||
[ -f fly.toml ] && echo "PLATFORM:fly"
|
||
[ -f render.yaml ] && echo "PLATFORM:render"
|
||
([ -f vercel.json ] || [ -d .vercel ]) && echo "PLATFORM:vercel"
|
||
[ -f netlify.toml ] && echo "PLATFORM:netlify"
|
||
[ -f Procfile ] && echo "PLATFORM:heroku"
|
||
([ -f railway.json ] || [ -f railway.toml ]) && echo "PLATFORM:railway"
|
||
|
||
# Detect deploy workflows
|
||
for f in $(find .github/workflows -maxdepth 1 \\( -name '*.yml' -o -name '*.yaml' \\) 2>/dev/null); do
|
||
[ -f "$f" ] && grep -qiE "deploy|release|production|cd" "$f" 2>/dev/null && echo "DEPLOY_WORKFLOW:$f"
|
||
[ -f "$f" ] && grep -qiE "staging" "$f" 2>/dev/null && echo "STAGING_WORKFLOW:$f"
|
||
done
|
||
\`\`\`
|
||
|
||
If \`PERSISTED_PLATFORM\` and \`PERSISTED_URL\` were found in CLAUDE.md, use them directly
|
||
and skip manual detection. If no persisted config exists, use the auto-detected platform
|
||
to guide deploy verification. If nothing is detected, ask the user via AskUserQuestion
|
||
in the decision tree below.
|
||
|
||
If you want to persist deploy settings for future runs, suggest the user run \`/setup-deploy\`.`;
|
||
}
|
||
|
||
export function generateQAMethodology(_ctx: TemplateContext): string {
|
||
return `## Modes
|
||
|
||
### Diff-aware (automatic when on a feature branch with no URL)
|
||
|
||
This is the **primary mode** for developers verifying their work. When the user says \`/qa\` without a URL and the repo is on a feature branch, automatically:
|
||
|
||
1. **Analyze the branch diff** to understand what changed:
|
||
\`\`\`bash
|
||
git diff main...HEAD --name-only
|
||
git log main..HEAD --oneline
|
||
\`\`\`
|
||
|
||
2. **Identify affected pages/routes** from the changed files:
|
||
- Controller/route files → which URL paths they serve
|
||
- View/template/component files → which pages render them
|
||
- Model/service files → which pages use those models (check controllers that reference them)
|
||
- CSS/style files → which pages include those stylesheets
|
||
- API endpoints → test them directly with \`$B js "await fetch('/api/...')"\`
|
||
- Static pages (markdown, HTML) → navigate to them directly
|
||
|
||
**If no obvious pages/routes are identified from the diff:** Do not skip browser testing. The user invoked /qa because they want browser-based verification. Fall back to Quick mode — navigate to the homepage, follow the top 5 navigation targets, check console for errors, and test any interactive elements found. Backend, config, and infrastructure changes affect app behavior — always verify the app still works.
|
||
|
||
3. **Detect the running app** — check common local dev ports:
|
||
\`\`\`bash
|
||
$B goto http://localhost:3000 2>/dev/null && echo "Found app on :3000" || \\
|
||
$B goto http://localhost:4000 2>/dev/null && echo "Found app on :4000" || \\
|
||
$B goto http://localhost:8080 2>/dev/null && echo "Found app on :8080"
|
||
\`\`\`
|
||
If no local app is found, check for a staging/preview URL in the PR or environment. If nothing works, ask the user for the URL.
|
||
|
||
4. **Test each affected page/route:**
|
||
- Navigate to the page
|
||
- Take a screenshot
|
||
- Check console for errors
|
||
- If the change was interactive (forms, buttons, flows), test the interaction end-to-end
|
||
- Use \`snapshot -D\` before and after actions to verify the change had the expected effect
|
||
|
||
5. **Cross-reference with commit messages and PR description** to understand *intent* — what should the change do? Verify it actually does that.
|
||
|
||
6. **Check TODOS.md** (if it exists) for known bugs or issues related to the changed files. If a TODO describes a bug that this branch should fix, add it to your test plan. If you find a new bug during QA that isn't in TODOS.md, note it in the report.
|
||
|
||
7. **Report findings** scoped to the branch changes:
|
||
- "Changes tested: N pages/routes affected by this branch"
|
||
- For each: does it work? Screenshot evidence.
|
||
- Any regressions on adjacent pages?
|
||
|
||
**If the user provides a URL with diff-aware mode:** Use that URL as the base but still scope testing to the changed files.
|
||
|
||
### Full (default when URL is provided)
|
||
Systematic exploration. Visit every reachable page. Document 5-10 well-evidenced issues. Produce health score. Takes 5-15 minutes depending on app size.
|
||
|
||
### Quick (\`--quick\`)
|
||
30-second smoke test. Visit homepage + top 5 navigation targets. Check: page loads? Console errors? Broken links? Produce health score. No detailed issue documentation.
|
||
|
||
### Regression (\`--regression <baseline>\`)
|
||
Run full mode, then load \`baseline.json\` from a previous run. Diff: which issues are fixed? Which are new? What's the score delta? Append regression section to report.
|
||
|
||
---
|
||
|
||
## Workflow
|
||
|
||
### Phase 1: Initialize
|
||
|
||
1. Find browse binary (see Setup above)
|
||
2. Create output directories
|
||
3. Copy report template from \`qa/templates/qa-report-template.md\` to output dir
|
||
4. Start timer for duration tracking
|
||
|
||
### Phase 2: Authenticate (if needed)
|
||
|
||
**If the user specified auth credentials:**
|
||
|
||
\`\`\`bash
|
||
$B goto <login-url>
|
||
$B snapshot -i # find the login form
|
||
$B fill @e3 "user@example.com"
|
||
$B fill @e4 "[REDACTED]" # NEVER include real passwords in report
|
||
$B click @e5 # submit
|
||
$B snapshot -D # verify login succeeded
|
||
\`\`\`
|
||
|
||
**If the user provided a cookie file:**
|
||
|
||
\`\`\`bash
|
||
$B cookie-import cookies.json
|
||
$B goto <target-url>
|
||
\`\`\`
|
||
|
||
**If 2FA/OTP is required:** Ask the user for the code and wait.
|
||
|
||
**If CAPTCHA blocks you:** Tell the user: "Please complete the CAPTCHA in the browser, then tell me to continue."
|
||
|
||
### Phase 3: Orient
|
||
|
||
Get a map of the application:
|
||
|
||
\`\`\`bash
|
||
$B goto <target-url>
|
||
$B snapshot -i -a -o "$REPORT_DIR/screenshots/initial.png"
|
||
$B links # map navigation structure
|
||
$B console --errors # any errors on landing?
|
||
\`\`\`
|
||
|
||
**Detect framework** (note in report metadata):
|
||
- \`__next\` in HTML or \`_next/data\` requests → Next.js
|
||
- \`csrf-token\` meta tag → Rails
|
||
- \`wp-content\` in URLs → WordPress
|
||
- Client-side routing with no page reloads → SPA
|
||
|
||
**For SPAs:** The \`links\` command may return few results because navigation is client-side. Use \`snapshot -i\` to find nav elements (buttons, menu items) instead.
|
||
|
||
### Phase 4: Explore
|
||
|
||
Visit pages systematically. At each page:
|
||
|
||
\`\`\`bash
|
||
$B goto <page-url>
|
||
$B snapshot -i -a -o "$REPORT_DIR/screenshots/page-name.png"
|
||
$B console --errors
|
||
\`\`\`
|
||
|
||
Then follow the **per-page exploration checklist** (see \`qa/references/issue-taxonomy.md\`):
|
||
|
||
1. **Visual scan** — Look at the annotated screenshot for layout issues
|
||
2. **Interactive elements** — Click buttons, links, controls. Do they work?
|
||
3. **Forms** — Fill and submit. Test empty, invalid, edge cases
|
||
4. **Navigation** — Check all paths in and out
|
||
5. **States** — Empty state, loading, error, overflow
|
||
6. **Console** — Any new JS errors after interactions?
|
||
7. **Responsiveness** — Check mobile viewport if relevant:
|
||
\`\`\`bash
|
||
$B viewport 375x812
|
||
$B screenshot "$REPORT_DIR/screenshots/page-mobile.png"
|
||
$B viewport 1280x720
|
||
\`\`\`
|
||
|
||
**Depth judgment:** Spend more time on core features (homepage, dashboard, checkout, search) and less on secondary pages (about, terms, privacy).
|
||
|
||
**Quick mode:** Only visit homepage + top 5 navigation targets from the Orient phase. Skip the per-page checklist — just check: loads? Console errors? Broken links visible?
|
||
|
||
### Phase 5: Document
|
||
|
||
Document each issue **immediately when found** — don't batch them.
|
||
|
||
**Two evidence tiers:**
|
||
|
||
**Interactive bugs** (broken flows, dead buttons, form failures):
|
||
1. Take a screenshot before the action
|
||
2. Perform the action
|
||
3. Take a screenshot showing the result
|
||
4. Use \`snapshot -D\` to show what changed
|
||
5. Write repro steps referencing screenshots
|
||
|
||
\`\`\`bash
|
||
$B screenshot "$REPORT_DIR/screenshots/issue-001-step-1.png"
|
||
$B click @e5
|
||
$B screenshot "$REPORT_DIR/screenshots/issue-001-result.png"
|
||
$B snapshot -D
|
||
\`\`\`
|
||
|
||
**Static bugs** (typos, layout issues, missing images):
|
||
1. Take a single annotated screenshot showing the problem
|
||
2. Describe what's wrong
|
||
|
||
\`\`\`bash
|
||
$B snapshot -i -a -o "$REPORT_DIR/screenshots/issue-002.png"
|
||
\`\`\`
|
||
|
||
**Write each issue to the report immediately** using the template format from \`qa/templates/qa-report-template.md\`.
|
||
|
||
### Phase 6: Wrap Up
|
||
|
||
1. **Compute health score** using the rubric below
|
||
2. **Write "Top 3 Things to Fix"** — the 3 highest-severity issues
|
||
3. **Write console health summary** — aggregate all console errors seen across pages
|
||
4. **Update severity counts** in the summary table
|
||
5. **Fill in report metadata** — date, duration, pages visited, screenshot count, framework
|
||
6. **Save baseline** — write \`baseline.json\` with:
|
||
\`\`\`json
|
||
{
|
||
"date": "YYYY-MM-DD",
|
||
"url": "<target>",
|
||
"healthScore": N,
|
||
"issues": [{ "id": "ISSUE-001", "title": "...", "severity": "...", "category": "..." }],
|
||
"categoryScores": { "console": N, "links": N, ... }
|
||
}
|
||
\`\`\`
|
||
|
||
**Regression mode:** After writing the report, load the baseline file. Compare:
|
||
- Health score delta
|
||
- Issues fixed (in baseline but not current)
|
||
- New issues (in current but not baseline)
|
||
- Append the regression section to the report
|
||
|
||
---
|
||
|
||
## Health Score Rubric
|
||
|
||
Compute each category score (0-100), then take the weighted average.
|
||
|
||
### Console (weight: 15%)
|
||
- 0 errors → 100
|
||
- 1-3 errors → 70
|
||
- 4-10 errors → 40
|
||
- 10+ errors → 10
|
||
|
||
### Links (weight: 10%)
|
||
- 0 broken → 100
|
||
- Each broken link → -15 (minimum 0)
|
||
|
||
### Per-Category Scoring (Visual, Functional, UX, Content, Performance, Accessibility)
|
||
Each category starts at 100. Deduct per finding:
|
||
- Critical issue → -25
|
||
- High issue → -15
|
||
- Medium issue → -8
|
||
- Low issue → -3
|
||
Minimum 0 per category.
|
||
|
||
### Weights
|
||
| Category | Weight |
|
||
|----------|--------|
|
||
| Console | 15% |
|
||
| Links | 10% |
|
||
| Visual | 10% |
|
||
| Functional | 20% |
|
||
| UX | 15% |
|
||
| Performance | 10% |
|
||
| Content | 5% |
|
||
| Accessibility | 15% |
|
||
|
||
### Final Score
|
||
\`score = Σ (category_score × weight)\`
|
||
|
||
---
|
||
|
||
## Framework-Specific Guidance
|
||
|
||
### Next.js
|
||
- Check console for hydration errors (\`Hydration failed\`, \`Text content did not match\`)
|
||
- Monitor \`_next/data\` requests in network — 404s indicate broken data fetching
|
||
- Test client-side navigation (click links, don't just \`goto\`) — catches routing issues
|
||
- Check for CLS (Cumulative Layout Shift) on pages with dynamic content
|
||
|
||
### Rails
|
||
- Check for N+1 query warnings in console (if development mode)
|
||
- Verify CSRF token presence in forms
|
||
- Test Turbo/Stimulus integration — do page transitions work smoothly?
|
||
- Check for flash messages appearing and dismissing correctly
|
||
|
||
### WordPress
|
||
- Check for plugin conflicts (JS errors from different plugins)
|
||
- Verify admin bar visibility for logged-in users
|
||
- Test REST API endpoints (\`/wp-json/\`)
|
||
- Check for mixed content warnings (common with WP)
|
||
|
||
### General SPA (React, Vue, Angular)
|
||
- Use \`snapshot -i\` for navigation — \`links\` command misses client-side routes
|
||
- Check for stale state (navigate away and back — does data refresh?)
|
||
- Test browser back/forward — does the app handle history correctly?
|
||
- Check for memory leaks (monitor console after extended use)
|
||
|
||
---
|
||
|
||
## Important Rules
|
||
|
||
1. **Repro is everything.** Every issue needs at least one screenshot. No exceptions.
|
||
2. **Verify before documenting.** Retry the issue once to confirm it's reproducible, not a fluke.
|
||
3. **Never include credentials.** Write \`[REDACTED]\` for passwords in repro steps.
|
||
4. **Write incrementally.** Append each issue to the report as you find it. Don't batch.
|
||
5. **Never read source code.** Test as a user, not a developer.
|
||
6. **Check console after every interaction.** JS errors that don't surface visually are still bugs.
|
||
7. **Test like a user.** Use realistic data. Walk through complete workflows end-to-end.
|
||
8. **Depth over breadth.** 5-10 well-documented issues with evidence > 20 vague descriptions.
|
||
9. **Never delete output files.** Screenshots and reports accumulate — that's intentional.
|
||
10. **Use \`snapshot -C\` for tricky UIs.** Finds clickable divs that the accessibility tree misses.
|
||
11. **Show screenshots to the user.** After every \`$B screenshot\`, \`$B snapshot -a -o\`, or \`$B responsive\` command, use the Read tool on the output file(s) so the user can see them inline. For \`responsive\` (3 files), Read all three. This is critical — without it, screenshots are invisible to the user.
|
||
12. **Never refuse to use the browser.** When the user invokes /qa or /qa-only, they are requesting browser-based testing. Never suggest evals, unit tests, or other alternatives as a substitute. Even if the diff appears to have no UI changes, backend changes affect app behavior — always open the browser and test.`;
|
||
}
|
||
|
||
export function generateCoAuthorTrailer(ctx: TemplateContext): string {
|
||
if (ctx.host === 'codex') {
|
||
return 'Co-Authored-By: OpenAI Codex <noreply@openai.com>';
|
||
}
|
||
if (ctx.host === 'factory') {
|
||
return 'Co-Authored-By: Factory Droid <droid@users.noreply.github.com>';
|
||
}
|
||
return 'Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>';
|
||
}
|