mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-01 19:25:10 +02:00
dc5e0538e5
* refactor: extract gen-skill-docs into modular resolver architecture Break the 3000-line monolith into 10 domain modules under scripts/resolvers/: types, constants, preamble, utility, browse, design, testing, review, codex-helpers, and index. Each module owns one domain of template generation. The preamble module introduces a 4-tier composition system (T1-T4) so skills only pay for the preamble sections they actually need, reducing token usage for lightweight skills by ~40%. Adds a token budget dashboard that prints after every generation run showing per-skill and total token counts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: tiered preamble — skills only pay for what they use Tag all 23 templates with preamble-tier (T1-T4). Lightweight skills like /browse and /benchmark get a minimal preamble (~40% fewer tokens), while review skills get the full stack. Regenerate all SKILL.md files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: migrate eval storage to project-scoped paths Move eval results and E2E run artifacts from ~/.gstack-dev/evals/ to ~/.gstack/projects/$SLUG/evals/ so each project's eval history lives alongside its other gstack data. Falls back to legacy path if slug detection fails. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: sync package.json version with VERSION after merge Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add WorktreeManager for isolated test environments Reusable platform module (lib/worktree.ts) that creates git worktrees for test isolation and harvests useful changes as patches. Includes SHA-256 dedup, original SHA tracking for committed change detection, and automatic gitignored artifact copying (.agents/, browse/dist/). 12 unit tests covering lifecycle, harvest, dedup, and error handling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: integrate worktree isolation into E2E test infrastructure Add createTestWorktree(), harvestAndCleanup(), and describeWithWorktree() helpers to e2e-helpers.ts. Add harvest field to EvalTestEntry for eval-store integration. Register lib/worktree.ts as a global touchfile. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: run Gemini and Codex E2E tests in worktrees Switch both test suites from cwd: ROOT to worktree isolation. Gemini (--yolo) no longer pollutes the working tree. Codex (read-only) gets worktree for consistency. Useful changes are harvested as patches for cherry-picking. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: skip symlinks in copyDirSync to prevent infinite recursion Adversarial review caught that .claude/skills/gstack may be a symlink back to the repo root, causing copyDirSync to recurse infinitely when copying gitignored artifacts into worktrees. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: bump version and changelog (v0.11.12.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: relax session-awareness assertion to accept structured options The LLM consistently presents well-formatted A/B choices with pros/cons but doesn't always use the exact string "RECOMMENDATION". Accept case-insensitive "recommend", "option a", "which do you want", or "which approach" as equivalent signals of a structured recommendation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
257 lines
7.4 KiB
Cheetah
257 lines
7.4 KiB
Cheetah
---
|
|
name: gstack
|
|
preamble-tier: 1
|
|
version: 1.1.0
|
|
description: |
|
|
Fast headless browser for QA testing and site dogfooding. Navigate pages, interact with
|
|
elements, verify state, diff before/after, take annotated screenshots, test responsive
|
|
layouts, forms, uploads, dialogs, and capture bug evidence. Use when asked to open or
|
|
test a site, verify a deployment, dogfood a user flow, or file a bug with screenshots.
|
|
Also suggest adjacent gstack skills by stage: brainstorm /office-hours; strategy
|
|
/plan-ceo-review; architecture /plan-eng-review; design /plan-design-review or
|
|
/design-consultation; auto-review /autoplan; debugging /investigate; QA /qa; code review
|
|
/review; visual audit /design-review; shipping /ship; docs /document-release; retro
|
|
/retro; second opinion /codex; prod safety /careful or /guard; scoped edits /freeze or
|
|
/unfreeze; gstack upgrades /gstack-upgrade. If the user opts out of suggestions, stop
|
|
and run gstack-config set proactive false; if they opt back in, run gstack-config set
|
|
proactive true.
|
|
allowed-tools:
|
|
- Bash
|
|
- Read
|
|
- AskUserQuestion
|
|
|
|
---
|
|
|
|
{{PREAMBLE}}
|
|
|
|
If `PROACTIVE` is `false`: do NOT proactively suggest other gstack skills during this session.
|
|
Only run skills the user explicitly invokes. This preference persists across sessions via
|
|
`gstack-config`.
|
|
|
|
# gstack browse: QA Testing & Dogfooding
|
|
|
|
Persistent headless Chromium. First call auto-starts (~3s), then ~100-200ms per command.
|
|
Auto-shuts down after 30 min idle. State persists between calls (cookies, tabs, sessions).
|
|
|
|
{{BROWSE_SETUP}}
|
|
|
|
## IMPORTANT
|
|
|
|
- Use the compiled binary via Bash: `$B <command>`
|
|
- NEVER use `mcp__claude-in-chrome__*` tools. They are slow and unreliable.
|
|
- Browser persists between calls — cookies, login sessions, and tabs carry over.
|
|
- Dialogs (alert/confirm/prompt) are auto-accepted by default — no browser lockup.
|
|
- **Show screenshots:** After `$B screenshot`, `$B snapshot -a -o`, or `$B responsive`, always use the Read tool on the output PNG(s) so the user can see them. Without this, screenshots are invisible.
|
|
|
|
## QA Workflows
|
|
|
|
### Test a user flow (login, signup, checkout, etc.)
|
|
|
|
```bash
|
|
# 1. Go to the page
|
|
$B goto https://app.example.com/login
|
|
|
|
# 2. See what's interactive
|
|
$B snapshot -i
|
|
|
|
# 3. Fill the form using refs
|
|
$B fill @e3 "test@example.com"
|
|
$B fill @e4 "password123"
|
|
$B click @e5
|
|
|
|
# 4. Verify it worked
|
|
$B snapshot -D # diff shows what changed after clicking
|
|
$B is visible ".dashboard" # assert the dashboard appeared
|
|
$B screenshot /tmp/after-login.png
|
|
```
|
|
|
|
### Verify a deployment / check prod
|
|
|
|
```bash
|
|
$B goto https://yourapp.com
|
|
$B text # read the page — does it load?
|
|
$B console # any JS errors?
|
|
$B network # any failed requests?
|
|
$B js "document.title" # correct title?
|
|
$B is visible ".hero-section" # key elements present?
|
|
$B screenshot /tmp/prod-check.png
|
|
```
|
|
|
|
### Dogfood a feature end-to-end
|
|
|
|
```bash
|
|
# Navigate to the feature
|
|
$B goto https://app.example.com/new-feature
|
|
|
|
# Take annotated screenshot — shows every interactive element with labels
|
|
$B snapshot -i -a -o /tmp/feature-annotated.png
|
|
|
|
# Find ALL clickable things (including divs with cursor:pointer)
|
|
$B snapshot -C
|
|
|
|
# Walk through the flow
|
|
$B snapshot -i # baseline
|
|
$B click @e3 # interact
|
|
$B snapshot -D # what changed? (unified diff)
|
|
|
|
# Check element states
|
|
$B is visible ".success-toast"
|
|
$B is enabled "#next-step-btn"
|
|
$B is checked "#agree-checkbox"
|
|
|
|
# Check console for errors after interactions
|
|
$B console
|
|
```
|
|
|
|
### Test responsive layouts
|
|
|
|
```bash
|
|
# Quick: 3 screenshots at mobile/tablet/desktop
|
|
$B goto https://yourapp.com
|
|
$B responsive /tmp/layout
|
|
|
|
# Manual: specific viewport
|
|
$B viewport 375x812 # iPhone
|
|
$B screenshot /tmp/mobile.png
|
|
$B viewport 1440x900 # Desktop
|
|
$B screenshot /tmp/desktop.png
|
|
|
|
# Element screenshot (crop to specific element)
|
|
$B screenshot "#hero-banner" /tmp/hero.png
|
|
$B snapshot -i
|
|
$B screenshot @e3 /tmp/button.png
|
|
|
|
# Region crop
|
|
$B screenshot --clip 0,0,800,600 /tmp/above-fold.png
|
|
|
|
# Viewport only (no scroll)
|
|
$B screenshot --viewport /tmp/viewport.png
|
|
```
|
|
|
|
### Test file upload
|
|
|
|
```bash
|
|
$B goto https://app.example.com/upload
|
|
$B snapshot -i
|
|
$B upload @e3 /path/to/test-file.pdf
|
|
$B is visible ".upload-success"
|
|
$B screenshot /tmp/upload-result.png
|
|
```
|
|
|
|
### Test forms with validation
|
|
|
|
```bash
|
|
$B goto https://app.example.com/form
|
|
$B snapshot -i
|
|
|
|
# Submit empty — check validation errors appear
|
|
$B click @e10 # submit button
|
|
$B snapshot -D # diff shows error messages appeared
|
|
$B is visible ".error-message"
|
|
|
|
# Fill and resubmit
|
|
$B fill @e3 "valid input"
|
|
$B click @e10
|
|
$B snapshot -D # diff shows errors gone, success state
|
|
```
|
|
|
|
### Test dialogs (delete confirmations, prompts)
|
|
|
|
```bash
|
|
# Set up dialog handling BEFORE triggering
|
|
$B dialog-accept # will auto-accept next alert/confirm
|
|
$B click "#delete-button" # triggers confirmation dialog
|
|
$B dialog # see what dialog appeared
|
|
$B snapshot -D # verify the item was deleted
|
|
|
|
# For prompts that need input
|
|
$B dialog-accept "my answer" # accept with text
|
|
$B click "#rename-button" # triggers prompt
|
|
```
|
|
|
|
### Test authenticated pages (import real browser cookies)
|
|
|
|
```bash
|
|
# Import cookies from your real browser (opens interactive picker)
|
|
$B cookie-import-browser
|
|
|
|
# Or import a specific domain directly
|
|
$B cookie-import-browser comet --domain .github.com
|
|
|
|
# Now test authenticated pages
|
|
$B goto https://github.com/settings/profile
|
|
$B snapshot -i
|
|
$B screenshot /tmp/github-profile.png
|
|
```
|
|
|
|
### Compare two pages / environments
|
|
|
|
```bash
|
|
$B diff https://staging.app.com https://prod.app.com
|
|
```
|
|
|
|
### Multi-step chain (efficient for long flows)
|
|
|
|
```bash
|
|
echo '[
|
|
["goto","https://app.example.com"],
|
|
["snapshot","-i"],
|
|
["fill","@e3","test@test.com"],
|
|
["fill","@e4","password"],
|
|
["click","@e5"],
|
|
["snapshot","-D"],
|
|
["screenshot","/tmp/result.png"]
|
|
]' | $B chain
|
|
```
|
|
|
|
## Quick Assertion Patterns
|
|
|
|
```bash
|
|
# Element exists and is visible
|
|
$B is visible ".modal"
|
|
|
|
# Button is enabled/disabled
|
|
$B is enabled "#submit-btn"
|
|
$B is disabled "#submit-btn"
|
|
|
|
# Checkbox state
|
|
$B is checked "#agree"
|
|
|
|
# Input is editable
|
|
$B is editable "#name-field"
|
|
|
|
# Element has focus
|
|
$B is focused "#search-input"
|
|
|
|
# Page contains text
|
|
$B js "document.body.textContent.includes('Success')"
|
|
|
|
# Element count
|
|
$B js "document.querySelectorAll('.list-item').length"
|
|
|
|
# Specific attribute value
|
|
$B attrs "#logo" # returns all attributes as JSON
|
|
|
|
# CSS property
|
|
$B css ".button" "background-color"
|
|
```
|
|
|
|
## Snapshot System
|
|
|
|
{{SNAPSHOT_FLAGS}}
|
|
|
|
## Command Reference
|
|
|
|
{{COMMAND_REFERENCE}}
|
|
|
|
## Tips
|
|
|
|
1. **Navigate once, query many times.** `goto` loads the page; then `text`, `js`, `screenshot` all hit the loaded page instantly.
|
|
2. **Use `snapshot -i` first.** See all interactive elements, then click/fill by ref. No CSS selector guessing.
|
|
3. **Use `snapshot -D` to verify.** Baseline → action → diff. See exactly what changed.
|
|
4. **Use `is` for assertions.** `is visible .modal` is faster and more reliable than parsing page text.
|
|
5. **Use `snapshot -a` for evidence.** Annotated screenshots are great for bug reports.
|
|
6. **Use `snapshot -C` for tricky UIs.** Finds clickable divs that the accessibility tree misses.
|
|
7. **Check `console` after actions.** Catch JS errors that don't surface visually.
|
|
8. **Use `chain` for long flows.** Single command, no per-step CLI overhead.
|