mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-01 19:25:10 +02:00
Merge remote-tracking branch 'origin/main' into v0.3.6-qa-upgrades
# Conflicts: # test/skill-e2e.test.ts
This commit is contained in:
+36
-1
@@ -1,11 +1,46 @@
|
||||
# Changelog
|
||||
|
||||
## Unreleased — 2026-03-14
|
||||
## 0.3.4 — 2026-03-13
|
||||
|
||||
### Added
|
||||
- **Daily update check** — all 9 skills now check for new versions once per day via `bin/gstack-update-check` (pure bash, <5ms cached). Prompts user via AskUserQuestion with option to upgrade or defer 24h.
|
||||
- **`/gstack-upgrade` skill** — standalone upgrade command that detects install type (global-git, local-git, vendored), upgrades, and shows a "What's New" summary from CHANGELOG
|
||||
- **"Just upgraded" confirmation** — after upgrading, the next skill invocation shows "Running gstack v{new} (just updated!)" via `~/.gstack/just-upgraded-from` marker
|
||||
- **`AskUserQuestion` added to 5 skills** — gstack (root), browse, qa, retro, setup-browser-cookies now have AskUserQuestion in allowed-tools for upgrade prompts
|
||||
- **`Bash` added to plan-eng-review** — enables the update check preamble to run in plan review sessions
|
||||
- `browse/test/gstack-update-check.test.ts` — 10 test cases covering all script branch paths with `GSTACK_REMOTE_URL` env var for test isolation
|
||||
- `TODOS.md` for tracking deferred work
|
||||
|
||||
### Changed
|
||||
- **Version check is now one system** — removed SHA-based `checkVersion()` from `browse/src/find-browse.ts` (~120 lines deleted) and `browse/test/find-browse.test.ts` (~100 lines deleted). Replaced by `bin/gstack-update-check` bash script using semver VERSION comparison with 24h cache.
|
||||
- Simplified `qa/SKILL.md` and `setup-browser-cookies/SKILL.md` setup blocks — removed old `BROWSE_OUTPUT`/`META` parsing, now use simple `find-browse` call
|
||||
- Updated `browse/bin/find-browse` shim comments to reflect simplified role (binary locator only)
|
||||
|
||||
### Removed
|
||||
- `checkVersion()`, `readCache()`, `writeCache()`, `fetchRemoteSHA()`, `resolveSkillDir()`, `CacheEntry` interface from `browse/src/find-browse.ts`
|
||||
- `META:UPDATE_AVAILABLE` protocol from find-browse output
|
||||
- Old META-based upgrade instructions from qa and setup-browser-cookies SKILL.md files
|
||||
- Legacy `/tmp/gstack-latest-version` cache file (cleaned up by `setup` script)
|
||||
|
||||
## 0.3.5 — 2026-03-14
|
||||
|
||||
### Fixed
|
||||
- **Browse binary discovery broken for agents** — replaced `find-browse` indirection with explicit `browse/dist/browse` path in SKILL.md setup blocks. Agents were guessing `bin/browse` (wrong) instead of running `find-browse` to discover `browse/dist/browse` (correct).
|
||||
- **Update check exit code 1 misleading agents** — `[ -n "$_UPD" ] && echo "$_UPD"` returned exit code 1 when no update available, causing agents to think gstack was broken. Added `|| true`.
|
||||
- **browse/SKILL.md missing setup block** — `/browse` used `$B` in every example but never defined it. Added `{{BROWSE_SETUP}}` placeholder.
|
||||
|
||||
### Changed
|
||||
- Enriched 14 command descriptions with specific arg formats, valid values, error behavior, and return types
|
||||
- Fixed `header` usage from `<name> <value>` to `<name>:<value>` (matching actual implementation)
|
||||
- Added `cookie` usage syntax: `cookie <name>=<value>`
|
||||
- **Template system expanded** — added `{{UPDATE_CHECK}}` and `{{BROWSE_SETUP}}` placeholders to `gen-skill-docs.ts`. Converted `qa/SKILL.md` and `setup-browser-cookies/SKILL.md` to `.tmpl` templates. All 4 browse-using skills now generate from a single source of truth.
|
||||
- Setup block now checks workspace-local path first (for development), then falls back to global `~/.claude/skills/gstack/browse/dist/browse`
|
||||
|
||||
### Added
|
||||
- 3 new e2e test cases for SKILL.md setup flow: happy path, NEEDS_SETUP, non-git-repo
|
||||
- LLM eval for setup block clarity (actionability + clarity >= 4)
|
||||
- `no such file or directory.*browse` error pattern in session-runner
|
||||
- TODO: convert remaining 5 non-browse skills to .tmpl files
|
||||
- Enriched 4 snapshot flag descriptions with defaults, output paths, and behavior details
|
||||
- Snapshot flags section now shows long flag names (`-i / --interactive`) alongside short
|
||||
- Added ref numbering explanation and output format example to snapshot docs
|
||||
|
||||
@@ -10,11 +10,21 @@ description: |
|
||||
allowed-tools:
|
||||
- Bash
|
||||
- Read
|
||||
- AskUserQuestion
|
||||
|
||||
---
|
||||
<!-- AUTO-GENERATED from SKILL.md.tmpl — do not edit directly -->
|
||||
<!-- Regenerate: bun run gen:skill-docs -->
|
||||
|
||||
## Update Check (run first)
|
||||
|
||||
```bash
|
||||
_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
|
||||
[ -n "$_UPD" ] && echo "$_UPD" || true
|
||||
```
|
||||
|
||||
If output shows `UPGRADE_AVAILABLE <old> <new>`: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (AskUserQuestion → upgrade if yes, `touch ~/.gstack/last-update-check` if no). If `JUST_UPGRADED <from> <to>`: tell user "Running gstack v{to} (just updated!)" and continue.
|
||||
|
||||
# gstack browse: QA Testing & Dogfooding
|
||||
|
||||
Persistent headless Chromium. First call auto-starts (~3s), then ~100-200ms per command.
|
||||
@@ -23,8 +33,11 @@ Auto-shuts down after 30 min idle. State persists between calls (cookies, tabs,
|
||||
## SETUP (run this check BEFORE any browse command)
|
||||
|
||||
```bash
|
||||
B=$(browse/bin/find-browse 2>/dev/null || ~/.claude/skills/gstack/browse/bin/find-browse 2>/dev/null)
|
||||
if [ -n "$B" ]; then
|
||||
_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
|
||||
B=""
|
||||
[ -n "$_ROOT" ] && [ -x "$_ROOT/.claude/skills/gstack/browse/dist/browse" ] && B="$_ROOT/.claude/skills/gstack/browse/dist/browse"
|
||||
[ -z "$B" ] && B=~/.claude/skills/gstack/browse/dist/browse
|
||||
if [ -x "$B" ]; then
|
||||
echo "READY: $B"
|
||||
else
|
||||
echo "NEEDS_SETUP"
|
||||
@@ -48,8 +61,6 @@ If `NEEDS_SETUP`:
|
||||
### Test a user flow (login, signup, checkout, etc.)
|
||||
|
||||
```bash
|
||||
B=~/.claude/skills/gstack/browse/dist/browse
|
||||
|
||||
# 1. Go to the page
|
||||
$B goto https://app.example.com/login
|
||||
|
||||
|
||||
+4
-17
@@ -10,29 +10,18 @@ description: |
|
||||
allowed-tools:
|
||||
- Bash
|
||||
- Read
|
||||
- AskUserQuestion
|
||||
|
||||
---
|
||||
|
||||
{{UPDATE_CHECK}}
|
||||
|
||||
# gstack browse: QA Testing & Dogfooding
|
||||
|
||||
Persistent headless Chromium. First call auto-starts (~3s), then ~100-200ms per command.
|
||||
Auto-shuts down after 30 min idle. State persists between calls (cookies, tabs, sessions).
|
||||
|
||||
## SETUP (run this check BEFORE any browse command)
|
||||
|
||||
```bash
|
||||
B=$(browse/bin/find-browse 2>/dev/null || ~/.claude/skills/gstack/browse/bin/find-browse 2>/dev/null)
|
||||
if [ -n "$B" ]; then
|
||||
echo "READY: $B"
|
||||
else
|
||||
echo "NEEDS_SETUP"
|
||||
fi
|
||||
```
|
||||
|
||||
If `NEEDS_SETUP`:
|
||||
1. Tell the user: "gstack browse needs a one-time build (~10 seconds). OK to proceed?" Then STOP and wait.
|
||||
2. Run: `cd <SKILL_DIR> && ./setup`
|
||||
3. If `bun` is not installed: `curl -fsSL https://bun.sh/install | bash`
|
||||
{{BROWSE_SETUP}}
|
||||
|
||||
## IMPORTANT
|
||||
|
||||
@@ -46,8 +35,6 @@ If `NEEDS_SETUP`:
|
||||
### Test a user flow (login, signup, checkout, etc.)
|
||||
|
||||
```bash
|
||||
B=~/.claude/skills/gstack/browse/dist/browse
|
||||
|
||||
# 1. Go to the page
|
||||
$B goto https://app.example.com/login
|
||||
|
||||
|
||||
@@ -0,0 +1,24 @@
|
||||
# TODOS
|
||||
|
||||
## Auto-upgrade mode (zero-prompt)
|
||||
|
||||
**What:** Add a `GSTACK_AUTO_UPGRADE=1` env var or `~/.gstack/config` option that skips the AskUserQuestion prompt and upgrades automatically when a new version is detected.
|
||||
|
||||
**Why:** Power users and CI environments may want zero-friction upgrades without being asked every time.
|
||||
|
||||
**Context:** The current upgrade system (v0.3.4) always prompts via AskUserQuestion. This TODO adds an opt-in bypass. Implementation is ~10 lines in the preamble instructions: check for the env var/config before calling AskUserQuestion, and if set, go straight to the upgrade flow. Depends on the full upgrade system being stable first — wait for user feedback on the prompt-based flow before adding this.
|
||||
|
||||
**Effort:** S (small)
|
||||
**Priority:** P3 (nice-to-have, revisit after adoption data)
|
||||
|
||||
## Convert remaining skills to .tmpl files
|
||||
|
||||
**What:** Convert ship/, review/, plan-ceo-review/, plan-eng-review/, retro/ SKILL.md files to .tmpl templates using the `{{UPDATE_CHECK}}` placeholder.
|
||||
|
||||
**Why:** These 5 skills still have the update check preamble copy-pasted. When the preamble changes (like the `|| true` fix in v0.3.5), all 5 need manual updates. The `{{UPDATE_CHECK}}` resolver already exists in `scripts/gen-skill-docs.ts` — these skills just need to be converted.
|
||||
|
||||
**Context:** The browse-using skills (SKILL.md, browse/, qa/, setup-browser-cookies/) were converted to .tmpl in v0.3.5. The remaining 5 skills only use `{{UPDATE_CHECK}}` (no `{{BROWSE_SETUP}}`), so the conversion is mechanical: replace the preamble with `{{UPDATE_CHECK}}`, add the path to `findTemplates()` in `scripts/gen-skill-docs.ts`, and commit both .tmpl + generated .md.
|
||||
|
||||
**Depends on:** v0.3.5 shipping first (the `{{UPDATE_CHECK}}` resolver).
|
||||
**Effort:** S (small, ~20 min)
|
||||
**Priority:** P2 (prevents drift on next preamble change)
|
||||
Executable
+88
@@ -0,0 +1,88 @@
|
||||
#!/usr/bin/env bash
|
||||
# gstack-update-check — daily version check for all skills.
|
||||
#
|
||||
# Output (one line, or nothing):
|
||||
# JUST_UPGRADED <old> <new> — marker found from recent upgrade
|
||||
# UPGRADE_AVAILABLE <old> <new> — remote VERSION differs from local
|
||||
# (nothing) — up to date or check skipped
|
||||
#
|
||||
# Env overrides (for testing):
|
||||
# GSTACK_DIR — override auto-detected gstack root
|
||||
# GSTACK_REMOTE_URL — override remote VERSION URL
|
||||
# GSTACK_STATE_DIR — override ~/.gstack state directory
|
||||
set -euo pipefail
|
||||
|
||||
GSTACK_DIR="${GSTACK_DIR:-$(cd "$(dirname "$0")/.." && pwd)}"
|
||||
STATE_DIR="${GSTACK_STATE_DIR:-$HOME/.gstack}"
|
||||
CACHE_FILE="$STATE_DIR/last-update-check"
|
||||
MARKER_FILE="$STATE_DIR/just-upgraded-from"
|
||||
VERSION_FILE="$GSTACK_DIR/VERSION"
|
||||
REMOTE_URL="${GSTACK_REMOTE_URL:-https://raw.githubusercontent.com/garrytan/gstack/main/VERSION}"
|
||||
|
||||
# ─── Step 1: Read local version ──────────────────────────────
|
||||
LOCAL=""
|
||||
if [ -f "$VERSION_FILE" ]; then
|
||||
LOCAL="$(cat "$VERSION_FILE" 2>/dev/null | tr -d '[:space:]')"
|
||||
fi
|
||||
if [ -z "$LOCAL" ]; then
|
||||
exit 0 # No VERSION file → skip check
|
||||
fi
|
||||
|
||||
# ─── Step 2: Check "just upgraded" marker ─────────────────────
|
||||
if [ -f "$MARKER_FILE" ]; then
|
||||
OLD="$(cat "$MARKER_FILE" 2>/dev/null | tr -d '[:space:]')"
|
||||
rm -f "$MARKER_FILE"
|
||||
mkdir -p "$STATE_DIR"
|
||||
echo "UP_TO_DATE $LOCAL" > "$CACHE_FILE"
|
||||
if [ -n "$OLD" ]; then
|
||||
echo "JUST_UPGRADED $OLD $LOCAL"
|
||||
fi
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# ─── Step 3: Check cache freshness (24h = 1440 min) ──────────
|
||||
if [ -f "$CACHE_FILE" ]; then
|
||||
# Cache is fresh if modified within 1440 minutes
|
||||
STALE=$(find "$CACHE_FILE" -mmin +1440 2>/dev/null || true)
|
||||
if [ -z "$STALE" ]; then
|
||||
# Cache is fresh — read it
|
||||
CACHED="$(cat "$CACHE_FILE" 2>/dev/null || true)"
|
||||
case "$CACHED" in
|
||||
UP_TO_DATE*)
|
||||
exit 0
|
||||
;;
|
||||
UPGRADE_AVAILABLE*)
|
||||
# Verify local version still matches cached old version
|
||||
CACHED_OLD="$(echo "$CACHED" | awk '{print $2}')"
|
||||
if [ "$CACHED_OLD" = "$LOCAL" ]; then
|
||||
echo "$CACHED"
|
||||
exit 0
|
||||
fi
|
||||
# Local version changed (manual upgrade?) — fall through to re-check
|
||||
;;
|
||||
esac
|
||||
fi
|
||||
fi
|
||||
|
||||
# ─── Step 4: Slow path — fetch remote version ────────────────
|
||||
mkdir -p "$STATE_DIR"
|
||||
|
||||
REMOTE=""
|
||||
REMOTE="$(curl -sf --max-time 5 "$REMOTE_URL" 2>/dev/null || true)"
|
||||
REMOTE="$(echo "$REMOTE" | tr -d '[:space:]')"
|
||||
|
||||
# Validate: must look like a version number (reject HTML error pages)
|
||||
if ! echo "$REMOTE" | grep -qE '^[0-9]+\.[0-9.]+$'; then
|
||||
# Invalid or empty response — assume up to date
|
||||
echo "UP_TO_DATE $LOCAL" > "$CACHE_FILE"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
if [ "$LOCAL" = "$REMOTE" ]; then
|
||||
echo "UP_TO_DATE $LOCAL" > "$CACHE_FILE"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Versions differ — upgrade available
|
||||
echo "UPGRADE_AVAILABLE $LOCAL $REMOTE" > "$CACHE_FILE"
|
||||
echo "UPGRADE_AVAILABLE $LOCAL $REMOTE"
|
||||
@@ -10,16 +10,45 @@ description: |
|
||||
allowed-tools:
|
||||
- Bash
|
||||
- Read
|
||||
- AskUserQuestion
|
||||
|
||||
---
|
||||
<!-- AUTO-GENERATED from SKILL.md.tmpl — do not edit directly -->
|
||||
<!-- Regenerate: bun run gen:skill-docs -->
|
||||
|
||||
## Update Check (run first)
|
||||
|
||||
```bash
|
||||
_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
|
||||
[ -n "$_UPD" ] && echo "$_UPD" || true
|
||||
```
|
||||
|
||||
If output shows `UPGRADE_AVAILABLE <old> <new>`: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (AskUserQuestion → upgrade if yes, `touch ~/.gstack/last-update-check` if no). If `JUST_UPGRADED <from> <to>`: tell user "Running gstack v{to} (just updated!)" and continue.
|
||||
|
||||
# browse: QA Testing & Dogfooding
|
||||
|
||||
Persistent headless Chromium. First call auto-starts (~3s), then ~100ms per command.
|
||||
State persists between calls (cookies, tabs, login sessions).
|
||||
|
||||
## SETUP (run this check BEFORE any browse command)
|
||||
|
||||
```bash
|
||||
_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
|
||||
B=""
|
||||
[ -n "$_ROOT" ] && [ -x "$_ROOT/.claude/skills/gstack/browse/dist/browse" ] && B="$_ROOT/.claude/skills/gstack/browse/dist/browse"
|
||||
[ -z "$B" ] && B=~/.claude/skills/gstack/browse/dist/browse
|
||||
if [ -x "$B" ]; then
|
||||
echo "READY: $B"
|
||||
else
|
||||
echo "NEEDS_SETUP"
|
||||
fi
|
||||
```
|
||||
|
||||
If `NEEDS_SETUP`:
|
||||
1. Tell the user: "gstack browse needs a one-time build (~10 seconds). OK to proceed?" Then STOP and wait.
|
||||
2. Run: `cd <SKILL_DIR> && ./setup`
|
||||
3. If `bun` is not installed: `curl -fsSL https://bun.sh/install | bash`
|
||||
|
||||
## Core QA Patterns
|
||||
|
||||
### 1. Verify a page loads correctly
|
||||
|
||||
@@ -10,14 +10,19 @@ description: |
|
||||
allowed-tools:
|
||||
- Bash
|
||||
- Read
|
||||
- AskUserQuestion
|
||||
|
||||
---
|
||||
|
||||
{{UPDATE_CHECK}}
|
||||
|
||||
# browse: QA Testing & Dogfooding
|
||||
|
||||
Persistent headless Chromium. First call auto-starts (~3s), then ~100ms per command.
|
||||
State persists between calls (cookies, tabs, login sessions).
|
||||
|
||||
{{BROWSE_SETUP}}
|
||||
|
||||
## Core QA Patterns
|
||||
|
||||
### 1. Verify a page loads correctly
|
||||
|
||||
@@ -1,11 +1,11 @@
|
||||
#!/bin/bash
|
||||
# Shim: delegates to compiled find-browse binary, falls back to basic discovery.
|
||||
# The compiled binary adds version checking and META signal support.
|
||||
# The compiled binary handles git root detection for workspace-local installs.
|
||||
DIR="$(cd "$(dirname "$0")/.." && pwd)/dist"
|
||||
if test -x "$DIR/find-browse"; then
|
||||
exec "$DIR/find-browse" "$@"
|
||||
fi
|
||||
# Fallback: basic discovery (no version check)
|
||||
# Fallback: basic discovery
|
||||
ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
|
||||
if [ -n "$ROOT" ] && test -x "$ROOT/.claude/skills/gstack/browse/dist/browse"; then
|
||||
echo "$ROOT/.claude/skills/gstack/browse/dist/browse"
|
||||
|
||||
+3
-128
@@ -1,28 +1,14 @@
|
||||
/**
|
||||
* find-browse — locate the gstack browse binary + check for updates.
|
||||
* find-browse — locate the gstack browse binary.
|
||||
*
|
||||
* Compiled to browse/dist/find-browse (standalone binary, no bun runtime needed).
|
||||
*
|
||||
* Output protocol:
|
||||
* Line 1: /path/to/binary (always present)
|
||||
* Line 2+: META:<TYPE> <json-payload> (optional, 0 or more)
|
||||
*
|
||||
* META types:
|
||||
* META:UPDATE_AVAILABLE — local binary is behind origin/main
|
||||
*
|
||||
* All version checks are best-effort: network failures, missing files, and
|
||||
* cache errors degrade gracefully to outputting only the binary path.
|
||||
* Outputs the absolute path to the browse binary on stdout, or exits 1 if not found.
|
||||
*/
|
||||
|
||||
import { existsSync } from 'fs';
|
||||
import { readFileSync, writeFileSync } from 'fs';
|
||||
import { join, dirname } from 'path';
|
||||
import { join } from 'path';
|
||||
import { homedir } from 'os';
|
||||
|
||||
const REPO_URL = 'https://github.com/garrytan/gstack.git';
|
||||
const CACHE_PATH = '/tmp/gstack-latest-version';
|
||||
const CACHE_TTL = 14400; // 4 hours in seconds
|
||||
|
||||
// ─── Binary Discovery ───────────────────────────────────────────
|
||||
|
||||
function getGitRoot(): string | null {
|
||||
@@ -55,114 +41,6 @@ export function locateBinary(): string | null {
|
||||
return null;
|
||||
}
|
||||
|
||||
// ─── Version Check ──────────────────────────────────────────────
|
||||
|
||||
interface CacheEntry {
|
||||
sha: string;
|
||||
timestamp: number;
|
||||
}
|
||||
|
||||
function readCache(): CacheEntry | null {
|
||||
try {
|
||||
const content = readFileSync(CACHE_PATH, 'utf-8').trim();
|
||||
const parts = content.split(/\s+/);
|
||||
if (parts.length < 2) return null;
|
||||
const sha = parts[0];
|
||||
const timestamp = parseInt(parts[1], 10);
|
||||
if (!sha || isNaN(timestamp)) return null;
|
||||
// Validate SHA is hex
|
||||
if (!/^[0-9a-f]{40}$/i.test(sha)) return null;
|
||||
return { sha, timestamp };
|
||||
} catch {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
function writeCache(sha: string, timestamp: number): void {
|
||||
try {
|
||||
writeFileSync(CACHE_PATH, `${sha} ${timestamp}\n`);
|
||||
} catch {
|
||||
// Cache write failure is non-fatal
|
||||
}
|
||||
}
|
||||
|
||||
function fetchRemoteSHA(): string | null {
|
||||
try {
|
||||
const proc = Bun.spawnSync(['git', 'ls-remote', REPO_URL, 'refs/heads/main'], {
|
||||
stdout: 'pipe',
|
||||
stderr: 'pipe',
|
||||
timeout: 10_000, // 10s timeout
|
||||
});
|
||||
if (proc.exitCode !== 0) return null;
|
||||
const output = proc.stdout.toString().trim();
|
||||
const sha = output.split(/\s+/)[0];
|
||||
if (!sha || !/^[0-9a-f]{40}$/i.test(sha)) return null;
|
||||
return sha;
|
||||
} catch {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
function resolveSkillDir(binaryPath: string): string | null {
|
||||
const home = homedir();
|
||||
const globalPrefix = join(home, '.claude', 'skills', 'gstack');
|
||||
if (binaryPath.startsWith(globalPrefix)) return globalPrefix;
|
||||
|
||||
// Workspace-local: binary is at $ROOT/.claude/skills/gstack/browse/dist/browse
|
||||
// Skill dir is $ROOT/.claude/skills/gstack
|
||||
const parts = binaryPath.split('/.claude/skills/gstack/');
|
||||
if (parts.length === 2) return parts[0] + '/.claude/skills/gstack';
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
export function checkVersion(binaryDir: string): string | null {
|
||||
// Read local version
|
||||
const versionFile = join(binaryDir, '.version');
|
||||
let localSHA: string;
|
||||
try {
|
||||
localSHA = readFileSync(versionFile, 'utf-8').trim();
|
||||
} catch {
|
||||
return null; // No .version file → skip check
|
||||
}
|
||||
if (!localSHA) return null;
|
||||
|
||||
const now = Math.floor(Date.now() / 1000);
|
||||
|
||||
// Check cache
|
||||
let remoteSHA: string | null = null;
|
||||
const cache = readCache();
|
||||
if (cache && (now - cache.timestamp) < CACHE_TTL) {
|
||||
remoteSHA = cache.sha;
|
||||
}
|
||||
|
||||
// Fetch from remote if cache miss
|
||||
if (!remoteSHA) {
|
||||
remoteSHA = fetchRemoteSHA();
|
||||
if (remoteSHA) {
|
||||
writeCache(remoteSHA, now);
|
||||
}
|
||||
}
|
||||
|
||||
if (!remoteSHA) return null; // Offline or error → skip check
|
||||
|
||||
// Compare
|
||||
if (localSHA === remoteSHA) return null; // Up to date
|
||||
|
||||
// Determine skill directory for update command
|
||||
const binaryPath = join(binaryDir, 'browse');
|
||||
const skillDir = resolveSkillDir(binaryPath);
|
||||
if (!skillDir) return null;
|
||||
|
||||
const payload = JSON.stringify({
|
||||
current: localSHA.slice(0, 8),
|
||||
latest: remoteSHA.slice(0, 8),
|
||||
command: `cd ${skillDir} && git stash && git fetch origin && git reset --hard origin/main && ./setup`,
|
||||
});
|
||||
|
||||
return `META:UPDATE_AVAILABLE ${payload}`;
|
||||
}
|
||||
|
||||
// ─── Main ───────────────────────────────────────────────────────
|
||||
|
||||
function main() {
|
||||
@@ -173,9 +51,6 @@ function main() {
|
||||
}
|
||||
|
||||
console.log(bin);
|
||||
|
||||
const meta = checkVersion(dirname(bin));
|
||||
if (meta) console.log(meta);
|
||||
}
|
||||
|
||||
main();
|
||||
|
||||
@@ -1,130 +1,10 @@
|
||||
/**
|
||||
* Tests for find-browse version check logic
|
||||
*
|
||||
* Tests the checkVersion() and locateBinary() functions directly.
|
||||
* Uses temp directories with mock .version files and cache files.
|
||||
* Tests for find-browse binary locator.
|
||||
*/
|
||||
|
||||
import { describe, test, expect, beforeEach, afterEach } from 'bun:test';
|
||||
import { checkVersion, locateBinary } from '../src/find-browse';
|
||||
import { mkdtempSync, writeFileSync, rmSync, existsSync, mkdirSync } from 'fs';
|
||||
import { join } from 'path';
|
||||
import { tmpdir } from 'os';
|
||||
|
||||
let tempDir: string;
|
||||
|
||||
beforeEach(() => {
|
||||
tempDir = mkdtempSync(join(tmpdir(), 'find-browse-test-'));
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
rmSync(tempDir, { recursive: true, force: true });
|
||||
// Clean up test cache
|
||||
try { rmSync('/tmp/gstack-latest-version'); } catch {}
|
||||
});
|
||||
|
||||
describe('checkVersion', () => {
|
||||
test('returns null when .version file is missing', () => {
|
||||
const result = checkVersion(tempDir);
|
||||
expect(result).toBeNull();
|
||||
});
|
||||
|
||||
test('returns null when .version file is empty', () => {
|
||||
writeFileSync(join(tempDir, '.version'), '');
|
||||
const result = checkVersion(tempDir);
|
||||
expect(result).toBeNull();
|
||||
});
|
||||
|
||||
test('returns null when .version has only whitespace', () => {
|
||||
writeFileSync(join(tempDir, '.version'), ' \n');
|
||||
const result = checkVersion(tempDir);
|
||||
expect(result).toBeNull();
|
||||
});
|
||||
|
||||
test('returns null when local SHA matches remote (cache hit)', () => {
|
||||
const sha = 'a'.repeat(40);
|
||||
writeFileSync(join(tempDir, '.version'), sha);
|
||||
// Write cache with same SHA, recent timestamp
|
||||
const now = Math.floor(Date.now() / 1000);
|
||||
writeFileSync('/tmp/gstack-latest-version', `${sha} ${now}\n`);
|
||||
|
||||
const result = checkVersion(tempDir);
|
||||
expect(result).toBeNull();
|
||||
});
|
||||
|
||||
test('returns META:UPDATE_AVAILABLE when SHAs differ (cache hit)', () => {
|
||||
const localSha = 'a'.repeat(40);
|
||||
const remoteSha = 'b'.repeat(40);
|
||||
writeFileSync(join(tempDir, '.version'), localSha);
|
||||
// Create a fake browse binary path so resolveSkillDir works
|
||||
const browsePath = join(tempDir, 'browse');
|
||||
writeFileSync(browsePath, '');
|
||||
// Write cache with different SHA, recent timestamp
|
||||
const now = Math.floor(Date.now() / 1000);
|
||||
writeFileSync('/tmp/gstack-latest-version', `${remoteSha} ${now}\n`);
|
||||
|
||||
const result = checkVersion(tempDir);
|
||||
// Result may be null if resolveSkillDir can't determine skill dir from temp path
|
||||
// That's expected — the META signal requires a known skill dir path
|
||||
if (result !== null) {
|
||||
expect(result).toStartWith('META:UPDATE_AVAILABLE');
|
||||
const jsonStr = result.replace('META:UPDATE_AVAILABLE ', '');
|
||||
const payload = JSON.parse(jsonStr);
|
||||
expect(payload.current).toBe('a'.repeat(8));
|
||||
expect(payload.latest).toBe('b'.repeat(8));
|
||||
expect(payload.command).toContain('git stash');
|
||||
expect(payload.command).toContain('git reset --hard origin/main');
|
||||
expect(payload.command).toContain('./setup');
|
||||
}
|
||||
});
|
||||
|
||||
test('uses cached SHA when cache is fresh (< 4hr)', () => {
|
||||
const localSha = 'a'.repeat(40);
|
||||
const remoteSha = 'a'.repeat(40);
|
||||
writeFileSync(join(tempDir, '.version'), localSha);
|
||||
// Cache is 1 hour old — should still be valid
|
||||
const oneHourAgo = Math.floor(Date.now() / 1000) - 3600;
|
||||
writeFileSync('/tmp/gstack-latest-version', `${remoteSha} ${oneHourAgo}\n`);
|
||||
|
||||
const result = checkVersion(tempDir);
|
||||
expect(result).toBeNull(); // SHAs match
|
||||
});
|
||||
|
||||
test('treats expired cache as stale', () => {
|
||||
const localSha = 'a'.repeat(40);
|
||||
writeFileSync(join(tempDir, '.version'), localSha);
|
||||
// Cache is 5 hours old — should be stale
|
||||
const fiveHoursAgo = Math.floor(Date.now() / 1000) - 18000;
|
||||
writeFileSync('/tmp/gstack-latest-version', `${'b'.repeat(40)} ${fiveHoursAgo}\n`);
|
||||
|
||||
// This will try git ls-remote which may fail in test env — that's OK
|
||||
// The important thing is it doesn't use the stale cache value
|
||||
const result = checkVersion(tempDir);
|
||||
// Result depends on whether git ls-remote succeeds in test environment
|
||||
// If offline, returns null (graceful degradation)
|
||||
expect(result === null || typeof result === 'string').toBe(true);
|
||||
});
|
||||
|
||||
test('handles corrupt cache file gracefully', () => {
|
||||
const localSha = 'a'.repeat(40);
|
||||
writeFileSync(join(tempDir, '.version'), localSha);
|
||||
writeFileSync('/tmp/gstack-latest-version', 'garbage data here');
|
||||
|
||||
// Should not throw, should treat as stale
|
||||
const result = checkVersion(tempDir);
|
||||
expect(result === null || typeof result === 'string').toBe(true);
|
||||
});
|
||||
|
||||
test('handles cache with invalid SHA gracefully', () => {
|
||||
const localSha = 'a'.repeat(40);
|
||||
writeFileSync(join(tempDir, '.version'), localSha);
|
||||
writeFileSync('/tmp/gstack-latest-version', `not-a-sha ${Math.floor(Date.now() / 1000)}\n`);
|
||||
|
||||
// Invalid SHA should be treated as no cache
|
||||
const result = checkVersion(tempDir);
|
||||
expect(result === null || typeof result === 'string').toBe(true);
|
||||
});
|
||||
});
|
||||
import { describe, test, expect } from 'bun:test';
|
||||
import { locateBinary } from '../src/find-browse';
|
||||
import { existsSync } from 'fs';
|
||||
|
||||
describe('locateBinary', () => {
|
||||
test('returns null when no binary exists at known paths', () => {
|
||||
|
||||
@@ -0,0 +1,188 @@
|
||||
/**
|
||||
* Tests for bin/gstack-update-check bash script.
|
||||
*
|
||||
* Uses Bun.spawnSync to invoke the script with temp dirs and
|
||||
* GSTACK_DIR / GSTACK_STATE_DIR / GSTACK_REMOTE_URL env overrides
|
||||
* for full isolation.
|
||||
*/
|
||||
|
||||
import { describe, test, expect, beforeEach, afterEach } from 'bun:test';
|
||||
import { mkdtempSync, writeFileSync, rmSync, existsSync, readFileSync } from 'fs';
|
||||
import { join } from 'path';
|
||||
import { tmpdir } from 'os';
|
||||
|
||||
const SCRIPT = join(import.meta.dir, '..', '..', 'bin', 'gstack-update-check');
|
||||
|
||||
let gstackDir: string;
|
||||
let stateDir: string;
|
||||
|
||||
function run(extraEnv: Record<string, string> = {}) {
|
||||
const result = Bun.spawnSync(['bash', SCRIPT], {
|
||||
env: {
|
||||
...process.env,
|
||||
GSTACK_DIR: gstackDir,
|
||||
GSTACK_STATE_DIR: stateDir,
|
||||
GSTACK_REMOTE_URL: `file://${join(gstackDir, 'REMOTE_VERSION')}`,
|
||||
...extraEnv,
|
||||
},
|
||||
stdout: 'pipe',
|
||||
stderr: 'pipe',
|
||||
});
|
||||
return {
|
||||
exitCode: result.exitCode,
|
||||
stdout: result.stdout.toString().trim(),
|
||||
stderr: result.stderr.toString().trim(),
|
||||
};
|
||||
}
|
||||
|
||||
beforeEach(() => {
|
||||
gstackDir = mkdtempSync(join(tmpdir(), 'gstack-upd-test-'));
|
||||
stateDir = mkdtempSync(join(tmpdir(), 'gstack-state-test-'));
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
rmSync(gstackDir, { recursive: true, force: true });
|
||||
rmSync(stateDir, { recursive: true, force: true });
|
||||
});
|
||||
|
||||
describe('gstack-update-check', () => {
|
||||
// ─── Path A: No VERSION file ────────────────────────────────
|
||||
test('exits 0 with no output when VERSION file is missing', () => {
|
||||
const { exitCode, stdout } = run();
|
||||
expect(exitCode).toBe(0);
|
||||
expect(stdout).toBe('');
|
||||
});
|
||||
|
||||
// ─── Path B: Empty VERSION file ─────────────────────────────
|
||||
test('exits 0 with no output when VERSION file is empty', () => {
|
||||
writeFileSync(join(gstackDir, 'VERSION'), '');
|
||||
const { exitCode, stdout } = run();
|
||||
expect(exitCode).toBe(0);
|
||||
expect(stdout).toBe('');
|
||||
});
|
||||
|
||||
// ─── Path C: Just-upgraded marker ───────────────────────────
|
||||
test('outputs JUST_UPGRADED and deletes marker', () => {
|
||||
writeFileSync(join(gstackDir, 'VERSION'), '0.4.0\n');
|
||||
writeFileSync(join(stateDir, 'just-upgraded-from'), '0.3.3\n');
|
||||
|
||||
const { exitCode, stdout } = run();
|
||||
expect(exitCode).toBe(0);
|
||||
expect(stdout).toBe('JUST_UPGRADED 0.3.3 0.4.0');
|
||||
// Marker should be deleted
|
||||
expect(existsSync(join(stateDir, 'just-upgraded-from'))).toBe(false);
|
||||
// Cache should be written
|
||||
const cache = readFileSync(join(stateDir, 'last-update-check'), 'utf-8');
|
||||
expect(cache).toContain('UP_TO_DATE');
|
||||
});
|
||||
|
||||
// ─── Path D1: Fresh cache, UP_TO_DATE ───────────────────────
|
||||
test('exits silently when cache says UP_TO_DATE and is fresh', () => {
|
||||
writeFileSync(join(gstackDir, 'VERSION'), '0.3.3\n');
|
||||
writeFileSync(join(stateDir, 'last-update-check'), 'UP_TO_DATE 0.3.3');
|
||||
|
||||
const { exitCode, stdout } = run();
|
||||
expect(exitCode).toBe(0);
|
||||
expect(stdout).toBe('');
|
||||
});
|
||||
|
||||
// ─── Path D2: Fresh cache, UPGRADE_AVAILABLE ────────────────
|
||||
test('echoes cached UPGRADE_AVAILABLE when cache is fresh', () => {
|
||||
writeFileSync(join(gstackDir, 'VERSION'), '0.3.3\n');
|
||||
writeFileSync(join(stateDir, 'last-update-check'), 'UPGRADE_AVAILABLE 0.3.3 0.4.0');
|
||||
|
||||
const { exitCode, stdout } = run();
|
||||
expect(exitCode).toBe(0);
|
||||
expect(stdout).toBe('UPGRADE_AVAILABLE 0.3.3 0.4.0');
|
||||
});
|
||||
|
||||
// ─── Path D3: Fresh cache, but local version changed ────────
|
||||
test('re-checks when local version does not match cached old version', () => {
|
||||
writeFileSync(join(gstackDir, 'VERSION'), '0.4.0\n');
|
||||
// Cache says 0.3.3 → 0.4.0 but we're already on 0.4.0
|
||||
writeFileSync(join(stateDir, 'last-update-check'), 'UPGRADE_AVAILABLE 0.3.3 0.4.0');
|
||||
// Remote also says 0.4.0 — should be up to date
|
||||
writeFileSync(join(gstackDir, 'REMOTE_VERSION'), '0.4.0\n');
|
||||
|
||||
const { exitCode, stdout } = run();
|
||||
expect(exitCode).toBe(0);
|
||||
expect(stdout).toBe(''); // Up to date after re-check
|
||||
const cache = readFileSync(join(stateDir, 'last-update-check'), 'utf-8');
|
||||
expect(cache).toContain('UP_TO_DATE');
|
||||
});
|
||||
|
||||
// ─── Path E: Versions match (remote fetch) ─────────────────
|
||||
test('writes UP_TO_DATE cache when versions match', () => {
|
||||
writeFileSync(join(gstackDir, 'VERSION'), '0.3.3\n');
|
||||
writeFileSync(join(gstackDir, 'REMOTE_VERSION'), '0.3.3\n');
|
||||
|
||||
const { exitCode, stdout } = run();
|
||||
expect(exitCode).toBe(0);
|
||||
expect(stdout).toBe('');
|
||||
const cache = readFileSync(join(stateDir, 'last-update-check'), 'utf-8');
|
||||
expect(cache).toContain('UP_TO_DATE');
|
||||
});
|
||||
|
||||
// ─── Path F: Versions differ (remote fetch) ─────────────────
|
||||
test('outputs UPGRADE_AVAILABLE when versions differ', () => {
|
||||
writeFileSync(join(gstackDir, 'VERSION'), '0.3.3\n');
|
||||
writeFileSync(join(gstackDir, 'REMOTE_VERSION'), '0.4.0\n');
|
||||
|
||||
const { exitCode, stdout } = run();
|
||||
expect(exitCode).toBe(0);
|
||||
expect(stdout).toBe('UPGRADE_AVAILABLE 0.3.3 0.4.0');
|
||||
const cache = readFileSync(join(stateDir, 'last-update-check'), 'utf-8');
|
||||
expect(cache).toContain('UPGRADE_AVAILABLE 0.3.3 0.4.0');
|
||||
});
|
||||
|
||||
// ─── Path G: Invalid remote response ────────────────────────
|
||||
test('treats invalid remote response as up to date', () => {
|
||||
writeFileSync(join(gstackDir, 'VERSION'), '0.3.3\n');
|
||||
writeFileSync(join(gstackDir, 'REMOTE_VERSION'), '<html>404 Not Found</html>\n');
|
||||
|
||||
const { exitCode, stdout } = run();
|
||||
expect(exitCode).toBe(0);
|
||||
expect(stdout).toBe('');
|
||||
const cache = readFileSync(join(stateDir, 'last-update-check'), 'utf-8');
|
||||
expect(cache).toContain('UP_TO_DATE');
|
||||
});
|
||||
|
||||
// ─── Path H: Curl fails (bad URL) ──────────────────────────
|
||||
test('exits silently when remote URL is unreachable', () => {
|
||||
writeFileSync(join(gstackDir, 'VERSION'), '0.3.3\n');
|
||||
|
||||
const { exitCode, stdout } = run({
|
||||
GSTACK_REMOTE_URL: 'file:///nonexistent/path/VERSION',
|
||||
});
|
||||
expect(exitCode).toBe(0);
|
||||
expect(stdout).toBe('');
|
||||
const cache = readFileSync(join(stateDir, 'last-update-check'), 'utf-8');
|
||||
expect(cache).toContain('UP_TO_DATE');
|
||||
});
|
||||
|
||||
// ─── Path I: Corrupt cache file ─────────────────────────────
|
||||
test('falls through to remote fetch when cache is corrupt', () => {
|
||||
writeFileSync(join(gstackDir, 'VERSION'), '0.3.3\n');
|
||||
writeFileSync(join(stateDir, 'last-update-check'), 'garbage data here');
|
||||
// Remote says same version — should end up UP_TO_DATE
|
||||
writeFileSync(join(gstackDir, 'REMOTE_VERSION'), '0.3.3\n');
|
||||
|
||||
const { exitCode, stdout } = run();
|
||||
expect(exitCode).toBe(0);
|
||||
expect(stdout).toBe('');
|
||||
// Cache should be overwritten with valid content
|
||||
const cache = readFileSync(join(stateDir, 'last-update-check'), 'utf-8');
|
||||
expect(cache).toContain('UP_TO_DATE');
|
||||
});
|
||||
|
||||
// ─── State dir creation ─────────────────────────────────────
|
||||
test('creates state dir if it does not exist', () => {
|
||||
const newStateDir = join(stateDir, 'nested', 'dir');
|
||||
writeFileSync(join(gstackDir, 'VERSION'), '0.3.3\n');
|
||||
writeFileSync(join(gstackDir, 'REMOTE_VERSION'), '0.3.3\n');
|
||||
|
||||
const { exitCode } = run({ GSTACK_STATE_DIR: newStateDir });
|
||||
expect(exitCode).toBe(0);
|
||||
expect(existsSync(join(newStateDir, 'last-update-check'))).toBe(true);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,112 @@
|
||||
---
|
||||
name: gstack-upgrade
|
||||
version: 1.0.0
|
||||
description: |
|
||||
Upgrade gstack to the latest version. Detects global vs vendored install,
|
||||
runs the upgrade, and shows what's new.
|
||||
allowed-tools:
|
||||
- Bash
|
||||
- Read
|
||||
- AskUserQuestion
|
||||
---
|
||||
|
||||
# /gstack-upgrade
|
||||
|
||||
Upgrade gstack to the latest version and show what's new.
|
||||
|
||||
## Inline upgrade flow
|
||||
|
||||
This section is referenced by all skill preambles when they detect `UPGRADE_AVAILABLE`.
|
||||
|
||||
### Step 1: Ask the user
|
||||
|
||||
Use AskUserQuestion:
|
||||
- Question: "gstack **v{new}** is available (you're on v{old}). Upgrade now? Takes ~10 seconds."
|
||||
- Options: ["Yes, upgrade now", "Later (ask again tomorrow)"]
|
||||
|
||||
**If "Later":** Run `touch ~/.gstack/last-update-check` to reset the 24h timer and continue with the current skill. Do not mention the upgrade again.
|
||||
|
||||
### Step 2: Detect install type
|
||||
|
||||
```bash
|
||||
if [ -d "$HOME/.claude/skills/gstack/.git" ]; then
|
||||
INSTALL_TYPE="global-git"
|
||||
INSTALL_DIR="$HOME/.claude/skills/gstack"
|
||||
elif [ -d ".claude/skills/gstack/.git" ]; then
|
||||
INSTALL_TYPE="local-git"
|
||||
INSTALL_DIR=".claude/skills/gstack"
|
||||
elif [ -d ".claude/skills/gstack" ]; then
|
||||
INSTALL_TYPE="vendored"
|
||||
INSTALL_DIR=".claude/skills/gstack"
|
||||
elif [ -d "$HOME/.claude/skills/gstack" ]; then
|
||||
INSTALL_TYPE="vendored-global"
|
||||
INSTALL_DIR="$HOME/.claude/skills/gstack"
|
||||
else
|
||||
echo "ERROR: gstack not found"
|
||||
exit 1
|
||||
fi
|
||||
echo "Install type: $INSTALL_TYPE at $INSTALL_DIR"
|
||||
```
|
||||
|
||||
### Step 3: Save old version
|
||||
|
||||
```bash
|
||||
OLD_VERSION=$(cat "$INSTALL_DIR/VERSION" 2>/dev/null || echo "unknown")
|
||||
```
|
||||
|
||||
### Step 4: Upgrade
|
||||
|
||||
**For git installs** (global-git, local-git):
|
||||
```bash
|
||||
cd "$INSTALL_DIR"
|
||||
STASH_OUTPUT=$(git stash 2>&1)
|
||||
git fetch origin
|
||||
git reset --hard origin/main
|
||||
./setup
|
||||
```
|
||||
If `$STASH_OUTPUT` contains "Saved working directory", warn the user: "Note: local changes were stashed. Run `git stash pop` in the skill directory to restore them."
|
||||
|
||||
**For vendored installs** (vendored, vendored-global):
|
||||
```bash
|
||||
PARENT=$(dirname "$INSTALL_DIR")
|
||||
TMP_DIR=$(mktemp -d)
|
||||
git clone --depth 1 https://github.com/garrytan/gstack.git "$TMP_DIR/gstack"
|
||||
mv "$INSTALL_DIR" "$INSTALL_DIR.bak"
|
||||
mv "$TMP_DIR/gstack" "$INSTALL_DIR"
|
||||
cd "$INSTALL_DIR" && ./setup
|
||||
rm -rf "$INSTALL_DIR.bak" "$TMP_DIR"
|
||||
```
|
||||
|
||||
### Step 5: Write marker + clear cache
|
||||
|
||||
```bash
|
||||
mkdir -p ~/.gstack
|
||||
echo "$OLD_VERSION" > ~/.gstack/just-upgraded-from
|
||||
rm -f ~/.gstack/last-update-check
|
||||
```
|
||||
|
||||
### Step 6: Show What's New
|
||||
|
||||
Read `$INSTALL_DIR/CHANGELOG.md`. Find all version entries between the old version and the new version. Summarize as 5-7 bullets grouped by theme. Don't overwhelm — focus on user-facing changes. Skip internal refactors unless they're significant.
|
||||
|
||||
Format:
|
||||
```
|
||||
gstack v{new} — upgraded from v{old}!
|
||||
|
||||
What's new:
|
||||
- [bullet 1]
|
||||
- [bullet 2]
|
||||
- ...
|
||||
|
||||
Happy shipping!
|
||||
```
|
||||
|
||||
### Step 7: Continue
|
||||
|
||||
After showing What's New, continue with whatever skill the user originally invoked. The upgrade is done — no further action needed.
|
||||
|
||||
---
|
||||
|
||||
## Standalone usage
|
||||
|
||||
When invoked directly as `/gstack-upgrade` (not from a preamble), follow Steps 2-6 above. If already on the latest version, tell the user: "You're already on the latest version (v{version})."
|
||||
@@ -14,6 +14,15 @@ allowed-tools:
|
||||
- AskUserQuestion
|
||||
---
|
||||
|
||||
## Update Check (run first)
|
||||
|
||||
```bash
|
||||
_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
|
||||
[ -n "$_UPD" ] && echo "$_UPD"
|
||||
```
|
||||
|
||||
If output shows `UPGRADE_AVAILABLE <old> <new>`: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (AskUserQuestion → upgrade if yes, `touch ~/.gstack/last-update-check` if no). If `JUST_UPGRADED <from> <to>`: tell user "Running gstack v{to} (just updated!)" and continue.
|
||||
|
||||
# Mega Plan Review Mode
|
||||
|
||||
## Philosophy
|
||||
|
||||
@@ -10,8 +10,18 @@ allowed-tools:
|
||||
- Grep
|
||||
- Glob
|
||||
- AskUserQuestion
|
||||
- Bash
|
||||
---
|
||||
|
||||
## Update Check (run first)
|
||||
|
||||
```bash
|
||||
_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
|
||||
[ -n "$_UPD" ] && echo "$_UPD"
|
||||
```
|
||||
|
||||
If output shows `UPGRADE_AVAILABLE <old> <new>`: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (AskUserQuestion → upgrade if yes, `touch ~/.gstack/last-update-check` if no). If `JUST_UPGRADED <from> <to>`: tell user "Running gstack v{to} (just updated!)" and continue.
|
||||
|
||||
# Plan Review Mode
|
||||
|
||||
Review this plan thoroughly before making any code changes. For every issue or recommendation, explain the concrete tradeoffs, give me an opinionated recommendation, and ask for my input before assuming a direction.
|
||||
|
||||
+26
-9
@@ -10,7 +10,19 @@ allowed-tools:
|
||||
- Bash
|
||||
- Read
|
||||
- Write
|
||||
- AskUserQuestion
|
||||
---
|
||||
<!-- AUTO-GENERATED from SKILL.md.tmpl — do not edit directly -->
|
||||
<!-- Regenerate: bun run gen:skill-docs -->
|
||||
|
||||
## Update Check (run first)
|
||||
|
||||
```bash
|
||||
_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
|
||||
[ -n "$_UPD" ] && echo "$_UPD" || true
|
||||
```
|
||||
|
||||
If output shows `UPGRADE_AVAILABLE <old> <new>`: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (AskUserQuestion → upgrade if yes, `touch ~/.gstack/last-update-check` if no). If `JUST_UPGRADED <from> <to>`: tell user "Running gstack v{to} (just updated!)" and continue.
|
||||
|
||||
# /qa: Systematic QA Testing
|
||||
|
||||
@@ -32,19 +44,24 @@ You are a QA engineer. Test web applications like a real user — click everythi
|
||||
|
||||
**Find the browse binary:**
|
||||
|
||||
## SETUP (run this check BEFORE any browse command)
|
||||
|
||||
```bash
|
||||
BROWSE_OUTPUT=$(browse/bin/find-browse 2>/dev/null || ~/.claude/skills/gstack/browse/bin/find-browse 2>/dev/null)
|
||||
B=$(echo "$BROWSE_OUTPUT" | head -1)
|
||||
META=$(echo "$BROWSE_OUTPUT" | grep "^META:" || true)
|
||||
if [ -z "$B" ]; then
|
||||
echo "ERROR: browse binary not found"
|
||||
exit 1
|
||||
_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
|
||||
B=""
|
||||
[ -n "$_ROOT" ] && [ -x "$_ROOT/.claude/skills/gstack/browse/dist/browse" ] && B="$_ROOT/.claude/skills/gstack/browse/dist/browse"
|
||||
[ -z "$B" ] && B=~/.claude/skills/gstack/browse/dist/browse
|
||||
if [ -x "$B" ]; then
|
||||
echo "READY: $B"
|
||||
else
|
||||
echo "NEEDS_SETUP"
|
||||
fi
|
||||
echo "READY: $B"
|
||||
[ -n "$META" ] && echo "$META"
|
||||
```
|
||||
|
||||
If you see `META:UPDATE_AVAILABLE`: tell the user an update is available, STOP and wait for approval, then run the command from the META payload and re-run the setup check.
|
||||
If `NEEDS_SETUP`:
|
||||
1. Tell the user: "gstack browse needs a one-time build (~10 seconds). OK to proceed?" Then STOP and wait.
|
||||
2. Run: `cd <SKILL_DIR> && ./setup`
|
||||
3. If `bun` is not installed: `curl -fsSL https://bun.sh/install | bash`
|
||||
|
||||
**Set up report directory (persistent, global):**
|
||||
|
||||
|
||||
@@ -0,0 +1,337 @@
|
||||
---
|
||||
name: qa
|
||||
version: 1.0.0
|
||||
description: |
|
||||
Systematically QA test a web application. Use when asked to "qa", "QA", "test this site",
|
||||
"find bugs", "dogfood", or review quality. Four modes: diff-aware (automatic on feature
|
||||
branches — analyzes git diff, identifies affected pages, tests them), full (systematic
|
||||
exploration), quick (30-second smoke test), regression (compare against baseline). Produces
|
||||
structured report with health score, screenshots, and repro steps.
|
||||
allowed-tools:
|
||||
- Bash
|
||||
- Read
|
||||
- Write
|
||||
- AskUserQuestion
|
||||
---
|
||||
|
||||
{{UPDATE_CHECK}}
|
||||
|
||||
# /qa: Systematic QA Testing
|
||||
|
||||
You are a QA engineer. Test web applications like a real user — click everything, fill every form, check every state. Produce a structured report with evidence.
|
||||
|
||||
## Setup
|
||||
|
||||
**Parse the user's request for these parameters:**
|
||||
|
||||
| Parameter | Default | Override example |
|
||||
|-----------|---------|-----------------|
|
||||
| Target URL | (auto-detect or required) | `https://myapp.com`, `http://localhost:3000` |
|
||||
| Mode | full | `--quick`, `--regression .gstack/qa-reports/baseline.json` |
|
||||
| Output dir | `.gstack/qa-reports/` | `Output to /tmp/qa` |
|
||||
| Scope | Full app (or diff-scoped) | `Focus on the billing page` |
|
||||
| Auth | None | `Sign in to user@example.com`, `Import cookies from cookies.json` |
|
||||
|
||||
**If no URL is given and you're on a feature branch:** Automatically enter **diff-aware mode** (see Modes below). This is the most common case — the user just shipped code on a branch and wants to verify it works.
|
||||
|
||||
**Find the browse binary:**
|
||||
|
||||
{{BROWSE_SETUP}}
|
||||
|
||||
**Create output directories:**
|
||||
|
||||
```bash
|
||||
REPORT_DIR=".gstack/qa-reports"
|
||||
mkdir -p "$REPORT_DIR/screenshots"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Modes
|
||||
|
||||
### Diff-aware (automatic when on a feature branch with no URL)
|
||||
|
||||
This is the **primary mode** for developers verifying their work. When the user says `/qa` without a URL and the repo is on a feature branch, automatically:
|
||||
|
||||
1. **Analyze the branch diff** to understand what changed:
|
||||
```bash
|
||||
git diff main...HEAD --name-only
|
||||
git log main..HEAD --oneline
|
||||
```
|
||||
|
||||
2. **Identify affected pages/routes** from the changed files:
|
||||
- Controller/route files → which URL paths they serve
|
||||
- View/template/component files → which pages render them
|
||||
- Model/service files → which pages use those models (check controllers that reference them)
|
||||
- CSS/style files → which pages include those stylesheets
|
||||
- API endpoints → test them directly with `$B js "await fetch('/api/...')"`
|
||||
- Static pages (markdown, HTML) → navigate to them directly
|
||||
|
||||
3. **Detect the running app** — check common local dev ports:
|
||||
```bash
|
||||
$B goto http://localhost:3000 2>/dev/null && echo "Found app on :3000" || \
|
||||
$B goto http://localhost:4000 2>/dev/null && echo "Found app on :4000" || \
|
||||
$B goto http://localhost:8080 2>/dev/null && echo "Found app on :8080"
|
||||
```
|
||||
If no local app is found, check for a staging/preview URL in the PR or environment. If nothing works, ask the user for the URL.
|
||||
|
||||
4. **Test each affected page/route:**
|
||||
- Navigate to the page
|
||||
- Take a screenshot
|
||||
- Check console for errors
|
||||
- If the change was interactive (forms, buttons, flows), test the interaction end-to-end
|
||||
- Use `snapshot -D` before and after actions to verify the change had the expected effect
|
||||
|
||||
5. **Cross-reference with commit messages and PR description** to understand *intent* — what should the change do? Verify it actually does that.
|
||||
|
||||
6. **Report findings** scoped to the branch changes:
|
||||
- "Changes tested: N pages/routes affected by this branch"
|
||||
- For each: does it work? Screenshot evidence.
|
||||
- Any regressions on adjacent pages?
|
||||
|
||||
**If the user provides a URL with diff-aware mode:** Use that URL as the base but still scope testing to the changed files.
|
||||
|
||||
### Full (default when URL is provided)
|
||||
Systematic exploration. Visit every reachable page. Document 5-10 well-evidenced issues. Produce health score. Takes 5-15 minutes depending on app size.
|
||||
|
||||
### Quick (`--quick`)
|
||||
30-second smoke test. Visit homepage + top 5 navigation targets. Check: page loads? Console errors? Broken links? Produce health score. No detailed issue documentation.
|
||||
|
||||
### Regression (`--regression <baseline>`)
|
||||
Run full mode, then load `baseline.json` from a previous run. Diff: which issues are fixed? Which are new? What's the score delta? Append regression section to report.
|
||||
|
||||
---
|
||||
|
||||
## Workflow
|
||||
|
||||
### Phase 1: Initialize
|
||||
|
||||
1. Find browse binary (see Setup above)
|
||||
2. Create output directories
|
||||
3. Copy report template from `qa/templates/qa-report-template.md` to output dir
|
||||
4. Start timer for duration tracking
|
||||
|
||||
### Phase 2: Authenticate (if needed)
|
||||
|
||||
**If the user specified auth credentials:**
|
||||
|
||||
```bash
|
||||
$B goto <login-url>
|
||||
$B snapshot -i # find the login form
|
||||
$B fill @e3 "user@example.com"
|
||||
$B fill @e4 "[REDACTED]" # NEVER include real passwords in report
|
||||
$B click @e5 # submit
|
||||
$B snapshot -D # verify login succeeded
|
||||
```
|
||||
|
||||
**If the user provided a cookie file:**
|
||||
|
||||
```bash
|
||||
$B cookie-import cookies.json
|
||||
$B goto <target-url>
|
||||
```
|
||||
|
||||
**If 2FA/OTP is required:** Ask the user for the code and wait.
|
||||
|
||||
**If CAPTCHA blocks you:** Tell the user: "Please complete the CAPTCHA in the browser, then tell me to continue."
|
||||
|
||||
### Phase 3: Orient
|
||||
|
||||
Get a map of the application:
|
||||
|
||||
```bash
|
||||
$B goto <target-url>
|
||||
$B snapshot -i -a -o "$REPORT_DIR/screenshots/initial.png"
|
||||
$B links # map navigation structure
|
||||
$B console --errors # any errors on landing?
|
||||
```
|
||||
|
||||
**Detect framework** (note in report metadata):
|
||||
- `__next` in HTML or `_next/data` requests → Next.js
|
||||
- `csrf-token` meta tag → Rails
|
||||
- `wp-content` in URLs → WordPress
|
||||
- Client-side routing with no page reloads → SPA
|
||||
|
||||
**For SPAs:** The `links` command may return few results because navigation is client-side. Use `snapshot -i` to find nav elements (buttons, menu items) instead.
|
||||
|
||||
### Phase 4: Explore
|
||||
|
||||
Visit pages systematically. At each page:
|
||||
|
||||
```bash
|
||||
$B goto <page-url>
|
||||
$B snapshot -i -a -o "$REPORT_DIR/screenshots/page-name.png"
|
||||
$B console --errors
|
||||
```
|
||||
|
||||
Then follow the **per-page exploration checklist** (see `qa/references/issue-taxonomy.md`):
|
||||
|
||||
1. **Visual scan** — Look at the annotated screenshot for layout issues
|
||||
2. **Interactive elements** — Click buttons, links, controls. Do they work?
|
||||
3. **Forms** — Fill and submit. Test empty, invalid, edge cases
|
||||
4. **Navigation** — Check all paths in and out
|
||||
5. **States** — Empty state, loading, error, overflow
|
||||
6. **Console** — Any new JS errors after interactions?
|
||||
7. **Responsiveness** — Check mobile viewport if relevant:
|
||||
```bash
|
||||
$B viewport 375x812
|
||||
$B screenshot "$REPORT_DIR/screenshots/page-mobile.png"
|
||||
$B viewport 1280x720
|
||||
```
|
||||
|
||||
**Depth judgment:** Spend more time on core features (homepage, dashboard, checkout, search) and less on secondary pages (about, terms, privacy).
|
||||
|
||||
**Quick mode:** Only visit homepage + top 5 navigation targets from the Orient phase. Skip the per-page checklist — just check: loads? Console errors? Broken links visible?
|
||||
|
||||
### Phase 5: Document
|
||||
|
||||
Document each issue **immediately when found** — don't batch them.
|
||||
|
||||
**Two evidence tiers:**
|
||||
|
||||
**Interactive bugs** (broken flows, dead buttons, form failures):
|
||||
1. Take a screenshot before the action
|
||||
2. Perform the action
|
||||
3. Take a screenshot showing the result
|
||||
4. Use `snapshot -D` to show what changed
|
||||
5. Write repro steps referencing screenshots
|
||||
|
||||
```bash
|
||||
$B screenshot "$REPORT_DIR/screenshots/issue-001-step-1.png"
|
||||
$B click @e5
|
||||
$B screenshot "$REPORT_DIR/screenshots/issue-001-result.png"
|
||||
$B snapshot -D
|
||||
```
|
||||
|
||||
**Static bugs** (typos, layout issues, missing images):
|
||||
1. Take a single annotated screenshot showing the problem
|
||||
2. Describe what's wrong
|
||||
|
||||
```bash
|
||||
$B snapshot -i -a -o "$REPORT_DIR/screenshots/issue-002.png"
|
||||
```
|
||||
|
||||
**Write each issue to the report immediately** using the template format from `qa/templates/qa-report-template.md`.
|
||||
|
||||
### Phase 6: Wrap Up
|
||||
|
||||
1. **Compute health score** using the rubric below
|
||||
2. **Write "Top 3 Things to Fix"** — the 3 highest-severity issues
|
||||
3. **Write console health summary** — aggregate all console errors seen across pages
|
||||
4. **Update severity counts** in the summary table
|
||||
5. **Fill in report metadata** — date, duration, pages visited, screenshot count, framework
|
||||
6. **Save baseline** — write `baseline.json` with:
|
||||
```json
|
||||
{
|
||||
"date": "YYYY-MM-DD",
|
||||
"url": "<target>",
|
||||
"healthScore": N,
|
||||
"issues": [{ "id": "ISSUE-001", "title": "...", "severity": "...", "category": "..." }],
|
||||
"categoryScores": { "console": N, "links": N, ... }
|
||||
}
|
||||
```
|
||||
|
||||
**Regression mode:** After writing the report, load the baseline file. Compare:
|
||||
- Health score delta
|
||||
- Issues fixed (in baseline but not current)
|
||||
- New issues (in current but not baseline)
|
||||
- Append the regression section to the report
|
||||
|
||||
---
|
||||
|
||||
## Health Score Rubric
|
||||
|
||||
Compute each category score (0-100), then take the weighted average.
|
||||
|
||||
### Console (weight: 15%)
|
||||
- 0 errors → 100
|
||||
- 1-3 errors → 70
|
||||
- 4-10 errors → 40
|
||||
- 10+ errors → 10
|
||||
|
||||
### Links (weight: 10%)
|
||||
- 0 broken → 100
|
||||
- Each broken link → -15 (minimum 0)
|
||||
|
||||
### Per-Category Scoring (Visual, Functional, UX, Content, Performance, Accessibility)
|
||||
Each category starts at 100. Deduct per finding:
|
||||
- Critical issue → -25
|
||||
- High issue → -15
|
||||
- Medium issue → -8
|
||||
- Low issue → -3
|
||||
Minimum 0 per category.
|
||||
|
||||
### Weights
|
||||
| Category | Weight |
|
||||
|----------|--------|
|
||||
| Console | 15% |
|
||||
| Links | 10% |
|
||||
| Visual | 10% |
|
||||
| Functional | 20% |
|
||||
| UX | 15% |
|
||||
| Performance | 10% |
|
||||
| Content | 5% |
|
||||
| Accessibility | 15% |
|
||||
|
||||
### Final Score
|
||||
`score = Σ (category_score × weight)`
|
||||
|
||||
---
|
||||
|
||||
## Framework-Specific Guidance
|
||||
|
||||
### Next.js
|
||||
- Check console for hydration errors (`Hydration failed`, `Text content did not match`)
|
||||
- Monitor `_next/data` requests in network — 404s indicate broken data fetching
|
||||
- Test client-side navigation (click links, don't just `goto`) — catches routing issues
|
||||
- Check for CLS (Cumulative Layout Shift) on pages with dynamic content
|
||||
|
||||
### Rails
|
||||
- Check for N+1 query warnings in console (if development mode)
|
||||
- Verify CSRF token presence in forms
|
||||
- Test Turbo/Stimulus integration — do page transitions work smoothly?
|
||||
- Check for flash messages appearing and dismissing correctly
|
||||
|
||||
### WordPress
|
||||
- Check for plugin conflicts (JS errors from different plugins)
|
||||
- Verify admin bar visibility for logged-in users
|
||||
- Test REST API endpoints (`/wp-json/`)
|
||||
- Check for mixed content warnings (common with WP)
|
||||
|
||||
### General SPA (React, Vue, Angular)
|
||||
- Use `snapshot -i` for navigation — `links` command misses client-side routes
|
||||
- Check for stale state (navigate away and back — does data refresh?)
|
||||
- Test browser back/forward — does the app handle history correctly?
|
||||
- Check for memory leaks (monitor console after extended use)
|
||||
|
||||
---
|
||||
|
||||
## Important Rules
|
||||
|
||||
1. **Repro is everything.** Every issue needs at least one screenshot. No exceptions.
|
||||
2. **Verify before documenting.** Retry the issue once to confirm it's reproducible, not a fluke.
|
||||
3. **Never include credentials.** Write `[REDACTED]` for passwords in repro steps.
|
||||
4. **Write incrementally.** Append each issue to the report as you find it. Don't batch.
|
||||
5. **Never read source code.** Test as a user, not a developer.
|
||||
6. **Check console after every interaction.** JS errors that don't surface visually are still bugs.
|
||||
7. **Test like a user.** Use realistic data. Walk through complete workflows end-to-end.
|
||||
8. **Depth over breadth.** 5-10 well-documented issues with evidence > 20 vague descriptions.
|
||||
9. **Never delete output files.** Screenshots and reports accumulate — that's intentional.
|
||||
10. **Use `snapshot -C` for tricky UIs.** Finds clickable divs that the accessibility tree misses.
|
||||
|
||||
---
|
||||
|
||||
## Output Structure
|
||||
|
||||
```
|
||||
.gstack/qa-reports/
|
||||
├── qa-report-{domain}-{YYYY-MM-DD}.md # Structured report
|
||||
├── screenshots/
|
||||
│ ├── initial.png # Landing page annotated screenshot
|
||||
│ ├── issue-001-step-1.png # Per-issue evidence
|
||||
│ ├── issue-001-result.png
|
||||
│ └── ...
|
||||
└── baseline.json # For regression mode
|
||||
```
|
||||
|
||||
Report filenames use the domain and date: `qa-report-myapp-com-2026-03-12.md`
|
||||
@@ -10,8 +10,18 @@ allowed-tools:
|
||||
- Read
|
||||
- Write
|
||||
- Glob
|
||||
- AskUserQuestion
|
||||
---
|
||||
|
||||
## Update Check (run first)
|
||||
|
||||
```bash
|
||||
_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
|
||||
[ -n "$_UPD" ] && echo "$_UPD"
|
||||
```
|
||||
|
||||
If output shows `UPGRADE_AVAILABLE <old> <new>`: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (AskUserQuestion → upgrade if yes, `touch ~/.gstack/last-update-check` if no). If `JUST_UPGRADED <from> <to>`: tell user "Running gstack v{to} (just updated!)" and continue.
|
||||
|
||||
# /retro — Weekly Engineering Retrospective
|
||||
|
||||
Generates a comprehensive engineering retrospective analyzing commit history, work patterns, and code quality metrics. Team-aware: identifies the user running the command, then analyzes every contributor with per-person praise and growth opportunities. Designed for a senior IC/CTO-level builder using Claude Code as a force multiplier.
|
||||
|
||||
@@ -14,6 +14,15 @@ allowed-tools:
|
||||
- AskUserQuestion
|
||||
---
|
||||
|
||||
## Update Check (run first)
|
||||
|
||||
```bash
|
||||
_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
|
||||
[ -n "$_UPD" ] && echo "$_UPD"
|
||||
```
|
||||
|
||||
If output shows `UPGRADE_AVAILABLE <old> <new>`: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (AskUserQuestion → upgrade if yes, `touch ~/.gstack/last-update-check` if no). If `JUST_UPGRADED <from> <to>`: tell user "Running gstack v{to} (just updated!)" and continue.
|
||||
|
||||
# Pre-Landing PR Review
|
||||
|
||||
You are running the `/review` workflow. Analyze the current branch's diff against main for structural issues that tests don't catch.
|
||||
|
||||
@@ -94,9 +94,43 @@ function generateSnapshotFlags(): string {
|
||||
return lines.join('\n');
|
||||
}
|
||||
|
||||
function generateUpdateCheck(): string {
|
||||
return `## Update Check (run first)
|
||||
|
||||
\`\`\`bash
|
||||
_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
|
||||
[ -n "$_UPD" ] && echo "$_UPD" || true
|
||||
\`\`\`
|
||||
|
||||
If output shows \`UPGRADE_AVAILABLE <old> <new>\`: read \`~/.claude/skills/gstack/gstack-upgrade/SKILL.md\` and follow the "Inline upgrade flow" (AskUserQuestion → upgrade if yes, \`touch ~/.gstack/last-update-check\` if no). If \`JUST_UPGRADED <from> <to>\`: tell user "Running gstack v{to} (just updated!)" and continue.`;
|
||||
}
|
||||
|
||||
function generateBrowseSetup(): string {
|
||||
return `## SETUP (run this check BEFORE any browse command)
|
||||
|
||||
\`\`\`bash
|
||||
_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
|
||||
B=""
|
||||
[ -n "$_ROOT" ] && [ -x "$_ROOT/.claude/skills/gstack/browse/dist/browse" ] && B="$_ROOT/.claude/skills/gstack/browse/dist/browse"
|
||||
[ -z "$B" ] && B=~/.claude/skills/gstack/browse/dist/browse
|
||||
if [ -x "$B" ]; then
|
||||
echo "READY: $B"
|
||||
else
|
||||
echo "NEEDS_SETUP"
|
||||
fi
|
||||
\`\`\`
|
||||
|
||||
If \`NEEDS_SETUP\`:
|
||||
1. Tell the user: "gstack browse needs a one-time build (~10 seconds). OK to proceed?" Then STOP and wait.
|
||||
2. Run: \`cd <SKILL_DIR> && ./setup\`
|
||||
3. If \`bun\` is not installed: \`curl -fsSL https://bun.sh/install | bash\``;
|
||||
}
|
||||
|
||||
const RESOLVERS: Record<string, () => string> = {
|
||||
COMMAND_REFERENCE: generateCommandReference,
|
||||
SNAPSHOT_FLAGS: generateSnapshotFlags,
|
||||
UPDATE_CHECK: generateUpdateCheck,
|
||||
BROWSE_SETUP: generateBrowseSetup,
|
||||
};
|
||||
|
||||
// ─── Template Processing ────────────────────────────────────
|
||||
@@ -141,6 +175,8 @@ function findTemplates(): string[] {
|
||||
const candidates = [
|
||||
path.join(ROOT, 'SKILL.md.tmpl'),
|
||||
path.join(ROOT, 'browse', 'SKILL.md.tmpl'),
|
||||
path.join(ROOT, 'qa', 'SKILL.md.tmpl'),
|
||||
path.join(ROOT, 'setup-browser-cookies', 'SKILL.md.tmpl'),
|
||||
];
|
||||
for (const p of candidates) {
|
||||
if (fs.existsSync(p)) templates.push(p);
|
||||
|
||||
@@ -88,3 +88,10 @@ else
|
||||
echo " browse: $BROWSE_BIN"
|
||||
echo " (skipped skill symlinks — not inside .claude/skills/)"
|
||||
fi
|
||||
|
||||
# 4. First-time welcome + legacy cleanup
|
||||
if [ ! -d "$HOME/.gstack" ]; then
|
||||
mkdir -p "$HOME/.gstack"
|
||||
echo " Welcome! Run /gstack-upgrade anytime to stay current."
|
||||
fi
|
||||
rm -f /tmp/gstack-latest-version
|
||||
|
||||
@@ -8,7 +8,19 @@ description: |
|
||||
allowed-tools:
|
||||
- Bash
|
||||
- Read
|
||||
- AskUserQuestion
|
||||
---
|
||||
<!-- AUTO-GENERATED from SKILL.md.tmpl — do not edit directly -->
|
||||
<!-- Regenerate: bun run gen:skill-docs -->
|
||||
|
||||
## Update Check (run first)
|
||||
|
||||
```bash
|
||||
_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
|
||||
[ -n "$_UPD" ] && echo "$_UPD" || true
|
||||
```
|
||||
|
||||
If output shows `UPGRADE_AVAILABLE <old> <new>`: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (AskUserQuestion → upgrade if yes, `touch ~/.gstack/last-update-check` if no). If `JUST_UPGRADED <from> <to>`: tell user "Running gstack v{to} (just updated!)" and continue.
|
||||
|
||||
# Setup Browser Cookies
|
||||
|
||||
@@ -25,13 +37,15 @@ Import logged-in sessions from your real Chromium browser into the headless brow
|
||||
|
||||
### 1. Find the browse binary
|
||||
|
||||
## SETUP (run this check BEFORE any browse command)
|
||||
|
||||
```bash
|
||||
BROWSE_OUTPUT=$(browse/bin/find-browse 2>/dev/null || ~/.claude/skills/gstack/browse/bin/find-browse 2>/dev/null)
|
||||
B=$(echo "$BROWSE_OUTPUT" | head -1)
|
||||
META=$(echo "$BROWSE_OUTPUT" | grep "^META:" || true)
|
||||
if [ -n "$B" ]; then
|
||||
_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
|
||||
B=""
|
||||
[ -n "$_ROOT" ] && [ -x "$_ROOT/.claude/skills/gstack/browse/dist/browse" ] && B="$_ROOT/.claude/skills/gstack/browse/dist/browse"
|
||||
[ -z "$B" ] && B=~/.claude/skills/gstack/browse/dist/browse
|
||||
if [ -x "$B" ]; then
|
||||
echo "READY: $B"
|
||||
[ -n "$META" ] && echo "$META"
|
||||
else
|
||||
echo "NEEDS_SETUP"
|
||||
fi
|
||||
@@ -42,8 +56,6 @@ If `NEEDS_SETUP`:
|
||||
2. Run: `cd <SKILL_DIR> && ./setup`
|
||||
3. If `bun` is not installed: `curl -fsSL https://bun.sh/install | bash`
|
||||
|
||||
If you see `META:UPDATE_AVAILABLE`: tell the user an update is available, STOP and wait for approval, then run the command from the META payload and re-run the setup check.
|
||||
|
||||
### 2. Open the cookie picker
|
||||
|
||||
```bash
|
||||
|
||||
@@ -0,0 +1,73 @@
|
||||
---
|
||||
name: setup-browser-cookies
|
||||
version: 1.0.0
|
||||
description: |
|
||||
Import cookies from your real browser (Comet, Chrome, Arc, Brave, Edge) into the
|
||||
headless browse session. Opens an interactive picker UI where you select which
|
||||
cookie domains to import. Use before QA testing authenticated pages.
|
||||
allowed-tools:
|
||||
- Bash
|
||||
- Read
|
||||
- AskUserQuestion
|
||||
---
|
||||
|
||||
{{UPDATE_CHECK}}
|
||||
|
||||
# Setup Browser Cookies
|
||||
|
||||
Import logged-in sessions from your real Chromium browser into the headless browse session.
|
||||
|
||||
## How it works
|
||||
|
||||
1. Find the browse binary
|
||||
2. Run `cookie-import-browser` to detect installed browsers and open the picker UI
|
||||
3. User selects which cookie domains to import in their browser
|
||||
4. Cookies are decrypted and loaded into the Playwright session
|
||||
|
||||
## Steps
|
||||
|
||||
### 1. Find the browse binary
|
||||
|
||||
{{BROWSE_SETUP}}
|
||||
|
||||
### 2. Open the cookie picker
|
||||
|
||||
```bash
|
||||
$B cookie-import-browser
|
||||
```
|
||||
|
||||
This auto-detects installed Chromium browsers (Comet, Chrome, Arc, Brave, Edge) and opens
|
||||
an interactive picker UI in your default browser where you can:
|
||||
- Switch between installed browsers
|
||||
- Search domains
|
||||
- Click "+" to import a domain's cookies
|
||||
- Click trash to remove imported cookies
|
||||
|
||||
Tell the user: **"Cookie picker opened — select the domains you want to import in your browser, then tell me when you're done."**
|
||||
|
||||
### 3. Direct import (alternative)
|
||||
|
||||
If the user specifies a domain directly (e.g., `/setup-browser-cookies github.com`), skip the UI:
|
||||
|
||||
```bash
|
||||
$B cookie-import-browser comet --domain github.com
|
||||
```
|
||||
|
||||
Replace `comet` with the appropriate browser if specified.
|
||||
|
||||
### 4. Verify
|
||||
|
||||
After the user confirms they're done:
|
||||
|
||||
```bash
|
||||
$B cookies
|
||||
```
|
||||
|
||||
Show the user a summary of imported cookies (domain counts).
|
||||
|
||||
## Notes
|
||||
|
||||
- First import per browser may trigger a macOS Keychain dialog — click "Allow" / "Always Allow"
|
||||
- Cookie picker is served on the same port as the browse server (no extra process)
|
||||
- Only domain names and cookie counts are shown in the UI — no cookie values are exposed
|
||||
- The browse session persists cookies between commands, so imported cookies work immediately
|
||||
@@ -13,6 +13,15 @@ allowed-tools:
|
||||
- AskUserQuestion
|
||||
---
|
||||
|
||||
## Update Check (run first)
|
||||
|
||||
```bash
|
||||
_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
|
||||
[ -n "$_UPD" ] && echo "$_UPD"
|
||||
```
|
||||
|
||||
If output shows `UPGRADE_AVAILABLE <old> <new>`: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (AskUserQuestion → upgrade if yes, `touch ~/.gstack/last-update-check` if no). If `JUST_UPGRADED <from> <to>`: tell user "Running gstack v{to} (just updated!)" and continue.
|
||||
|
||||
# Ship: Fully Automated Ship Workflow
|
||||
|
||||
You are running the `/ship` workflow. This is a **non-interactive, fully automated** workflow. Do NOT ask for confirmation at any step. The user said `/ship` which means DO IT. Run straight through and output the PR URL at the end.
|
||||
|
||||
@@ -33,6 +33,7 @@ const BROWSE_ERROR_PATTERNS = [
|
||||
/Exit code 1/,
|
||||
/ERROR: browse binary not found/,
|
||||
/Server failed to start/,
|
||||
/no such file or directory.*browse/i,
|
||||
];
|
||||
|
||||
export async function runSkillTest(options: {
|
||||
|
||||
+84
-1
@@ -132,6 +132,89 @@ Report what each command returned.`,
|
||||
expect(result.browseErrors).toHaveLength(0);
|
||||
expect(result.exitReason).toBe('success');
|
||||
}, 90_000);
|
||||
|
||||
|
||||
test('agent discovers browse binary via SKILL.md setup block', async () => {
|
||||
const skillMd = fs.readFileSync(path.join(ROOT, 'SKILL.md'), 'utf-8');
|
||||
const setupStart = skillMd.indexOf('## SETUP');
|
||||
const setupEnd = skillMd.indexOf('## IMPORTANT');
|
||||
const setupBlock = skillMd.slice(setupStart, setupEnd);
|
||||
|
||||
// Guard: verify we extracted a valid setup block
|
||||
expect(setupBlock).toContain('browse/dist/browse');
|
||||
|
||||
const result = await runSkillTest({
|
||||
prompt: `Follow these instructions to find the browse binary and run a basic command.
|
||||
|
||||
${setupBlock}
|
||||
|
||||
After finding the binary, run: $B goto ${testServer.url}
|
||||
Then run: $B text
|
||||
Report whether it worked.`,
|
||||
workingDirectory: tmpDir,
|
||||
maxTurns: 10,
|
||||
timeout: 60_000,
|
||||
});
|
||||
|
||||
expect(result.browseErrors).toHaveLength(0);
|
||||
expect(result.exitReason).toBe('success');
|
||||
}, 90_000);
|
||||
|
||||
test('SKILL.md setup block shows NEEDS_SETUP when binary missing', async () => {
|
||||
// Create a tmpdir with no browse binary
|
||||
const emptyDir = fs.mkdtempSync(path.join(os.tmpdir(), 'skill-e2e-empty-'));
|
||||
|
||||
const skillMd = fs.readFileSync(path.join(ROOT, 'SKILL.md'), 'utf-8');
|
||||
const setupStart = skillMd.indexOf('## SETUP');
|
||||
const setupEnd = skillMd.indexOf('## IMPORTANT');
|
||||
const setupBlock = skillMd.slice(setupStart, setupEnd);
|
||||
|
||||
const result = await runSkillTest({
|
||||
prompt: `Follow these instructions exactly. Run the bash code block below and report what it outputs.
|
||||
|
||||
${setupBlock}
|
||||
|
||||
Report the exact output. Do NOT try to fix or install anything — just report what you see.`,
|
||||
workingDirectory: emptyDir,
|
||||
maxTurns: 5,
|
||||
timeout: 30_000,
|
||||
});
|
||||
|
||||
// Agent should see NEEDS_SETUP (not crash or guess wrong paths)
|
||||
const allText = result.output || '';
|
||||
expect(allText).toContain('NEEDS_SETUP');
|
||||
|
||||
// Clean up
|
||||
try { fs.rmSync(emptyDir, { recursive: true, force: true }); } catch {}
|
||||
}, 60_000);
|
||||
|
||||
test('SKILL.md setup block works outside git repo', async () => {
|
||||
// Create a tmpdir outside any git repo
|
||||
const nonGitDir = fs.mkdtempSync(path.join(os.tmpdir(), 'skill-e2e-nogit-'));
|
||||
|
||||
const skillMd = fs.readFileSync(path.join(ROOT, 'SKILL.md'), 'utf-8');
|
||||
const setupStart = skillMd.indexOf('## SETUP');
|
||||
const setupEnd = skillMd.indexOf('## IMPORTANT');
|
||||
const setupBlock = skillMd.slice(setupStart, setupEnd);
|
||||
|
||||
const result = await runSkillTest({
|
||||
prompt: `Follow these instructions exactly. Run the bash code block below and report what it outputs.
|
||||
|
||||
${setupBlock}
|
||||
|
||||
Report the exact output — either "READY: <path>" or "NEEDS_SETUP".`,
|
||||
workingDirectory: nonGitDir,
|
||||
maxTurns: 5,
|
||||
timeout: 30_000,
|
||||
});
|
||||
|
||||
// Should either find global binary (READY) or show NEEDS_SETUP — not crash
|
||||
const allText = result.output || '';
|
||||
expect(allText).toMatch(/READY|NEEDS_SETUP/);
|
||||
|
||||
// Clean up
|
||||
try { fs.rmSync(nonGitDir, { recursive: true, force: true }); } catch {}
|
||||
}, 60_000);
|
||||
});
|
||||
|
||||
// --- B4: QA skill E2E ---
|
||||
@@ -264,7 +347,7 @@ describeOutcome('Planted-bug outcome evals', () => {
|
||||
fs.mkdirSync(path.join(reportDir, 'screenshots'), { recursive: true });
|
||||
const reportPath = path.join(reportDir, 'qa-report.md');
|
||||
|
||||
// Phase 1: Agent SDK runs /qa Standard
|
||||
// Phase 1: runs /qa Standard
|
||||
const result = await runSkillTest({
|
||||
prompt: `You have a browse binary at ${browseBin}. Assign it to B variable like: B="${browseBin}"
|
||||
|
||||
|
||||
@@ -65,6 +65,19 @@ describeEval('LLM-as-judge quality evals', () => {
|
||||
expect(scores.actionability).toBeGreaterThanOrEqual(4);
|
||||
}, 30_000);
|
||||
|
||||
test('setup block scores >= 4 on actionability and clarity', async () => {
|
||||
const content = fs.readFileSync(path.join(ROOT, 'SKILL.md'), 'utf-8');
|
||||
const setupStart = content.indexOf('## SETUP');
|
||||
const setupEnd = content.indexOf('## IMPORTANT');
|
||||
const section = content.slice(setupStart, setupEnd);
|
||||
|
||||
const scores = await judge('setup/binary discovery instructions', section);
|
||||
console.log('Setup block scores:', JSON.stringify(scores, null, 2));
|
||||
|
||||
expect(scores.actionability).toBeGreaterThanOrEqual(4);
|
||||
expect(scores.clarity).toBeGreaterThanOrEqual(4);
|
||||
}, 30_000);
|
||||
|
||||
test('regression check: compare branch vs baseline quality', async () => {
|
||||
// This test compares the generated output against the hand-maintained
|
||||
// baseline from main. The generated version should score equal or higher.
|
||||
|
||||
Reference in New Issue
Block a user