Files
gstack/scripts/resolvers/browse.ts
T
Garry Tan dc5e0538e5 feat: worktree isolation for E2E tests + infrastructure elegance (v0.11.12.0) (#425)
* refactor: extract gen-skill-docs into modular resolver architecture

Break the 3000-line monolith into 10 domain modules under scripts/resolvers/:
types, constants, preamble, utility, browse, design, testing, review,
codex-helpers, and index. Each module owns one domain of template generation.

The preamble module introduces a 4-tier composition system (T1-T4) so skills
only pay for the preamble sections they actually need, reducing token usage
for lightweight skills by ~40%.

Adds a token budget dashboard that prints after every generation run showing
per-skill and total token counts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: tiered preamble — skills only pay for what they use

Tag all 23 templates with preamble-tier (T1-T4). Lightweight skills
like /browse and /benchmark get a minimal preamble (~40% fewer tokens),
while review skills get the full stack. Regenerate all SKILL.md files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: migrate eval storage to project-scoped paths

Move eval results and E2E run artifacts from ~/.gstack-dev/evals/ to
~/.gstack/projects/$SLUG/evals/ so each project's eval history lives
alongside its other gstack data. Falls back to legacy path if slug
detection fails.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sync package.json version with VERSION after merge

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add WorktreeManager for isolated test environments

Reusable platform module (lib/worktree.ts) that creates git worktrees
for test isolation and harvests useful changes as patches. Includes
SHA-256 dedup, original SHA tracking for committed change detection,
and automatic gitignored artifact copying (.agents/, browse/dist/).

12 unit tests covering lifecycle, harvest, dedup, and error handling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: integrate worktree isolation into E2E test infrastructure

Add createTestWorktree(), harvestAndCleanup(), and describeWithWorktree()
helpers to e2e-helpers.ts. Add harvest field to EvalTestEntry for
eval-store integration. Register lib/worktree.ts as a global touchfile.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: run Gemini and Codex E2E tests in worktrees

Switch both test suites from cwd: ROOT to worktree isolation.
Gemini (--yolo) no longer pollutes the working tree. Codex
(read-only) gets worktree for consistency. Useful changes are
harvested as patches for cherry-picking.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: skip symlinks in copyDirSync to prevent infinite recursion

Adversarial review caught that .claude/skills/gstack may be a symlink
back to the repo root, causing copyDirSync to recurse infinitely
when copying gitignored artifacts into worktrees.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: bump version and changelog (v0.11.12.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: relax session-awareness assertion to accept structured options

The LLM consistently presents well-formatted A/B choices with pros/cons
but doesn't always use the exact string "RECOMMENDATION". Accept
case-insensitive "recommend", "option a", "which do you want", or
"which approach" as equivalent signals of a structured recommendation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 23:05:22 -07:00

100 lines
3.7 KiB
TypeScript

import type { TemplateContext } from './types';
import { COMMAND_DESCRIPTIONS } from '../../browse/src/commands';
import { SNAPSHOT_FLAGS } from '../../browse/src/snapshot';
export function generateCommandReference(_ctx: TemplateContext): string {
// Group commands by category
const groups = new Map<string, Array<{ command: string; description: string; usage?: string }>>();
for (const [cmd, meta] of Object.entries(COMMAND_DESCRIPTIONS)) {
const list = groups.get(meta.category) || [];
list.push({ command: cmd, description: meta.description, usage: meta.usage });
groups.set(meta.category, list);
}
// Category display order
const categoryOrder = [
'Navigation', 'Reading', 'Interaction', 'Inspection',
'Visual', 'Snapshot', 'Meta', 'Tabs', 'Server',
];
const sections: string[] = [];
for (const category of categoryOrder) {
const commands = groups.get(category);
if (!commands || commands.length === 0) continue;
// Sort alphabetically within category
commands.sort((a, b) => a.command.localeCompare(b.command));
sections.push(`### ${category}`);
sections.push('| Command | Description |');
sections.push('|---------|-------------|');
for (const cmd of commands) {
const display = cmd.usage ? `\`${cmd.usage}\`` : `\`${cmd.command}\``;
sections.push(`| ${display} | ${cmd.description} |`);
}
sections.push('');
}
return sections.join('\n').trimEnd();
}
export function generateSnapshotFlags(_ctx: TemplateContext): string {
const lines: string[] = [
'The snapshot is your primary tool for understanding and interacting with pages.',
'',
'```',
];
for (const flag of SNAPSHOT_FLAGS) {
const label = flag.valueHint ? `${flag.short} ${flag.valueHint}` : flag.short;
lines.push(`${label.padEnd(10)}${flag.long.padEnd(24)}${flag.description}`);
}
lines.push('```');
lines.push('');
lines.push('All flags can be combined freely. `-o` only applies when `-a` is also used.');
lines.push('Example: `$B snapshot -i -a -C -o /tmp/annotated.png`');
lines.push('');
lines.push('**Ref numbering:** @e refs are assigned sequentially (@e1, @e2, ...) in tree order.');
lines.push('@c refs from `-C` are numbered separately (@c1, @c2, ...).');
lines.push('');
lines.push('After snapshot, use @refs as selectors in any command:');
lines.push('```bash');
lines.push('$B click @e3 $B fill @e4 "value" $B hover @e1');
lines.push('$B html @e2 $B css @e5 "color" $B attrs @e6');
lines.push('$B click @c1 # cursor-interactive ref (from -C)');
lines.push('```');
lines.push('');
lines.push('**Output format:** indented accessibility tree with @ref IDs, one element per line.');
lines.push('```');
lines.push(' @e1 [heading] "Welcome" [level=1]');
lines.push(' @e2 [textbox] "Email"');
lines.push(' @e3 [button] "Submit"');
lines.push('```');
lines.push('');
lines.push('Refs are invalidated on navigation — run `snapshot` again after `goto`.');
return lines.join('\n');
}
export function generateBrowseSetup(ctx: TemplateContext): string {
return `## SETUP (run this check BEFORE any browse command)
\`\`\`bash
_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
B=""
[ -n "$_ROOT" ] && [ -x "$_ROOT/${ctx.paths.localSkillRoot}/browse/dist/browse" ] && B="$_ROOT/${ctx.paths.localSkillRoot}/browse/dist/browse"
[ -z "$B" ] && B=${ctx.paths.browseDir}/browse
if [ -x "$B" ]; then
echo "READY: $B"
else
echo "NEEDS_SETUP"
fi
\`\`\`
If \`NEEDS_SETUP\`:
1. Tell the user: "gstack browse needs a one-time build (~10 seconds). OK to proceed?" Then STOP and wait.
2. Run: \`cd <SKILL_DIR> && ./setup\`
3. If \`bun\` is not installed: \`curl -fsSL https://bun.sh/install | bash\``;
}