mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-02 03:35:09 +02:00
8ca950f6f1
* feat: token registry for multi-agent browser access Per-agent scoped tokens with read/write/admin/meta command categories, domain glob restrictions, rate limiting, expiry, and revocation. Setup key exchange for the /pair-agent ceremony (5-min one-time key → 24h session token). Idempotent exchange handles tunnel drops. 39 tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: integrate token registry + scoped auth into browse server Server changes for multi-agent browser access: - /connect endpoint: setup key exchange for /pair-agent ceremony - /token endpoint: root-only minting of scoped sub-tokens - /token/:clientId DELETE: revoke agent tokens - /agents endpoint: list connected agents (root-only) - /health: strips root token when tunnel is active (P0 security fix) - /command: scope/rate/domain checks via token registry before dispatch - Idle timer skips shutdown when tunnel is active Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: ngrok tunnel integration + @ngrok/ngrok dependency BROWSE_TUNNEL=1 env var starts an ngrok tunnel after Bun.serve(). Reads NGROK_AUTHTOKEN from env or ~/.gstack/ngrok.env. Reads NGROK_DOMAIN for dedicated domain (stable URL). Updates state file with tunnel URL. Feasibility spike confirmed: SDK works in compiled Bun binary. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: tab isolation for multi-agent browser access Add per-tab ownership tracking to BrowserManager. Scoped agents must create their own tab via newtab before writing. Unowned tabs (pre-existing, user-opened) are root-only for writes. Read access always allowed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: tab enforcement + POST /pair endpoint + activity attribution Server-side tab ownership check blocks scoped agents from writing to unowned tabs. Special-case newtab records ownership for scoped tokens. POST /pair endpoint creates setup keys for the pairing ceremony. Activity events now include clientId for attribution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: pair-agent CLI command + instruction block generator One command to pair a remote agent: $B pair-agent. Creates a setup key via POST /pair, prints a copy-pasteable instruction block with curl commands. Smart tunnel fallback (tunnel URL > auto-start > localhost). Flags: --for HOST, --local HOST, --admin, --client NAME. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: tab isolation + instruction block generator tests 14 tests covering tab ownership lifecycle (access checks, unowned tabs, transferTab) and instruction block generator (scopes, URLs, admin flag, troubleshooting section). Fix server-auth test that used fragile sliceBetween boundaries broken by new endpoints. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.15.9.0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: CSO security fixes — token leak, domain bypass, input validation 1. Remove root token from /health endpoint entirely (CSO #1 CRITICAL). Origin header is spoofable. Extension reads from ~/.gstack/.auth.json. 2. Add domain check for newtab URL (CSO #5). Previously only goto was checked, allowing domain-restricted agents to bypass via newtab. 3. Validate scope values, rateLimit, expiresSeconds in createToken() (CSO #4). Rejects invalid scopes and negative values. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: /pair-agent skill — syntactic sugar for browser sharing Users remember /pair-agent, not $B pair-agent. The skill walks through agent selection (OpenClaw, Hermes, Codex, Cursor, generic), local vs remote setup, tunnel configuration, and includes platform-specific notes for each agent type. Wraps the CLI command with context. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: remote browser access reference for paired agents Full API reference, snapshot→@ref pattern, scopes, tab isolation, error codes, ngrok setup, and same-machine shortcuts. The instruction block points here for deeper reading. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: improved instruction block with snapshot→@ref pattern The paste-into-agent instruction block now teaches the snapshot→@ref workflow (the most powerful browsing pattern), shows the server URL prominently, and uses clearer formatting. Tests updated to match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: smart ngrok detection + auto-tunnel in pair-agent The pair-agent command now checks ngrok's native config (not just ~/.gstack/ngrok.env) and auto-starts the tunnel when ngrok is available. The skill template walks users through ngrok install and auth if not set up, instead of just printing a dead localhost URL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: on-demand tunnel start via POST /tunnel/start pair-agent now auto-starts the ngrok tunnel without restarting the server. New POST /tunnel/start endpoint reads authtoken from env, ~/.gstack/ngrok.env, or ngrok's native config. CLI detects ngrok availability and calls the endpoint automatically. Zero manual steps when ngrok is installed and authed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: pair-agent skill must output the instruction block verbatim Added CRITICAL instruction: the agent MUST output the full instruction block so the user can copy it. Previously the agent could summarize over it, leaving the user with nothing to paste. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: scoped tokens rejected on /command — auth gate ordering bug The blanket validateAuth() gate (root-only) sat above the /command endpoint, rejecting all scoped tokens with 401 before they reached getTokenInfo(). Moved /command above the gate so both root and scoped tokens are accepted. This was the bug Wintermute hit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: pair-agent auto-launches headed mode before pairing When pair-agent detects headless mode, it auto-switches to headed (visible Chromium window) so the user can watch what the remote agent does. Use --headless to skip this. Fixed compiled binary path resolution (process.execPath, not process.argv[1] which is virtual /$bunfs/ in Bun compiled binaries). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: comprehensive tests for auth ordering, tunnel, ngrok, headed mode 16 new tests covering: - /command sits above blanket auth gate (Wintermute bug) - /command uses getTokenInfo not validateAuth - /tunnel/start requires root, checks native ngrok config, returns already_active - /pair creates setup keys not session tokens - Tab ownership checked before command dispatch - Activity events include clientId - Instruction block teaches snapshot→@ref pattern - pair-agent auto-headed mode, process.execPath, --headless skip - isNgrokAvailable checks all 3 sources (gstack env, env var, native config) - handlePairAgent calls /tunnel/start not server restart Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: chain scope bypass + /health info leak when tunneled 1. Chain command now pre-validates ALL subcommand scopes before executing any. A read+meta token can no longer escalate to admin via chain (eval, js, cookies were dispatched without scope checks). tokenInfo flows through handleMetaCommand into the chain handler. Rejects entire chain if any subcommand fails. 2. /health strips sensitive fields (currentUrl, agent.currentMessage, session) when tunnel is active. Only operational metadata (status, mode, uptime, tabs) exposed to the internet. Previously anyone reaching the ngrok URL could surveil browsing activity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: tout /pair-agent as headline feature in CHANGELOG + README Lead with what it does for the user: type /pair-agent, paste into your other agent, done. First time AI agents from different companies can coordinate through a shared browser with real security boundaries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: expand /pair-agent, /design-shotgun, /design-html in README Each skill gets a real narrative paragraph explaining the workflow, not just a table cell. design-shotgun: visual exploration with taste memory. design-html: production HTML with Pretext computed layout. pair-agent: cross-vendor AI agent coordination through shared browser. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: split handleCommand into handleCommandInternal + HTTP wrapper Chain subcommands now route through handleCommandInternal for full security enforcement (scope, domain, tab ownership, rate limiting, content wrapping). Adds recursion guard for nested chains, rate-limit exemption for chain subcommands, and activity event suppression (1 event per chain, not per sub). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add content-security.ts with datamarking, envelope, and filter hooks Four-layer prompt injection defense for pair-agent browser sharing: - Datamarking: session-scoped watermark for text exfiltration detection - Content envelope: trust boundary wrapping with ZWSP marker escaping - Content filter hooks: extensible filter pipeline with warn/block modes - Built-in URL blocklist: requestbin, pipedream, webhook.site, etc. BROWSE_CONTENT_FILTER env var controls mode: off|warn|block (default: warn) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: centralize content wrapping in handleCommandInternal response path Single wrapping location replaces fragmented per-handler wrapping: - Scoped tokens: content filters + datamarking + enhanced envelope - Root tokens: existing basic wrapping (backward compat) - Chain subcommands exempt from top-level wrapping (wrapped individually) - Adds 'attrs' to PAGE_CONTENT_COMMANDS (ARIA value exposure defense) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: hidden element stripping for scoped token text extraction Detects CSS-hidden elements (opacity, font-size, off-screen, same-color, clip-path) and ARIA label injection patterns. Marks elements with data-gstack-hidden, extracts text from a clean clone (no DOM mutation), then removes markers. Only active for scoped tokens on text command. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: snapshot split output format for scoped tokens Scoped tokens get a split snapshot: trusted @refs section (for click/fill) separated from untrusted web content in an envelope. Ref names truncated to 50 chars in trusted section. Root tokens unchanged (backward compat). Resume command also uses split format for scoped tokens. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add SECURITY section to pair-agent instruction block Instructs remote agents to treat content inside untrusted envelopes as potentially malicious. Lists common injection phrases to watch for. Directs agents to only use @refs from the trusted INTERACTIVE ELEMENTS section, not from page content. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add 4 prompt injection test fixtures - injection-visible.html: visible injection in product review text - injection-hidden.html: 7 CSS hiding techniques + ARIA injection + false positive - injection-social.html: social engineering in legitimate-looking content - injection-combined.html: all attack types + envelope escape attempt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: comprehensive content security tests (47 tests) Covers all 4 defense layers: - Datamarking: marker format, session consistency, text-only application - Content envelope: wrapping, ZWSP marker escaping, filter warnings - Content filter hooks: URL blocklist, custom filters, warn/block modes - Instruction block: SECURITY section content, ordering, generation - Centralized wrapping: source-level verification of integration - Chain security: recursion guard, rate-limit exemption, activity suppression - Hidden element stripping: 7 CSS techniques, ARIA injection, false positives - Snapshot split format: scoped vs root output, resume integration Also fixes: visibility:hidden detection, case-insensitive ARIA pattern matching. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: pair-agent skill compliance + fix all 16 pre-existing test failures Root cause: pair-agent was added without completing the gen-skill-docs compliance checklist. All 16 failures traced back to this. Fixes: - Sync package.json version to VERSION (0.15.9.0) - Add "(gstack)" to pair-agent description for discoverability - Add pair-agent to Codex path exception (legitimately documents ~/.codex/) - Add CLI_COMMANDS (status, pair-agent, tunnel) to skill parser allowlist - Regenerate SKILL.md for all hosts (claude, codex, factory, kiro, etc.) - Update golden file baselines for ship skill - Fix relink tests: pass GSTACK_INSTALL_DIR to auto-relink calls so they use the fast mock install instead of scanning real ~/.claude/skills/gstack Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.15.12.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: E2E exit reason precedence + worktree prune race condition Two fixes for E2E test reliability: 1. session-runner.ts: error_max_turns was misclassified as error_api because is_error flag was checked before subtype. Now known subtypes like error_max_turns are preserved even when is_error is set. The is_error override only applies when subtype=success (API failure). 2. worktree.ts: pruneStale() now skips worktrees < 1 hour old to avoid deleting worktrees from concurrent test runs still in progress. Previously any second test execution would kill the first's worktrees. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: restore token in /health for localhost extension auth The CSO security fix stripped the token from /health to prevent leaking when tunneled. But the extension needs it to authenticate on localhost. Now returns token only when not tunneled (safe: localhost-only path). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: verify /health token is localhost-only, never served through tunnel Updated tests to match the restored token behavior: - Test 1: token assignment exists AND is inside the !tunnelActive guard - Test 1b: tunnel branch (else block) does not contain AUTH_TOKEN Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add security rationale for token in /health on localhost Explains why this is an accepted risk (no escalation over file-based token access), CORS protection, and tunnel guard. Prevents future CSO scans from stripping it without providing an alternative auth path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: verify tunnel is alive before returning URL to pair-agent Root cause: when ngrok dies externally (pkill, crash, timeout), the server still reports tunnelActive=true with a dead URL. pair-agent prints an instruction block pointing at a dead tunnel. The remote agent gets "endpoint offline" and the user has to manually restart everything. Three-layer fix: - Server /pair endpoint: probes tunnel URL before returning it. If dead, resets tunnelActive/tunnelUrl and returns null (triggers CLI restart). - Server /tunnel/start: probes cached tunnel before returning already_active. If dead, falls through to restart ngrok automatically. - CLI pair-agent: double-checks tunnel URL from server before printing instruction block. Falls through to auto-start on failure. 4 regression tests verify all three probe points + CLI verification. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add POST /batch endpoint for multi-command batching Remote agents controlling GStack Browser through a tunnel pay 2-5s of latency per HTTP round-trip. A typical "navigate and read" takes 4 sequential commands = 10-20 seconds. The /batch endpoint collapses N commands into a single HTTP round-trip, cutting a 20-tab crawl from ~60s to ~5s. Sequential execution through the full security pipeline (scope, domain, tab ownership, content wrapping). Rate limiting counts the batch as 1 request. Activity events emitted at batch level, not per-command. Max 50 commands per batch. Nested batches rejected. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add source-level security tests for /batch endpoint 8 tests verifying: auth gate placement, scoped token support, max command limit, nested batch rejection, rate limiting bypass, batch-level activity events, command field validation, and tabId passthrough. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: correct CHANGELOG date from 2026-04-06 to 2026-04-05 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: consolidate Hermes into generic HTTP option in pair-agent Hermes doesn't have a host-specific config — it uses the same generic curl instructions as any other agent. Removing the dedicated option simplifies the menu and eliminates a misleading distinction. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump VERSION to 0.15.14.0, add CHANGELOG entry for batch endpoint Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate pair-agent/SKILL.md after main merge Vendoring deprecation section from main's template wasn't reflected in the generated file. Fixes check-freshness CI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: checkTabAccess uses options object, add own-only tab policy Refactors checkTabAccess(tabId, clientId, isWrite) to use an options object { isWrite?, ownOnly? }. Adds tabPolicy === 'own-only' support in the server command dispatch — scoped tokens with this policy are restricted to their own tabs for all commands, not just writes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add --domain flag to pair-agent CLI for domain restrictions Allows passing --domain to pair-agent to restrict the remote agent's navigation to specific domains (comma-separated). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * revert: remove batch commands CHANGELOG entry and VERSION bump The batch endpoint work belongs on the browser-batch-multitab branch (port-louis), not this branch. Reverting VERSION to 0.15.14.0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: adopt main's headed-mode /health token serving Our merge kept the old !tunnelActive guard which conflicted with main's security-audit-r2 tests that require no currentUrl/currentMessage in /health. Adopts main's approach: serve token conditionally based on headed mode or chrome-extension origin. Updates server-auth tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: improve snapshot flags docs completeness for LLM judge Adds $B placeholder explanation, explicit syntax line, and detailed flag behavior (-d depth values, -s CSS selector syntax, -D unified diff format and baseline persistence, -a screenshot vs text output relationship). Fixes snapshot flags reference LLM eval scoring completeness < 4. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
516 lines
22 KiB
TypeScript
516 lines
22 KiB
TypeScript
import { describe, test, expect, beforeEach, afterEach } from 'bun:test';
|
|
import { execSync } from 'child_process';
|
|
import * as fs from 'fs';
|
|
import * as path from 'path';
|
|
import * as os from 'os';
|
|
|
|
const ROOT = path.resolve(import.meta.dir, '..');
|
|
const BIN = path.join(ROOT, 'bin');
|
|
|
|
let tmpDir: string;
|
|
let skillsDir: string;
|
|
let installDir: string;
|
|
|
|
function run(cmd: string, env: Record<string, string> = {}, expectFail = false): string {
|
|
try {
|
|
return execSync(cmd, {
|
|
cwd: ROOT,
|
|
env: { ...process.env, GSTACK_STATE_DIR: tmpDir, ...env },
|
|
encoding: 'utf-8',
|
|
timeout: 10000,
|
|
stdio: ['pipe', 'pipe', 'pipe'],
|
|
}).trim();
|
|
} catch (e: any) {
|
|
if (expectFail) return (e.stderr || e.stdout || '').toString().trim();
|
|
throw e;
|
|
}
|
|
}
|
|
|
|
// Create a mock gstack install directory with skill subdirs
|
|
function setupMockInstall(skills: string[]): void {
|
|
installDir = path.join(tmpDir, 'gstack-install');
|
|
skillsDir = path.join(tmpDir, 'skills');
|
|
fs.mkdirSync(installDir, { recursive: true });
|
|
fs.mkdirSync(skillsDir, { recursive: true });
|
|
|
|
// Copy the real gstack-config and gstack-relink to the mock install
|
|
const mockBin = path.join(installDir, 'bin');
|
|
fs.mkdirSync(mockBin, { recursive: true });
|
|
fs.copyFileSync(path.join(BIN, 'gstack-config'), path.join(mockBin, 'gstack-config'));
|
|
fs.chmodSync(path.join(mockBin, 'gstack-config'), 0o755);
|
|
if (fs.existsSync(path.join(BIN, 'gstack-relink'))) {
|
|
fs.copyFileSync(path.join(BIN, 'gstack-relink'), path.join(mockBin, 'gstack-relink'));
|
|
fs.chmodSync(path.join(mockBin, 'gstack-relink'), 0o755);
|
|
}
|
|
if (fs.existsSync(path.join(BIN, 'gstack-patch-names'))) {
|
|
fs.copyFileSync(path.join(BIN, 'gstack-patch-names'), path.join(mockBin, 'gstack-patch-names'));
|
|
fs.chmodSync(path.join(mockBin, 'gstack-patch-names'), 0o755);
|
|
}
|
|
|
|
// Create mock skill directories with proper frontmatter
|
|
for (const skill of skills) {
|
|
fs.mkdirSync(path.join(installDir, skill), { recursive: true });
|
|
fs.writeFileSync(
|
|
path.join(installDir, skill, 'SKILL.md'),
|
|
`---\nname: ${skill}\ndescription: test\n---\n# ${skill}`
|
|
);
|
|
}
|
|
}
|
|
|
|
beforeEach(() => {
|
|
tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'gstack-relink-test-'));
|
|
});
|
|
|
|
afterEach(() => {
|
|
fs.rmSync(tmpDir, { recursive: true, force: true });
|
|
});
|
|
|
|
describe('gstack-relink (#578)', () => {
|
|
// Test 11: prefixed symlinks when skill_prefix=true
|
|
test('creates gstack-* symlinks when skill_prefix=true', () => {
|
|
setupMockInstall(['qa', 'ship', 'review']);
|
|
// Set config to prefix mode (pass install/skills env so auto-relink uses mock install)
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix true`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
// Run relink with env pointing to the mock install
|
|
const output = run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
// Verify gstack-* symlinks exist
|
|
expect(fs.existsSync(path.join(skillsDir, 'gstack-qa'))).toBe(true);
|
|
expect(fs.existsSync(path.join(skillsDir, 'gstack-ship'))).toBe(true);
|
|
expect(fs.existsSync(path.join(skillsDir, 'gstack-review'))).toBe(true);
|
|
expect(output).toContain('gstack-');
|
|
});
|
|
|
|
// Test 12: flat symlinks when skill_prefix=false
|
|
test('creates flat symlinks when skill_prefix=false', () => {
|
|
setupMockInstall(['qa', 'ship', 'review']);
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix false`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
const output = run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
expect(fs.existsSync(path.join(skillsDir, 'qa'))).toBe(true);
|
|
expect(fs.existsSync(path.join(skillsDir, 'ship'))).toBe(true);
|
|
expect(fs.existsSync(path.join(skillsDir, 'review'))).toBe(true);
|
|
expect(output).toContain('flat');
|
|
});
|
|
|
|
// REGRESSION: unprefixed skills must be real directories, not symlinks (#761)
|
|
// Claude Code auto-prefixes skills nested under a parent dir symlink.
|
|
// e.g., `qa -> gstack/qa` gets discovered as "gstack-qa", not "qa".
|
|
// The fix: create real directories with SKILL.md symlinks inside.
|
|
test('unprefixed skills are real directories with SKILL.md symlinks, not dir symlinks', () => {
|
|
setupMockInstall(['qa', 'ship', 'review', 'plan-ceo-review']);
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix false`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
for (const skill of ['qa', 'ship', 'review', 'plan-ceo-review']) {
|
|
const skillPath = path.join(skillsDir, skill);
|
|
const skillMdPath = path.join(skillPath, 'SKILL.md');
|
|
// Must be a real directory, NOT a symlink
|
|
expect(fs.lstatSync(skillPath).isDirectory()).toBe(true);
|
|
expect(fs.lstatSync(skillPath).isSymbolicLink()).toBe(false);
|
|
// Must contain a SKILL.md that IS a symlink
|
|
expect(fs.existsSync(skillMdPath)).toBe(true);
|
|
expect(fs.lstatSync(skillMdPath).isSymbolicLink()).toBe(true);
|
|
// The SKILL.md symlink must point to the source skill's SKILL.md
|
|
const target = fs.readlinkSync(skillMdPath);
|
|
expect(target).toContain(skill);
|
|
expect(target).toEndWith('/SKILL.md');
|
|
}
|
|
});
|
|
|
|
// Same invariant for prefixed mode
|
|
test('prefixed skills are real directories with SKILL.md symlinks, not dir symlinks', () => {
|
|
setupMockInstall(['qa', 'ship']);
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix true`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
for (const skill of ['gstack-qa', 'gstack-ship']) {
|
|
const skillPath = path.join(skillsDir, skill);
|
|
const skillMdPath = path.join(skillPath, 'SKILL.md');
|
|
expect(fs.lstatSync(skillPath).isDirectory()).toBe(true);
|
|
expect(fs.lstatSync(skillPath).isSymbolicLink()).toBe(false);
|
|
expect(fs.lstatSync(skillMdPath).isSymbolicLink()).toBe(true);
|
|
}
|
|
});
|
|
|
|
// Upgrade: old directory symlinks get replaced with real directories
|
|
test('upgrades old directory symlinks to real directories', () => {
|
|
setupMockInstall(['qa', 'ship']);
|
|
// Simulate old behavior: create directory symlinks (the old pattern)
|
|
fs.symlinkSync(path.join(installDir, 'qa'), path.join(skillsDir, 'qa'));
|
|
fs.symlinkSync(path.join(installDir, 'ship'), path.join(skillsDir, 'ship'));
|
|
// Verify they start as symlinks
|
|
expect(fs.lstatSync(path.join(skillsDir, 'qa')).isSymbolicLink()).toBe(true);
|
|
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix false`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
|
|
// After relink: must be real directories, not symlinks
|
|
expect(fs.lstatSync(path.join(skillsDir, 'qa')).isSymbolicLink()).toBe(false);
|
|
expect(fs.lstatSync(path.join(skillsDir, 'qa')).isDirectory()).toBe(true);
|
|
expect(fs.lstatSync(path.join(skillsDir, 'qa', 'SKILL.md')).isSymbolicLink()).toBe(true);
|
|
});
|
|
|
|
// FIRST INSTALL: --no-prefix must create ONLY flat names, zero gstack-* pollution
|
|
test('first install --no-prefix: only flat names exist, zero gstack-* entries', () => {
|
|
setupMockInstall(['qa', 'ship', 'review', 'plan-ceo-review', 'gstack-upgrade']);
|
|
// Simulate first install: no saved config, pass --no-prefix equivalent
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix false`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
// Enumerate everything in skills dir
|
|
const entries = fs.readdirSync(skillsDir);
|
|
// Expected: qa, ship, review, plan-ceo-review, gstack-upgrade (its real name)
|
|
expect(entries.sort()).toEqual(['gstack-upgrade', 'plan-ceo-review', 'qa', 'review', 'ship']);
|
|
// No gstack-qa, gstack-ship, gstack-review, gstack-plan-ceo-review
|
|
const leaked = entries.filter(e => e.startsWith('gstack-') && e !== 'gstack-upgrade');
|
|
expect(leaked).toEqual([]);
|
|
});
|
|
|
|
// FIRST INSTALL: --prefix must create ONLY gstack-* names, zero flat-name pollution
|
|
test('first install --prefix: only gstack-* entries exist, zero flat names', () => {
|
|
setupMockInstall(['qa', 'ship', 'review', 'plan-ceo-review', 'gstack-upgrade']);
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix true`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
const entries = fs.readdirSync(skillsDir);
|
|
// Expected: gstack-qa, gstack-ship, gstack-review, gstack-plan-ceo-review, gstack-upgrade
|
|
expect(entries.sort()).toEqual([
|
|
'gstack-plan-ceo-review', 'gstack-qa', 'gstack-review', 'gstack-ship', 'gstack-upgrade',
|
|
]);
|
|
// No unprefixed qa, ship, review, plan-ceo-review
|
|
const leaked = entries.filter(e => !e.startsWith('gstack-'));
|
|
expect(leaked).toEqual([]);
|
|
});
|
|
|
|
// FIRST INSTALL: non-TTY (no saved config, piped stdin) defaults to flat names
|
|
test('non-TTY first install defaults to flat names via relink', () => {
|
|
setupMockInstall(['qa', 'ship']);
|
|
// Don't set any config — simulate fresh install
|
|
// gstack-relink reads config; on fresh install config returns empty → defaults to false
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
const entries = fs.readdirSync(skillsDir);
|
|
// Should be flat names (relink defaults to false when config returns empty)
|
|
expect(entries.sort()).toEqual(['qa', 'ship']);
|
|
});
|
|
|
|
// SWITCH: prefix → no-prefix must clean up ALL gstack-* entries
|
|
test('switching prefix to no-prefix removes all gstack-* entries completely', () => {
|
|
setupMockInstall(['qa', 'ship', 'review', 'plan-ceo-review', 'gstack-upgrade']);
|
|
// Start in prefix mode
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix true`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
let entries = fs.readdirSync(skillsDir);
|
|
expect(entries.filter(e => !e.startsWith('gstack-'))).toEqual([]);
|
|
|
|
// Switch to no-prefix
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix false`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
entries = fs.readdirSync(skillsDir);
|
|
// Only flat names + gstack-upgrade (its real name)
|
|
expect(entries.sort()).toEqual(['gstack-upgrade', 'plan-ceo-review', 'qa', 'review', 'ship']);
|
|
const leaked = entries.filter(e => e.startsWith('gstack-') && e !== 'gstack-upgrade');
|
|
expect(leaked).toEqual([]);
|
|
});
|
|
|
|
// SWITCH: no-prefix → prefix must clean up ALL flat entries
|
|
test('switching no-prefix to prefix removes all flat entries completely', () => {
|
|
setupMockInstall(['qa', 'ship', 'review', 'gstack-upgrade']);
|
|
// Start in no-prefix mode
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix false`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
let entries = fs.readdirSync(skillsDir);
|
|
expect(entries.filter(e => e.startsWith('gstack-') && e !== 'gstack-upgrade')).toEqual([]);
|
|
|
|
// Switch to prefix
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix true`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
entries = fs.readdirSync(skillsDir);
|
|
// Only gstack-* names
|
|
expect(entries.sort()).toEqual([
|
|
'gstack-qa', 'gstack-review', 'gstack-ship', 'gstack-upgrade',
|
|
]);
|
|
const leaked = entries.filter(e => !e.startsWith('gstack-'));
|
|
expect(leaked).toEqual([]);
|
|
});
|
|
|
|
// Test 13: cleans stale symlinks from opposite mode
|
|
test('cleans up stale symlinks from opposite mode', () => {
|
|
setupMockInstall(['qa', 'ship']);
|
|
// Create prefixed symlinks first
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix true`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
expect(fs.existsSync(path.join(skillsDir, 'gstack-qa'))).toBe(true);
|
|
|
|
// Switch to flat mode
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix false`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
|
|
// Flat symlinks should exist, prefixed should be gone
|
|
expect(fs.existsSync(path.join(skillsDir, 'qa'))).toBe(true);
|
|
expect(fs.existsSync(path.join(skillsDir, 'gstack-qa'))).toBe(false);
|
|
});
|
|
|
|
// Test 14: error when install dir missing
|
|
test('prints error when install dir missing', () => {
|
|
const output = run(`${BIN}/gstack-relink`, {
|
|
GSTACK_INSTALL_DIR: '/nonexistent/path/gstack',
|
|
GSTACK_SKILLS_DIR: '/nonexistent/path/skills',
|
|
}, true);
|
|
expect(output).toContain('setup');
|
|
});
|
|
|
|
// Test: gstack-upgrade does NOT get double-prefixed
|
|
test('does not double-prefix gstack-upgrade directory', () => {
|
|
setupMockInstall(['qa', 'ship', 'gstack-upgrade']);
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix true`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
// gstack-upgrade should keep its name, NOT become gstack-gstack-upgrade
|
|
expect(fs.existsSync(path.join(skillsDir, 'gstack-upgrade'))).toBe(true);
|
|
expect(fs.existsSync(path.join(skillsDir, 'gstack-gstack-upgrade'))).toBe(false);
|
|
// Regular skills still get prefixed
|
|
expect(fs.existsSync(path.join(skillsDir, 'gstack-qa'))).toBe(true);
|
|
});
|
|
|
|
// Test 15: gstack-config set skill_prefix triggers relink
|
|
test('gstack-config set skill_prefix triggers relink', () => {
|
|
setupMockInstall(['qa', 'ship']);
|
|
// Run gstack-config set which should auto-trigger relink
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix true`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
// If relink was triggered, symlinks should exist
|
|
expect(fs.existsSync(path.join(skillsDir, 'gstack-qa'))).toBe(true);
|
|
expect(fs.existsSync(path.join(skillsDir, 'gstack-ship'))).toBe(true);
|
|
});
|
|
});
|
|
|
|
describe('upgrade migrations', () => {
|
|
const MIGRATIONS_DIR = path.join(ROOT, 'gstack-upgrade', 'migrations');
|
|
|
|
test('migrations directory exists', () => {
|
|
expect(fs.existsSync(MIGRATIONS_DIR)).toBe(true);
|
|
});
|
|
|
|
test('all migration scripts are executable and parse without syntax errors', () => {
|
|
const scripts = fs.readdirSync(MIGRATIONS_DIR).filter(f => f.endsWith('.sh'));
|
|
expect(scripts.length).toBeGreaterThan(0);
|
|
for (const script of scripts) {
|
|
const fullPath = path.join(MIGRATIONS_DIR, script);
|
|
// Must be executable
|
|
const stat = fs.statSync(fullPath);
|
|
expect(stat.mode & 0o111).toBeGreaterThan(0);
|
|
// Must parse without syntax errors (bash -n is a syntax check, doesn't execute)
|
|
const result = execSync(`bash -n "${fullPath}" 2>&1`, { encoding: 'utf-8', timeout: 5000 });
|
|
// bash -n outputs nothing on success
|
|
}
|
|
});
|
|
|
|
test('migration filenames follow v{VERSION}.sh pattern', () => {
|
|
const scripts = fs.readdirSync(MIGRATIONS_DIR).filter(f => f.endsWith('.sh'));
|
|
for (const script of scripts) {
|
|
expect(script).toMatch(/^v\d+\.\d+\.\d+\.\d+\.sh$/);
|
|
}
|
|
});
|
|
|
|
test('v0.15.2.0 migration runs gstack-relink', () => {
|
|
const content = fs.readFileSync(path.join(MIGRATIONS_DIR, 'v0.15.2.0.sh'), 'utf-8');
|
|
expect(content).toContain('gstack-relink');
|
|
});
|
|
|
|
test('v0.15.2.0 migration fixes stale directory symlinks', () => {
|
|
setupMockInstall(['qa', 'ship', 'review']);
|
|
// Simulate old state: directory symlinks (pre-v0.15.2.0 pattern)
|
|
fs.symlinkSync(path.join(installDir, 'qa'), path.join(skillsDir, 'qa'));
|
|
fs.symlinkSync(path.join(installDir, 'ship'), path.join(skillsDir, 'ship'));
|
|
fs.symlinkSync(path.join(installDir, 'review'), path.join(skillsDir, 'review'));
|
|
// Set no-prefix mode (suppress auto-relink so symlinks stay intact for the test)
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix false`, {
|
|
GSTACK_SETUP_RUNNING: '1',
|
|
});
|
|
// Verify old state: symlinks
|
|
expect(fs.lstatSync(path.join(skillsDir, 'qa')).isSymbolicLink()).toBe(true);
|
|
|
|
// Run the migration (it calls gstack-relink internally)
|
|
run(`bash ${path.join(MIGRATIONS_DIR, 'v0.15.2.0.sh')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
|
|
// After migration: real directories with SKILL.md symlinks
|
|
for (const skill of ['qa', 'ship', 'review']) {
|
|
const skillPath = path.join(skillsDir, skill);
|
|
expect(fs.lstatSync(skillPath).isSymbolicLink()).toBe(false);
|
|
expect(fs.lstatSync(skillPath).isDirectory()).toBe(true);
|
|
expect(fs.lstatSync(path.join(skillPath, 'SKILL.md')).isSymbolicLink()).toBe(true);
|
|
}
|
|
});
|
|
});
|
|
|
|
describe('gstack-patch-names (#620/#578)', () => {
|
|
// Helper to read name: from SKILL.md frontmatter
|
|
function readSkillName(skillDir: string): string | null {
|
|
const content = fs.readFileSync(path.join(skillDir, 'SKILL.md'), 'utf-8');
|
|
const match = content.match(/^name:\s*(.+)$/m);
|
|
return match ? match[1].trim() : null;
|
|
}
|
|
|
|
test('prefix=true patches name: field in SKILL.md', () => {
|
|
setupMockInstall(['qa', 'ship', 'review']);
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix true`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
// Verify name: field is patched with gstack- prefix
|
|
expect(readSkillName(path.join(installDir, 'qa'))).toBe('gstack-qa');
|
|
expect(readSkillName(path.join(installDir, 'ship'))).toBe('gstack-ship');
|
|
expect(readSkillName(path.join(installDir, 'review'))).toBe('gstack-review');
|
|
});
|
|
|
|
test('prefix=false restores name: field in SKILL.md', () => {
|
|
setupMockInstall(['qa', 'ship']);
|
|
// First, prefix them
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix true`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
expect(readSkillName(path.join(installDir, 'qa'))).toBe('gstack-qa');
|
|
// Now switch to flat mode
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix false`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
// Verify name: field is restored to unprefixed
|
|
expect(readSkillName(path.join(installDir, 'qa'))).toBe('qa');
|
|
expect(readSkillName(path.join(installDir, 'ship'))).toBe('ship');
|
|
});
|
|
|
|
test('gstack-upgrade name: not double-prefixed', () => {
|
|
setupMockInstall(['qa', 'gstack-upgrade']);
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix true`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
// gstack-upgrade should keep its name, NOT become gstack-gstack-upgrade
|
|
expect(readSkillName(path.join(installDir, 'gstack-upgrade'))).toBe('gstack-upgrade');
|
|
// Regular skill should be prefixed
|
|
expect(readSkillName(path.join(installDir, 'qa'))).toBe('gstack-qa');
|
|
});
|
|
|
|
test('SKILL.md without frontmatter is a no-op', () => {
|
|
setupMockInstall(['qa']);
|
|
// Overwrite qa SKILL.md with no frontmatter
|
|
fs.writeFileSync(path.join(installDir, 'qa', 'SKILL.md'), '# qa\nSome content.');
|
|
run(`${path.join(installDir, 'bin', 'gstack-config')} set skill_prefix true`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
// Should not crash
|
|
run(`${path.join(installDir, 'bin', 'gstack-relink')}`, {
|
|
GSTACK_INSTALL_DIR: installDir,
|
|
GSTACK_SKILLS_DIR: skillsDir,
|
|
});
|
|
// Content should be unchanged (no name: to patch)
|
|
const content = fs.readFileSync(path.join(installDir, 'qa', 'SKILL.md'), 'utf-8');
|
|
expect(content).toBe('# qa\nSome content.');
|
|
});
|
|
});
|