Files
gstack/browse/test/gstack-config.test.ts
Garry Tan 66c09644a7 feat: composable skills — INVOKE_SKILL resolver + factoring infrastructure (v0.13.7.0) (#644)
* feat: add parameterized resolver support to gen-skill-docs

Extend the placeholder regex from {{WORD}} to {{WORD:arg1:arg2}},
enabling parameterized resolvers like {{INVOKE_SKILL:plan-ceo-review}}.

- Widen ResolverFn type to accept optional args?: string[]
- Update RESOLVERS record to use ResolverFn type
- Both replacement and unresolved-check regexes updated
- Fully backward compatible: existing {{WORD}} patterns unchanged

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add INVOKE_SKILL resolver for composable skill loading

New composition.ts resolver module that emits prose instructing Claude
to read another skill's SKILL.md and follow it, skipping preamble
sections. Supports optional skip= parameter for additional sections.

Usage: {{INVOKE_SKILL:plan-ceo-review}} or
       {{INVOKE_SKILL:plan-ceo-review:skip=Outside Voice}}

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: use frontmatter name: for skill symlinks and Codex paths

Patch all 3 name-derivation paths to read name: from SKILL.md
frontmatter instead of relying solely on directory basenames.
This enables directory names that differ from invocation names
(e.g., run-tests/ directory with name: test).

- setup: link_claude_skill_dirs reads name: via grep, falls back to basename
- gen-skill-docs.ts: codexSkillName uses frontmatter name for Codex output paths
- gen-skill-docs.ts: moved frontmatter extraction before Codex path logic

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: extract CHANGELOG_WORKFLOW resolver from /ship

Move changelog generation logic into a reusable resolver. The resolver
is changelog-only (no version bump per Codex review recommendation).
Adds voice rules inline. /ship Step 5 now uses {{CHANGELOG_WORKFLOW}}.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: use INVOKE_SKILL resolver for plan-ceo-review office-hours fallback

Replace inline skill loading prose (read file, skip sections) with
{{INVOKE_SKILL:office-hours}} in the mid-session detection path.
The BENEFITS_FROM prerequisite offer is unchanged (separate use case).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: BENEFITS_FROM resolver delegates to INVOKE_SKILL

Eliminate duplicated skip-list logic by having generateBenefitsFrom
call generateInvokeSkill internally. The wrapper (AskUserQuestion,
design doc re-check) stays in BENEFITS_FROM. The loading instructions
(read file, skip sections, error handling) come from INVOKE_SKILL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add resolver tests for INVOKE_SKILL, CHANGELOG_WORKFLOW, parameterized args

12 new tests covering:
- INVOKE_SKILL: template placeholder, default skip list, error handling,
  BENEFITS_FROM delegation
- CHANGELOG_WORKFLOW: content, cross-check, voice guidance, format
- Parameterized resolver infra: colon-separated args processing,
  no unresolved placeholders across all generated SKILL.md files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.13.7.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: journey routing tests — CLAUDE.md routing rules + stronger descriptions

Three journey E2E tests (ideation, ship, debug) were failing because
Claude answered directly instead of invoking the Skill tool. Root cause:
skill descriptions in system-reminder are too weak to override Claude's
default behavior for tasks it can handle natively.

Fix has two parts:
1. CLAUDE.md routing rules in test workdir — Claude weighs project-level
   instructions higher than skill description metadata
2. "Proactively invoke" (not "suggest") in office-hours, investigate,
   ship descriptions — reinforces the routing signal

10/10 journey tests now pass (was 7/10).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: one-time CLAUDE.md routing injection prompt

Add a preamble section that checks if the project's CLAUDE.md has
skill routing rules. If not (and user hasn't declined), asks once
via AskUserQuestion to inject a "## Skill routing" section.

Root cause: skill descriptions in system-reminder metadata are too
weak to reliably trigger proactive Skill tool invocation. CLAUDE.md
project instructions carry higher weight in Claude's decision making.

- Preamble bash checks for "## Skill routing" in CLAUDE.md
- Stores decline in gstack-config (routing_declined=true)
- Only asks once per project (HAS_ROUTING check + config check)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: annotated config file + routing injection tests

gstack-config now writes a documented header on first config creation
with every supported key explained (proactive, telemetry, auto_upgrade,
skill_prefix, routing_declined, codex_reviews, skip_eng_review, etc.).
Users can edit ~/.gstack/config.yaml directly, anytime.

Also fixes grep to use ^KEY: anchoring so commented header lines don't
shadow real config values.

Tests added:
- 7 new gstack-config tests (annotated header, no duplication, comment
  safety, routing_declined get/set/reset)
- 6 new gen-skill-docs tests (preamble routing injection: bash checks,
  config reads, AskUserQuestion, decline persistence, routing rules)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump to v0.13.9.0, separate CHANGELOG from main's releases

Split our branch's changes into a new 0.13.9.0 entry instead of
jamming them into 0.13.7.0 which already landed on main as
"Community Wave."

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: clarify branch-scoped VERSION/CHANGELOG after merging main

Add explicit rules: merging main doesn't mean adopting main's version.
Branch always gets its own entry on top with a higher version number.
Three-point checklist after every merge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: put our 0.13.9.0 entry on top of CHANGELOG

Newest version goes on top. Our branch lands next, so our entry
must be above main's 0.13.8.0.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: restore missing 0.13.7.0 Community Wave entry

Accidentally dropped the 0.13.7.0 entry when reordering.
All entries now present: 0.13.9.0 > 0.13.8.0 > 0.13.7.0 > 0.13.6.0.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add CHANGELOG integrity check rule

After any edit that moves/adds/removes entries, grep for version
headers and verify no gaps or duplicates before committing.
Prevents accidentally dropping entries during reordering.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 23:35:17 -06:00

197 lines
7.7 KiB
TypeScript

/**
* Tests for bin/gstack-config bash script.
*
* Uses Bun.spawnSync to invoke the script with temp dirs and
* GSTACK_STATE_DIR env override for full isolation.
*/
import { describe, test, expect, beforeEach, afterEach } from 'bun:test';
import { mkdtempSync, writeFileSync, rmSync, readFileSync, existsSync } from 'fs';
import { join } from 'path';
import { tmpdir } from 'os';
const SCRIPT = join(import.meta.dir, '..', '..', 'bin', 'gstack-config');
let stateDir: string;
function run(args: string[] = [], extraEnv: Record<string, string> = {}) {
const result = Bun.spawnSync(['bash', SCRIPT, ...args], {
env: {
...process.env,
GSTACK_STATE_DIR: stateDir,
...extraEnv,
},
stdout: 'pipe',
stderr: 'pipe',
});
return {
exitCode: result.exitCode,
stdout: result.stdout.toString().trim(),
stderr: result.stderr.toString().trim(),
};
}
beforeEach(() => {
stateDir = mkdtempSync(join(tmpdir(), 'gstack-config-test-'));
});
afterEach(() => {
rmSync(stateDir, { recursive: true, force: true });
});
describe('gstack-config', () => {
// ─── get ──────────────────────────────────────────────────
test('get on missing file returns empty, exit 0', () => {
const { exitCode, stdout } = run(['get', 'auto_upgrade']);
expect(exitCode).toBe(0);
expect(stdout).toBe('');
});
test('get existing key returns value', () => {
writeFileSync(join(stateDir, 'config.yaml'), 'auto_upgrade: true\n');
const { exitCode, stdout } = run(['get', 'auto_upgrade']);
expect(exitCode).toBe(0);
expect(stdout).toBe('true');
});
test('get missing key returns empty', () => {
writeFileSync(join(stateDir, 'config.yaml'), 'auto_upgrade: true\n');
const { exitCode, stdout } = run(['get', 'nonexistent']);
expect(exitCode).toBe(0);
expect(stdout).toBe('');
});
test('get returns last value when key appears multiple times', () => {
writeFileSync(join(stateDir, 'config.yaml'), 'foo: bar\nfoo: baz\n');
const { exitCode, stdout } = run(['get', 'foo']);
expect(exitCode).toBe(0);
expect(stdout).toBe('baz');
});
// ─── set ──────────────────────────────────────────────────
test('set creates file and writes key on missing file', () => {
const { exitCode } = run(['set', 'auto_upgrade', 'true']);
expect(exitCode).toBe(0);
const content = readFileSync(join(stateDir, 'config.yaml'), 'utf-8');
expect(content).toContain('auto_upgrade: true');
});
test('set appends new key to existing file', () => {
writeFileSync(join(stateDir, 'config.yaml'), 'foo: bar\n');
const { exitCode } = run(['set', 'auto_upgrade', 'true']);
expect(exitCode).toBe(0);
const content = readFileSync(join(stateDir, 'config.yaml'), 'utf-8');
expect(content).toContain('foo: bar');
expect(content).toContain('auto_upgrade: true');
});
test('set replaces existing key in-place', () => {
writeFileSync(join(stateDir, 'config.yaml'), 'auto_upgrade: false\n');
const { exitCode } = run(['set', 'auto_upgrade', 'true']);
expect(exitCode).toBe(0);
const content = readFileSync(join(stateDir, 'config.yaml'), 'utf-8');
expect(content).toContain('auto_upgrade: true');
expect(content).not.toContain('auto_upgrade: false');
});
test('set creates state dir if missing', () => {
const nestedDir = join(stateDir, 'nested', 'dir');
const { exitCode } = run(['set', 'foo', 'bar'], { GSTACK_STATE_DIR: nestedDir });
expect(exitCode).toBe(0);
expect(existsSync(join(nestedDir, 'config.yaml'))).toBe(true);
});
// ─── list ─────────────────────────────────────────────────
test('list shows all keys', () => {
writeFileSync(join(stateDir, 'config.yaml'), 'auto_upgrade: true\nupdate_check: false\n');
const { exitCode, stdout } = run(['list']);
expect(exitCode).toBe(0);
expect(stdout).toContain('auto_upgrade: true');
expect(stdout).toContain('update_check: false');
});
test('list on missing file returns empty, exit 0', () => {
const { exitCode, stdout } = run(['list']);
expect(exitCode).toBe(0);
expect(stdout).toBe('');
});
// ─── usage ────────────────────────────────────────────────
test('no args shows usage and exits 1', () => {
const { exitCode, stdout } = run([]);
expect(exitCode).toBe(1);
expect(stdout).toContain('Usage');
});
// ─── security: input validation ─────────────────────────
test('set rejects key with regex metacharacters', () => {
const { exitCode, stderr } = run(['set', '.*', 'value']);
expect(exitCode).toBe(1);
expect(stderr).toContain('alphanumeric');
});
test('set preserves value with sed special chars', () => {
run(['set', 'test_special', 'a/b&c\\d']);
const { stdout } = run(['get', 'test_special']);
expect(stdout).toBe('a/b&c\\d');
});
// ─── annotated header ──────────────────────────────────────
test('first set writes annotated header with docs', () => {
run(['set', 'telemetry', 'off']);
const content = readFileSync(join(stateDir, 'config.yaml'), 'utf-8');
expect(content).toContain('# gstack configuration');
expect(content).toContain('edit freely');
expect(content).toContain('proactive:');
expect(content).toContain('telemetry:');
expect(content).toContain('auto_upgrade:');
expect(content).toContain('skill_prefix:');
expect(content).toContain('routing_declined:');
expect(content).toContain('codex_reviews:');
expect(content).toContain('skip_eng_review:');
});
test('header written only once, not duplicated on second set', () => {
run(['set', 'foo', 'bar']);
run(['set', 'baz', 'qux']);
const content = readFileSync(join(stateDir, 'config.yaml'), 'utf-8');
const headerCount = (content.match(/# gstack configuration/g) || []).length;
expect(headerCount).toBe(1);
});
test('header does not break get on commented-out keys', () => {
run(['set', 'telemetry', 'community']);
// Header contains "# telemetry: anonymous" as a comment example.
// get should return the real value, not the comment.
const { stdout } = run(['get', 'telemetry']);
expect(stdout).toBe('community');
});
test('existing config file is not overwritten with header', () => {
writeFileSync(join(stateDir, 'config.yaml'), 'existing: value\n');
run(['set', 'new_key', 'new_value']);
const content = readFileSync(join(stateDir, 'config.yaml'), 'utf-8');
expect(content).toContain('existing: value');
expect(content).not.toContain('# gstack configuration');
});
// ─── routing_declined ──────────────────────────────────────
test('routing_declined defaults to empty (not set)', () => {
const { stdout } = run(['get', 'routing_declined']);
expect(stdout).toBe('');
});
test('routing_declined can be set and read', () => {
run(['set', 'routing_declined', 'true']);
const { stdout } = run(['get', 'routing_declined']);
expect(stdout).toBe('true');
});
test('routing_declined can be reset to false', () => {
run(['set', 'routing_declined', 'true']);
run(['set', 'routing_declined', 'false']);
const { stdout } = run(['get', 'routing_declined']);
expect(stdout).toBe('false');
});
});