feat: composable skills — INVOKE_SKILL resolver + factoring infrastructure (v0.13.7.0) (#644)

* feat: add parameterized resolver support to gen-skill-docs

Extend the placeholder regex from {{WORD}} to {{WORD:arg1:arg2}},
enabling parameterized resolvers like {{INVOKE_SKILL:plan-ceo-review}}.

- Widen ResolverFn type to accept optional args?: string[]
- Update RESOLVERS record to use ResolverFn type
- Both replacement and unresolved-check regexes updated
- Fully backward compatible: existing {{WORD}} patterns unchanged

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add INVOKE_SKILL resolver for composable skill loading

New composition.ts resolver module that emits prose instructing Claude
to read another skill's SKILL.md and follow it, skipping preamble
sections. Supports optional skip= parameter for additional sections.

Usage: {{INVOKE_SKILL:plan-ceo-review}} or
       {{INVOKE_SKILL:plan-ceo-review:skip=Outside Voice}}

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: use frontmatter name: for skill symlinks and Codex paths

Patch all 3 name-derivation paths to read name: from SKILL.md
frontmatter instead of relying solely on directory basenames.
This enables directory names that differ from invocation names
(e.g., run-tests/ directory with name: test).

- setup: link_claude_skill_dirs reads name: via grep, falls back to basename
- gen-skill-docs.ts: codexSkillName uses frontmatter name for Codex output paths
- gen-skill-docs.ts: moved frontmatter extraction before Codex path logic

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: extract CHANGELOG_WORKFLOW resolver from /ship

Move changelog generation logic into a reusable resolver. The resolver
is changelog-only (no version bump per Codex review recommendation).
Adds voice rules inline. /ship Step 5 now uses {{CHANGELOG_WORKFLOW}}.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: use INVOKE_SKILL resolver for plan-ceo-review office-hours fallback

Replace inline skill loading prose (read file, skip sections) with
{{INVOKE_SKILL:office-hours}} in the mid-session detection path.
The BENEFITS_FROM prerequisite offer is unchanged (separate use case).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: BENEFITS_FROM resolver delegates to INVOKE_SKILL

Eliminate duplicated skip-list logic by having generateBenefitsFrom
call generateInvokeSkill internally. The wrapper (AskUserQuestion,
design doc re-check) stays in BENEFITS_FROM. The loading instructions
(read file, skip sections, error handling) come from INVOKE_SKILL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add resolver tests for INVOKE_SKILL, CHANGELOG_WORKFLOW, parameterized args

12 new tests covering:
- INVOKE_SKILL: template placeholder, default skip list, error handling,
  BENEFITS_FROM delegation
- CHANGELOG_WORKFLOW: content, cross-check, voice guidance, format
- Parameterized resolver infra: colon-separated args processing,
  no unresolved placeholders across all generated SKILL.md files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.13.7.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: journey routing tests — CLAUDE.md routing rules + stronger descriptions

Three journey E2E tests (ideation, ship, debug) were failing because
Claude answered directly instead of invoking the Skill tool. Root cause:
skill descriptions in system-reminder are too weak to override Claude's
default behavior for tasks it can handle natively.

Fix has two parts:
1. CLAUDE.md routing rules in test workdir — Claude weighs project-level
   instructions higher than skill description metadata
2. "Proactively invoke" (not "suggest") in office-hours, investigate,
   ship descriptions — reinforces the routing signal

10/10 journey tests now pass (was 7/10).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: one-time CLAUDE.md routing injection prompt

Add a preamble section that checks if the project's CLAUDE.md has
skill routing rules. If not (and user hasn't declined), asks once
via AskUserQuestion to inject a "## Skill routing" section.

Root cause: skill descriptions in system-reminder metadata are too
weak to reliably trigger proactive Skill tool invocation. CLAUDE.md
project instructions carry higher weight in Claude's decision making.

- Preamble bash checks for "## Skill routing" in CLAUDE.md
- Stores decline in gstack-config (routing_declined=true)
- Only asks once per project (HAS_ROUTING check + config check)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: annotated config file + routing injection tests

gstack-config now writes a documented header on first config creation
with every supported key explained (proactive, telemetry, auto_upgrade,
skill_prefix, routing_declined, codex_reviews, skip_eng_review, etc.).
Users can edit ~/.gstack/config.yaml directly, anytime.

Also fixes grep to use ^KEY: anchoring so commented header lines don't
shadow real config values.

Tests added:
- 7 new gstack-config tests (annotated header, no duplication, comment
  safety, routing_declined get/set/reset)
- 6 new gen-skill-docs tests (preamble routing injection: bash checks,
  config reads, AskUserQuestion, decline persistence, routing rules)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump to v0.13.9.0, separate CHANGELOG from main's releases

Split our branch's changes into a new 0.13.9.0 entry instead of
jamming them into 0.13.7.0 which already landed on main as
"Community Wave."

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: clarify branch-scoped VERSION/CHANGELOG after merging main

Add explicit rules: merging main doesn't mean adopting main's version.
Branch always gets its own entry on top with a higher version number.
Three-point checklist after every merge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: put our 0.13.9.0 entry on top of CHANGELOG

Newest version goes on top. Our branch lands next, so our entry
must be above main's 0.13.8.0.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: restore missing 0.13.7.0 Community Wave entry

Accidentally dropped the 0.13.7.0 entry when reordering.
All entries now present: 0.13.9.0 > 0.13.8.0 > 0.13.7.0 > 0.13.6.0.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add CHANGELOG integrity check rule

After any edit that moves/adds/removes entries, grep for version
headers and verify no gaps or duplicates before committing.
Prevents accidentally dropping entries during reordering.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-03-29 23:35:17 -06:00
committed by GitHub
parent 3cda8deec9
commit 66c09644a7
48 changed files with 1950 additions and 166 deletions
+135 -2
View File
@@ -1153,6 +1153,138 @@ describe('BENEFITS_FROM resolver', () => {
expect(ceoContent).toContain('office-hours/SKILL.md');
expect(engContent).toContain('office-hours/SKILL.md');
});
test('BENEFITS_FROM delegates to INVOKE_SKILL pattern', () => {
// Should contain the INVOKE_SKILL-style loading prose (not the old manual skip list)
expect(engContent).toContain('Follow its instructions from top to bottom');
expect(engContent).toContain('skipping these sections');
expect(ceoContent).toContain('Follow its instructions from top to bottom');
});
});
// --- {{INVOKE_SKILL}} resolver tests ---
describe('INVOKE_SKILL resolver', () => {
const ceoContent = fs.readFileSync(path.join(ROOT, 'plan-ceo-review', 'SKILL.md'), 'utf-8');
test('plan-ceo-review uses INVOKE_SKILL for mid-session office-hours fallback', () => {
// The mid-session detection path should use INVOKE_SKILL-generated prose
expect(ceoContent).toContain('office-hours/SKILL.md');
expect(ceoContent).toContain('Follow its instructions from top to bottom');
});
test('INVOKE_SKILL output includes default skip list', () => {
expect(ceoContent).toContain('Preamble (run first)');
expect(ceoContent).toContain('Telemetry (run last)');
expect(ceoContent).toContain('AskUserQuestion Format');
});
test('INVOKE_SKILL output includes error handling', () => {
expect(ceoContent).toContain('If unreadable');
expect(ceoContent).toContain('Could not load');
});
test('template uses {{INVOKE_SKILL:office-hours}} placeholder', () => {
const tmpl = fs.readFileSync(path.join(ROOT, 'plan-ceo-review', 'SKILL.md.tmpl'), 'utf-8');
expect(tmpl).toContain('{{INVOKE_SKILL:office-hours}}');
});
});
// --- {{CHANGELOG_WORKFLOW}} resolver tests ---
describe('CHANGELOG_WORKFLOW resolver', () => {
const shipContent = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
test('ship SKILL.md contains changelog workflow', () => {
expect(shipContent).toContain('CHANGELOG (auto-generate)');
expect(shipContent).toContain('git log <base>..HEAD --oneline');
});
test('changelog workflow includes cross-check step', () => {
expect(shipContent).toContain('Cross-check');
expect(shipContent).toContain('Every commit must map to at least one bullet point');
});
test('changelog workflow includes voice guidance', () => {
expect(shipContent).toContain('Lead with what the user can now **do**');
});
test('template uses {{CHANGELOG_WORKFLOW}} placeholder', () => {
const tmpl = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md.tmpl'), 'utf-8');
expect(tmpl).toContain('{{CHANGELOG_WORKFLOW}}');
// Should NOT contain the old inline changelog content
expect(tmpl).not.toContain('Group commits by theme');
});
test('changelog workflow includes keep-changelog format', () => {
expect(shipContent).toContain('### Added');
expect(shipContent).toContain('### Fixed');
});
});
// --- Parameterized resolver infrastructure tests ---
describe('parameterized resolver support', () => {
test('gen-skill-docs regex handles colon-separated args', () => {
// Verify the template containing {{INVOKE_SKILL:office-hours}} was processed
// without leaving unresolved placeholders
const ceoContent = fs.readFileSync(path.join(ROOT, 'plan-ceo-review', 'SKILL.md'), 'utf-8');
expect(ceoContent).not.toMatch(/\{\{INVOKE_SKILL:[^}]+\}\}/);
});
test('templates with parameterized resolvers pass unresolved check', () => {
// All generated SKILL.md files should have no unresolved {{...}} placeholders
const skillDirs = fs.readdirSync(ROOT).filter(d =>
fs.existsSync(path.join(ROOT, d, 'SKILL.md'))
);
for (const dir of skillDirs) {
const content = fs.readFileSync(path.join(ROOT, dir, 'SKILL.md'), 'utf-8');
const unresolved = content.match(/\{\{[A-Z_]+(?::[^}]*)?\}\}/g);
if (unresolved) {
throw new Error(`${dir}/SKILL.md has unresolved placeholders: ${unresolved.join(', ')}`);
}
}
});
});
// --- Preamble routing injection tests ---
describe('preamble routing injection', () => {
const shipContent = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
test('preamble bash checks for routing section in CLAUDE.md', () => {
expect(shipContent).toContain('grep -q "## Skill routing" CLAUDE.md');
expect(shipContent).toContain('HAS_ROUTING');
});
test('preamble bash reads routing_declined config', () => {
expect(shipContent).toContain('routing_declined');
expect(shipContent).toContain('ROUTING_DECLINED');
});
test('preamble includes routing injection AskUserQuestion', () => {
expect(shipContent).toContain('Add routing rules to CLAUDE.md');
expect(shipContent).toContain("I'll invoke skills manually");
});
test('routing injection respects prior decline', () => {
expect(shipContent).toContain('ROUTING_DECLINED');
expect(shipContent).toMatch(/routing_declined.*true/);
});
test('routing injection only fires when all conditions met', () => {
// Must be: HAS_ROUTING=no AND ROUTING_DECLINED=false AND PROACTIVE_PROMPTED=yes
expect(shipContent).toContain('HAS_ROUTING');
expect(shipContent).toContain('ROUTING_DECLINED');
expect(shipContent).toContain('PROACTIVE_PROMPTED');
});
test('routing section content includes key routing rules', () => {
expect(shipContent).toContain('invoke office-hours');
expect(shipContent).toContain('invoke investigate');
expect(shipContent).toContain('invoke ship');
expect(shipContent).toContain('invoke qa');
});
});
// --- {{DESIGN_OUTSIDE_VOICES}} resolver tests ---
@@ -1793,11 +1925,12 @@ describe('setup script validation', () => {
});
test('link_claude_skill_dirs creates relative symlinks', () => {
// Claude links should be relative: ln -snf "gstack/skill_name"
// Claude links should be relative: ln -snf "gstack/$dir_name"
// Uses dir_name (not skill_name) because symlink target must point to the physical directory
const fnStart = setupContent.indexOf('link_claude_skill_dirs()');
const fnEnd = setupContent.indexOf('}', setupContent.indexOf('linked[@]}', fnStart));
const fnBody = setupContent.slice(fnStart, fnEnd);
expect(fnBody).toContain('ln -snf "gstack/$skill_name"');
expect(fnBody).toContain('ln -snf "gstack/$dir_name"');
});
test('setup supports --host auto|claude|codex|kiro', () => {
+24 -5
View File
@@ -93,11 +93,30 @@ function installSkills(tmpDir: string) {
}
}
// Copy CLAUDE.md so Claude has project context for skill routing.
const claudeMdSrc = path.join(ROOT, 'CLAUDE.md');
if (fs.existsSync(claudeMdSrc)) {
fs.copyFileSync(claudeMdSrc, path.join(tmpDir, 'CLAUDE.md'));
}
// Write a CLAUDE.md with explicit routing instructions.
// The skill descriptions in system-reminder aren't strong enough to override
// Claude's default behavior of answering directly. A CLAUDE.md instruction
// puts routing rules in project context which Claude weighs more heavily.
fs.writeFileSync(path.join(tmpDir, 'CLAUDE.md'), `# Project Instructions
## Skill routing
When the user's request matches an available skill, ALWAYS invoke it using the Skill
tool as your FIRST action. Do NOT answer directly, do NOT use other tools first.
The skill has specialized workflows that produce better results than ad-hoc answers.
Key routing rules:
- Product ideas, "is this worth building", brainstorming → invoke office-hours
- Bugs, errors, "why is this broken", 500 errors → invoke investigate
- Ship, deploy, push, create PR → invoke ship
- QA, test the site, find bugs → invoke qa
- Code review, check my diff → invoke review
- Update docs after shipping → invoke document-release
- Weekly retro → invoke retro
- Design system, brand → invoke design-consultation
- Visual audit, design polish → invoke design-review
- Architecture review → invoke plan-eng-review
`);
}
/** Init a git repo with config */
+2 -2
View File
@@ -1409,13 +1409,13 @@ describe('Skill trigger phrases', () => {
];
for (const skill of SKILLS_REQUIRING_PROACTIVE) {
test(`${skill}/SKILL.md has "Proactively suggest" phrase`, () => {
test(`${skill}/SKILL.md has proactive routing phrase`, () => {
const skillPath = path.join(ROOT, skill, 'SKILL.md');
if (!fs.existsSync(skillPath)) return;
const content = fs.readFileSync(skillPath, 'utf-8');
const frontmatterEnd = content.indexOf('---', 4);
const frontmatter = content.slice(0, frontmatterEnd);
expect(frontmatter).toMatch(/Proactively suggest/i);
expect(frontmatter).toMatch(/Proactively (suggest|invoke)/i);
});
}
});