feat: composable skills — INVOKE_SKILL resolver + factoring infrastructure (v0.13.7.0) (#644)

* feat: add parameterized resolver support to gen-skill-docs

Extend the placeholder regex from {{WORD}} to {{WORD:arg1:arg2}},
enabling parameterized resolvers like {{INVOKE_SKILL:plan-ceo-review}}.

- Widen ResolverFn type to accept optional args?: string[]
- Update RESOLVERS record to use ResolverFn type
- Both replacement and unresolved-check regexes updated
- Fully backward compatible: existing {{WORD}} patterns unchanged

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add INVOKE_SKILL resolver for composable skill loading

New composition.ts resolver module that emits prose instructing Claude
to read another skill's SKILL.md and follow it, skipping preamble
sections. Supports optional skip= parameter for additional sections.

Usage: {{INVOKE_SKILL:plan-ceo-review}} or
       {{INVOKE_SKILL:plan-ceo-review:skip=Outside Voice}}

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: use frontmatter name: for skill symlinks and Codex paths

Patch all 3 name-derivation paths to read name: from SKILL.md
frontmatter instead of relying solely on directory basenames.
This enables directory names that differ from invocation names
(e.g., run-tests/ directory with name: test).

- setup: link_claude_skill_dirs reads name: via grep, falls back to basename
- gen-skill-docs.ts: codexSkillName uses frontmatter name for Codex output paths
- gen-skill-docs.ts: moved frontmatter extraction before Codex path logic

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: extract CHANGELOG_WORKFLOW resolver from /ship

Move changelog generation logic into a reusable resolver. The resolver
is changelog-only (no version bump per Codex review recommendation).
Adds voice rules inline. /ship Step 5 now uses {{CHANGELOG_WORKFLOW}}.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: use INVOKE_SKILL resolver for plan-ceo-review office-hours fallback

Replace inline skill loading prose (read file, skip sections) with
{{INVOKE_SKILL:office-hours}} in the mid-session detection path.
The BENEFITS_FROM prerequisite offer is unchanged (separate use case).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: BENEFITS_FROM resolver delegates to INVOKE_SKILL

Eliminate duplicated skip-list logic by having generateBenefitsFrom
call generateInvokeSkill internally. The wrapper (AskUserQuestion,
design doc re-check) stays in BENEFITS_FROM. The loading instructions
(read file, skip sections, error handling) come from INVOKE_SKILL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add resolver tests for INVOKE_SKILL, CHANGELOG_WORKFLOW, parameterized args

12 new tests covering:
- INVOKE_SKILL: template placeholder, default skip list, error handling,
  BENEFITS_FROM delegation
- CHANGELOG_WORKFLOW: content, cross-check, voice guidance, format
- Parameterized resolver infra: colon-separated args processing,
  no unresolved placeholders across all generated SKILL.md files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.13.7.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: journey routing tests — CLAUDE.md routing rules + stronger descriptions

Three journey E2E tests (ideation, ship, debug) were failing because
Claude answered directly instead of invoking the Skill tool. Root cause:
skill descriptions in system-reminder are too weak to override Claude's
default behavior for tasks it can handle natively.

Fix has two parts:
1. CLAUDE.md routing rules in test workdir — Claude weighs project-level
   instructions higher than skill description metadata
2. "Proactively invoke" (not "suggest") in office-hours, investigate,
   ship descriptions — reinforces the routing signal

10/10 journey tests now pass (was 7/10).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: one-time CLAUDE.md routing injection prompt

Add a preamble section that checks if the project's CLAUDE.md has
skill routing rules. If not (and user hasn't declined), asks once
via AskUserQuestion to inject a "## Skill routing" section.

Root cause: skill descriptions in system-reminder metadata are too
weak to reliably trigger proactive Skill tool invocation. CLAUDE.md
project instructions carry higher weight in Claude's decision making.

- Preamble bash checks for "## Skill routing" in CLAUDE.md
- Stores decline in gstack-config (routing_declined=true)
- Only asks once per project (HAS_ROUTING check + config check)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: annotated config file + routing injection tests

gstack-config now writes a documented header on first config creation
with every supported key explained (proactive, telemetry, auto_upgrade,
skill_prefix, routing_declined, codex_reviews, skip_eng_review, etc.).
Users can edit ~/.gstack/config.yaml directly, anytime.

Also fixes grep to use ^KEY: anchoring so commented header lines don't
shadow real config values.

Tests added:
- 7 new gstack-config tests (annotated header, no duplication, comment
  safety, routing_declined get/set/reset)
- 6 new gen-skill-docs tests (preamble routing injection: bash checks,
  config reads, AskUserQuestion, decline persistence, routing rules)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump to v0.13.9.0, separate CHANGELOG from main's releases

Split our branch's changes into a new 0.13.9.0 entry instead of
jamming them into 0.13.7.0 which already landed on main as
"Community Wave."

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: clarify branch-scoped VERSION/CHANGELOG after merging main

Add explicit rules: merging main doesn't mean adopting main's version.
Branch always gets its own entry on top with a higher version number.
Three-point checklist after every merge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: put our 0.13.9.0 entry on top of CHANGELOG

Newest version goes on top. Our branch lands next, so our entry
must be above main's 0.13.8.0.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: restore missing 0.13.7.0 Community Wave entry

Accidentally dropped the 0.13.7.0 entry when reordering.
All entries now present: 0.13.9.0 > 0.13.8.0 > 0.13.7.0 > 0.13.6.0.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add CHANGELOG integrity check rule

After any edit that moves/adds/removes entries, grep for version
headers and verify no gaps or duplicates before committing.
Prevents accidentally dropping entries during reordering.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-03-29 23:35:17 -06:00
committed by GitHub
parent 3cda8deec9
commit 66c09644a7
48 changed files with 1950 additions and 166 deletions
+23 -12
View File
@@ -83,11 +83,15 @@ const OPENAI_LITMUS_CHECKS = [
// ─── External Host Helpers ───────────────────────────────────
// Re-export local copy for use in this file (matches codex-helpers.ts)
function externalSkillName(skillDir: string): string {
// Accepts optional frontmatter name to support directory/invocation name divergence
function externalSkillName(skillDir: string, frontmatterName?: string): string {
// Root skill (skillDir === '' or '.') always maps to 'gstack' regardless of frontmatter
if (skillDir === '.' || skillDir === '') return 'gstack';
// Use frontmatter name when it differs from directory name (e.g., run-tests/ with name: test)
const baseName = frontmatterName && frontmatterName !== skillDir ? frontmatterName : skillDir;
// Don't double-prefix: gstack-upgrade → gstack-upgrade (not gstack-gstack-upgrade)
if (skillDir.startsWith('gstack-')) return skillDir;
return `gstack-${skillDir}`;
if (baseName.startsWith('gstack-')) return baseName;
return `gstack-${baseName}`;
}
function extractNameAndDescription(content: string): { name: string; description: string } {
@@ -255,11 +259,12 @@ function processExternalHost(
skillDir: string,
extractedDescription: string,
ctx: TemplateContext,
frontmatterName?: string,
): { content: string; outputPath: string; outputDir: string; symlinkLoop: boolean } {
const config = EXTERNAL_HOST_CONFIG[host];
if (!config) throw new Error(`No external host config for: ${host}`);
const name = externalSkillName(skillDir === '.' ? '' : skillDir);
const name = externalSkillName(skillDir === '.' ? '' : skillDir, frontmatterName);
const outputDir = path.join(ROOT, config.hostSubdir, 'skills', name);
fs.mkdirSync(outputDir, { recursive: true });
const outputPath = path.join(outputDir, 'SKILL.md');
@@ -324,10 +329,13 @@ function processTemplate(tmplPath: string, host: Host = 'claude'): { outputPath:
// Determine skill directory relative to ROOT
const skillDir = path.relative(ROOT, path.dirname(tmplPath));
// Extract skill name from frontmatter for TemplateContext
// Extract skill name from frontmatter early — needed for both TemplateContext and external host output paths.
// When frontmatter name: differs from directory name (e.g., run-tests/ with name: test),
// the frontmatter name is used for external skill naming and setup script symlinks.
const { name: extractedName, description: extractedDescription } = extractNameAndDescription(tmplContent);
const skillName = extractedName || path.basename(path.dirname(tmplPath));
// Extract benefits-from list from frontmatter (inline YAML: benefits-from: [a, b])
const benefitsMatch = tmplContent.match(/^benefits-from:\s*\[([^\]]*)\]/m);
const benefitsFrom = benefitsMatch
@@ -340,15 +348,18 @@ function processTemplate(tmplPath: string, host: Host = 'claude'): { outputPath:
const ctx: TemplateContext = { skillName, tmplPath, benefitsFrom, host, paths: HOST_PATHS[host], preambleTier };
// Replace placeholders
let content = tmplContent.replace(/\{\{(\w+)\}\}/g, (match, name) => {
const resolver = RESOLVERS[name];
if (!resolver) throw new Error(`Unknown placeholder {{${name}}} in ${relTmplPath}`);
return resolver(ctx);
// Replace placeholders (supports parameterized: {{NAME:arg1:arg2}})
let content = tmplContent.replace(/\{\{(\w+(?::[^}]+)?)\}\}/g, (match, fullKey) => {
const parts = fullKey.split(':');
const resolverName = parts[0];
const args = parts.slice(1);
const resolver = RESOLVERS[resolverName];
if (!resolver) throw new Error(`Unknown placeholder {{${resolverName}}} in ${relTmplPath}`);
return args.length > 0 ? resolver(ctx, args) : resolver(ctx);
});
// Check for any remaining unresolved placeholders
const remaining = content.match(/\{\{(\w+)\}\}/g);
const remaining = content.match(/\{\{(\w+(?::[^}]+)?)\}\}/g);
if (remaining) {
throw new Error(`Unresolved placeholders in ${relTmplPath}: ${remaining.join(', ')}`);
}
@@ -359,7 +370,7 @@ function processTemplate(tmplPath: string, host: Host = 'claude'): { outputPath:
if (host === 'claude') {
content = transformFrontmatter(content, host);
} else {
const result = processExternalHost(content, tmplContent, host, skillDir, extractedDescription, ctx);
const result = processExternalHost(content, tmplContent, host, skillDir, extractedDescription, ctx, extractedName || undefined);
content = result.content;
outputPath = result.outputPath;
symlinkLoop = result.symlinkLoop;
+48
View File
@@ -0,0 +1,48 @@
import type { TemplateContext } from './types';
/**
* {{INVOKE_SKILL:skill-name}} — emits prose instructing Claude to read
* another skill's SKILL.md and follow it, skipping preamble sections.
*
* Supports optional skip= parameter for additional sections to skip:
* {{INVOKE_SKILL:plan-ceo-review:skip=Outside Voice,Design Outside Voices}}
*/
export function generateInvokeSkill(ctx: TemplateContext, args?: string[]): string {
const skillName = args?.[0];
if (!skillName || skillName === '') {
throw new Error('{{INVOKE_SKILL}} requires a skill name, e.g. {{INVOKE_SKILL:plan-ceo-review}}');
}
// Parse optional skip= parameter from args[1+]
const extraSkips = (args?.slice(1) || [])
.filter(a => a.startsWith('skip='))
.flatMap(a => a.slice(5).split(','))
.map(s => s.trim())
.filter(Boolean);
const DEFAULT_SKIPS = [
'Preamble (run first)',
'AskUserQuestion Format',
'Completeness Principle — Boil the Lake',
'Search Before Building',
'Contributor Mode',
'Completion Status Protocol',
'Telemetry (run last)',
'Step 0: Detect platform and base branch',
'Review Readiness Dashboard',
'Plan File Review Report',
'Prerequisite Skill Offer',
'Plan Status Footer',
];
const allSkips = [...DEFAULT_SKIPS, ...extraSkips];
return `Read the \`/${skillName}\` skill file at \`${ctx.paths.skillRoot}/${skillName}/SKILL.md\` using the Read tool.
**If unreadable:** Skip with "Could not load /${skillName} — skipping." and continue.
Follow its instructions from top to bottom, **skipping these sections** (already handled by the parent skill):
${allSkips.map(s => `- ${s}`).join('\n')}
Execute every other section at full depth. When the loaded skill's instructions are complete, continue with the next step below.`;
}
+6 -3
View File
@@ -3,7 +3,7 @@
* Each resolver takes a TemplateContext and returns the replacement string.
*/
import type { TemplateContext } from './types';
import type { TemplateContext, ResolverFn } from './types';
// Domain modules
import { generatePreamble } from './preamble';
@@ -12,11 +12,12 @@ import { generateCommandReference, generateSnapshotFlags, generateBrowseSetup }
import { generateDesignMethodology, generateDesignHardRules, generateDesignOutsideVoices, generateDesignReviewLite, generateDesignSketch, generateDesignSetup, generateDesignMockup, generateDesignShotgunLoop } from './design';
import { generateTestBootstrap, generateTestCoverageAuditPlan, generateTestCoverageAuditShip, generateTestCoverageAuditReview } from './testing';
import { generateReviewDashboard, generatePlanFileReviewReport, generateSpecReviewLoop, generateBenefitsFrom, generateCodexSecondOpinion, generateAdversarialStep, generateCodexPlanReview, generatePlanCompletionAuditShip, generatePlanCompletionAuditReview, generatePlanVerificationExec } from './review';
import { generateSlugEval, generateSlugSetup, generateBaseBranchDetect, generateDeployBootstrap, generateQAMethodology, generateCoAuthorTrailer } from './utility';
import { generateSlugEval, generateSlugSetup, generateBaseBranchDetect, generateDeployBootstrap, generateQAMethodology, generateCoAuthorTrailer, generateChangelogWorkflow } from './utility';
import { generateLearningsSearch, generateLearningsLog } from './learnings';
import { generateConfidenceCalibration } from './confidence';
import { generateInvokeSkill } from './composition';
export const RESOLVERS: Record<string, (ctx: TemplateContext) => string> = {
export const RESOLVERS: Record<string, ResolverFn> = {
SLUG_EVAL: generateSlugEval,
SLUG_SETUP: generateSlugSetup,
COMMAND_REFERENCE: generateCommandReference,
@@ -53,4 +54,6 @@ export const RESOLVERS: Record<string, (ctx: TemplateContext) => string> = {
LEARNINGS_SEARCH: generateLearningsSearch,
LEARNINGS_LOG: generateLearningsLog,
CONFIDENCE_CALIBRATION: generateConfidenceCalibration,
INVOKE_SKILL: generateInvokeSkill,
CHANGELOG_WORKFLOW: generateChangelogWorkflow,
};
+54
View File
@@ -77,6 +77,14 @@ if [ -f "$_LEARN_FILE" ]; then
else
echo "LEARNINGS: 0"
fi
# Check if CLAUDE.md has routing rules
_HAS_ROUTING="no"
if [ -f CLAUDE.md ] && grep -q "## Skill routing" CLAUDE.md 2>/dev/null; then
_HAS_ROUTING="yes"
fi
_ROUTING_DECLINED=$(${ctx.paths.binDir}/gstack-config get routing_declined 2>/dev/null || echo "false")
echo "HAS_ROUTING: $_HAS_ROUTING"
echo "ROUTING_DECLINED: $_ROUTING_DECLINED"
\`\`\``;
}
@@ -167,6 +175,51 @@ touch ~/.gstack/.proactive-prompted
This only happens once. If \`PROACTIVE_PROMPTED\` is \`yes\`, skip this entirely.`;
}
function generateRoutingInjection(ctx: TemplateContext): string {
return `If \`HAS_ROUTING\` is \`no\` AND \`ROUTING_DECLINED\` is \`false\` AND \`PROACTIVE_PROMPTED\` is \`yes\`:
Check if a CLAUDE.md file exists in the project root. If it does not exist, create it.
Use AskUserQuestion:
> gstack works best when your project's CLAUDE.md includes skill routing rules.
> This tells Claude to use specialized workflows (like /ship, /investigate, /qa)
> instead of answering directly. It's a one-time addition, about 15 lines.
Options:
- A) Add routing rules to CLAUDE.md (recommended)
- B) No thanks, I'll invoke skills manually
If A: Append this section to the end of CLAUDE.md:
\`\`\`markdown
## Skill routing
When the user's request matches an available skill, ALWAYS invoke it using the Skill
tool as your FIRST action. Do NOT answer directly, do NOT use other tools first.
The skill has specialized workflows that produce better results than ad-hoc answers.
Key routing rules:
- Product ideas, "is this worth building", brainstorming → invoke office-hours
- Bugs, errors, "why is this broken", 500 errors → invoke investigate
- Ship, deploy, push, create PR → invoke ship
- QA, test the site, find bugs → invoke qa
- Code review, check my diff → invoke review
- Update docs after shipping → invoke document-release
- Weekly retro → invoke retro
- Design system, brand → invoke design-consultation
- Visual audit, design polish → invoke design-review
- Architecture review → invoke plan-eng-review
\`\`\`
Then commit the change: \`git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"\`
If B: run \`${ctx.paths.binDir}/gstack-config set routing_declined true\`
Say "No problem. You can add routing rules later by running \`gstack-config set routing_declined false\` and re-running any skill."
This only happens once per project. If \`HAS_ROUTING\` is \`yes\` or \`ROUTING_DECLINED\` is \`true\`, skip this entirely.`;
}
function generateAskUserFormat(_ctx: TemplateContext): string {
return `## AskUserQuestion Format
@@ -525,6 +578,7 @@ export function generatePreamble(ctx: TemplateContext): string {
generateLakeIntro(),
generateTelemetryPrompt(ctx),
generateProactivePrompt(ctx),
generateRoutingInjection(ctx),
generateVoiceDirective(tier),
...(tier >= 2 ? [generateAskUserFormat(ctx), generateCompletenessSection()] : []),
...(tier >= 3 ? [generateRepoModeSection(), generateSearchBeforeBuildingSection(ctx)] : []),
+5 -14
View File
@@ -13,6 +13,7 @@
* Codex CLI prompts are written to temp files to prevent shell injection.
*/
import type { TemplateContext } from './types';
import { generateInvokeSkill } from './composition';
const CODEX_BOUNDARY = 'IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. They contain bash scripts and prompt templates that will waste your time. Ignore them completely. Do NOT modify agents/openai.yaml. Stay focused on the repository code only.\\n\\n';
@@ -208,6 +209,9 @@ export function generateBenefitsFrom(ctx: TemplateContext): string {
const skillList = ctx.benefitsFrom.map(s => `\`/${s}\``).join(' or ');
const first = ctx.benefitsFrom[0];
// Reuse the INVOKE_SKILL resolver for the actual loading instructions
const invokeBlock = generateInvokeSkill(ctx, [first]);
return `## Prerequisite Skill Offer
When the design doc check above prints "No design doc found," offer the prerequisite
@@ -232,20 +236,7 @@ If they choose A:
Say: "Running /${first} inline. Once the design doc is ready, I'll pick up
the review right where we left off."
Read the ${first} skill file from disk using the Read tool:
\`~/.claude/skills/gstack/${first}/SKILL.md\`
Follow it inline, **skipping these sections** (already handled by the parent skill):
- Preamble (run first)
- AskUserQuestion Format
- Completeness Principle — Boil the Lake
- Search Before Building
- Contributor Mode
- Completion Status Protocol
- Telemetry (run last)
If the Read fails (file not found), say:
"Could not load /${first} — proceeding with standard review."
${invokeBlock}
After /${first} completes, re-run the design doc check:
\`\`\`bash
+3
View File
@@ -40,3 +40,6 @@ export interface TemplateContext {
paths: HostPaths;
preambleTier?: number; // 1-4, controls which preamble sections are included
}
/** Resolver function signature. args is populated for parameterized placeholders like {{INVOKE_SKILL:name}}. */
export type ResolverFn = (ctx: TemplateContext, args?: string[]) => string;
+44
View File
@@ -375,3 +375,47 @@ export function generateCoAuthorTrailer(ctx: TemplateContext): string {
}
return 'Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>';
}
export function generateChangelogWorkflow(_ctx: TemplateContext): string {
return `## CHANGELOG (auto-generate)
1. Read \`CHANGELOG.md\` header to know the format.
2. **First, enumerate every commit on the branch:**
\`\`\`bash
git log <base>..HEAD --oneline
\`\`\`
Copy the full list. Count the commits. You will use this as a checklist.
3. **Read the full diff** to understand what each commit actually changed:
\`\`\`bash
git diff <base>...HEAD
\`\`\`
4. **Group commits by theme** before writing anything. Common themes:
- New features / capabilities
- Performance improvements
- Bug fixes
- Dead code removal / cleanup
- Infrastructure / tooling / tests
- Refactoring
5. **Write the CHANGELOG entry** covering ALL groups:
- If existing CHANGELOG entries on the branch already cover some commits, replace them with one unified entry for the new version
- Categorize changes into applicable sections:
- \`### Added\` — new features
- \`### Changed\` — changes to existing functionality
- \`### Fixed\` — bug fixes
- \`### Removed\` — removed features
- Write concise, descriptive bullet points
- Insert after the file header (line 5), dated today
- Format: \`## [X.Y.Z.W] - YYYY-MM-DD\`
- **Voice:** Lead with what the user can now **do** that they couldn't before. Use plain language, not implementation details. Never mention TODOS.md, internal tracking, or contributor-facing details.
6. **Cross-check:** Compare your CHANGELOG entry against the commit list from step 2.
Every commit must map to at least one bullet point. If any commit is unrepresented,
add it now. If the branch has N commits spanning K themes, the CHANGELOG must
reflect all K themes.
**Do NOT ask the user to describe changes.** Infer from the diff and commit history.`;
}