feat: composable skills — INVOKE_SKILL resolver + factoring infrastructure (v0.13.7.0) (#644)

* feat: add parameterized resolver support to gen-skill-docs Extend the placeholder regex from {{WORD}} to {{WORD:arg1:arg2}}, enabling parameterized resolvers like {{INVOKE_SKILL:plan-ceo-review}}. - Widen ResolverFn type to accept optional args?: string[] - Update RESOLVERS record to use ResolverFn type - Both replacement and unresolved-check regexes updated - Fully backward compatible: existing {{WORD}} patterns unchanged Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add INVOKE_SKILL resolver for composable skill loading New composition.ts resolver module that emits prose instructing Claude to read another skill's SKILL.md and follow it, skipping preamble sections. Supports optional skip= parameter for additional sections. Usage: {{INVOKE_SKILL:plan-ceo-review}} or {{INVOKE_SKILL:plan-ceo-review:skip=Outside Voice}} Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: use frontmatter name: for skill symlinks and Codex paths Patch all 3 name-derivation paths to read name: from SKILL.md frontmatter instead of relying solely on directory basenames. This enables directory names that differ from invocation names (e.g., run-tests/ directory with name: test). - setup: link_claude_skill_dirs reads name: via grep, falls back to basename - gen-skill-docs.ts: codexSkillName uses frontmatter name for Codex output paths - gen-skill-docs.ts: moved frontmatter extraction before Codex path logic Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: extract CHANGELOG_WORKFLOW resolver from /ship Move changelog generation logic into a reusable resolver. The resolver is changelog-only (no version bump per Codex review recommendation). Adds voice rules inline. /ship Step 5 now uses {{CHANGELOG_WORKFLOW}}. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: use INVOKE_SKILL resolver for plan-ceo-review office-hours fallback Replace inline skill loading prose (read file, skip sections) with {{INVOKE_SKILL:office-hours}} in the mid-session detection path. The BENEFITS_FROM prerequisite offer is unchanged (separate use case). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: BENEFITS_FROM resolver delegates to INVOKE_SKILL Eliminate duplicated skip-list logic by having generateBenefitsFrom call generateInvokeSkill internally. The wrapper (AskUserQuestion, design doc re-check) stays in BENEFITS_FROM. The loading instructions (read file, skip sections, error handling) come from INVOKE_SKILL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add resolver tests for INVOKE_SKILL, CHANGELOG_WORKFLOW, parameterized args 12 new tests covering: - INVOKE_SKILL: template placeholder, default skip list, error handling, BENEFITS_FROM delegation - CHANGELOG_WORKFLOW: content, cross-check, voice guidance, format - Parameterized resolver infra: colon-separated args processing, no unresolved placeholders across all generated SKILL.md files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.13.7.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: journey routing tests — CLAUDE.md routing rules + stronger descriptions Three journey E2E tests (ideation, ship, debug) were failing because Claude answered directly instead of invoking the Skill tool. Root cause: skill descriptions in system-reminder are too weak to override Claude's default behavior for tasks it can handle natively. Fix has two parts: 1. CLAUDE.md routing rules in test workdir — Claude weighs project-level instructions higher than skill description metadata 2. "Proactively invoke" (not "suggest") in office-hours, investigate, ship descriptions — reinforces the routing signal 10/10 journey tests now pass (was 7/10). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: one-time CLAUDE.md routing injection prompt Add a preamble section that checks if the project's CLAUDE.md has skill routing rules. If not (and user hasn't declined), asks once via AskUserQuestion to inject a "## Skill routing" section. Root cause: skill descriptions in system-reminder metadata are too weak to reliably trigger proactive Skill tool invocation. CLAUDE.md project instructions carry higher weight in Claude's decision making. - Preamble bash checks for "## Skill routing" in CLAUDE.md - Stores decline in gstack-config (routing_declined=true) - Only asks once per project (HAS_ROUTING check + config check) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: annotated config file + routing injection tests gstack-config now writes a documented header on first config creation with every supported key explained (proactive, telemetry, auto_upgrade, skill_prefix, routing_declined, codex_reviews, skip_eng_review, etc.). Users can edit ~/.gstack/config.yaml directly, anytime. Also fixes grep to use ^KEY: anchoring so commented header lines don't shadow real config values. Tests added: - 7 new gstack-config tests (annotated header, no duplication, comment safety, routing_declined get/set/reset) - 6 new gen-skill-docs tests (preamble routing injection: bash checks, config reads, AskUserQuestion, decline persistence, routing rules) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump to v0.13.9.0, separate CHANGELOG from main's releases Split our branch's changes into a new 0.13.9.0 entry instead of jamming them into 0.13.7.0 which already landed on main as "Community Wave." Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: clarify branch-scoped VERSION/CHANGELOG after merging main Add explicit rules: merging main doesn't mean adopting main's version. Branch always gets its own entry on top with a higher version number. Three-point checklist after every merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: put our 0.13.9.0 entry on top of CHANGELOG Newest version goes on top. Our branch lands next, so our entry must be above main's 0.13.8.0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: restore missing 0.13.7.0 Community Wave entry Accidentally dropped the 0.13.7.0 entry when reordering. All entries now present: 0.13.9.0 > 0.13.8.0 > 0.13.7.0 > 0.13.6.0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add CHANGELOG integrity check rule After any edit that moves/adds/removes entries, grep for version headers and verify no gaps or duplicates before committing. Prevents accidentally dropping entries during reordering. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 19:25:10 +02:00 · 2026-03-29 23:35:17 -06:00
parent 3cda8deec9
commit 66c09644a7
48 changed files with 1950 additions and 166 deletions
@@ -83,11 +83,15 @@ const OPENAI_LITMUS_CHECKS = [
 // ─── External Host Helpers ───────────────────────────────────

 // Re-export local copy for use in this file (matches codex-helpers.ts)
-function externalSkillName(skillDir: string): string {
+// Accepts optional frontmatter name to support directory/invocation name divergence
+function externalSkillName(skillDir: string, frontmatterName?: string): string {
+  // Root skill (skillDir === '' or '.') always maps to 'gstack' regardless of frontmatter
  if (skillDir === '.' || skillDir === '') return 'gstack';
+  // Use frontmatter name when it differs from directory name (e.g., run-tests/ with name: test)
+  const baseName = frontmatterName && frontmatterName !== skillDir ? frontmatterName : skillDir;
  // Don't double-prefix: gstack-upgrade → gstack-upgrade (not gstack-gstack-upgrade)
-  if (skillDir.startsWith('gstack-')) return skillDir;
-  return `gstack-${skillDir}`;
+  if (baseName.startsWith('gstack-')) return baseName;
+  return `gstack-${baseName}`;
 }

 function extractNameAndDescription(content: string): { name: string; description: string } {
@@ -255,11 +259,12 @@ function processExternalHost(
  skillDir: string,
  extractedDescription: string,
  ctx: TemplateContext,
+  frontmatterName?: string,
 ): { content: string; outputPath: string; outputDir: string; symlinkLoop: boolean } {
  const config = EXTERNAL_HOST_CONFIG[host];
  if (!config) throw new Error(`No external host config for: ${host}`);

-  const name = externalSkillName(skillDir === '.' ? '' : skillDir);
+  const name = externalSkillName(skillDir === '.' ? '' : skillDir, frontmatterName);
  const outputDir = path.join(ROOT, config.hostSubdir, 'skills', name);
  fs.mkdirSync(outputDir, { recursive: true });
  const outputPath = path.join(outputDir, 'SKILL.md');
@@ -324,10 +329,13 @@ function processTemplate(tmplPath: string, host: Host = 'claude'): { outputPath:
  // Determine skill directory relative to ROOT
  const skillDir = path.relative(ROOT, path.dirname(tmplPath));

-  // Extract skill name from frontmatter for TemplateContext
+  // Extract skill name from frontmatter early — needed for both TemplateContext and external host output paths.
+  // When frontmatter name: differs from directory name (e.g., run-tests/ with name: test),
+  // the frontmatter name is used for external skill naming and setup script symlinks.
  const { name: extractedName, description: extractedDescription } = extractNameAndDescription(tmplContent);
  const skillName = extractedName || path.basename(path.dirname(tmplPath));

+
  // Extract benefits-from list from frontmatter (inline YAML: benefits-from: [a, b])
  const benefitsMatch = tmplContent.match(/^benefits-from:\s*\[([^\]]*)\]/m);
  const benefitsFrom = benefitsMatch
@@ -340,15 +348,18 @@ function processTemplate(tmplPath: string, host: Host = 'claude'): { outputPath:

  const ctx: TemplateContext = { skillName, tmplPath, benefitsFrom, host, paths: HOST_PATHS[host], preambleTier };

-  // Replace placeholders
-  let content = tmplContent.replace(/\{\{(\w+)\}\}/g, (match, name) => {
-    const resolver = RESOLVERS[name];
-    if (!resolver) throw new Error(`Unknown placeholder {{${name}}} in ${relTmplPath}`);
-    return resolver(ctx);
+  // Replace placeholders (supports parameterized: {{NAME:arg1:arg2}})
+  let content = tmplContent.replace(/\{\{(\w+(?::[^}]+)?)\}\}/g, (match, fullKey) => {
+    const parts = fullKey.split(':');
+    const resolverName = parts[0];
+    const args = parts.slice(1);
+    const resolver = RESOLVERS[resolverName];
+    if (!resolver) throw new Error(`Unknown placeholder {{${resolverName}}} in ${relTmplPath}`);
+    return args.length > 0 ? resolver(ctx, args) : resolver(ctx);
  });

  // Check for any remaining unresolved placeholders
-  const remaining = content.match(/\{\{(\w+)\}\}/g);
+  const remaining = content.match(/\{\{(\w+(?::[^}]+)?)\}\}/g);
  if (remaining) {
    throw new Error(`Unresolved placeholders in ${relTmplPath}: ${remaining.join(', ')}`);
  }
@@ -359,7 +370,7 @@ function processTemplate(tmplPath: string, host: Host = 'claude'): { outputPath:
  if (host === 'claude') {
    content = transformFrontmatter(content, host);
  } else {
-    const result = processExternalHost(content, tmplContent, host, skillDir, extractedDescription, ctx);
+    const result = processExternalHost(content, tmplContent, host, skillDir, extractedDescription, ctx, extractedName || undefined);
    content = result.content;
    outputPath = result.outputPath;
    symlinkLoop = result.symlinkLoop;
@@ -0,0 +1,48 @@
+import type { TemplateContext } from './types';
+
+/**
+ * {{INVOKE_SKILL:skill-name}} — emits prose instructing Claude to read
+ * another skill's SKILL.md and follow it, skipping preamble sections.
+ *
+ * Supports optional skip= parameter for additional sections to skip:
+ *   {{INVOKE_SKILL:plan-ceo-review:skip=Outside Voice,Design Outside Voices}}
+ */
+export function generateInvokeSkill(ctx: TemplateContext, args?: string[]): string {
+  const skillName = args?.[0];
+  if (!skillName || skillName === '') {
+    throw new Error('{{INVOKE_SKILL}} requires a skill name, e.g. {{INVOKE_SKILL:plan-ceo-review}}');
+  }
+
+  // Parse optional skip= parameter from args[1+]
+  const extraSkips = (args?.slice(1) || [])
+    .filter(a => a.startsWith('skip='))
+    .flatMap(a => a.slice(5).split(','))
+    .map(s => s.trim())
+    .filter(Boolean);
+
+  const DEFAULT_SKIPS = [
+    'Preamble (run first)',
+    'AskUserQuestion Format',
+    'Completeness Principle — Boil the Lake',
+    'Search Before Building',
+    'Contributor Mode',
+    'Completion Status Protocol',
+    'Telemetry (run last)',
+    'Step 0: Detect platform and base branch',
+    'Review Readiness Dashboard',
+    'Plan File Review Report',
+    'Prerequisite Skill Offer',
+    'Plan Status Footer',
+  ];
+
+  const allSkips = [...DEFAULT_SKIPS, ...extraSkips];
+
+  return `Read the \`/${skillName}\` skill file at \`${ctx.paths.skillRoot}/${skillName}/SKILL.md\` using the Read tool.
+
+**If unreadable:** Skip with "Could not load /${skillName} — skipping." and continue.
+
+Follow its instructions from top to bottom, **skipping these sections** (already handled by the parent skill):
+${allSkips.map(s => `- ${s}`).join('\n')}
+
+Execute every other section at full depth. When the loaded skill's instructions are complete, continue with the next step below.`;
+}
@@ -3,7 +3,7 @@
 * Each resolver takes a TemplateContext and returns the replacement string.
 */

-import type { TemplateContext } from './types';
+import type { TemplateContext, ResolverFn } from './types';

 // Domain modules
 import { generatePreamble } from './preamble';
@@ -12,11 +12,12 @@ import { generateCommandReference, generateSnapshotFlags, generateBrowseSetup }
 import { generateDesignMethodology, generateDesignHardRules, generateDesignOutsideVoices, generateDesignReviewLite, generateDesignSketch, generateDesignSetup, generateDesignMockup, generateDesignShotgunLoop } from './design';
 import { generateTestBootstrap, generateTestCoverageAuditPlan, generateTestCoverageAuditShip, generateTestCoverageAuditReview } from './testing';
 import { generateReviewDashboard, generatePlanFileReviewReport, generateSpecReviewLoop, generateBenefitsFrom, generateCodexSecondOpinion, generateAdversarialStep, generateCodexPlanReview, generatePlanCompletionAuditShip, generatePlanCompletionAuditReview, generatePlanVerificationExec } from './review';
-import { generateSlugEval, generateSlugSetup, generateBaseBranchDetect, generateDeployBootstrap, generateQAMethodology, generateCoAuthorTrailer } from './utility';
+import { generateSlugEval, generateSlugSetup, generateBaseBranchDetect, generateDeployBootstrap, generateQAMethodology, generateCoAuthorTrailer, generateChangelogWorkflow } from './utility';
 import { generateLearningsSearch, generateLearningsLog } from './learnings';
 import { generateConfidenceCalibration } from './confidence';
+import { generateInvokeSkill } from './composition';

-export const RESOLVERS: Record<string, (ctx: TemplateContext) => string> = {
+export const RESOLVERS: Record<string, ResolverFn> = {
  SLUG_EVAL: generateSlugEval,
  SLUG_SETUP: generateSlugSetup,
  COMMAND_REFERENCE: generateCommandReference,
@@ -53,4 +54,6 @@ export const RESOLVERS: Record<string, (ctx: TemplateContext) => string> = {
  LEARNINGS_SEARCH: generateLearningsSearch,
  LEARNINGS_LOG: generateLearningsLog,
  CONFIDENCE_CALIBRATION: generateConfidenceCalibration,
+  INVOKE_SKILL: generateInvokeSkill,
+  CHANGELOG_WORKFLOW: generateChangelogWorkflow,
 };
@@ -77,6 +77,14 @@ if [ -f "$_LEARN_FILE" ]; then
 else
  echo "LEARNINGS: 0"
 fi
+# Check if CLAUDE.md has routing rules
+_HAS_ROUTING="no"
+if [ -f CLAUDE.md ] && grep -q "## Skill routing" CLAUDE.md 2>/dev/null; then
+  _HAS_ROUTING="yes"
+fi
+_ROUTING_DECLINED=$(${ctx.paths.binDir}/gstack-config get routing_declined 2>/dev/null || echo "false")
+echo "HAS_ROUTING: $_HAS_ROUTING"
+echo "ROUTING_DECLINED: $_ROUTING_DECLINED"
 \`\`\``;
 }

@@ -167,6 +175,51 @@ touch ~/.gstack/.proactive-prompted
 This only happens once. If \`PROACTIVE_PROMPTED\` is \`yes\`, skip this entirely.`;
 }

+function generateRoutingInjection(ctx: TemplateContext): string {
+  return `If \`HAS_ROUTING\` is \`no\` AND \`ROUTING_DECLINED\` is \`false\` AND \`PROACTIVE_PROMPTED\` is \`yes\`:
+Check if a CLAUDE.md file exists in the project root. If it does not exist, create it.
+
+Use AskUserQuestion:
+
+> gstack works best when your project's CLAUDE.md includes skill routing rules.
+> This tells Claude to use specialized workflows (like /ship, /investigate, /qa)
+> instead of answering directly. It's a one-time addition, about 15 lines.
+
+Options:
+- A) Add routing rules to CLAUDE.md (recommended)
+- B) No thanks, I'll invoke skills manually
+
+If A: Append this section to the end of CLAUDE.md:
+
+\`\`\`markdown
+
+## Skill routing
+
+When the user's request matches an available skill, ALWAYS invoke it using the Skill
+tool as your FIRST action. Do NOT answer directly, do NOT use other tools first.
+The skill has specialized workflows that produce better results than ad-hoc answers.
+
+Key routing rules:
+- Product ideas, "is this worth building", brainstorming → invoke office-hours
+- Bugs, errors, "why is this broken", 500 errors → invoke investigate
+- Ship, deploy, push, create PR → invoke ship
+- QA, test the site, find bugs → invoke qa
+- Code review, check my diff → invoke review
+- Update docs after shipping → invoke document-release
+- Weekly retro → invoke retro
+- Design system, brand → invoke design-consultation
+- Visual audit, design polish → invoke design-review
+- Architecture review → invoke plan-eng-review
+\`\`\`
+
+Then commit the change: \`git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"\`
+
+If B: run \`${ctx.paths.binDir}/gstack-config set routing_declined true\`
+Say "No problem. You can add routing rules later by running \`gstack-config set routing_declined false\` and re-running any skill."
+
+This only happens once per project. If \`HAS_ROUTING\` is \`yes\` or \`ROUTING_DECLINED\` is \`true\`, skip this entirely.`;
+}
+
 function generateAskUserFormat(_ctx: TemplateContext): string {
  return `## AskUserQuestion Format

@@ -525,6 +578,7 @@ export function generatePreamble(ctx: TemplateContext): string {
    generateLakeIntro(),
    generateTelemetryPrompt(ctx),
    generateProactivePrompt(ctx),
+    generateRoutingInjection(ctx),
    generateVoiceDirective(tier),
    ...(tier >= 2 ? [generateAskUserFormat(ctx), generateCompletenessSection()] : []),
    ...(tier >= 3 ? [generateRepoModeSection(), generateSearchBeforeBuildingSection(ctx)] : []),
@@ -13,6 +13,7 @@
 * Codex CLI prompts are written to temp files to prevent shell injection.
 */
 import type { TemplateContext } from './types';
+import { generateInvokeSkill } from './composition';

 const CODEX_BOUNDARY = 'IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. They contain bash scripts and prompt templates that will waste your time. Ignore them completely. Do NOT modify agents/openai.yaml. Stay focused on the repository code only.\\n\\n';

@@ -208,6 +209,9 @@ export function generateBenefitsFrom(ctx: TemplateContext): string {
  const skillList = ctx.benefitsFrom.map(s => `\`/${s}\``).join(' or ');
  const first = ctx.benefitsFrom[0];

+  // Reuse the INVOKE_SKILL resolver for the actual loading instructions
+  const invokeBlock = generateInvokeSkill(ctx, [first]);
+
  return `## Prerequisite Skill Offer

 When the design doc check above prints "No design doc found," offer the prerequisite
@@ -232,20 +236,7 @@ If they choose A:
 Say: "Running /${first} inline. Once the design doc is ready, I'll pick up
 the review right where we left off."

-Read the ${first} skill file from disk using the Read tool:
-\`~/.claude/skills/gstack/${first}/SKILL.md\`
-
-Follow it inline, **skipping these sections** (already handled by the parent skill):
- Preamble (run first)
- AskUserQuestion Format
- Completeness Principle — Boil the Lake
- Search Before Building
- Contributor Mode
- Completion Status Protocol
- Telemetry (run last)
-
-If the Read fails (file not found), say:
-"Could not load /${first} — proceeding with standard review."
+${invokeBlock}

 After /${first} completes, re-run the design doc check:
 \`\`\`bash
@@ -40,3 +40,6 @@ export interface TemplateContext {
  paths: HostPaths;
  preambleTier?: number;  // 1-4, controls which preamble sections are included
 }
+
+/** Resolver function signature. args is populated for parameterized placeholders like {{INVOKE_SKILL:name}}. */
+export type ResolverFn = (ctx: TemplateContext, args?: string[]) => string;
@@ -375,3 +375,47 @@ export function generateCoAuthorTrailer(ctx: TemplateContext): string {
  }
  return 'Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>';
 }
+
+export function generateChangelogWorkflow(_ctx: TemplateContext): string {
+  return `## CHANGELOG (auto-generate)
+
+1. Read \`CHANGELOG.md\` header to know the format.
+
+2. **First, enumerate every commit on the branch:**
+   \`\`\`bash
+   git log <base>..HEAD --oneline
+   \`\`\`
+   Copy the full list. Count the commits. You will use this as a checklist.
+
+3. **Read the full diff** to understand what each commit actually changed:
+   \`\`\`bash
+   git diff <base>...HEAD
+   \`\`\`
+
+4. **Group commits by theme** before writing anything. Common themes:
+   - New features / capabilities
+   - Performance improvements
+   - Bug fixes
+   - Dead code removal / cleanup
+   - Infrastructure / tooling / tests
+   - Refactoring
+
+5. **Write the CHANGELOG entry** covering ALL groups:
+   - If existing CHANGELOG entries on the branch already cover some commits, replace them with one unified entry for the new version
+   - Categorize changes into applicable sections:
+     - \`### Added\` — new features
+     - \`### Changed\` — changes to existing functionality
+     - \`### Fixed\` — bug fixes
+     - \`### Removed\` — removed features
+   - Write concise, descriptive bullet points
+   - Insert after the file header (line 5), dated today
+   - Format: \`## [X.Y.Z.W] - YYYY-MM-DD\`
+   - **Voice:** Lead with what the user can now **do** that they couldn't before. Use plain language, not implementation details. Never mention TODOS.md, internal tracking, or contributor-facing details.
+
+6. **Cross-check:** Compare your CHANGELOG entry against the commit list from step 2.
+   Every commit must map to at least one bullet point. If any commit is unrepresented,
+   add it now. If the branch has N commits spanning K themes, the CHANGELOG must
+   reflect all K themes.
+
+**Do NOT ask the user to describe changes.** Infer from the diff and commit history.`;
+}