feat: composable skills — INVOKE_SKILL resolver + factoring infrastructure (v0.13.7.0) (#644)

* feat: add parameterized resolver support to gen-skill-docs Extend the placeholder regex from {{WORD}} to {{WORD:arg1:arg2}}, enabling parameterized resolvers like {{INVOKE_SKILL:plan-ceo-review}}. - Widen ResolverFn type to accept optional args?: string[] - Update RESOLVERS record to use ResolverFn type - Both replacement and unresolved-check regexes updated - Fully backward compatible: existing {{WORD}} patterns unchanged Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add INVOKE_SKILL resolver for composable skill loading New composition.ts resolver module that emits prose instructing Claude to read another skill's SKILL.md and follow it, skipping preamble sections. Supports optional skip= parameter for additional sections. Usage: {{INVOKE_SKILL:plan-ceo-review}} or {{INVOKE_SKILL:plan-ceo-review:skip=Outside Voice}} Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: use frontmatter name: for skill symlinks and Codex paths Patch all 3 name-derivation paths to read name: from SKILL.md frontmatter instead of relying solely on directory basenames. This enables directory names that differ from invocation names (e.g., run-tests/ directory with name: test). - setup: link_claude_skill_dirs reads name: via grep, falls back to basename - gen-skill-docs.ts: codexSkillName uses frontmatter name for Codex output paths - gen-skill-docs.ts: moved frontmatter extraction before Codex path logic Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: extract CHANGELOG_WORKFLOW resolver from /ship Move changelog generation logic into a reusable resolver. The resolver is changelog-only (no version bump per Codex review recommendation). Adds voice rules inline. /ship Step 5 now uses {{CHANGELOG_WORKFLOW}}. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: use INVOKE_SKILL resolver for plan-ceo-review office-hours fallback Replace inline skill loading prose (read file, skip sections) with {{INVOKE_SKILL:office-hours}} in the mid-session detection path. The BENEFITS_FROM prerequisite offer is unchanged (separate use case). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: BENEFITS_FROM resolver delegates to INVOKE_SKILL Eliminate duplicated skip-list logic by having generateBenefitsFrom call generateInvokeSkill internally. The wrapper (AskUserQuestion, design doc re-check) stays in BENEFITS_FROM. The loading instructions (read file, skip sections, error handling) come from INVOKE_SKILL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add resolver tests for INVOKE_SKILL, CHANGELOG_WORKFLOW, parameterized args 12 new tests covering: - INVOKE_SKILL: template placeholder, default skip list, error handling, BENEFITS_FROM delegation - CHANGELOG_WORKFLOW: content, cross-check, voice guidance, format - Parameterized resolver infra: colon-separated args processing, no unresolved placeholders across all generated SKILL.md files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.13.7.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: journey routing tests — CLAUDE.md routing rules + stronger descriptions Three journey E2E tests (ideation, ship, debug) were failing because Claude answered directly instead of invoking the Skill tool. Root cause: skill descriptions in system-reminder are too weak to override Claude's default behavior for tasks it can handle natively. Fix has two parts: 1. CLAUDE.md routing rules in test workdir — Claude weighs project-level instructions higher than skill description metadata 2. "Proactively invoke" (not "suggest") in office-hours, investigate, ship descriptions — reinforces the routing signal 10/10 journey tests now pass (was 7/10). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: one-time CLAUDE.md routing injection prompt Add a preamble section that checks if the project's CLAUDE.md has skill routing rules. If not (and user hasn't declined), asks once via AskUserQuestion to inject a "## Skill routing" section. Root cause: skill descriptions in system-reminder metadata are too weak to reliably trigger proactive Skill tool invocation. CLAUDE.md project instructions carry higher weight in Claude's decision making. - Preamble bash checks for "## Skill routing" in CLAUDE.md - Stores decline in gstack-config (routing_declined=true) - Only asks once per project (HAS_ROUTING check + config check) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: annotated config file + routing injection tests gstack-config now writes a documented header on first config creation with every supported key explained (proactive, telemetry, auto_upgrade, skill_prefix, routing_declined, codex_reviews, skip_eng_review, etc.). Users can edit ~/.gstack/config.yaml directly, anytime. Also fixes grep to use ^KEY: anchoring so commented header lines don't shadow real config values. Tests added: - 7 new gstack-config tests (annotated header, no duplication, comment safety, routing_declined get/set/reset) - 6 new gen-skill-docs tests (preamble routing injection: bash checks, config reads, AskUserQuestion, decline persistence, routing rules) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump to v0.13.9.0, separate CHANGELOG from main's releases Split our branch's changes into a new 0.13.9.0 entry instead of jamming them into 0.13.7.0 which already landed on main as "Community Wave." Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: clarify branch-scoped VERSION/CHANGELOG after merging main Add explicit rules: merging main doesn't mean adopting main's version. Branch always gets its own entry on top with a higher version number. Three-point checklist after every merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: put our 0.13.9.0 entry on top of CHANGELOG Newest version goes on top. Our branch lands next, so our entry must be above main's 0.13.8.0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: restore missing 0.13.7.0 Community Wave entry Accidentally dropped the 0.13.7.0 entry when reordering. All entries now present: 0.13.9.0 > 0.13.8.0 > 0.13.7.0 > 0.13.6.0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add CHANGELOG integrity check rule After any edit that moves/adds/removes entries, grep for version headers and verify no gaps or duplicates before committing. Prevents accidentally dropping entries during reordering. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 19:25:10 +02:00 · 2026-03-29 23:35:17 -06:00
parent 3cda8deec9
commit 66c09644a7
48 changed files with 1950 additions and 166 deletions
@@ -1153,6 +1153,138 @@ describe('BENEFITS_FROM resolver', () => {
    expect(ceoContent).toContain('office-hours/SKILL.md');
    expect(engContent).toContain('office-hours/SKILL.md');
  });
+
+  test('BENEFITS_FROM delegates to INVOKE_SKILL pattern', () => {
+    // Should contain the INVOKE_SKILL-style loading prose (not the old manual skip list)
+    expect(engContent).toContain('Follow its instructions from top to bottom');
+    expect(engContent).toContain('skipping these sections');
+    expect(ceoContent).toContain('Follow its instructions from top to bottom');
+  });
+});
+
+// --- {{INVOKE_SKILL}} resolver tests ---
+
+describe('INVOKE_SKILL resolver', () => {
+  const ceoContent = fs.readFileSync(path.join(ROOT, 'plan-ceo-review', 'SKILL.md'), 'utf-8');
+
+  test('plan-ceo-review uses INVOKE_SKILL for mid-session office-hours fallback', () => {
+    // The mid-session detection path should use INVOKE_SKILL-generated prose
+    expect(ceoContent).toContain('office-hours/SKILL.md');
+    expect(ceoContent).toContain('Follow its instructions from top to bottom');
+  });
+
+  test('INVOKE_SKILL output includes default skip list', () => {
+    expect(ceoContent).toContain('Preamble (run first)');
+    expect(ceoContent).toContain('Telemetry (run last)');
+    expect(ceoContent).toContain('AskUserQuestion Format');
+  });
+
+  test('INVOKE_SKILL output includes error handling', () => {
+    expect(ceoContent).toContain('If unreadable');
+    expect(ceoContent).toContain('Could not load');
+  });
+
+  test('template uses {{INVOKE_SKILL:office-hours}} placeholder', () => {
+    const tmpl = fs.readFileSync(path.join(ROOT, 'plan-ceo-review', 'SKILL.md.tmpl'), 'utf-8');
+    expect(tmpl).toContain('{{INVOKE_SKILL:office-hours}}');
+  });
+});
+
+// --- {{CHANGELOG_WORKFLOW}} resolver tests ---
+
+describe('CHANGELOG_WORKFLOW resolver', () => {
+  const shipContent = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
+
+  test('ship SKILL.md contains changelog workflow', () => {
+    expect(shipContent).toContain('CHANGELOG (auto-generate)');
+    expect(shipContent).toContain('git log <base>..HEAD --oneline');
+  });
+
+  test('changelog workflow includes cross-check step', () => {
+    expect(shipContent).toContain('Cross-check');
+    expect(shipContent).toContain('Every commit must map to at least one bullet point');
+  });
+
+  test('changelog workflow includes voice guidance', () => {
+    expect(shipContent).toContain('Lead with what the user can now **do**');
+  });
+
+  test('template uses {{CHANGELOG_WORKFLOW}} placeholder', () => {
+    const tmpl = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md.tmpl'), 'utf-8');
+    expect(tmpl).toContain('{{CHANGELOG_WORKFLOW}}');
+    // Should NOT contain the old inline changelog content
+    expect(tmpl).not.toContain('Group commits by theme');
+  });
+
+  test('changelog workflow includes keep-changelog format', () => {
+    expect(shipContent).toContain('### Added');
+    expect(shipContent).toContain('### Fixed');
+  });
+});
+
+// --- Parameterized resolver infrastructure tests ---
+
+describe('parameterized resolver support', () => {
+  test('gen-skill-docs regex handles colon-separated args', () => {
+    // Verify the template containing {{INVOKE_SKILL:office-hours}} was processed
+    // without leaving unresolved placeholders
+    const ceoContent = fs.readFileSync(path.join(ROOT, 'plan-ceo-review', 'SKILL.md'), 'utf-8');
+    expect(ceoContent).not.toMatch(/\{\{INVOKE_SKILL:[^}]+\}\}/);
+  });
+
+  test('templates with parameterized resolvers pass unresolved check', () => {
+    // All generated SKILL.md files should have no unresolved {{...}} placeholders
+    const skillDirs = fs.readdirSync(ROOT).filter(d =>
+      fs.existsSync(path.join(ROOT, d, 'SKILL.md'))
+    );
+    for (const dir of skillDirs) {
+      const content = fs.readFileSync(path.join(ROOT, dir, 'SKILL.md'), 'utf-8');
+      const unresolved = content.match(/\{\{[A-Z_]+(?::[^}]*)?\}\}/g);
+      if (unresolved) {
+        throw new Error(`${dir}/SKILL.md has unresolved placeholders: ${unresolved.join(', ')}`);
+      }
+    }
+  });
+});
+
+// --- Preamble routing injection tests ---
+
+describe('preamble routing injection', () => {
+  const shipContent = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
+
+  test('preamble bash checks for routing section in CLAUDE.md', () => {
+    expect(shipContent).toContain('grep -q "## Skill routing" CLAUDE.md');
+    expect(shipContent).toContain('HAS_ROUTING');
+  });
+
+  test('preamble bash reads routing_declined config', () => {
+    expect(shipContent).toContain('routing_declined');
+    expect(shipContent).toContain('ROUTING_DECLINED');
+  });
+
+  test('preamble includes routing injection AskUserQuestion', () => {
+    expect(shipContent).toContain('Add routing rules to CLAUDE.md');
+    expect(shipContent).toContain("I'll invoke skills manually");
+  });
+
+  test('routing injection respects prior decline', () => {
+    expect(shipContent).toContain('ROUTING_DECLINED');
+    expect(shipContent).toMatch(/routing_declined.*true/);
+  });
+
+  test('routing injection only fires when all conditions met', () => {
+    // Must be: HAS_ROUTING=no AND ROUTING_DECLINED=false AND PROACTIVE_PROMPTED=yes
+    expect(shipContent).toContain('HAS_ROUTING');
+    expect(shipContent).toContain('ROUTING_DECLINED');
+    expect(shipContent).toContain('PROACTIVE_PROMPTED');
+  });
+
+  test('routing section content includes key routing rules', () => {
+    expect(shipContent).toContain('invoke office-hours');
+    expect(shipContent).toContain('invoke investigate');
+    expect(shipContent).toContain('invoke ship');
+    expect(shipContent).toContain('invoke qa');
+  });
 });

 // --- {{DESIGN_OUTSIDE_VOICES}} resolver tests ---
@@ -1793,11 +1925,12 @@ describe('setup script validation', () => {
  });

  test('link_claude_skill_dirs creates relative symlinks', () => {
-    // Claude links should be relative: ln -snf "gstack/skill_name"
+    // Claude links should be relative: ln -snf "gstack/$dir_name"
+    // Uses dir_name (not skill_name) because symlink target must point to the physical directory
    const fnStart = setupContent.indexOf('link_claude_skill_dirs()');
    const fnEnd = setupContent.indexOf('}', setupContent.indexOf('linked[@]}', fnStart));
    const fnBody = setupContent.slice(fnStart, fnEnd);
-    expect(fnBody).toContain('ln -snf "gstack/$skill_name"');
+    expect(fnBody).toContain('ln -snf "gstack/$dir_name"');
  });

  test('setup supports --host auto|claude|codex|kiro', () => {
@@ -93,11 +93,30 @@ function installSkills(tmpDir: string) {
    }
  }

-  // Copy CLAUDE.md so Claude has project context for skill routing.
-  const claudeMdSrc = path.join(ROOT, 'CLAUDE.md');
-  if (fs.existsSync(claudeMdSrc)) {
-    fs.copyFileSync(claudeMdSrc, path.join(tmpDir, 'CLAUDE.md'));
-  }
+  // Write a CLAUDE.md with explicit routing instructions.
+  // The skill descriptions in system-reminder aren't strong enough to override
+  // Claude's default behavior of answering directly. A CLAUDE.md instruction
+  // puts routing rules in project context which Claude weighs more heavily.
+  fs.writeFileSync(path.join(tmpDir, 'CLAUDE.md'), `# Project Instructions
+
+## Skill routing
+
+When the user's request matches an available skill, ALWAYS invoke it using the Skill
+tool as your FIRST action. Do NOT answer directly, do NOT use other tools first.
+The skill has specialized workflows that produce better results than ad-hoc answers.
+
+Key routing rules:
+- Product ideas, "is this worth building", brainstorming → invoke office-hours
+- Bugs, errors, "why is this broken", 500 errors → invoke investigate
+- Ship, deploy, push, create PR → invoke ship
+- QA, test the site, find bugs → invoke qa
+- Code review, check my diff → invoke review
+- Update docs after shipping → invoke document-release
+- Weekly retro → invoke retro
+- Design system, brand → invoke design-consultation
+- Visual audit, design polish → invoke design-review
+- Architecture review → invoke plan-eng-review
+`);
 }

 /** Init a git repo with config */
@@ -1409,13 +1409,13 @@ describe('Skill trigger phrases', () => {
  ];

  for (const skill of SKILLS_REQUIRING_PROACTIVE) {
-    test(`${skill}/SKILL.md has "Proactively suggest" phrase`, () => {
+    test(`${skill}/SKILL.md has proactive routing phrase`, () => {
      const skillPath = path.join(ROOT, skill, 'SKILL.md');
      if (!fs.existsSync(skillPath)) return;
      const content = fs.readFileSync(skillPath, 'utf-8');
      const frontmatterEnd = content.indexOf('---', 4);
      const frontmatter = content.slice(0, frontmatterEnd);
-      expect(frontmatter).toMatch(/Proactively suggest/i);
+      expect(frontmatter).toMatch(/Proactively (suggest|invoke)/i);
    });
  }
 });