refactor(opus-4.7): split overlay, align routing, fix trailer fallback

Follow-up to wintermute's initial Opus 4.7 migration commit (addresses ship-quality review findings before v1.6.1.0 release). Overlay split (model-overlays/): - Move 4 Opus-4.7-specific nudges (Fan out, Effort-match, Batch your questions, Literal interpretation) from claude.md into new opus-4-7.md with {{INHERIT:claude}} - claude.md now holds only model-agnostic nudges (Todo discipline, Think before heavy, Dedicated tools over Bash) - Prevents Opus-4.7-specific guidance leaking onto Sonnet/Haiku - Uses existing {{INHERIT:claude}} mechanism at scripts/resolvers/model-overlay.ts:28-43 scripts/models.ts: - Add opus-4-7 to ALL_MODEL_NAMES - resolveModel: claude-opus-4-7-* variants route to opus-4-7, all other claude-* variants continue to route to claude scripts/resolvers/utility.ts: - Update coAuthor trailer fallback: Opus 4.6 -> Opus 4.7 (fallback was missed in the initial migration commit) scripts/resolvers/preamble/generate-routing-injection.ts: - Align policy with new SKILL.md.tmpl: soft "when in doubt, invoke" instead of hard "ALWAYS invoke... Do NOT answer directly" - Replace stale /checkpoint reference with /context-save + /context-restore (skills were renamed in v1.0.1.0) - Expand route coverage to match full skill inventory: /plan-devex-review, /qa-only, /devex-review, /land-and-deploy, /setup-deploy, /canary, /open-gstack-browser, /setup-browser-cookies, /benchmark, /learn, /plan-tune, /health scripts/resolvers/preamble/generate-voice-directive.ts: - Voice example closing: "Want me to ship it?" -> "Want me to fix it?" - Preserves directness while routing through review gates SKILL.md.tmpl: - Add routing triggers for skills that were missing from the list: /plan-devex-review, /qa-only, /devex-review, /land-and-deploy, /setup-deploy, /canary, /open-gstack-browser, /setup-browser-cookies, /benchmark, /learn, /plan-tune, /health - Within Opus 4.7 overlay, added scope boundary to "Literal interpretation" nudge ("fix tests that this branch introduced or is responsible for") - Added pacing exception to "Batch your questions" nudge so skills that require one-question-at-a-time pacing still win Follow-up commit will regenerate SKILL.md files + update goldens. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-21 09:10:11 +02:00 · 2026-04-21 23:39:42 -07:00
parent 4d260db78a
commit da75ebaaa0
7 changed files with 81 additions and 42 deletions
@@ -13,6 +13,7 @@

 export const ALL_MODEL_NAMES = [
  'claude',
+  'opus-4-7',
  'gpt',
  'gpt-5.4',
  'gemini',
@@ -51,6 +52,7 @@ export function resolveModel(input: string): Model | null {
  if (/^gpt-5\.4(-|$)/.test(s)) return 'gpt-5.4';
  if (/^gpt(-|$)/.test(s)) return 'gpt';
  if (/^o[0-9]+(-|$)/.test(s)) return 'o-series';
+  if (/^claude-opus-4-7(-|$)/.test(s)) return 'opus-4-7';
  if (/^claude(-|$)/.test(s)) return 'claude';
  if (/^gemini(-|$)/.test(s)) return 'gemini';

@@ -20,23 +20,44 @@ If A: Append this section to the end of CLAUDE.md:

 ## Skill routing

-When the user's request matches an available skill, ALWAYS invoke it using the Skill
-tool as your FIRST action. Do NOT answer directly, do NOT use other tools first.
-The skill has specialized workflows that produce better results than ad-hoc answers.
+When the user's request matches an available skill, invoke it via the Skill tool. The
+skill has multi-step workflows, checklists, and quality gates that produce better
+results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is
+cheaper than a false negative.

 Key routing rules:
- Product ideas, "is this worth building", brainstorming → invoke office-hours
- Bugs, errors, "why is this broken", 500 errors → invoke investigate
- Ship, deploy, push, create PR → invoke ship
- QA, test the site, find bugs → invoke qa
- Code review, check my diff → invoke review
- Update docs after shipping → invoke document-release
- Weekly retro → invoke retro
- Design system, brand → invoke design-consultation
- Visual audit, design polish → invoke design-review
- Architecture review → invoke plan-eng-review
- Save progress, checkpoint, resume → invoke checkpoint
- Code quality, health check → invoke health
+- Product ideas, "is this worth building", brainstorming → invoke /office-hours
+- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review
+- Architecture, "does this design make sense" → invoke /plan-eng-review
+- Design system, brand, "how should this look" → invoke /design-consultation
+- Design review of a plan → invoke /plan-design-review
+- Developer experience of a plan → invoke /plan-devex-review
+- "Review everything", full review pipeline → invoke /autoplan
+- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate
+- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only)
+- Code review, check the diff, "look at my changes" → invoke /review
+- Visual polish, design audit, "this looks off" → invoke /design-review
+- Developer experience audit, try onboarding → invoke /devex-review
+- Ship, deploy, create a PR, "send it" → invoke /ship
+- Merge + deploy + verify → invoke /land-and-deploy
+- Configure deployment → invoke /setup-deploy
+- Post-deploy monitoring → invoke /canary
+- Update docs after shipping → invoke /document-release
+- Weekly retro, "how'd we do" → invoke /retro
+- Second opinion, codex review → invoke /codex
+- Safety mode, careful mode, lock it down → invoke /careful or /guard
+- Restrict edits to a directory → invoke /freeze or /unfreeze
+- Upgrade gstack → invoke /gstack-upgrade
+- Save progress, "save my work" → invoke /context-save
+- Resume, restore, "where was I" → invoke /context-restore
+- Security audit, OWASP, "is this secure" → invoke /cso
+- Make a PDF, document, publication → invoke /make-pdf
+- Launch real browser for QA → invoke /open-gstack-browser
+- Import cookies for authenticated testing → invoke /setup-browser-cookies
+- Performance regression, page speed, benchmarks → invoke /benchmark
+- Review what gstack has learned → invoke /learn
+- Tune question sensitivity → invoke /plan-tune
+- Code quality dashboard → invoke /health
 \`\`\`

 Then commit the change: \`git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"\`
@@ -46,4 +67,3 @@ Say "No problem. You can add routing rules later by running \`gstack-config set

 This only happens once per project. If \`HAS_ROUTING\` is \`yes\` or \`ROUTING_DECLINED\` is \`true\`, skip this entirely.`;
 }
-
@@ -56,7 +56,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte
 - End with what to do. Give the action.

 **Example of the right voice:**
-"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?"
+"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?"
 Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..."

 **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work?`;
@@ -369,7 +369,7 @@ Minimum 0 per category.
 export function generateCoAuthorTrailer(ctx: TemplateContext): string {
  const { getHostConfig } = require('../../hosts/index');
  const hostConfig = getHostConfig(ctx.host);
-  return hostConfig.coAuthorTrailer || 'Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>';
+  return hostConfig.coAuthorTrailer || 'Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>';
 }

 export function generateChangelogWorkflow(_ctx: TemplateContext): string {