chore(opus-4.7): regenerate SKILL.md files + update golden fixtures

Mechanical consequence of the preceding source changes (overlay split, routing alignment, voice example, routing expansion). No behavior change beyond what that commit introduced. - 36 SKILL.md files regenerated via bun run gen:skill-docs - 3 golden fixtures updated (claude, codex, factory ship skill) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 05:56:41 +02:00 · 2026-04-21 23:40:07 -07:00
parent da75ebaaa0
commit 6701205118
39 changed files with 1459 additions and 1480 deletions
@@ -266,23 +266,44 @@ If A: Append this section to the end of CLAUDE.md:

 ## Skill routing

-When the user's request matches an available skill, ALWAYS invoke it using the Skill
-tool as your FIRST action. Do NOT answer directly, do NOT use other tools first.
-The skill has specialized workflows that produce better results than ad-hoc answers.
+When the user's request matches an available skill, invoke it via the Skill tool. The
+skill has multi-step workflows, checklists, and quality gates that produce better
+results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is
+cheaper than a false negative.

 Key routing rules:
- Product ideas, "is this worth building", brainstorming → invoke office-hours
- Bugs, errors, "why is this broken", 500 errors → invoke investigate
- Ship, deploy, push, create PR → invoke ship
- QA, test the site, find bugs → invoke qa
- Code review, check my diff → invoke review
- Update docs after shipping → invoke document-release
- Weekly retro → invoke retro
- Design system, brand → invoke design-consultation
- Visual audit, design polish → invoke design-review
- Architecture review → invoke plan-eng-review
- Save progress, checkpoint, resume → invoke checkpoint
- Code quality, health check → invoke health
+- Product ideas, "is this worth building", brainstorming → invoke /office-hours
+- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review
+- Architecture, "does this design make sense" → invoke /plan-eng-review
+- Design system, brand, "how should this look" → invoke /design-consultation
+- Design review of a plan → invoke /plan-design-review
+- Developer experience of a plan → invoke /plan-devex-review
+- "Review everything", full review pipeline → invoke /autoplan
+- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate
+- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only)
+- Code review, check the diff, "look at my changes" → invoke /review
+- Visual polish, design audit, "this looks off" → invoke /design-review
+- Developer experience audit, try onboarding → invoke /devex-review
+- Ship, deploy, create a PR, "send it" → invoke /ship
+- Merge + deploy + verify → invoke /land-and-deploy
+- Configure deployment → invoke /setup-deploy
+- Post-deploy monitoring → invoke /canary
+- Update docs after shipping → invoke /document-release
+- Weekly retro, "how'd we do" → invoke /retro
+- Second opinion, codex review → invoke /codex
+- Safety mode, careful mode, lock it down → invoke /careful or /guard
+- Restrict edits to a directory → invoke /freeze or /unfreeze
+- Upgrade gstack → invoke /gstack-upgrade
+- Save progress, "save my work" → invoke /context-save
+- Resume, restore, "where was I" → invoke /context-restore
+- Security audit, OWASP, "is this secure" → invoke /cso
+- Make a PDF, document, publication → invoke /make-pdf
+- Launch real browser for QA → invoke /open-gstack-browser
+- Import cookies for authenticated testing → invoke /setup-browser-cookies
+- Performance regression, page speed, benchmarks → invoke /benchmark
+- Review what gstack has learned → invoke /learn
+- Tune question sensitivity → invoke /plan-tune
+- Code quality dashboard → invoke /health
 ```

 Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"`
@@ -349,30 +370,6 @@ the user course-correct cheaply instead of mid-flight.
 **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell
 equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer.

-**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer
-subagents than 4.6. When a task has independent sub-problems (investigating multiple
-files, testing multiple endpoints, auditing multiple components), explicitly parallelize:
-spawn subagents in the same turn, run independent checks concurrently, don't serialize
-work that has no dependencies. If you catch yourself doing A then B then C where none
-depend on each other, stop and do all three at once.
-
-**Effort-match the step.** Simple file reads, config checks, command lookups, and
-mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve
-extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs,
-security implications, design decisions with competing constraints. Over-thinking
-simple steps wastes tokens and time.
-
-**Batch your questions.** If you need to clarify multiple things before proceeding,
-ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per
-turn. Three questions in one message beats three back-and-forth exchanges.
-
-**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and
-will not silently generalize. When the user says "fix the tests," fix ALL failing tests,
-not just the first one. When the user says "update the docs," update every relevant doc,
-not just the most obvious one. Read the full scope of what was asked and deliver the
-full scope. If the request is ambiguous, ask once (batched with any other questions),
-then execute completely.
-
 ## Voice

 You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography.
@@ -418,7 +415,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte
 - End with what to do. Give the action.

 **Example of the right voice:**
-"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?"
+"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?"
 Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..."

 **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work?