diff --git a/SKILL.md b/SKILL.md index 854ddd4e..95f22604 100644 --- a/SKILL.md +++ b/SKILL.md @@ -263,23 +263,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -346,30 +367,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice **Tone:** direct, concrete, sharp, never corporate, never academic. Sound like a builder, not a consultant. Name the file, the function, the command. No filler, no throat-clearing. @@ -499,12 +496,18 @@ quality gates that produce better results than answering inline. - User asks to review architecture, lock in the plan, "does this design make sense" → invoke `/plan-eng-review` - User asks about design system, brand, visual identity, "how should this look" → invoke `/design-consultation` - User asks to review design of a plan → invoke `/plan-design-review` +- User asks about developer experience of a plan, API/CLI/SDK design → invoke `/plan-devex-review` - User wants all reviews done automatically, "review everything" → invoke `/autoplan` - User reports a bug, error, broken behavior, "why is this broken", "this doesn't work", "wtf", "something's wrong" → invoke `/investigate` - User asks to test the site, find bugs, QA, "does this work", "check the deploy" → invoke `/qa` +- User asks to just report bugs without fixing → invoke `/qa-only` - User asks to review code, check the diff, pre-landing review, "look at my changes" → invoke `/review` - User asks about visual polish, design audit of a live site, "this looks off" → invoke `/design-review` +- User asks to audit the live developer experience, time-to-hello-world → invoke `/devex-review` - User asks to ship, deploy, push, create a PR, "let's land this", "send it" → invoke `/ship` +- User asks to merge + deploy + verify as one flow → invoke `/land-and-deploy` +- User asks to configure deployment for the project → invoke `/setup-deploy` +- User asks to monitor prod after shipping, post-deploy checks → invoke `/canary` - User asks to update docs after shipping → invoke `/document-release` - User asks for a weekly retro, what did we ship, "how'd we do" → invoke `/retro` - User asks for a second opinion, codex review → invoke `/codex` @@ -515,6 +518,12 @@ quality gates that produce better results than answering inline. - User asks to resume, restore, "where was I" → invoke `/context-restore` - User asks about security, OWASP, vulnerabilities, "is this secure" → invoke `/cso` - User asks to make a PDF, document, publication → invoke `/make-pdf` +- User asks to launch a real browser for QA, "open the browser" → invoke `/open-gstack-browser` +- User asks to import cookies for authenticated testing → invoke `/setup-browser-cookies` +- User asks about page speed, performance regression, benchmarks → invoke `/benchmark` +- User asks what gstack has learned, "show learnings" → invoke `/learn` +- User asks to tune question sensitivity, "stop asking me that" → invoke `/plan-tune` +- User asks for code quality dashboard, "health check" → invoke `/health` **When in doubt, invoke the skill.** A false positive (invoking a skill that wasn't needed) is cheaper than a false negative (answering ad-hoc when a structured workflow diff --git a/autoplan/SKILL.md b/autoplan/SKILL.md index 7e8467fe..9b7c7f32 100644 --- a/autoplan/SKILL.md +++ b/autoplan/SKILL.md @@ -272,23 +272,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -355,30 +376,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -424,7 +421,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/benchmark-models/SKILL.md b/benchmark-models/SKILL.md index 114f211d..078c5c92 100644 --- a/benchmark-models/SKILL.md +++ b/benchmark-models/SKILL.md @@ -265,23 +265,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -348,30 +369,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice **Tone:** direct, concrete, sharp, never corporate, never academic. Sound like a builder, not a consultant. Name the file, the function, the command. No filler, no throat-clearing. diff --git a/benchmark/SKILL.md b/benchmark/SKILL.md index 61576c4b..ae22b509 100644 --- a/benchmark/SKILL.md +++ b/benchmark/SKILL.md @@ -265,23 +265,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -348,30 +369,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice **Tone:** direct, concrete, sharp, never corporate, never academic. Sound like a builder, not a consultant. Name the file, the function, the command. No filler, no throat-clearing. diff --git a/browse/SKILL.md b/browse/SKILL.md index 6f3a4a46..864644a0 100644 --- a/browse/SKILL.md +++ b/browse/SKILL.md @@ -264,23 +264,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -347,30 +368,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice **Tone:** direct, concrete, sharp, never corporate, never academic. Sound like a builder, not a consultant. Name the file, the function, the command. No filler, no throat-clearing. diff --git a/canary/SKILL.md b/canary/SKILL.md index 07c58304..af8c7dd4 100644 --- a/canary/SKILL.md +++ b/canary/SKILL.md @@ -264,23 +264,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -347,30 +368,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -416,7 +413,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/codex/SKILL.md b/codex/SKILL.md index 3e358705..4eda87fd 100644 --- a/codex/SKILL.md +++ b/codex/SKILL.md @@ -266,23 +266,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -349,30 +370,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -418,7 +415,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/context-restore/SKILL.md b/context-restore/SKILL.md index a2f0c89d..8e3bf814 100644 --- a/context-restore/SKILL.md +++ b/context-restore/SKILL.md @@ -268,23 +268,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -351,30 +372,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -420,7 +417,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/context-save/SKILL.md b/context-save/SKILL.md index 98160c81..04370e7e 100644 --- a/context-save/SKILL.md +++ b/context-save/SKILL.md @@ -268,23 +268,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -351,30 +372,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -420,7 +417,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/cso/SKILL.md b/cso/SKILL.md index a4280779..b020255f 100644 --- a/cso/SKILL.md +++ b/cso/SKILL.md @@ -269,23 +269,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -352,30 +373,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -421,7 +418,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/design-consultation/SKILL.md b/design-consultation/SKILL.md index f7e1c5a2..c05f240c 100644 --- a/design-consultation/SKILL.md +++ b/design-consultation/SKILL.md @@ -269,23 +269,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -352,30 +373,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -421,7 +418,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/design-html/SKILL.md b/design-html/SKILL.md index 7f584d4b..44e9b788 100644 --- a/design-html/SKILL.md +++ b/design-html/SKILL.md @@ -271,23 +271,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -354,30 +375,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -423,7 +420,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/design-review/SKILL.md b/design-review/SKILL.md index 91add250..6cbe3c45 100644 --- a/design-review/SKILL.md +++ b/design-review/SKILL.md @@ -269,23 +269,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -352,30 +373,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -421,7 +418,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/design-shotgun/SKILL.md b/design-shotgun/SKILL.md index d7d1bc60..e078d683 100644 --- a/design-shotgun/SKILL.md +++ b/design-shotgun/SKILL.md @@ -266,23 +266,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -349,30 +370,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -418,7 +415,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/devex-review/SKILL.md b/devex-review/SKILL.md index 6a8d8c7f..790c97b7 100644 --- a/devex-review/SKILL.md +++ b/devex-review/SKILL.md @@ -269,23 +269,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -352,30 +373,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -421,7 +418,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/document-release/SKILL.md b/document-release/SKILL.md index 84e8ad3f..999e4ffe 100644 --- a/document-release/SKILL.md +++ b/document-release/SKILL.md @@ -266,23 +266,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -349,30 +370,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -418,7 +415,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/health/SKILL.md b/health/SKILL.md index 5da61817..ac9bc4b2 100644 --- a/health/SKILL.md +++ b/health/SKILL.md @@ -266,23 +266,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -349,30 +370,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -418,7 +415,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/investigate/SKILL.md b/investigate/SKILL.md index eeeb0eda..31c66149 100644 --- a/investigate/SKILL.md +++ b/investigate/SKILL.md @@ -283,23 +283,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -366,30 +387,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -435,7 +432,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/land-and-deploy/SKILL.md b/land-and-deploy/SKILL.md index 6ca05f44..2fd7a7d9 100644 --- a/land-and-deploy/SKILL.md +++ b/land-and-deploy/SKILL.md @@ -263,23 +263,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -346,30 +367,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -415,7 +412,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/learn/SKILL.md b/learn/SKILL.md index e041c7ae..bac6abd6 100644 --- a/learn/SKILL.md +++ b/learn/SKILL.md @@ -266,23 +266,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -349,30 +370,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -418,7 +415,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/make-pdf/SKILL.md b/make-pdf/SKILL.md index 581647bc..8414a346 100644 --- a/make-pdf/SKILL.md +++ b/make-pdf/SKILL.md @@ -264,23 +264,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -347,30 +368,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice **Tone:** direct, concrete, sharp, never corporate, never academic. Sound like a builder, not a consultant. Name the file, the function, the command. No filler, no throat-clearing. diff --git a/office-hours/SKILL.md b/office-hours/SKILL.md index 821dedcd..7d3be39d 100644 --- a/office-hours/SKILL.md +++ b/office-hours/SKILL.md @@ -274,23 +274,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -357,30 +378,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -426,7 +423,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/open-gstack-browser/SKILL.md b/open-gstack-browser/SKILL.md index 934def8d..9867e8a4 100644 --- a/open-gstack-browser/SKILL.md +++ b/open-gstack-browser/SKILL.md @@ -263,23 +263,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -346,30 +367,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -415,7 +412,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/pair-agent/SKILL.md b/pair-agent/SKILL.md index 1ce0e741..e8e4e941 100644 --- a/pair-agent/SKILL.md +++ b/pair-agent/SKILL.md @@ -264,23 +264,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -347,30 +368,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -416,7 +413,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/plan-ceo-review/SKILL.md b/plan-ceo-review/SKILL.md index 476f1e6e..f01e8404 100644 --- a/plan-ceo-review/SKILL.md +++ b/plan-ceo-review/SKILL.md @@ -270,23 +270,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -353,30 +374,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -422,7 +419,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/plan-design-review/SKILL.md b/plan-design-review/SKILL.md index 1f9649b0..6a7303ff 100644 --- a/plan-design-review/SKILL.md +++ b/plan-design-review/SKILL.md @@ -267,23 +267,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -350,30 +371,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -419,7 +416,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/plan-devex-review/SKILL.md b/plan-devex-review/SKILL.md index 751ef0a5..b66ed978 100644 --- a/plan-devex-review/SKILL.md +++ b/plan-devex-review/SKILL.md @@ -271,23 +271,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -354,30 +375,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -423,7 +420,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/plan-eng-review/SKILL.md b/plan-eng-review/SKILL.md index 9975e0dc..4fba0494 100644 --- a/plan-eng-review/SKILL.md +++ b/plan-eng-review/SKILL.md @@ -269,23 +269,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -352,30 +373,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -421,7 +418,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/plan-tune/SKILL.md b/plan-tune/SKILL.md index 33f35d75..9d69bf3b 100644 --- a/plan-tune/SKILL.md +++ b/plan-tune/SKILL.md @@ -277,23 +277,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -360,30 +381,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -429,7 +426,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/qa-only/SKILL.md b/qa-only/SKILL.md index a62ad63d..9f2f5e88 100644 --- a/qa-only/SKILL.md +++ b/qa-only/SKILL.md @@ -265,23 +265,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -348,30 +369,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -417,7 +414,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/qa/SKILL.md b/qa/SKILL.md index 6559cf03..a64c074c 100644 --- a/qa/SKILL.md +++ b/qa/SKILL.md @@ -271,23 +271,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -354,30 +375,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -423,7 +420,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/retro/SKILL.md b/retro/SKILL.md index b86547f0..92f7a962 100644 --- a/retro/SKILL.md +++ b/retro/SKILL.md @@ -264,23 +264,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -347,30 +368,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -416,7 +413,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/review/SKILL.md b/review/SKILL.md index 999786f9..df1bcf70 100644 --- a/review/SKILL.md +++ b/review/SKILL.md @@ -268,23 +268,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -351,30 +372,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -420,7 +417,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/setup-browser-cookies/SKILL.md b/setup-browser-cookies/SKILL.md index a82b1e1d..3b0160e0 100644 --- a/setup-browser-cookies/SKILL.md +++ b/setup-browser-cookies/SKILL.md @@ -261,23 +261,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -344,30 +365,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice **Tone:** direct, concrete, sharp, never corporate, never academic. Sound like a builder, not a consultant. Name the file, the function, the command. No filler, no throat-clearing. diff --git a/setup-deploy/SKILL.md b/setup-deploy/SKILL.md index d897f427..6411bde9 100644 --- a/setup-deploy/SKILL.md +++ b/setup-deploy/SKILL.md @@ -267,23 +267,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -350,30 +371,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -419,7 +416,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/ship/SKILL.md b/ship/SKILL.md index ba1b3b54..540a62a1 100644 --- a/ship/SKILL.md +++ b/ship/SKILL.md @@ -269,23 +269,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -352,30 +373,6 @@ the user course-correct cheaply instead of mid-flight. **Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. -**Fan out explicitly.** Opus 4.7 defaults to sequential work and spawns fewer -subagents than 4.6. When a task has independent sub-problems (investigating multiple -files, testing multiple endpoints, auditing multiple components), explicitly parallelize: -spawn subagents in the same turn, run independent checks concurrently, don't serialize -work that has no dependencies. If you catch yourself doing A then B then C where none -depend on each other, stop and do all three at once. - -**Effort-match the step.** Simple file reads, config checks, command lookups, and -mechanical edits don't need deep reasoning. Complete them quickly and move on. Reserve -extended thinking for genuinely hard subproblems: architectural tradeoffs, subtle bugs, -security implications, design decisions with competing constraints. Over-thinking -simple steps wastes tokens and time. - -**Batch your questions.** If you need to clarify multiple things before proceeding, -ask all of them in a single AskUserQuestion turn. Do not drip-feed one question per -turn. Three questions in one message beats three back-and-forth exchanges. - -**Literal interpretation awareness.** Opus 4.7 interprets instructions literally and -will not silently generalize. When the user says "fix the tests," fix ALL failing tests, -not just the first one. When the user says "update the docs," update every relevant doc, -not just the most obvious one. Read the full scope of what was asked and deliver the -full scope. If the request is ambiguous, ask once (batched with any other questions), -then execute completely. - ## Voice You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography. @@ -421,7 +418,7 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - End with what to do. Give the action. **Example of the right voice:** -"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to ship it?" +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? diff --git a/test/fixtures/golden/claude-ship-SKILL.md b/test/fixtures/golden/claude-ship-SKILL.md index 8e2fa0c0..540a62a1 100644 --- a/test/fixtures/golden/claude-ship-SKILL.md +++ b/test/fixtures/golden/claude-ship-SKILL.md @@ -269,23 +269,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -396,6 +417,10 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - Stay curious, not lecturing. "What's interesting here is..." beats "It is important to understand..." - End with what to do. Give the action. +**Example of the right voice:** +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" +Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." + **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? ## Context Recovery @@ -2760,7 +2785,7 @@ user via AskUserQuestion rather than destroying non-WIP commits. git commit -m "$(cat <<'EOF' chore: bump version and changelog (vX.Y.Z.W) -Co-Authored-By: Claude Opus 4.6 +Co-Authored-By: Claude Opus 4.7 EOF )" ``` diff --git a/test/fixtures/golden/codex-ship-SKILL.md b/test/fixtures/golden/codex-ship-SKILL.md index cd5c7c0e..2200b4f4 100644 --- a/test/fixtures/golden/codex-ship-SKILL.md +++ b/test/fixtures/golden/codex-ship-SKILL.md @@ -258,23 +258,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -385,6 +406,10 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - Stay curious, not lecturing. "What's interesting here is..." beats "It is important to understand..." - End with what to do. Give the action. +**Example of the right voice:** +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" +Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." + **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? ## Context Recovery diff --git a/test/fixtures/golden/factory-ship-SKILL.md b/test/fixtures/golden/factory-ship-SKILL.md index 5c38f080..3427afb3 100644 --- a/test/fixtures/golden/factory-ship-SKILL.md +++ b/test/fixtures/golden/factory-ship-SKILL.md @@ -260,23 +260,44 @@ If A: Append this section to the end of CLAUDE.md: ## Skill routing -When the user's request matches an available skill, ALWAYS invoke it using the Skill -tool as your FIRST action. Do NOT answer directly, do NOT use other tools first. -The skill has specialized workflows that produce better results than ad-hoc answers. +When the user's request matches an available skill, invoke it via the Skill tool. The +skill has multi-step workflows, checklists, and quality gates that produce better +results than an ad-hoc answer. When in doubt, invoke the skill. A false positive is +cheaper than a false negative. Key routing rules: -- Product ideas, "is this worth building", brainstorming → invoke office-hours -- Bugs, errors, "why is this broken", 500 errors → invoke investigate -- Ship, deploy, push, create PR → invoke ship -- QA, test the site, find bugs → invoke qa -- Code review, check my diff → invoke review -- Update docs after shipping → invoke document-release -- Weekly retro → invoke retro -- Design system, brand → invoke design-consultation -- Visual audit, design polish → invoke design-review -- Architecture review → invoke plan-eng-review -- Save progress, checkpoint, resume → invoke checkpoint -- Code quality, health check → invoke health +- Product ideas, "is this worth building", brainstorming → invoke /office-hours +- Strategy, scope, "think bigger", "what should we build" → invoke /plan-ceo-review +- Architecture, "does this design make sense" → invoke /plan-eng-review +- Design system, brand, "how should this look" → invoke /design-consultation +- Design review of a plan → invoke /plan-design-review +- Developer experience of a plan → invoke /plan-devex-review +- "Review everything", full review pipeline → invoke /autoplan +- Bugs, errors, "why is this broken", "wtf", "this doesn't work" → invoke /investigate +- Test the site, find bugs, "does this work" → invoke /qa (or /qa-only for report only) +- Code review, check the diff, "look at my changes" → invoke /review +- Visual polish, design audit, "this looks off" → invoke /design-review +- Developer experience audit, try onboarding → invoke /devex-review +- Ship, deploy, create a PR, "send it" → invoke /ship +- Merge + deploy + verify → invoke /land-and-deploy +- Configure deployment → invoke /setup-deploy +- Post-deploy monitoring → invoke /canary +- Update docs after shipping → invoke /document-release +- Weekly retro, "how'd we do" → invoke /retro +- Second opinion, codex review → invoke /codex +- Safety mode, careful mode, lock it down → invoke /careful or /guard +- Restrict edits to a directory → invoke /freeze or /unfreeze +- Upgrade gstack → invoke /gstack-upgrade +- Save progress, "save my work" → invoke /context-save +- Resume, restore, "where was I" → invoke /context-restore +- Security audit, OWASP, "is this secure" → invoke /cso +- Make a PDF, document, publication → invoke /make-pdf +- Launch real browser for QA → invoke /open-gstack-browser +- Import cookies for authenticated testing → invoke /setup-browser-cookies +- Performance regression, page speed, benchmarks → invoke /benchmark +- Review what gstack has learned → invoke /learn +- Tune question sensitivity → invoke /plan-tune +- Code quality dashboard → invoke /health ``` Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` @@ -387,6 +408,10 @@ Avoid filler, throat-clearing, generic optimism, founder cosplay, and unsupporte - Stay curious, not lecturing. "What's interesting here is..." beats "It is important to understand..." - End with what to do. Give the action. +**Example of the right voice:** +"auth.ts:47 returns undefined when the session cookie expires. Your users hit a white screen. Fix: add a null check and redirect to /login. Two lines. Want me to fix it?" +Not: "I've identified a potential issue in the authentication flow that may cause problems for some users under certain conditions. Let me explain the approach I'd recommend..." + **Final test:** does this sound like a real cross-functional builder who wants to help someone make something people want, ship it, and make it actually work? ## Context Recovery