Merge branch 'main' into garrytan/enable-plan-tune (catch up to v1.48.0.0)

Brings in v1.48.0.0 (AskUserQuestion 5+-option split rule + runtime
AUTO_DECIDE carve-out).

Conflict resolutions:
- VERSION + package.json: keep 1.49.0.0 (still queue-ahead of main's 1.48.0.0)
- CHANGELOG.md: insert main's 1.48.0.0 entry below ours, keep reverse-chronological order
- Regenerated all SKILL.md files via gen:skill-docs to inherit the new
  'Handling 5+ options — split, never drop' preamble subsection
- Refreshed claude/codex/factory ship golden fixtures

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-05-26 23:44:08 -07:00
56 changed files with 2693 additions and 58 deletions
+56
View File
@@ -120,3 +120,59 @@ export const FORCING_BATCHING_ENG = [
'iterate the payload to recompute the dependency graph. Could cache the',
'graph on the first attempt; not planned.',
].join('\n');
/**
* Split-overflow regression seed (periodic tier).
*
* Catches the original failure mode the user complained about: when the
* agent has 5+ options for ONE conceptual decision, it must split into N
* sequential AskUserQuestion calls (or batch into compatible ≤4-groups),
* NOT drop an option arbitrarily to fit Conductor's 4-option cap.
*
* Fixture shape: 5 independent platform-integration candidates for ONE
* scope decision. Each is independent (no dependencies between them) so
* the natural compliant shape is a per-option split chain at parent D<N>.
*
* Used by test/skill-e2e-plan-ceo-split-overflow.test.ts to assert the
* agent fires >= 4 review-phase AUQs (floor uses the standard [N-1]
* tolerance band, accounting for one expected scope-reduction-or-merge
* call before the per-option chain begins).
*
* Pre-fix behavior: agent fires 1 AUQ with 4 options, "trims" the 5th
* via prose ("E5 is the largest lift and a natural follow-up; moving to
* TODOs without asking"). That's the bug. Floor of 4 detects it.
*/
export const FORCING_SPLIT_OVERFLOW_CEO = [
'Please review this plan and help me decide scope. Write your plan-mode plan to /tmp/gstack-test-plan-ceo-split-overflow.md (use Edit/Write to that exact path).',
'',
'# Plan: Pick which chat-platform integrations to ship this quarter',
'',
'We have engineering bandwidth for at most 2-3 integrations this quarter.',
'I need your help deciding which to prioritize. Below are 5 candidates,',
'each fully independent of the others (no shared infrastructure, no',
'dependencies between them). For each, the user can independently decide:',
'include in this scope, defer to next quarter, or cut entirely.',
'',
'## E1) Slack — DM bot for incident alerts',
'Build cost: ~2 weeks. Existing Slack auth flow we can reuse. High user',
'demand (top customer request in Q2 survey, ~40% of asks).',
'',
'## E2) Discord — guild bot for community channels',
'Build cost: ~3 weeks. Greenfield integration, no existing auth. Medium',
'demand (~15% of asks, but loud community).',
'',
'## E3) Microsoft Teams — webhook + bot framework',
'Build cost: ~4 weeks. Enterprise customers specifically asked for this.',
'Highest revenue impact per user but smallest user count (~5% of asks).',
'',
'## E4) Telegram — bot API integration',
'Build cost: ~1 week. Simplest API surface. Low strategic value but',
'cheap win (~8% of asks, mostly from international users).',
'',
'## E5) Mattermost — REST plugin',
'Build cost: ~2 weeks. Self-hosted enterprise users. Niche but locked-in',
'segment (~3% of asks but all from high-ARR accounts).',
'',
'Please walk me through each candidate and help me decide include/defer/cut',
'per option. I want individual decisions per candidate, not a bundled pick.',
].join('\n');
+33 -1
View File
@@ -338,7 +338,36 @@ Effort both-scales: when an option involves effort, label both human-team and CC
Net line closes the tradeoff. Per-skill instructions may add stricter rules.
12. **Non-ASCII characters — write directly, never \u-escape.** When any
### Handling 5+ options — split, never drop
AskUserQuestion caps every call at **4 options**. With 5+ real options, NEVER
drop, merge, or silently defer one to fit. Pick a compliant shape:
- **Batch into ≤4-groups** — for coherent alternatives (e.g. version bumps,
layout variants). One call, 5th surfaced only if first 4 don't fit.
- **Split per-option** — for independent scope items (e.g. "ship E1..E6?").
Fire N sequential calls, one per option. Default to this when unsure.
Per-option call shape: `D<N>.k` header (e.g. D3.1..D3.5), ELI10 per option,
Recommendation, kind-note (no completeness score — Include/Defer/Cut/Hold are
decision actions), and 4 buckets:
**A) Include**, **B) Defer**, **C) Cut**, **D) Hold** (stop chain, discuss).
After the chain, fire `D<N>.final` to validate the assembled set (reprompt
dependency conflicts) and confirm shipping it. Use `D<N>.revise-<k>` to
revise one option without re-running the chain.
For N>6, fire a `D<N>.0` meta-AskUserQuestion first (proceed / narrow / batch).
question_ids for split chains: `<skill>-split-<option-slug>` (kebab-case ASCII,
≤64 chars, `-2`/`-3` suffix on collision). The runtime checker
(`bin/gstack-question-preference`) refuses `never-ask` on any `*-split-*` id,
so split chains are never AUTO_DECIDE-eligible — the user's option set is sacred.
**Full rule + worked examples + Hold/dependency semantics:** see
`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4.
**Non-ASCII characters — write directly, never \u-escape.** When any
string field (question, option label, option description) contains
Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit
the literal UTF-8 characters in the JSON string. **Never escape them
@@ -371,6 +400,9 @@ Before calling AskUserQuestion, verify:
- [ ] Net line closes the decision
- [ ] You are calling the tool, not writing prose
- [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped
- [ ] If you had 5+ options, you split (or batched into ≤4-groups) — did NOT drop any
- [ ] If you split, you checked dependencies between options before firing the chain
- [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue)
## Artifacts Sync (skill start)
+633
View File
@@ -0,0 +1,633 @@
{
"tag": "v1.47.0.0",
"capturedAt": "2026-05-27T05:50:57.656Z",
"capturedFromCommit": "e08e5fa8",
"capturedFromBranch": "garrytan/askuserquestion-split-on-overflow",
"totalSkills": 52,
"totalCorpusBytes": 3090887,
"estTotalCatalogTokens": 4116,
"topHeaviest": [
{
"skill": "ship",
"skillMdBytes": 166782,
"skillMdLines": 3099,
"estTokens": 41696,
"tmplBytes": 50495,
"descriptionLen": 291,
"hasGateEval": true,
"hasPeriodicEval": true
},
{
"skill": "plan-ceo-review",
"skillMdBytes": 132488,
"skillMdLines": 2197,
"estTokens": 33122,
"tmplBytes": 63393,
"descriptionLen": 794,
"hasGateEval": true,
"hasPeriodicEval": true
},
{
"skill": "office-hours",
"skillMdBytes": 112842,
"skillMdLines": 2066,
"estTokens": 28211,
"tmplBytes": 55466,
"descriptionLen": 860,
"hasGateEval": true,
"hasPeriodicEval": false
},
{
"skill": "plan-design-review",
"skillMdBytes": 107855,
"skillMdLines": 1928,
"estTokens": 26964,
"tmplBytes": 28624,
"descriptionLen": 218,
"hasGateEval": true,
"hasPeriodicEval": true
},
{
"skill": "plan-devex-review",
"skillMdBytes": 106167,
"skillMdLines": 2119,
"estTokens": 26542,
"tmplBytes": 35680,
"descriptionLen": 250,
"hasGateEval": true,
"hasPeriodicEval": true
},
{
"skill": "plan-eng-review",
"skillMdBytes": 103009,
"skillMdLines": 1762,
"estTokens": 25752,
"tmplBytes": 26234,
"descriptionLen": 231,
"hasGateEval": true,
"hasPeriodicEval": true
},
{
"skill": "spec",
"skillMdBytes": 102629,
"skillMdLines": 2141,
"estTokens": 25657,
"tmplBytes": 28429,
"descriptionLen": 282,
"hasGateEval": true,
"hasPeriodicEval": false
},
{
"skill": "design-review",
"skillMdBytes": 95654,
"skillMdLines": 1932,
"estTokens": 23914,
"tmplBytes": 11674,
"descriptionLen": 304,
"hasGateEval": true,
"hasPeriodicEval": false
},
{
"skill": "review",
"skillMdBytes": 94048,
"skillMdLines": 1762,
"estTokens": 23512,
"tmplBytes": 14099,
"descriptionLen": 205,
"hasGateEval": true,
"hasPeriodicEval": false
},
{
"skill": "land-and-deploy",
"skillMdBytes": 91886,
"skillMdLines": 1856,
"estTokens": 22972,
"tmplBytes": 48624,
"descriptionLen": 160,
"hasGateEval": true,
"hasPeriodicEval": false
}
],
"skills": {
"autoplan": {
"skill": "autoplan",
"skillMdBytes": 90870,
"skillMdLines": 1784,
"estTokens": 22718,
"tmplBytes": 45271,
"descriptionLen": 366,
"hasGateEval": true,
"hasPeriodicEval": true
},
"benchmark": {
"skill": "benchmark",
"skillMdBytes": 33266,
"skillMdLines": 747,
"estTokens": 8317,
"tmplBytes": 9378,
"descriptionLen": 213,
"hasGateEval": true,
"hasPeriodicEval": false
},
"benchmark-models": {
"skill": "benchmark-models",
"skillMdBytes": 29333,
"skillMdLines": 622,
"estTokens": 7333,
"tmplBytes": 6631,
"descriptionLen": 217,
"hasGateEval": false,
"hasPeriodicEval": false
},
"browse": {
"skill": "browse",
"skillMdBytes": 48018,
"skillMdLines": 929,
"estTokens": 12005,
"tmplBytes": 10805,
"descriptionLen": 181,
"hasGateEval": true,
"hasPeriodicEval": false
},
"canary": {
"skill": "canary",
"skillMdBytes": 47105,
"skillMdLines": 990,
"estTokens": 11776,
"tmplBytes": 8033,
"descriptionLen": 180,
"hasGateEval": true,
"hasPeriodicEval": false
},
"careful": {
"skill": "careful",
"skillMdBytes": 2551,
"skillMdLines": 68,
"estTokens": 638,
"tmplBytes": 2435,
"descriptionLen": 315,
"hasGateEval": false,
"hasPeriodicEval": false
},
"codex": {
"skill": "codex",
"skillMdBytes": 79620,
"skillMdLines": 1519,
"estTokens": 19905,
"tmplBytes": 34143,
"descriptionLen": 187,
"hasGateEval": true,
"hasPeriodicEval": false
},
"context-restore": {
"skill": "context-restore",
"skillMdBytes": 41493,
"skillMdLines": 848,
"estTokens": 10373,
"tmplBytes": 5255,
"descriptionLen": 238,
"hasGateEval": true,
"hasPeriodicEval": false
},
"context-save": {
"skill": "context-save",
"skillMdBytes": 45690,
"skillMdLines": 966,
"estTokens": 11423,
"tmplBytes": 9293,
"descriptionLen": 168,
"hasGateEval": true,
"hasPeriodicEval": false
},
"cso": {
"skill": "cso",
"skillMdBytes": 77397,
"skillMdLines": 1451,
"estTokens": 19349,
"tmplBytes": 35158,
"descriptionLen": 196,
"hasGateEval": true,
"hasPeriodicEval": false
},
"design-consultation": {
"skill": "design-consultation",
"skillMdBytes": 79222,
"skillMdLines": 1561,
"estTokens": 19806,
"tmplBytes": 25899,
"descriptionLen": 888,
"hasGateEval": true,
"hasPeriodicEval": false
},
"design-html": {
"skill": "design-html",
"skillMdBytes": 66547,
"skillMdLines": 1449,
"estTokens": 16637,
"tmplBytes": 22567,
"descriptionLen": 233,
"hasGateEval": false,
"hasPeriodicEval": false
},
"design-review": {
"skill": "design-review",
"skillMdBytes": 95654,
"skillMdLines": 1932,
"estTokens": 23914,
"tmplBytes": 11674,
"descriptionLen": 304,
"hasGateEval": true,
"hasPeriodicEval": false
},
"design-shotgun": {
"skill": "design-shotgun",
"skillMdBytes": 62836,
"skillMdLines": 1311,
"estTokens": 15709,
"tmplBytes": 13331,
"descriptionLen": 786,
"hasGateEval": false,
"hasPeriodicEval": false
},
"devex-review": {
"skill": "devex-review",
"skillMdBytes": 64413,
"skillMdLines": 1233,
"estTokens": 16103,
"tmplBytes": 7984,
"descriptionLen": 201,
"hasGateEval": false,
"hasPeriodicEval": false
},
"document-generate": {
"skill": "document-generate",
"skillMdBytes": 52987,
"skillMdLines": 1176,
"estTokens": 13247,
"tmplBytes": 15093,
"descriptionLen": 334,
"hasGateEval": false,
"hasPeriodicEval": false
},
"document-release": {
"skill": "document-release",
"skillMdBytes": 58251,
"skillMdLines": 1235,
"estTokens": 14563,
"tmplBytes": 20362,
"descriptionLen": 192,
"hasGateEval": true,
"hasPeriodicEval": false
},
"freeze": {
"skill": "freeze",
"skillMdBytes": 3154,
"skillMdLines": 92,
"estTokens": 789,
"tmplBytes": 3038,
"descriptionLen": 503,
"hasGateEval": false,
"hasPeriodicEval": false
},
"gstack-upgrade": {
"skill": "gstack-upgrade",
"skillMdBytes": 10817,
"skillMdLines": 285,
"estTokens": 2704,
"tmplBytes": 10667,
"descriptionLen": 163,
"hasGateEval": true,
"hasPeriodicEval": false
},
"guard": {
"skill": "guard",
"skillMdBytes": 3297,
"skillMdLines": 91,
"estTokens": 824,
"tmplBytes": 3181,
"descriptionLen": 686,
"hasGateEval": false,
"hasPeriodicEval": false
},
"health": {
"skill": "health",
"skillMdBytes": 47916,
"skillMdLines": 1014,
"estTokens": 11979,
"tmplBytes": 11617,
"descriptionLen": 184,
"hasGateEval": true,
"hasPeriodicEval": false
},
"investigate": {
"skill": "investigate",
"skillMdBytes": 50409,
"skillMdLines": 1012,
"estTokens": 12602,
"tmplBytes": 11561,
"descriptionLen": 1379,
"hasGateEval": true,
"hasPeriodicEval": false
},
"ios-clean": {
"skill": "ios-clean",
"skillMdBytes": 41045,
"skillMdLines": 813,
"estTokens": 10261,
"tmplBytes": 3851,
"descriptionLen": 252,
"hasGateEval": false,
"hasPeriodicEval": false
},
"ios-design-review": {
"skill": "ios-design-review",
"skillMdBytes": 41631,
"skillMdLines": 815,
"estTokens": 10408,
"tmplBytes": 4417,
"descriptionLen": 209,
"hasGateEval": false,
"hasPeriodicEval": false
},
"ios-fix": {
"skill": "ios-fix",
"skillMdBytes": 40760,
"skillMdLines": 811,
"estTokens": 10190,
"tmplBytes": 3574,
"descriptionLen": 187,
"hasGateEval": false,
"hasPeriodicEval": false
},
"ios-qa": {
"skill": "ios-qa",
"skillMdBytes": 47271,
"skillMdLines": 931,
"estTokens": 11818,
"tmplBytes": 10090,
"descriptionLen": 223,
"hasGateEval": true,
"hasPeriodicEval": false
},
"ios-sync": {
"skill": "ios-sync",
"skillMdBytes": 40737,
"skillMdLines": 804,
"estTokens": 10184,
"tmplBytes": 3544,
"descriptionLen": 269,
"hasGateEval": false,
"hasPeriodicEval": false
},
"land-and-deploy": {
"skill": "land-and-deploy",
"skillMdBytes": 91886,
"skillMdLines": 1856,
"estTokens": 22972,
"tmplBytes": 48624,
"descriptionLen": 160,
"hasGateEval": true,
"hasPeriodicEval": false
},
"landing-report": {
"skill": "landing-report",
"skillMdBytes": 43985,
"skillMdLines": 874,
"estTokens": 10996,
"tmplBytes": 6806,
"descriptionLen": 195,
"hasGateEval": false,
"hasPeriodicEval": false
},
"learn": {
"skill": "learn",
"skillMdBytes": 41722,
"skillMdLines": 891,
"estTokens": 10431,
"tmplBytes": 5594,
"descriptionLen": 178,
"hasGateEval": true,
"hasPeriodicEval": false
},
"make-pdf": {
"skill": "make-pdf",
"skillMdBytes": 29450,
"skillMdLines": 663,
"estTokens": 7363,
"tmplBytes": 5106,
"descriptionLen": 177,
"hasGateEval": false,
"hasPeriodicEval": false
},
"office-hours": {
"skill": "office-hours",
"skillMdBytes": 112842,
"skillMdLines": 2066,
"estTokens": 28211,
"tmplBytes": 55466,
"descriptionLen": 860,
"hasGateEval": true,
"hasPeriodicEval": false
},
"open-gstack-browser": {
"skill": "open-gstack-browser",
"skillMdBytes": 46131,
"skillMdLines": 954,
"estTokens": 11533,
"tmplBytes": 7702,
"descriptionLen": 204,
"hasGateEval": false,
"hasPeriodicEval": false
},
"pair-agent": {
"skill": "pair-agent",
"skillMdBytes": 46939,
"skillMdLines": 1010,
"estTokens": 11735,
"tmplBytes": 8548,
"descriptionLen": 167,
"hasGateEval": false,
"hasPeriodicEval": false
},
"plan-ceo-review": {
"skill": "plan-ceo-review",
"skillMdBytes": 132488,
"skillMdLines": 2197,
"estTokens": 33122,
"tmplBytes": 63393,
"descriptionLen": 794,
"hasGateEval": true,
"hasPeriodicEval": true
},
"plan-design-review": {
"skill": "plan-design-review",
"skillMdBytes": 107855,
"skillMdLines": 1928,
"estTokens": 26964,
"tmplBytes": 28624,
"descriptionLen": 218,
"hasGateEval": true,
"hasPeriodicEval": true
},
"plan-devex-review": {
"skill": "plan-devex-review",
"skillMdBytes": 106167,
"skillMdLines": 2119,
"estTokens": 26542,
"tmplBytes": 35680,
"descriptionLen": 250,
"hasGateEval": true,
"hasPeriodicEval": true
},
"plan-eng-review": {
"skill": "plan-eng-review",
"skillMdBytes": 103009,
"skillMdLines": 1762,
"estTokens": 25752,
"tmplBytes": 26234,
"descriptionLen": 231,
"hasGateEval": true,
"hasPeriodicEval": true
},
"plan-tune": {
"skill": "plan-tune",
"skillMdBytes": 51717,
"skillMdLines": 1077,
"estTokens": 12929,
"tmplBytes": 15586,
"descriptionLen": 325,
"hasGateEval": true,
"hasPeriodicEval": false
},
"qa": {
"skill": "qa",
"skillMdBytes": 73863,
"skillMdLines": 1622,
"estTokens": 18466,
"tmplBytes": 12701,
"descriptionLen": 218,
"hasGateEval": true,
"hasPeriodicEval": false
},
"qa-only": {
"skill": "qa-only",
"skillMdBytes": 56421,
"skillMdLines": 1194,
"estTokens": 14105,
"tmplBytes": 3851,
"descriptionLen": 165,
"hasGateEval": true,
"hasPeriodicEval": false
},
"retro": {
"skill": "retro",
"skillMdBytes": 82889,
"skillMdLines": 1750,
"estTokens": 20722,
"tmplBytes": 42427,
"descriptionLen": 648,
"hasGateEval": true,
"hasPeriodicEval": false
},
"review": {
"skill": "review",
"skillMdBytes": 94048,
"skillMdLines": 1762,
"estTokens": 23512,
"tmplBytes": 14099,
"descriptionLen": 205,
"hasGateEval": true,
"hasPeriodicEval": false
},
"scrape": {
"skill": "scrape",
"skillMdBytes": 43641,
"skillMdLines": 887,
"estTokens": 10910,
"tmplBytes": 5220,
"descriptionLen": 167,
"hasGateEval": true,
"hasPeriodicEval": false
},
"setup-browser-cookies": {
"skill": "setup-browser-cookies",
"skillMdBytes": 26618,
"skillMdLines": 594,
"estTokens": 6655,
"tmplBytes": 2724,
"descriptionLen": 222,
"hasGateEval": false,
"hasPeriodicEval": false
},
"setup-deploy": {
"skill": "setup-deploy",
"skillMdBytes": 43927,
"skillMdLines": 919,
"estTokens": 10982,
"tmplBytes": 7780,
"descriptionLen": 197,
"hasGateEval": true,
"hasPeriodicEval": false
},
"setup-gbrain": {
"skill": "setup-gbrain",
"skillMdBytes": 78394,
"skillMdLines": 1704,
"estTokens": 19599,
"tmplBytes": 42245,
"descriptionLen": 323,
"hasGateEval": true,
"hasPeriodicEval": false
},
"ship": {
"skill": "ship",
"skillMdBytes": 166782,
"skillMdLines": 3099,
"estTokens": 41696,
"tmplBytes": 50495,
"descriptionLen": 291,
"hasGateEval": true,
"hasPeriodicEval": true
},
"skillify": {
"skill": "skillify",
"skillMdBytes": 53534,
"skillMdLines": 1168,
"estTokens": 13384,
"tmplBytes": 15107,
"descriptionLen": 233,
"hasGateEval": true,
"hasPeriodicEval": false
},
"spec": {
"skill": "spec",
"skillMdBytes": 102629,
"skillMdLines": 2141,
"estTokens": 25657,
"tmplBytes": 28429,
"descriptionLen": 282,
"hasGateEval": true,
"hasPeriodicEval": false
},
"sync-gbrain": {
"skill": "sync-gbrain",
"skillMdBytes": 50156,
"skillMdLines": 1028,
"estTokens": 12539,
"tmplBytes": 13996,
"descriptionLen": 299,
"hasGateEval": false,
"hasPeriodicEval": false
},
"unfreeze": {
"skill": "unfreeze",
"skillMdBytes": 1504,
"skillMdLines": 49,
"estTokens": 376,
"tmplBytes": 1386,
"descriptionLen": 199,
"hasGateEval": false,
"hasPeriodicEval": false
}
}
}