mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-02 11:45:20 +02:00
feat(preamble): upgrade AskUserQuestion format to Pros/Cons decision brief
Part 4 of 4 (plan: ~/.claude/plans/system-instruction-you-are-working-polymorphic-twilight.md). Every AskUserQuestion now renders as a decision brief, not a bullet list: D-numbered header, ELI10, Stakes-if-we-pick-wrong, Recommendation, Pros/Cons with ✅/❌ markers per option, closing Net: tradeoff synthesis. scripts/resolvers/preamble/generate-ask-user-format.ts - Full rewrite. Preserves prior rules (Re-ground, ELI10, Recommend, Completeness, Options) and adds: - D-numbering per skill invocation (model-level, not runtime state) - Stakes line (pain avoided / capability unlocked / consequence named) - Pros/Cons block with min 2 ✅ + 1 ❌ per option, min 40 chars/bullet - Hard-stop escape: "✅ No cons — this is a hard-stop choice" for genuine one-sided choices (destructive-action confirmations) - Neutral-posture handling (CT1-compliant): (recommended) label STAYS on default option to preserve AUTO_DECIDE contract; neutrality expressed as prose in Recommendation line only - Net line closes the decision with a one-sentence tradeoff frame - Rule 11: tool_use mandate (prose "Question:" blocks don't count) - Self-check list before emitting test/skill-validation.test.ts - Update format assertions to check for new Pros/Cons tokens (Pros / cons:, Recommendation: <choice>, Net:, ELI10, Stakes if we pick wrong:, ✅, ❌) across all tier-2+ skills - Old "RECOMMENDATION: Choose" expectation removed (the new format uses mixed-case "Recommendation:" with no literal "Choose") test/skill-e2e-plan-format.test.ts - Add v1.7.0.0 format token regexes (PROS_CONS_HEADER_RE, PRO_BULLET_RE, CON_BULLET_RE, NET_LINE_RE, D_NUMBER_RE, STAKES_RE) - Existing RECOMMENDATION_RE loosened to accept mixed-case "Recommendation:" (canonical v1.7.0.0 form) alongside all-caps (legacy). Tests are additive — the strict new-format gate is the upcoming cadence eval. Regenerated 30 SKILL.md files via bun run gen:skill-docs. Verified: - bun test: 319 pass (1 pre-existing security-bench fixture oversize failure on main, unrelated — confirmed via git stash test on main HEAD) - New format tokens render in all tier-2+ skills (plan-ceo-review, plan-eng-review, ship, office-hours verified) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -35,10 +35,25 @@ const evalCollector = createEvalCollector('e2e-plan-format');
|
||||
// Regex predicates applied to captured AskUserQuestion content.
|
||||
// RECOMMENDATION regex is lenient on intervening markdown markers (e.g.
|
||||
// agent writes `**RECOMMENDATION:** Choose` — the `**` closers are benign).
|
||||
const RECOMMENDATION_RE = /RECOMMENDATION:[*\s]*Choose/;
|
||||
// Post v1.7.0.0: "Recommendation:" (mixed-case) is the canonical form per
|
||||
// the Pros/Cons format; accept both cases for backward compatibility.
|
||||
const RECOMMENDATION_RE = /[Rr]ecommendation:[*\s]*Choose/;
|
||||
const COMPLETENESS_RE = /Completeness:\s*\d{1,2}\/10/;
|
||||
const KIND_NOTE_RE = /options differ in kind/i;
|
||||
|
||||
// v1.7.0.0 Pros/Cons format tokens. Tests are additive: existing
|
||||
// RECOMMENDATION / Completeness / kind-note assertions still hold; new
|
||||
// format tokens are asserted ONLY when the capture is from a v1.7+
|
||||
// skill rendering. Presence is optional for backward compatibility during
|
||||
// rollout; the periodic-tier cadence+format eval (see skill-e2e-plan-cadence)
|
||||
// is the strict gate for the new format.
|
||||
const PROS_CONS_HEADER_RE = /Pros\s*\/\s*cons:/i;
|
||||
const PRO_BULLET_RE = /^\s*✅\s+\S/m;
|
||||
const CON_BULLET_RE = /^\s*❌\s+\S/m;
|
||||
const NET_LINE_RE = /^Net:\s+\S/m;
|
||||
const D_NUMBER_RE = /^D\d+\s+—/m;
|
||||
const STAKES_RE = /Stakes if we pick wrong:/i;
|
||||
|
||||
const SAMPLE_PLAN = `# Plan: Add User Dashboard
|
||||
|
||||
## Context
|
||||
|
||||
Reference in New Issue
Block a user