feat(preamble): upgrade AskUserQuestion format to Pros/Cons decision brief

Part 4 of 4 (plan: ~/.claude/plans/system-instruction-you-are-working-polymorphic-twilight.md). Every AskUserQuestion now renders as a decision brief, not a bullet list: D-numbered header, ELI10, Stakes-if-we-pick-wrong, Recommendation, Pros/Cons with ✅/❌ markers per option, closing Net: tradeoff synthesis. scripts/resolvers/preamble/generate-ask-user-format.ts - Full rewrite. Preserves prior rules (Re-ground, ELI10, Recommend, Completeness, Options) and adds: - D-numbering per skill invocation (model-level, not runtime state) - Stakes line (pain avoided / capability unlocked / consequence named) - Pros/Cons block with min 2 ✅ + 1 ❌ per option, min 40 chars/bullet - Hard-stop escape: "✅ No cons — this is a hard-stop choice" for genuine one-sided choices (destructive-action confirmations) - Neutral-posture handling (CT1-compliant): (recommended) label STAYS on default option to preserve AUTO_DECIDE contract; neutrality expressed as prose in Recommendation line only - Net line closes the decision with a one-sentence tradeoff frame - Rule 11: tool_use mandate (prose "Question:" blocks don't count) - Self-check list before emitting test/skill-validation.test.ts - Update format assertions to check for new Pros/Cons tokens (Pros / cons:, Recommendation: <choice>, Net:, ELI10, Stakes if we pick wrong:, ✅, ❌) across all tier-2+ skills - Old "RECOMMENDATION: Choose" expectation removed (the new format uses mixed-case "Recommendation:" with no literal "Choose") test/skill-e2e-plan-format.test.ts - Add v1.7.0.0 format token regexes (PROS_CONS_HEADER_RE, PRO_BULLET_RE, CON_BULLET_RE, NET_LINE_RE, D_NUMBER_RE, STAKES_RE) - Existing RECOMMENDATION_RE loosened to accept mixed-case "Recommendation:" (canonical v1.7.0.0 form) alongside all-caps (legacy). Tests are additive — the strict new-format gate is the upcoming cadence eval. Regenerated 30 SKILL.md files via bun run gen:skill-docs. Verified: - bun test: 319 pass (1 pre-existing security-bench fixture oversize failure on main, unrelated — confirmed via git stash test on main HEAD) - New format tokens render in all tier-2+ skills (plan-ceo-review, plan-eng-review, ship, office-hours verified) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 11:45:20 +02:00 · 2026-04-23 16:41:36 -07:00
parent d63b4cd0e0
commit 6b99df9df7
33 changed files with 3842 additions and 252 deletions
@@ -35,10 +35,25 @@ const evalCollector = createEvalCollector('e2e-plan-format');
 // Regex predicates applied to captured AskUserQuestion content.
 // RECOMMENDATION regex is lenient on intervening markdown markers (e.g.
 // agent writes `**RECOMMENDATION:** Choose` — the `**` closers are benign).
-const RECOMMENDATION_RE = /RECOMMENDATION:[*\s]*Choose/;
+// Post v1.7.0.0: "Recommendation:" (mixed-case) is the canonical form per
+// the Pros/Cons format; accept both cases for backward compatibility.
+const RECOMMENDATION_RE = /[Rr]ecommendation:[*\s]*Choose/;
 const COMPLETENESS_RE = /Completeness:\s*\d{1,2}\/10/;
 const KIND_NOTE_RE = /options differ in kind/i;

+// v1.7.0.0 Pros/Cons format tokens. Tests are additive: existing
+// RECOMMENDATION / Completeness / kind-note assertions still hold; new
+// format tokens are asserted ONLY when the capture is from a v1.7+
+// skill rendering. Presence is optional for backward compatibility during
+// rollout; the periodic-tier cadence+format eval (see skill-e2e-plan-cadence)
+// is the strict gate for the new format.
+const PROS_CONS_HEADER_RE = /Pros\s*\/\s*cons:/i;
+const PRO_BULLET_RE = /^\s*✅\s+\S/m;
+const CON_BULLET_RE = /^\s*❌\s+\S/m;
+const NET_LINE_RE = /^Net:\s+\S/m;
+const D_NUMBER_RE = /^D\d+\s+—/m;
+const STAKES_RE = /Stakes if we pick wrong:/i;
+
 const SAMPLE_PLAN = `# Plan: Add User Dashboard

 ## Context