merge: integrate origin/main (v1.1.2.0) — mode-posture energy

Main shipped v1.1.2.0, restoring mode-posture energy to /plan-ceo-review
EXPANSION and /office-hours forcing + builder modes. V1's writing-style
rules 2-4 collapsed every outcome into diagnostic-pain framing; models
follow concrete examples over abstract taxonomies, so cathedral-mode
output was flattening even when the template said "dream big."

Conflicts:
- VERSION / package.json: kept 1.2.0.0 (branch higher than main's 1.1.2.0)
- CHANGELOG: preserved 1.2.0.0 at top, inserted main's 1.1.2.0 below it,
  and added a short note under 1.2.0.0's Changed section documenting
  that the mode-posture examples are included here too (via the port)
- scripts/resolvers/preamble.ts: main edited inline writing-style
  examples in the old monolithic preamble file; my submodule refactor
  landed the same file as an 80-line composition root. Resolution:
  kept my submodule structure (dropped main's 800 lines of inline code)
  and ported main's new rule 2/3/4 examples into
  scripts/resolvers/preamble/generate-writing-style.ts — same behavior,
  just in the right place for the submodule shape.

Ship SKILL.md, golden fixtures, office-hours/plan-ceo-review templates,
new test/fixtures/mode-posture/** fixtures, new judgePosture helper,
and touchfiles entries for three new gate-tier E2E tests (plan-ceo-
review-expansion-energy, office-hours-forcing-energy, office-hours-
builder-wildness) all auto-merged cleanly.

Regenerated all SKILL.md files and ship goldens. 423 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-04-19 05:57:27 +08:00
44 changed files with 745 additions and 105 deletions
@@ -26,9 +26,15 @@ export function generateWritingStyle(_ctx: TemplateContext): string {
These rules apply to every AskUserQuestion, every response you write to the user, and every review finding. They compose with the AskUserQuestion Format section above: Format = *how* a question is structured; Writing Style = *the prose quality of the content inside it*.
1. **Jargon gets a one-sentence gloss on first use per skill invocation.** Even if the user's own prompt already contained the term — users often paste jargon from someone else's plan. Gloss unconditionally on first use. No cross-invocation memory: a new skill fire is a new first-use opportunity. Example: "race condition (two things happen at the same time and step on each other)".
2. **Frame questions in outcome terms, not implementation terms.** Bad: "Is this endpoint idempotent?" Good: "If someone double-clicks the button, is it OK for the action to run twice?" Ask the question the user would actually want to answer.
3. **Short sentences. Concrete nouns. Active voice.** Standard advice from any good writing guide. Prefer "the cache stores the result for 60s" over "results will have been cached for a period of 60s."
4. **Close every decision with user impact.** Connect the technical call back to who's affected. "If we skip this, your users will see a 3-second spinner on every page load." Make the user's user real.
2. **Frame questions in outcome terms, not implementation terms.** Ask the question the user would actually want to answer. Outcome framing covers three families — match the framing to the mode:
- **Pain reduction** (default for diagnostic / HOLD SCOPE / rigor review): "If someone double-clicks the button, is it OK for the action to run twice?" (instead of "Is this endpoint idempotent?")
- **Upside / delight** (for expansion / builder / vision contexts): "When the workflow finishes, does the user see the result instantly, or are they still refreshing a dashboard?" (instead of "Should we add webhook notifications?")
- **Interrogative pressure** (for forcing-question / founder-challenge contexts): "Can you name the actual person whose career gets better if this ships and whose career gets worse if it doesn't?" (instead of "Who's the target user?")
3. **Short sentences. Concrete nouns. Active voice.** Standard advice from any good writing guide. Prefer "the cache stores the result for 60s" over "results will have been cached for a period of 60s." *Exception:* stacked, multi-part questions are a legitimate forcing device — "Title? Gets them promoted? Gets them fired? Keeps them up at night?" is longer than one short sentence, and it should be, because the pressure IS in the stacking. Don't collapse a stack into a single neutral ask when the skill's posture is forcing.
4. **Close every decision with user impact.** Connect the technical call back to who's affected. Make the user's user real. Impact has three shapes — again, match the mode:
- **Pain avoided:** "If we skip this, your users will see a 3-second spinner on every page load."
- **Capability unlocked:** "If we ship this, users get instant feedback the moment a workflow finishes — no tabs to refresh, no polling."
- **Consequence named** (for forcing questions): "If you can't name the person whose career this helps, you don't know who you're building for — and 'users' isn't an answer."
5. **User-turn override.** If the user's current message says "be terse" / "no explanations" / "brutally honest, just the answer" / similar, skip this entire Writing Style block for your next response, regardless of config. User's in-turn request wins.
6. **Glossary boundary is the curated list.** Terms below get glossed. Terms not on the list are assumed plain-English enough. If you see a term that genuinely needs glossing but isn't listed, note it (once) in your response so it can be added via PR.