fix: resolve merge conflicts — bump to v0.14.2.0 on top of main's v0.14.1.0

Main claimed v0.14.1.0 for the "Comparison Board is the Chooser" release. Our branch bumps to v0.14.2.0 for the sidebar CSS inspector + per-tab agents. Both CHANGELOG entries preserved in correct order. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-22 09:39:59 +02:00 · 2026-03-30 09:14:17 -07:00
parent 2f3581da5e 7911b7b974
commit 6442d9cfe1
10 changed files with 503 additions and 138 deletions
@@ -1,6 +1,6 @@
 # Changelog

-## [0.14.1.0] - 2026-03-30 — Sidebar CSS Inspector + Per-Tab Agents
+## [0.14.2.0] - 2026-03-30 — Sidebar CSS Inspector + Per-Tab Agents

 The sidebar is now a visual design tool. Pick any element on the page and see the full CSS rule cascade, box model, and computed styles right in the Side Panel. Edit styles live and see changes instantly. Each browser tab gets its own independent agent, so you can work on multiple pages simultaneously without cross-talk. Cleanup is LLM-powered... the agent snapshots the page, understands it semantically, and removes the junk while keeping the site's identity.

@@ -30,6 +30,20 @@ The sidebar is now a visual design tool. Pick any element on the page and see th
 - **Input placeholder** is "Ask about this page..." (more inviting than the old placeholder).
 - **System prompt** includes prompt injection defense and allowed-commands whitelist from the security audit.

+## [0.14.1.0] - 2026-03-30 — Comparison Board is the Chooser
+
+The design comparison board now always opens automatically when reviewing variants. No more inline image + "which do you prefer?" — the board has rating controls, comments, remix/regenerate buttons, and structured feedback output. That's the experience. All 3 design skills (/plan-design-review, /design-shotgun, /design-consultation) get this fix.
+
+### Changed
+
+- **Comparison board is now mandatory.** After generating design variants, the agent creates a comparison board with `$D compare --serve` and sends you the URL via AskUserQuestion. You interact with the board, click Submit, and the agent reads your structured feedback from `feedback.json`. No more polling loops as the primary wait mechanism.
+- **AskUserQuestion is the wait, not the chooser.** The agent uses AskUserQuestion to tell you the board is open and wait for you to finish, not to present variants inline and ask for preferences. The board URL is always included so you can click through if you lost the tab.
+- **Serve-failure fallback improved.** If the comparison board server can't start, variants are shown inline via Read tool before asking for preferences — you're no longer choosing blind.
+
+### Fixed
+
+- **Board URL corrected.** The recovery URL now points to `http://127.0.0.1:<PORT>/` (where the server actually serves) instead of `/design-board.html` (which would 404).
+
 ## [0.14.0.0] - 2026-03-30 — Design to Code

 You can now go from an approved design mockup to production-quality HTML with one command. `/design-html` takes the winning design from `/design-shotgun` and generates Pretext-native HTML where text actually reflows on resize, heights adjust to content, and layouts are dynamic. No more hardcoded CSS heights or broken text overflow.
@@ -18,7 +18,7 @@ I'm [Garry Tan](https://x.com/garrytan), President & CEO of [Y Combinator](https

 Same person. Different era. The difference is the tooling.

-**gstack is how I do it.** It turns Claude Code into a virtual engineering team — a CEO who rethinks the product, an eng manager who locks architecture, a designer who catches AI slop, a reviewer who finds production bugs, a QA lead who opens a real browser, a security officer who runs OWASP + STRIDE audits, and a release engineer who ships the PR. Twenty specialists and eight power tools, all slash commands, all Markdown, all free, MIT license.
+**gstack is how I do it.** It turns Claude Code into a virtual engineering team — a CEO who rethinks the product, an eng manager who locks architecture, a designer who catches AI slop, a reviewer who finds production bugs, a QA lead who opens a real browser, a security officer who runs OWASP + STRIDE audits, and a release engineer who ships the PR. Twenty-three specialists and eight power tools, all slash commands, all Markdown, all free, MIT license.

 This is my open source software factory. I use it every day. I'm sharing it because these tools should be available to everyone.

@@ -46,11 +46,11 @@ Fork it. Improve it. Make it yours. And if you want to hate on free open source

 Open Claude Code and paste this. Claude does the rest.

-> Install gstack: run **`git clone --single-branch --depth 1 https://github.com/garrytan/gstack.git ~/.claude/skills/gstack && cd ~/.claude/skills/gstack && ./setup`** then add a "gstack" section to CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, and lists the available skills: /office-hours, /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /design-shotgun, /review, /ship, /land-and-deploy, /canary, /benchmark, /browse, /connect-chrome, /qa, /qa-only, /design-review, /setup-browser-cookies, /setup-deploy, /retro, /investigate, /document-release, /codex, /cso, /autoplan, /careful, /freeze, /guard, /unfreeze, /gstack-upgrade. Then ask the user if they also want to add gstack to the current project so teammates get it.
+> Install gstack: run **`git clone --single-branch --depth 1 https://github.com/garrytan/gstack.git ~/.claude/skills/gstack && cd ~/.claude/skills/gstack && ./setup`** then add a "gstack" section to CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, and lists the available skills: /office-hours, /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /design-shotgun, /design-html, /review, /ship, /land-and-deploy, /canary, /benchmark, /browse, /connect-chrome, /qa, /qa-only, /design-review, /setup-browser-cookies, /setup-deploy, /retro, /investigate, /document-release, /codex, /cso, /autoplan, /careful, /freeze, /guard, /unfreeze, /gstack-upgrade, /learn. Then ask the user if they also want to add gstack to the current project so teammates get it.

 ### Step 2: Add to your repo so teammates get it (optional)

-> Add gstack to this project: run **`cp -Rf ~/.claude/skills/gstack .claude/skills/gstack && rm -rf .claude/skills/gstack/.git && cd .claude/skills/gstack && ./setup`** then add a "gstack" section to this project's CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, lists the available skills: /office-hours, /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /review, /ship, /land-and-deploy, /canary, /benchmark, /browse, /qa, /qa-only, /design-review, /setup-browser-cookies, /setup-deploy, /retro, /investigate, /document-release, /codex, /cso, /careful, /freeze, /guard, /unfreeze, /gstack-upgrade, and tells Claude that if gstack skills aren't working, run `cd .claude/skills/gstack && ./setup` to build the binary and register skills.
+> Add gstack to this project: run **`cp -Rf ~/.claude/skills/gstack .claude/skills/gstack && rm -rf .claude/skills/gstack/.git && cd .claude/skills/gstack && ./setup`** then add a "gstack" section to this project's CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, lists the available skills: /office-hours, /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /design-shotgun, /design-html, /review, /ship, /land-and-deploy, /canary, /benchmark, /browse, /connect-chrome, /qa, /qa-only, /design-review, /setup-browser-cookies, /setup-deploy, /retro, /investigate, /document-release, /codex, /cso, /autoplan, /careful, /freeze, /guard, /unfreeze, /gstack-upgrade, /learn, and tells Claude that if gstack skills aren't working, run `cd .claude/skills/gstack && ./setup` to build the binary and register skills.

 Real files get committed to your repo (not a submodule), so `git clone` just works. Everything lives inside `.claude/`. Nothing touches your PATH or runs in the background.

@@ -90,7 +90,7 @@ git clone --single-branch --depth 1 https://github.com/garrytan/gstack.git ~/gst
 cd ~/gstack && ./setup --host auto
 ```

-For Codex-compatible hosts, setup now supports both repo-local installs from `.agents/skills/gstack` and user-global installs from `~/.codex/skills/gstack`. All 29 skills work across all supported agents. Hook-based safety skills (careful, freeze, guard) use inline safety advisory prose on non-Claude hosts.
+For Codex-compatible hosts, setup now supports both repo-local installs from `.agents/skills/gstack` and user-global installs from `~/.codex/skills/gstack`. All 31 skills work across all supported agents. Hook-based safety skills (careful, freeze, guard) use inline safety advisory prose on non-Claude hosts.

 ### Factory Droid

@@ -165,6 +165,7 @@ Each skill feeds into the next. `/office-hours` writes a design doc that `/plan-
 | `/investigate` | **Debugger** | Systematic root-cause debugging. Iron Law: no fixes without investigation. Traces data flow, tests hypotheses, stops after 3 failed fixes. |
 | `/design-review` | **Designer Who Codes** | Same audit as /plan-design-review, then fixes what it finds. Atomic commits, before/after screenshots. |
 | `/design-shotgun` | **Design Explorer** | Generate multiple AI design variants, open a comparison board in your browser, and iterate until you approve a direction. Taste memory biases toward your preferences. |
+| `/design-html` | **Design Engineer** | Takes an approved mockup from `/design-shotgun` and generates production-quality HTML with Pretext for computed text layout. Text reflows on resize, heights adjust to content. Smart API routing picks the right Pretext patterns per design type. Framework detection for React/Svelte/Vue. |
 | `/qa` | **QA Lead** | Test your app, find bugs, fix them with atomic commits, re-verify. Auto-generates regression tests for every fix. |
 | `/qa-only` | **QA Reporter** | Same methodology as /qa but report only. Pure bug report without code changes. |
 | `/cso` | **Chief Security Officer** | OWASP Top 10 + STRIDE threat model. Zero-noise: 17 false positive exclusions, 8/10+ confidence gate, independent finding verification. Each finding includes a concrete exploit scenario. |
@@ -177,6 +178,7 @@ Each skill feeds into the next. `/office-hours` writes a design doc that `/plan-
 | `/browse` | **QA Engineer** | Give the agent eyes. Real Chromium browser, real clicks, real screenshots. ~100ms per command. `$B connect` launches your real Chrome as a headed window — watch every action live. |
 | `/setup-browser-cookies` | **Session Manager** | Import cookies from your real browser (Chrome, Arc, Brave, Edge) into the headless session. Test authenticated pages. |
 | `/autoplan` | **Review Pipeline** | One command, fully reviewed plan. Runs CEO → design → eng review automatically with encoded decision principles. Surfaces only taste decisions for your approval. |
+| `/learn` | **Memory** | Manage what gstack learned across sessions. Review, search, prune, and export project-specific patterns, pitfalls, and preferences. Learnings compound across sessions so gstack gets smarter on your codebase over time. |

 ### Power tools

@@ -197,7 +199,7 @@ Each skill feeds into the next. `/office-hours` writes a design doc that `/plan-

 gstack works well with one sprint. It gets interesting with ten running at once.

-**Design is at the heart.** `/design-consultation` doesn't just pick fonts. It researches what's out there in your space, proposes safe choices AND creative risks, generates realistic mockups of your actual product, and writes `DESIGN.md` — and then `/design-review` and `/plan-eng-review` read what you chose. Design decisions flow through the whole system.
+**Design is at the heart.** `/design-consultation` builds your design system from scratch, researches the space, proposes creative risks, and writes `DESIGN.md`. `/design-shotgun` generates multiple visual variants and opens a comparison board so you can pick a direction. `/design-html` takes that approved mockup and generates production-quality HTML with Pretext, where text actually reflows on resize instead of breaking with hardcoded heights. Then `/design-review` and `/plan-eng-review` read what you chose. Design decisions flow through the whole system.

 **`/qa` was a massive unlock.** It let me go from 6 to 12 parallel workers. Claude Code saying *"I SEE THE ISSUE"* and then actually fixing it, generating a regression test, and verifying the fix — that changed how I work. The agent has eyes now.

@@ -286,10 +288,10 @@ Data is stored in [Supabase](https://supabase.com) (open source Firebase alterna
 ## gstack
 Use /browse from gstack for all web browsing. Never use mcp__claude-in-chrome__* tools.
 Available skills: /office-hours, /plan-ceo-review, /plan-eng-review, /plan-design-review,
-/design-consultation, /review, /ship, /land-and-deploy, /canary, /benchmark, /browse,
-/qa, /qa-only, /design-review, /setup-browser-cookies, /setup-deploy, /retro,
-/investigate, /document-release, /codex, /cso, /autoplan, /careful, /freeze, /guard,
-/unfreeze, /gstack-upgrade.
+/design-consultation, /design-shotgun, /design-html, /review, /ship, /land-and-deploy,
+/canary, /benchmark, /browse, /connect-chrome, /qa, /qa-only, /design-review,
+/setup-browser-cookies, /setup-deploy, /retro, /investigate, /document-release, /codex,
+/cso, /autoplan, /careful, /freeze, /guard, /unfreeze, /gstack-upgrade, /learn.
 ```

 ## License
@@ -1 +1 @@
-0.14.1.0
+0.14.2.0
@@ -814,31 +814,42 @@ $D compare --images "$_DESIGN_DIR/variant-A.png,$_DESIGN_DIR/variant-B.png,$_DES

 This command generates the board HTML, starts an HTTP server on a random port,
 and opens it in the user's default browser. **Run it in the background** with `&`
-because the agent needs to keep running while the user interacts with the board.
+because the server needs to stay running while the user interacts with the board.

-**IMPORTANT: Reading feedback via file polling (not stdout):**
+Parse the port from stderr output: `SERVE_STARTED: port=XXXXX`. You need this
+for the board URL and for reloading during regeneration cycles.

-The server writes feedback to files next to the board HTML. The agent polls for these:
+**PRIMARY WAIT: AskUserQuestion with board URL**
+
+After the board is serving, use AskUserQuestion to wait for the user. Include the
+board URL so they can click it if they lost the browser tab:
+
+"I've opened a comparison board with the design variants:
+http://127.0.0.1:<PORT>/ — Rate them, leave comments, remix
+elements you like, and click Submit when you're done. Let me know when you've
+submitted your feedback (or paste your preferences here). If you clicked
+Regenerate or Remix on the board, tell me and I'll generate new variants."
+
+**Do NOT use AskUserQuestion to ask which variant the user prefers.** The comparison
+board IS the chooser. AskUserQuestion is just the blocking wait mechanism.
+
+**After the user responds to AskUserQuestion:**
+
+Check for feedback files next to the board HTML:
 - `$_DESIGN_DIR/feedback.json` — written when user clicks Submit (final choice)
 - `$_DESIGN_DIR/feedback-pending.json` — written when user clicks Regenerate/Remix/More Like This

-**Polling loop** (run after launching `$D serve` in background):
-
 ```bash
-# Poll for feedback files every 5 seconds (up to 10 minutes)
-for i in $(seq 1 120); do
-  if [ -f "$_DESIGN_DIR/feedback.json" ]; then
-    echo "SUBMIT_RECEIVED"
-    cat "$_DESIGN_DIR/feedback.json"
-    break
-  elif [ -f "$_DESIGN_DIR/feedback-pending.json" ]; then
-    echo "REGENERATE_RECEIVED"
-    cat "$_DESIGN_DIR/feedback-pending.json"
-    rm "$_DESIGN_DIR/feedback-pending.json"
-    break
-  fi
-  sleep 5
-done
+if [ -f "$_DESIGN_DIR/feedback.json" ]; then
+  echo "SUBMIT_RECEIVED"
+  cat "$_DESIGN_DIR/feedback.json"
+elif [ -f "$_DESIGN_DIR/feedback-pending.json" ]; then
+  echo "REGENERATE_RECEIVED"
+  cat "$_DESIGN_DIR/feedback-pending.json"
+  rm "$_DESIGN_DIR/feedback-pending.json"
+else
+  echo "NO_FEEDBACK_FILE"
+fi
 ```

 The feedback JSON has this shape:
@@ -852,24 +863,30 @@ The feedback JSON has this shape:
 }
 ```

-**If `feedback-pending.json` found (`"regenerated": true`):**
+**If `feedback.json` found:** The user clicked Submit on the board.
+Read `preferred`, `ratings`, `comments`, `overall` from the JSON. Proceed with
+the approved variant.
+
+**If `feedback-pending.json` found:** The user clicked Regenerate/Remix on the board.
 1. Read `regenerateAction` from the JSON (`"different"`, `"match"`, `"more_like_B"`,
   `"remix"`, or custom text)
 2. If `regenerateAction` is `"remix"`, read `remixSpec` (e.g. `{"layout":"A","colors":"B"}`)
 3. Generate new variants with `$D iterate` or `$D variants` using updated brief
 4. Create new board: `$D compare --images "..." --output "$_DESIGN_DIR/design-board.html"`
-5. Parse the port from the `$D serve` stderr output (`SERVE_STARTED: port=XXXXX`),
-   then reload the board in the user's browser (same tab):
+5. Reload the board in the user's browser (same tab):
   `curl -s -X POST http://127.0.0.1:PORT/api/reload -H 'Content-Type: application/json' -d '{"html":"$_DESIGN_DIR/design-board.html"}'`
-6. The board auto-refreshes. **Poll again** for the next feedback file.
-7. Repeat until `feedback.json` appears (user clicked Submit).
+6. The board auto-refreshes. **AskUserQuestion again** with the same board URL to
+   wait for the next round of feedback. Repeat until `feedback.json` appears.

-**If `feedback.json` found (`"regenerated": false`):**
-1. Read `preferred`, `ratings`, `comments`, `overall` from the JSON
-2. Proceed with the approved variant
+**If `NO_FEEDBACK_FILE`:** The user typed their preferences directly in the
+AskUserQuestion response instead of using the board. Use their text response
+as the feedback.

-**If `$D serve` fails or no feedback within 10 minutes:** Fall back to AskUserQuestion:
-"I've opened the design board. Which variant do you prefer? Any feedback?"
+**POLLING FALLBACK:** Only use polling if `$D serve` fails (no port available).
+In that case, show each variant inline using the Read tool (so the user can see them),
+then use AskUserQuestion:
+"The comparison board server failed to start. I've shown the variants above.
+Which do you prefer? Any feedback?"

 **After receiving feedback (any path):** Output a clear summary confirming
 what was understood:
@@ -676,31 +676,42 @@ $D compare --images "$_DESIGN_DIR/variant-A.png,$_DESIGN_DIR/variant-B.png,$_DES

 This command generates the board HTML, starts an HTTP server on a random port,
 and opens it in the user's default browser. **Run it in the background** with `&`
-because the agent needs to keep running while the user interacts with the board.
+because the server needs to stay running while the user interacts with the board.

-**IMPORTANT: Reading feedback via file polling (not stdout):**
+Parse the port from stderr output: `SERVE_STARTED: port=XXXXX`. You need this
+for the board URL and for reloading during regeneration cycles.

-The server writes feedback to files next to the board HTML. The agent polls for these:
+**PRIMARY WAIT: AskUserQuestion with board URL**
+
+After the board is serving, use AskUserQuestion to wait for the user. Include the
+board URL so they can click it if they lost the browser tab:
+
+"I've opened a comparison board with the design variants:
+http://127.0.0.1:<PORT>/ — Rate them, leave comments, remix
+elements you like, and click Submit when you're done. Let me know when you've
+submitted your feedback (or paste your preferences here). If you clicked
+Regenerate or Remix on the board, tell me and I'll generate new variants."
+
+**Do NOT use AskUserQuestion to ask which variant the user prefers.** The comparison
+board IS the chooser. AskUserQuestion is just the blocking wait mechanism.
+
+**After the user responds to AskUserQuestion:**
+
+Check for feedback files next to the board HTML:
 - `$_DESIGN_DIR/feedback.json` — written when user clicks Submit (final choice)
 - `$_DESIGN_DIR/feedback-pending.json` — written when user clicks Regenerate/Remix/More Like This

-**Polling loop** (run after launching `$D serve` in background):
-
 ```bash
-# Poll for feedback files every 5 seconds (up to 10 minutes)
-for i in $(seq 1 120); do
-  if [ -f "$_DESIGN_DIR/feedback.json" ]; then
-    echo "SUBMIT_RECEIVED"
-    cat "$_DESIGN_DIR/feedback.json"
-    break
-  elif [ -f "$_DESIGN_DIR/feedback-pending.json" ]; then
-    echo "REGENERATE_RECEIVED"
-    cat "$_DESIGN_DIR/feedback-pending.json"
-    rm "$_DESIGN_DIR/feedback-pending.json"
-    break
-  fi
-  sleep 5
-done
+if [ -f "$_DESIGN_DIR/feedback.json" ]; then
+  echo "SUBMIT_RECEIVED"
+  cat "$_DESIGN_DIR/feedback.json"
+elif [ -f "$_DESIGN_DIR/feedback-pending.json" ]; then
+  echo "REGENERATE_RECEIVED"
+  cat "$_DESIGN_DIR/feedback-pending.json"
+  rm "$_DESIGN_DIR/feedback-pending.json"
+else
+  echo "NO_FEEDBACK_FILE"
+fi
 ```

 The feedback JSON has this shape:
@@ -714,24 +725,30 @@ The feedback JSON has this shape:
 }
 ```

-**If `feedback-pending.json` found (`"regenerated": true`):**
+**If `feedback.json` found:** The user clicked Submit on the board.
+Read `preferred`, `ratings`, `comments`, `overall` from the JSON. Proceed with
+the approved variant.
+
+**If `feedback-pending.json` found:** The user clicked Regenerate/Remix on the board.
 1. Read `regenerateAction` from the JSON (`"different"`, `"match"`, `"more_like_B"`,
   `"remix"`, or custom text)
 2. If `regenerateAction` is `"remix"`, read `remixSpec` (e.g. `{"layout":"A","colors":"B"}`)
 3. Generate new variants with `$D iterate` or `$D variants` using updated brief
 4. Create new board: `$D compare --images "..." --output "$_DESIGN_DIR/design-board.html"`
-5. Parse the port from the `$D serve` stderr output (`SERVE_STARTED: port=XXXXX`),
-   then reload the board in the user's browser (same tab):
+5. Reload the board in the user's browser (same tab):
   `curl -s -X POST http://127.0.0.1:PORT/api/reload -H 'Content-Type: application/json' -d '{"html":"$_DESIGN_DIR/design-board.html"}'`
-6. The board auto-refreshes. **Poll again** for the next feedback file.
-7. Repeat until `feedback.json` appears (user clicked Submit).
+6. The board auto-refreshes. **AskUserQuestion again** with the same board URL to
+   wait for the next round of feedback. Repeat until `feedback.json` appears.

-**If `feedback.json` found (`"regenerated": false`):**
-1. Read `preferred`, `ratings`, `comments`, `overall` from the JSON
-2. Proceed with the approved variant
+**If `NO_FEEDBACK_FILE`:** The user typed their preferences directly in the
+AskUserQuestion response instead of using the board. Use their text response
+as the feedback.

-**If `$D serve` fails or no feedback within 10 minutes:** Fall back to AskUserQuestion:
-"I've opened the design board. Which variant do you prefer? Any feedback?"
+**POLLING FALLBACK:** Only use polling if `$D serve` fails (no port available).
+In that case, show each variant inline using the Read tool (so the user can see them),
+then use AskUserQuestion:
+"The comparison board server failed to start. I've shown the variants above.
+Which do you prefer? Any feedback?"

 **After receiving feedback (any path):** Output a clear summary confirming
 what was understood:
@@ -12,14 +12,21 @@ Detailed guides for every gstack skill — philosophy, workflow, and examples.
 | [`/review`](#review) | **Staff Engineer** | Find the bugs that pass CI but blow up in production. Auto-fixes the obvious ones. Flags completeness gaps. |
 | [`/investigate`](#investigate) | **Debugger** | Systematic root-cause debugging. Iron Law: no fixes without investigation. Traces data flow, tests hypotheses, stops after 3 failed fixes. |
 | [`/design-review`](#design-review) | **Designer Who Codes** | Live-site visual audit + fix loop. 80-item audit, then fixes what it finds. Atomic commits, before/after screenshots. |
+| [`/design-shotgun`](#design-shotgun) | **Design Explorer** | Generate multiple AI design variants, open a comparison board in your browser, and iterate until you approve a direction. Taste memory biases toward your preferences. |
+| [`/design-html`](#design-html) | **Design Engineer** | Takes an approved mockup from `/design-shotgun` and generates production-quality Pretext-native HTML. Text reflows on resize, heights adjust to content. Smart API routing per design type. Framework detection for React/Svelte/Vue. |
 | [`/qa`](#qa) | **QA Lead** | Test your app, find bugs, fix them with atomic commits, re-verify. Auto-generates regression tests for every fix. |
 | [`/qa-only`](#qa) | **QA Reporter** | Same methodology as /qa but report only. Use when you want a pure bug report without code changes. |
 | [`/ship`](#ship) | **Release Engineer** | Sync main, run tests, audit coverage, push, open PR. Bootstraps test frameworks if you don't have one. One command. |
+| [`/land-and-deploy`](#land-and-deploy) | **Release Engineer** | Merge the PR, wait for CI and deploy, verify production health. One command from "approved" to "verified in production." |
+| [`/canary`](#canary) | **SRE** | Post-deploy monitoring loop. Watches for console errors, performance regressions, and page failures using the browse daemon. |
+| [`/benchmark`](#benchmark) | **Performance Engineer** | Baseline page load times, Core Web Vitals, and resource sizes. Compare before/after on every PR. Track trends over time. |
 | [`/cso`](#cso) | **Chief Security Officer** | OWASP Top 10 + STRIDE threat modeling security audit. Scans for injection, auth, crypto, and access control issues. |
 | [`/document-release`](#document-release) | **Technical Writer** | Update all project docs to match what you just shipped. Catches stale READMEs automatically. |
 | [`/retro`](#retro) | **Eng Manager** | Team-aware weekly retro. Per-person breakdowns, shipping streaks, test health trends, growth opportunities. |
 | [`/browse`](#browse) | **QA Engineer** | Give the agent eyes. Real Chromium browser, real clicks, real screenshots. ~100ms per command. |
 | [`/setup-browser-cookies`](#setup-browser-cookies) | **Session Manager** | Import cookies from your real browser (Chrome, Arc, Brave, Edge) into the headless session. Test authenticated pages. |
+| [`/autoplan`](#autoplan) | **Review Pipeline** | One command, fully reviewed plan. Runs CEO → design → eng review automatically with encoded decision principles. Surfaces only taste decisions for your approval. |
+| [`/learn`](#learn) | **Memory** | Manage what gstack learned across sessions. Review, search, prune, and export project-specific patterns and preferences. |
 | | | |
 | **Multi-AI** | | |
 | [`/codex`](#codex) | **Second Opinion** | Independent review from OpenAI Codex CLI. Three modes: code review (pass/fail gate), adversarial challenge, and open consultation with session continuity. Cross-model analysis when both `/review` and `/codex` have run. |
@@ -29,6 +36,8 @@ Detailed guides for every gstack skill — philosophy, workflow, and examples.
 | [`/freeze`](#safety--guardrails) | **Edit Lock** | Restrict all file edits to a single directory. Blocks Edit and Write outside the boundary. Accident prevention for debugging. |
 | [`/guard`](#safety--guardrails) | **Full Safety** | Combines /careful + /freeze in one command. Maximum safety for prod work. |
 | [`/unfreeze`](#safety--guardrails) | **Unlock** | Remove the /freeze boundary, allowing edits everywhere again. |
+| [`/connect-chrome`](#connect-chrome) | **Chrome Controller** | Launch your real Chrome controlled by gstack with the Side Panel extension. Watch every action live. |
+| [`/setup-deploy`](#setup-deploy) | **Deploy Configurator** | One-time setup for `/land-and-deploy`. Detects your platform, production URL, and deploy commands. |
 | [`/gstack-upgrade`](#gstack-upgrade) | **Self-Updater** | Upgrade gstack to the latest version. Detects global vs vendored install, syncs both, shows what changed. |

 ---
@@ -399,6 +408,108 @@ Nine commits, each touching one concern. The AI Slop score went from D to A beca

 ---

+## `/design-shotgun`
+
+This is my **design exploration mode**.
+
+You know the feeling. You have a feature, a page, a landing screen... and you're not sure what it should look like. You could describe it to Claude and get one answer. But one answer means one perspective, and design is a taste game. You need to see options.
+
+`/design-shotgun` generates 3 visual design variants using the GPT Image API, opens a comparison board in your browser, and waits for your feedback. You pick a direction, request changes, or ask for entirely new variants. The board supports remix, regenerate, and approval actions.
+
+### The loop
+
+1. You describe what you want (or point at an existing page)
+2. The skill reads your `DESIGN.md` for brand constraints (if it exists)
+3. It generates 3 distinct design variants as PNGs
+4. A comparison board opens in your browser with all 3 side-by-side
+5. You click "Approve" on the one you like, or give feedback for another round
+6. The approved variant saves to `~/.gstack/projects/$SLUG/designs/` with an `approved.json`
+
+That `approved.json` is what `/design-html` reads. The design pipeline chains: shotgun picks the direction, design-html renders it as working code.
+
+### Taste memory
+
+The skill remembers your preferences across sessions. If you consistently prefer minimal designs over busy ones, it biases future generations. This isn't a setting you configure... it emerges from your approvals.
+
+### Example
+
+```
+You:   /design-shotgun — hero section for a developer tools landing page
+
+Claude: [Generates 3 variants]
+        Variant A: Bold typography, dark background, code snippet hero
+        Variant B: Split layout, product screenshot left, copy right
+        Variant C: Minimal, centered headline, gradient accent
+
+        [Opens comparison board at localhost:PORT]
+
+You:    [Clicks "Approve" on Variant A in the browser]
+
+Claude: Approved Variant A. Saved to ~/.gstack/projects/myapp/designs/
+        Next: run /design-html to generate production HTML from this mockup.
+```
+
+---
+
+## `/design-html`
+
+This is my **design-to-code mode**.
+
+Every AI code generation tool produces static CSS. Hardcoded heights. Text that overflows on resize. Breakpoints that snap instead of flowing. The output looks right at exactly one viewport size and breaks at every other.
+
+`/design-html` fixes this. It takes the approved mockup from `/design-shotgun` and generates HTML using [Pretext](https://github.com/chenglou/pretext) by Cheng Lou (ex-React core, Midjourney frontend). Pretext is a 15KB library that computes text layout without DOM measurement. Text reflows. Heights adjust to content. Cards size themselves. Chat bubbles shrinkwrap. All sub-millisecond, all dynamic.
+
+### Smart API routing
+
+Not every page needs the full Pretext engine. The skill reads the design and picks the right tools:
+
+- **Simple layouts** (landing, marketing): `prepare()` + `layout()` for resize-aware heights
+- **Card grids** (dashboard, listing): `prepare()` + `layout()` for self-sizing cards
+- **Chat UIs**: `walkLineRanges()` for tight-fit bubbles with zero wasted pixels
+- **Editorial layouts**: `layoutNextLine()` for text flowing around obstacles
+- **Complex editorial**: Full engine with `layoutWithLines()` for manual line rendering
+
+### The refinement loop
+
+1. Reads the approved mockup from `approved.json`
+2. Uses GPT-4o vision to extract implementation spec (colors, typography, layout)
+3. Generates self-contained HTML with Pretext inlined (15KB, zero network dependency)
+4. Spins up a live-reload server so you see changes instantly
+5. Screenshots at 3 viewports (mobile, tablet, desktop) to verify layout
+6. AskUserQuestion: what needs to change?
+7. Surgical edits via the Edit tool (not full regeneration)
+8. Repeat until you say "done"
+
+### Framework detection
+
+If your project uses React, Svelte, or Vue (detected from `package.json`), the skill offers to generate a framework component instead of vanilla HTML. Framework output uses `npm install @chenglou/pretext` instead of inline vendoring.
+
+### Example
+
+```
+You:   /design-html
+
+Claude: Found approved mockup: variant-A.png (hero section)
+        Design type: simple layout → using prepare() + layout()
+
+        [Generates HTML with Pretext, opens live-reload server]
+        [Screenshots at 375px, 768px, 1440px — all verified]
+
+        The HTML is live at localhost:3456. Resize your browser.
+        Text reflows dynamically. What needs to change?
+
+You:    Make the headline bigger and add more spacing below the CTA
+
+Claude: [Edits font-size and margin, browser auto-reloads]
+        Updated. Two changes applied.
+
+You:    Done
+
+Claude: Saved to ~/.gstack/projects/myapp/designs/hero-20260330/finalized.html
+```
+
+---
+
 ## `/review`

 This is my **paranoid staff engineer mode**.
@@ -525,6 +636,82 @@ A lot of branches die when the interesting work is done and only the boring rele

 ---

+## `/land-and-deploy`
+
+This is my **deploy pipeline mode**.
+
+`/ship` creates the PR. `/land-and-deploy` finishes the job: merge, deploy, verify.
+
+It merges the PR, waits for CI, waits for the deploy to finish, then runs canary checks against production. One command from "approved" to "verified in production." If the deploy breaks, it tells you what failed and whether to rollback.
+
+First run on a new project triggers a dry-run walk-through so you can verify the pipeline before it does anything irreversible. After that, it trusts the config and runs straight through.
+
+### Setup
+
+Run `/setup-deploy` first. It detects your platform (Fly.io, Render, Vercel, Netlify, Heroku, GitHub Actions, or custom), discovers your production URL and health check endpoints, and writes the config to CLAUDE.md. One-time, 60 seconds.
+
+### Example
+
+```
+You:   /land-and-deploy
+
+Claude: Merging PR #42...
+        CI: 3/3 checks passed
+        Deploy: Fly.io — deploying v2.1.0...
+        Health check: https://myapp.fly.dev/health → 200 OK
+        Canary: 5 pages checked, 0 console errors, p95 < 800ms
+
+        Production verified. v2.1.0 is live.
+```
+
+---
+
+## `/canary`
+
+This is my **post-deploy monitoring mode**.
+
+After deploy, `/canary` watches the live site for trouble. It loops through your key pages using the browse daemon, checking for console errors, performance regressions, page failures, and visual anomalies. Takes periodic screenshots and compares against pre-deploy baselines.
+
+Use it right after `/land-and-deploy`, or schedule it to run periodically after a risky deploy.
+
+```
+You:   /canary https://myapp.com
+
+Claude: Monitoring 8 pages every 2 minutes...
+
+        Cycle 1: ✓ All pages healthy. p95: 340ms. 0 console errors.
+        Cycle 2: ✓ All pages healthy. p95: 380ms. 0 console errors.
+        Cycle 3: ⚠ /dashboard — new console error: "TypeError: Cannot read
+                   property 'map' of undefined" at dashboard.js:142
+                 Screenshot saved.
+
+        Alert: 1 new console error after 3 monitoring cycles.
+```
+
+---
+
+## `/benchmark`
+
+This is my **performance engineer mode**.
+
+`/benchmark` establishes performance baselines for your pages: load time, Core Web Vitals (LCP, CLS, INP), resource counts, and total transfer size. Run it before and after a PR to catch regressions.
+
+It uses the browse daemon for real Chromium measurements, not synthetic estimates. Multiple runs averaged. Results persist so you can track trends across PRs.
+
+```
+You:   /benchmark https://myapp.com
+
+Claude: Benchmarking 5 pages (3 runs each)...
+
+        /           load: 1.2s  LCP: 0.9s  CLS: 0.01  resources: 24 (890KB)
+        /dashboard  load: 2.1s  LCP: 1.8s  CLS: 0.03  resources: 31 (1.4MB)
+        /settings   load: 0.8s  LCP: 0.6s  CLS: 0.00  resources: 18 (420KB)
+
+        Baseline saved. Run again after changes to compare.
+```
+
+---
+
 ## `/cso`

 This is my **Chief Security Officer**.
@@ -711,6 +898,100 @@ Claude: Imported 12 cookies for github.com from Comet.

 ---

+## `/autoplan`
+
+This is my **review autopilot mode**.
+
+Running `/plan-ceo-review`, then `/plan-design-review`, then `/plan-eng-review` individually means answering 15-30 intermediate questions. Each question is valuable, but sometimes you want the gauntlet to run without stopping for every decision.
+
+`/autoplan` reads all three review skills from disk and runs them sequentially: CEO → Design → Eng. It makes decisions automatically using six encoded principles (prefer completeness, match existing patterns, choose reversible options, prefer the option the user chose for similar past decisions, defer ambiguous items, and escalate security). Taste decisions (close approaches, borderline scope expansions, cross-model disagreements) get saved and presented at a final approval gate.
+
+One command, fully reviewed plan out.
+
+```
+You:   /autoplan
+
+Claude: Running CEO review... [4 scope decisions auto-resolved]
+        Running design review... [3 design dimensions auto-scored]
+        Running eng review... [2 architecture decisions auto-resolved]
+
+        TASTE DECISIONS (need your input):
+        1. Scope: Codex suggested adding search — borderline expansion. Add?
+        2. Design: Two approaches scored within 1 point. Which feels right?
+
+        [Shows both options with context]
+
+You:    1) Yes, add search. 2) Option A.
+
+Claude: Plan complete. 9 decisions auto-resolved, 2 taste decisions approved.
+```
+
+---
+
+## `/learn`
+
+This is my **institutional memory mode**.
+
+gstack learns from every session. Patterns, pitfalls, preferences, architectural decisions... they accumulate in `~/.gstack/projects/$SLUG/learnings.jsonl`. Each learning has a confidence score, source attribution, and the files it references.
+
+`/learn` lets you see what gstack has absorbed, search for specific patterns, prune stale entries (when referenced files no longer exist), and export learnings for team sharing. The real magic is in other skills... they automatically search learnings before making recommendations, and display "Prior learning applied" when a past insight is relevant.
+
+```
+You:   /learn
+
+Claude: 23 learnings for this project (14 high confidence, 6 medium, 3 low)
+
+        Top patterns:
+        - [9/10] API responses always wrapped in { data, error } envelope
+        - [8/10] Tests use factory helpers in test/support/factories.ts
+        - [8/10] All DB queries go through repository pattern, never direct
+
+        3 potentially stale (referenced files deleted):
+        - "auth middleware uses JWT" — auth/middleware.ts was deleted
+        [Prune these? Y/N]
+```
+
+---
+
+## `/connect-chrome`
+
+This is my **co-presence mode**.
+
+`/browse` runs headless by default. You don't see what the agent sees. `/connect-chrome` changes that. It launches your actual Chrome browser controlled by Playwright, with the gstack Side Panel extension auto-loaded. You watch every action in real time... same screen, same window.
+
+A subtle green shimmer at the top edge tells you which Chrome window gstack controls. All existing browse commands work unchanged. The Side Panel shows a live activity feed of every command and a chat sidebar where you can direct Claude with natural language instructions.
+
+```
+You:   /connect-chrome
+
+Claude: Launched Chrome with Side Panel extension.
+        Green shimmer indicates the controlled window.
+        All $B commands now run in headed mode.
+        Type in the Side Panel to direct the browser agent.
+```
+
+---
+
+## `/setup-deploy`
+
+One-time deploy configuration. Run this before your first `/land-and-deploy`.
+
+It auto-detects your deploy platform (Fly.io, Render, Vercel, Netlify, Heroku, GitHub Actions, or custom), discovers your production URL, health check endpoints, and deploy status commands. Writes everything to CLAUDE.md so all future deploys are automatic.
+
+```
+You:   /setup-deploy
+
+Claude: Detected: Fly.io (fly.toml found)
+        Production URL: https://myapp.fly.dev
+        Health check: /health → expects 200
+        Deploy command: fly deploy
+        Status command: fly status
+
+        Written to CLAUDE.md. Run /land-and-deploy when ready.
+```
+
+---
+
 ## `/codex`

 This is my **second opinion mode**.
@@ -1,6 +1,6 @@
 {
  "name": "gstack",
-  "version": "0.14.1.0",
+  "version": "0.14.2.0",
  "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.",
  "license": "MIT",
  "type": "module",
@@ -675,11 +675,10 @@ $D check --image "$_DESIGN_DIR/variant-A.png" --brief "<the original brief>"

 Flag any variants that fail the quality check. Offer to regenerate failures.

-Show each variant inline (Read tool on each PNG) so the user sees them immediately.
-
-Tell the user: "I've generated design directions. Take a look at the variants above,
-then use the comparison board that just opened in your browser to pick your favorite,
-rate the others, remix elements, and click Submit when you're done."
+**Do NOT show variants inline via Read tool and ask for preferences.** Proceed
+directly to the Comparison Board + Feedback Loop section below. The comparison board
+IS the chooser — it has rating controls, comments, remix/regenerate, and structured
+feedback output. Showing mockups inline is a degraded experience.

 ### Comparison Board + Feedback Loop

@@ -691,31 +690,42 @@ $D compare --images "$_DESIGN_DIR/variant-A.png,$_DESIGN_DIR/variant-B.png,$_DES

 This command generates the board HTML, starts an HTTP server on a random port,
 and opens it in the user's default browser. **Run it in the background** with `&`
-because the agent needs to keep running while the user interacts with the board.
+because the server needs to stay running while the user interacts with the board.

-**IMPORTANT: Reading feedback via file polling (not stdout):**
+Parse the port from stderr output: `SERVE_STARTED: port=XXXXX`. You need this
+for the board URL and for reloading during regeneration cycles.

-The server writes feedback to files next to the board HTML. The agent polls for these:
+**PRIMARY WAIT: AskUserQuestion with board URL**
+
+After the board is serving, use AskUserQuestion to wait for the user. Include the
+board URL so they can click it if they lost the browser tab:
+
+"I've opened a comparison board with the design variants:
+http://127.0.0.1:<PORT>/ — Rate them, leave comments, remix
+elements you like, and click Submit when you're done. Let me know when you've
+submitted your feedback (or paste your preferences here). If you clicked
+Regenerate or Remix on the board, tell me and I'll generate new variants."
+
+**Do NOT use AskUserQuestion to ask which variant the user prefers.** The comparison
+board IS the chooser. AskUserQuestion is just the blocking wait mechanism.
+
+**After the user responds to AskUserQuestion:**
+
+Check for feedback files next to the board HTML:
 - `$_DESIGN_DIR/feedback.json` — written when user clicks Submit (final choice)
 - `$_DESIGN_DIR/feedback-pending.json` — written when user clicks Regenerate/Remix/More Like This

-**Polling loop** (run after launching `$D serve` in background):
-
 ```bash
-# Poll for feedback files every 5 seconds (up to 10 minutes)
-for i in $(seq 1 120); do
-  if [ -f "$_DESIGN_DIR/feedback.json" ]; then
-    echo "SUBMIT_RECEIVED"
-    cat "$_DESIGN_DIR/feedback.json"
-    break
-  elif [ -f "$_DESIGN_DIR/feedback-pending.json" ]; then
-    echo "REGENERATE_RECEIVED"
-    cat "$_DESIGN_DIR/feedback-pending.json"
-    rm "$_DESIGN_DIR/feedback-pending.json"
-    break
-  fi
-  sleep 5
-done
+if [ -f "$_DESIGN_DIR/feedback.json" ]; then
+  echo "SUBMIT_RECEIVED"
+  cat "$_DESIGN_DIR/feedback.json"
+elif [ -f "$_DESIGN_DIR/feedback-pending.json" ]; then
+  echo "REGENERATE_RECEIVED"
+  cat "$_DESIGN_DIR/feedback-pending.json"
+  rm "$_DESIGN_DIR/feedback-pending.json"
+else
+  echo "NO_FEEDBACK_FILE"
+fi
 ```

 The feedback JSON has this shape:
@@ -729,24 +739,30 @@ The feedback JSON has this shape:
 }
 ```

-**If `feedback-pending.json` found (`"regenerated": true`):**
+**If `feedback.json` found:** The user clicked Submit on the board.
+Read `preferred`, `ratings`, `comments`, `overall` from the JSON. Proceed with
+the approved variant.
+
+**If `feedback-pending.json` found:** The user clicked Regenerate/Remix on the board.
 1. Read `regenerateAction` from the JSON (`"different"`, `"match"`, `"more_like_B"`,
   `"remix"`, or custom text)
 2. If `regenerateAction` is `"remix"`, read `remixSpec` (e.g. `{"layout":"A","colors":"B"}`)
 3. Generate new variants with `$D iterate` or `$D variants` using updated brief
 4. Create new board: `$D compare --images "..." --output "$_DESIGN_DIR/design-board.html"`
-5. Parse the port from the `$D serve` stderr output (`SERVE_STARTED: port=XXXXX`),
-   then reload the board in the user's browser (same tab):
+5. Reload the board in the user's browser (same tab):
   `curl -s -X POST http://127.0.0.1:PORT/api/reload -H 'Content-Type: application/json' -d '{"html":"$_DESIGN_DIR/design-board.html"}'`
-6. The board auto-refreshes. **Poll again** for the next feedback file.
-7. Repeat until `feedback.json` appears (user clicked Submit).
+6. The board auto-refreshes. **AskUserQuestion again** with the same board URL to
+   wait for the next round of feedback. Repeat until `feedback.json` appears.

-**If `feedback.json` found (`"regenerated": false`):**
-1. Read `preferred`, `ratings`, `comments`, `overall` from the JSON
-2. Proceed with the approved variant
+**If `NO_FEEDBACK_FILE`:** The user typed their preferences directly in the
+AskUserQuestion response instead of using the board. Use their text response
+as the feedback.

-**If `$D serve` fails or no feedback within 10 minutes:** Fall back to AskUserQuestion:
-"I've opened the design board. Which variant do you prefer? Any feedback?"
+**POLLING FALLBACK:** Only use polling if `$D serve` fails (no port available).
+In that case, show each variant inline using the Read tool (so the user can see them),
+then use AskUserQuestion:
+"The comparison board server failed to start. I've shown the variants above.
+Which do you prefer? Any feedback?"

 **After receiving feedback (any path):** Output a clear summary confirming
 what was understood:
@@ -1060,6 +1076,7 @@ Follow the AskUserQuestion format from the Preamble above. Additional rules for
 * **Map to Design Principles above.** One sentence connecting your recommendation to a specific principle.
 * Label with issue NUMBER + option LETTER (e.g., "3A", "3B").
 * **Escape hatch:** If a section has no issues, say so and move on. If a gap has an obvious fix, state what you'll add and move on — don't waste a question on it. Only use AskUserQuestion when there is a genuine design choice with meaningful tradeoffs.
+* **NEVER use AskUserQuestion to ask which variant the user prefers.** Always create a comparison board first (`$D compare --serve`) and open it in the browser. The board has rating controls, comments, remix/regenerate buttons, and structured feedback output. Use AskUserQuestion ONLY to notify the user the board is open and wait for them to finish — not to present variants inline and ask "which do you prefer?" That is a degraded experience.

 ## Required Outputs

@@ -208,11 +208,10 @@ $D check --image "$_DESIGN_DIR/variant-A.png" --brief "<the original brief>"

 Flag any variants that fail the quality check. Offer to regenerate failures.

-Show each variant inline (Read tool on each PNG) so the user sees them immediately.
-
-Tell the user: "I've generated design directions. Take a look at the variants above,
-then use the comparison board that just opened in your browser to pick your favorite,
-rate the others, remix elements, and click Submit when you're done."
+**Do NOT show variants inline via Read tool and ask for preferences.** Proceed
+directly to the Comparison Board + Feedback Loop section below. The comparison board
+IS the chooser — it has rating controls, comments, remix/regenerate, and structured
+feedback output. Showing mockups inline is a degraded experience.

 {{DESIGN_SHOTGUN_LOOP}}

@@ -337,6 +336,7 @@ Follow the AskUserQuestion format from the Preamble above. Additional rules for
 * **Map to Design Principles above.** One sentence connecting your recommendation to a specific principle.
 * Label with issue NUMBER + option LETTER (e.g., "3A", "3B").
 * **Escape hatch:** If a section has no issues, say so and move on. If a gap has an obvious fix, state what you'll add and move on — don't waste a question on it. Only use AskUserQuestion when there is a genuine design choice with meaningful tradeoffs.
+* **NEVER use AskUserQuestion to ask which variant the user prefers.** Always create a comparison board first (`$D compare --serve`) and open it in the browser. The board has rating controls, comments, remix/regenerate buttons, and structured feedback output. Use AskUserQuestion ONLY to notify the user the board is open and wait for them to finish — not to present variants inline and ask "which do you prefer?" That is a degraded experience.

 ## Required Outputs

@@ -855,31 +855,42 @@ $D compare --images "$_DESIGN_DIR/variant-A.png,$_DESIGN_DIR/variant-B.png,$_DES

 This command generates the board HTML, starts an HTTP server on a random port,
 and opens it in the user's default browser. **Run it in the background** with \`&\`
-because the agent needs to keep running while the user interacts with the board.
+because the server needs to stay running while the user interacts with the board.

-**IMPORTANT: Reading feedback via file polling (not stdout):**
+Parse the port from stderr output: \`SERVE_STARTED: port=XXXXX\`. You need this
+for the board URL and for reloading during regeneration cycles.

-The server writes feedback to files next to the board HTML. The agent polls for these:
+**PRIMARY WAIT: AskUserQuestion with board URL**
+
+After the board is serving, use AskUserQuestion to wait for the user. Include the
+board URL so they can click it if they lost the browser tab:
+
+"I've opened a comparison board with the design variants:
+http://127.0.0.1:<PORT>/ — Rate them, leave comments, remix
+elements you like, and click Submit when you're done. Let me know when you've
+submitted your feedback (or paste your preferences here). If you clicked
+Regenerate or Remix on the board, tell me and I'll generate new variants."
+
+**Do NOT use AskUserQuestion to ask which variant the user prefers.** The comparison
+board IS the chooser. AskUserQuestion is just the blocking wait mechanism.
+
+**After the user responds to AskUserQuestion:**
+
+Check for feedback files next to the board HTML:
 - \`$_DESIGN_DIR/feedback.json\` — written when user clicks Submit (final choice)
 - \`$_DESIGN_DIR/feedback-pending.json\` — written when user clicks Regenerate/Remix/More Like This

-**Polling loop** (run after launching \`$D serve\` in background):
-
 \`\`\`bash
-# Poll for feedback files every 5 seconds (up to 10 minutes)
-for i in $(seq 1 120); do
-  if [ -f "$_DESIGN_DIR/feedback.json" ]; then
-    echo "SUBMIT_RECEIVED"
-    cat "$_DESIGN_DIR/feedback.json"
-    break
-  elif [ -f "$_DESIGN_DIR/feedback-pending.json" ]; then
-    echo "REGENERATE_RECEIVED"
-    cat "$_DESIGN_DIR/feedback-pending.json"
-    rm "$_DESIGN_DIR/feedback-pending.json"
-    break
-  fi
-  sleep 5
-done
+if [ -f "$_DESIGN_DIR/feedback.json" ]; then
+  echo "SUBMIT_RECEIVED"
+  cat "$_DESIGN_DIR/feedback.json"
+elif [ -f "$_DESIGN_DIR/feedback-pending.json" ]; then
+  echo "REGENERATE_RECEIVED"
+  cat "$_DESIGN_DIR/feedback-pending.json"
+  rm "$_DESIGN_DIR/feedback-pending.json"
+else
+  echo "NO_FEEDBACK_FILE"
+fi
 \`\`\`

 The feedback JSON has this shape:
@@ -893,24 +904,30 @@ The feedback JSON has this shape:
 }
 \`\`\`

-**If \`feedback-pending.json\` found (\`"regenerated": true\`):**
+**If \`feedback.json\` found:** The user clicked Submit on the board.
+Read \`preferred\`, \`ratings\`, \`comments\`, \`overall\` from the JSON. Proceed with
+the approved variant.
+
+**If \`feedback-pending.json\` found:** The user clicked Regenerate/Remix on the board.
 1. Read \`regenerateAction\` from the JSON (\`"different"\`, \`"match"\`, \`"more_like_B"\`,
   \`"remix"\`, or custom text)
 2. If \`regenerateAction\` is \`"remix"\`, read \`remixSpec\` (e.g. \`{"layout":"A","colors":"B"}\`)
 3. Generate new variants with \`$D iterate\` or \`$D variants\` using updated brief
 4. Create new board: \`$D compare --images "..." --output "$_DESIGN_DIR/design-board.html"\`
-5. Parse the port from the \`$D serve\` stderr output (\`SERVE_STARTED: port=XXXXX\`),
-   then reload the board in the user's browser (same tab):
+5. Reload the board in the user's browser (same tab):
   \`curl -s -X POST http://127.0.0.1:PORT/api/reload -H 'Content-Type: application/json' -d '{"html":"$_DESIGN_DIR/design-board.html"}'\`
-6. The board auto-refreshes. **Poll again** for the next feedback file.
-7. Repeat until \`feedback.json\` appears (user clicked Submit).
+6. The board auto-refreshes. **AskUserQuestion again** with the same board URL to
+   wait for the next round of feedback. Repeat until \`feedback.json\` appears.

-**If \`feedback.json\` found (\`"regenerated": false\`):**
-1. Read \`preferred\`, \`ratings\`, \`comments\`, \`overall\` from the JSON
-2. Proceed with the approved variant
+**If \`NO_FEEDBACK_FILE\`:** The user typed their preferences directly in the
+AskUserQuestion response instead of using the board. Use their text response
+as the feedback.

-**If \`$D serve\` fails or no feedback within 10 minutes:** Fall back to AskUserQuestion:
-"I've opened the design board. Which variant do you prefer? Any feedback?"
+**POLLING FALLBACK:** Only use polling if \`$D serve\` fails (no port available).
+In that case, show each variant inline using the Read tool (so the user can see them),
+then use AskUserQuestion:
+"The comparison board server failed to start. I've shown the variants above.
+Which do you prefer? Any feedback?"

 **After receiving feedback (any path):** Output a clear summary confirming
 what was understood:
@@ -1 +1 @@
 .14.1.0
 .14.2.0