diff --git a/CHANGELOG.md b/CHANGELOG.md
index d6bedcde..409e3d4b 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,46 +2,62 @@
 
 ## [1.3.0.0] - 2026-04-19
 
-## **Every new CLI wired to a slash command.**
-## **Zero orphan binaries ship in v1.3.**
+## **Your design skills learn your taste.**
+## **Your session state survives a laptop close.**
 
-v1.3 ships two new binaries (`gstack-model-benchmark`, `gstack-taste-update`) and one new skill (`/benchmark-models`). The delta from v1.2 isn't just "more features." It's that every new primitive is discoverable from a `/command`, not buried in a CHANGELOG bullet nobody reads. Before this cut, `gstack-model-benchmark` existed but no skill called it, so most users would never find it. Now `/benchmark-models` walks you through a cross-model comparison, and `gstack-taste-update` is already wired into `/design-shotgun`'s approval flow. First multi-provider benchmark in any agent framework, and it's one slash command away.
+v1.3 is about the things you do every day. `/design-shotgun` now remembers which fonts, colors, and layouts you approve across sessions, so the next round of variants leans toward your actual taste instead of resetting to Inter every time. `/design-consultation` has a "would a human designer be embarrassed by this?" self-gate in Phase 5 and a "what's the one thing someone will remember?" forcing question in Phase 1, AI-slop output gets discarded before it reaches you. Continuous checkpoint mode (flip it on with `gstack-config set checkpoint_mode continuous`) auto-commits work with `WIP:` prefixes so closing your laptop mid-refactor doesn't vaporize your decisions, and `/context-restore` picks you right back up.
 
 ### The numbers that matter
 
-Headline from this branch's review audit against `origin/main` (33 commits). Reproducible: `git log origin/main..HEAD --oneline` for the commit set, `bun test test/taste-engine.test.ts test/benchmark-cli.test.ts test/skill-e2e-benchmark-providers.test.ts` for the test count, `grep -rn "gstack-model-benchmark\|gstack-taste-update" --include="*.tmpl"` for wiring status.
+Setup: these come from the v1.3 feature surface. Reproducible via `grep "Generate a different" design-shotgun/SKILL.md.tmpl`, `ls model-overlays/`, `cat bin/gstack-taste-update` for the schema, and `gstack-config get checkpoint_mode` for the runtime wiring.
 
-| Metric                                           | BEFORE (initial v1.2 scope) | AFTER (v1.3)         | Δ           |
-|--------------------------------------------------|------------------------------|----------------------|-------------|
-| **New CLIs wired to a /skill**                   | 1 of 2 (50%)                 | **2 of 2 (100%)**    | **+1**      |
-| **Deterministic tests for v1.3 CLIs**            | 0                            | **32**               | **+32**     |
-| **Live-API adapter E2E (gated on `EVALS=1`)**    | 0                            | **8**                | **+8**      |
-| **Real adapter bugs caught by new tests**        | 0                            | **1** (codex `--skip-git-repo-check`) | **+1**      |
-| **Preamble composition root**                    | 740 lines                    | **~100 lines**       | **-86%**    |
-| **Models benchmarkable in one command**          | 1 (Claude only)              | **3** (Claude, GPT, Gemini) | **+2** |
+| Metric                                           | BEFORE v1.3                 | AFTER v1.3                              | Δ           |
+|--------------------------------------------------|------------------------------|-----------------------------------------|-------------|
+| **Design-variant convergence gate**              | no requirement               | **3 axes required** (font + palette + layout must differ) | **+3**  |
+| **AI-slop font blacklist**                       | ~8 fonts                     | **10+** (added Space Grotesk, system-ui as primary) | **+2+** |
+| **Taste memory across `/design-shotgun` rounds** | none                         | **per-project JSON, 5%/wk decay**       | **new**     |
+| **Session state after mid-refactor crash**       | lost since last manual commit| **auto-WIP commits with structured body** (opt-in) | **new** |
+| **`/context-restore` sources**                   | markdown files only          | **markdown + `[gstack-context]` from WIP commits** | **+1** |
+| **Models with behavioral overlays**              | 1 (Claude implicit)          | **5** (claude, gpt, gpt-5.4, gemini, o-series) | **+4** |
 
-The single most striking number: the new E2E suite caught a real codex adapter bug (`--skip-git-repo-check` missing) on its first run. That bug would have shipped silently, then surfaced later as a cryptic "Not inside a trusted directory" error to anyone running `gstack-model-benchmark` from a temp dir. One test, one regression caught, before a user ever hit it.
+The single most striking row: closing your laptop mid-session used to cost you every decision since the last manual commit. Now, with continuous mode on, `WIP:` commits land at every meaningful step with a structured `[gstack-context]` body (decisions made, remaining work, failed approaches). `/context-restore` reads those commits and hands your next session the exact state you left.
 
 ### What this means for gstack users
 
-If you're a YC founder or solo builder shipping methodology skills from one laptop, `/benchmark-models` answers "is my skill better on Opus, GPT-5.4, or Gemini" with a real benchmark table, not vibes. The design taste engine remembers which fonts, colors, and aesthetics you approve in `/design-shotgun`, so next round's variants lean toward your actual taste instead of resetting to Inter every time. Continuous checkpoint mode (opt-in, local by default) means you can close your laptop mid-refactor and `/context-restore` picks you up from a WIP commit with decisions and remaining work intact, not a stale notes file. Run `/gstack-upgrade` and try `/benchmark-models` on the skill you use most this week.
+If you're a solo builder or founder shipping a product one sprint at a time, `/design-shotgun` stops handing you the same four variants every time and starts learning which ones you pick. `/design-consultation` stops defaulting to Inter + gray + rounded-corners and forces itself to answer "what's memorable?" before it finishes. Continuous checkpoint mode means your session state survives crashes, context switches, and forgotten laptops. Run `/gstack-upgrade`, try `/design-shotgun` on your next landing page, and approve a variant so the taste engine has a starting signal.
 
 ### Itemized changes
 
 ### Added
 
-- **Per-model behavioral overlays via `--model` flag.** Different LLMs need different nudges. Run `bun run gen:skill-docs --model gpt-5.4` and every generated skill picks up GPT-tuned behavioral patches. Five overlays ship in `model-overlays/`: claude (todo discipline), gpt (anti-termination), gpt-5.4 (anti-verbosity, inherits gpt), gemini (conciseness), o-series (structured output). Overlay files are plain markdown — edit in place, no code changes. `MODEL_OVERLAY: {model}` line in the preamble output tells you which one is active. Defaults to claude. Missing overlay file → empty string (graceful), no error.
-- **Continuous checkpoint mode (opt-in, local by default).** Set `gstack-config set checkpoint_mode continuous` and skills will auto-commit your work as you go with `WIP: <description>` prefix and a structured `[gstack-context]` body block (decisions, remaining work, tried-and-failed approaches, current skill). Survives Claude Code crashes. Push is opt-in via `checkpoint_push=true` — defaults to local-only so you don't accidentally trigger CI on every WIP commit. `/context-restore` (formerly `/checkpoint resume` pre-v1.1.3) now reads both the markdown saved-context files AND the `[gstack-context]` blocks from WIP commits to reconstruct session state.
-- **`/ship` non-destructively squashes WIP commits** before creating the PR (when continuous mode is active). Uses `git rebase --autosquash` scoped to WIP commits only — preserves any non-WIP commits on the branch. Refuses to blind soft-reset when non-WIP work is mixed in (would have caused non-fast-forward push). Aborts with BLOCKED status on conflict instead of destroying real work.
-- **Feature discovery prompt after upgrade.** When `JUST_UPGRADED` fires, gstack now offers to enable new features once per user (per-feature marker files at `~/.gstack/.feature-prompted-{name}`). Skipped entirely in spawned sessions (OpenClaw orchestrator). No more silent features that never get discovered.
-- **Context health soft directive (T2+ skills).** During long-running skills (/qa, /investigate, /cso), gstack now nudges you to write periodic `[PROGRESS]` summaries. Self-monitoring during 50+ tool-call sessions. No fake thresholds — soft directive that the model self-applies. Progress reporting never mutates git state.
-- **`gstack-model-benchmark` CLI.** Run the same prompt across Claude, GPT, and Gemini, compare latency/tokens/cost/quality. Per-provider auth detection, pricing tables, tool-compatibility map, parallel execution, per-provider error isolation. Quality scoring via Anthropic SDK as the stable judge (`--judge` flag). Output as table, JSON, or markdown. The first multi-provider benchmark in any agent framework — every other tool guesses which model is best, gstack measures it.
-- **Design taste engine.** Persistent cross-session taste profile at `~/.gstack/projects/$SLUG/taste-profile.json`. Tracks fonts, colors, layouts, and aesthetic directions you approve and reject across sessions. Confidence decays 5% per week. Design-consultation and design-shotgun now factor in your demonstrated preferences. Schema migration handles legacy approved.json. New `gstack-taste-update` CLI updates the profile after design-shotgun decisions.
-- **Anti-slop design constraints.** Design-consultation now asks "What's the one thing someone will remember?" as a forcing question. Phase 5 self-gate: "Would a human designer be embarrassed by this?" — discards and regenerates if yes. Anti-convergence directive in design-shotgun: each variant must use a different font, palette, and layout, or one of them failed. Space Grotesk added to the overused fonts list (it's the new "safe alternative to Inter" trap). system-ui-as-primary-font added to the AI slop blacklist.
-- **`gstack-config list` and `gstack-config defaults`** subcommands. `list` shows all config keys with their current value AND source (set/default). `defaults` shows just the defaults table. Fixes the prior gap where `get` returned empty for missing keys instead of falling back to the documented defaults. Telemetry default aligned: header and runtime both say `off` now (previously mismatched).
-- **`gstack-config checkpoint_mode` and `checkpoint_push` keys.** New config knobs for continuous checkpoint mode. Both default to safe values (`explicit` mode, no auto-push).
-- **New `/benchmark-models` skill.** Wraps `gstack-model-benchmark` in an interactive flow: pick a prompt (an existing SKILL.md, inline text, or file path), confirm providers (dry-run shows auth status per provider), decide on `--judge` (adds ~$0.05 for quality scoring), run, interpret. Trigger phrases: "compare models", "model shootout", "which model is best". Separate from `/benchmark` (which measures web page performance) — different surface, different domain.
-- **`gstack-model-benchmark --dry-run`.** Offline validation mode. Validates the provider list, resolves per-adapter auth, echoes the resolved flag values, and exits without invoking any provider CLI. Zero-cost pre-flight for CI pipelines and for catching auth drift before starting a paid benchmark run.
+#### Design skills that stop looking like AI
+
+- **Anti-slop design constraints.** `/design-consultation` now asks "What's the one thing someone will remember?" as a forcing question in Phase 1, and runs a "Would a human designer be embarrassed by this?" self-gate in Phase 5 — output that fails the gate gets discarded and regenerated. `/design-shotgun` gets an anti-convergence directive: each variant must use a different font, palette, and layout, or one of them failed. Space Grotesk (the new "safe alternative to Inter") added to the overused-fonts list. `system-ui` as a primary font added to the AI-slop blacklist.
+- **Design taste engine.** Your approvals and rejections in `/design-shotgun` get written to a persistent per-project taste profile at `~/.gstack/projects/$SLUG/taste-profile.json`. Tracks fonts, colors, layouts, and aesthetic directions with Laplace-smoothed confidence. Decays 5% per week so stale preferences fade. `/design-consultation` and `/design-shotgun` both factor in your demonstrated preferences on future runs, so variant #3 this month remembers what you liked in variant #1 last month.
+
+#### Session state that survives a crash
+
+- **Continuous checkpoint mode (opt-in, local by default).** Flip it on with `gstack-config set checkpoint_mode continuous` and skills auto-commit your work with `WIP: <description>` prefix and a structured `[gstack-context]` body (decisions made, remaining work, failed approaches). Close your laptop mid-refactor and your state survives. Push is opt-in via `checkpoint_push=true`, default is local-only so you don't accidentally trigger CI on every WIP commit.
+- **`/context-restore` reads WIP commits.** In addition to the markdown saved-context files, `/context-restore` now parses `[gstack-context]` blocks from WIP commits on the current branch. Your next session starts with the exact state you left, not a stale notes file.
+- **`/ship` non-destructively squashes WIP commits** before creating the PR. Uses `git rebase --autosquash` scoped to WIP commits only. Non-WIP commits on the branch are preserved. Aborts on conflict with a `BLOCKED` status instead of destroying real work. So you can go wild with `WIP:` commits all week and still ship a clean bisectable PR.
+
+#### Quality-of-life
+
+- **Feature discovery prompt after upgrade.** When `JUST_UPGRADED` fires, gstack offers to enable new features once per user (per-feature marker files at `~/.gstack/.feature-prompted-{name}`). Skipped entirely in spawned sessions. No more silent features that never get discovered.
+- **Context health soft directive (T2+ skills).** During long-running skills (`/qa`, `/investigate`, `/cso`), gstack now nudges you to write periodic `[PROGRESS]` summaries. If you notice you're going in circles, STOP and reassess. Self-monitoring for 50+ tool-call sessions. No fake thresholds, no enforcement. Progress reports never mutate git state.
+
+#### Cross-host support
+
+- **Per-model behavioral overlays via `--model` flag.** Different LLMs need different nudges. Run `bun run gen:skill-docs --model gpt-5.4` and every generated skill picks up GPT-tuned behavioral patches. Five overlays ship in `model-overlays/`: claude (todo-list discipline), gpt (anti-termination + completeness), gpt-5.4 (anti-verbosity, inherits gpt), gemini (conciseness), o-series (structured output). Overlay files are plain markdown — edit in place, no code changes. `MODEL_OVERLAY: {model}` prints in the preamble output so you know which one is active.
+
+#### Config
+
+- **`gstack-config list` and `defaults`** subcommands. `list` shows all config keys with current value AND source (user-set vs default). `defaults` shows the defaults table. Fixes the prior gap where `get` returned empty for missing keys instead of falling back to the documented defaults.
+- **`checkpoint_mode` and `checkpoint_push` config keys.** New knobs for continuous checkpoint mode. Both default to safe values (`explicit` mode, no auto-push).
+
+#### Power-user / internal
+
+- **`gstack-model-benchmark` CLI + `/benchmark-models` skill.** Run the same prompt across Claude, GPT (via Codex CLI), and Gemini side-by-side. Compares latency, tokens, cost, and optionally output quality via an Anthropic SDK judge (`--judge`, ~$0.05/run). Per-provider auth detection, pricing tables, tool-compatibility map, parallel execution, per-provider error isolation. Output as table / JSON / markdown. `--dry-run` validates flags + auth without spending API calls. `/benchmark-models` wraps the CLI in an interactive flow (pick prompt → confirm providers → decide on judge → run → interpret) for when you want to know "which model is actually best for my `/qa` skill" with data instead of vibes.
 
 ### Changed