refactor: rename qa-design-review → design-review

The "qa-" prefix was confusing — this is the live-site design audit with fix loop, not a QA-only report. Rename directory and update all references across docs, tests, scripts, and skill templates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-05 05:05:08 +02:00 · 2026-03-17 20:23:05 -07:00
parent 28becb3b39
commit 56fa6d9cc2
15 changed files with 51 additions and 51 deletions
@@ -202,9 +202,9 @@ Templates contain the workflows, tips, and examples that require human judgment.
 | `{{BROWSE_SETUP}}` | `gen-skill-docs.ts` | Binary discovery + setup instructions |
 | `{{BASE_BRANCH_DETECT}}` | `gen-skill-docs.ts` | Dynamic base branch detection for PR-targeting skills (ship, review, qa, plan-ceo-review) |
 | `{{QA_METHODOLOGY}}` | `gen-skill-docs.ts` | Shared QA methodology block for /qa and /qa-only |
-| `{{DESIGN_METHODOLOGY}}` | `gen-skill-docs.ts` | Shared design audit methodology for /plan-design-review and /qa-design-review |
+| `{{DESIGN_METHODOLOGY}}` | `gen-skill-docs.ts` | Shared design audit methodology for /plan-design-review and /design-review |
 | `{{REVIEW_DASHBOARD}}` | `gen-skill-docs.ts` | Review Readiness Dashboard for /ship pre-flight |
-| `{{TEST_BOOTSTRAP}}` | `gen-skill-docs.ts` | Test framework detection, bootstrap, CI/CD setup for /qa, /ship, /qa-design-review |
+| `{{TEST_BOOTSTRAP}}` | `gen-skill-docs.ts` | Test framework detection, bootstrap, CI/CD setup for /qa, /ship, /design-review |

 This is structurally sound — if a command exists in code, it appears in docs. If it doesn't exist, it can't appear.

@@ -53,7 +53,7 @@ gstack/
 │   └── skill-e2e.test.ts         # Tier 2: E2E via claude -p (~$3.85/run)
 ├── qa-only/         # /qa-only skill (report-only QA, no fixes)
 ├── plan-design-review/  # /plan-design-review skill (report-only design audit)
-├── qa-design-review/    # /qa-design-review skill (design audit + fix loop)
+├── design-review/    # /design-review skill (design audit + fix loop)
 ├── ship/            # Ship workflow skill
 ├── review/          # PR review skill
 ├── plan-ceo-review/ # /plan-ceo-review skill
@@ -91,7 +91,7 @@ One feature. Seven commands. The agent reframed the product, ran an 80-item desi
 | `/browse` | **QA Engineer** | Give the agent eyes. Real Chromium browser, real clicks, real screenshots. ~100ms per command. |
 | `/qa` | **QA Lead** | Test your app, find bugs, fix them with atomic commits, re-verify. Auto-generates regression tests for every fix. |
 | `/qa-only` | **QA Reporter** | Same methodology as /qa but report only. Use when you want a pure bug report without code changes. |
-| `/qa-design-review` | **Designer Who Codes** | Same audit as /plan-design-review, then fixes what it finds. Atomic commits, before/after screenshots. |
+| `/design-review` | **Designer Who Codes** | Same audit as /plan-design-review, then fixes what it finds. Atomic commits, before/after screenshots. |
 | `/setup-browser-cookies` | **Session Manager** | Import cookies from your real browser (Chrome, Arc, Brave, Edge) into the headless session. Test authenticated pages. |
 | `/retro` | **Eng Manager** | Team-aware weekly retro. Per-person breakdowns, shipping streaks, test health trends, growth opportunities. |
 | `/document-release` | **Technical Writer** | Update all project docs to match what you just shipped. Catches stale READMEs automatically. |
@@ -102,7 +102,7 @@ One feature. Seven commands. The agent reframed the product, ran an 80-item desi

 ## What's new and why it matters

-**Design is at the heart.** `/design-consultation` doesn't just pick fonts. It researches what's out there in your space, proposes safe choices AND creative risks, generates realistic mockups of your actual product, and writes `DESIGN.md` — and then `/qa-design-review` and `/plan-eng-review` read what you chose. Design decisions flow through the whole system.
+**Design is at the heart.** `/design-consultation` doesn't just pick fonts. It researches what's out there in your space, proposes safe choices AND creative risks, generates realistic mockups of your actual product, and writes `DESIGN.md` — and then `/design-review` and `/plan-eng-review` read what you chose. Design decisions flow through the whole system.

 **`/qa` was a massive unlock.** It let me go from 6 to 12 parallel workers. Claude Code saying *"I SEE THE ISSUE"* and then actually fixing it, generating a regression test, and verifying the fix — that changed how I work. The agent has eyes now.

@@ -132,11 +132,11 @@ One person, ten parallel agents, each with the right cognitive mode. That is a d

 Open Claude Code and paste this. Claude does the rest.

-> Install gstack: run `git clone https://github.com/garrytan/gstack.git ~/.claude/skills/gstack && cd ~/.claude/skills/gstack && ./setup` then add a "gstack" section to CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, and lists the available skills: /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /review, /ship, /browse, /qa, /qa-only, /qa-design-review, /setup-browser-cookies, /retro, /document-release. Then ask the user if they also want to add gstack to the current project so teammates get it.
+> Install gstack: run `git clone https://github.com/garrytan/gstack.git ~/.claude/skills/gstack && cd ~/.claude/skills/gstack && ./setup` then add a "gstack" section to CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, and lists the available skills: /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /review, /ship, /browse, /qa, /qa-only, /design-review, /setup-browser-cookies, /retro, /document-release. Then ask the user if they also want to add gstack to the current project so teammates get it.

 ### Step 2: Add to your repo so teammates get it (optional)

-> Add gstack to this project: run `cp -Rf ~/.claude/skills/gstack .claude/skills/gstack && rm -rf .claude/skills/gstack/.git && cd .claude/skills/gstack && ./setup` then add a "gstack" section to this project's CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, lists the available skills: /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /review, /ship, /browse, /qa, /qa-only, /qa-design-review, /setup-browser-cookies, /retro, /document-release, and tells Claude that if gstack skills aren't working, run `cd .claude/skills/gstack && ./setup` to build the binary and register skills.
+> Add gstack to this project: run `cp -Rf ~/.claude/skills/gstack .claude/skills/gstack && rm -rf .claude/skills/gstack/.git && cd .claude/skills/gstack && ./setup` then add a "gstack" section to this project's CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, lists the available skills: /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /review, /ship, /browse, /qa, /qa-only, /design-review, /setup-browser-cookies, /retro, /document-release, and tells Claude that if gstack skills aren't working, run `cd .claude/skills/gstack && ./setup` to build the binary and register skills.

 Real files get committed to your repo (not a submodule), so `git clone` just works. Everything lives inside `.claude/`. Nothing touches your PATH or runs in the background.

@@ -1,11 +1,11 @@
 ---
-name: qa-design-review
-version: 1.0.0
+name: design-review
+version: 2.0.0
 description: |
  Designer's eye QA: finds visual inconsistency, spacing issues, hierarchy problems,
  AI slop patterns, and slow interactions — then fixes them. Iteratively fixes issues
  in source code, committing each fix atomically and re-verifying with before/after
-  screenshots. For report-only mode, use /plan-design-review instead.
+  screenshots. For plan-mode design review (before implementation), use /plan-design-review.
 allowed-tools:
  - Bash
  - Read
@@ -123,7 +123,7 @@ Hey gstack team — ran into this while using /{skill-name}:

 Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"

-# /qa-design-review: Design Audit → Fix → Verify
+# /design-review: Design Audit → Fix → Verify

 You are a senior product designer AND a frontend engineer. Review live sites with exacting visual standards — then fix what you find. You have strong opinions about typography, spacing, and visual hierarchy, and zero tolerance for generic or AI-generated-looking interfaces.

@@ -150,7 +150,7 @@ Look for `DESIGN.md`, `design-system.md`, or similar in the repo root. If found,

 ```bash
 if [ -n "$(git status --porcelain)" ]; then
-  echo "ERROR: Working tree is dirty. Commit or stash changes before running /qa-design-review."
+  echo "ERROR: Working tree is dirty. Commit or stash changes before running /design-review."
  exit 1
 fi
 ```
@@ -766,7 +766,7 @@ Design fixes are typically CSS-only. Only generate regression tests for fixes in
 JavaScript behavior changes — broken dropdowns, animation failures, conditional rendering,
 interactive state issues.

-For CSS-only fixes: skip entirely. CSS regressions are caught by re-running /qa-design-review.
+For CSS-only fixes: skip entirely. CSS regressions are caught by re-running /design-review.

 If the fix involved JS behavior: follow the same procedure as /qa Phase 8e.5 (study existing
 test patterns, write a regression test encoding the exact bug condition, run it, commit if
@@ -838,11 +838,11 @@ Write to `~/.gstack/projects/{slug}/{user}-{branch}-design-audit-{datetime}.md`
 If the repo has a `TODOS.md`:

 1. **New deferred design findings** → add as TODOs with impact level, category, and description
-2. **Fixed findings that were in TODOS.md** → annotate with "Fixed by /qa-design-review on {branch}, {date}"
+2. **Fixed findings that were in TODOS.md** → annotate with "Fixed by /design-review on {branch}, {date}"

 ---

-## Additional Rules (qa-design-review specific)
+## Additional Rules (design-review specific)

 11. **Clean working tree required.** Refuse to start if `git status --porcelain` is non-empty.
 12. **One commit per fix.** Never bundle multiple design fixes into one commit.
@@ -1,11 +1,11 @@
 ---
-name: qa-design-review
-version: 1.0.0
+name: design-review
+version: 2.0.0
 description: |
  Designer's eye QA: finds visual inconsistency, spacing issues, hierarchy problems,
  AI slop patterns, and slow interactions — then fixes them. Iteratively fixes issues
  in source code, committing each fix atomically and re-verifying with before/after
-  screenshots. For report-only mode, use /plan-design-review instead.
+  screenshots. For plan-mode design review (before implementation), use /plan-design-review.
 allowed-tools:
  - Bash
  - Read
@@ -19,7 +19,7 @@ allowed-tools:

 {{PREAMBLE}}

-# /qa-design-review: Design Audit → Fix → Verify
+# /design-review: Design Audit → Fix → Verify

 You are a senior product designer AND a frontend engineer. Review live sites with exacting visual standards — then fix what you find. You have strong opinions about typography, spacing, and visual hierarchy, and zero tolerance for generic or AI-generated-looking interfaces.

@@ -46,7 +46,7 @@ Look for `DESIGN.md`, `design-system.md`, or similar in the repo root. If found,

 ```bash
 if [ -n "$(git status --porcelain)" ]; then
-  echo "ERROR: Working tree is dirty. Commit or stash changes before running /qa-design-review."
+  echo "ERROR: Working tree is dirty. Commit or stash changes before running /design-review."
  exit 1
 fi
 ```
@@ -164,7 +164,7 @@ Design fixes are typically CSS-only. Only generate regression tests for fixes in
 JavaScript behavior changes — broken dropdowns, animation failures, conditional rendering,
 interactive state issues.

-For CSS-only fixes: skip entirely. CSS regressions are caught by re-running /qa-design-review.
+For CSS-only fixes: skip entirely. CSS regressions are caught by re-running /design-review.

 If the fix involved JS behavior: follow the same procedure as /qa Phase 8e.5 (study existing
 test patterns, write a regression test encoding the exact bug condition, run it, commit if
@@ -236,11 +236,11 @@ Write to `~/.gstack/projects/{slug}/{user}-{branch}-design-audit-{datetime}.md`
 If the repo has a `TODOS.md`:

 1. **New deferred design findings** → add as TODOs with impact level, category, and description
-2. **Fixed findings that were in TODOS.md** → annotate with "Fixed by /qa-design-review on {branch}, {date}"
+2. **Fixed findings that were in TODOS.md** → annotate with "Fixed by /design-review on {branch}, {date}"

 ---

-## Additional Rules (qa-design-review specific)
+## Additional Rules (design-review specific)

 11. **Clean working tree required.** Refuse to start if `git status --porcelain` is non-empty.
 12. **One commit per fix.** Never bundle multiple design fixes into one commit.
@@ -219,7 +219,7 @@ eval $(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)
 4. **Apply the design checklist** against the changed files. For each item:
   - **[HIGH] mechanical CSS fix** (`outline: none`, `!important`, `font-size < 16px`): classify as AUTO-FIX
   - **[HIGH/MEDIUM] design judgment needed**: classify as ASK
-   - **[LOW] intent-based detection**: present as "Possible — verify visually or run /qa-design-review"
+   - **[LOW] intent-based detection**: present as "Possible — verify visually or run /design-review"

 5. **Include findings** in the review output under a "Design Review" header, following the output format in the checklist. Design findings merge with code review findings into the same Fix-First flow.

@@ -24,7 +24,7 @@ Each item is tagged with a detection confidence level:

 - **[HIGH]** — Reliably detectable via grep/pattern match. Definitive findings.
 - **[MEDIUM]** — Detectable via pattern aggregation or heuristic. Flag as findings but expect some noise.
- **[LOW]** — Requires understanding visual intent. Present as: "Possible issue — verify visually or run /qa-design-review."
+- **[LOW]** — Requires understanding visual intent. Present as: "Possible issue — verify visually or run /design-review."

 ---

@@ -38,7 +38,7 @@ Each item is tagged with a detection confidence level:
 **ASK** (everything else — requires design judgment):
 - All AI slop findings, typography structure, spacing choices, interaction state gaps, DESIGN.md violations

-**LOW confidence items** → present as "Possible: [description]. Verify visually or run /qa-design-review." Never AUTO-FIX.
+**LOW confidence items** → present as "Possible: [description]. Verify visually or run /design-review." Never AUTO-FIX.

 ---

@@ -55,7 +55,7 @@ Design Review: N issues (X auto-fixable, Y need input, Z possible)
  Recommended fix: suggested fix

 **POSSIBLE (verify visually):**
- [file:line] Possible issue — verify with /qa-design-review
+- [file:line] Possible issue — verify with /design-review
 ```

 If no issues found: `Design Review: No issues found.`
@@ -541,7 +541,7 @@ eval $(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)
 4. **Apply the design checklist** against the changed files. For each item:
   - **[HIGH] mechanical CSS fix** (\`outline: none\`, \`!important\`, \`font-size < 16px\`): classify as AUTO-FIX
   - **[HIGH/MEDIUM] design judgment needed**: classify as ASK
-   - **[LOW] intent-based detection**: present as "Possible — verify visually or run /qa-design-review"
+   - **[LOW] intent-based detection**: present as "Possible — verify visually or run /design-review"

 5. **Include findings** in the review output under a "Design Review" header, following the output format in the checklist. Design findings merge with code review findings into the same Fix-First flow.

@@ -1152,7 +1152,7 @@ function findTemplates(): string[] {
    path.join(ROOT, 'retro', 'SKILL.md.tmpl'),
    path.join(ROOT, 'gstack-upgrade', 'SKILL.md.tmpl'),
    path.join(ROOT, 'plan-design-review', 'SKILL.md.tmpl'),
-    path.join(ROOT, 'qa-design-review', 'SKILL.md.tmpl'),
+    path.join(ROOT, 'design-review', 'SKILL.md.tmpl'),
    path.join(ROOT, 'design-consultation', 'SKILL.md.tmpl'),
    path.join(ROOT, 'document-release', 'SKILL.md.tmpl'),
  ];
@@ -28,7 +28,7 @@ const SKILL_FILES = [
  'plan-eng-review/SKILL.md',
  'setup-browser-cookies/SKILL.md',
  'plan-design-review/SKILL.md',
-  'qa-design-review/SKILL.md',
+  'design-review/SKILL.md',
  'gstack-upgrade/SKILL.md',
  'document-release/SKILL.md',
 ].filter(f => fs.existsSync(path.join(ROOT, f)));
@@ -227,7 +227,7 @@ If the Eng Review is NOT "CLEAR":
   - RECOMMENDATION: Choose C if the change is obviously trivial (< 20 lines, typo fix, config-only); Choose B for larger changes
   - Options: A) Ship anyway  B) Abort — run /plan-eng-review first  C) Change is too small to need eng review
   - If CEO Review is missing, mention as informational ("CEO Review not run — recommended for product changes") but do NOT block
-   - For Design Review: run `eval $(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 3.5, but consider running /plan-design-review for a full visual audit." Still never block.
+   - For Design Review: run `eval $(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 3.5, but consider running /design-review for a full visual audit post-implementation." Still never block.

 3. **If the user chooses A or C,** persist the decision so future `/ship` runs on this branch skip the gate:
   ```bash
@@ -664,7 +664,7 @@ eval $(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)
 4. **Apply the design checklist** against the changed files. For each item:
   - **[HIGH] mechanical CSS fix** (`outline: none`, `!important`, `font-size < 16px`): classify as AUTO-FIX
   - **[HIGH/MEDIUM] design judgment needed**: classify as ASK
-   - **[LOW] intent-based detection**: present as "Possible — verify visually or run /qa-design-review"
+   - **[LOW] intent-based detection**: present as "Possible — verify visually or run /design-review"

 5. **Include findings** in the review output under a "Design Review" header, following the output format in the checklist. Design findings merge with code review findings into the same Fix-First flow.

@@ -70,7 +70,7 @@ If the Eng Review is NOT "CLEAR":
   - RECOMMENDATION: Choose C if the change is obviously trivial (< 20 lines, typo fix, config-only); Choose B for larger changes
   - Options: A) Ship anyway  B) Abort — run /plan-eng-review first  C) Change is too small to need eng review
   - If CEO Review is missing, mention as informational ("CEO Review not run — recommended for product changes") but do NOT block
-   - For Design Review: run `eval $(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 3.5, but consider running /plan-design-review for a full visual audit." Still never block.
+   - For Design Review: run `eval $(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 3.5, but consider running /design-review for a full visual audit post-implementation." Still never block.

 3. **If the user chooses A or C,** persist the decision so future `/ship` runs on this branch skip the gate:
   ```bash
@@ -70,7 +70,7 @@ describe('gen-skill-docs', () => {
    { dir: 'setup-browser-cookies', name: 'setup-browser-cookies' },
    { dir: 'gstack-upgrade', name: 'gstack-upgrade' },
    { dir: 'plan-design-review', name: 'plan-design-review' },
-    { dir: 'qa-design-review', name: 'qa-design-review' },
+    { dir: 'design-review', name: 'design-review' },
    { dir: 'design-consultation', name: 'design-consultation' },
  ];

@@ -84,9 +84,9 @@ export const E2E_TOUCHFILES: Record<string, string[]> = {
  'design-consultation-research': ['design-consultation/**'],
  'design-consultation-existing': ['design-consultation/**'],
  'design-consultation-preview':  ['design-consultation/**'],
-  'plan-design-review-audit':     ['plan-design-review/**'],
-  'plan-design-review-export':    ['plan-design-review/**'],
-  'qa-design-review-fix':         ['qa-design-review/**', 'browse/src/**'],
+  'plan-design-review-plan-mode':   ['plan-design-review/**'],
+  'plan-design-review-no-ui-scope': ['plan-design-review/**'],
+  'design-review-fix':              ['design-review/**', 'browse/src/**'],
 };

 /**
@@ -72,15 +72,15 @@ describe('SKILL.md command validation', () => {
    expect(result.snapshotFlagErrors).toHaveLength(0);
  });

-  test('all $B commands in qa-design-review/SKILL.md are valid browse commands', () => {
-    const skill = path.join(ROOT, 'qa-design-review', 'SKILL.md');
+  test('all $B commands in design-review/SKILL.md are valid browse commands', () => {
+    const skill = path.join(ROOT, 'design-review', 'SKILL.md');
    if (!fs.existsSync(skill)) return;
    const result = validateSkill(skill);
    expect(result.invalid).toHaveLength(0);
  });

-  test('all snapshot flags in qa-design-review/SKILL.md are valid', () => {
-    const skill = path.join(ROOT, 'qa-design-review', 'SKILL.md');
+  test('all snapshot flags in design-review/SKILL.md are valid', () => {
+    const skill = path.join(ROOT, 'design-review', 'SKILL.md');
    if (!fs.existsSync(skill)) return;
    const result = validateSkill(skill);
    expect(result.snapshotFlagErrors).toHaveLength(0);
@@ -205,7 +205,7 @@ describe('Update check preamble', () => {
    'plan-ceo-review/SKILL.md', 'plan-eng-review/SKILL.md',
    'retro/SKILL.md',
    'plan-design-review/SKILL.md',
-    'qa-design-review/SKILL.md',
+    'design-review/SKILL.md',
    'design-consultation/SKILL.md',
    'document-release/SKILL.md',
  ];
@@ -513,7 +513,7 @@ describe('v0.4.1 preamble features', () => {
    'plan-ceo-review/SKILL.md', 'plan-eng-review/SKILL.md',
    'retro/SKILL.md',
    'plan-design-review/SKILL.md',
-    'qa-design-review/SKILL.md',
+    'design-review/SKILL.md',
    'design-consultation/SKILL.md',
    'document-release/SKILL.md',
  ];
@@ -628,7 +628,7 @@ describe('Completeness Principle in generated SKILL.md files', () => {
    'plan-ceo-review/SKILL.md', 'plan-eng-review/SKILL.md',
    'retro/SKILL.md',
    'plan-design-review/SKILL.md',
-    'qa-design-review/SKILL.md',
+    'design-review/SKILL.md',
    'design-consultation/SKILL.md',
    'document-release/SKILL.md',
  ];
@@ -801,8 +801,8 @@ describe('Test Bootstrap ({{TEST_BOOTSTRAP}}) integration', () => {
    expect(content).toContain('Step 2.5');
  });

-  test('TEST_BOOTSTRAP appears in qa-design-review/SKILL.md', () => {
-    const content = fs.readFileSync(path.join(ROOT, 'qa-design-review', 'SKILL.md'), 'utf-8');
+  test('TEST_BOOTSTRAP appears in design-review/SKILL.md', () => {
+    const content = fs.readFileSync(path.join(ROOT, 'design-review', 'SKILL.md'), 'utf-8');
    expect(content).toContain('Test Framework Bootstrap');
  });

@@ -843,10 +843,10 @@ describe('Test Bootstrap ({{TEST_BOOTSTRAP}}) integration', () => {
    expect(content).toContain('100% test coverage');
  });

-  test('WebSearch is in allowed-tools for qa, ship, qa-design-review', () => {
+  test('WebSearch is in allowed-tools for qa, ship, design-review', () => {
    const qa = fs.readFileSync(path.join(ROOT, 'qa', 'SKILL.md'), 'utf-8');
    const ship = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
-    const qaDesign = fs.readFileSync(path.join(ROOT, 'qa-design-review', 'SKILL.md'), 'utf-8');
+    const qaDesign = fs.readFileSync(path.join(ROOT, 'design-review', 'SKILL.md'), 'utf-8');
    expect(qa).toContain('WebSearch');
    expect(ship).toContain('WebSearch');
    expect(qaDesign).toContain('WebSearch');
@@ -869,8 +869,8 @@ describe('Phase 8e.5 regression test generation', () => {
    expect(content).not.toContain('Never modify tests or CI configuration');
  });

-  test('qa-design-review has CSS-aware Phase 8e.5 variant', () => {
-    const content = fs.readFileSync(path.join(ROOT, 'qa-design-review', 'SKILL.md'), 'utf-8');
+  test('design-review has CSS-aware Phase 8e.5 variant', () => {
+    const content = fs.readFileSync(path.join(ROOT, 'design-review', 'SKILL.md'), 'utf-8');
    expect(content).toContain('8e.5. Regression Test (design-review variant)');
    expect(content).toContain('CSS-only');
    expect(content).toContain('test(design): regression test');
@@ -66,7 +66,7 @@ describe('selectTests', () => {
    expect(result.selected).toContain('browse-snapshot');
    expect(result.selected).toContain('qa-quick');
    expect(result.selected).toContain('qa-fix-loop');
-    expect(result.selected).toContain('qa-design-review-fix');
+    expect(result.selected).toContain('design-review-fix');
    expect(result.reason).toBe('diff');
    // Should NOT include unrelated tests
    expect(result.selected).not.toContain('plan-ceo-review');