feat: design outside voices — cross-model design critique (v0.11.3.0) (#347)

* feat(gen-skill-docs): add design outside voices + hard rules resolvers

Add generateDesignOutsideVoices() — parallel Codex + Claude subagent
dispatch for cross-model design critique with litmus scorecard synthesis.
Branches per skillName (plan-design-review, design-review, design-consultation)
with task-specific reasoning effort (high for analytical, medium for creative).

Add generateDesignHardRules() — OpenAI Frontend Skill hard rules + gstack
AI slop blacklist unified into one shared block with classifier step
(landing page vs app UI vs hybrid).

Extract AI_SLOP_BLACKLIST constant from inline prose in generateDesignMethodology()
for DRY. Extend generateDesignReviewLite() with lightweight Codex block.
Extend generateDesignSketch() with outside voices opt-in after wireframe.

Source: OpenAI "Designing Delightful Frontends with GPT-5.4" (Mar 2026)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(design skills): add outside voices + hard rules to all design templates

Insert {{DESIGN_OUTSIDE_VOICES}} in plan-design-review (between Step 0D
and Pass 1), design-review (between Phase 6 and Phase 7), and
design-consultation (between Phase 2 and Phase 3).

Insert {{DESIGN_HARD_RULES}} in plan-design-review Pass 4 and design-review
Phase 3 checklist.

DESIGN_REVIEW_LITE in /ship and /review now includes a Codex design voice
block with litmus checks.

DESIGN_SKETCH in /office-hours now includes outside voices opt-in after
wireframe approval.

Regenerated all SKILL.md files (both Claude and Codex hosts).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add resolver tests + touchfiles for design outside voices

Add 18 test cases across 4 new describe blocks:
- DESIGN_OUTSIDE_VOICES: host guard, skillName branching, reasoning effort
- DESIGN_HARD_RULES: classifier, 3 rule sets, slop blacklist, OpenAI criteria
- DESIGN_SKETCH extended: outside voices step, original wireframe preserved
- DESIGN_REVIEW_LITE extended: Codex block, codex host exclusion

Update touchfiles: add scripts/gen-skill-docs.ts to design skill E2E
test dependencies for accurate diff-based test selection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.11.3.0)

Design outside voices — parallel Codex + Claude subagent for cross-model
design critique with litmus scorecard synthesis. OpenAI hard rules + gstack
slop blacklist unified. Classifier for landing page vs app UI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: generate .agents/ on demand in tests (not checked in since v0.11.2.0)

.agents/ is gitignored since v0.11.2.0 — tests that read Codex-host
SKILL.md files now generate them on demand via `bun run gen-skill-docs.ts
--host codex` before reading. Fixes test failures on fresh clones.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-03-22 20:22:23 -07:00
committed by GitHub
parent b7a3bf108d
commit 4cd4d11cb0
16 changed files with 999 additions and 23 deletions
+65
View File
@@ -423,6 +423,71 @@ If the user said no research, skip entirely and proceed to Phase 3 using your bu
---
## Design Outside Voices (parallel)
Use AskUserQuestion:
> "Want outside design voices? Codex evaluates against OpenAI's design hard rules + litmus checks; Claude subagent does an independent design direction proposal."
>
> A) Yes — run outside design voices
> B) No — proceed without
If user chooses B, skip this step and continue.
**Check Codex availability:**
```bash
which codex 2>/dev/null && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
```
**If Codex is available**, launch both voices simultaneously:
1. **Codex design voice** (via Bash):
```bash
TMPERR_DESIGN=$(mktemp /tmp/codex-design-XXXXXXXX)
codex exec "Given this product context, propose a complete design direction:
- Visual thesis: one sentence describing mood, material, and energy
- Typography: specific font names (not defaults — no Inter/Roboto/Arial/system) + hex colors
- Color system: CSS variables for background, surface, primary text, muted text, accent
- Layout: composition-first, not component-first. First viewport as poster, not document
- Differentiation: 2 deliberate departures from category norms
- Anti-slop: no purple gradients, no 3-column icon grids, no centered everything, no decorative blobs
Be opinionated. Be specific. Do not hedge. This is YOUR design direction — own it." -s read-only -c 'model_reasoning_effort="medium"' --enable web_search_cached 2>"$TMPERR_DESIGN"
```
Use a 5-minute timeout (`timeout: 300000`). After the command completes, read stderr:
```bash
cat "$TMPERR_DESIGN" && rm -f "$TMPERR_DESIGN"
```
2. **Claude design subagent** (via Agent tool):
Dispatch a subagent with this prompt:
"Given this product context, propose a design direction that would SURPRISE. What would the cool indie studio do that the enterprise UI team wouldn't?
- Propose an aesthetic direction, typography stack (specific font names), color palette (hex values)
- 2 deliberate departures from category norms
- What emotional reaction should the user have in the first 3 seconds?
Be bold. Be specific. No hedging."
**Error handling (all non-blocking):**
- **Auth failure:** If stderr contains "auth", "login", "unauthorized", or "API key": "Codex authentication failed. Run `codex login` to authenticate."
- **Timeout:** "Codex timed out after 5 minutes."
- **Empty response:** "Codex returned no response."
- On any Codex error: proceed with Claude subagent output only, tagged `[single-model]`.
- If Claude subagent also fails: "Outside voices unavailable — continuing with primary review."
Present Codex output under a `CODEX SAYS (design direction):` header.
Present subagent output under a `CLAUDE SUBAGENT (design direction):` header.
**Synthesis:** Claude main references both Codex and subagent proposals in the Phase 3 proposal. Present:
- Areas of agreement between all three voices (Claude main + Codex + subagent)
- Genuine divergences as creative alternatives for the user to choose from
- "Codex and I agree on X. Codex suggested Y where I'm proposing Z — here's why..."
**Log the result:**
```bash
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"design-outside-voices","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","source":"SOURCE","commit":"'"$(git rev-parse --short HEAD)"'"}'
```
Replace STATUS with "clean" or "issues_found", SOURCE with "codex+subagent", "codex-only", "subagent-only", or "unavailable".
## Phase 3: The Complete Proposal
This is the soul of the skill. Propose EVERYTHING as one coherent package.
+2
View File
@@ -131,6 +131,8 @@ If the user said no research, skip entirely and proceed to Phase 3 using your bu
---
{{DESIGN_OUTSIDE_VOICES}}
## Phase 3: The Complete Proposal
This is the soul of the skill. Propose EVERYTHING as one coherent package.