feat: /plan-devex-review + /devex-review — DX review skills (v0.15.3.0) (#784)

* feat: add DX framework resolver for shared principles and scoring rubric

New {{DX_FRAMEWORK}} resolver provides compact (~150 lines) shared content
for /plan-devex-review and /devex-review: Addy Osmani's 8 DX principles,
7 characteristics table, 10 cognitive patterns, scoring rubric, and TTHW
benchmarks. Hall of Fame examples loaded on-demand per pass to avoid bloat.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add DX Review row to review dashboard

Adds plan-devex-review and devex-review schema entries to the review
dashboard resolver and placeholder table in the preamble. All existing
SKILL.md files regenerated to include the new DX Review row.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: /plan-devex-review skill — DX plan review with Osmani framework

Plan-stage developer experience review. Rates 8 DX dimensions 0-10:
getting started, API/CLI/SDK design, error messages, docs, upgrade path,
dev environment, community, and DX measurement. Includes developer empathy
simulation, auto-detect product type with applicability gate, DX scorecard
with trend tracking, and a conditional Claude Code Skill DX checklist.
Hall of Fame examples loaded on-demand per pass from dx-hall-of-fame.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: /devex-review skill — live DX audit with browse

Live-system developer experience audit using browse tool. Tests all 8
dimensions aligned with /plan-devex-review for boomerang comparison
(plan said 3 min TTHW, reality says 8). Each dimension marked TESTED,
INFERRED, or N/A with evidence. Scope-aware: declares what browse can
and cannot test, falls back to file artifacts for untestable dimensions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.15.3.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-04-03 16:22:57 -07:00
committed by GitHub
parent 103a1b35dc
commit be96ff5ce7
41 changed files with 3420 additions and 2 deletions
+85
View File
@@ -0,0 +1,85 @@
/**
* DX Framework resolver
*
* Shared principles, characteristics, cognitive patterns, and scoring rubric
* for /plan-devex-review and /devex-review. Compact (~150 lines).
*
* Hall of Fame examples are NOT included here. They live in
* plan-devex-review/dx-hall-of-fame.md and are loaded on-demand per pass
* to avoid prompt bloat.
*/
import type { TemplateContext } from './types';
export function generateDxFramework(ctx: TemplateContext): string {
const hallOfFamePath = `${ctx.paths.skillRoot}/plan-devex-review/dx-hall-of-fame.md`;
return `## DX First Principles
These are the laws. Every recommendation traces back to one of these.
1. **Zero friction at T0.** First five minutes decide everything. One click to start. Hello world without reading docs. No credit card. No demo call.
2. **Incremental steps.** Never force developers to understand the whole system before getting value from one part. Gentle ramp, not cliff.
3. **Learn by doing.** Playgrounds, sandboxes, copy-paste code that works in context. Reference docs are necessary but never sufficient.
4. **Decide for me, let me override.** Opinionated defaults are features. Escape hatches are requirements. Strong opinions, loosely held.
5. **Fight uncertainty.** Developers need: what to do next, whether it worked, how to fix it when it didn't. Every error = problem + cause + fix.
6. **Show code in context.** Hello world is a lie. Show real auth, real error handling, real deployment. Solve 100% of the problem.
7. **Speed is a feature.** Iteration speed is everything. Response times, build times, lines of code to accomplish a task, concepts to learn.
8. **Create magical moments.** What would feel like magic? Stripe's instant API response. Vercel's push-to-deploy. Find yours and make it the first thing developers experience.
## The Seven DX Characteristics
| # | Characteristic | What It Means | Gold Standard |
|---|---------------|---------------|---------------|
| 1 | **Usable** | Simple to install, set up, use. Intuitive APIs. Fast feedback. | Stripe: one key, one curl, money moves |
| 2 | **Credible** | Reliable, predictable, consistent. Clear deprecation. Secure. | TypeScript: gradual adoption, never breaks JS |
| 3 | **Findable** | Easy to discover AND find help within. Strong community. Good search. | React: every question answered on SO |
| 4 | **Useful** | Solves real problems. Features match actual use cases. Scales. | Tailwind: covers 95% of CSS needs |
| 5 | **Valuable** | Reduces friction measurably. Saves time. Worth the dependency. | Next.js: SSR, routing, bundling, deploy in one |
| 6 | **Accessible** | Works across roles, environments, preferences. CLI + GUI. | VS Code: works for junior to principal |
| 7 | **Desirable** | Best-in-class tech. Reasonable pricing. Community momentum. | Vercel: devs WANT to use it, not tolerate it |
## Cognitive Patterns — How Great DX Leaders Think
Internalize these; don't enumerate them.
1. **Chef-for-chefs** — Your users build products for a living. The bar is higher because they notice everything.
2. **First five minutes obsession** — New dev arrives. Clock starts. Can they hello-world without docs, sales, or credit card?
3. **Error message empathy** — Every error is pain. Does it identify the problem, explain the cause, show the fix, link to docs?
4. **Escape hatch awareness** — Every default needs an override. No escape hatch = no trust = no adoption at scale.
5. **Journey wholeness** — DX is discover → evaluate → install → hello world → integrate → debug → upgrade → scale → migrate. Every gap = a lost dev.
6. **Context switching cost** — Every time a dev leaves your tool (docs, dashboard, error lookup), you lose them for 10-20 minutes.
7. **Upgrade fear** — Will this break my production app? Clear changelogs, migration guides, codemods, deprecation warnings. Upgrades should be boring.
8. **SDK completeness** — If devs write their own HTTP wrapper, you failed. If the SDK works in 4 of 5 languages, the fifth community hates you.
9. **Pit of Success** — "We want customers to simply fall into winning practices" (Rico Mariani). Make the right thing easy, the wrong thing hard.
10. **Progressive disclosure** — Simple case is production-ready, not a toy. Complex case uses the same API. SwiftUI: \\\`Button("Save") { save() }\\\` → full customization, same API.
## DX Scoring Rubric (0-10 calibration)
| Score | Meaning |
|-------|---------|
| 9-10 | Best-in-class. Stripe/Vercel tier. Developers rave about it. |
| 7-8 | Good. Developers can use it without frustration. Minor gaps. |
| 5-6 | Acceptable. Works but with friction. Developers tolerate it. |
| 3-4 | Poor. Developers complain. Adoption suffers. |
| 1-2 | Broken. Developers abandon after first attempt. |
| 0 | Not addressed. No thought given to this dimension. |
**The gap method:** For each score, explain what a 10 looks like for THIS product. Then fix toward 10.
## TTHW Benchmarks (Time to Hello World)
| Tier | Time | Adoption Impact |
|------|------|-----------------|
| Champion | < 2 min | 3-4x higher adoption |
| Competitive | 2-5 min | Baseline |
| Needs Work | 5-10 min | Significant drop-off |
| Red Flag | > 10 min | 50-70% abandon |
## Hall of Fame Reference
During each review pass, load the relevant section from:
\\\`${hallOfFamePath}\\\`
Read ONLY the section for the current pass (e.g., "## Pass 1" for Getting Started).
Do NOT read the entire file at once. This keeps context focused.`;
}
+2
View File
@@ -17,6 +17,7 @@ import { generateLearningsSearch, generateLearningsLog } from './learnings';
import { generateConfidenceCalibration } from './confidence';
import { generateInvokeSkill } from './composition';
import { generateReviewArmy } from './review-army';
import { generateDxFramework } from './dx';
export const RESOLVERS: Record<string, ResolverFn> = {
SLUG_EVAL: generateSlugEval,
@@ -59,4 +60,5 @@ export const RESOLVERS: Record<string, ResolverFn> = {
INVOKE_SKILL: generateInvokeSkill,
CHANGELOG_WORKFLOW: generateChangelogWorkflow,
REVIEW_ARMY: generateReviewArmy,
DX_FRAMEWORK: generateDxFramework,
};
+1
View File
@@ -508,6 +508,7 @@ Then write a \`## GSTACK REVIEW REPORT\` section to the end of the plan file:
| Codex Review | \\\`/codex review\\\` | Independent 2nd opinion | 0 | — | — |
| Eng Review | \\\`/plan-eng-review\\\` | Architecture & tests (required) | 0 | — | — |
| Design Review | \\\`/plan-design-review\\\` | UI/UX gaps | 0 | — | — |
| DX Review | \\\`/plan-devex-review\\\` | Developer experience gaps | 0 | — | — |
**VERDICT:** NO REVIEWS YET — run \\\`/autoplan\\\` for full review pipeline, or individual reviews above.
\\\`\\\`\\\`
+5
View File
@@ -94,6 +94,10 @@ Parse each JSONL entry. Each skill logs different fields:
→ Findings: "{issues_found} issues, {critical_gaps} critical gaps"
- **plan-design-review**: \\\`status\\\`, \\\`initial_score\\\`, \\\`overall_score\\\`, \\\`unresolved\\\`, \\\`decisions_made\\\`, \\\`commit\\\`
→ Findings: "score: {initial_score}/10 → {overall_score}/10, {decisions_made} decisions"
- **plan-devex-review**: \\\`status\\\`, \\\`initial_score\\\`, \\\`overall_score\\\`, \\\`product_type\\\`, \\\`tthw_current\\\`, \\\`tthw_target\\\`, \\\`unresolved\\\`, \\\`commit\\\`
→ Findings: "score: {initial_score}/10 → {overall_score}/10, TTHW: {tthw_current} → {tthw_target}"
- **devex-review**: \\\`status\\\`, \\\`overall_score\\\`, \\\`product_type\\\`, \\\`tthw_measured\\\`, \\\`dimensions_tested\\\`, \\\`dimensions_inferred\\\`, \\\`boomerang\\\`, \\\`commit\\\`
→ Findings: "score: {overall_score}/10, TTHW: {tthw_measured}, {dimensions_tested} tested/{dimensions_inferred} inferred"
- **codex-review**: \\\`status\\\`, \\\`gate\\\`, \\\`findings\\\`, \\\`findings_fixed\\\`
→ Findings: "{findings} findings, {findings_fixed}/{findings} fixed"
@@ -112,6 +116,7 @@ Produce this markdown table:
| Codex Review | \\\`/codex review\\\` | Independent 2nd opinion | {runs} | {status} | {findings} |
| Eng Review | \\\`/plan-eng-review\\\` | Architecture & tests (required) | {runs} | {status} | {findings} |
| Design Review | \\\`/plan-design-review\\\` | UI/UX gaps | {runs} | {status} | {findings} |
| DX Review | \\\`/plan-devex-review\\\` | Developer experience gaps | {runs} | {status} | {findings} |
\\\`\\\`\\\`
Below the table, add these lines (omit any that are empty/not applicable):