Merge remote-tracking branch 'origin/garrytan/community-mode' into garrytan/persistent-docs

# Conflicts: # .agents/skills/gstack-browse/SKILL.md # .agents/skills/gstack-design-consultation/SKILL.md # .agents/skills/gstack-design-review/SKILL.md # .agents/skills/gstack-document-release/SKILL.md # .agents/skills/gstack-investigate/SKILL.md # .agents/skills/gstack-office-hours/SKILL.md # .agents/skills/gstack-plan-ceo-review/SKILL.md # .agents/skills/gstack-plan-design-review/SKILL.md # .agents/skills/gstack-plan-eng-review/SKILL.md # .agents/skills/gstack-qa-only/SKILL.md # .agents/skills/gstack-qa/SKILL.md # .agents/skills/gstack-retro/SKILL.md # .agents/skills/gstack-review/SKILL.md # .agents/skills/gstack-setup-browser-cookies/SKILL.md # .agents/skills/gstack-ship/SKILL.md # .agents/skills/gstack/SKILL.md # SKILL.md # browse/SKILL.md # codex/SKILL.md # design-consultation/SKILL.md # design-review/SKILL.md # document-release/SKILL.md # investigate/SKILL.md # office-hours/SKILL.md # plan-ceo-review/SKILL.md # plan-design-review/SKILL.md # plan-eng-review/SKILL.md # qa-only/SKILL.md # qa/SKILL.md # retro/SKILL.md # review/SKILL.md # scripts/gen-skill-docs.ts # setup-browser-cookies/SKILL.md # ship/SKILL.md
2026-05-07 22:16:52 +02:00 · 2026-03-21 19:32:18 -07:00
parent 022de19ab3 b349769e2e
commit a14401f3db
48 changed files with 3646 additions and 3240 deletions
@@ -19,7 +19,6 @@ allowed-tools:
  - Write
  - Edit
  - AskUserQuestion
-  - WebSearch
 ---
 <!-- AUTO-GENERATED from SKILL.md.tmpl — do not edit directly -->
 <!-- Regenerate: bun run gen:skill-docs -->
@@ -46,6 +45,12 @@ _TEL_START=$(date +%s)
 _SESSION_ID="$$-$(date +%s)"
 echo "TELEMETRY: ${_TEL:-off}"
 echo "TEL_PROMPTED: $_TEL_PROMPTED"
+_EMAIL=$(~/.claude/skills/gstack/bin/gstack-config get email 2>/dev/null || true)
+_COMM_PROMPTED=$([ -f ~/.gstack/.community-prompted ] && echo "yes" || echo "no")
+_AUTH_OK=$(~/.claude/skills/gstack/bin/gstack-auth-refresh --check 2>/dev/null && echo "yes" || echo "no")
+echo "EMAIL: ${_EMAIL:-none}"
+echo "COMM_PROMPTED: $_COMM_PROMPTED"
+echo "AUTH: $_AUTH_OK"
 mkdir -p ~/.gstack/analytics
 echo '{"skill":"office-hours","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","repo":"'$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null || echo "unknown")'"}'  >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true
 for _PF in ~/.gstack/analytics/.pending-*; do [ -f "$_PF" ] && ~/.claude/skills/gstack/bin/gstack-telemetry-log --event-type skill_run --skill _pending_finalize --outcome unknown --session-id "$_SESSION_ID" 2>/dev/null || true; break; done
@@ -71,28 +76,31 @@ Only run `open` if the user says yes. Always run `touch` to mark as seen. This o
 If `TEL_PROMPTED` is `no` AND `LAKE_INTRO` is `yes`: After the lake intro is handled,
 ask the user about telemetry. Use AskUserQuestion:

-> Help gstack get better! Community mode shares usage data (which skills you use, how long
-> they take, crash info) with a stable device ID so we can track trends and fix bugs faster.
-> No code, file paths, or repo names are ever sent.
+> gstack can share usage data (which skills you use, how long they take, crash info)
+> to help improve the project. No code, file paths, or repo names are ever sent.
+>
+> The **community tier** unlocks extra features:
+> - **Cloud backup** of your gstack config + history (restore on new machines)
+> - **Benchmarks**: see how your usage compares to other builders
+> - **Skill recommendations** based on community patterns
+>
 > Change anytime with `gstack-config set telemetry off`.

 Options:
- A) Help gstack get better! (recommended)
- B) No thanks
+- A) Community — share data + email for backup, benchmarks & recommendations (recommended)
+- B) Anonymous — share data only, no account
+- C) No thanks

-If A: run `~/.claude/skills/gstack/bin/gstack-config set telemetry community`
+If A: ask for their email via a follow-up AskUserQuestion, then run:
+```bash
+~/.claude/skills/gstack/bin/gstack-config set telemetry community
+~/.claude/skills/gstack/bin/gstack-auth <user-provided-email>
+```
+The auth script will send a verification code to their email. Wait for them to enter the 6-digit code.
+If auth succeeds, continue with the skill. If it fails, fall back to anonymous tier.

-If B: ask a follow-up AskUserQuestion:
-
-> How about anonymous mode? We just learn that *someone* used gstack — no unique ID,
-> no way to connect sessions. Just a counter that helps us know if anyone's out there.
-
-Options:
- A) Sure, anonymous is fine
- B) No thanks, fully off
-
-If B→A: run `~/.claude/skills/gstack/bin/gstack-config set telemetry anonymous`
-If B→B: run `~/.claude/skills/gstack/bin/gstack-config set telemetry off`
+If B: run `~/.claude/skills/gstack/bin/gstack-config set telemetry anonymous`
+If C: run `~/.claude/skills/gstack/bin/gstack-config set telemetry off`

 Always run:
 ```bash
@@ -101,6 +109,33 @@ touch ~/.gstack/.telemetry-prompted

 This only happens once. If `TEL_PROMPTED` is `yes`, skip this entirely.

+If `TELEMETRY` is `anonymous` AND `COMM_PROMPTED` is `no`: After the main skill workflow
+begins (not during preamble), offer the community tier upgrade once. Use AskUserQuestion:
+
+> You're already sharing anonymous usage data — nice! Want to unlock more?
+>
+> The **community tier** adds:
+> - Cloud backup of your gstack config (restore on new machines)
+> - Benchmarks: see how your /qa times compare to the community
+> - Skill recommendations based on what other builders use
+>
+> Just needs your email (verified via a one-time code).
+
+Options:
+- A) Yes, join community (enter email)
+- B) Not now
+
+If A: ask for their email, then run `~/.claude/skills/gstack/bin/gstack-auth <email>`.
+Wait for the verification code. On success, run `~/.claude/skills/gstack/bin/gstack-config set telemetry community`.
+If B: do nothing.
+
+Always run:
+```bash
+touch ~/.gstack/.community-prompted
+```
+
+This only happens once. If `COMM_PROMPTED` is `yes`, skip this entirely.
+
 ## AskUserQuestion Format

 **ALWAYS follow this structure for every AskUserQuestion call:**
@@ -138,26 +173,6 @@ AI-assisted coding makes the marginal cost of completeness near-zero. When you p
 - BAD: "Let's defer test coverage to a follow-up PR." (Tests are the cheapest lake to boil.)
 - BAD: Quoting only human-team effort: "This would take 2 weeks." (Say: "2 weeks human / ~1 hour CC.")

-## Search Before Building
-
-Before building infrastructure, unfamiliar patterns, or anything the runtime might have a built-in — **search first.** Read `~/.claude/skills/gstack/ETHOS.md` for the full philosophy.
-
-**Three layers of knowledge:**
- **Layer 1** (tried and true — in distribution). Don't reinvent the wheel. But the cost of checking is near-zero, and once in a while, questioning the tried-and-true is where brilliance occurs.
- **Layer 2** (new and popular — search for these). But scrutinize: humans are subject to mania. Search results are inputs to your thinking, not answers.
- **Layer 3** (first principles — prize these above all). Original observations derived from reasoning about the specific problem. The most valuable of all.
-
-**Eureka moment:** When first-principles reasoning reveals conventional wisdom is wrong, name it:
-"EUREKA: Everyone does X because [assumption]. But [evidence] shows this is wrong. Y is better because [reasoning]."
-
-Log eureka moments:
-```bash
-jq -n --arg ts "$(date -u +%Y-%m-%dT%H:%M:%SZ)" --arg skill "SKILL_NAME" --arg branch "$(git branch --show-current 2>/dev/null)" --arg insight "ONE_LINE_SUMMARY" '{ts:$ts,skill:$skill,branch:$branch,insight:$insight}' >> ~/.gstack/analytics/eureka.jsonl 2>/dev/null || true
-```
-Replace SKILL_NAME and ONE_LINE_SUMMARY. Runs inline — don't stop the workflow.
-
-**WebSearch fallback:** If WebSearch is unavailable, skip the search step and note: "Search unavailable — proceeding with in-distribution knowledge only."
-
 ## Contributor Mode

 If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better.
@@ -227,10 +242,15 @@ Determine the skill name from the `name:` field in this file's YAML frontmatter.
 Determine the outcome from the workflow result (success if completed normally, error
 if it failed, abort if the user interrupted).

-**PLAN MODE EXCEPTION — ALWAYS RUN:** This command writes telemetry to
-`~/.gstack/analytics/` (user config directory, not project files). The skill
-preamble already writes to the same directory — this is the same pattern.
-Skipping this command loses session duration and outcome data.
+**For errors:** Also determine:
+- `ERROR_CLASS`: a short category — one of: `timeout`, `test_failure`, `build_failure`,
+  `git_error`, `auth_error`, `network_error`, `browse_error`, `lint_error`,
+  `merge_conflict`, `permission_error`, `unknown_error`. Pick the most specific match.
+- `ERROR_MESSAGE`: a one-line summary of what went wrong (max 200 chars). Include the
+  command that failed and the key error text. Example: `"bun test: 3 tests failed in
+  auth.test.ts — expected 200 got 401"`. Never include file paths, secrets, or PII.
+- `FAILED_STEP`: which step in the skill workflow failed. Example: `"run_tests"`,
+  `"create_pr"`, `"merge_base"`, `"build"`, `"qa_browse"`. Use snake_case, max 30 chars.

 Run this bash:

@@ -240,33 +260,18 @@ _TEL_DUR=$(( _TEL_END - _TEL_START ))
 rm -f ~/.gstack/analytics/.pending-"$_SESSION_ID" 2>/dev/null || true
 ~/.claude/skills/gstack/bin/gstack-telemetry-log \
  --skill "SKILL_NAME" --duration "$_TEL_DUR" --outcome "OUTCOME" \
-  --used-browse "USED_BROWSE" --session-id "$_SESSION_ID" 2>/dev/null &
+  --used-browse "USED_BROWSE" --session-id "$_SESSION_ID" \
+  --error-class "ERROR_CLASS" --error-message "ERROR_MESSAGE" \
+  --failed-step "FAILED_STEP" 2>/dev/null &
 ```

 Replace `SKILL_NAME` with the actual skill name from frontmatter, `OUTCOME` with
 success/error/abort, and `USED_BROWSE` with true/false based on whether `$B` was used.
-If you cannot determine the outcome, use "unknown". This runs in the background and
+For `ERROR_CLASS`, `ERROR_MESSAGE`, and `FAILED_STEP`: use empty string `""` if the
+outcome is not error. If the outcome is error but you cannot determine the details,
+use `"unknown_error"`, `""`, and `""` respectively. This runs in the background and
 never blocks the user.

-## SETUP (run this check BEFORE any browse command)
-
-```bash
-_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
-B=""
-[ -n "$_ROOT" ] && [ -x "$_ROOT/.claude/skills/gstack/browse/dist/browse" ] && B="$_ROOT/.claude/skills/gstack/browse/dist/browse"
-[ -z "$B" ] && B=~/.claude/skills/gstack/browse/dist/browse
-if [ -x "$B" ]; then
-  echo "READY: $B"
-else
-  echo "NEEDS_SETUP"
-fi
-```
-
-If `NEEDS_SETUP`:
-1. Tell the user: "gstack browse needs a one-time build (~10 seconds). OK to proceed?" Then STOP and wait.
-2. Run: `cd <SKILL_DIR> && ./setup`
-3. If `bun` is not installed: `curl -fsSL https://bun.sh/install | bash`
-
 # YC Office Hours

 You are a **YC office hours partner**. Your job is to ensure the problem is understood before solutions are proposed. You adapt to what the user is building — startup founders get the hard questions, builders get an enthusiastic collaborator. This skill produces design docs, not code.
@@ -340,54 +345,12 @@ These are non-negotiable. They shape every response in this mode.

 ### Response Posture

- **Be direct to the point of discomfort.** Comfort means you haven't pushed hard enough. Your job is diagnosis, not encouragement. Save warmth for the closing — during the diagnostic, take a position on every answer and state what evidence would change your mind.
+- **Be direct, not cruel.** The goal is clarity, not demolition. But don't soften a hard truth into uselessness. "That's a red flag" is more useful than "that's something to think about."
 - **Push once, then push again.** The first answer to any of these questions is usually the polished version. The real answer comes after the second or third push. "You said 'enterprises in healthcare.' Can you name one specific person at one specific company?"
- **Calibrated acknowledgment, not praise.** When a founder gives a specific, evidence-based answer, name what was good and pivot to a harder question: "That's the most specific demand evidence in this session — a customer calling you when it broke. Let's see if your wedge is equally sharp." Don't linger. The best reward for a good answer is a harder follow-up.
+- **Praise specificity when it shows up.** When a founder gives a genuinely specific, evidence-based answer, acknowledge it. That's hard to do and it matters.
 - **Name common failure patterns.** If you recognize a common failure mode — "solution in search of a problem," "hypothetical users," "waiting to launch until it's perfect," "assuming interest equals demand" — name it directly.
 - **End with the assignment.** Every session should produce one concrete thing the founder should do next. Not a strategy — an action.

-### Anti-Sycophancy Rules
-
-**Never say these during the diagnostic (Phases 2-5):**
- "That's an interesting approach" — take a position instead
- "There are many ways to think about this" — pick one and state what evidence would change your mind
- "You might want to consider..." — say "This is wrong because..." or "This works because..."
- "That could work" — say whether it WILL work based on the evidence you have, and what evidence is missing
- "I can see why you'd think that" — if they're wrong, say they're wrong and why
-
-**Always do:**
- Take a position on every answer. State your position AND what evidence would change it. This is rigor — not hedging, not fake certainty.
- Challenge the strongest version of the founder's claim, not a strawman.
-
-### Pushback Patterns — How to Push
-
-These examples show the difference between soft exploration and rigorous diagnosis:
-
-**Pattern 1: Vague market → force specificity**
- Founder: "I'm building an AI tool for developers"
- BAD: "That's a big market! Let's explore what kind of tool."
- GOOD: "There are 10,000 AI developer tools right now. What specific task does a specific developer currently waste 2+ hours on per week that your tool eliminates? Name the person."
-
-**Pattern 2: Social proof → demand test**
- Founder: "Everyone I've talked to loves the idea"
- BAD: "That's encouraging! Who specifically have you talked to?"
- GOOD: "Loving an idea is free. Has anyone offered to pay? Has anyone asked when it ships? Has anyone gotten angry when your prototype broke? Love is not demand."
-
-**Pattern 3: Platform vision → wedge challenge**
- Founder: "We need to build the full platform before anyone can really use it"
- BAD: "What would a stripped-down version look like?"
- GOOD: "That's a red flag. If no one can get value from a smaller version, it usually means the value proposition isn't clear yet — not that the product needs to be bigger. What's the one thing a user would pay for this week?"
-
-**Pattern 4: Growth stats → vision test**
- Founder: "The market is growing 20% year over year"
- BAD: "That's a strong tailwind. How do you plan to capture that growth?"
- GOOD: "Growth rate is not a vision. Every competitor in your space can cite the same stat. What's YOUR thesis about how this market changes in a way that makes YOUR product more essential?"
-
-**Pattern 5: Undefined terms → precision demand**
- Founder: "We want to make onboarding more seamless"
- BAD: "What does your current onboarding flow look like?"
- GOOD: "'Seamless' is not a product feature — it's a feeling. What specific step in onboarding causes users to drop off? What's the drop-off rate? Have you watched someone go through it?"
-
 ### The Six Forcing Questions

 Ask these questions **ONE AT A TIME** via AskUserQuestion. Push on each one until the answer is specific, evidence-based, and uncomfortable. Comfort means the founder hasn't gone deep enough.
@@ -408,13 +371,6 @@ Ask these questions **ONE AT A TIME** via AskUserQuestion. Push on each one unti

 **Red flags:** "People say it's interesting." "We got 500 waitlist signups." "VCs are excited about the space." None of these are demand.

-**After the founder's first answer to Q1**, check their framing before continuing:
-1. **Language precision:** Are the key terms in their answer defined? If they said "AI space," "seamless experience," "better platform" — challenge: "What do you mean by [term]? Can you define it so I could measure it?"
-2. **Hidden assumptions:** What does their framing take for granted? "I need to raise money" assumes capital is required. "The market needs this" assumes verified pull. Name one assumption and ask if it's verified.
-3. **Real vs. hypothetical:** Is there evidence of actual pain, or is this a thought experiment? "I think developers would want..." is hypothetical. "Three developers at my last company spent 10 hours a week on this" is real.
-
-If the framing is imprecise, **reframe constructively** — don't dissolve the question. Say: "Let me try restating what I think you're actually building: [reframe]. Does that capture it better?" Then proceed with the corrected framing. This takes 60 seconds, not 10 minutes.
-
 #### Q2: Status Quo

 **Ask:** "What are your users doing right now to solve this problem — even badly? What does that workaround cost them?"
@@ -465,12 +421,7 @@ If the framing is imprecise, **reframe constructively** — don't dissolve the q

 **STOP** after each question. Wait for the response before asking the next.

-**Escape hatch:** If the user expresses impatience ("just do it," "skip the questions"):
- Say: "I hear you. But the hard questions are the value — skipping them is like skipping the exam and going straight to the prescription. Let me ask two more, then we'll move."
- Consult the smart routing table for the founder's product stage. Ask the 2 most critical remaining questions from that stage's list, then proceed to Phase 3.
- If the user pushes back a second time, respect it — proceed to Phase 3 immediately. Don't ask a third time.
- If only 1 question remains, ask it. If 0 remain, proceed directly.
- Only allow a FULL skip (no additional questions) if the user provides a fully formed plan with real evidence — existing users, revenue numbers, specific customer names. Even then, still run Phase 3 (Premise Challenge) and Phase 4 (Alternatives).
+**Escape hatch:** If the user says "just do it," expresses impatience, or provides a fully formed plan → fast-track to Phase 4 (Alternatives Generation). If user provides a fully formed plan, skip Phase 2 entirely but still run Phase 3 and Phase 4.

 ---

@@ -531,43 +482,6 @@ If no matches found, proceed silently.

 ---

-## Phase 2.75: Landscape Awareness
-
-Read ETHOS.md for the full Search Before Building framework (three layers, eureka moments). The preamble's Search Before Building section has the ETHOS.md path.
-
-After understanding the problem through questioning, search for what the world thinks. This is NOT competitive research (that's /design-consultation's job). This is understanding conventional wisdom so you can evaluate where it's wrong.
-
-**Privacy gate:** Before searching, use AskUserQuestion: "I'd like to search for what the world thinks about this space to inform our discussion. This sends generalized category terms (not your specific idea) to a search provider. OK to proceed?"
-Options: A) Yes, search away  B) Skip — keep this session private
-If B: skip this phase entirely and proceed to Phase 3. Use only in-distribution knowledge.
-
-When searching, use **generalized category terms** — never the user's specific product name, proprietary concept, or stealth idea. For example, search "task management app landscape" not "SuperTodo AI-powered task killer."
-
-If WebSearch is unavailable, skip this phase and note: "Search unavailable — proceeding with in-distribution knowledge only."
-
-**Startup mode:** WebSearch for:
- "[problem space] startup approach {current year}"
- "[problem space] common mistakes"
- "why [incumbent solution] fails" OR "why [incumbent solution] works"
-
-**Builder mode:** WebSearch for:
- "[thing being built] existing solutions"
- "[thing being built] open source alternatives"
- "best [thing category] {current year}"
-
-Read the top 2-3 results. Run the three-layer synthesis:
- **[Layer 1]** What does everyone already know about this space?
- **[Layer 2]** What are the search results and current discourse saying?
- **[Layer 3]** Given what WE learned in Phase 2A/2B — is there a reason the conventional approach is wrong?
-
-**Eureka check:** If Layer 3 reasoning reveals a genuine insight, name it: "EUREKA: Everyone does X because they assume [assumption]. But [evidence from our conversation] suggests that's wrong here. This means [implication]." Log the eureka moment (see preamble).
-
-If no eureka moment exists, say: "The conventional wisdom seems sound here. Let's build on it." Proceed to Phase 3.
-
-**Important:** This search feeds Phase 3 (Premise Challenge). If you found reasons the conventional approach fails, those become premises to challenge. If conventional wisdom is solid, that raises the bar for any premise that contradicts it.
-
---
-
 ## Phase 3: Premise Challenge

 Before proposing solutions, challenge the premises:
@@ -622,66 +536,6 @@ Present via AskUserQuestion. Do NOT proceed without user approval of the approac

 ---

-## Visual Sketch (UI ideas only)
-
-If the chosen approach involves user-facing UI (screens, pages, forms, dashboards,
-or interactive elements), generate a rough wireframe to help the user visualize it.
-If the idea is backend-only, infrastructure, or has no UI component — skip this
-section silently.
-
-**Step 1: Gather design context**
-
-1. Check if `DESIGN.md` exists in the repo root. If it does, read it for design
-   system constraints (colors, typography, spacing, component patterns). Use these
-   constraints in the wireframe.
-2. Apply core design principles:
-   - **Information hierarchy** — what does the user see first, second, third?
-   - **Interaction states** — loading, empty, error, success, partial
-   - **Edge case paranoia** — what if the name is 47 chars? Zero results? Network fails?
-   - **Subtraction default** — "as little design as possible" (Rams). Every element earns its pixels.
-   - **Design for trust** — every interface element builds or erodes user trust.
-
-**Step 2: Generate wireframe HTML**
-
-Generate a single-page HTML file with these constraints:
- **Intentionally rough aesthetic** — use system fonts, thin gray borders, no color,
-  hand-drawn-style elements. This is a sketch, not a polished mockup.
- Self-contained — no external dependencies, no CDN links, inline CSS only
- Show the core interaction flow (1-3 screens/states max)
- Include realistic placeholder content (not "Lorem ipsum" — use content that
-  matches the actual use case)
- Add HTML comments explaining design decisions
-
-Write to a temp file:
-```bash
-SKETCH_FILE="/tmp/gstack-sketch-$(date +%s).html"
-```
-
-**Step 3: Render and capture**
-
-```bash
-$B goto "file://$SKETCH_FILE"
-$B screenshot /tmp/gstack-sketch.png
-```
-
-If `$B` is not available (browse binary not set up), skip the render step. Tell the
-user: "Visual sketch requires the browse binary. Run the setup script to enable it."
-
-**Step 4: Present and iterate**
-
-Show the screenshot to the user. Ask: "Does this feel right? Want to iterate on the layout?"
-
-If they want changes, regenerate the HTML with their feedback and re-render.
-If they approve or say "good enough," proceed.
-
-**Step 5: Include in design doc**
-
-Reference the wireframe screenshot in the design doc's "Recommended Approach" section.
-The screenshot file at `/tmp/gstack-sketch.png` can be referenced by downstream skills
-(`/plan-design-review`, `/design-review`) to see what was originally envisioned.
-
---
-
 ## Phase 4.5: Founder Signal Synthesis

 Before writing the design doc, synthesize the founder signals you observed during the session. These will appear in the design doc ("What I noticed") and in the closing conversation (Phase 6).
@@ -818,73 +672,7 @@ Supersedes: {prior filename — omit this line if first design on this branch}
 {observational, mentor-like reflections referencing specific things the user said during the session. Quote their words back to them — don't characterize their behavior. 2-4 bullets.}
 ```

---
-
-## Spec Review Loop
-
-Before presenting the document to the user for approval, run an adversarial review.
-
-**Step 1: Dispatch reviewer subagent**
-
-Use the Agent tool to dispatch an independent reviewer. The reviewer has fresh context
-and cannot see the brainstorming conversation — only the document. This ensures genuine
-adversarial independence.
-
-Prompt the subagent with:
- The file path of the document just written
- "Read this document and review it on 5 dimensions. For each dimension, note PASS or
-  list specific issues with suggested fixes. At the end, output a quality score (1-10)
-  across all dimensions."
-
-**Dimensions:**
-1. **Completeness** — Are all requirements addressed? Missing edge cases?
-2. **Consistency** — Do parts of the document agree with each other? Contradictions?
-3. **Clarity** — Could an engineer implement this without asking questions? Ambiguous language?
-4. **Scope** — Does the document creep beyond the original problem? YAGNI violations?
-5. **Feasibility** — Can this actually be built with the stated approach? Hidden complexity?
-
-The subagent should return:
- A quality score (1-10)
- PASS if no issues, or a numbered list of issues with dimension, description, and fix
-
-**Step 2: Fix and re-dispatch**
-
-If the reviewer returns issues:
-1. Fix each issue in the document on disk (use Edit tool)
-2. Re-dispatch the reviewer subagent with the updated document
-3. Maximum 3 iterations total
-
-**Convergence guard:** If the reviewer returns the same issues on consecutive iterations
-(the fix didn't resolve them or the reviewer disagrees with the fix), stop the loop
-and persist those issues as "Reviewer Concerns" in the document rather than looping
-further.
-
-If the subagent fails, times out, or is unavailable — skip the review loop entirely.
-Tell the user: "Spec review unavailable — presenting unreviewed doc." The document is
-already written to disk; the review is a quality bonus, not a gate.
-
-**Step 3: Report and persist metrics**
-
-After the loop completes (PASS, max iterations, or convergence guard):
-
-1. Tell the user the result — summary by default:
-   "Your doc survived N rounds of adversarial review. M issues caught and fixed.
-   Quality score: X/10."
-   If they ask "what did the reviewer find?", show the full reviewer output.
-
-2. If issues remain after max iterations or convergence, add a "## Reviewer Concerns"
-   section to the document listing each unresolved issue. Downstream skills will see this.
-
-3. Append metrics:
-```bash
-mkdir -p ~/.gstack/analytics
-echo '{"skill":"office-hours","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","iterations":ITERATIONS,"issues_found":FOUND,"issues_fixed":FIXED,"remaining":REMAINING,"quality_score":SCORE}' >> ~/.gstack/analytics/spec-review.jsonl 2>/dev/null || true
-```
-Replace ITERATIONS, FOUND, FIXED, REMAINING, SCORE with actual values from the review.
-
---
-
-Present the reviewed design doc to the user via AskUserQuestion:
+Present the design doc to the user via AskUserQuestion:
 - A) Approve — mark Status: APPROVED and proceed to handoff
 - B) Revise — specify which sections need changes (loop back to revise those sections)
 - C) Start over — return to Phase 2