feat: skill chaining — plan reviews offer /office-hours

- plan-ceo-review: benefits-from office-hours, offers /office-hours when
  no design doc found, mid-session detection when user seems lost,
  spec review loop on CEO plan documents
- plan-eng-review: benefits-from office-hours, offers /office-hours when
  no design doc found
- One-hop-max chaining: never blocks, max one offer per session

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-03-19 21:03:08 -07:00
parent 7dc9af0999
commit ab82e21957
4 changed files with 138 additions and 0 deletions
+96
View File
@@ -10,6 +10,7 @@ description: |
or "is this ambitious enough".
Proactively suggest when the user is questioning scope or ambition of a plan,
or when the plan feels like it could be thinking bigger.
benefits-from: [office-hours]
allowed-tools:
- Read
- Grep
@@ -263,6 +264,37 @@ DESIGN=$(ls -t ~/.gstack/projects/$SLUG/*-$BRANCH-design-*.md 2>/dev/null | head
```
If a design doc exists (from `/office-hours`), read it. Use it as the source of truth for the problem statement, constraints, and chosen approach. If it has a `Supersedes:` field, note that this is a revised design.
## Prerequisite Skill Offer
When the design doc check above prints "No design doc found," offer the prerequisite
skill before proceeding.
Say to the user via AskUserQuestion:
> "No design doc found for this branch. `/office-hours` produces a structured problem
> statement, premise challenge, and explored alternatives — it gives this review much
> sharper input to work with. Takes about 10 minutes. The design doc is per-feature,
> not per-product — it captures the thinking behind this specific change."
Options:
- A) Run /office-hours first (in another window, then come back)
- B) Skip — proceed with standard review
If they skip: "No worries — standard review. If you ever want sharper input, try
/office-hours first next time." Then proceed normally. Do not re-offer later in the session.
**Mid-session detection:** During Step 0A (Premise Challenge), if the user can't
articulate the problem, keeps changing the problem statement, answers with "I'm not
sure," or is clearly exploring rather than reviewing — offer `/office-hours`:
> "It sounds like you're still figuring out what to build — that's totally fine, but
> that's what /office-hours is designed for. Want to pause this review and run
> /office-hours first? It'll help you nail down the problem and approach, then come
> back here for the strategic review."
Options: A) Yes, run /office-hours first. B) No, keep going.
If they keep going, proceed normally — no guilt, no re-asking.
When reading TODOS.md, specifically:
* Note any TODOs this plan touches, blocks, or unlocks
* Check if deferred work from prior reviews relates to this plan
@@ -406,6 +438,70 @@ Repo: {owner/repo}
Derive the feature slug from the plan being reviewed (e.g., "user-dashboard", "auth-refactor"). Use the date in YYYY-MM-DD format.
After writing the CEO plan, run the spec review loop on it:
## Spec Review Loop
Before presenting the document to the user for approval, run an adversarial review.
**Step 1: Dispatch reviewer subagent**
Use the Agent tool to dispatch an independent reviewer. The reviewer has fresh context
and cannot see the brainstorming conversation — only the document. This ensures genuine
adversarial independence.
Prompt the subagent with:
- The file path of the document just written
- "Read this document and review it on 5 dimensions. For each dimension, note PASS or
list specific issues with suggested fixes. At the end, output a quality score (1-10)
across all dimensions."
**Dimensions:**
1. **Completeness** — Are all requirements addressed? Missing edge cases?
2. **Consistency** — Do parts of the document agree with each other? Contradictions?
3. **Clarity** — Could an engineer implement this without asking questions? Ambiguous language?
4. **Scope** — Does the document creep beyond the original problem? YAGNI violations?
5. **Feasibility** — Can this actually be built with the stated approach? Hidden complexity?
The subagent should return:
- A quality score (1-10)
- PASS if no issues, or a numbered list of issues with dimension, description, and fix
**Step 2: Fix and re-dispatch**
If the reviewer returns issues:
1. Fix each issue in the document on disk (use Edit tool)
2. Re-dispatch the reviewer subagent with the updated document
3. Maximum 3 iterations total
**Convergence guard:** If the reviewer returns the same issues on consecutive iterations
(the fix didn't resolve them or the reviewer disagrees with the fix), stop the loop
and persist those issues as "Reviewer Concerns" in the document rather than looping
further.
If the subagent fails, times out, or is unavailable — skip the review loop entirely.
Tell the user: "Spec review unavailable — presenting unreviewed doc." The document is
already written to disk; the review is a quality bonus, not a gate.
**Step 3: Report and persist metrics**
After the loop completes (PASS, max iterations, or convergence guard):
1. Tell the user the result — summary by default:
"Your doc survived N rounds of adversarial review. M issues caught and fixed.
Quality score: X/10."
If they ask "what did the reviewer find?", show the full reviewer output.
2. If issues remain after max iterations or convergence, add a "## Reviewer Concerns"
section to the document listing each unresolved issue. Downstream skills will see this.
3. Append metrics:
```bash
mkdir -p ~/.gstack/analytics
echo '{"skill":"plan-ceo-review","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","iterations":ITERATIONS,"issues_found":FOUND,"issues_fixed":FIXED,"remaining":REMAINING,"quality_score":SCORE}' >> ~/.gstack/analytics/spec-review.jsonl 2>/dev/null || true
```
Replace ITERATIONS, FOUND, FIXED, REMAINING, SCORE with actual values from the review.
### 0E. Temporal Interrogation (EXPANSION, SELECTIVE EXPANSION, and HOLD modes)
Think ahead to implementation: What decisions will need to be made during implementation that should be resolved NOW in the plan?
```
+19
View File
@@ -10,6 +10,7 @@ description: |
or "is this ambitious enough".
Proactively suggest when the user is questioning scope or ambition of a plan,
or when the plan feels like it could be thinking bigger.
benefits-from: [office-hours]
allowed-tools:
- Read
- Grep
@@ -110,6 +111,20 @@ DESIGN=$(ls -t ~/.gstack/projects/$SLUG/*-$BRANCH-design-*.md 2>/dev/null | head
```
If a design doc exists (from `/office-hours`), read it. Use it as the source of truth for the problem statement, constraints, and chosen approach. If it has a `Supersedes:` field, note that this is a revised design.
{{BENEFITS_FROM}}
**Mid-session detection:** During Step 0A (Premise Challenge), if the user can't
articulate the problem, keeps changing the problem statement, answers with "I'm not
sure," or is clearly exploring rather than reviewing — offer `/office-hours`:
> "It sounds like you're still figuring out what to build — that's totally fine, but
> that's what /office-hours is designed for. Want to pause this review and run
> /office-hours first? It'll help you nail down the problem and approach, then come
> back here for the strategic review."
Options: A) Yes, run /office-hours first. B) No, keep going.
If they keep going, proceed normally — no guilt, no re-asking.
When reading TODOS.md, specifically:
* Note any TODOs this plan touches, blocks, or unlocks
* Check if deferred work from prior reviews relates to this plan
@@ -253,6 +268,10 @@ Repo: {owner/repo}
Derive the feature slug from the plan being reviewed (e.g., "user-dashboard", "auth-refactor"). Use the date in YYYY-MM-DD format.
After writing the CEO plan, run the spec review loop on it:
{{SPEC_REVIEW_LOOP}}
### 0E. Temporal Interrogation (EXPANSION, SELECTIVE EXPANSION, and HOLD modes)
Think ahead to implementation: What decisions will need to be made during implementation that should be resolved NOW in the plan?
```
+20
View File
@@ -8,6 +8,7 @@ description: |
"review the architecture", "engineering review", or "lock in the plan".
Proactively suggest when the user has a plan or design doc and is about to
start coding — to catch architecture issues before implementation.
benefits-from: [office-hours]
allowed-tools:
- Read
- Write
@@ -209,6 +210,25 @@ DESIGN=$(ls -t ~/.gstack/projects/$SLUG/*-$BRANCH-design-*.md 2>/dev/null | head
```
If a design doc exists, read it. Use it as the source of truth for the problem statement, constraints, and chosen approach. If it has a `Supersedes:` field, note that this is a revised design — check the prior version for context on what changed and why.
## Prerequisite Skill Offer
When the design doc check above prints "No design doc found," offer the prerequisite
skill before proceeding.
Say to the user via AskUserQuestion:
> "No design doc found for this branch. `/office-hours` produces a structured problem
> statement, premise challenge, and explored alternatives — it gives this review much
> sharper input to work with. Takes about 10 minutes. The design doc is per-feature,
> not per-product — it captures the thinking behind this specific change."
Options:
- A) Run /office-hours first (in another window, then come back)
- B) Skip — proceed with standard review
If they skip: "No worries — standard review. If you ever want sharper input, try
/office-hours first next time." Then proceed normally. Do not re-offer later in the session.
### Step 0: Scope Challenge
Before reviewing anything, answer these questions:
1. **What existing code already partially or fully solves each sub-problem?** Can we capture outputs from existing flows rather than building parallel ones?
+3
View File
@@ -8,6 +8,7 @@ description: |
"review the architecture", "engineering review", or "lock in the plan".
Proactively suggest when the user has a plan or design doc and is about to
start coding — to catch architecture issues before implementation.
benefits-from: [office-hours]
allowed-tools:
- Read
- Write
@@ -73,6 +74,8 @@ DESIGN=$(ls -t ~/.gstack/projects/$SLUG/*-$BRANCH-design-*.md 2>/dev/null | head
```
If a design doc exists, read it. Use it as the source of truth for the problem statement, constraints, and chosen approach. If it has a `Supersedes:` field, note that this is a revised design — check the prior version for context on what changed and why.
{{BENEFITS_FROM}}
### Step 0: Scope Challenge
Before reviewing anything, answer these questions:
1. **What existing code already partially or fully solves each sub-problem?** Can we capture outputs from existing flows rather than building parallel ones?