Merge remote-tracking branch 'origin/main' into garrytan/codex-in-ceo-review

# Conflicts: # CHANGELOG.md # VERSION
2026-05-02 03:35:09 +02:00 · 2026-03-22 21:16:22 -07:00
parent 62285414d5 9eb74debd5
commit f0041b6d44
31 changed files with 494 additions and 134 deletions
@@ -1,6 +1,6 @@
 # Changelog

-## [0.11.3.1] - 2026-03-22 — Outside Voice
+## [0.11.5.2] - 2026-03-22 — Outside Voice

 ### Added

@@ -12,6 +12,29 @@

 - **`/plan-eng-review` Codex integration upgraded.** The old hardcoded Step 0.5 is replaced with a richer resolver that adds Claude subagent fallback, review log persistence, dashboard visibility, and higher reasoning effort (`xhigh`).

+## [0.11.5.1] - 2026-03-23 — Inline Office Hours
+
+### Changed
+
+- **No more "open another window" for /office-hours.** When `/plan-ceo-review` or `/plan-eng-review` offer to run `/office-hours` first, it now runs inline in the same conversation. The review picks up right where it left off after the design doc is ready. Same for mid-session detection when you're still figuring out what to build.
+- **Handoff note infrastructure removed.** The handoff notes that bridged the old "go to another window" flow are no longer written. Existing notes from prior sessions are still read for backward compatibility.
+
+## [0.11.5.0] - 2026-03-23 — Bash Compatibility Fix
+
+### Fixed
+
+- **`gstack-review-read` and `gstack-review-log` no longer crash under bash.** These scripts used `source <(gstack-slug)` which silently fails to set variables under bash with `set -euo pipefail`, causing `SLUG: unbound variable` errors. Replaced with `eval "$(gstack-slug)"` which works correctly in both bash and zsh.
+- **All SKILL.md templates updated.** Every template that instructed agents to run `source <(gstack-slug)` now uses `eval "$(gstack-slug)"` for cross-shell compatibility. Regenerated all SKILL.md files from templates.
+- **Regression tests added.** New tests verify `eval "$(gstack-slug)"` works under bash strict mode, and guard against `source <(.*gstack-slug` patterns reappearing in templates or bin scripts.
+
+## [0.11.4.0] - 2026-03-22 — Codex in Office Hours
+
+### Added
+
+- **Your brainstorming now gets a second opinion.** After premise challenge in `/office-hours`, you can opt in to a Codex cold read — a completely independent AI that hasn't seen the conversation reviews your problem, answers, and premises. It steelmans your idea, identifies the most revealing thing you said, challenges one premise, and proposes a 48-hour prototype. Two different AI models seeing different things catches blind spots neither would find alone.
+- **Cross-Model Perspective in design docs.** When you use the second opinion, the design doc automatically includes a `## Cross-Model Perspective` section capturing what Codex said — so the independent view is preserved for downstream reviews.
+- **New founder signal: defended premise with reasoning.** When Codex challenges one of your premises and you keep it with articulated reasoning (not just dismissal), that's tracked as a positive signal of conviction.
+
 ## [0.11.3.0] - 2026-03-23 — Design Outside Voices

 ### Added
@@ -1 +1 @@
-0.11.3.1
+0.11.5.2
@@ -331,12 +331,44 @@ Say to the user via AskUserQuestion:
 > not per-product — it captures the thinking behind this specific change."

 Options:
- A) Run /office-hours first (in another window, then come back)
+- A) Run /office-hours now (we'll pick up the review right after)
 - B) Skip — proceed with standard review

 If they skip: "No worries — standard review. If you ever want sharper input, try
 /office-hours first next time." Then proceed normally. Do not re-offer later in the session.

+If they choose A:
+
+Say: "Running /office-hours inline. Once the design doc is ready, I'll pick up
+the review right where we left off."
+
+Read the office-hours skill file from disk using the Read tool:
+`~/.claude/skills/gstack/office-hours/SKILL.md`
+
+Follow it inline, **skipping these sections** (already handled by the parent skill):
+- Preamble (run first)
+- AskUserQuestion Format
+- Completeness Principle — Boil the Lake
+- Search Before Building
+- Contributor Mode
+- Completion Status Protocol
+- Telemetry (run last)
+
+If the Read fails (file not found), say:
+"Could not load /office-hours — proceeding with standard review."
+
+After /office-hours completes, re-run the design doc check:
+```bash
+SLUG=$(~/.claude/skills/gstack/browse/bin/remote-slug 2>/dev/null || basename "$(git rev-parse --show-toplevel 2>/dev/null || pwd)")
+BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-' || echo 'no-branch')
+DESIGN=$(ls -t ~/.gstack/projects/$SLUG/*-$BRANCH-design-*.md 2>/dev/null | head -1)
+[ -z "$DESIGN" ] && DESIGN=$(ls -t ~/.gstack/projects/$SLUG/*-design-*.md 2>/dev/null | head -1)
+[ -n "$DESIGN" ] && echo "Design doc found: $DESIGN" || echo "No design doc found"
+```
+
+If a design doc is now found, read it and continue the review.
+If none was produced (user may have cancelled), proceed with standard review.
+
 # /autoplan — Auto-Review Pipeline

 One command. Rough plan in, fully reviewed plan out.
@@ -416,7 +448,7 @@ State what you examined and why nothing was flagged (1-2 sentences minimum).
 Before doing anything, save the plan file's current state to an external file:

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
 BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-')
 DATETIME=$(date +%Y%m%d-%H%M%S)
 echo "RESTORE_PATH=$HOME/.gstack/projects/$SLUG/${BRANCH}-autoplan-restore-${DATETIME}.md"
@@ -107,7 +107,7 @@ State what you examined and why nothing was flagged (1-2 sentences minimum).
 Before doing anything, save the plan file's current state to an external file:

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+{{SLUG_SETUP}}
 BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-')
 DATETIME=$(date +%Y%m%d-%H%M%S)
 echo "RESTORE_PATH=$HOME/.gstack/projects/$SLUG/${BRANCH}-autoplan-restore-${DATETIME}.md"
@@ -333,7 +333,7 @@ When the user types `/benchmark`, run this skill.
 ### Phase 1: Setup

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null || echo "SLUG=unknown")
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null || echo "SLUG=unknown")"
 mkdir -p .gstack/benchmark-reports
 mkdir -p .gstack/benchmark-reports/baselines
 ```
@@ -41,7 +41,7 @@ When the user types `/benchmark`, run this skill.
 ### Phase 1: Setup

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null || echo "SLUG=unknown")
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null || echo "SLUG=unknown")"
 mkdir -p .gstack/benchmark-reports
 mkdir -p .gstack/benchmark-reports/baselines
 ```
@@ -3,7 +3,7 @@
 # Usage: gstack-review-log '{"skill":"...","timestamp":"...","status":"..."}'
 set -euo pipefail
 SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
-source <("$SCRIPT_DIR/gstack-slug" 2>/dev/null)
+eval "$("$SCRIPT_DIR/gstack-slug" 2>/dev/null)"
 GSTACK_HOME="${GSTACK_HOME:-$HOME/.gstack}"
 mkdir -p "$GSTACK_HOME/projects/$SLUG"
 echo "$1" >> "$GSTACK_HOME/projects/$SLUG/$BRANCH-reviews.jsonl"
@@ -3,7 +3,7 @@
 # Usage: gstack-review-read
 set -euo pipefail
 SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
-source <("$SCRIPT_DIR/gstack-slug" 2>/dev/null)
+eval "$("$SCRIPT_DIR/gstack-slug" 2>/dev/null)"
 GSTACK_HOME="${GSTACK_HOME:-$HOME/.gstack}"
 cat "$GSTACK_HOME/projects/$SLUG/$BRANCH-reviews.jsonl" 2>/dev/null || echo "NO_REVIEWS"
 echo "---CONFIG---"
@@ -1,6 +1,6 @@
 #!/usr/bin/env bash
 # gstack-slug — output project slug and sanitized branch name
-# Usage: source <(gstack-slug)  → sets SLUG and BRANCH variables
+# Usage: eval "$(gstack-slug)"  → sets SLUG and BRANCH variables
 # Or:    gstack-slug            → prints SLUG=... and BRANCH=... lines
 #
 # Security: output is sanitized to [a-zA-Z0-9._-] only, preventing
@@ -351,7 +351,7 @@ When the user types `/canary`, run this skill.
 ### Phase 1: Setup

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null || echo "SLUG=unknown")
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null || echo "SLUG=unknown")"
 mkdir -p .gstack/canary-reports
 mkdir -p .gstack/canary-reports/baselines
 mkdir -p .gstack/canary-reports/screenshots
@@ -501,7 +501,7 @@ Save report to `.gstack/canary-reports/{date}-canary.md` and `.gstack/canary-rep
 Log the result for the review dashboard:

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)"
 mkdir -p ~/.gstack/projects/$SLUG
 ```

@@ -42,7 +42,7 @@ When the user types `/canary`, run this skill.
 ### Phase 1: Setup

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null || echo "SLUG=unknown")
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null || echo "SLUG=unknown")"
 mkdir -p .gstack/canary-reports
 mkdir -p .gstack/canary-reports/baselines
 mkdir -p .gstack/canary-reports/screenshots
@@ -192,7 +192,7 @@ Save report to `.gstack/canary-reports/{date}-canary.md` and `.gstack/canary-rep
 Log the result for the review dashboard:

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+{{SLUG_EVAL}}
 mkdir -p ~/.gstack/projects/$SLUG
 ```

@@ -327,7 +327,7 @@ ls src/ app/ pages/ components/ 2>/dev/null | head -30
 Look for office-hours output:

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)"
 ls ~/.gstack/projects/$SLUG/*office-hours* 2>/dev/null | head -5
 ls .context/*office-hours* .context/attachments/*office-hours* 2>/dev/null | head -5
 ```
@@ -52,7 +52,7 @@ ls src/ app/ pages/ components/ 2>/dev/null | head -30
 Look for office-hours output:

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+{{SLUG_EVAL}}
 ls ~/.gstack/projects/$SLUG/*office-hours* 2>/dev/null | head -5
 ls .context/*office-hours* .context/attachments/*office-hours* 2>/dev/null | head -5
 ```
@@ -774,7 +774,7 @@ Compare screenshots and observations across pages for:

 **Project-scoped:**
 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
 ```
 Write to: `~/.gstack/projects/{slug}/{user}-{branch}-design-audit-{datetime}.md`

@@ -1142,7 +1142,7 @@ Write the report to both local and project-scoped locations:

 **Project-scoped:**
 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
 ```
 Write to `~/.gstack/projects/{slug}/{user}-{branch}-design-audit-{datetime}.md`

@@ -224,7 +224,7 @@ Write the report to both local and project-scoped locations:

 **Project-scoped:**
 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+{{SLUG_SETUP}}
 ```
 Write to `~/.gstack/projects/{slug}/{user}-{branch}-design-audit-{datetime}.md`

@@ -883,7 +883,7 @@ Save report to `.gstack/deploy-reports/{date}-pr{number}-deploy.md`.
 Log to the review dashboard:

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)"
 mkdir -p ~/.gstack/projects/$SLUG
 ```

@@ -542,7 +542,7 @@ Save report to `.gstack/deploy-reports/{date}-pr{number}-deploy.md`.
 Log to the review dashboard:

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+{{SLUG_EVAL}}
 mkdir -p ~/.gstack/projects/$SLUG
 ```

@@ -331,7 +331,7 @@ You are a **YC office hours partner**. Your job is to ensure the problem is unde
 Understand the project and the area the user wants to change.

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)"
 ```

 1. Read `CLAUDE.md`, `TODOS.md` (if they exist).
@@ -640,6 +640,90 @@ Use AskUserQuestion to confirm. If the user disagrees with a premise, revise und

 ---

+## Phase 3.5: Cross-Model Second Opinion (optional)
+
+**Binary check first — no question if unavailable:**
+
+```bash
+which codex 2>/dev/null && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
+```
+
+If `CODEX_NOT_AVAILABLE`: skip Phase 3.5 entirely — no message, no AskUserQuestion. Proceed directly to Phase 4.
+
+If `CODEX_AVAILABLE`: use AskUserQuestion:
+
+> Want a second opinion from a different AI model? Codex will independently review your problem statement, key answers, premises, and any landscape findings from this session. It hasn't seen this conversation — it gets a structured summary. Usually takes 2-5 minutes.
+> A) Yes, get a second opinion
+> B) No, proceed to alternatives
+
+If B: skip Phase 3.5 entirely. Remember that Codex did NOT run (affects design doc, founder signals, and Phase 4 below).
+
+**If A: Run the Codex cold read.**
+
+1. Assemble a structured context block from Phases 1-3:
+   - Mode (Startup or Builder)
+   - Problem statement (from Phase 1)
+   - Key answers from Phase 2A/2B (summarize each Q&A in 1-2 sentences, include verbatim user quotes)
+   - Landscape findings (from Phase 2.75, if search was run)
+   - Agreed premises (from Phase 3)
+   - Codebase context (project name, languages, recent activity)
+
+2. **Write the assembled prompt to a temp file** (prevents shell injection from user-derived content):
+
+```bash
+CODEX_PROMPT_FILE=$(mktemp /tmp/gstack-codex-oh-XXXXXXXX.txt)
+```
+
+Write the full prompt (context block + instructions) to this file. Use the mode-appropriate variant:
+
+**Startup mode instructions:** "You are an independent technical advisor reading a transcript of a startup brainstorming session. [CONTEXT BLOCK HERE]. Your job: 1) What is the STRONGEST version of what this person is trying to build? Steelman it in 2-3 sentences. 2) What is the ONE thing from their answers that reveals the most about what they should actually build? Quote it and explain why. 3) Name ONE agreed premise you think is wrong, and what evidence would prove you right. 4) If you had 48 hours and one engineer to build a prototype, what would you build? Be specific — tech stack, features, what you'd skip. Be direct. Be terse. No preamble."
+
+**Builder mode instructions:** "You are an independent technical advisor reading a transcript of a builder brainstorming session. [CONTEXT BLOCK HERE]. Your job: 1) What is the COOLEST version of this they haven't considered? 2) What's the ONE thing from their answers that reveals what excites them most? Quote it. 3) What existing open source project or tool gets them 50% of the way there — and what's the 50% they'd need to build? 4) If you had a weekend to build this, what would you build first? Be specific. Be direct. No preamble."
+
+3. Run Codex:
+
+```bash
+TMPERR_OH=$(mktemp /tmp/codex-oh-err-XXXXXXXX)
+codex exec "$(cat "$CODEX_PROMPT_FILE")" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached 2>"$TMPERR_OH"
+```
+
+Use a 5-minute timeout (`timeout: 300000`). After the command completes, read stderr:
+```bash
+cat "$TMPERR_OH"
+rm -f "$TMPERR_OH" "$CODEX_PROMPT_FILE"
+```
+
+**Error handling:** All errors are non-blocking — Codex second opinion is a quality enhancement, not a prerequisite.
+- **Auth failure:** If stderr contains "auth", "login", "unauthorized", or "API key": "Codex authentication failed. Run \`codex login\` to authenticate. Skipping second opinion."
+- **Timeout:** "Codex timed out after 5 minutes. Skipping second opinion."
+- **Empty response:** "Codex returned no response. Stderr: <paste relevant error>. Skipping second opinion."
+
+On any error, proceed to Phase 4 — do NOT fall back to a Claude subagent (this is brainstorming, not adversarial review).
+
+4. **Presentation:**
+
+```
+SECOND OPINION (Codex):
+════════════════════════════════════════════════════════════
+<full codex output, verbatim — do not truncate or summarize>
+════════════════════════════════════════════════════════════
+```
+
+5. **Cross-model synthesis:** After presenting Codex output, provide 3-5 bullet synthesis:
+   - Where Claude agrees with Codex
+   - Where Claude disagrees and why
+   - Whether Codex's challenged premise changes Claude's recommendation
+
+6. **Premise revision check:** If Codex challenged an agreed premise, use AskUserQuestion:
+
+> Codex challenged premise #{N}: "{premise text}". Their argument: "{reasoning}".
+> A) Revise this premise based on Codex's input
+> B) Keep the original premise — proceed to alternatives
+
+If A: revise the premise and note the revision. If B: proceed (and note that the user defended this premise with reasoning — this is a founder signal if they articulate WHY they disagree, not just dismiss).
+
+---
+
 ## Phase 4: Alternatives Generation (MANDATORY)

 Produce 2-3 distinct implementation approaches. This is NOT optional.
@@ -666,6 +750,7 @@ Rules:
 - One must be the **"minimal viable"** (fewest files, smallest diff, ships fastest).
 - One must be the **"ideal architecture"** (best long-term trajectory, most elegant).
 - One can be **creative/lateral** (unexpected approach, different framing of the problem).
+- If Codex proposed a prototype in Phase 3.5, consider using it as a starting point for the creative/lateral approach.

 **RECOMMENDATION:** Choose [X] because [one-line reason].

@@ -774,6 +859,7 @@ Track which of these signals appeared during the session:
 - Has **domain expertise** — knows this space from the inside
 - Showed **taste** — cared about getting the details right
 - Showed **agency** — actually building, not just planning
+- **Defended premise with reasoning** against cross-model challenge (kept original premise when Codex disagreed AND articulated specific reasoning for why — dismissal without reasoning does not count)

 Count the signals. You'll use this count in Phase 6 to determine which tier of closing message to use.

@@ -784,7 +870,7 @@ Count the signals. You'll use this count in Phase 6 to determine which tier of c
 Write the design document to the project directory.

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
 USER=$(whoami)
 DATETIME=$(date +%Y%m%d-%H%M%S)
 ```
@@ -827,6 +913,9 @@ Supersedes: {prior filename — omit this line if first design on this branch}
 ## Premises
 {from Phase 3}

+## Cross-Model Perspective
+{If Codex ran in Phase 3.5: Codex's independent cold read — steelman, key insight, challenged premise, prototype suggestion. Verbatim or close paraphrase of what Codex said. If Codex did NOT run (skipped or unavailable): omit this section entirely — do not include it.}
+
 ## Approaches Considered
 ### Approach A: {name}
 {from Phase 4}
@@ -876,6 +965,9 @@ Supersedes: {prior filename — omit this line if first design on this branch}
 ## Premises
 {from Phase 3}

+## Cross-Model Perspective
+{If Codex ran in Phase 3.5: Codex's independent cold read — coolest version, key insight, existing tools, prototype suggestion. Verbatim or close paraphrase of what Codex said. If Codex did NOT run (skipped or unavailable): omit this section entirely — do not include it.}
+
 ## Approaches Considered
 ### Approach A: {name}
 {from Phase 4}
@@ -39,7 +39,7 @@ You are a **YC office hours partner**. Your job is to ensure the problem is unde
 Understand the project and the area the user wants to change.

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+{{SLUG_EVAL}}
 ```

 1. Read `CLAUDE.md`, `TODOS.md` (if they exist).
@@ -348,6 +348,10 @@ Use AskUserQuestion to confirm. If the user disagrees with a premise, revise und

 ---

+{{CODEX_SECOND_OPINION}}
+
+---
+
 ## Phase 4: Alternatives Generation (MANDATORY)

 Produce 2-3 distinct implementation approaches. This is NOT optional.
@@ -374,6 +378,7 @@ Rules:
 - One must be the **"minimal viable"** (fewest files, smallest diff, ships fastest).
 - One must be the **"ideal architecture"** (best long-term trajectory, most elegant).
 - One can be **creative/lateral** (unexpected approach, different framing of the problem).
+- If Codex proposed a prototype in Phase 3.5, consider using it as a starting point for the creative/lateral approach.

 **RECOMMENDATION:** Choose [X] because [one-line reason].

@@ -397,6 +402,7 @@ Track which of these signals appeared during the session:
 - Has **domain expertise** — knows this space from the inside
 - Showed **taste** — cared about getting the details right
 - Showed **agency** — actually building, not just planning
+- **Defended premise with reasoning** against cross-model challenge (kept original premise when Codex disagreed AND articulated specific reasoning for why — dismissal without reasoning does not count)

 Count the signals. You'll use this count in Phase 6 to determine which tier of closing message to use.

@@ -407,7 +413,7 @@ Count the signals. You'll use this count in Phase 6 to determine which tier of c
 Write the design document to the project directory.

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+{{SLUG_SETUP}}
 USER=$(whoami)
 DATETIME=$(date +%Y%m%d-%H%M%S)
 ```
@@ -450,6 +456,9 @@ Supersedes: {prior filename — omit this line if first design on this branch}
 ## Premises
 {from Phase 3}

+## Cross-Model Perspective
+{If Codex ran in Phase 3.5: Codex's independent cold read — steelman, key insight, challenged premise, prototype suggestion. Verbatim or close paraphrase of what Codex said. If Codex did NOT run (skipped or unavailable): omit this section entirely — do not include it.}
+
 ## Approaches Considered
 ### Approach A: {name}
 {from Phase 4}
@@ -499,6 +508,9 @@ Supersedes: {prior filename — omit this line if first design on this branch}
 ## Premises
 {from Phase 3}

+## Cross-Model Perspective
+{If Codex ran in Phase 3.5: Codex's independent cold read — coolest version, key insight, existing tools, prototype suggestion. Verbatim or close paraphrase of what Codex said. If Codex did NOT run (skipped or unavailable): omit this section entirely — do not include it.}
+
 ## Approaches Considered
 ### Approach A: {name}
 {from Phase 4}
@@ -432,65 +432,64 @@ Say to the user via AskUserQuestion:
 > not per-product — it captures the thinking behind this specific change."

 Options:
- A) Run /office-hours first (in another window, then come back)
+- A) Run /office-hours now (we'll pick up the review right after)
 - B) Skip — proceed with standard review

 If they skip: "No worries — standard review. If you ever want sharper input, try
 /office-hours first next time." Then proceed normally. Do not re-offer later in the session.

-**Handoff note save (BENEFITS_FROM):** If the user chose A (run /office-hours first),
-save a handoff context note before they leave. Reuse $SLUG and $BRANCH from the
-design doc check block above (they use the same `remote-slug || basename` fallback
-that handles repos without an origin remote). Then run:
+If they choose A:
+
+Say: "Running /office-hours inline. Once the design doc is ready, I'll pick up
+the review right where we left off."
+
+Read the office-hours skill file from disk using the Read tool:
+`~/.claude/skills/gstack/office-hours/SKILL.md`
+
+Follow it inline, **skipping these sections** (already handled by the parent skill):
+- Preamble (run first)
+- AskUserQuestion Format
+- Completeness Principle — Boil the Lake
+- Search Before Building
+- Contributor Mode
+- Completion Status Protocol
+- Telemetry (run last)
+
+If the Read fails (file not found), say:
+"Could not load /office-hours — proceeding with standard review."
+
+After /office-hours completes, re-run the design doc check:
 ```bash
-mkdir -p ~/.gstack/projects/$SLUG
-USER=$(whoami)
-DATETIME=$(date +%Y%m%d-%H%M%S)
-```
-Write to `~/.gstack/projects/$SLUG/$USER-$BRANCH-ceo-handoff-$DATETIME.md`:
-```markdown
-# CEO Review Handoff Note
-
-Generated by /plan-ceo-review on {date}
-Branch: {branch}
-Repo: {owner/repo}
-
-## Why I paused
-User chose to run /office-hours first (no design doc found).
-
-## System Audit Summary
-{Summarize what the system audit found — recent git history, diff scope,
-CLAUDE.md key points, TODOS.md relevant items, known pain points}
-
-## Discussion So Far
-{Empty — handoff happened before Step 0. Frontend/UI scope detection has not
-run yet — it will be assessed when the review resumes.}
+SLUG=$(~/.claude/skills/gstack/browse/bin/remote-slug 2>/dev/null || basename "$(git rev-parse --show-toplevel 2>/dev/null || pwd)")
+BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-' || echo 'no-branch')
+DESIGN=$(ls -t ~/.gstack/projects/$SLUG/*-$BRANCH-design-*.md 2>/dev/null | head -1)
+[ -z "$DESIGN" ] && DESIGN=$(ls -t ~/.gstack/projects/$SLUG/*-design-*.md 2>/dev/null | head -1)
+[ -n "$DESIGN" ] && echo "Design doc found: $DESIGN" || echo "No design doc found"
 ```

-Tell the user: "Context saved. Run /office-hours in another window. When you come back
-and invoke /plan-ceo-review, I'll pick up the context automatically — including the
-design doc /office-hours produces."
+If a design doc is now found, read it and continue the review.
+If none was produced (user may have cancelled), proceed with standard review.

 **Mid-session detection:** During Step 0A (Premise Challenge), if the user can't
 articulate the problem, keeps changing the problem statement, answers with "I'm not
 sure," or is clearly exploring rather than reviewing — offer `/office-hours`:

 > "It sounds like you're still figuring out what to build — that's totally fine, but
-> that's what /office-hours is designed for. Want to pause this review and run
-> /office-hours first? It'll help you nail down the problem and approach, then come
-> back here for the strategic review."
+> that's what /office-hours is designed for. Want to run /office-hours right now?
+> We'll pick up right where we left off."

-Options: A) Yes, run /office-hours first. B) No, keep going.
+Options: A) Yes, run /office-hours now. B) No, keep going.
 If they keep going, proceed normally — no guilt, no re-asking.

-**Handoff note save (mid-session):** If the user chose A (run /office-hours first from
-mid-session detection), save a handoff context note with the same format above, but
-include any Step 0A progress in the "Discussion So Far" section — premises discussed,
-problem framing attempts, user answers so far. Use the same bash block to generate the
-file path.
+If they choose A: Read the office-hours skill file from disk:
+`~/.claude/skills/gstack/office-hours/SKILL.md`

-Tell the user: "Context saved with your discussion so far. Run /office-hours, then
-come back to /plan-ceo-review."
+Follow it inline, skipping these sections (already handled by parent skill):
+Preamble, AskUserQuestion Format, Completeness Principle, Search Before Building,
+Contributor Mode, Completion Status Protocol, Telemetry.
+
+Note current Step 0A progress so you don't re-ask questions already answered.
+After completion, re-run the design doc check and resume the review.

 When reading TODOS.md, specifically:
 * Note any TODOs this plan touches, blocks, or unlocks
@@ -607,7 +606,7 @@ Rules:
 After the opt-in/cherry-pick ceremony, write the plan to disk so the vision and decisions survive beyond this conversation. Only run this step for EXPANSION and SELECTIVE EXPANSION modes.

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG/ceo-plans
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG/ceo-plans
 ```

 Before writing, check for existing CEO plans in the ceo-plans/ directory. If any are >30 days old or their branch has been merged/deleted, offer to archive them:
@@ -1221,7 +1220,7 @@ After producing the Completion Summary, clean up any handoff notes for this bran
 the review is complete and the context is no longer needed.

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)"
 rm -f ~/.gstack/projects/$SLUG/*-$BRANCH-ceo-handoff-*.md 2>/dev/null || true
 ```

@@ -129,59 +129,26 @@ context to pick up where we left off."

 {{BENEFITS_FROM}}

-**Handoff note save (BENEFITS_FROM):** If the user chose A (run /office-hours first),
-save a handoff context note before they leave. Reuse $SLUG and $BRANCH from the
-design doc check block above (they use the same `remote-slug || basename` fallback
-that handles repos without an origin remote). Then run:
-```bash
-mkdir -p ~/.gstack/projects/$SLUG
-USER=$(whoami)
-DATETIME=$(date +%Y%m%d-%H%M%S)
-```
-Write to `~/.gstack/projects/$SLUG/$USER-$BRANCH-ceo-handoff-$DATETIME.md`:
-```markdown
-# CEO Review Handoff Note
-
-Generated by /plan-ceo-review on {date}
-Branch: {branch}
-Repo: {owner/repo}
-
-## Why I paused
-User chose to run /office-hours first (no design doc found).
-
-## System Audit Summary
-{Summarize what the system audit found — recent git history, diff scope,
-CLAUDE.md key points, TODOS.md relevant items, known pain points}
-
-## Discussion So Far
-{Empty — handoff happened before Step 0. Frontend/UI scope detection has not
-run yet — it will be assessed when the review resumes.}
-```
-
-Tell the user: "Context saved. Run /office-hours in another window. When you come back
-and invoke /plan-ceo-review, I'll pick up the context automatically — including the
-design doc /office-hours produces."
-
 **Mid-session detection:** During Step 0A (Premise Challenge), if the user can't
 articulate the problem, keeps changing the problem statement, answers with "I'm not
 sure," or is clearly exploring rather than reviewing — offer `/office-hours`:

 > "It sounds like you're still figuring out what to build — that's totally fine, but
-> that's what /office-hours is designed for. Want to pause this review and run
-> /office-hours first? It'll help you nail down the problem and approach, then come
-> back here for the strategic review."
+> that's what /office-hours is designed for. Want to run /office-hours right now?
+> We'll pick up right where we left off."

-Options: A) Yes, run /office-hours first. B) No, keep going.
+Options: A) Yes, run /office-hours now. B) No, keep going.
 If they keep going, proceed normally — no guilt, no re-asking.

-**Handoff note save (mid-session):** If the user chose A (run /office-hours first from
-mid-session detection), save a handoff context note with the same format above, but
-include any Step 0A progress in the "Discussion So Far" section — premises discussed,
-problem framing attempts, user answers so far. Use the same bash block to generate the
-file path.
+If they choose A: Read the office-hours skill file from disk:
+`~/.claude/skills/gstack/office-hours/SKILL.md`

-Tell the user: "Context saved with your discussion so far. Run /office-hours, then
-come back to /plan-ceo-review."
+Follow it inline, skipping these sections (already handled by parent skill):
+Preamble, AskUserQuestion Format, Completeness Principle, Search Before Building,
+Contributor Mode, Completion Status Protocol, Telemetry.
+
+Note current Step 0A progress so you don't re-ask questions already answered.
+After completion, re-run the design doc check and resume the review.

 When reading TODOS.md, specifically:
 * Note any TODOs this plan touches, blocks, or unlocks
@@ -298,7 +265,7 @@ Rules:
 After the opt-in/cherry-pick ceremony, write the plan to disk so the vision and decisions survive beyond this conversation. Only run this step for EXPANSION and SELECTIVE EXPANSION modes.

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG/ceo-plans
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG/ceo-plans
 ```

 Before writing, check for existing CEO plans in the ceo-plans/ directory. If any are >30 days old or their branch has been merged/deleted, offer to archive them:
@@ -735,7 +702,7 @@ After producing the Completion Summary, clean up any handoff notes for this bran
 the review is complete and the context is no longer needed.

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+{{SLUG_EVAL}}
 rm -f ~/.gstack/projects/$SLUG/*-$BRANCH-ceo-handoff-*.md 2>/dev/null || true
 ```

@@ -363,12 +363,44 @@ Say to the user via AskUserQuestion:
 > not per-product — it captures the thinking behind this specific change."

 Options:
- A) Run /office-hours first (in another window, then come back)
+- A) Run /office-hours now (we'll pick up the review right after)
 - B) Skip — proceed with standard review

 If they skip: "No worries — standard review. If you ever want sharper input, try
 /office-hours first next time." Then proceed normally. Do not re-offer later in the session.

+If they choose A:
+
+Say: "Running /office-hours inline. Once the design doc is ready, I'll pick up
+the review right where we left off."
+
+Read the office-hours skill file from disk using the Read tool:
+`~/.claude/skills/gstack/office-hours/SKILL.md`
+
+Follow it inline, **skipping these sections** (already handled by the parent skill):
+- Preamble (run first)
+- AskUserQuestion Format
+- Completeness Principle — Boil the Lake
+- Search Before Building
+- Contributor Mode
+- Completion Status Protocol
+- Telemetry (run last)
+
+If the Read fails (file not found), say:
+"Could not load /office-hours — proceeding with standard review."
+
+After /office-hours completes, re-run the design doc check:
+```bash
+SLUG=$(~/.claude/skills/gstack/browse/bin/remote-slug 2>/dev/null || basename "$(git rev-parse --show-toplevel 2>/dev/null || pwd)")
+BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-' || echo 'no-branch')
+DESIGN=$(ls -t ~/.gstack/projects/$SLUG/*-$BRANCH-design-*.md 2>/dev/null | head -1)
+[ -z "$DESIGN" ] && DESIGN=$(ls -t ~/.gstack/projects/$SLUG/*-design-*.md 2>/dev/null | head -1)
+[ -n "$DESIGN" ] && echo "Design doc found: $DESIGN" || echo "No design doc found"
+```
+
+If a design doc is now found, read it and continue the review.
+If none was produced (user may have cancelled), proceed with standard review.
+
 ### Step 0: Scope Challenge
 Before reviewing anything, answer these questions:
 1. **What existing code already partially or fully solves each sub-problem?** Can we capture outputs from existing flows rather than building parallel ones?
@@ -588,7 +620,7 @@ The plan should be complete enough that when implementation begins, every test i
 After producing the coverage diagram, write a test plan artifact to the project directory so `/qa` and `/qa-only` can consume it as primary test input:

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
 USER=$(whoami)
 DATETIME=$(date +%Y%m%d-%H%M%S)
 ```
@@ -346,7 +346,7 @@ Before falling back to git diff heuristics, check for richer test plan sources:

 1. **Project-scoped test plans:** Check `~/.gstack/projects/` for recent `*-test-plan-*.md` files for this repo
   ```bash
-   source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+   eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)"
   ls -t ~/.gstack/projects/$SLUG/*-test-plan-*.md 2>/dev/null | head -1
   ```
 2. **Conversation context:** Check if a prior `/plan-eng-review` or `/plan-ceo-review` produced test plan output in this conversation
@@ -642,7 +642,7 @@ Write the report to both local and project-scoped locations:

 **Project-scoped:** Write test outcome artifact for cross-session context:
 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
 ```
 Write to `~/.gstack/projects/{slug}/{user}-{branch}-test-outcome-{datetime}.md`

@@ -54,7 +54,7 @@ Before falling back to git diff heuristics, check for richer test plan sources:

 1. **Project-scoped test plans:** Check `~/.gstack/projects/` for recent `*-test-plan-*.md` files for this repo
   ```bash
-   source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+   {{SLUG_EVAL}}
   ls -t ~/.gstack/projects/$SLUG/*-test-plan-*.md 2>/dev/null | head -1
   ```
 2. **Conversation context:** Check if a prior `/plan-eng-review` or `/plan-ceo-review` produced test plan output in this conversation
@@ -74,7 +74,7 @@ Write the report to both local and project-scoped locations:

 **Project-scoped:** Write test outcome artifact for cross-session context:
 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+{{SLUG_SETUP}}
 ```
 Write to `~/.gstack/projects/{slug}/{user}-{branch}-test-outcome-{datetime}.md`

@@ -549,7 +549,7 @@ Before falling back to git diff heuristics, check for richer test plan sources:

 1. **Project-scoped test plans:** Check `~/.gstack/projects/` for recent `*-test-plan-*.md` files for this repo
   ```bash
-   source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+   eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)"
   ls -t ~/.gstack/projects/$SLUG/*-test-plan-*.md 2>/dev/null | head -1
   ```
 2. **Conversation context:** Check if a prior `/plan-eng-review` or `/plan-ceo-review` produced test plan output in this conversation
@@ -1013,7 +1013,7 @@ Write the report to both local and project-scoped locations:

 **Project-scoped:** Write test outcome artifact for cross-session context:
 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
 ```
 Write to `~/.gstack/projects/{slug}/{user}-{branch}-test-outcome-{datetime}.md`

@@ -89,7 +89,7 @@ Before falling back to git diff heuristics, check for richer test plan sources:

 1. **Project-scoped test plans:** Check `~/.gstack/projects/` for recent `*-test-plan-*.md` files for this repo
   ```bash
-   source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+   {{SLUG_EVAL}}
   ls -t ~/.gstack/projects/$SLUG/*-test-plan-*.md 2>/dev/null | head -1
   ```
 2. **Conversation context:** Check if a prior `/plan-eng-review` or `/plan-ceo-review` produced test plan output in this conversation
@@ -277,7 +277,7 @@ Write the report to both local and project-scoped locations:

 **Project-scoped:** Write test outcome artifact for cross-session context:
 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+{{SLUG_SETUP}}
 ```
 Write to `~/.gstack/projects/{slug}/{user}-{branch}-test-outcome-{datetime}.md`

@@ -1219,7 +1219,7 @@ Compare screenshots and observations across pages for:

 **Project-scoped:**
 \`\`\`bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
 \`\`\`
 Write to: \`~/.gstack/projects/{slug}/{user}-{branch}-design-audit-{datetime}.md\`

@@ -1817,7 +1817,7 @@ The plan should be complete enough that when implementation begins, every test i
 After producing the coverage diagram, write a test plan artifact to the project directory so \`/qa\` and \`/qa-only\` can consume it as primary test input:

 \`\`\`bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
 USER=$(whoami)
 DATETIME=$(date +%Y%m%d-%H%M%S)
 \`\`\`
@@ -1881,7 +1881,7 @@ Coverage line: \`Test Coverage Audit: N new code paths. M covered (X%). K tests
 After producing the coverage diagram, write a test plan artifact so \`/qa\` and \`/qa-only\` can consume it:

 \`\`\`bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
 USER=$(whoami)
 DATETIME=$(date +%Y%m%d-%H%M%S)
 \`\`\`
@@ -2023,11 +2023,43 @@ Say to the user via AskUserQuestion:
 > not per-product — it captures the thinking behind this specific change."

 Options:
- A) Run /${first} first (in another window, then come back)
+- A) Run /${first} now (we'll pick up the review right after)
 - B) Skip — proceed with standard review

 If they skip: "No worries — standard review. If you ever want sharper input, try
-/${first} first next time." Then proceed normally. Do not re-offer later in the session.`;
+/${first} first next time." Then proceed normally. Do not re-offer later in the session.
+
+If they choose A:
+
+Say: "Running /${first} inline. Once the design doc is ready, I'll pick up
+the review right where we left off."
+
+Read the ${first} skill file from disk using the Read tool:
+\`~/.claude/skills/gstack/${first}/SKILL.md\`
+
+Follow it inline, **skipping these sections** (already handled by the parent skill):
+- Preamble (run first)
+- AskUserQuestion Format
+- Completeness Principle — Boil the Lake
+- Search Before Building
+- Contributor Mode
+- Completion Status Protocol
+- Telemetry (run last)
+
+If the Read fails (file not found), say:
+"Could not load /${first} — proceeding with standard review."
+
+After /${first} completes, re-run the design doc check:
+\`\`\`bash
+SLUG=$(~/.claude/skills/gstack/browse/bin/remote-slug 2>/dev/null || basename "$(git rev-parse --show-toplevel 2>/dev/null || pwd)")
+BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-' || echo 'no-branch')
+DESIGN=$(ls -t ~/.gstack/projects/$SLUG/*-$BRANCH-design-*.md 2>/dev/null | head -1)
+[ -z "$DESIGN" ] && DESIGN=$(ls -t ~/.gstack/projects/$SLUG/*-design-*.md 2>/dev/null | head -1)
+[ -n "$DESIGN" ] && echo "Design doc found: $DESIGN" || echo "No design doc found"
+\`\`\`
+
+If a design doc is now found, read it and continue the review.
+If none was produced (user may have cancelled), proceed with standard review.`;
 }

 function generateDesignSketch(_ctx: TemplateContext): string {
@@ -2119,6 +2151,93 @@ Present Codex output under \`CODEX SAYS (design sketch):\` and subagent output u
 Error handling: all non-blocking. On failure, skip and continue.`;
 }

+function generateCodexSecondOpinion(ctx: TemplateContext): string {
+  // Codex host: strip entirely — Codex should never invoke itself
+  if (ctx.host === 'codex') return '';
+
+  return `## Phase 3.5: Cross-Model Second Opinion (optional)
+
+**Binary check first — no question if unavailable:**
+
+\`\`\`bash
+which codex 2>/dev/null && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
+\`\`\`
+
+If \`CODEX_NOT_AVAILABLE\`: skip Phase 3.5 entirely — no message, no AskUserQuestion. Proceed directly to Phase 4.
+
+If \`CODEX_AVAILABLE\`: use AskUserQuestion:
+
+> Want a second opinion from a different AI model? Codex will independently review your problem statement, key answers, premises, and any landscape findings from this session. It hasn't seen this conversation — it gets a structured summary. Usually takes 2-5 minutes.
+> A) Yes, get a second opinion
+> B) No, proceed to alternatives
+
+If B: skip Phase 3.5 entirely. Remember that Codex did NOT run (affects design doc, founder signals, and Phase 4 below).
+
+**If A: Run the Codex cold read.**
+
+1. Assemble a structured context block from Phases 1-3:
+   - Mode (Startup or Builder)
+   - Problem statement (from Phase 1)
+   - Key answers from Phase 2A/2B (summarize each Q&A in 1-2 sentences, include verbatim user quotes)
+   - Landscape findings (from Phase 2.75, if search was run)
+   - Agreed premises (from Phase 3)
+   - Codebase context (project name, languages, recent activity)
+
+2. **Write the assembled prompt to a temp file** (prevents shell injection from user-derived content):
+
+\`\`\`bash
+CODEX_PROMPT_FILE=$(mktemp /tmp/gstack-codex-oh-XXXXXXXX.txt)
+\`\`\`
+
+Write the full prompt (context block + instructions) to this file. Use the mode-appropriate variant:
+
+**Startup mode instructions:** "You are an independent technical advisor reading a transcript of a startup brainstorming session. [CONTEXT BLOCK HERE]. Your job: 1) What is the STRONGEST version of what this person is trying to build? Steelman it in 2-3 sentences. 2) What is the ONE thing from their answers that reveals the most about what they should actually build? Quote it and explain why. 3) Name ONE agreed premise you think is wrong, and what evidence would prove you right. 4) If you had 48 hours and one engineer to build a prototype, what would you build? Be specific — tech stack, features, what you'd skip. Be direct. Be terse. No preamble."
+
+**Builder mode instructions:** "You are an independent technical advisor reading a transcript of a builder brainstorming session. [CONTEXT BLOCK HERE]. Your job: 1) What is the COOLEST version of this they haven't considered? 2) What's the ONE thing from their answers that reveals what excites them most? Quote it. 3) What existing open source project or tool gets them 50% of the way there — and what's the 50% they'd need to build? 4) If you had a weekend to build this, what would you build first? Be specific. Be direct. No preamble."
+
+3. Run Codex:
+
+\`\`\`bash
+TMPERR_OH=$(mktemp /tmp/codex-oh-err-XXXXXXXX)
+codex exec "$(cat "$CODEX_PROMPT_FILE")" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached 2>"$TMPERR_OH"
+\`\`\`
+
+Use a 5-minute timeout (\`timeout: 300000\`). After the command completes, read stderr:
+\`\`\`bash
+cat "$TMPERR_OH"
+rm -f "$TMPERR_OH" "$CODEX_PROMPT_FILE"
+\`\`\`
+
+**Error handling:** All errors are non-blocking — Codex second opinion is a quality enhancement, not a prerequisite.
+- **Auth failure:** If stderr contains "auth", "login", "unauthorized", or "API key": "Codex authentication failed. Run \\\`codex login\\\` to authenticate. Skipping second opinion."
+- **Timeout:** "Codex timed out after 5 minutes. Skipping second opinion."
+- **Empty response:** "Codex returned no response. Stderr: <paste relevant error>. Skipping second opinion."
+
+On any error, proceed to Phase 4 — do NOT fall back to a Claude subagent (this is brainstorming, not adversarial review).
+
+4. **Presentation:**
+
+\`\`\`
+SECOND OPINION (Codex):
+════════════════════════════════════════════════════════════
+<full codex output, verbatim — do not truncate or summarize>
+════════════════════════════════════════════════════════════
+\`\`\`
+
+5. **Cross-model synthesis:** After presenting Codex output, provide 3-5 bullet synthesis:
+   - Where Claude agrees with Codex
+   - Where Claude disagrees and why
+   - Whether Codex's challenged premise changes Claude's recommendation
+
+6. **Premise revision check:** If Codex challenged an agreed premise, use AskUserQuestion:
+
+> Codex challenged premise #{N}: "{premise text}". Their argument: "{reasoning}".
+> A) Revise this premise based on Codex's input
+> B) Keep the original premise — proceed to alternatives
+
+If A: revise the premise and note the revision. If B: proceed (and note that the user defended this premise with reasoning — this is a founder signal if they articulate WHY they disagree, not just dismiss).`;
+}
+
 function generateAdversarialStep(ctx: TemplateContext): string {
  // Codex host: strip entirely — Codex should never invoke itself
  if (ctx.host === 'codex') return '';
@@ -2668,7 +2787,17 @@ ${slopItems}
 Source: [OpenAI "Designing Delightful Frontends with GPT-5.4"](https://developers.openai.com/blog/designing-delightful-frontends-with-gpt-5-4) (Mar 2026) + gstack design methodology.`;
 }

+function generateSlugEval(ctx: TemplateContext): string {
+  return `eval "$(${ctx.paths.binDir}/gstack-slug 2>/dev/null)"`;
+}
+
+function generateSlugSetup(ctx: TemplateContext): string {
+  return `eval "$(${ctx.paths.binDir}/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG`;
+}
+
 const RESOLVERS: Record<string, (ctx: TemplateContext) => string> = {
+  SLUG_EVAL: generateSlugEval,
+  SLUG_SETUP: generateSlugSetup,
  COMMAND_REFERENCE: generateCommandReference,
  SNAPSHOT_FLAGS: generateSnapshotFlags,
  PREAMBLE: generatePreamble,
@@ -2689,6 +2818,7 @@ const RESOLVERS: Record<string, (ctx: TemplateContext) => string> = {
  SPEC_REVIEW_LOOP: generateSpecReviewLoop,
  DESIGN_SKETCH: generateDesignSketch,
  BENEFITS_FROM: generateBenefitsFrom,
+  CODEX_SECOND_OPINION: generateCodexSecondOpinion,
  CODEX_REVIEW_STEP: generateAdversarialStep,
  ADVERSARIAL_STEP: generateAdversarialStep,
  DEPLOY_BOOTSTRAP: generateDeployBootstrap,
@@ -397,7 +397,7 @@ If the Eng Review is NOT "CLEAR":

 1. **Check for a prior override on this branch:**
   ```bash
-   source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+   eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)"
   grep '"skill":"ship-review-override"' ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl 2>/dev/null || echo "NO_OVERRIDE"
   ```
   If an override exists, display the dashboard and note "Review gate previously accepted — continuing." Do NOT ask again.
@@ -411,7 +411,7 @@ If the Eng Review is NOT "CLEAR":

 3. **If the user chooses A or C,** persist the decision so future `/ship` runs on this branch skip the gate:
   ```bash
-   source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+   eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)"
   echo '{"skill":"ship-review-override","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","decision":"USER_CHOICE"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
   ```
   Substitute USER_CHOICE with "ship_anyway" or "not_relevant".
@@ -971,7 +971,7 @@ Coverage line: `Test Coverage Audit: N new code paths. M covered (X%). K tests g
 After producing the coverage diagram, write a test plan artifact so `/qa` and `/qa-only` can consume it:

 ```bash
-source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null) && mkdir -p ~/.gstack/projects/$SLUG
+eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
 USER=$(whoami)
 DATETIME=$(date +%Y%m%d-%H%M%S)
 ```
@@ -62,7 +62,7 @@ If the Eng Review is NOT "CLEAR":

 1. **Check for a prior override on this branch:**
   ```bash
-   source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+   {{SLUG_EVAL}}
   grep '"skill":"ship-review-override"' ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl 2>/dev/null || echo "NO_OVERRIDE"
   ```
   If an override exists, display the dashboard and note "Review gate previously accepted — continuing." Do NOT ask again.
@@ -76,7 +76,7 @@ If the Eng Review is NOT "CLEAR":

 3. **If the user chooses A or C,** persist the decision so future `/ship` runs on this branch skip the gate:
   ```bash
-   source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+   {{SLUG_EVAL}}
   echo '{"skill":"ship-review-override","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","decision":"USER_CHOICE"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
   ```
   Substitute USER_CHOICE with "ship_anyway" or "not_relevant".
@@ -705,6 +705,50 @@ describe('DESIGN_SKETCH resolver', () => {
  });
 });

+// --- {{CODEX_SECOND_OPINION}} resolver tests ---
+
+describe('CODEX_SECOND_OPINION resolver', () => {
+  const content = fs.readFileSync(path.join(ROOT, 'office-hours', 'SKILL.md'), 'utf-8');
+  const codexContent = fs.readFileSync(path.join(ROOT, '.agents', 'skills', 'gstack-office-hours', 'SKILL.md'), 'utf-8');
+
+  test('Phase 3.5 section appears in office-hours SKILL.md', () => {
+    expect(content).toContain('Phase 3.5: Cross-Model Second Opinion');
+  });
+
+  test('contains codex exec invocation', () => {
+    expect(content).toContain('codex exec');
+  });
+
+  test('contains opt-in AskUserQuestion text', () => {
+    expect(content).toContain('second opinion from a different AI model');
+  });
+
+  test('contains cross-model synthesis instructions', () => {
+    expect(content).toMatch(/[Ss]ynthesis/);
+    expect(content).toContain('Where Claude agrees with Codex');
+  });
+
+  test('contains premise revision check', () => {
+    expect(content).toContain('Codex challenged premise');
+  });
+
+  test('contains error handling for auth, timeout, and empty', () => {
+    expect(content).toMatch(/[Aa]uth.*fail/);
+    expect(content).toMatch(/[Tt]imeout/);
+    expect(content).toMatch(/[Ee]mpty response/);
+  });
+
+  test('Codex host variant does NOT contain the Phase 3.5 resolver output', () => {
+    // The resolver returns '' for codex host, so the interactive section is stripped.
+    // Static template references to "Phase 3.5" in prose/conditionals are fine.
+    // Other resolvers (design review lite) may contain CODEX_NOT_AVAILABLE, so we
+    // check for Phase 3.5-specific markers only.
+    expect(codexContent).not.toContain('Phase 3.5: Cross-Model Second Opinion');
+    expect(codexContent).not.toContain('TMPERR_OH');
+    expect(codexContent).not.toContain('gstack-codex-oh-');
+  });
+});
+
 // --- {{BENEFITS_FROM}} resolver tests ---

 describe('BENEFITS_FROM resolver', () => {
@@ -729,6 +773,16 @@ describe('BENEFITS_FROM resolver', () => {
    const qaContent = fs.readFileSync(path.join(ROOT, 'qa', 'SKILL.md'), 'utf-8');
    expect(qaContent).not.toContain('Prerequisite Skill Offer');
  });
+
+  test('inline invocation — no "another window" language', () => {
+    expect(ceoContent).not.toContain('another window');
+    expect(engContent).not.toContain('another window');
+  });
+
+  test('inline invocation — read-and-follow path present', () => {
+    expect(ceoContent).toContain('office-hours/SKILL.md');
+    expect(engContent).toContain('office-hours/SKILL.md');
+  });
 });

 // --- {{DESIGN_OUTSIDE_VOICES}} resolver tests ---
@@ -1004,6 +1004,25 @@ describe('gstack-slug', () => {
    expect(slug).toMatch(/^[a-zA-Z0-9._-]+$/);
    expect(branch).toMatch(/^[a-zA-Z0-9._-]+$/);
  });
+  test('eval sets variables under bash with set -euo pipefail', () => {
+    const result = Bun.spawnSync(
+      ['bash', '-c', 'set -euo pipefail; eval "$(./bin/gstack-slug 2>/dev/null)"; echo "SLUG=$SLUG"; echo "BRANCH=$BRANCH"'],
+      { cwd: ROOT, stdout: 'pipe', stderr: 'pipe' }
+    );
+    expect(result.exitCode).toBe(0);
+    const output = result.stdout.toString();
+    expect(output).toMatch(/^SLUG=.+/m);
+    expect(output).toMatch(/^BRANCH=.+/m);
+  });
+
+  test('no templates or bin scripts use source process substitution for gstack-slug', () => {
+    const result = Bun.spawnSync(
+      ['grep', '-r', 'source <(.*gstack-slug', '--include=*.tmpl', '--include=gstack-review-*', '.'],
+      { cwd: ROOT, stdout: 'pipe', stderr: 'pipe' }
+    );
+    // grep returns exit code 1 when no matches found — that's what we want
+    expect(result.stdout.toString().trim()).toBe('');
+  });
 });

 // --- Test Bootstrap validation ---
@@ -1 +1 @@
 .11.3.1
 .11.5.2