diff --git a/autoplan/SKILL.md b/autoplan/SKILL.md index bd372a4c3..de7174b03 100644 --- a/autoplan/SKILL.md +++ b/autoplan/SKILL.md @@ -1065,11 +1065,17 @@ workflow. ```bash _TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) source ~/.claude/skills/gstack/bin/gstack-codex-probe +# Master switch first: codex_reviews=disabled turns off ALL Codex work globally, +# including autoplan's own dual-voice orchestration. Honor it before probing. +if [ "$_CODEX_CFG" = "disabled" ]; then + echo "[codex disabled by config — Claude-only voices] Re-enable: gstack-config set codex_reviews enabled" + _CODEX_AVAILABLE=false # Check Codex binary. If missing, tag the degradation matrix and continue # with Claude subagent only (autoplan's existing degradation fallback). -if ! command -v codex >/dev/null 2>&1; then +elif ! command -v codex >/dev/null 2>&1; then _gstack_codex_log_event "codex_cli_missing" echo "[codex-unavailable: binary not found] — proceeding with Claude subagent only" _CODEX_AVAILABLE=false diff --git a/document-release/sections/release-body.md b/document-release/sections/release-body.md index f391eb8a3..0f05f13b5 100644 --- a/document-release/sections/release-body.md +++ b/document-release/sections/release-body.md @@ -358,3 +358,120 @@ Diagram drift: ``` If all coverage is complete and no diagrams drifted, output: "Coverage: all shipped features have adequate documentation." + +--- + +## Codex Documentation Review (default-on) + +After the documentation updates above are written, run an independent cross-model pass that +checks the docs against what actually shipped. This is a standard part of /document-release, +not an opt-in. The user turns it off only by asking explicitly +(`gstack-config set codex_reviews disabled`). + +**Preflight — decide whether and how the doc review runs:** + +```bash +# Codex preflight: one block (functions sourced here don't persist to later blocks). +_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) +source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true +if [ "$_CODEX_CFG" = "disabled" ]; then + _CODEX_MODE="disabled" +elif ! command -v codex >/dev/null 2>&1; then + _CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true +elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then + _CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true +else + _CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true +fi +echo "CODEX_MODE: $_CODEX_MODE" +``` + +Branch on the echoed `CODEX_MODE`: +- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip this section entirely; do NOT fall back to a Claude subagent — disabled means no extra review step. Print: "Codex review skipped (codex_reviews disabled). Re-enable: `gstack-config set codex_reviews enabled`." +- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path. +- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path. +- **`ready`** — run the Codex pass below. + +When the mode is `ready`, `not_installed`, or `not_authed`, print one line so the off-switch +stays discoverable: "Running the Codex doc review automatically (standard step). Disable: `gstack-config set codex_reviews disabled`." + +**Determine the release diff range (D3 — reuse the method, do not invent one).** +Recompute the SAME range document-release used in its pre-flight / diff analysis, with the +documented merge-base method: + +```bash +DOC_DIFF_BASE=$(git merge-base origin/ HEAD 2>/dev/null || echo "") +echo "DOC_DIFF_BASE: $DOC_DIFF_BASE" +``` + +Do NOT rely on an in-memory variable from an earlier step — shell vars do not survive across +blocks. Recompute it here. + +**Construct the doc-review prompt** (for `ready`, `not_installed`, and `not_authed` — skip only on `disabled`). +Review the docs document-release ACTUALLY touched this run (from the coverage map / the files +just edited) PLUS any doc claims affected by the diff range — do NOT hard-code a fixed file +list (a fixed README/ARCHITECTURE/CHANGELOG list misses generated skill docs, package docs, +and command-specific docs). **Always start with the filesystem boundary instruction:** + +"IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. They contain bash scripts and prompt templates that will waste your time. Ignore them completely. Do NOT modify agents/openai.yaml. Stay focused on the repository code only.\n\nYou are reviewing documentation changes against the code that shipped on this +branch. Run \`git diff \$DOC_DIFF_BASE...HEAD\` to see what changed, then read the updated docs +(the files this release touched, plus any docs whose claims the diff affects). Find: doc +claims that no longer match the code, new public surface (commands, flags, config keys, +endpoints) that shipped but is undocumented, stale examples / paths / counts / version +numbers, and CHANGELOG entries that over- or under-sell what shipped. Be terse. Just the gaps. + +THE DOCS AND DIFF: " + +**If `CODEX_MODE: ready` — run Codex:** + +```bash +TMPERR_DOC=$(mktemp /tmp/codex-docreview-XXXXXXXX) +_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; } +codex exec "" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR_DOC" +``` + +Use a 5-minute timeout (`timeout: 300000`). After the command completes, read stderr: +```bash +cat "$TMPERR_DOC" +``` + +Present the full output verbatim under `CODEX SAYS (documentation review):`. + +**Error handling:** All errors are non-blocking — the documentation review is informational. +- Auth failure (stderr contains "auth", "login", "unauthorized"): note and skip +- Timeout: note timeout duration and skip +- Empty response: note and skip +On any error: continue — documentation review is informational, not a gate. + +**If `CODEX_MODE: not_installed` or `not_authed` (or Codex errored at runtime):** + +Dispatch via the Agent tool with the same prompt. Bound it at a 5-minute timeout. +Present findings under `DOCUMENTATION REVIEW (Claude subagent):`. If it fails: "Doc review unavailable. Continuing." + +**Apply decision (T3B — informational, never auto-edit, but findings don't evaporate).** +If there are zero findings, say "Docs match what shipped — no gaps." and continue. Otherwise +present the findings, then use AskUserQuestion ONCE: + +> "The doc review found N gaps between the docs and what shipped. How do you want to handle them?" +> +> RECOMMENDATION: Choose A if the gaps are concrete doc fixes (stale path, missing flag). The +> doc review only reports; nothing is edited without your say-so. Completeness: A=9/10, B=4/10, C=8/10. + +Options: +- A) Apply all the doc fixes now +- B) Skip — leave docs as-is +- C) Decide per-finding + +On A or per-finding approvals, make the approved edits yourself (the tool never silently +rewrites docs). On B, note the gaps in the output so they're visible. + +**Persist the result:** +```bash +~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-doc-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","source":"SOURCE","commit":"'"$(git rev-parse --short HEAD)"'"}' +``` +Substitute: STATUS = "clean" if no gaps, "issues_found" if gaps exist. SOURCE = "codex" if Codex ran, "claude" if the subagent ran. + +**Cleanup:** Run `rm -f "$TMPERR_DOC"` after processing (if Codex was used). + +--- diff --git a/plan-ceo-review/sections/review-sections.md b/plan-ceo-review/sections/review-sections.md index 80d903665..497164284 100644 --- a/plan-ceo-review/sections/review-sections.md +++ b/plan-ceo-review/sections/review-sections.md @@ -253,38 +253,46 @@ If this plan has significant UI scope, recommend: "Consider running /plan-design **STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If this section turned up zero findings, state "No issues, moving on" and proceed. If the section has findings, you MUST call AskUserQuestion as a tool_use — a finding with an "obvious fix" is still a finding and still needs user approval before any change lands in the plan. Do NOT proceed until the user responds. **Reminder: Do NOT make any code changes. Review only.** -## Outside Voice — Independent Plan Challenge (optional, recommended) +## Outside Voice — Independent Plan Challenge (default-on) -After all review sections are complete, offer an independent second opinion from a -different AI system. Two models agreeing on a plan is stronger signal than one model's -thorough review. +After all review sections are complete, run an independent second opinion from a +different AI system automatically — it is a standard part of plan review, not an +opt-in. Two models agreeing on a plan is stronger signal than one model's thorough +review. The user turns this off only by asking explicitly +(`gstack-config set codex_reviews disabled`). -**Check tool availability:** +**Preflight — decide whether and how the outside voice runs:** ```bash -command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" +# Codex preflight: one block (functions sourced here don't persist to later blocks). +_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) +source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true +if [ "$_CODEX_CFG" = "disabled" ]; then + _CODEX_MODE="disabled" +elif ! command -v codex >/dev/null 2>&1; then + _CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true +elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then + _CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true +else + _CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true +fi +echo "CODEX_MODE: $_CODEX_MODE" ``` -Use AskUserQuestion: +Branch on the echoed `CODEX_MODE`: +- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip this section entirely; do NOT fall back to a Claude subagent — disabled means no extra review step. Print: "Codex review skipped (codex_reviews disabled). Re-enable: `gstack-config set codex_reviews enabled`." +- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path. +- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path. +- **`ready`** — run the Codex pass below. -> "All review sections are complete. Want an outside voice? A different AI system can -> give a brutally honest, independent challenge of this plan — logical gaps, feasibility -> risks, and blind spots that are hard to catch from inside the review. Takes about 2 -> minutes." -> -> RECOMMENDATION: Choose A — an independent second opinion catches structural blind -> spots. Two different AI models agreeing on a plan is stronger signal than one model's -> thorough review. Completeness: A=9/10, B=7/10. +When the mode is `ready`, `not_installed`, or `not_authed`, print one line so the off-switch +stays discoverable: "Running the outside voice automatically (standard step). Disable: `gstack-config set codex_reviews disabled`." -Options: -- A) Get the outside voice (recommended) -- B) Skip — proceed to outputs - -**If B:** Print "Skipping outside voice." and continue to the next section. - -**If A:** Construct the plan review prompt. Read the plan file being reviewed (the file -the user pointed this review at, or the branch diff scope). If a CEO plan document -was written in Step 0D-POST, read that too — it contains the scope decisions and vision. +**Construct the plan review prompt** (for `ready`, `not_installed`, and `not_authed` — skip only on `disabled`). +Read the plan file being reviewed (the file the user pointed this review at, or the branch +diff scope). If a CEO plan document was written in Step 0D-POST, read that too — it contains +the scope decisions and vision. Construct this prompt (substitute the actual plan content — if plan content exceeds 30KB, truncate to the first 30KB and note "Plan truncated for size"). **Always start with the @@ -302,7 +310,7 @@ compliments. Just the problems. THE PLAN: " -**If CODEX_AVAILABLE:** +**If `CODEX_MODE: ready` — run Codex:** ```bash TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX) @@ -325,15 +333,15 @@ CODEX SAYS (plan review — outside voice): ``` **Error handling:** All errors are non-blocking — the outside voice is informational. -- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate." -- Timeout: "Codex timed out after 5 minutes." -- Empty response: "Codex returned no response." +- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate." Fall back to the Claude subagent below. +- Timeout: "Codex timed out after 5 minutes." Fall back to the Claude subagent below. +- Empty response: "Codex returned no response." Fall back to the Claude subagent below. -On any Codex error, fall back to the Claude adversarial subagent. - -**If CODEX_NOT_AVAILABLE (or Codex errored):** +**If `CODEX_MODE: not_installed` or `not_authed` (or Codex errored at runtime):** Dispatch via the Agent tool. The subagent has fresh context — genuine independence. +Bound it the same way as Codex: cap the dispatch at a 5-minute timeout so "never blocking" +is also "never hanging." Subagent prompt: same plan review prompt as above. @@ -341,6 +349,8 @@ Present findings under an `OUTSIDE VOICE (Claude subagent):` header. If the subagent fails or times out: "Outside voice unavailable. Continuing to outputs." +(On `CODEX_MODE: disabled` you already skipped this section per the preflight — do not reach here.) + **Cross-model tension:** After presenting the outside voice findings, note any points where the outside voice diff --git a/plan-devex-review/sections/review-sections.md b/plan-devex-review/sections/review-sections.md index 0e94ceb62..05403d904 100644 --- a/plan-devex-review/sections/review-sections.md +++ b/plan-devex-review/sections/review-sections.md @@ -239,38 +239,46 @@ Check each item. For any unchecked item, explain what's missing and suggest the **STOP.** AskUserQuestion for any item that requires a design decision. -## Outside Voice — Independent Plan Challenge (optional, recommended) +## Outside Voice — Independent Plan Challenge (default-on) -After all review sections are complete, offer an independent second opinion from a -different AI system. Two models agreeing on a plan is stronger signal than one model's -thorough review. +After all review sections are complete, run an independent second opinion from a +different AI system automatically — it is a standard part of plan review, not an +opt-in. Two models agreeing on a plan is stronger signal than one model's thorough +review. The user turns this off only by asking explicitly +(`gstack-config set codex_reviews disabled`). -**Check tool availability:** +**Preflight — decide whether and how the outside voice runs:** ```bash -command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" +# Codex preflight: one block (functions sourced here don't persist to later blocks). +_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) +source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true +if [ "$_CODEX_CFG" = "disabled" ]; then + _CODEX_MODE="disabled" +elif ! command -v codex >/dev/null 2>&1; then + _CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true +elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then + _CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true +else + _CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true +fi +echo "CODEX_MODE: $_CODEX_MODE" ``` -Use AskUserQuestion: +Branch on the echoed `CODEX_MODE`: +- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip this section entirely; do NOT fall back to a Claude subagent — disabled means no extra review step. Print: "Codex review skipped (codex_reviews disabled). Re-enable: `gstack-config set codex_reviews enabled`." +- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path. +- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path. +- **`ready`** — run the Codex pass below. -> "All review sections are complete. Want an outside voice? A different AI system can -> give a brutally honest, independent challenge of this plan — logical gaps, feasibility -> risks, and blind spots that are hard to catch from inside the review. Takes about 2 -> minutes." -> -> RECOMMENDATION: Choose A — an independent second opinion catches structural blind -> spots. Two different AI models agreeing on a plan is stronger signal than one model's -> thorough review. Completeness: A=9/10, B=7/10. +When the mode is `ready`, `not_installed`, or `not_authed`, print one line so the off-switch +stays discoverable: "Running the outside voice automatically (standard step). Disable: `gstack-config set codex_reviews disabled`." -Options: -- A) Get the outside voice (recommended) -- B) Skip — proceed to outputs - -**If B:** Print "Skipping outside voice." and continue to the next section. - -**If A:** Construct the plan review prompt. Read the plan file being reviewed (the file -the user pointed this review at, or the branch diff scope). If a CEO plan document -was written in Step 0D-POST, read that too — it contains the scope decisions and vision. +**Construct the plan review prompt** (for `ready`, `not_installed`, and `not_authed` — skip only on `disabled`). +Read the plan file being reviewed (the file the user pointed this review at, or the branch +diff scope). If a CEO plan document was written in Step 0D-POST, read that too — it contains +the scope decisions and vision. Construct this prompt (substitute the actual plan content — if plan content exceeds 30KB, truncate to the first 30KB and note "Plan truncated for size"). **Always start with the @@ -288,7 +296,7 @@ compliments. Just the problems. THE PLAN: " -**If CODEX_AVAILABLE:** +**If `CODEX_MODE: ready` — run Codex:** ```bash TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX) @@ -311,15 +319,15 @@ CODEX SAYS (plan review — outside voice): ``` **Error handling:** All errors are non-blocking — the outside voice is informational. -- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate." -- Timeout: "Codex timed out after 5 minutes." -- Empty response: "Codex returned no response." +- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate." Fall back to the Claude subagent below. +- Timeout: "Codex timed out after 5 minutes." Fall back to the Claude subagent below. +- Empty response: "Codex returned no response." Fall back to the Claude subagent below. -On any Codex error, fall back to the Claude adversarial subagent. - -**If CODEX_NOT_AVAILABLE (or Codex errored):** +**If `CODEX_MODE: not_installed` or `not_authed` (or Codex errored at runtime):** Dispatch via the Agent tool. The subagent has fresh context — genuine independence. +Bound it the same way as Codex: cap the dispatch at a 5-minute timeout so "never blocking" +is also "never hanging." Subagent prompt: same plan review prompt as above. @@ -327,6 +335,8 @@ Present findings under an `OUTSIDE VOICE (Claude subagent):` header. If the subagent fails or times out: "Outside voice unavailable. Continuing to outputs." +(On `CODEX_MODE: disabled` you already skipped this section per the preflight — do not reach here.) + **Cross-model tension:** After presenting the outside voice findings, note any points where the outside voice diff --git a/plan-eng-review/sections/review-sections.md b/plan-eng-review/sections/review-sections.md index 43125b0af..e6bb6378c 100644 --- a/plan-eng-review/sections/review-sections.md +++ b/plan-eng-review/sections/review-sections.md @@ -329,38 +329,46 @@ For each issue found in this section, call AskUserQuestion individually. One iss **STOP.** Do NOT proceed to the next review section, edit the plan file with the proposed fix, or call ExitPlanMode until the user responds. An issue with an "obvious fix" is still an issue and still needs explicit user approval before it lands in the plan. Loading the AskUserQuestion schema via ToolSearch and then writing the recommendation as chat prose is the failure mode this gate exists to prevent. -## Outside Voice — Independent Plan Challenge (optional, recommended) +## Outside Voice — Independent Plan Challenge (default-on) -After all review sections are complete, offer an independent second opinion from a -different AI system. Two models agreeing on a plan is stronger signal than one model's -thorough review. +After all review sections are complete, run an independent second opinion from a +different AI system automatically — it is a standard part of plan review, not an +opt-in. Two models agreeing on a plan is stronger signal than one model's thorough +review. The user turns this off only by asking explicitly +(`gstack-config set codex_reviews disabled`). -**Check tool availability:** +**Preflight — decide whether and how the outside voice runs:** ```bash -command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" +# Codex preflight: one block (functions sourced here don't persist to later blocks). +_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) +source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true +if [ "$_CODEX_CFG" = "disabled" ]; then + _CODEX_MODE="disabled" +elif ! command -v codex >/dev/null 2>&1; then + _CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true +elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then + _CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true +else + _CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true +fi +echo "CODEX_MODE: $_CODEX_MODE" ``` -Use AskUserQuestion: +Branch on the echoed `CODEX_MODE`: +- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip this section entirely; do NOT fall back to a Claude subagent — disabled means no extra review step. Print: "Codex review skipped (codex_reviews disabled). Re-enable: `gstack-config set codex_reviews enabled`." +- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path. +- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path. +- **`ready`** — run the Codex pass below. -> "All review sections are complete. Want an outside voice? A different AI system can -> give a brutally honest, independent challenge of this plan — logical gaps, feasibility -> risks, and blind spots that are hard to catch from inside the review. Takes about 2 -> minutes." -> -> RECOMMENDATION: Choose A — an independent second opinion catches structural blind -> spots. Two different AI models agreeing on a plan is stronger signal than one model's -> thorough review. Completeness: A=9/10, B=7/10. +When the mode is `ready`, `not_installed`, or `not_authed`, print one line so the off-switch +stays discoverable: "Running the outside voice automatically (standard step). Disable: `gstack-config set codex_reviews disabled`." -Options: -- A) Get the outside voice (recommended) -- B) Skip — proceed to outputs - -**If B:** Print "Skipping outside voice." and continue to the next section. - -**If A:** Construct the plan review prompt. Read the plan file being reviewed (the file -the user pointed this review at, or the branch diff scope). If a CEO plan document -was written in Step 0D-POST, read that too — it contains the scope decisions and vision. +**Construct the plan review prompt** (for `ready`, `not_installed`, and `not_authed` — skip only on `disabled`). +Read the plan file being reviewed (the file the user pointed this review at, or the branch +diff scope). If a CEO plan document was written in Step 0D-POST, read that too — it contains +the scope decisions and vision. Construct this prompt (substitute the actual plan content — if plan content exceeds 30KB, truncate to the first 30KB and note "Plan truncated for size"). **Always start with the @@ -378,7 +386,7 @@ compliments. Just the problems. THE PLAN: " -**If CODEX_AVAILABLE:** +**If `CODEX_MODE: ready` — run Codex:** ```bash TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX) @@ -401,15 +409,15 @@ CODEX SAYS (plan review — outside voice): ``` **Error handling:** All errors are non-blocking — the outside voice is informational. -- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate." -- Timeout: "Codex timed out after 5 minutes." -- Empty response: "Codex returned no response." +- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate." Fall back to the Claude subagent below. +- Timeout: "Codex timed out after 5 minutes." Fall back to the Claude subagent below. +- Empty response: "Codex returned no response." Fall back to the Claude subagent below. -On any Codex error, fall back to the Claude adversarial subagent. - -**If CODEX_NOT_AVAILABLE (or Codex errored):** +**If `CODEX_MODE: not_installed` or `not_authed` (or Codex errored at runtime):** Dispatch via the Agent tool. The subagent has fresh context — genuine independence. +Bound it the same way as Codex: cap the dispatch at a 5-minute timeout so "never blocking" +is also "never hanging." Subagent prompt: same plan review prompt as above. @@ -417,6 +425,8 @@ Present findings under an `OUTSIDE VOICE (Claude subagent):` header. If the subagent fails or times out: "Outside voice unavailable. Continuing to outputs." +(On `CODEX_MODE: disabled` you already skipped this section per the preflight — do not reach here.) + **Cross-model tension:** After presenting the outside voice findings, note any points where the outside voice diff --git a/review/SKILL.md b/review/SKILL.md index e7a2fa4f2..16e90389f 100644 --- a/review/SKILL.md +++ b/review/SKILL.md @@ -1602,23 +1602,47 @@ If no documentation files exist, skip this step silently. Every diff gets adversarial review from both Claude and Codex. LOC is not a proxy for risk — a 5-line auth change can be critical. -**Detect diff size and tool availability:** +**Detect diff size:** ```bash DIFF_BASE=$(git merge-base origin/ HEAD) DIFF_INS=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0") DIFF_DEL=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ deletion' | grep -oE '[0-9]+' || echo "0") DIFF_TOTAL=$((DIFF_INS + DIFF_DEL)) -command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" -# Legacy opt-out — only gates Codex passes, Claude always runs -OLD_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || true) echo "DIFF_SIZE: $DIFF_TOTAL" -echo "OLD_CFG: ${OLD_CFG:-not_set}" ``` -If `OLD_CFG` is `disabled`: skip Codex passes only. Claude adversarial subagent still runs (it's free and fast). Jump to the "Claude adversarial subagent" section. +**Detect the Codex master switch + tool availability:** -**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size. +```bash +# Codex preflight: one block (functions sourced here don't persist to later blocks). +_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) +source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true +if [ "$_CODEX_CFG" = "disabled" ]; then + _CODEX_MODE="disabled" +elif ! command -v codex >/dev/null 2>&1; then + _CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true +elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then + _CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true +else + _CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true +fi +echo "CODEX_MODE: $_CODEX_MODE" +``` + +Branch on the echoed `CODEX_MODE`: +- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip the Codex passes only; the Claude adversarial subagent below STILL runs (it is free and fast). Print: "Codex passes skipped (codex_reviews disabled) — running Claude adversarial only." +- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path. +- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path. +- **`ready`** — run the Codex pass below. + +For this diff-review path, `CODEX_MODE: disabled` means skip the Codex passes ONLY — the +Claude adversarial subagent below still runs (it's free and fast). `ready` runs the Codex +passes; `not_installed` / `not_authed` skip them with the printed note and continue with +Claude only. + +**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size (still requires `CODEX_MODE: ready`). --- @@ -1639,9 +1663,9 @@ If the subagent fails or times out: "Claude adversarial subagent unavailable. Co --- -### Codex adversarial challenge (always runs when available) +### Codex adversarial challenge (runs whenever `CODEX_MODE: ready`) -If Codex is available AND `OLD_CFG` is NOT `disabled`: +If `CODEX_MODE` is `ready`: ```bash TMPERR_ADV=$(mktemp /tmp/codex-adv-XXXXXXXX) @@ -1663,13 +1687,13 @@ Present the full output verbatim. This is informational — it never blocks ship **Cleanup:** Run `rm -f "$TMPERR_ADV"` after processing. -If Codex is NOT available: "Codex CLI not found — running Claude adversarial only. Install Codex for cross-model coverage: `npm install -g @openai/codex`" +If `CODEX_MODE` is `not_installed` / `not_authed` / `disabled`: the preflight already printed the reason; run Claude adversarial only. --- ### Codex structured review (large diffs only, 200+ lines) -If `DIFF_TOTAL >= 200` AND Codex is available AND `OLD_CFG` is NOT `disabled`: +If `DIFF_TOTAL >= 200` AND `CODEX_MODE` is `ready`: ```bash TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX) diff --git a/ship/sections/adversarial.md b/ship/sections/adversarial.md index bbc1eb80d..c7b2321d1 100644 --- a/ship/sections/adversarial.md +++ b/ship/sections/adversarial.md @@ -4,23 +4,47 @@ Every diff gets adversarial review from both Claude and Codex. LOC is not a proxy for risk — a 5-line auth change can be critical. -**Detect diff size and tool availability:** +**Detect diff size:** ```bash DIFF_BASE=$(git merge-base origin/ HEAD) DIFF_INS=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0") DIFF_DEL=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ deletion' | grep -oE '[0-9]+' || echo "0") DIFF_TOTAL=$((DIFF_INS + DIFF_DEL)) -command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" -# Legacy opt-out — only gates Codex passes, Claude always runs -OLD_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || true) echo "DIFF_SIZE: $DIFF_TOTAL" -echo "OLD_CFG: ${OLD_CFG:-not_set}" ``` -If `OLD_CFG` is `disabled`: skip Codex passes only. Claude adversarial subagent still runs (it's free and fast). Jump to the "Claude adversarial subagent" section. +**Detect the Codex master switch + tool availability:** -**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size. +```bash +# Codex preflight: one block (functions sourced here don't persist to later blocks). +_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) +source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true +if [ "$_CODEX_CFG" = "disabled" ]; then + _CODEX_MODE="disabled" +elif ! command -v codex >/dev/null 2>&1; then + _CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true +elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then + _CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true +else + _CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true +fi +echo "CODEX_MODE: $_CODEX_MODE" +``` + +Branch on the echoed `CODEX_MODE`: +- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip the Codex passes only; the Claude adversarial subagent below STILL runs (it is free and fast). Print: "Codex passes skipped (codex_reviews disabled) — running Claude adversarial only." +- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path. +- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path. +- **`ready`** — run the Codex pass below. + +For this diff-review path, `CODEX_MODE: disabled` means skip the Codex passes ONLY — the +Claude adversarial subagent below still runs (it's free and fast). `ready` runs the Codex +passes; `not_installed` / `not_authed` skip them with the printed note and continue with +Claude only. + +**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size (still requires `CODEX_MODE: ready`). --- @@ -41,9 +65,9 @@ If the subagent fails or times out: "Claude adversarial subagent unavailable. Co --- -### Codex adversarial challenge (always runs when available) +### Codex adversarial challenge (runs whenever `CODEX_MODE: ready`) -If Codex is available AND `OLD_CFG` is NOT `disabled`: +If `CODEX_MODE` is `ready`: ```bash TMPERR_ADV=$(mktemp /tmp/codex-adv-XXXXXXXX) @@ -65,13 +89,13 @@ Present the full output verbatim. This is informational — it never blocks ship **Cleanup:** Run `rm -f "$TMPERR_ADV"` after processing. -If Codex is NOT available: "Codex CLI not found — running Claude adversarial only. Install Codex for cross-model coverage: `npm install -g @openai/codex`" +If `CODEX_MODE` is `not_installed` / `not_authed` / `disabled`: the preflight already printed the reason; run Claude adversarial only. --- ### Codex structured review (large diffs only, 200+ lines) -If `DIFF_TOTAL >= 200` AND Codex is available AND `OLD_CFG` is NOT `disabled`: +If `DIFF_TOTAL >= 200` AND `CODEX_MODE` is `ready`: ```bash TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX) diff --git a/test/fixtures/golden/factory-ship-SKILL.md b/test/fixtures/golden/factory-ship-SKILL.md index f5f48abaf..dd8d0e01d 100644 --- a/test/fixtures/golden/factory-ship-SKILL.md +++ b/test/fixtures/golden/factory-ship-SKILL.md @@ -2332,23 +2332,47 @@ For each comment in `comments`: Every diff gets adversarial review from both Claude and Codex. LOC is not a proxy for risk — a 5-line auth change can be critical. -**Detect diff size and tool availability:** +**Detect diff size:** ```bash DIFF_BASE=$(git merge-base origin/ HEAD) DIFF_INS=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0") DIFF_DEL=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ deletion' | grep -oE '[0-9]+' || echo "0") DIFF_TOTAL=$((DIFF_INS + DIFF_DEL)) -command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" -# Legacy opt-out — only gates Codex passes, Claude always runs -OLD_CFG=$($GSTACK_ROOT/bin/gstack-config get codex_reviews 2>/dev/null || true) echo "DIFF_SIZE: $DIFF_TOTAL" -echo "OLD_CFG: ${OLD_CFG:-not_set}" ``` -If `OLD_CFG` is `disabled`: skip Codex passes only. Claude adversarial subagent still runs (it's free and fast). Jump to the "Claude adversarial subagent" section. +**Detect the Codex master switch + tool availability:** -**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size. +```bash +# Codex preflight: one block (functions sourced here don't persist to later blocks). +_TEL=$($GSTACK_ROOT/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$($GSTACK_ROOT/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) +source $GSTACK_ROOT/bin/gstack-codex-probe 2>/dev/null || true +if [ "$_CODEX_CFG" = "disabled" ]; then + _CODEX_MODE="disabled" +elif ! command -v codex >/dev/null 2>&1; then + _CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true +elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then + _CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true +else + _CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true +fi +echo "CODEX_MODE: $_CODEX_MODE" +``` + +Branch on the echoed `CODEX_MODE`: +- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip the Codex passes only; the Claude adversarial subagent below STILL runs (it is free and fast). Print: "Codex passes skipped (codex_reviews disabled) — running Claude adversarial only." +- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path. +- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path. +- **`ready`** — run the Codex pass below. + +For this diff-review path, `CODEX_MODE: disabled` means skip the Codex passes ONLY — the +Claude adversarial subagent below still runs (it's free and fast). `ready` runs the Codex +passes; `not_installed` / `not_authed` skip them with the printed note and continue with +Claude only. + +**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size (still requires `CODEX_MODE: ready`). --- @@ -2369,9 +2393,9 @@ If the subagent fails or times out: "Claude adversarial subagent unavailable. Co --- -### Codex adversarial challenge (always runs when available) +### Codex adversarial challenge (runs whenever `CODEX_MODE: ready`) -If Codex is available AND `OLD_CFG` is NOT `disabled`: +If `CODEX_MODE` is `ready`: ```bash TMPERR_ADV=$(mktemp /tmp/codex-adv-XXXXXXXX) @@ -2393,13 +2417,13 @@ Present the full output verbatim. This is informational — it never blocks ship **Cleanup:** Run `rm -f "$TMPERR_ADV"` after processing. -If Codex is NOT available: "Codex CLI not found — running Claude adversarial only. Install Codex for cross-model coverage: `npm install -g @openai/codex`" +If `CODEX_MODE` is `not_installed` / `not_authed` / `disabled`: the preflight already printed the reason; run Claude adversarial only. --- ### Codex structured review (large diffs only, 200+ lines) -If `DIFF_TOTAL >= 200` AND Codex is available AND `OLD_CFG` is NOT `disabled`: +If `DIFF_TOTAL >= 200` AND `CODEX_MODE` is `ready`: ```bash TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX)