fix: autoplan dual-voice — sequential foreground execution instead of broken parallel

Background subagents don't inherit tool permissions in Claude Code, so the
Claude subagent in dual-voice mode was silently failing on every invocation.
Every autoplan run was degrading to single-reviewer mode without warning.

Change all three phases (CEO, Design, Eng) from "simultaneously" to
sequential foreground execution: Claude subagent first (Agent tool,
foreground), then Codex (Bash). Both complete before the consensus table.

Fixes #497

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-03-28 08:11:39 -07:00
parent e4d7c86870
commit 3b62fec470
+12 -10
View File
@@ -207,7 +207,9 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
Duplicates → reject (P4). Borderline (3-5 files) → mark TASTE DECISION.
- All 10 review sections: run fully, auto-decide each issue, log every decision.
- Dual voices: always run BOTH Claude subagent AND Codex if available (P6).
Run them simultaneously (Agent tool for subagent, Bash for Codex).
Run them sequentially in foreground. First the Claude subagent (Agent tool,
foreground — do NOT use run_in_background), then Codex (Bash). Both must
complete before building the consensus table.
**Codex CEO voice** (via Bash):
```bash
@@ -234,7 +236,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
5. What's the competitive risk — could someone else solve this first/better?
For each finding: what's wrong, severity (critical/high/medium), and the fix."
**Error handling:** All non-blocking. Codex auth/timeout/empty → proceed with
**Error handling:** Both calls block in foreground. Codex auth/timeout/empty → proceed with
Claude subagent only, tagged `[single-model]`. If Claude subagent also fails →
"Outside voices unavailable — continuing with primary review."
@@ -255,10 +257,10 @@ Step 0 (0A-0F) — run each sub-step and produce:
- 0E: Temporal interrogation (HOUR 1 → HOUR 6+)
- 0F: Mode selection confirmation
Step 0.5 (Dual Voices): Run Claude subagent AND Codex simultaneously. Present
Codex output under CODEX SAYS (CEO — strategy challenge) header. Present subagent
output under CLAUDE SUBAGENT (CEO — strategic independence) header. Produce CEO
consensus table:
Step 0.5 (Dual Voices): Run Claude subagent (foreground Agent tool) first, then
Codex (Bash). Present Codex output under CODEX SAYS (CEO — strategy challenge)
header. Present subagent output under CLAUDE SUBAGENT (CEO — strategic independence)
header. Produce CEO consensus table:
```
CEO DUAL VOICES — CONSENSUS TABLE:
@@ -351,7 +353,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
For each finding: what's wrong, severity (critical/high/medium), and the fix."
NO prior-phase context — subagent must be truly independent.
Error handling: same as Phase 1 (non-blocking, degradation matrix applies).
Error handling: same as Phase 1 (both foreground/blocking, degradation matrix applies).
- Design choices: if codex disagrees with a design decision with valid UX reasoning
→ TASTE DECISION.
@@ -360,7 +362,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
1. Step 0 (Design Scope): Rate completeness 0-10. Check DESIGN.md. Map existing patterns.
2. Step 0.5 (Dual Voices): Run Claude subagent AND Codex simultaneously. Present under
2. Step 0.5 (Dual Voices): Run Claude subagent (foreground) first, then Codex. Present under
CODEX SAYS (design — UX challenge) and CLAUDE SUBAGENT (design — independent review)
headers. Produce design litmus scorecard (consensus table). Use the litmus scorecard
format from plan-design-review. Include CEO phase findings in Codex prompt ONLY
@@ -421,7 +423,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
For each finding: what's wrong, severity, and the fix."
NO prior-phase context — subagent must be truly independent.
Error handling: same as Phase 1 (non-blocking, degradation matrix applies).
Error handling: same as Phase 1 (both foreground/blocking, degradation matrix applies).
- Architecture choices: explicit over clever (P5). If codex disagrees with valid reason → TASTE DECISION.
- Evals: always include all relevant suites (P1)
@@ -433,7 +435,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
1. Step 0 (Scope Challenge): Read actual code referenced by the plan. Map each
sub-problem to existing code. Run the complexity check. Produce concrete findings.
2. Step 0.5 (Dual Voices): Run Claude subagent AND Codex simultaneously. Present
2. Step 0.5 (Dual Voices): Run Claude subagent (foreground) first, then Codex. Present
Codex output under CODEX SAYS (eng — architecture challenge) header. Present subagent
output under CLAUDE SUBAGENT (eng — independent review) header. Produce eng consensus
table: