Hardens /codex and /autoplan against silent failures surfaced by the #972
stdin fix and #1003 Apple Silicon codesign. Six-layer defense:
1. **Multi-signal auth probe** (new Step 0.5 / Phase 0.5): env-based auth
($CODEX_API_KEY, $OPENAI_API_KEY) OR file-based auth
(${CODEX_HOME:-~/.codex}/auth.json). Rejects false negatives that the
old file-only check produced for CI / platform-engineer users.
2. **Timeout wrapper** around every codex exec / codex review invocation:
gtimeout → timeout → unwrapped fallback chain. On exit 124, surfaces
common causes + actionable next step. Guards against model-API stalls
not covered by the #972 stdin fix.
3. **Stderr capture in Challenge mode** (codex/SKILL.md.tmpl:208):
2>/dev/null → 2>$TMPERR. Post-invocation grep for auth/login/unauthorized
surfaces errors that were previously dropped silently.
4. **Completeness check** in the Python JSON parser: tracks turn.completed
events and warns on zero (possible mid-stream disconnect).
5. **Version warning** for known-bad Codex CLI (0.120.0-0.120.2, the range
that introduced the stdin deadlock #972 fixes). Anchored regex
`(^|[^0-9.])0\.120\.(0|1|2)([^0-9.]|$)` prevents 0.120.10 / 0.120.20
false positives.
6. **Failure telemetry + operational learnings**: codex_timeout,
codex_auth_failed, codex_cli_missing, codex_version_warning events
land in ~/.gstack/analytics/skill-usage.jsonl behind the existing
telemetry opt-in. On timeout (exit 124), auto-logs an operational
learning via gstack-learnings-log so future /investigate sessions
surface prior hang patterns automatically.
**Shared helper** (bin/gstack-codex-probe): consolidates all four pieces
(auth probe, version check, timeout wrapper, telemetry logger) into one
bash file that /codex and /autoplan source. Namespace-prefixed
(_gstack_codex_*) with a unit test that verifies sourcing does not leak
shell options into the caller. pathRewrites in host configs rewrite
~/.claude/skills/gstack → $GSTACK_ROOT for Codex, $GSTACK_BIN for
Factory/Cursor/etc.
**Apple Silicon coreutils auto-install** (setup:264): macOS lacks GNU
timeout by default; Homebrew's coreutils installs it as gtimeout to
avoid shadowing BSD utilities. ./setup now auto-installs coreutils on
Darwin (arch-agnostic — applies to Intel + Apple Silicon) when neither
gtimeout nor timeout is present. Opt-out via GSTACK_SKIP_COREUTILS=1
for CI, managed machines, or offline envs.
**25 deterministic unit tests** (test/codex-hardening.test.ts):
- 8 auth probe combinations (env precedence, whitespace, alternate
$CODEX_HOME, corrupt file paths)
- 10 version regex cases including 0.120.10 false-positive guards
and v-prefixed / multiline output
- 4 timeout wrapper + namespace hygiene (bash -n, gtimeout
preference, set-option leak check)
- 3 telemetry payload schema checks (confirms env values + auth
tokens never leak into emitted events)
**1 periodic-tier E2E** (test/skill-e2e-autoplan-dual-voice.test.ts):
gates the /autoplan dual-voice path — asserts both Claude subagent
and Codex voices produce output in Phase 1, OR that [codex-unavailable]
is logged when Codex is absent. ~\$1/run, not a CI gate.
Golden baseline + gen-skill-docs exclusion list updated for the new
codex path references and the 16 < /dev/null redirects from #972.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>