mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-07 05:56:41 +02:00
feat: codex/autoplan hardening + Apple Silicon coreutils auto-install
Hardens /codex and /autoplan against silent failures surfaced by the #972 stdin fix and #1003 Apple Silicon codesign. Six-layer defense: 1. **Multi-signal auth probe** (new Step 0.5 / Phase 0.5): env-based auth ($CODEX_API_KEY, $OPENAI_API_KEY) OR file-based auth (${CODEX_HOME:-~/.codex}/auth.json). Rejects false negatives that the old file-only check produced for CI / platform-engineer users. 2. **Timeout wrapper** around every codex exec / codex review invocation: gtimeout → timeout → unwrapped fallback chain. On exit 124, surfaces common causes + actionable next step. Guards against model-API stalls not covered by the #972 stdin fix. 3. **Stderr capture in Challenge mode** (codex/SKILL.md.tmpl:208): 2>/dev/null → 2>$TMPERR. Post-invocation grep for auth/login/unauthorized surfaces errors that were previously dropped silently. 4. **Completeness check** in the Python JSON parser: tracks turn.completed events and warns on zero (possible mid-stream disconnect). 5. **Version warning** for known-bad Codex CLI (0.120.0-0.120.2, the range that introduced the stdin deadlock #972 fixes). Anchored regex `(^|[^0-9.])0\.120\.(0|1|2)([^0-9.]|$)` prevents 0.120.10 / 0.120.20 false positives. 6. **Failure telemetry + operational learnings**: codex_timeout, codex_auth_failed, codex_cli_missing, codex_version_warning events land in ~/.gstack/analytics/skill-usage.jsonl behind the existing telemetry opt-in. On timeout (exit 124), auto-logs an operational learning via gstack-learnings-log so future /investigate sessions surface prior hang patterns automatically. **Shared helper** (bin/gstack-codex-probe): consolidates all four pieces (auth probe, version check, timeout wrapper, telemetry logger) into one bash file that /codex and /autoplan source. Namespace-prefixed (_gstack_codex_*) with a unit test that verifies sourcing does not leak shell options into the caller. pathRewrites in host configs rewrite ~/.claude/skills/gstack → $GSTACK_ROOT for Codex, $GSTACK_BIN for Factory/Cursor/etc. **Apple Silicon coreutils auto-install** (setup:264): macOS lacks GNU timeout by default; Homebrew's coreutils installs it as gtimeout to avoid shadowing BSD utilities. ./setup now auto-installs coreutils on Darwin (arch-agnostic — applies to Intel + Apple Silicon) when neither gtimeout nor timeout is present. Opt-out via GSTACK_SKIP_COREUTILS=1 for CI, managed machines, or offline envs. **25 deterministic unit tests** (test/codex-hardening.test.ts): - 8 auth probe combinations (env precedence, whitespace, alternate $CODEX_HOME, corrupt file paths) - 10 version regex cases including 0.120.10 false-positive guards and v-prefixed / multiline output - 4 timeout wrapper + namespace hygiene (bash -n, gtimeout preference, set-option leak check) - 3 telemetry payload schema checks (confirms env values + auth tokens never leak into emitted events) **1 periodic-tier E2E** (test/skill-e2e-autoplan-dual-voice.test.ts): gates the /autoplan dual-voice path — asserts both Claude subagent and Codex voices produce output in Phase 1, OR that [codex-unavailable] is logged when Codex is absent. ~\$1/run, not a CI gate. Golden baseline + gen-skill-docs exclusion list updated for the new codex path references and the 16 < /dev/null redirects from #972. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
+91
-8
@@ -630,6 +630,45 @@ CODEX_BIN=$(which codex 2>/dev/null || echo "")
|
||||
If `NOT_FOUND`: stop and tell the user:
|
||||
"Codex CLI not found. Install it: `npm install -g @openai/codex` or see https://github.com/openai/codex"
|
||||
|
||||
If `NOT_FOUND`, also log the event:
|
||||
```bash
|
||||
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off)
|
||||
source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null && _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 0.5: Auth probe + version check
|
||||
|
||||
Before building expensive prompts, verify Codex has valid auth AND the installed
|
||||
CLI version isn't in the known-bad list. Sourcing `gstack-codex-probe` loads the
|
||||
shared helpers that both `/codex` and `/autoplan` use.
|
||||
|
||||
```bash
|
||||
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off)
|
||||
source ~/.claude/skills/gstack/bin/gstack-codex-probe
|
||||
|
||||
if ! _gstack_codex_auth_probe >/dev/null; then
|
||||
_gstack_codex_log_event "codex_auth_failed"
|
||||
echo "AUTH_FAILED"
|
||||
fi
|
||||
_gstack_codex_version_check # warns if known-bad, non-blocking
|
||||
```
|
||||
|
||||
If the output contains `AUTH_FAILED`, stop and tell the user:
|
||||
"No Codex authentication found. Run `codex login` or set `$CODEX_API_KEY` / `$OPENAI_API_KEY`, then re-run this skill."
|
||||
|
||||
If the version check printed a `WARN:` line, pass it through to the user verbatim
|
||||
(non-blocking — Codex may still work, but the user should upgrade).
|
||||
|
||||
The probe multi-signal auth logic accepts: `$CODEX_API_KEY` set, `$OPENAI_API_KEY`
|
||||
set, or `${CODEX_HOME:-~/.codex}/auth.json` exists. Avoids false-negatives for
|
||||
env-auth users (CI, platform engineers) that file-only checks would reject.
|
||||
|
||||
**Update the known-bad list** in `bin/gstack-codex-probe` when a new Codex CLI version
|
||||
regresses. Current entries (`0.120.0`, `0.120.1`, `0.120.2`) trace to the stdin
|
||||
deadlock fixed in #972.
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Detect mode
|
||||
@@ -692,7 +731,15 @@ instructions, append them after the boundary separated by a newline:
|
||||
```bash
|
||||
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
|
||||
cd "$_REPO_ROOT"
|
||||
codex review "IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. Do NOT modify agents/openai.yaml. Stay focused on repository code only." --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
|
||||
# Fix 1: wrap with timeout. 330s (5.5min) is slightly longer than the Bash 300s
|
||||
# so the shell wrapper only fires if Bash's own timeout doesn't.
|
||||
_gstack_codex_timeout_wrapper 330 codex review "IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. Do NOT modify agents/openai.yaml. Stay focused on repository code only." --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR"
|
||||
_CODEX_EXIT=$?
|
||||
if [ "$_CODEX_EXIT" = "124" ]; then
|
||||
_gstack_codex_log_event "codex_timeout" "330"
|
||||
_gstack_codex_log_hang "review" "$(wc -c < "$TMPERR" 2>/dev/null || echo 0)"
|
||||
echo "Codex stalled past 5.5 minutes. Common causes: model API stall, long prompt, network issue. Try re-running. If persistent, split the prompt or check ~/.codex/logs/."
|
||||
fi
|
||||
```
|
||||
|
||||
If the user passed `--xhigh`, use `"xhigh"` instead of `"high"`.
|
||||
@@ -704,7 +751,7 @@ _REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo"
|
||||
cd "$_REPO_ROOT"
|
||||
codex review "IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. Do NOT modify agents/openai.yaml. Stay focused on repository code only.
|
||||
|
||||
focus on security" --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
|
||||
focus on security" --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR"
|
||||
```
|
||||
|
||||
3. Capture the output. Then parse cost from stderr:
|
||||
@@ -856,8 +903,12 @@ If the user passed `--xhigh`, use `"xhigh"` instead of `"high"`.
|
||||
|
||||
```bash
|
||||
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
|
||||
codex exec "<prompt>" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached --json 2>/dev/null | PYTHONUNBUFFERED=1 python3 -u -c "
|
||||
# Fix 1+2: wrap with timeout (gtimeout/timeout fallback chain via probe helper),
|
||||
# capture stderr to $TMPERR for auth error detection (was: 2>/dev/null).
|
||||
TMPERR=${TMPERR:-$(mktemp /tmp/codex-err-XXXXXX.txt)}
|
||||
_gstack_codex_timeout_wrapper 600 codex exec "<prompt>" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached --json < /dev/null 2>"$TMPERR" | PYTHONUNBUFFERED=1 python3 -u -c "
|
||||
import sys, json
|
||||
turn_completed_count = 0
|
||||
for line in sys.stdin:
|
||||
line = line.strip()
|
||||
if not line: continue
|
||||
@@ -877,11 +928,27 @@ for line in sys.stdin:
|
||||
cmd = item.get('command','')
|
||||
if cmd: print(f'[codex ran] {cmd}', flush=True)
|
||||
elif t == 'turn.completed':
|
||||
turn_completed_count += 1
|
||||
usage = obj.get('usage',{})
|
||||
tokens = usage.get('input_tokens',0) + usage.get('output_tokens',0)
|
||||
if tokens: print(f'\ntokens used: {tokens}', flush=True)
|
||||
except: pass
|
||||
# Fix 2: completeness check — warn if no turn.completed received
|
||||
if turn_completed_count == 0:
|
||||
print('[codex warning] No turn.completed event received — possible mid-stream disconnect.', flush=True, file=sys.stderr)
|
||||
"
|
||||
_CODEX_EXIT=${PIPESTATUS[0]}
|
||||
# Fix 1: hang detection — log + surface actionable message
|
||||
if [ "$_CODEX_EXIT" = "124" ]; then
|
||||
_gstack_codex_log_event "codex_timeout" "600"
|
||||
_gstack_codex_log_hang "challenge" "$(wc -c < "$TMPERR" 2>/dev/null || echo 0)"
|
||||
echo "Codex stalled past 10 minutes. Common causes: model API stall, long prompt, network issue. Try re-running. If persistent, split the prompt or check ~/.codex/logs/."
|
||||
fi
|
||||
# Fix 2: surface auth errors from captured stderr instead of dropping them
|
||||
if grep -qiE "auth|login|unauthorized" "$TMPERR" 2>/dev/null; then
|
||||
echo "[codex auth error] $(head -1 "$TMPERR")"
|
||||
_gstack_codex_log_event "codex_auth_failed"
|
||||
fi
|
||||
```
|
||||
|
||||
This parses codex's JSONL events to extract reasoning traces, tool calls, and the final
|
||||
@@ -968,7 +1035,8 @@ If the user passed `--xhigh`, use `"xhigh"` instead of `"medium"`.
|
||||
For a **new session:**
|
||||
```bash
|
||||
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
|
||||
codex exec "<prompt>" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="medium"' --enable web_search_cached --json 2>"$TMPERR" | PYTHONUNBUFFERED=1 python3 -u -c "
|
||||
# Fix 1: wrap with timeout (gtimeout/timeout fallback chain via probe helper)
|
||||
_gstack_codex_timeout_wrapper 600 codex exec "<prompt>" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="medium"' --enable web_search_cached --json < /dev/null 2>"$TMPERR" | PYTHONUNBUFFERED=1 python3 -u -c "
|
||||
import sys, json
|
||||
for line in sys.stdin:
|
||||
line = line.strip()
|
||||
@@ -997,15 +1065,29 @@ for line in sys.stdin:
|
||||
if tokens: print(f'\ntokens used: {tokens}', flush=True)
|
||||
except: pass
|
||||
"
|
||||
# Fix 1: hang detection for Consult new-session (mirrors Challenge + resume)
|
||||
_CODEX_EXIT=${PIPESTATUS[0]}
|
||||
if [ "$_CODEX_EXIT" = "124" ]; then
|
||||
_gstack_codex_log_event "codex_timeout" "600"
|
||||
_gstack_codex_log_hang "consult" "$(wc -c < "$TMPERR" 2>/dev/null || echo 0)"
|
||||
echo "Codex stalled past 10 minutes. Common causes: model API stall, long prompt, network issue. Try re-running. If persistent, split the prompt or check ~/.codex/logs/."
|
||||
fi
|
||||
```
|
||||
|
||||
For a **resumed session** (user chose "Continue"):
|
||||
```bash
|
||||
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
|
||||
codex exec resume <session-id> "<prompt>" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="medium"' --enable web_search_cached --json 2>"$TMPERR" | PYTHONUNBUFFERED=1 python3 -u -c "
|
||||
# Fix 1: wrap with timeout (gtimeout/timeout fallback chain via probe helper)
|
||||
_gstack_codex_timeout_wrapper 600 codex exec resume <session-id> "<prompt>" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="medium"' --enable web_search_cached --json < /dev/null 2>"$TMPERR" | PYTHONUNBUFFERED=1 python3 -u -c "
|
||||
<same python streaming parser as above, with flush=True on all print() calls>
|
||||
"
|
||||
```
|
||||
# Fix 1: same hang detection pattern as new-session block
|
||||
_CODEX_EXIT=${PIPESTATUS[0]}
|
||||
if [ "$_CODEX_EXIT" = "124" ]; then
|
||||
_gstack_codex_log_event "codex_timeout" "600"
|
||||
_gstack_codex_log_hang "consult-resume" "$(wc -c < "$TMPERR" 2>/dev/null || echo 0)"
|
||||
echo "Codex stalled past 10 minutes. Common causes: model API stall, long prompt, network issue. Try re-running. If persistent, split the prompt or check ~/.codex/logs/."
|
||||
fi
|
||||
|
||||
5. Capture session ID from the streamed output. The parser prints `SESSION_ID:<id>`
|
||||
from the `thread.started` event. Save it for follow-ups:
|
||||
@@ -1070,8 +1152,9 @@ If token count is not available, display: `Tokens: unknown`
|
||||
- **Binary not found:** Detected in Step 0. Stop with install instructions.
|
||||
- **Auth error:** Codex prints an auth error to stderr. Surface the error:
|
||||
"Codex authentication failed. Run `codex login` in your terminal to authenticate via ChatGPT."
|
||||
- **Timeout:** If the Bash call times out (5 min), tell the user:
|
||||
"Codex timed out after 5 minutes. The diff may be too large or the API may be slow. Try again or use a smaller scope."
|
||||
- **Timeout (Bash outer gate):** If the Bash call times out (5 min for Review/Challenge, 10 min for Consult), tell the user:
|
||||
"Codex timed out. The prompt may be too large or the API may be slow. Try again or use a smaller scope."
|
||||
- **Timeout (inner `timeout` wrapper, exit 124):** If the shell `timeout 600` wrapper fires first, the skill's hang-detection block auto-logs a telemetry event + operational learning and prints: "Codex stalled past 10 minutes. Common causes: model API stall, long prompt, network issue. Try re-running. If persistent, split the prompt or check `~/.codex/logs/`." No extra action needed.
|
||||
- **Empty response:** If `$TMPRESP` is empty or doesn't exist, tell the user:
|
||||
"Codex returned no response. Check stderr for errors."
|
||||
- **Session resume failure:** If resume fails, delete the session file and start fresh.
|
||||
|
||||
+90
-7
@@ -49,6 +49,45 @@ CODEX_BIN=$(which codex 2>/dev/null || echo "")
|
||||
If `NOT_FOUND`: stop and tell the user:
|
||||
"Codex CLI not found. Install it: `npm install -g @openai/codex` or see https://github.com/openai/codex"
|
||||
|
||||
If `NOT_FOUND`, also log the event:
|
||||
```bash
|
||||
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off)
|
||||
source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null && _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 0.5: Auth probe + version check
|
||||
|
||||
Before building expensive prompts, verify Codex has valid auth AND the installed
|
||||
CLI version isn't in the known-bad list. Sourcing `gstack-codex-probe` loads the
|
||||
shared helpers that both `/codex` and `/autoplan` use.
|
||||
|
||||
```bash
|
||||
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off)
|
||||
source ~/.claude/skills/gstack/bin/gstack-codex-probe
|
||||
|
||||
if ! _gstack_codex_auth_probe >/dev/null; then
|
||||
_gstack_codex_log_event "codex_auth_failed"
|
||||
echo "AUTH_FAILED"
|
||||
fi
|
||||
_gstack_codex_version_check # warns if known-bad, non-blocking
|
||||
```
|
||||
|
||||
If the output contains `AUTH_FAILED`, stop and tell the user:
|
||||
"No Codex authentication found. Run `codex login` or set `$CODEX_API_KEY` / `$OPENAI_API_KEY`, then re-run this skill."
|
||||
|
||||
If the version check printed a `WARN:` line, pass it through to the user verbatim
|
||||
(non-blocking — Codex may still work, but the user should upgrade).
|
||||
|
||||
The probe multi-signal auth logic accepts: `$CODEX_API_KEY` set, `$OPENAI_API_KEY`
|
||||
set, or `${CODEX_HOME:-~/.codex}/auth.json` exists. Avoids false-negatives for
|
||||
env-auth users (CI, platform engineers) that file-only checks would reject.
|
||||
|
||||
**Update the known-bad list** in `bin/gstack-codex-probe` when a new Codex CLI version
|
||||
regresses. Current entries (`0.120.0`, `0.120.1`, `0.120.2`) trace to the stdin
|
||||
deadlock fixed in #972.
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Detect mode
|
||||
@@ -111,7 +150,15 @@ instructions, append them after the boundary separated by a newline:
|
||||
```bash
|
||||
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
|
||||
cd "$_REPO_ROOT"
|
||||
codex review "IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. Do NOT modify agents/openai.yaml. Stay focused on repository code only." --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR"
|
||||
# Fix 1: wrap with timeout. 330s (5.5min) is slightly longer than the Bash 300s
|
||||
# so the shell wrapper only fires if Bash's own timeout doesn't.
|
||||
_gstack_codex_timeout_wrapper 330 codex review "IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. Do NOT modify agents/openai.yaml. Stay focused on repository code only." --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR"
|
||||
_CODEX_EXIT=$?
|
||||
if [ "$_CODEX_EXIT" = "124" ]; then
|
||||
_gstack_codex_log_event "codex_timeout" "330"
|
||||
_gstack_codex_log_hang "review" "$(wc -c < "$TMPERR" 2>/dev/null || echo 0)"
|
||||
echo "Codex stalled past 5.5 minutes. Common causes: model API stall, long prompt, network issue. Try re-running. If persistent, split the prompt or check ~/.codex/logs/."
|
||||
fi
|
||||
```
|
||||
|
||||
If the user passed `--xhigh`, use `"xhigh"` instead of `"high"`.
|
||||
@@ -205,8 +252,12 @@ If the user passed `--xhigh`, use `"xhigh"` instead of `"high"`.
|
||||
|
||||
```bash
|
||||
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
|
||||
codex exec "<prompt>" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached --json < /dev/null 2>/dev/null | PYTHONUNBUFFERED=1 python3 -u -c "
|
||||
# Fix 1+2: wrap with timeout (gtimeout/timeout fallback chain via probe helper),
|
||||
# capture stderr to $TMPERR for auth error detection (was: 2>/dev/null).
|
||||
TMPERR=${TMPERR:-$(mktemp /tmp/codex-err-XXXXXX.txt)}
|
||||
_gstack_codex_timeout_wrapper 600 codex exec "<prompt>" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached --json < /dev/null 2>"$TMPERR" | PYTHONUNBUFFERED=1 python3 -u -c "
|
||||
import sys, json
|
||||
turn_completed_count = 0
|
||||
for line in sys.stdin:
|
||||
line = line.strip()
|
||||
if not line: continue
|
||||
@@ -226,11 +277,27 @@ for line in sys.stdin:
|
||||
cmd = item.get('command','')
|
||||
if cmd: print(f'[codex ran] {cmd}', flush=True)
|
||||
elif t == 'turn.completed':
|
||||
turn_completed_count += 1
|
||||
usage = obj.get('usage',{})
|
||||
tokens = usage.get('input_tokens',0) + usage.get('output_tokens',0)
|
||||
if tokens: print(f'\ntokens used: {tokens}', flush=True)
|
||||
except: pass
|
||||
# Fix 2: completeness check — warn if no turn.completed received
|
||||
if turn_completed_count == 0:
|
||||
print('[codex warning] No turn.completed event received — possible mid-stream disconnect.', flush=True, file=sys.stderr)
|
||||
"
|
||||
_CODEX_EXIT=${PIPESTATUS[0]}
|
||||
# Fix 1: hang detection — log + surface actionable message
|
||||
if [ "$_CODEX_EXIT" = "124" ]; then
|
||||
_gstack_codex_log_event "codex_timeout" "600"
|
||||
_gstack_codex_log_hang "challenge" "$(wc -c < "$TMPERR" 2>/dev/null || echo 0)"
|
||||
echo "Codex stalled past 10 minutes. Common causes: model API stall, long prompt, network issue. Try re-running. If persistent, split the prompt or check ~/.codex/logs/."
|
||||
fi
|
||||
# Fix 2: surface auth errors from captured stderr instead of dropping them
|
||||
if grep -qiE "auth|login|unauthorized" "$TMPERR" 2>/dev/null; then
|
||||
echo "[codex auth error] $(head -1 "$TMPERR")"
|
||||
_gstack_codex_log_event "codex_auth_failed"
|
||||
fi
|
||||
```
|
||||
|
||||
This parses codex's JSONL events to extract reasoning traces, tool calls, and the final
|
||||
@@ -317,7 +384,8 @@ If the user passed `--xhigh`, use `"xhigh"` instead of `"medium"`.
|
||||
For a **new session:**
|
||||
```bash
|
||||
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
|
||||
codex exec "<prompt>" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="medium"' --enable web_search_cached --json < /dev/null 2>"$TMPERR" | PYTHONUNBUFFERED=1 python3 -u -c "
|
||||
# Fix 1: wrap with timeout (gtimeout/timeout fallback chain via probe helper)
|
||||
_gstack_codex_timeout_wrapper 600 codex exec "<prompt>" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="medium"' --enable web_search_cached --json < /dev/null 2>"$TMPERR" | PYTHONUNBUFFERED=1 python3 -u -c "
|
||||
import sys, json
|
||||
for line in sys.stdin:
|
||||
line = line.strip()
|
||||
@@ -346,15 +414,29 @@ for line in sys.stdin:
|
||||
if tokens: print(f'\ntokens used: {tokens}', flush=True)
|
||||
except: pass
|
||||
"
|
||||
# Fix 1: hang detection for Consult new-session (mirrors Challenge + resume)
|
||||
_CODEX_EXIT=${PIPESTATUS[0]}
|
||||
if [ "$_CODEX_EXIT" = "124" ]; then
|
||||
_gstack_codex_log_event "codex_timeout" "600"
|
||||
_gstack_codex_log_hang "consult" "$(wc -c < "$TMPERR" 2>/dev/null || echo 0)"
|
||||
echo "Codex stalled past 10 minutes. Common causes: model API stall, long prompt, network issue. Try re-running. If persistent, split the prompt or check ~/.codex/logs/."
|
||||
fi
|
||||
```
|
||||
|
||||
For a **resumed session** (user chose "Continue"):
|
||||
```bash
|
||||
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
|
||||
codex exec resume <session-id> "<prompt>" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="medium"' --enable web_search_cached --json < /dev/null 2>"$TMPERR" | PYTHONUNBUFFERED=1 python3 -u -c "
|
||||
# Fix 1: wrap with timeout (gtimeout/timeout fallback chain via probe helper)
|
||||
_gstack_codex_timeout_wrapper 600 codex exec resume <session-id> "<prompt>" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="medium"' --enable web_search_cached --json < /dev/null 2>"$TMPERR" | PYTHONUNBUFFERED=1 python3 -u -c "
|
||||
<same python streaming parser as above, with flush=True on all print() calls>
|
||||
"
|
||||
```
|
||||
# Fix 1: same hang detection pattern as new-session block
|
||||
_CODEX_EXIT=${PIPESTATUS[0]}
|
||||
if [ "$_CODEX_EXIT" = "124" ]; then
|
||||
_gstack_codex_log_event "codex_timeout" "600"
|
||||
_gstack_codex_log_hang "consult-resume" "$(wc -c < "$TMPERR" 2>/dev/null || echo 0)"
|
||||
echo "Codex stalled past 10 minutes. Common causes: model API stall, long prompt, network issue. Try re-running. If persistent, split the prompt or check ~/.codex/logs/."
|
||||
fi
|
||||
|
||||
5. Capture session ID from the streamed output. The parser prints `SESSION_ID:<id>`
|
||||
from the `thread.started` event. Save it for follow-ups:
|
||||
@@ -419,8 +501,9 @@ If token count is not available, display: `Tokens: unknown`
|
||||
- **Binary not found:** Detected in Step 0. Stop with install instructions.
|
||||
- **Auth error:** Codex prints an auth error to stderr. Surface the error:
|
||||
"Codex authentication failed. Run `codex login` in your terminal to authenticate via ChatGPT."
|
||||
- **Timeout:** If the Bash call times out (5 min), tell the user:
|
||||
"Codex timed out after 5 minutes. The diff may be too large or the API may be slow. Try again or use a smaller scope."
|
||||
- **Timeout (Bash outer gate):** If the Bash call times out (5 min for Review/Challenge, 10 min for Consult), tell the user:
|
||||
"Codex timed out. The prompt may be too large or the API may be slow. Try again or use a smaller scope."
|
||||
- **Timeout (inner `timeout` wrapper, exit 124):** If the shell `timeout 600` wrapper fires first, the skill's hang-detection block auto-logs a telemetry event + operational learning and prints: "Codex stalled past 10 minutes. Common causes: model API stall, long prompt, network issue. Try re-running. If persistent, split the prompt or check `~/.codex/logs/`." No extra action needed.
|
||||
- **Empty response:** If `$TMPRESP` is empty or doesn't exist, tell the user:
|
||||
"Codex returned no response. Check stderr for errors."
|
||||
- **Session resume failure:** If resume fails, delete the session file and start fresh.
|
||||
|
||||
Reference in New Issue
Block a user