mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-02 03:35:09 +02:00
security: harden migration + context-save after adversarial review
Adversarial review (Claude + Codex, both high confidence) identified 6
critical production-harm findings in the /ship pre-landing pass.
All folded in.
Migration v1.0.1.0.sh hardening:
- Add explicit `[ -z "${HOME:-}" ]` guard. HOME="" survives set -u and
expands paths to /.claude/skills/... which could hit absolute paths
under root/containers/sudo-without-H.
- Add python3 fallback inside resolve_real() (was missing; broken
symlinks silently defeated ownership check).
- Ownership-guard Shape 2 (~/.claude/skills/gstack/checkpoint/). Was
unconditional rm -rf. Now: if symlink, check target resolves inside
gstack; if regular dir, check realpath resolves inside gstack. A
user's hand-edited customization or a symlink pointing outside gstack
is preserved with a notice.
- Use `rm --` and `rm -r --` consistently to resist hostile basenames.
- Use `find -type f -not -name .DS_Store -not -name ._*` instead of
`ls -A | grep`. macOS sidecars no longer mask a legit prefix-mode
install. Strip sidecars explicitly before removing the dir.
context-save/SKILL.md.tmpl:
- Sanitize title in bash, not LLM prose. Allowlist [a-z0-9.-], cap 60
chars, default to "untitled". Closes a prompt-injection surface where
`/context-save $(rm -rf ~)` could propagate into subsequent commands.
- Collision-safe filename. If ${TIMESTAMP}-${SLUG}.md already exists
(same-second double-save with same title), append a 4-char random
suffix. The skill contract says "saved files are append-only" — this
enforces it. Silent overwrite was a data-loss bug.
context-restore/SKILL.md.tmpl:
- Cap `find ... | sort -r` at 20 entries via `| head -20`. A user with
10k+ saved files no longer blows the context window just to pick one.
/context-save list still handles the full-history listing path.
test/skill-e2e-autoplan-dual-voice.test.ts:
- Filter transcript to tool_use / tool_result / assistant entries
before matching, so prompt-text mentions of "plan-ceo-review" don't
force the reachedPhase1 assertion to pass. Phase-1 assertion now
requires completion markers ("Phase 1 complete", "Phase 2 started"),
not mere name occurrence.
- claudeVoiceFired now requires JSON evidence of an Agent tool_use
(name:"Agent" or subagent_type field), not the literal string
"Agent(" which could appear anywhere.
- codexVoiceFired now requires a Bash tool_use with a `codex exec/review`
command string, not prompt-text mentions.
All SKILL.md files regenerated. Golden fixtures updated. bun test: 0
failures across 80+ targeted tests and the full suite.
Review source: /ship Step 11 adversarial pass (claude subagent + codex
exec). Same findings independently surfaced by both reviewers — this is
cross-model high confidence.
This commit is contained in:
@@ -764,7 +764,9 @@ else
|
|||||||
# copies/rsync). Filesystem mtime drifts and is not authoritative.
|
# copies/rsync). Filesystem mtime drifts and is not authoritative.
|
||||||
# 2. On macOS, `find ... | xargs ls -1t` with zero results falls back to
|
# 2. On macOS, `find ... | xargs ls -1t` with zero results falls back to
|
||||||
# listing cwd. `sort -r` on empty input cleanly returns nothing.
|
# listing cwd. `sort -r` on empty input cleanly returns nothing.
|
||||||
FILES=$(find "$CHECKPOINT_DIR" -maxdepth 1 -name "*.md" -type f 2>/dev/null | sort -r)
|
# Cap at 20 most recent: a user with 10k saved files shouldn't blow the
|
||||||
|
# context window just listing them. /context-save list handles pagination.
|
||||||
|
FILES=$(find "$CHECKPOINT_DIR" -maxdepth 1 -name "*.md" -type f 2>/dev/null | sort -r | head -20)
|
||||||
if [ -z "$FILES" ]; then
|
if [ -z "$FILES" ]; then
|
||||||
echo "NO_CHECKPOINTS"
|
echo "NO_CHECKPOINTS"
|
||||||
else
|
else
|
||||||
|
|||||||
@@ -71,7 +71,9 @@ else
|
|||||||
# copies/rsync). Filesystem mtime drifts and is not authoritative.
|
# copies/rsync). Filesystem mtime drifts and is not authoritative.
|
||||||
# 2. On macOS, `find ... | xargs ls -1t` with zero results falls back to
|
# 2. On macOS, `find ... | xargs ls -1t` with zero results falls back to
|
||||||
# listing cwd. `sort -r` on empty input cleanly returns nothing.
|
# listing cwd. `sort -r` on empty input cleanly returns nothing.
|
||||||
FILES=$(find "$CHECKPOINT_DIR" -maxdepth 1 -name "*.md" -type f 2>/dev/null | sort -r)
|
# Cap at 20 most recent: a user with 10k saved files shouldn't blow the
|
||||||
|
# context window just listing them. /context-save list handles pagination.
|
||||||
|
FILES=$(find "$CHECKPOINT_DIR" -maxdepth 1 -name "*.md" -type f 2>/dev/null | sort -r | head -20)
|
||||||
if [ -z "$FILES" ]; then
|
if [ -z "$FILES" ]; then
|
||||||
echo "NO_CHECKPOINTS"
|
echo "NO_CHECKPOINTS"
|
||||||
else
|
else
|
||||||
|
|||||||
+21
-3
@@ -805,21 +805,39 @@ saved file.
|
|||||||
|
|
||||||
### Step 4: Write saved-context file
|
### Step 4: Write saved-context file
|
||||||
|
|
||||||
|
Compute the path in bash (NOT in the LLM prompt) so user-supplied titles can't
|
||||||
|
inject shell metacharacters into any subsequent command. The sanitizer is an
|
||||||
|
allowlist: only `a-z 0-9 - .` survive.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
|
eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
|
||||||
CHECKPOINT_DIR="$HOME/.gstack/projects/$SLUG/checkpoints"
|
CHECKPOINT_DIR="$HOME/.gstack/projects/$SLUG/checkpoints"
|
||||||
mkdir -p "$CHECKPOINT_DIR"
|
mkdir -p "$CHECKPOINT_DIR"
|
||||||
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
|
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
|
||||||
|
# Bash-side title sanitize. Pass the raw title as $1 when running this block.
|
||||||
|
# Example: TITLE_RAW="wintermute progress" bash -c '...'
|
||||||
|
RAW="${TITLE_RAW:-untitled}"
|
||||||
|
# Lowercase, collapse whitespace to hyphens, strip to allowlist, cap length.
|
||||||
|
TITLE_SLUG=$(printf '%s' "$RAW" | tr '[:upper:]' '[:lower:]' | tr -s ' \t' '-' | tr -cd 'a-z0-9.-' | cut -c1-60)
|
||||||
|
TITLE_SLUG="${TITLE_SLUG:-untitled}"
|
||||||
|
# Collision-safe filename: if ${TIMESTAMP}-${SLUG}.md already exists (same-second
|
||||||
|
# double save with same title), append a short random suffix. Filenames are
|
||||||
|
# append-only — never overwrite.
|
||||||
|
FILE="${CHECKPOINT_DIR}/${TIMESTAMP}-${TITLE_SLUG}.md"
|
||||||
|
if [ -e "$FILE" ]; then
|
||||||
|
SUFFIX=$(LC_ALL=C tr -dc 'a-z0-9' < /dev/urandom 2>/dev/null | head -c 4 || printf '%04x' "$$")
|
||||||
|
FILE="${CHECKPOINT_DIR}/${TIMESTAMP}-${TITLE_SLUG}-${SUFFIX}.md"
|
||||||
|
fi
|
||||||
echo "CHECKPOINT_DIR=$CHECKPOINT_DIR"
|
echo "CHECKPOINT_DIR=$CHECKPOINT_DIR"
|
||||||
echo "TIMESTAMP=$TIMESTAMP"
|
echo "TIMESTAMP=$TIMESTAMP"
|
||||||
|
echo "FILE=$FILE"
|
||||||
```
|
```
|
||||||
|
|
||||||
The on-disk directory name is `checkpoints/` (not `contexts/`) — this is a legacy
|
The on-disk directory name is `checkpoints/` (not `contexts/`) — this is a legacy
|
||||||
path kept so existing saved files remain loadable. Users never see it.
|
path kept so existing saved files remain loadable. Users never see it.
|
||||||
|
|
||||||
Write the file to `{CHECKPOINT_DIR}/{TIMESTAMP}-{title-slug}.md` where
|
Write the file to the `$FILE` path printed above (use the exact string — do not
|
||||||
`title-slug` is the title in kebab-case (lowercase, spaces replaced with hyphens,
|
reconstruct it in the LLM layer).
|
||||||
special characters removed).
|
|
||||||
|
|
||||||
The file format:
|
The file format:
|
||||||
|
|
||||||
|
|||||||
@@ -112,21 +112,39 @@ saved file.
|
|||||||
|
|
||||||
### Step 4: Write saved-context file
|
### Step 4: Write saved-context file
|
||||||
|
|
||||||
|
Compute the path in bash (NOT in the LLM prompt) so user-supplied titles can't
|
||||||
|
inject shell metacharacters into any subsequent command. The sanitizer is an
|
||||||
|
allowlist: only `a-z 0-9 - .` survive.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
{{SLUG_SETUP}}
|
{{SLUG_SETUP}}
|
||||||
CHECKPOINT_DIR="$HOME/.gstack/projects/$SLUG/checkpoints"
|
CHECKPOINT_DIR="$HOME/.gstack/projects/$SLUG/checkpoints"
|
||||||
mkdir -p "$CHECKPOINT_DIR"
|
mkdir -p "$CHECKPOINT_DIR"
|
||||||
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
|
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
|
||||||
|
# Bash-side title sanitize. Pass the raw title as $1 when running this block.
|
||||||
|
# Example: TITLE_RAW="wintermute progress" bash -c '...'
|
||||||
|
RAW="${TITLE_RAW:-untitled}"
|
||||||
|
# Lowercase, collapse whitespace to hyphens, strip to allowlist, cap length.
|
||||||
|
TITLE_SLUG=$(printf '%s' "$RAW" | tr '[:upper:]' '[:lower:]' | tr -s ' \t' '-' | tr -cd 'a-z0-9.-' | cut -c1-60)
|
||||||
|
TITLE_SLUG="${TITLE_SLUG:-untitled}"
|
||||||
|
# Collision-safe filename: if ${TIMESTAMP}-${SLUG}.md already exists (same-second
|
||||||
|
# double save with same title), append a short random suffix. Filenames are
|
||||||
|
# append-only — never overwrite.
|
||||||
|
FILE="${CHECKPOINT_DIR}/${TIMESTAMP}-${TITLE_SLUG}.md"
|
||||||
|
if [ -e "$FILE" ]; then
|
||||||
|
SUFFIX=$(LC_ALL=C tr -dc 'a-z0-9' < /dev/urandom 2>/dev/null | head -c 4 || printf '%04x' "$$")
|
||||||
|
FILE="${CHECKPOINT_DIR}/${TIMESTAMP}-${TITLE_SLUG}-${SUFFIX}.md"
|
||||||
|
fi
|
||||||
echo "CHECKPOINT_DIR=$CHECKPOINT_DIR"
|
echo "CHECKPOINT_DIR=$CHECKPOINT_DIR"
|
||||||
echo "TIMESTAMP=$TIMESTAMP"
|
echo "TIMESTAMP=$TIMESTAMP"
|
||||||
|
echo "FILE=$FILE"
|
||||||
```
|
```
|
||||||
|
|
||||||
The on-disk directory name is `checkpoints/` (not `contexts/`) — this is a legacy
|
The on-disk directory name is `checkpoints/` (not `contexts/`) — this is a legacy
|
||||||
path kept so existing saved files remain loadable. Users never see it.
|
path kept so existing saved files remain loadable. Users never see it.
|
||||||
|
|
||||||
Write the file to `{CHECKPOINT_DIR}/{TIMESTAMP}-{title-slug}.md` where
|
Write the file to the `$FILE` path printed above (use the exact string — do not
|
||||||
`title-slug` is the title in kebab-case (lowercase, spaces replaced with hyphens,
|
reconstruct it in the LLM layer).
|
||||||
special characters removed).
|
|
||||||
|
|
||||||
The file format:
|
The file format:
|
||||||
|
|
||||||
|
|||||||
@@ -20,33 +20,42 @@
|
|||||||
# Idempotent: missing paths are no-ops.
|
# Idempotent: missing paths are no-ops.
|
||||||
set -euo pipefail
|
set -euo pipefail
|
||||||
|
|
||||||
|
# Guard: refuse to run if HOME is unset or empty. With `set -u`, unset HOME
|
||||||
|
# errors out, but HOME="" (possible under sudo-without-H, systemd units, some
|
||||||
|
# CI runners) survives and produces dangerous absolute paths like
|
||||||
|
# "/.claude/skills/...". Abort cleanly.
|
||||||
|
if [ -z "${HOME:-}" ]; then
|
||||||
|
echo " [v1.0.1.0] HOME is unset or empty — skipping migration." >&2
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
SKILLS_DIR="${HOME}/.claude/skills"
|
SKILLS_DIR="${HOME}/.claude/skills"
|
||||||
OLD_TOPLEVEL="${SKILLS_DIR}/checkpoint"
|
OLD_TOPLEVEL="${SKILLS_DIR}/checkpoint"
|
||||||
OLD_NAMESPACED="${SKILLS_DIR}/gstack/checkpoint"
|
OLD_NAMESPACED="${SKILLS_DIR}/gstack/checkpoint"
|
||||||
GSTACK_ROOT_REAL=""
|
GSTACK_ROOT_REAL=""
|
||||||
|
|
||||||
|
# Helper: canonical-path a target (symlink-safe). Prints the resolved path, or
|
||||||
|
# empty on failure (broken symlink, ENOENT, ELOOP). Both realpath AND the python3
|
||||||
|
# fallback are tried — a single tool failure shouldn't defeat the ownership
|
||||||
|
# check. Returns empty string if both fail.
|
||||||
|
resolve_real() {
|
||||||
|
local target="$1"
|
||||||
|
local out=""
|
||||||
|
if command -v realpath >/dev/null 2>&1; then
|
||||||
|
out=$(realpath "$target" 2>/dev/null || true)
|
||||||
|
fi
|
||||||
|
if [ -z "$out" ] && command -v python3 >/dev/null 2>&1; then
|
||||||
|
out=$(python3 -c 'import os,sys;print(os.path.realpath(sys.argv[1]))' "$target" 2>/dev/null || true)
|
||||||
|
fi
|
||||||
|
printf '%s' "$out"
|
||||||
|
}
|
||||||
|
|
||||||
# Resolve the canonical path of the gstack skills root. If gstack isn't
|
# Resolve the canonical path of the gstack skills root. If gstack isn't
|
||||||
# installed here, there's nothing to migrate.
|
# installed here, there's nothing to migrate.
|
||||||
if [ -d "${SKILLS_DIR}/gstack" ]; then
|
if [ -d "${SKILLS_DIR}/gstack" ]; then
|
||||||
# Portable realpath: macOS BSD `readlink` lacks -f. Fall back to python3.
|
GSTACK_ROOT_REAL=$(resolve_real "${SKILLS_DIR}/gstack")
|
||||||
if command -v realpath >/dev/null 2>&1; then
|
|
||||||
GSTACK_ROOT_REAL=$(realpath "${SKILLS_DIR}/gstack" 2>/dev/null || true)
|
|
||||||
fi
|
|
||||||
if [ -z "$GSTACK_ROOT_REAL" ]; then
|
|
||||||
GSTACK_ROOT_REAL=$(python3 -c 'import os,sys;print(os.path.realpath(sys.argv[1]))' "${SKILLS_DIR}/gstack" 2>/dev/null || true)
|
|
||||||
fi
|
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Helper: canonical-path a target (symlink-safe). Prints the resolved path.
|
|
||||||
resolve_real() {
|
|
||||||
local target="$1"
|
|
||||||
if command -v realpath >/dev/null 2>&1; then
|
|
||||||
realpath "$target" 2>/dev/null || true
|
|
||||||
return
|
|
||||||
fi
|
|
||||||
python3 -c 'import os,sys;print(os.path.realpath(sys.argv[1]))' "$target" 2>/dev/null || true
|
|
||||||
}
|
|
||||||
|
|
||||||
# Helper: does $1 (canonical path) live inside $2 (canonical path)?
|
# Helper: does $1 (canonical path) live inside $2 (canonical path)?
|
||||||
path_inside() {
|
path_inside() {
|
||||||
local inner="$1"
|
local inner="$1"
|
||||||
@@ -65,20 +74,24 @@ if [ -L "$OLD_TOPLEVEL" ]; then
|
|||||||
# Directory symlink (or file symlink). Canonicalize and check ownership.
|
# Directory symlink (or file symlink). Canonicalize and check ownership.
|
||||||
target_real=$(resolve_real "$OLD_TOPLEVEL")
|
target_real=$(resolve_real "$OLD_TOPLEVEL")
|
||||||
if [ -n "$GSTACK_ROOT_REAL" ] && path_inside "$target_real" "$GSTACK_ROOT_REAL"; then
|
if [ -n "$GSTACK_ROOT_REAL" ] && path_inside "$target_real" "$GSTACK_ROOT_REAL"; then
|
||||||
rm "$OLD_TOPLEVEL"
|
rm -- "$OLD_TOPLEVEL"
|
||||||
echo " [v1.0.1.0] Removed stale /checkpoint symlink (was shadowing Claude Code's /rewind alias)."
|
echo " [v1.0.1.0] Removed stale /checkpoint symlink (was shadowing Claude Code's /rewind alias)."
|
||||||
removed_any=1
|
removed_any=1
|
||||||
else
|
else
|
||||||
echo " [v1.0.1.0] Leaving $OLD_TOPLEVEL alone — symlink target is outside gstack."
|
echo " [v1.0.1.0] Leaving $OLD_TOPLEVEL alone — symlink target is outside gstack (or unresolvable)."
|
||||||
fi
|
fi
|
||||||
elif [ -d "$OLD_TOPLEVEL" ]; then
|
elif [ -d "$OLD_TOPLEVEL" ]; then
|
||||||
# Regular directory. Only remove if it contains exactly one file named
|
# Regular directory. Only remove if it contains exactly one file named
|
||||||
# SKILL.md that's a symlink into gstack (gstack's prefix-install shape).
|
# SKILL.md that's a symlink into gstack (gstack's prefix-install shape).
|
||||||
entries=$(ls -A "$OLD_TOPLEVEL" 2>/dev/null)
|
# Use find to count real files, ignoring .DS_Store (macOS sidecars).
|
||||||
if [ "$entries" = "SKILL.md" ] && [ -L "$OLD_TOPLEVEL/SKILL.md" ]; then
|
file_count=$(find "$OLD_TOPLEVEL" -maxdepth 1 -type f -not -name '.DS_Store' -not -name '._*' 2>/dev/null | wc -l | tr -d ' ')
|
||||||
|
symlink_count=$(find "$OLD_TOPLEVEL" -maxdepth 1 -type l 2>/dev/null | wc -l | tr -d ' ')
|
||||||
|
if [ "$file_count" = "0" ] && [ "$symlink_count" = "1" ] && [ -L "$OLD_TOPLEVEL/SKILL.md" ]; then
|
||||||
target_real=$(resolve_real "$OLD_TOPLEVEL/SKILL.md")
|
target_real=$(resolve_real "$OLD_TOPLEVEL/SKILL.md")
|
||||||
if [ -n "$GSTACK_ROOT_REAL" ] && path_inside "$target_real" "$GSTACK_ROOT_REAL"; then
|
if [ -n "$GSTACK_ROOT_REAL" ] && path_inside "$target_real" "$GSTACK_ROOT_REAL"; then
|
||||||
rm -r "$OLD_TOPLEVEL"
|
# Strip macOS sidecars first (not user content), then remove the dir.
|
||||||
|
find "$OLD_TOPLEVEL" -maxdepth 1 \( -name '.DS_Store' -o -name '._*' \) -type f -delete 2>/dev/null || true
|
||||||
|
rm -r -- "$OLD_TOPLEVEL"
|
||||||
echo " [v1.0.1.0] Removed stale /checkpoint install directory (gstack prefix-mode)."
|
echo " [v1.0.1.0] Removed stale /checkpoint install directory (gstack prefix-mode)."
|
||||||
removed_any=1
|
removed_any=1
|
||||||
else
|
else
|
||||||
@@ -90,11 +103,31 @@ elif [ -d "$OLD_TOPLEVEL" ]; then
|
|||||||
fi
|
fi
|
||||||
# Missing → no-op (idempotency).
|
# Missing → no-op (idempotency).
|
||||||
|
|
||||||
# --- Shape 2: ~/.claude/skills/gstack/checkpoint/ (gstack owns this dir unconditionally)
|
# --- Shape 2: ~/.claude/skills/gstack/checkpoint/
|
||||||
if [ -d "$OLD_NAMESPACED" ] || [ -L "$OLD_NAMESPACED" ]; then
|
# Ownership guard applies here too: only remove if this path resolves inside the
|
||||||
rm -rf "$OLD_NAMESPACED"
|
# gstack skills root. If a user replaced the directory with a symlink pointing
|
||||||
|
# elsewhere (e.g., at their own fork), respect it.
|
||||||
|
if [ -L "$OLD_NAMESPACED" ]; then
|
||||||
|
target_real=$(resolve_real "$OLD_NAMESPACED")
|
||||||
|
if [ -n "$GSTACK_ROOT_REAL" ] && path_inside "$target_real" "$GSTACK_ROOT_REAL"; then
|
||||||
|
rm -- "$OLD_NAMESPACED"
|
||||||
|
echo " [v1.0.1.0] Removed stale ~/.claude/skills/gstack/checkpoint symlink."
|
||||||
|
removed_any=1
|
||||||
|
else
|
||||||
|
echo " [v1.0.1.0] Leaving $OLD_NAMESPACED alone — symlink target is outside gstack."
|
||||||
|
fi
|
||||||
|
elif [ -d "$OLD_NAMESPACED" ]; then
|
||||||
|
# Regular directory. This is the gstack-prefix install location. Check that
|
||||||
|
# it resolves to a path inside the gstack root (it should, unless someone
|
||||||
|
# hand-edited the tree).
|
||||||
|
target_real=$(resolve_real "$OLD_NAMESPACED")
|
||||||
|
if [ -n "$GSTACK_ROOT_REAL" ] && path_inside "$target_real" "$GSTACK_ROOT_REAL"; then
|
||||||
|
rm -rf -- "$OLD_NAMESPACED"
|
||||||
echo " [v1.0.1.0] Removed stale ~/.claude/skills/gstack/checkpoint/ (replaced by context-save + context-restore)."
|
echo " [v1.0.1.0] Removed stale ~/.claude/skills/gstack/checkpoint/ (replaced by context-save + context-restore)."
|
||||||
removed_any=1
|
removed_any=1
|
||||||
|
else
|
||||||
|
echo " [v1.0.1.0] Leaving $OLD_NAMESPACED alone — resolves outside gstack."
|
||||||
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if [ "$removed_any" = "1" ]; then
|
if [ "$removed_any" = "1" ]; then
|
||||||
|
|||||||
@@ -85,20 +85,37 @@ Add a new /greet skill that prints a welcome message.
|
|||||||
// Accept EITHER outcome as success:
|
// Accept EITHER outcome as success:
|
||||||
// (a) Both voices produced output (ideal case)
|
// (a) Both voices produced output (ideal case)
|
||||||
// (b) Codex unavailable + Claude voice produced output (graceful degrade)
|
// (b) Codex unavailable + Claude voice produced output (graceful degrade)
|
||||||
// Session runner returns `output` (final assistant message). Search
|
// Search ONLY the tool-call structure — NOT the prompt string that went in.
|
||||||
// transcript tool-call inputs/outputs as a broader net for voice fingerprints.
|
// Matching against full transcript is risky because the prompt itself
|
||||||
const transcriptText = JSON.stringify(result.transcript || []);
|
// contains "plan-ceo-review" and other marker strings that would produce
|
||||||
const out = (result.output ?? '') + '\n' + transcriptText;
|
// false positives regardless of skill behavior. Filter to tool_result
|
||||||
const claudeVoiceFired = /Claude\s+(CEO|subagent)|claude-subagent|Agent\s*\(|subagent_type/i.test(out);
|
// content + assistant messages emitted DURING execution.
|
||||||
const codexVoiceFired = /codex\s+(exec|review)|CODEX SAYS|\[via:codex\]|codex-plan-review/i.test(out);
|
const transcript = Array.isArray(result.transcript) ? result.transcript : [];
|
||||||
const codexUnavailable = /\[codex-unavailable\]|AUTH_FAILED|CODEX_NOT_AVAILABLE|codex_cli_missing|Codex CLI not found/i.test(out);
|
const executionContent = transcript
|
||||||
|
.filter((entry: any) => entry && (entry.type === 'tool_use' || entry.type === 'tool_result' || entry.role === 'assistant'))
|
||||||
|
.map((entry: any) => JSON.stringify(entry))
|
||||||
|
.join('\n');
|
||||||
|
const out = (result.output ?? '') + '\n' + executionContent;
|
||||||
|
|
||||||
|
// Claude voice: require evidence of a dispatched Agent subagent, not
|
||||||
|
// merely the literal string "Agent(" (which could appear in any text).
|
||||||
|
// Task/Agent tool_use entries have name:"Agent" or subagent_type:"..."
|
||||||
|
const claudeVoiceFired = /"name":\s*"Agent"|"subagent_type":\s*"[^"]/.test(out) ||
|
||||||
|
/Claude\s+(CEO|subagent)\s+(review|complete|finished)|claude-subagent\s/i.test(out);
|
||||||
|
// Codex voice: require evidence of codex CLI invocation (command string in
|
||||||
|
// a Bash tool_use), not prompt-text mentions.
|
||||||
|
const codexVoiceFired = /"command":\s*"[^"]*codex\s+(exec|review)/.test(out) ||
|
||||||
|
/CODEX SAYS\s*\(/i.test(out);
|
||||||
|
// Unavailable markers: explicit probe-failure strings emitted by the skill.
|
||||||
|
const codexUnavailable = /\[codex-unavailable\]|AUTH_FAILED\b|CODEX_NOT_AVAILABLE\b|codex_cli_missing|Codex CLI not found/i.test(out);
|
||||||
|
|
||||||
expect(claudeVoiceFired).toBe(true);
|
expect(claudeVoiceFired).toBe(true);
|
||||||
expect(codexVoiceFired || codexUnavailable).toBe(true);
|
expect(codexVoiceFired || codexUnavailable).toBe(true);
|
||||||
|
|
||||||
// Hang protection: if the skill reached Phase 1 at all, our hardening worked.
|
// Hang protection: require phase completion evidence, not name mentions.
|
||||||
// If it didn't, this is a regression from the pre-wave stdin-deadlock era.
|
// "Phase 1 complete" or a phase-transition marker, not "plan-ceo-review"
|
||||||
const reachedPhase1 = /Phase 1|CEO\s+Review|Strategy\s*&\s*Scope|plan-ceo-review/i.test(out);
|
// as a bare string (which appears in the prompt itself).
|
||||||
|
const reachedPhase1 = /Phase\s+1\s+(complete|done|finished)|CEO\s+Review\s+(complete|done|approved)|Strategy\s*&\s*Scope\s+(complete|done)|Phase\s+2\s+(started|begin)/i.test(out);
|
||||||
expect(reachedPhase1).toBe(true);
|
expect(reachedPhase1).toBe(true);
|
||||||
|
|
||||||
logCost('autoplan-dual-voice', result);
|
logCost('autoplan-dual-voice', result);
|
||||||
|
|||||||
Reference in New Issue
Block a user