Files
gstack/bin/gstack-telemetry-log
T
Garry Tan b343ba2797 fix: community PRs + security hardening + E2E stability (v0.12.7.0) (#552)
* fix(security): skip hidden directories in skill template discovery

discoverTemplates() scans subdirectories for SKILL.md.tmpl files but
only skips node_modules, .git, and dist. Hidden directories like
.claude/, .agents/, and .codex/ (which contain symlinked skill
installs) were being scanned, allowing a malicious .tmpl in a
symlinked skill to inject into the generation pipeline.

Fix: add !d.name.startsWith('.') to the subdirs() filter. This skips
all dot-prefixed directories, matching the standard convention that
hidden dirs are not source code.

* fix(security): sanitize telemetry JSONL inputs against injection

SKILL, OUTCOME, SESSION_ID, SOURCE, and EVENT_TYPE values go directly
into printf %s for JSONL output. If any contain double quotes,
backslashes, or newlines, the JSON breaks — or worse, injects
arbitrary fields.

Fix: strip quotes, backslashes, and control characters from all
string fields before JSONL construction via json_safe() helper.

* fix(security): validate JSON input in gstack-review-log

gstack-review-log appends its argument directly to a JSONL file with
no validation. Malformed or crafted input could corrupt the review log
or inject arbitrary content.

Fix: validate input is parseable JSON via python3 before appending.
Reject with exit 1 and stderr message if invalid.

* fix: treat relative dot-paths as file paths in screenshot command

Closes #495

* fix: use host-specific co-author trailer in /ship and /document-release

Codex-generated skills hardcoded a Claude co-author trailer in commit
messages. Users running gstack under Codex pushed commits attributed
to the wrong AI assistant.

Add {{CO_AUTHOR_TRAILER}} resolver that emits the correct trailer
based on ctx.host:
  - claude: Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
  - codex:  Co-Authored-By: OpenAI Codex <noreply@openai.com>

Replace hardcoded trailers in ship/SKILL.md.tmpl and
document-release/SKILL.md.tmpl with the resolver placeholder.

Fixes #282. Fixes #383.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: auto-upgrade marker no longer masks newer remote versions

When a just-upgraded-from marker persists across sessions, the update
check would write UP_TO_DATE to cache and exit immediately — never
fetching the remote VERSION. Users silently miss updates that landed
after their last upgrade.

Remove the early exit and premature cache write so the script falls
through to the remote check after consuming the marker. This ensures
JUST_UPGRADED is still emitted for the preamble, while also detecting
any newer versions available upstream.

Fixes #515

* fix: decouple doc generation from binary compilation in build script

The build script chains gen:skill-docs and bun build --compile with &&,
so a doc generation failure (e.g. missing Codex host config, template
error) prevents the browse binary from being compiled. Users end up
with a broken install where setup reports the binary is missing.

Replace && with ; for the two gen:skill-docs steps so they run
independently of the compilation chain. Doc generation errors are still
visible in stderr, but no longer block binary compilation.

Fixes #482

* fix: extend security sanitization + add 10 tests for merged community PRs

- Extend json_safe() to ERROR_CLASS and FAILED_STEP fields
- Improve ERROR_MESSAGE escaping to handle backslashes and newlines
- Replace python3 with bun for JSON validation in gstack-review-log
- Add 7 telemetry injection prevention tests
- Add 2 review-log JSON validation tests
- Add 1 discover-skills hidden directory filtering test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: stabilize flaky E2E tests (browse-basic, ship-base-branch, dashboard-via)

browse-basic: bump maxTurns 5→7 (agent reads PNG per SKILL.md instruction)
ship-base-branch: extract Step 0 only instead of full 1900-line ship/SKILL.md
dashboard-via: extract dashboard section only + increase timeout 90s→180s

Root cause: copying full SKILL.md files into test fixtures caused context bloat,
leading to timeouts and flaky turn limits. Extracting only the relevant section
cut dashboard-via from timing out at 240s to finishing in 38s.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add E2E fixture extraction rule to CLAUDE.md

Never copy full SKILL.md files into E2E test fixtures. Extract only
the section the test needs. Also: run targeted evals in foreground,
never pkill and restart mid-run.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: stabilize journey-think-bigger routing test

Use exact trigger phrases from plan-ceo-review skill description
("think bigger", "expand scope", "ambitious enough") instead of
the ambiguous "thinking too small". Reduce maxTurns 5→3 to cut
cost per attempt ($0.12 vs $0.25). Test remains periodic tier
since LLM routing is inherently non-deterministic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* remove: delete journey-think-bigger routing test

Never passed reliably. Tests ambiguous routing ("think bigger" →
plan-ceo-review) but Claude legitimately answers directly instead
of invoking a skill. The other 10 journey tests cover routing
with clear, actionable signals.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.12.7.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Arun Kumar Thiagarajan <arunkt.bm14@gmail.com>
Co-authored-by: bluzername <bluzer@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Greg Jackson <gregario@users.noreply.github.com>
2026-03-26 23:21:27 -06:00

202 lines
8.2 KiB
Bash
Executable File

#!/usr/bin/env bash
# gstack-telemetry-log — append a telemetry event to local JSONL
#
# Data flow:
# preamble (start) ──▶ .pending marker
# preamble (epilogue) ──▶ gstack-telemetry-log ──▶ skill-usage.jsonl
# └──▶ gstack-telemetry-sync (bg)
#
# Usage:
# gstack-telemetry-log --skill qa --duration 142 --outcome success \
# --used-browse true --session-id "12345-1710756600"
#
# Env overrides (for testing):
# GSTACK_STATE_DIR — override ~/.gstack state directory
# GSTACK_DIR — override auto-detected gstack root
#
# NOTE: Uses set -uo pipefail (no -e) — telemetry must never exit non-zero
set -uo pipefail
GSTACK_DIR="${GSTACK_DIR:-$(cd "$(dirname "$0")/.." && pwd)}"
STATE_DIR="${GSTACK_STATE_DIR:-$HOME/.gstack}"
ANALYTICS_DIR="$STATE_DIR/analytics"
JSONL_FILE="$ANALYTICS_DIR/skill-usage.jsonl"
PENDING_DIR="$ANALYTICS_DIR" # .pending-* files live here
CONFIG_CMD="$GSTACK_DIR/bin/gstack-config"
VERSION_FILE="$GSTACK_DIR/VERSION"
# ─── Parse flags ─────────────────────────────────────────────
SKILL=""
DURATION=""
OUTCOME="unknown"
USED_BROWSE="false"
SESSION_ID=""
ERROR_CLASS=""
ERROR_MESSAGE=""
FAILED_STEP=""
EVENT_TYPE="skill_run"
SOURCE=""
while [ $# -gt 0 ]; do
case "$1" in
--skill) SKILL="$2"; shift 2 ;;
--duration) DURATION="$2"; shift 2 ;;
--outcome) OUTCOME="$2"; shift 2 ;;
--used-browse) USED_BROWSE="$2"; shift 2 ;;
--session-id) SESSION_ID="$2"; shift 2 ;;
--error-class) ERROR_CLASS="$2"; shift 2 ;;
--error-message) ERROR_MESSAGE="$2"; shift 2 ;;
--failed-step) FAILED_STEP="$2"; shift 2 ;;
--event-type) EVENT_TYPE="$2"; shift 2 ;;
--source) SOURCE="$2"; shift 2 ;;
*) shift ;;
esac
done
# Source: flag > env > default 'live'
SOURCE="${SOURCE:-${GSTACK_TELEMETRY_SOURCE:-live}}"
# ─── Read telemetry tier ─────────────────────────────────────
TIER="$("$CONFIG_CMD" get telemetry 2>/dev/null || true)"
TIER="${TIER:-off}"
# Validate tier
case "$TIER" in
off|anonymous|community) ;;
*) TIER="off" ;; # invalid value → default to off
esac
if [ "$TIER" = "off" ]; then
# Still clear pending markers for this session even if telemetry is off
[ -n "$SESSION_ID" ] && rm -f "$PENDING_DIR/.pending-$SESSION_ID" 2>/dev/null || true
exit 0
fi
# ─── Finalize stale .pending markers ────────────────────────
# Each session gets its own .pending-$SESSION_ID file to avoid races
# between concurrent sessions. Finalize any that don't match our session.
for PFILE in "$PENDING_DIR"/.pending-*; do
[ -f "$PFILE" ] || continue
# Skip our own session's marker (it's still in-flight)
PFILE_BASE="$(basename "$PFILE")"
PFILE_SID="${PFILE_BASE#.pending-}"
[ "$PFILE_SID" = "$SESSION_ID" ] && continue
PENDING_DATA="$(cat "$PFILE" 2>/dev/null || true)"
rm -f "$PFILE" 2>/dev/null || true
if [ -n "$PENDING_DATA" ]; then
# Extract fields from pending marker using grep -o + awk
P_SKILL="$(echo "$PENDING_DATA" | grep -o '"skill":"[^"]*"' | head -1 | awk -F'"' '{print $4}')"
P_TS="$(echo "$PENDING_DATA" | grep -o '"ts":"[^"]*"' | head -1 | awk -F'"' '{print $4}')"
P_SID="$(echo "$PENDING_DATA" | grep -o '"session_id":"[^"]*"' | head -1 | awk -F'"' '{print $4}')"
P_VER="$(echo "$PENDING_DATA" | grep -o '"gstack_version":"[^"]*"' | head -1 | awk -F'"' '{print $4}')"
P_OS="$(uname -s | tr '[:upper:]' '[:lower:]')"
P_ARCH="$(uname -m)"
# Write the stale event as outcome: unknown
mkdir -p "$ANALYTICS_DIR"
printf '{"v":1,"ts":"%s","event_type":"skill_run","skill":"%s","session_id":"%s","gstack_version":"%s","os":"%s","arch":"%s","duration_s":null,"outcome":"unknown","error_class":null,"used_browse":false,"sessions":1}\n' \
"$P_TS" "$P_SKILL" "$P_SID" "$P_VER" "$P_OS" "$P_ARCH" >> "$JSONL_FILE" 2>/dev/null || true
fi
done
# Clear our own session's pending marker (we're about to log the real event)
[ -n "$SESSION_ID" ] && rm -f "$PENDING_DIR/.pending-$SESSION_ID" 2>/dev/null || true
# ─── Collect metadata ────────────────────────────────────────
TS="$(date -u +%Y-%m-%dT%H:%M:%SZ 2>/dev/null || date -u +%Y-%m-%dT%H:%M:%S 2>/dev/null || echo "")"
GSTACK_VERSION="$(cat "$VERSION_FILE" 2>/dev/null | tr -d '[:space:]' || echo "unknown")"
OS="$(uname -s | tr '[:upper:]' '[:lower:]')"
ARCH="$(uname -m)"
SESSIONS="1"
if [ -d "$STATE_DIR/sessions" ]; then
_SC="$(find "$STATE_DIR/sessions" -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' \n\r\t')"
[ -n "$_SC" ] && [ "$_SC" -gt 0 ] 2>/dev/null && SESSIONS="$_SC"
fi
# Generate installation_id for community tier
# Uses a random UUID stored locally — not derived from hostname/user so it
# can't be guessed or correlated by someone who knows your machine identity.
INSTALL_ID=""
if [ "$TIER" = "community" ]; then
ID_FILE="$HOME/.gstack/installation-id"
if [ -f "$ID_FILE" ]; then
INSTALL_ID="$(cat "$ID_FILE" 2>/dev/null)"
fi
if [ -z "$INSTALL_ID" ]; then
# Generate a random UUID v4
if command -v uuidgen >/dev/null 2>&1; then
INSTALL_ID="$(uuidgen | tr '[:upper:]' '[:lower:]')"
elif [ -r /proc/sys/kernel/random/uuid ]; then
INSTALL_ID="$(cat /proc/sys/kernel/random/uuid)"
else
# Fallback: random hex from /dev/urandom
INSTALL_ID="$(od -An -tx1 -N16 /dev/urandom 2>/dev/null | tr -d ' \n')"
fi
if [ -n "$INSTALL_ID" ]; then
mkdir -p "$(dirname "$ID_FILE")" 2>/dev/null
printf '%s' "$INSTALL_ID" > "$ID_FILE" 2>/dev/null
fi
fi
fi
# Local-only fields (never sent remotely)
REPO_SLUG=""
BRANCH=""
if command -v git >/dev/null 2>&1; then
REPO_SLUG="$(git remote get-url origin 2>/dev/null | sed 's|.*[:/]\([^/]*/[^/]*\)\.git$|\1|;s|.*[:/]\([^/]*/[^/]*\)$|\1|' | tr '/' '-' 2>/dev/null || true)"
BRANCH="$(git rev-parse --abbrev-ref HEAD 2>/dev/null || true)"
fi
# ─── Construct and append JSON ───────────────────────────────
mkdir -p "$ANALYTICS_DIR"
# Sanitize string fields for JSON safety (strip quotes, backslashes, control chars)
json_safe() { printf '%s' "$1" | tr -d '"\\\n\r\t' | head -c 200; }
SKILL="$(json_safe "$SKILL")"
OUTCOME="$(json_safe "$OUTCOME")"
SESSION_ID="$(json_safe "$SESSION_ID")"
SOURCE="$(json_safe "$SOURCE")"
EVENT_TYPE="$(json_safe "$EVENT_TYPE")"
# Escape null fields — sanitize ERROR_CLASS and FAILED_STEP via json_safe()
ERR_FIELD="null"
[ -n "$ERROR_CLASS" ] && ERR_FIELD="\"$(json_safe "$ERROR_CLASS")\""
ERR_MSG_FIELD="null"
[ -n "$ERROR_MESSAGE" ] && ERR_MSG_FIELD="\"$(printf '%s' "$ERROR_MESSAGE" | head -c 200 | sed -e 's/\\/\\\\/g' -e 's/"/\\"/g' -e 's/ /\\t/g' | tr '\n\r' ' ')\""
STEP_FIELD="null"
[ -n "$FAILED_STEP" ] && STEP_FIELD="\"$(json_safe "$FAILED_STEP")\""
# Cap unreasonable durations
if [ -n "$DURATION" ] && [ "$DURATION" -gt 86400 ] 2>/dev/null; then
DURATION="" # null if > 24h
fi
if [ -n "$DURATION" ] && [ "$DURATION" -lt 0 ] 2>/dev/null; then
DURATION="" # null if negative
fi
DUR_FIELD="null"
[ -n "$DURATION" ] && DUR_FIELD="$DURATION"
INSTALL_FIELD="null"
[ -n "$INSTALL_ID" ] && INSTALL_FIELD="\"$INSTALL_ID\""
BROWSE_BOOL="false"
[ "$USED_BROWSE" = "true" ] && BROWSE_BOOL="true"
printf '{"v":1,"ts":"%s","event_type":"%s","skill":"%s","session_id":"%s","gstack_version":"%s","os":"%s","arch":"%s","duration_s":%s,"outcome":"%s","error_class":%s,"error_message":%s,"failed_step":%s,"used_browse":%s,"sessions":%s,"installation_id":%s,"source":"%s","_repo_slug":"%s","_branch":"%s"}\n' \
"$TS" "$EVENT_TYPE" "$SKILL" "$SESSION_ID" "$GSTACK_VERSION" "$OS" "$ARCH" \
"$DUR_FIELD" "$OUTCOME" "$ERR_FIELD" "$ERR_MSG_FIELD" "$STEP_FIELD" \
"$BROWSE_BOOL" "${SESSIONS:-1}" \
"$INSTALL_FIELD" "$SOURCE" "$REPO_SLUG" "$BRANCH" >> "$JSONL_FILE" 2>/dev/null || true
# ─── Trigger sync if tier is not off ─────────────────────────
SYNC_CMD="$GSTACK_DIR/bin/gstack-telemetry-sync"
if [ -x "$SYNC_CMD" ]; then
"$SYNC_CMD" 2>/dev/null &
fi
exit 0