mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-01 19:25:10 +02:00
7ff0f84b1e
* refactor: extract {{TEST_COVERAGE_AUDIT}} shared resolver
DRY extraction of the test coverage audit methodology into a shared
generator function with three explicit placeholders:
- TEST_COVERAGE_AUDIT_PLAN (plan-eng-review)
- TEST_COVERAGE_AUDIT_SHIP (ship)
- TEST_COVERAGE_AUDIT_REVIEW (review)
Shared across all modes: codepath tracing, ASCII diagram format,
quality scoring rubric, E2E test decision matrix, regression rule,
and test framework detection via CLAUDE.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: plan-eng-review uses shared test coverage audit
Replace the thin 6-line Section 3 test review with the full shared
methodology via {{TEST_COVERAGE_AUDIT_PLAN}}. Plan mode now:
- Traces every codepath with full ASCII diagrams
- Adds missing tests to the plan (not just "check for tests")
- Writes test plan artifact for /qa consumption
- Includes E2E/eval recommendations and regression detection
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: ship uses shared test coverage audit
Replace 135 lines of inline Step 3.4 methodology with
{{TEST_COVERAGE_AUDIT_SHIP}}. Functionally identical output plus:
- E2E test decision matrix (marks paths needing E2E vs unit)
- Eval recommendations for LLM prompt changes
- Regression detection iron rule
- Test framework detection via CLAUDE.md first
- Test plan artifact for /qa consumption
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: /review Step 4.75 test coverage diagram
Add codepath tracing to the pre-landing review via
{{TEST_COVERAGE_AUDIT_REVIEW}}. Review mode:
- Produces ASCII coverage diagram (same methodology as plan/ship)
- Generates tests for gaps via Fix-First (ASK user)
- Subsumes Pass 2 "Test Gaps" checklist category
- Gaps are INFORMATIONAL findings
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: mode differentiation + regression guard for coverage audit
10 new tests verifying the three TEST_COVERAGE_AUDIT placeholders:
- All modes share: codepath tracing, E2E matrix, regression rule
- Plan mode: adds to plan + artifact, no ship-specific content
- Ship mode: auto-generates + before/after count + coverage summary
- Review mode: Fix-First ASK + INFORMATIONAL, no artifact
- Regression guard: ship SKILL.md preserves all key phrases
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: extract shared coverage audit fixture + review E2E
- Extract billing.ts fixture into coverage-audit-fixture.ts (DRY)
- Refactor ship-coverage-audit E2E to use shared fixture
- Add review-coverage-audit E2E for Step 4.75
- Update touchfiles: both E2Es depend on shared fixture
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: strengthen E2E assertions for coverage audit tests
The coverage audit E2E tests (ship + review) were only asserting
exitReason === 'success' and readCalls > 0 — they passed even
if the agent produced no coverage diagram. Add assertion that
the output contains either GAP or TESTED markers.
Found during /review.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: plan mode traces the plan, not the git diff
Codex adversarial review caught that plan-eng-review was inheriting
"git diff origin/<base>...HEAD" from the shared resolver, but plan mode
reviews a plan document, not a code diff. Plan mode now says:
"Trace every codepath in the plan" and "Read the plan document."
Ship and review modes keep the git diff instruction.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.9.5.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: test coverage catalog + failure triage (merged branches) (#285)
* feat: add bin/gstack-repo-mode — solo vs collaborative detection with caching
Detects whether a repo is solo-dev (one person does 80%+ of recent commits)
or collaborative. Uses 90-day git shortlog window with 7-day cache in
~/.gstack/projects/{SLUG}/repo-mode.json. Config override via
`gstack-config set repo_mode solo|collaborative` takes precedence over
the heuristic. Minimum 5 commits required to classify (otherwise unknown).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: test failure ownership triage — see something say something
Adds two new preamble sections to all gstack skills:
- Repo Ownership Mode: explains solo vs collaborative behavior
- See Something, Say Something: proactive issue flagging principle
Adds {{TEST_FAILURE_TRIAGE}} template variable (opt-in, used by /ship):
- Classifies test failures as in-branch vs pre-existing
- Solo mode defaults to "investigate and fix now"
- Collaborative mode offers "blame + assign GitHub issue" option
- Also offers P0 TODO and skip options
/ship Step 3 now triages test failures instead of hard-stopping on all
failures. In-branch failures still block shipping. Pre-existing failures
get user-directed triage based on repo mode.
Adds P2 TODO for gstack notes system (deferred lightweight reminder).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: regenerate SKILL.md files for Claude and Codex hosts
All 22 Claude skills and 21 Codex skills regenerated with new preamble
sections (Repo Ownership Mode, See Something Say Something) and
{{TEST_FAILURE_TRIAGE}} resolved in ship/SKILL.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: validate repo mode values to prevent shell injection
Codex adversarial review found that unvalidated config/cache values
could be injected into shell via source <(gstack-repo-mode). Added
validate_mode() that only allows solo|collaborative|unknown — anything
else becomes "unknown". Prevents persistent code execution through
malicious config.yaml or tampered cache JSON.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: shell injection via branch names + feature-branch sampling bias
Codex code review found two issues:
P1: eval $(gstack-slug) in gstack-repo-mode executes branch names as
shell. Branch names like foo$(touch${IFS}pwned) are valid git refs and
would execute arbitrary commands. Fix: compute SLUG directly with sed
instead of eval'ing gstack-slug output.
P2: git shortlog HEAD only sees current branch history. On feature
branches that haven't merged main recently, other contributors disappear
from the sample. Fix: use git shortlog on the default branch
(origin/main) instead of HEAD.
Also improved blame lookup in collaborative triage to check both the
test file and the production code it covers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: broaden codex-host stripping test to accommodate triage section
"Investigate and fix" now appears in TEST_FAILURE_TRIAGE (not just the
Codex review step). Use CODEX_REVIEWS config string as a more specific
marker for detecting the Codex review step in Codex-hosted skills.
* fix: replace template placeholder in TODOS.md with readable text
{{TEST_FAILURE_TRIAGE}} is template syntax but TODOS.md is not processed
by gen-skill-docs — replaced with human-readable reference.
* chore: bump version and changelog (v0.9.5.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: add bin/ directory to project structure in CLAUDE.md
* test: add triage resolver unit tests, plan-eng coverage audit E2E, and triage E2E
- TEST_FAILURE_TRIAGE resolver: 6 unit tests verifying all triage steps (T1-T4),
REPO_MODE branching, and safety default for ambiguous failures
- plan-eng-coverage-audit E2E: tests /plan-eng-review coverage audit codepath
(gap identified during eng review — existed on neither branch)
- ship-triage E2E: planted-bug fixture with in-branch (truncate null) and
pre-existing (divide-by-zero) failures; verifies correct classification
- Touchfile entries for diff-based test selection
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: regenerate stale Codex SKILL.md for retro
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: gstack-repo-mode handles repos without origin remote
Split `git remote get-url origin` into a separate variable with `|| true`
so the script doesn't crash under `set -euo pipefail` in local-only repos.
Falls back to REPO_MODE=unknown gracefully.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: REPO_MODE defaults to unknown when helper emits nothing
Changed preamble from `source <(...) || REPO_MODE=unknown` (which doesn't
catch empty output) to `source <(...) || true` followed by
`REPO_MODE=${REPO_MODE:-unknown}`. Regenerated all SKILL.md files.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: triage E2E runs both test files in subprocesses
math.test.js called process.exit(1) which killed the runner before
string.test.js could execute. Changed test runner to use child_process
so each test runs independently and both failure classes are exercised.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: gstack-repo-mode handles repos without origin remote
Fall back through origin/main → origin/master → HEAD when
git symbolic-ref refs/remotes/origin/HEAD is not set. Prevents
shortlog crash in repos where origin/HEAD isn't configured.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: triage E2E runs both test files in subprocesses
Add assertions verifying both math.test.js (pre-existing failure) and
string.test.js (in-branch failure) actually executed during triage.
Prevents false passes where only one failure class is exercised.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: REPO_MODE defaults to unknown when helper emits nothing
- Remove head -20 truncation that biased solo classification by
dropping low-volume contributors from the denominator
- Use atomic write (mktemp + mv) for cache to prevent concurrent
preamble reads from seeing partial JSON
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add test coverage catalog to CHANGELOG + update project structure
- CHANGELOG: add 6 entries for coverage audit, review Step 4.75, E2E
recommendations, regression iron rule, failure triage, repo-mode fix
- CLAUDE.md: add missing skill directories (autoplan, benchmark, canary,
codex, land-and-deploy, setup-deploy) to project structure
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.10.1.0)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: CHANGELOG rules — branch-scoped versions, never fold into old entries
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
94 lines
3.5 KiB
Bash
Executable File
94 lines
3.5 KiB
Bash
Executable File
#!/usr/bin/env bash
|
|
# gstack-repo-mode — detect solo vs collaborative repo mode
|
|
# Usage: source <(gstack-repo-mode) → sets REPO_MODE variable
|
|
# Or: gstack-repo-mode → prints REPO_MODE=... line
|
|
#
|
|
# Detection heuristic (90-day window):
|
|
# Solo: top author >= 80% of commits
|
|
# Collaborative: top author < 80%
|
|
#
|
|
# Override: gstack-config set repo_mode solo|collaborative
|
|
# Cache: ~/.gstack/projects/$SLUG/repo-mode.json (7-day TTL)
|
|
set -euo pipefail
|
|
|
|
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
|
# Compute SLUG directly (avoid eval of gstack-slug — branch names can contain shell metacharacters)
|
|
REMOTE_URL=$(git remote get-url origin 2>/dev/null || true)
|
|
if [ -z "$REMOTE_URL" ]; then
|
|
echo "REPO_MODE=unknown"
|
|
exit 0
|
|
fi
|
|
SLUG=$(echo "$REMOTE_URL" | sed 's|.*[:/]\([^/]*/[^/]*\)\.git$|\1|;s|.*[:/]\([^/]*/[^/]*\)$|\1|' | tr '/' '-')
|
|
[ -z "${SLUG:-}" ] && { echo "REPO_MODE=unknown"; exit 0; }
|
|
|
|
# Validate: only allow known values (prevent shell injection via source <(...))
|
|
validate_mode() {
|
|
case "$1" in solo|collaborative|unknown) echo "$1" ;; *) echo "unknown" ;; esac
|
|
}
|
|
|
|
# Config override takes precedence
|
|
OVERRIDE=$("$SCRIPT_DIR/gstack-config" get repo_mode 2>/dev/null || true)
|
|
if [ -n "$OVERRIDE" ] && [ "$OVERRIDE" != "null" ]; then
|
|
echo "REPO_MODE=$(validate_mode "$OVERRIDE")"
|
|
exit 0
|
|
fi
|
|
|
|
# Check cache (7-day TTL)
|
|
CACHE_DIR="$HOME/.gstack/projects/$SLUG"
|
|
CACHE_FILE="$CACHE_DIR/repo-mode.json"
|
|
if [ -f "$CACHE_FILE" ]; then
|
|
CACHE_AGE=$(( $(date +%s) - $(stat -f %m "$CACHE_FILE" 2>/dev/null || stat -c %Y "$CACHE_FILE" 2>/dev/null || echo 0) ))
|
|
if [ "$CACHE_AGE" -lt 604800 ]; then # 7 days in seconds
|
|
MODE=$(grep -o '"mode":"[^"]*"' "$CACHE_FILE" | head -1 | cut -d'"' -f4)
|
|
[ -n "$MODE" ] && echo "REPO_MODE=$(validate_mode "$MODE")" && exit 0
|
|
fi
|
|
fi
|
|
|
|
# Compute from git history (90-day window)
|
|
# Use default branch (not HEAD) to avoid feature-branch sampling bias
|
|
DEFAULT_BRANCH=$(git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's|refs/remotes/||' || true)
|
|
# Fallback: try origin/main, then origin/master, then HEAD
|
|
if [ -z "$DEFAULT_BRANCH" ]; then
|
|
if git rev-parse --verify origin/main &>/dev/null; then
|
|
DEFAULT_BRANCH="origin/main"
|
|
elif git rev-parse --verify origin/master &>/dev/null; then
|
|
DEFAULT_BRANCH="origin/master"
|
|
else
|
|
DEFAULT_BRANCH="HEAD"
|
|
fi
|
|
fi
|
|
SHORTLOG=$(git shortlog -sn --since="90 days ago" --no-merges "$DEFAULT_BRANCH" 2>/dev/null)
|
|
if [ -z "$SHORTLOG" ]; then
|
|
echo "REPO_MODE=unknown"
|
|
exit 0
|
|
fi
|
|
|
|
# Compute TOTAL from ALL authors (not truncated) to avoid solo bias
|
|
TOTAL=$(echo "$SHORTLOG" | awk '{s+=$1} END {print s}')
|
|
TOP=$(echo "$SHORTLOG" | head -1 | awk '{print $1}')
|
|
AUTHORS=$(echo "$SHORTLOG" | wc -l | tr -d ' ')
|
|
|
|
# Minimum sample: need at least 5 commits to classify
|
|
if [ "$TOTAL" -lt 5 ]; then
|
|
echo "REPO_MODE=unknown"
|
|
exit 0
|
|
fi
|
|
|
|
TOP_PCT=$(( TOP * 100 / TOTAL ))
|
|
|
|
# Solo: top author >= 80% of commits (occasional outside PRs don't change mode)
|
|
if [ "$TOP_PCT" -ge 80 ]; then
|
|
MODE=solo
|
|
else
|
|
MODE=collaborative
|
|
fi
|
|
|
|
# Cache result atomically (fail silently if ~/.gstack is unwritable)
|
|
mkdir -p "$CACHE_DIR" 2>/dev/null || true
|
|
CACHE_TMP=$(mktemp "$CACHE_DIR/.repo-mode-XXXXXX" 2>/dev/null || true)
|
|
if [ -n "$CACHE_TMP" ]; then
|
|
echo "{\"mode\":\"$MODE\",\"top_pct\":$TOP_PCT,\"authors\":$AUTHORS,\"total\":$TOTAL,\"computed\":\"$(date -u +%Y-%m-%dT%H:%M:%SZ)\"}" > "$CACHE_TMP" 2>/dev/null && mv "$CACHE_TMP" "$CACHE_FILE" 2>/dev/null || rm -f "$CACHE_TMP" 2>/dev/null
|
|
fi
|
|
|
|
echo "REPO_MODE=$MODE"
|