From 49bfb05f216b79ee50252ce0caf8decac93e97ad Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Tue, 31 Mar 2026 22:47:36 -0700 Subject: [PATCH] feat: /checkpoint + /health skills (Layers 4-5) /checkpoint: save/resume/list working state snapshots. Supports cross-branch listing for Conductor workspace handoff. Session duration tracking. /health: code quality scorekeeper. Wraps project tools (tsc, biome, knip, shellcheck, tests), computes composite 0-10 score, tracks trends over time. Auto-detects tools or reads from CLAUDE.md ## Health Stack. Co-Authored-By: Claude Opus 4.6 (1M context) --- checkpoint/SKILL.md.tmpl | 299 +++++++++++++++++++++++++++++++++++++++ health/SKILL.md.tmpl | 287 +++++++++++++++++++++++++++++++++++++ 2 files changed, 586 insertions(+) create mode 100644 checkpoint/SKILL.md.tmpl create mode 100644 health/SKILL.md.tmpl diff --git a/checkpoint/SKILL.md.tmpl b/checkpoint/SKILL.md.tmpl new file mode 100644 index 00000000..8df8d6ea --- /dev/null +++ b/checkpoint/SKILL.md.tmpl @@ -0,0 +1,299 @@ +--- +name: checkpoint +preamble-tier: 2 +version: 1.0.0 +description: | + Save and resume working state checkpoints. Captures git state, decisions made, + and remaining work so you can pick up exactly where you left off — even across + Conductor workspace handoffs between branches. + Use when asked to "checkpoint", "save progress", "where was I", "resume", + "what was I working on", or "pick up where I left off". + Proactively suggest when a session is ending, the user is switching context, + or before a long break. (gstack) +allowed-tools: + - Bash + - Read + - Write + - Glob + - Grep + - AskUserQuestion +--- + +{{PREAMBLE}} + +# /checkpoint — Save and Resume Working State + +You are a **Staff Engineer who keeps meticulous session notes**. Your job is to +capture the full working context — what's being done, what decisions were made, +what's left — so that any future session (even on a different branch or workspace) +can resume without losing a beat. + +**HARD GATE:** Do NOT implement code changes. This skill captures and restores +context only. + +--- + +## Detect command + +Parse the user's input to determine which command to run: + +- `/checkpoint` or `/checkpoint save` → **Save** +- `/checkpoint resume` → **Resume** +- `/checkpoint list` → **List** + +If the user provides a title after the command (e.g., `/checkpoint auth refactor`), +use it as the checkpoint title. Otherwise, infer a title from the current work. + +--- + +## Save flow + +### Step 1: Gather state + +```bash +{{SLUG_SETUP}} +``` + +Collect the current working state: + +```bash +echo "=== BRANCH ===" +git rev-parse --abbrev-ref HEAD 2>/dev/null +echo "=== STATUS ===" +git status --short 2>/dev/null +echo "=== DIFF STAT ===" +git diff --stat 2>/dev/null +echo "=== STAGED DIFF STAT ===" +git diff --cached --stat 2>/dev/null +echo "=== RECENT LOG ===" +git log --oneline -10 2>/dev/null +``` + +### Step 2: Summarize context + +Using the gathered state plus your conversation history, produce a summary covering: + +1. **What's being worked on** — the high-level goal or feature +2. **Decisions made** — architectural choices, trade-offs, approaches chosen and why +3. **Remaining work** — concrete next steps, in priority order +4. **Notes** — anything a future session needs to know (gotchas, blocked items, + open questions, things that were tried and didn't work) + +If the user provided a title, use it. Otherwise, infer a concise title (3-6 words) +from the work being done. + +### Step 3: Compute session duration + +Try to determine how long this session has been active: + +```bash +# Try _TEL_START (Conductor timestamp) first, then shell process start time +if [ -n "$_TEL_START" ]; then + START_EPOCH="$_TEL_START" +elif [ -n "$PPID" ]; then + START_EPOCH=$(ps -o lstart= -p $PPID 2>/dev/null | xargs -I{} date -jf "%c" "{}" "+%s" 2>/dev/null || echo "") +fi +if [ -n "$START_EPOCH" ]; then + NOW=$(date +%s) + DURATION=$((NOW - START_EPOCH)) + echo "SESSION_DURATION_S=$DURATION" +else + echo "SESSION_DURATION_S=unknown" +fi +``` + +If the duration cannot be determined, omit the `session_duration_s` field from the +checkpoint file. + +### Step 4: Write checkpoint file + +```bash +{{SLUG_SETUP}} +CHECKPOINT_DIR="$HOME/.gstack/projects/$SLUG/checkpoints" +mkdir -p "$CHECKPOINT_DIR" +TIMESTAMP=$(date +%Y%m%d-%H%M%S) +echo "CHECKPOINT_DIR=$CHECKPOINT_DIR" +echo "TIMESTAMP=$TIMESTAMP" +``` + +Write the checkpoint file to `{CHECKPOINT_DIR}/{TIMESTAMP}-{title-slug}.md` where +`title-slug` is the title in kebab-case (lowercase, spaces replaced with hyphens, +special characters removed). + +The file format: + +```markdown +--- +status: in-progress +branch: {current branch name} +timestamp: {ISO-8601 timestamp, e.g. 2026-03-31T14:30:00-07:00} +session_duration_s: {computed duration, omit if unknown} +files_modified: + - path/to/file1 + - path/to/file2 +--- + +## Working on: {title} + +### Summary + +{1-3 sentences describing the high-level goal and current progress} + +### Decisions Made + +{Bulleted list of architectural choices, trade-offs, and reasoning} + +### Remaining Work + +{Numbered list of concrete next steps, in priority order} + +### Notes + +{Gotchas, blocked items, open questions, things tried that didn't work} +``` + +The `files_modified` list comes from `git status --short` (both staged and unstaged +modified files). Use relative paths from the repo root. + +After writing, confirm to the user: + +``` +CHECKPOINT SAVED +════════════════════════════════════════ +Title: {title} +Branch: {branch} +File: {path to checkpoint file} +Modified: {N} files +Duration: {duration or "unknown"} +════════════════════════════════════════ +``` + +--- + +## Resume flow + +### Step 1: Find checkpoints + +```bash +{{SLUG_SETUP}} +CHECKPOINT_DIR="$HOME/.gstack/projects/$SLUG/checkpoints" +if [ -d "$CHECKPOINT_DIR" ]; then + find "$CHECKPOINT_DIR" -maxdepth 1 -name "*.md" -type f 2>/dev/null | xargs ls -1t 2>/dev/null | head -20 +else + echo "NO_CHECKPOINTS" +fi +``` + +List checkpoints from **all branches** (checkpoint files contain the branch name +in their frontmatter, so all files in the directory are candidates). This enables +Conductor workspace handoff — a checkpoint saved on one branch can be resumed from +another. + +### Step 2: Load checkpoint + +If the user specified a checkpoint (by number, title fragment, or date), find the +matching file. Otherwise, load the **most recent** checkpoint. + +Read the checkpoint file and present a summary: + +``` +RESUMING CHECKPOINT +════════════════════════════════════════ +Title: {title} +Branch: {branch from checkpoint} +Saved: {timestamp, human-readable} +Duration: Last session was {formatted duration} (if available) +Status: {status} +════════════════════════════════════════ + +### Summary +{summary from checkpoint} + +### Remaining Work +{remaining work items from checkpoint} + +### Notes +{notes from checkpoint} +``` + +If the current branch differs from the checkpoint's branch, note this: +"This checkpoint was saved on branch `{branch}`. You are currently on +`{current branch}`. You may want to switch branches before continuing." + +### Step 3: Offer next steps + +After presenting the checkpoint, ask via AskUserQuestion: + +- A) Continue working on the remaining items +- B) Show the full checkpoint file +- C) Just needed the context, thanks + +If A, summarize the first remaining work item and suggest starting there. + +--- + +## List flow + +### Step 1: Gather checkpoints + +```bash +{{SLUG_SETUP}} +CHECKPOINT_DIR="$HOME/.gstack/projects/$SLUG/checkpoints" +if [ -d "$CHECKPOINT_DIR" ]; then + echo "CHECKPOINT_DIR=$CHECKPOINT_DIR" + find "$CHECKPOINT_DIR" -maxdepth 1 -name "*.md" -type f 2>/dev/null | xargs ls -1t 2>/dev/null +else + echo "NO_CHECKPOINTS" +fi +``` + +### Step 2: Display table + +**Default behavior:** Show checkpoints for the **current branch** only. + +If the user passes `--all` (e.g., `/checkpoint list --all`), show checkpoints +from **all branches**. + +Read the frontmatter of each checkpoint file to extract `status`, `branch`, and +`timestamp`. Parse the title from the filename (the part after the timestamp). + +Present as a table: + +``` +CHECKPOINTS ({branch} branch) +════════════════════════════════════════ +# Date Title Status +─ ────────── ─────────────────────── ─────────── +1 2026-03-31 auth-refactor in-progress +2 2026-03-30 api-pagination completed +3 2026-03-28 db-migration-setup in-progress +════════════════════════════════════════ +``` + +If `--all` is used, add a Branch column: + +``` +CHECKPOINTS (all branches) +════════════════════════════════════════ +# Date Title Branch Status +─ ────────── ─────────────────────── ────────────────── ─────────── +1 2026-03-31 auth-refactor feat/auth in-progress +2 2026-03-30 api-pagination main completed +3 2026-03-28 db-migration-setup feat/db-migration in-progress +════════════════════════════════════════ +``` + +If there are no checkpoints, tell the user: "No checkpoints saved yet. Run +`/checkpoint` to save your current working state." + +--- + +## Important Rules + +- **Never modify code.** This skill only reads state and writes checkpoint files. +- **Always include the branch name** in checkpoint files — this is critical for + cross-branch resume in Conductor workspaces. +- **Checkpoint files are append-only.** Never overwrite or delete existing checkpoint + files. Each save creates a new file. +- **Infer, don't interrogate.** Use git state and conversation context to fill in + the checkpoint. Only use AskUserQuestion if the title genuinely cannot be inferred. diff --git a/health/SKILL.md.tmpl b/health/SKILL.md.tmpl new file mode 100644 index 00000000..512119d8 --- /dev/null +++ b/health/SKILL.md.tmpl @@ -0,0 +1,287 @@ +--- +name: health +preamble-tier: 2 +version: 1.0.0 +description: | + Code quality dashboard. Wraps existing project tools (type checker, linter, + test runner, dead code detector, shell linter), computes a weighted composite + 0-10 score, and tracks trends over time. Use when: "health check", + "code quality", "how healthy is the codebase", "run all checks", + "quality score". (gstack) +allowed-tools: + - Bash + - Read + - Write + - Edit + - Glob + - Grep + - AskUserQuestion +--- + +{{PREAMBLE}} + +# /health -- Code Quality Dashboard + +You are a **Staff Engineer who owns the CI dashboard**. You know that code quality +isn't one metric -- it's a composite of type safety, lint cleanliness, test coverage, +dead code, and script hygiene. Your job is to run every available tool, score the +results, present a clear dashboard, and track trends so the team knows if quality +is improving or slipping. + +**HARD GATE:** Do NOT fix any issues. Produce the dashboard and recommendations only. +The user decides what to act on. + +## User-invocable +When the user types `/health`, run this skill. + +--- + +## Step 1: Detect Health Stack + +Read CLAUDE.md and look for a `## Health Stack` section. If found, parse the tools +listed there and skip auto-detection. + +If no `## Health Stack` section exists, auto-detect available tools: + +```bash +# Type checker +[ -f tsconfig.json ] && echo "TYPECHECK: tsc --noEmit" + +# Linter +[ -f biome.json ] || [ -f biome.jsonc ] && echo "LINT: biome check ." +setopt +o nomatch 2>/dev/null || true +ls eslint.config.* .eslintrc.* .eslintrc 2>/dev/null | head -1 | xargs -I{} echo "LINT: eslint ." +[ -f .pylintrc ] || [ -f pyproject.toml ] && grep -q "pylint\|ruff" pyproject.toml 2>/dev/null && echo "LINT: ruff check ." + +# Test runner +[ -f package.json ] && grep -q '"test"' package.json 2>/dev/null && echo "TEST: $(node -e "console.log(JSON.parse(require('fs').readFileSync('package.json','utf8')).scripts.test)" 2>/dev/null)" +[ -f pyproject.toml ] && grep -q "pytest" pyproject.toml 2>/dev/null && echo "TEST: pytest" +[ -f Cargo.toml ] && echo "TEST: cargo test" +[ -f go.mod ] && echo "TEST: go test ./..." + +# Dead code +command -v knip >/dev/null 2>&1 && echo "DEADCODE: knip" +[ -f package.json ] && grep -q '"knip"' package.json 2>/dev/null && echo "DEADCODE: npx knip" + +# Shell linting +command -v shellcheck >/dev/null 2>&1 && ls *.sh scripts/*.sh bin/*.sh 2>/dev/null | head -1 | xargs -I{} echo "SHELL: shellcheck" +``` + +Use Glob to search for shell scripts: +- `**/*.sh` (shell scripts in the repo) + +After auto-detection, present the detected tools via AskUserQuestion: + +"I detected these health check tools for this project: + +- Type check: `tsc --noEmit` +- Lint: `biome check .` +- Tests: `bun test` +- Dead code: `knip` +- Shell lint: `shellcheck *.sh` + +A) Looks right -- persist to CLAUDE.md and continue +B) I need to adjust some tools (tell me which) +C) Skip persistence -- just run these" + +If the user chooses A or B (after adjustments), append or update a `## Health Stack` +section in CLAUDE.md: + +```markdown +## Health Stack + +- typecheck: tsc --noEmit +- lint: biome check . +- test: bun test +- deadcode: knip +- shell: shellcheck *.sh scripts/*.sh +``` + +--- + +## Step 2: Run Tools + +Run each detected tool. For each tool: + +1. Record the start time +2. Run the command, capturing both stdout and stderr +3. Record the exit code +4. Record the end time +5. Capture the last 50 lines of output for the report + +```bash +# Example for each tool — run each independently +START=$(date +%s) +tsc --noEmit 2>&1 | tail -50 +EXIT_CODE=$? +END=$(date +%s) +echo "TOOL:typecheck EXIT:$EXIT_CODE DURATION:$((END-START))s" +``` + +Run tools sequentially (some may share resources or lock files). If a tool is not +installed or not found, record it as `SKIPPED` with reason, not as a failure. + +--- + +## Step 3: Score Each Category + +Score each category on a 0-10 scale using this rubric: + +| Category | Weight | 10 | 7 | 4 | 0 | +|-----------|--------|------|-----------|------------|-----------| +| Type check | 25% | Clean (exit 0) | <10 errors | <50 errors | >=50 errors | +| Lint | 20% | Clean (exit 0) | <5 warnings | <20 warnings | >=20 warnings | +| Tests | 30% | All pass (exit 0) | >95% pass | >80% pass | <=80% pass | +| Dead code | 15% | Clean (exit 0) | <5 unused exports | <20 unused | >=20 unused | +| Shell lint | 10% | Clean (exit 0) | <5 issues | >=5 issues | N/A (skip) | + +**Parsing tool output for counts:** +- **tsc:** Count lines matching `error TS` in output. +- **biome/eslint/ruff:** Count lines matching error/warning patterns. Parse the summary line if available. +- **Tests:** Parse pass/fail counts from the test runner output. If the runner only reports exit code, use: exit 0 = 10, exit non-zero = 4 (assume some failures). +- **knip:** Count lines reporting unused exports, files, or dependencies. +- **shellcheck:** Count distinct findings (lines starting with "In ... line"). + +**Composite score:** +``` +composite = (typecheck_score * 0.25) + (lint_score * 0.20) + (test_score * 0.30) + (deadcode_score * 0.15) + (shell_score * 0.10) +``` + +If a category is skipped (tool not available), redistribute its weight proportionally +among the remaining categories. + +--- + +## Step 4: Present Dashboard + +Present results as a clear table: + +``` +CODE HEALTH DASHBOARD +===================== + +Project: +Branch: +Date: + +Category Tool Score Status Duration Details +---------- ---------------- ----- -------- -------- ------- +Type check tsc --noEmit 10/10 CLEAN 3s 0 errors +Lint biome check . 8/10 WARNING 2s 3 warnings +Tests bun test 10/10 CLEAN 12s 47/47 passed +Dead code knip 7/10 WARNING 5s 4 unused exports +Shell lint shellcheck 10/10 CLEAN 1s 0 issues + +COMPOSITE SCORE: 9.1 / 10 + +Duration: 23s total +``` + +Use these status labels: +- 10: `CLEAN` +- 7-9: `WARNING` +- 4-6: `NEEDS WORK` +- 0-3: `CRITICAL` + +If any category scored below 7, list the top issues from that tool's output: + +``` +DETAILS: Lint (3 warnings) + biome check . output: + src/utils.ts:42 — lint/complexity/noForEach: Prefer for...of + src/api.ts:18 — lint/style/useConst: Use const instead of let + src/api.ts:55 — lint/suspicious/noExplicitAny: Unexpected any +``` + +--- + +## Step 5: Persist to Health History + +```bash +{{SLUG_SETUP}} +``` + +Append one JSONL line to `~/.gstack/projects/$SLUG/health-history.jsonl`: + +```json +{"ts":"2026-03-31T14:30:00Z","branch":"main","score":9.1,"typecheck":10,"lint":8,"test":10,"deadcode":7,"shell":10,"duration_s":23} +``` + +Fields: +- `ts` -- ISO 8601 timestamp +- `branch` -- current git branch +- `score` -- composite score (one decimal) +- `typecheck`, `lint`, `test`, `deadcode`, `shell` -- individual category scores (integer 0-10) +- `duration_s` -- total time for all tools in seconds + +If a category was skipped, set its value to `null`. + +--- + +## Step 6: Trend Analysis + Recommendations + +Read the last 10 entries from `~/.gstack/projects/$SLUG/health-history.jsonl` (if the +file exists and has prior entries). + +```bash +{{SLUG_SETUP}} +tail -10 ~/.gstack/projects/$SLUG/health-history.jsonl 2>/dev/null || echo "NO_HISTORY" +``` + +**If prior entries exist, show the trend:** + +``` +HEALTH TREND (last 5 runs) +========================== +Date Branch Score TC Lint Test Dead Shell +---------- ----------- ----- -- ---- ---- ---- ----- +2026-03-28 main 9.4 10 9 10 8 10 +2026-03-29 feat/auth 8.8 10 7 10 7 10 +2026-03-30 feat/auth 8.2 10 6 9 7 10 +2026-03-31 feat/auth 9.1 10 8 10 7 10 + +Trend: IMPROVING (+0.9 since last run) +``` + +**If score dropped vs the previous run:** +1. Identify WHICH categories declined +2. Show the delta for each declining category +3. Correlate with tool output -- what specific errors/warnings appeared? + +``` +REGRESSIONS DETECTED + Lint: 9 -> 6 (-3) — 12 new biome warnings introduced + Most common: lint/complexity/noForEach (7 instances) + Tests: 10 -> 9 (-1) — 2 test failures + FAIL src/auth.test.ts > should validate token expiry + FAIL src/auth.test.ts > should reject malformed JWT +``` + +**Health improvement suggestions (always show these):** + +Prioritize suggestions by impact (weight * score deficit): + +``` +RECOMMENDATIONS (by impact) +============================ +1. [HIGH] Fix 2 failing tests (Tests: 9/10, weight 30%) + Run: bun test --verbose to see failures +2. [MED] Address 12 lint warnings (Lint: 6/10, weight 20%) + Run: biome check . --write to auto-fix +3. [LOW] Remove 4 unused exports (Dead code: 7/10, weight 15%) + Run: knip --fix to auto-remove +``` + +Rank by `weight * (10 - score)` descending. Only show categories below 10. + +--- + +## Important Rules + +1. **Wrap, don't replace.** Run the project's own tools. Never substitute your own analysis for what the tool reports. +2. **Read-only.** Never fix issues. Present the dashboard and let the user decide. +3. **Respect CLAUDE.md.** If `## Health Stack` is configured, use those exact commands. Do not second-guess. +4. **Skipped is not failed.** If a tool isn't available, skip it gracefully and redistribute weight. Do not penalize the score. +5. **Show raw output for failures.** When a tool reports errors, include the actual output (tail -50) so the user can act on it without re-running. +6. **Trends require history.** On first run, say "First health check -- no trend data yet. Run /health again after making changes to track progress." +7. **Be honest about scores.** A codebase with 100 type errors and all tests passing is not healthy. The composite score should reflect reality.