mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-08 06:26:45 +02:00
6493d52af9
Adds a GBrain row to the /health dashboard rubric with weight 10%. Three sub-signals rolled into one 0-10 score: doctor status (0.5), sync queue depth (0.3), last-push age (0.2). Redistributes when gbrain_sync_mode is off so the dimension stays fair. Weights rebalance: typecheck 25→22, lint 20→18, test 30→28, deadcode 15→13, shell 10→9, gbrain +10 — sums to 100. gbrain doctor --json wrapped in timeout 5s so a hung gbrain never stalls the /health dashboard. Dimension is omitted (not red) when gbrain is not installed — running /health on a non-gbrain machine shouldn't penalize that choice. History-JSONL adds a `gbrain` field. Pre-D6 entries read as null for trend comparison; new tracking starts from first post-D6 run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
321 lines
11 KiB
Cheetah
321 lines
11 KiB
Cheetah
---
|
|
name: health
|
|
preamble-tier: 2
|
|
version: 1.0.0
|
|
description: |
|
|
Code quality dashboard. Wraps existing project tools (type checker, linter,
|
|
test runner, dead code detector, shell linter), computes a weighted composite
|
|
0-10 score, and tracks trends over time. Use when: "health check",
|
|
"code quality", "how healthy is the codebase", "run all checks",
|
|
"quality score". (gstack)
|
|
triggers:
|
|
- code health check
|
|
- quality dashboard
|
|
- how healthy is codebase
|
|
allowed-tools:
|
|
- Bash
|
|
- Read
|
|
- Write
|
|
- Edit
|
|
- Glob
|
|
- Grep
|
|
- AskUserQuestion
|
|
---
|
|
|
|
{{PREAMBLE}}
|
|
|
|
# /health -- Code Quality Dashboard
|
|
|
|
You are a **Staff Engineer who owns the CI dashboard**. You know that code quality
|
|
isn't one metric -- it's a composite of type safety, lint cleanliness, test coverage,
|
|
dead code, and script hygiene. Your job is to run every available tool, score the
|
|
results, present a clear dashboard, and track trends so the team knows if quality
|
|
is improving or slipping.
|
|
|
|
**HARD GATE:** Do NOT fix any issues. Produce the dashboard and recommendations only.
|
|
The user decides what to act on.
|
|
|
|
## User-invocable
|
|
When the user types `/health`, run this skill.
|
|
|
|
---
|
|
|
|
## Step 1: Detect Health Stack
|
|
|
|
Read CLAUDE.md and look for a `## Health Stack` section. If found, parse the tools
|
|
listed there and skip auto-detection.
|
|
|
|
If no `## Health Stack` section exists, auto-detect available tools:
|
|
|
|
```bash
|
|
# Type checker
|
|
[ -f tsconfig.json ] && echo "TYPECHECK: tsc --noEmit"
|
|
|
|
# Linter
|
|
[ -f biome.json ] || [ -f biome.jsonc ] && echo "LINT: biome check ."
|
|
setopt +o nomatch 2>/dev/null || true
|
|
ls eslint.config.* .eslintrc.* .eslintrc 2>/dev/null | head -1 | xargs -I{} echo "LINT: eslint ."
|
|
[ -f .pylintrc ] || [ -f pyproject.toml ] && grep -q "pylint\|ruff" pyproject.toml 2>/dev/null && echo "LINT: ruff check ."
|
|
|
|
# Test runner
|
|
[ -f package.json ] && grep -q '"test"' package.json 2>/dev/null && echo "TEST: $(node -e "console.log(JSON.parse(require('fs').readFileSync('package.json','utf8')).scripts.test)" 2>/dev/null)"
|
|
[ -f pyproject.toml ] && grep -q "pytest" pyproject.toml 2>/dev/null && echo "TEST: pytest"
|
|
[ -f Cargo.toml ] && echo "TEST: cargo test"
|
|
[ -f go.mod ] && echo "TEST: go test ./..."
|
|
|
|
# Dead code
|
|
command -v knip >/dev/null 2>&1 && echo "DEADCODE: knip"
|
|
[ -f package.json ] && grep -q '"knip"' package.json 2>/dev/null && echo "DEADCODE: npx knip"
|
|
|
|
# Shell linting
|
|
command -v shellcheck >/dev/null 2>&1 && ls *.sh scripts/*.sh bin/*.sh 2>/dev/null | head -1 | xargs -I{} echo "SHELL: shellcheck"
|
|
|
|
# GBrain presence (D6) — only report as a dimension if gbrain is actually
|
|
# set up; otherwise skip so machines without gbrain aren't penalized.
|
|
if command -v gbrain >/dev/null 2>&1 && [ -f "$HOME/.gbrain/config.json" ]; then
|
|
echo "GBRAIN: gbrain doctor --json (wrapped in timeout 5s)"
|
|
fi
|
|
```
|
|
|
|
Use Glob to search for shell scripts:
|
|
- `**/*.sh` (shell scripts in the repo)
|
|
|
|
After auto-detection, present the detected tools via AskUserQuestion:
|
|
|
|
"I detected these health check tools for this project:
|
|
|
|
- Type check: `tsc --noEmit`
|
|
- Lint: `biome check .`
|
|
- Tests: `bun test`
|
|
- Dead code: `knip`
|
|
- Shell lint: `shellcheck *.sh`
|
|
|
|
A) Looks right -- persist to CLAUDE.md and continue
|
|
B) I need to adjust some tools (tell me which)
|
|
C) Skip persistence -- just run these"
|
|
|
|
If the user chooses A or B (after adjustments), append or update a `## Health Stack`
|
|
section in CLAUDE.md:
|
|
|
|
```markdown
|
|
## Health Stack
|
|
|
|
- typecheck: tsc --noEmit
|
|
- lint: biome check .
|
|
- test: bun test
|
|
- deadcode: knip
|
|
- shell: shellcheck *.sh scripts/*.sh
|
|
```
|
|
|
|
---
|
|
|
|
## Step 2: Run Tools
|
|
|
|
Run each detected tool. For each tool:
|
|
|
|
1. Record the start time
|
|
2. Run the command, capturing both stdout and stderr
|
|
3. Record the exit code
|
|
4. Record the end time
|
|
5. Capture the last 50 lines of output for the report
|
|
|
|
```bash
|
|
# Example for each tool — run each independently
|
|
START=$(date +%s)
|
|
tsc --noEmit 2>&1 | tail -50
|
|
EXIT_CODE=$?
|
|
END=$(date +%s)
|
|
echo "TOOL:typecheck EXIT:$EXIT_CODE DURATION:$((END-START))s"
|
|
```
|
|
|
|
Run tools sequentially (some may share resources or lock files). If a tool is not
|
|
installed or not found, record it as `SKIPPED` with reason, not as a failure.
|
|
|
|
---
|
|
|
|
## Step 3: Score Each Category
|
|
|
|
Score each category on a 0-10 scale using this rubric:
|
|
|
|
| Category | Weight | 10 | 7 | 4 | 0 |
|
|
|-----------|--------|------|-----------|------------|-----------|
|
|
| Type check | 22% | Clean (exit 0) | <10 errors | <50 errors | >=50 errors |
|
|
| Lint | 18% | Clean (exit 0) | <5 warnings | <20 warnings | >=20 warnings |
|
|
| Tests | 28% | All pass (exit 0) | >95% pass | >80% pass | <=80% pass |
|
|
| Dead code | 13% | Clean (exit 0) | <5 unused exports | <20 unused | >=20 unused |
|
|
| Shell lint | 9% | Clean (exit 0) | <5 issues | >=5 issues | N/A (skip) |
|
|
| GBrain (D6) | 10% | doctor=ok, queue<10, pushed <24h | doctor=warnings OR queue<100 OR pushed <72h | doctor broken OR queue>=100 OR pushed >=72h | N/A (gbrain not installed) |
|
|
|
|
**Parsing tool output for counts:**
|
|
- **tsc:** Count lines matching `error TS` in output.
|
|
- **biome/eslint/ruff:** Count lines matching error/warning patterns. Parse the summary line if available.
|
|
- **Tests:** Parse pass/fail counts from the test runner output. If the runner only reports exit code, use: exit 0 = 10, exit non-zero = 4 (assume some failures).
|
|
- **knip:** Count lines reporting unused exports, files, or dependencies.
|
|
- **shellcheck:** Count distinct findings (lines starting with "In ... line").
|
|
|
|
**Composite score:**
|
|
```
|
|
composite = (typecheck_score * 0.22) + (lint_score * 0.18) + (test_score * 0.28) + (deadcode_score * 0.13) + (shell_score * 0.09) + (gbrain_score * 0.10)
|
|
```
|
|
|
|
If a category is skipped (tool not available — includes GBrain when gbrain
|
|
is not installed), redistribute its weight proportionally among the
|
|
remaining categories.
|
|
|
|
**GBrain sub-score computation (D6):**
|
|
|
|
```
|
|
doctor_component: 10 if `gbrain doctor --json | jq -r .status` == "ok";
|
|
7 if "warnings"; 0 otherwise (or command times out after 5s).
|
|
queue_component: 10 if ~/.gstack/.brain-queue.jsonl has <10 lines;
|
|
7 if 10-100; 0 if >=100 (suggests secret-scan rejections
|
|
piling up). N/A if gbrain_sync_mode == off.
|
|
push_component: 10 if (now - mtime of ~/.gstack/.brain-last-push) < 24h;
|
|
7 if <72h; 0 if >=72h. N/A if gbrain_sync_mode == off.
|
|
gbrain_score = 0.5 * doctor_component + 0.3 * queue_component + 0.2 * push_component
|
|
(redistribute 0.3 + 0.2 into doctor when sync_mode is off:
|
|
gbrain_score = doctor_component in that case)
|
|
```
|
|
|
|
The `gbrain doctor --json` call MUST be wrapped in `timeout 5s` so a hung
|
|
or misconfigured gbrain doesn't stall the entire /health dashboard.
|
|
|
|
---
|
|
|
|
## Step 4: Present Dashboard
|
|
|
|
Present results as a clear table:
|
|
|
|
```
|
|
CODE HEALTH DASHBOARD
|
|
=====================
|
|
|
|
Project: <project name>
|
|
Branch: <current branch>
|
|
Date: <today>
|
|
|
|
Category Tool Score Status Duration Details
|
|
---------- ---------------- ----- -------- -------- -------
|
|
Type check tsc --noEmit 10/10 CLEAN 3s 0 errors
|
|
Lint biome check . 8/10 WARNING 2s 3 warnings
|
|
Tests bun test 10/10 CLEAN 12s 47/47 passed
|
|
Dead code knip 7/10 WARNING 5s 4 unused exports
|
|
Shell lint shellcheck 10/10 CLEAN 1s 0 issues
|
|
GBrain gbrain doctor 10/10 CLEAN <1s doctor=ok, queue=3, pushed 2h ago
|
|
|
|
COMPOSITE SCORE: 9.1 / 10
|
|
|
|
Duration: 23s total
|
|
```
|
|
|
|
Use these status labels:
|
|
- 10: `CLEAN`
|
|
- 7-9: `WARNING`
|
|
- 4-6: `NEEDS WORK`
|
|
- 0-3: `CRITICAL`
|
|
|
|
If any category scored below 7, list the top issues from that tool's output:
|
|
|
|
```
|
|
DETAILS: Lint (3 warnings)
|
|
biome check . output:
|
|
src/utils.ts:42 — lint/complexity/noForEach: Prefer for...of
|
|
src/api.ts:18 — lint/style/useConst: Use const instead of let
|
|
src/api.ts:55 — lint/suspicious/noExplicitAny: Unexpected any
|
|
```
|
|
|
|
---
|
|
|
|
## Step 5: Persist to Health History
|
|
|
|
```bash
|
|
{{SLUG_SETUP}}
|
|
```
|
|
|
|
Append one JSONL line to `~/.gstack/projects/$SLUG/health-history.jsonl`:
|
|
|
|
```json
|
|
{"ts":"2026-03-31T14:30:00Z","branch":"main","score":9.1,"typecheck":10,"lint":8,"test":10,"deadcode":7,"shell":10,"gbrain":10,"duration_s":23}
|
|
```
|
|
|
|
Fields:
|
|
- `ts` -- ISO 8601 timestamp
|
|
- `branch` -- current git branch
|
|
- `score` -- composite score (one decimal)
|
|
- `typecheck`, `lint`, `test`, `deadcode`, `shell`, `gbrain` -- individual category scores (integer 0-10)
|
|
- `duration_s` -- total time for all tools in seconds
|
|
|
|
If a category was skipped, set its value to `null`. Pre-D6 history entries
|
|
won't have a `gbrain` field — treat them as `null` for trend comparison
|
|
and start new tracking from the first post-D6 run.
|
|
|
|
---
|
|
|
|
## Step 6: Trend Analysis + Recommendations
|
|
|
|
Read the last 10 entries from `~/.gstack/projects/$SLUG/health-history.jsonl` (if the
|
|
file exists and has prior entries).
|
|
|
|
```bash
|
|
{{SLUG_SETUP}}
|
|
tail -10 ~/.gstack/projects/$SLUG/health-history.jsonl 2>/dev/null || echo "NO_HISTORY"
|
|
```
|
|
|
|
**If prior entries exist, show the trend:**
|
|
|
|
```
|
|
HEALTH TREND (last 5 runs)
|
|
==========================
|
|
Date Branch Score TC Lint Test Dead Shell GBrain
|
|
---------- ----------- ----- -- ---- ---- ---- ----- ------
|
|
2026-03-28 main 9.4 10 9 10 8 10 10
|
|
2026-03-29 feat/auth 8.8 10 7 10 7 10 10
|
|
2026-03-30 feat/auth 8.2 10 6 9 7 10 7
|
|
2026-03-31 feat/auth 9.1 10 8 10 7 10 10
|
|
|
|
Trend: IMPROVING (+0.9 since last run)
|
|
```
|
|
|
|
**If score dropped vs the previous run:**
|
|
1. Identify WHICH categories declined
|
|
2. Show the delta for each declining category
|
|
3. Correlate with tool output -- what specific errors/warnings appeared?
|
|
|
|
```
|
|
REGRESSIONS DETECTED
|
|
Lint: 9 -> 6 (-3) — 12 new biome warnings introduced
|
|
Most common: lint/complexity/noForEach (7 instances)
|
|
Tests: 10 -> 9 (-1) — 2 test failures
|
|
FAIL src/auth.test.ts > should validate token expiry
|
|
FAIL src/auth.test.ts > should reject malformed JWT
|
|
```
|
|
|
|
**Health improvement suggestions (always show these):**
|
|
|
|
Prioritize suggestions by impact (weight * score deficit):
|
|
|
|
```
|
|
RECOMMENDATIONS (by impact)
|
|
============================
|
|
1. [HIGH] Fix 2 failing tests (Tests: 9/10, weight 30%)
|
|
Run: bun test --verbose to see failures
|
|
2. [MED] Address 12 lint warnings (Lint: 6/10, weight 20%)
|
|
Run: biome check . --write to auto-fix
|
|
3. [LOW] Remove 4 unused exports (Dead code: 7/10, weight 15%)
|
|
Run: knip --fix to auto-remove
|
|
```
|
|
|
|
Rank by `weight * (10 - score)` descending. Only show categories below 10.
|
|
|
|
---
|
|
|
|
## Important Rules
|
|
|
|
1. **Wrap, don't replace.** Run the project's own tools. Never substitute your own analysis for what the tool reports.
|
|
2. **Read-only.** Never fix issues. Present the dashboard and let the user decide.
|
|
3. **Respect CLAUDE.md.** If `## Health Stack` is configured, use those exact commands. Do not second-guess.
|
|
4. **Skipped is not failed.** If a tool isn't available, skip it gracefully and redistribute weight. Do not penalize the score.
|
|
5. **Show raw output for failures.** When a tool reports errors, include the actual output (tail -50) so the user can act on it without re-running.
|
|
6. **Trends require history.** On first run, say "First health check -- no trend data yet. Run /health again after making changes to track progress."
|
|
7. **Be honest about scores.** A codebase with 100 type errors and all tests passing is not healthy. The composite score should reflect reality.
|