mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-01 19:25:10 +02:00
66c09644a7
* feat: add parameterized resolver support to gen-skill-docs
Extend the placeholder regex from {{WORD}} to {{WORD:arg1:arg2}},
enabling parameterized resolvers like {{INVOKE_SKILL:plan-ceo-review}}.
- Widen ResolverFn type to accept optional args?: string[]
- Update RESOLVERS record to use ResolverFn type
- Both replacement and unresolved-check regexes updated
- Fully backward compatible: existing {{WORD}} patterns unchanged
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add INVOKE_SKILL resolver for composable skill loading
New composition.ts resolver module that emits prose instructing Claude
to read another skill's SKILL.md and follow it, skipping preamble
sections. Supports optional skip= parameter for additional sections.
Usage: {{INVOKE_SKILL:plan-ceo-review}} or
{{INVOKE_SKILL:plan-ceo-review:skip=Outside Voice}}
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: use frontmatter name: for skill symlinks and Codex paths
Patch all 3 name-derivation paths to read name: from SKILL.md
frontmatter instead of relying solely on directory basenames.
This enables directory names that differ from invocation names
(e.g., run-tests/ directory with name: test).
- setup: link_claude_skill_dirs reads name: via grep, falls back to basename
- gen-skill-docs.ts: codexSkillName uses frontmatter name for Codex output paths
- gen-skill-docs.ts: moved frontmatter extraction before Codex path logic
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: extract CHANGELOG_WORKFLOW resolver from /ship
Move changelog generation logic into a reusable resolver. The resolver
is changelog-only (no version bump per Codex review recommendation).
Adds voice rules inline. /ship Step 5 now uses {{CHANGELOG_WORKFLOW}}.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: use INVOKE_SKILL resolver for plan-ceo-review office-hours fallback
Replace inline skill loading prose (read file, skip sections) with
{{INVOKE_SKILL:office-hours}} in the mid-session detection path.
The BENEFITS_FROM prerequisite offer is unchanged (separate use case).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: BENEFITS_FROM resolver delegates to INVOKE_SKILL
Eliminate duplicated skip-list logic by having generateBenefitsFrom
call generateInvokeSkill internally. The wrapper (AskUserQuestion,
design doc re-check) stays in BENEFITS_FROM. The loading instructions
(read file, skip sections, error handling) come from INVOKE_SKILL.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add resolver tests for INVOKE_SKILL, CHANGELOG_WORKFLOW, parameterized args
12 new tests covering:
- INVOKE_SKILL: template placeholder, default skip list, error handling,
BENEFITS_FROM delegation
- CHANGELOG_WORKFLOW: content, cross-check, voice guidance, format
- Parameterized resolver infra: colon-separated args processing,
no unresolved placeholders across all generated SKILL.md files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.13.7.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: journey routing tests — CLAUDE.md routing rules + stronger descriptions
Three journey E2E tests (ideation, ship, debug) were failing because
Claude answered directly instead of invoking the Skill tool. Root cause:
skill descriptions in system-reminder are too weak to override Claude's
default behavior for tasks it can handle natively.
Fix has two parts:
1. CLAUDE.md routing rules in test workdir — Claude weighs project-level
instructions higher than skill description metadata
2. "Proactively invoke" (not "suggest") in office-hours, investigate,
ship descriptions — reinforces the routing signal
10/10 journey tests now pass (was 7/10).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: one-time CLAUDE.md routing injection prompt
Add a preamble section that checks if the project's CLAUDE.md has
skill routing rules. If not (and user hasn't declined), asks once
via AskUserQuestion to inject a "## Skill routing" section.
Root cause: skill descriptions in system-reminder metadata are too
weak to reliably trigger proactive Skill tool invocation. CLAUDE.md
project instructions carry higher weight in Claude's decision making.
- Preamble bash checks for "## Skill routing" in CLAUDE.md
- Stores decline in gstack-config (routing_declined=true)
- Only asks once per project (HAS_ROUTING check + config check)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: annotated config file + routing injection tests
gstack-config now writes a documented header on first config creation
with every supported key explained (proactive, telemetry, auto_upgrade,
skill_prefix, routing_declined, codex_reviews, skip_eng_review, etc.).
Users can edit ~/.gstack/config.yaml directly, anytime.
Also fixes grep to use ^KEY: anchoring so commented header lines don't
shadow real config values.
Tests added:
- 7 new gstack-config tests (annotated header, no duplication, comment
safety, routing_declined get/set/reset)
- 6 new gen-skill-docs tests (preamble routing injection: bash checks,
config reads, AskUserQuestion, decline persistence, routing rules)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump to v0.13.9.0, separate CHANGELOG from main's releases
Split our branch's changes into a new 0.13.9.0 entry instead of
jamming them into 0.13.7.0 which already landed on main as
"Community Wave."
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: clarify branch-scoped VERSION/CHANGELOG after merging main
Add explicit rules: merging main doesn't mean adopting main's version.
Branch always gets its own entry on top with a higher version number.
Three-point checklist after every merge.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: put our 0.13.9.0 entry on top of CHANGELOG
Newest version goes on top. Our branch lands next, so our entry
must be above main's 0.13.8.0.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: restore missing 0.13.7.0 Community Wave entry
Accidentally dropped the 0.13.7.0 entry when reordering.
All entries now present: 0.13.9.0 > 0.13.8.0 > 0.13.7.0 > 0.13.6.0.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add CHANGELOG integrity check rule
After any edit that moves/adds/removes entries, grep for version
headers and verify no gaps or duplicates before committing.
Prevents accidentally dropping entries during reordering.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
203 lines
8.3 KiB
Cheetah
203 lines
8.3 KiB
Cheetah
---
|
|
name: investigate
|
|
preamble-tier: 2
|
|
version: 1.0.0
|
|
description: |
|
|
Systematic debugging with root cause investigation. Four phases: investigate,
|
|
analyze, hypothesize, implement. Iron Law: no fixes without root cause.
|
|
Use when asked to "debug this", "fix this bug", "why is this broken",
|
|
"investigate this error", or "root cause analysis".
|
|
Proactively invoke this skill (do NOT debug directly) when the user reports
|
|
errors, 500 errors, stack traces, unexpected behavior, "it was working
|
|
yesterday", or is troubleshooting why something stopped working. (gstack)
|
|
allowed-tools:
|
|
- Bash
|
|
- Read
|
|
- Write
|
|
- Edit
|
|
- Grep
|
|
- Glob
|
|
- AskUserQuestion
|
|
- WebSearch
|
|
hooks:
|
|
PreToolUse:
|
|
- matcher: "Edit"
|
|
hooks:
|
|
- type: command
|
|
command: "bash ${CLAUDE_SKILL_DIR}/../freeze/bin/check-freeze.sh"
|
|
statusMessage: "Checking debug scope boundary..."
|
|
- matcher: "Write"
|
|
hooks:
|
|
- type: command
|
|
command: "bash ${CLAUDE_SKILL_DIR}/../freeze/bin/check-freeze.sh"
|
|
statusMessage: "Checking debug scope boundary..."
|
|
---
|
|
|
|
{{PREAMBLE}}
|
|
|
|
# Systematic Debugging
|
|
|
|
## Iron Law
|
|
|
|
**NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST.**
|
|
|
|
Fixing symptoms creates whack-a-mole debugging. Every fix that doesn't address root cause makes the next bug harder to find. Find the root cause, then fix it.
|
|
|
|
---
|
|
|
|
## Phase 1: Root Cause Investigation
|
|
|
|
Gather context before forming any hypothesis.
|
|
|
|
1. **Collect symptoms:** Read the error messages, stack traces, and reproduction steps. If the user hasn't provided enough context, ask ONE question at a time via AskUserQuestion.
|
|
|
|
2. **Read the code:** Trace the code path from the symptom back to potential causes. Use Grep to find all references, Read to understand the logic.
|
|
|
|
3. **Check recent changes:**
|
|
```bash
|
|
git log --oneline -20 -- <affected-files>
|
|
```
|
|
Was this working before? What changed? A regression means the root cause is in the diff.
|
|
|
|
4. **Reproduce:** Can you trigger the bug deterministically? If not, gather more evidence before proceeding.
|
|
|
|
{{LEARNINGS_SEARCH}}
|
|
|
|
Output: **"Root cause hypothesis: ..."** — a specific, testable claim about what is wrong and why.
|
|
|
|
---
|
|
|
|
## Scope Lock
|
|
|
|
After forming your root cause hypothesis, lock edits to the affected module to prevent scope creep.
|
|
|
|
```bash
|
|
[ -x "${CLAUDE_SKILL_DIR}/../freeze/bin/check-freeze.sh" ] && echo "FREEZE_AVAILABLE" || echo "FREEZE_UNAVAILABLE"
|
|
```
|
|
|
|
**If FREEZE_AVAILABLE:** Identify the narrowest directory containing the affected files. Write it to the freeze state file:
|
|
|
|
```bash
|
|
STATE_DIR="${CLAUDE_PLUGIN_DATA:-$HOME/.gstack}"
|
|
mkdir -p "$STATE_DIR"
|
|
echo "<detected-directory>/" > "$STATE_DIR/freeze-dir.txt"
|
|
echo "Debug scope locked to: <detected-directory>/"
|
|
```
|
|
|
|
Substitute `<detected-directory>` with the actual directory path (e.g., `src/auth/`). Tell the user: "Edits restricted to `<dir>/` for this debug session. This prevents changes to unrelated code. Run `/unfreeze` to remove the restriction."
|
|
|
|
If the bug spans the entire repo or the scope is genuinely unclear, skip the lock and note why.
|
|
|
|
**If FREEZE_UNAVAILABLE:** Skip scope lock. Edits are unrestricted.
|
|
|
|
---
|
|
|
|
## Phase 2: Pattern Analysis
|
|
|
|
Check if this bug matches a known pattern:
|
|
|
|
| Pattern | Signature | Where to look |
|
|
|---------|-----------|---------------|
|
|
| Race condition | Intermittent, timing-dependent | Concurrent access to shared state |
|
|
| Nil/null propagation | NoMethodError, TypeError | Missing guards on optional values |
|
|
| State corruption | Inconsistent data, partial updates | Transactions, callbacks, hooks |
|
|
| Integration failure | Timeout, unexpected response | External API calls, service boundaries |
|
|
| Configuration drift | Works locally, fails in staging/prod | Env vars, feature flags, DB state |
|
|
| Stale cache | Shows old data, fixes on cache clear | Redis, CDN, browser cache, Turbo |
|
|
|
|
Also check:
|
|
- `TODOS.md` for related known issues
|
|
- `git log` for prior fixes in the same area — **recurring bugs in the same files are an architectural smell**, not a coincidence
|
|
|
|
**External pattern search:** If the bug doesn't match a known pattern above, WebSearch for:
|
|
- "{framework} {generic error type}" — **sanitize first:** strip hostnames, IPs, file paths, SQL, customer data. Search the error category, not the raw message.
|
|
- "{library} {component} known issues"
|
|
|
|
If WebSearch is unavailable, skip this search and proceed with hypothesis testing. If a documented solution or known dependency bug surfaces, present it as a candidate hypothesis in Phase 3.
|
|
|
|
---
|
|
|
|
## Phase 3: Hypothesis Testing
|
|
|
|
Before writing ANY fix, verify your hypothesis.
|
|
|
|
1. **Confirm the hypothesis:** Add a temporary log statement, assertion, or debug output at the suspected root cause. Run the reproduction. Does the evidence match?
|
|
|
|
2. **If the hypothesis is wrong:** Before forming the next hypothesis, consider searching for the error. **Sanitize first** — strip hostnames, IPs, file paths, SQL fragments, customer identifiers, and any internal/proprietary data from the error message. Search only the generic error type and framework context: "{component} {sanitized error type} {framework version}". If the error message is too specific to sanitize safely, skip the search. If WebSearch is unavailable, skip and proceed. Then return to Phase 1. Gather more evidence. Do not guess.
|
|
|
|
3. **3-strike rule:** If 3 hypotheses fail, **STOP**. Use AskUserQuestion:
|
|
```
|
|
3 hypotheses tested, none match. This may be an architectural issue
|
|
rather than a simple bug.
|
|
|
|
A) Continue investigating — I have a new hypothesis: [describe]
|
|
B) Escalate for human review — this needs someone who knows the system
|
|
C) Add logging and wait — instrument the area and catch it next time
|
|
```
|
|
|
|
**Red flags** — if you see any of these, slow down:
|
|
- "Quick fix for now" — there is no "for now." Fix it right or escalate.
|
|
- Proposing a fix before tracing data flow — you're guessing.
|
|
- Each fix reveals a new problem elsewhere — wrong layer, not wrong code.
|
|
|
|
---
|
|
|
|
## Phase 4: Implementation
|
|
|
|
Once root cause is confirmed:
|
|
|
|
1. **Fix the root cause, not the symptom.** The smallest change that eliminates the actual problem.
|
|
|
|
2. **Minimal diff:** Fewest files touched, fewest lines changed. Resist the urge to refactor adjacent code.
|
|
|
|
3. **Write a regression test** that:
|
|
- **Fails** without the fix (proves the test is meaningful)
|
|
- **Passes** with the fix (proves the fix works)
|
|
|
|
4. **Run the full test suite.** Paste the output. No regressions allowed.
|
|
|
|
5. **If the fix touches >5 files:** Use AskUserQuestion to flag the blast radius:
|
|
```
|
|
This fix touches N files. That's a large blast radius for a bug fix.
|
|
A) Proceed — the root cause genuinely spans these files
|
|
B) Split — fix the critical path now, defer the rest
|
|
C) Rethink — maybe there's a more targeted approach
|
|
```
|
|
|
|
---
|
|
|
|
## Phase 5: Verification & Report
|
|
|
|
**Fresh verification:** Reproduce the original bug scenario and confirm it's fixed. This is not optional.
|
|
|
|
Run the test suite and paste the output.
|
|
|
|
Output a structured debug report:
|
|
```
|
|
DEBUG REPORT
|
|
════════════════════════════════════════
|
|
Symptom: [what the user observed]
|
|
Root cause: [what was actually wrong]
|
|
Fix: [what was changed, with file:line references]
|
|
Evidence: [test output, reproduction attempt showing fix works]
|
|
Regression test: [file:line of the new test]
|
|
Related: [TODOS.md items, prior bugs in same area, architectural notes]
|
|
Status: DONE | DONE_WITH_CONCERNS | BLOCKED
|
|
════════════════════════════════════════
|
|
```
|
|
|
|
{{LEARNINGS_LOG}}
|
|
|
|
---
|
|
|
|
## Important Rules
|
|
|
|
- **3+ failed fix attempts → STOP and question the architecture.** Wrong architecture, not failed hypothesis.
|
|
- **Never apply a fix you cannot verify.** If you can't reproduce and confirm, don't ship it.
|
|
- **Never say "this should fix it."** Verify and prove it. Run the tests.
|
|
- **If fix touches >5 files → AskUserQuestion** about blast radius before proceeding.
|
|
- **Completion status:**
|
|
- DONE — root cause found, fix applied, regression test written, all tests pass
|
|
- DONE_WITH_CONCERNS — fixed but cannot fully verify (e.g., intermittent bug, requires staging)
|
|
- BLOCKED — root cause unclear after investigation, escalated
|