feat: test bootstrap, regression tests, coverage audit, retro test health

- Add {{TEST_BOOTSTRAP}} resolver to gen-skill-docs.ts
- Add Phase 8e.5 regression test generation to /qa and /qa-design-review
- Add Step 3.4 test coverage audit with quality scoring to /ship
- Add test health tracking to /retro
- Add 2 E2E evals (bootstrap + coverage audit)
- Add 26 validation tests
- Update ARCHITECTURE.md placeholder table
- Add 2 P3 TODOs (CI/CD non-GitHub, auto-upgrade weak tests)
This commit is contained in:
Garry Tan
2026-03-17 10:41:40 -07:00
parent 73b00b4e29
commit 04eb7824bd
16 changed files with 1512 additions and 6 deletions
+156
View File
@@ -846,6 +846,161 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
- Informational only — does NOT block.`;
}
function generateTestBootstrap(): string {
return `## Test Framework Bootstrap
**Detect existing test framework and project runtime:**
\`\`\`bash
# Detect project runtime
[ -f Gemfile ] && echo "RUNTIME:ruby"
[ -f package.json ] && echo "RUNTIME:node"
[ -f requirements.txt ] || [ -f pyproject.toml ] && echo "RUNTIME:python"
[ -f go.mod ] && echo "RUNTIME:go"
[ -f Cargo.toml ] && echo "RUNTIME:rust"
[ -f composer.json ] && echo "RUNTIME:php"
[ -f mix.exs ] && echo "RUNTIME:elixir"
# Detect sub-frameworks
[ -f Gemfile ] && grep -q "rails" Gemfile 2>/dev/null && echo "FRAMEWORK:rails"
[ -f package.json ] && grep -q '"next"' package.json 2>/dev/null && echo "FRAMEWORK:nextjs"
# Check for existing test infrastructure
ls jest.config.* vitest.config.* playwright.config.* .rspec pytest.ini pyproject.toml phpunit.xml 2>/dev/null
ls -d test/ tests/ spec/ __tests__/ cypress/ e2e/ 2>/dev/null
# Check opt-out marker
[ -f .gstack/no-test-bootstrap ] && echo "BOOTSTRAP_DECLINED"
\`\`\`
**If test framework detected** (config files or test directories found):
Print "Test framework detected: {name} ({N} existing tests). Skipping bootstrap."
Read 2-3 existing test files to learn conventions (naming, imports, assertion style, setup patterns).
Store conventions as prose context for use in Phase 8e.5 or Step 3.4. **Skip the rest of bootstrap.**
**If BOOTSTRAP_DECLINED** appears: Print "Test bootstrap previously declined — skipping." **Skip the rest of bootstrap.**
**If NO runtime detected** (no config files found): Use AskUserQuestion:
"I couldn't detect your project's language. What runtime are you using?"
Options: A) Node.js/TypeScript B) Ruby/Rails C) Python D) Go E) Rust F) PHP G) Elixir H) This project doesn't need tests.
If user picks H → write \`.gstack/no-test-bootstrap\` and continue without tests.
**If runtime detected but no test framework — bootstrap:**
### B2. Research best practices
Use WebSearch to find current best practices for the detected runtime:
- \`"[runtime] best test framework 2025 2026"\`
- \`"[framework A] vs [framework B] comparison"\`
If WebSearch is unavailable, use this built-in knowledge table:
| Runtime | Primary recommendation | Alternative |
|---------|----------------------|-------------|
| Ruby/Rails | minitest + fixtures + capybara | rspec + factory_bot + shoulda-matchers |
| Node.js | vitest + @testing-library | jest + @testing-library |
| Next.js | vitest + @testing-library/react + playwright | jest + cypress |
| Python | pytest + pytest-cov | unittest |
| Go | stdlib testing + testify | stdlib only |
| Rust | cargo test (built-in) + mockall | — |
| PHP | phpunit + mockery | pest |
| Elixir | ExUnit (built-in) + ex_machina | — |
### B3. Framework selection
Use AskUserQuestion:
"I detected this is a [Runtime/Framework] project with no test framework. I researched current best practices. Here are the options:
A) [Primary] — [rationale]. Includes: [packages]. Supports: unit, integration, smoke, e2e
B) [Alternative] — [rationale]. Includes: [packages]
C) Skip — don't set up testing right now
RECOMMENDATION: Choose A because [reason based on project context]"
If user picks C → write \`.gstack/no-test-bootstrap\`. Tell user: "If you change your mind later, delete \`.gstack/no-test-bootstrap\` and re-run." Continue without tests.
If multiple runtimes detected (monorepo) → ask which runtime to set up first, with option to do both sequentially.
### B4. Install and configure
1. Install the chosen packages (npm/bun/gem/pip/etc.)
2. Create minimal config file
3. Create directory structure (test/, spec/, etc.)
4. Create one example test matching the project's code to verify setup works
If package installation fails → debug once. If still failing → revert with \`git checkout -- package.json package-lock.json\` (or equivalent for the runtime). Warn user and continue without tests.
### B4.5. First real tests
Generate 3-5 real tests for existing code:
1. **Find recently changed files:** \`git log --since=30.days --name-only --format="" | sort | uniq -c | sort -rn | head -10\`
2. **Prioritize by risk:** Error handlers > business logic with conditionals > API endpoints > pure functions
3. **For each file:** Write one test that tests real behavior with meaningful assertions. Never \`expect(x).toBeDefined()\` — test what the code DOES.
4. Run each test. Passes → keep. Fails → fix once. Still fails → delete silently.
5. Generate at least 1 test, cap at 5.
Never import secrets, API keys, or credentials in test files. Use environment variables or test fixtures.
### B5. Verify
\`\`\`bash
# Run the full test suite to confirm everything works
{detected test command}
\`\`\`
If tests fail → debug once. If still failing → revert all bootstrap changes and warn user.
### B5.5. CI/CD pipeline
\`\`\`bash
# Check CI provider
ls -d .github/ 2>/dev/null && echo "CI:github"
ls .gitlab-ci.yml .circleci/ bitrise.yml 2>/dev/null
\`\`\`
If \`.github/\` exists (or no CI detected — default to GitHub Actions):
Create \`.github/workflows/test.yml\` with:
- \`runs-on: ubuntu-latest\`
- Appropriate setup action for the runtime (setup-node, setup-ruby, setup-python, etc.)
- The same test command verified in B5
- Trigger: push + pull_request
If non-GitHub CI detected → skip CI generation with note: "Detected {provider} — CI pipeline generation supports GitHub Actions only. Add test step to your existing pipeline manually."
### B6. Create TESTING.md
First check: If TESTING.md already exists → read it and update/append rather than overwriting. Never destroy existing content.
Write TESTING.md with:
- Philosophy: "100% test coverage is the key to great vibe coding. Tests let you move fast, trust your instincts, and ship with confidence — without them, vibe coding is just yolo coding. With tests, it's a superpower."
- Framework name and version
- How to run tests (the verified command from B5)
- Test layers: Unit tests (what, where, when), Integration tests, Smoke tests, E2E tests
- Conventions: file naming, assertion style, setup/teardown patterns
### B7. Update CLAUDE.md
First check: If CLAUDE.md already has a \`## Testing\` section → skip. Don't duplicate.
Append a \`## Testing\` section:
- Run command and test directory
- Reference to TESTING.md
- Test expectations:
- 100% test coverage is the goal — tests make vibe coding safe
- When writing new functions, write a corresponding test
- When fixing a bug, write a regression test
- When adding error handling, write a test that triggers the error
- When adding a conditional (if/else, switch), write tests for BOTH paths
- Never commit code that makes existing tests fail
### B8. Commit
\`\`\`bash
git status --porcelain
\`\`\`
Only commit if there are changes. Stage all bootstrap files (config, test directory, TESTING.md, CLAUDE.md, .github/workflows/test.yml if created):
\`git commit -m "chore: bootstrap test framework ({framework name})"\`
---`;
}
const RESOLVERS: Record<string, () => string> = {
COMMAND_REFERENCE: generateCommandReference,
SNAPSHOT_FLAGS: generateSnapshotFlags,
@@ -855,6 +1010,7 @@ const RESOLVERS: Record<string, () => string> = {
QA_METHODOLOGY: generateQAMethodology,
DESIGN_METHODOLOGY: generateDesignMethodology,
REVIEW_DASHBOARD: generateReviewDashboard,
TEST_BOOTSTRAP: generateTestBootstrap,
};
// ─── Template Processing ────────────────────────────────────