Files
gstack/qa/templates/qa-report-template.md
Garry Tan a2d756f945 feat: Test Bootstrap + Regression Tests + Coverage Audit (v0.6.0) (#136)
* feat: test bootstrap, regression tests, coverage audit, retro test health

- Add {{TEST_BOOTSTRAP}} resolver to gen-skill-docs.ts
- Add Phase 8e.5 regression test generation to /qa and /qa-design-review
- Add Step 3.4 test coverage audit with quality scoring to /ship
- Add test health tracking to /retro
- Add 2 E2E evals (bootstrap + coverage audit)
- Add 26 validation tests
- Update ARCHITECTURE.md placeholder table
- Add 2 P3 TODOs (CI/CD non-GitHub, auto-upgrade weak tests)

* chore: bump version and changelog (v0.6.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: make coverage audit trace actual codepaths, not just syntax patterns

Step 3.4 now instructs Claude to read full files, trace data flow through
every branch, diagram the execution, and check each branch against tests.
Phase 8e.5 regression tests now trace the bug's codepath before writing
the test, catching adjacent edge cases.

* feat: coverage audit now maps user flows, interactions, and error states

Step 3.4 now covers the full picture: code branches AND user-facing behavior.
Maps user flows (complete journey through the feature), interaction edge cases
(double-click, back button, stale state, slow connection), error states
(what does the user actually see?), and boundary states (zero results,
10k results, max-length input). Coverage diagram splits into Code Path
Coverage and User Flow Coverage sections with separate percentages.

* fix: raise test gen cap to 20, add validation tests for user flow coverage

- Raise Step 3.4 test generation cap from 10 to 20 (code + user flow combined)
- Add 3 validation tests: codepath tracing, user flow mapping, diagram sections

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 13:05:18 -05:00

2.9 KiB

QA Report: {APP_NAME}

Field Value
Date {DATE}
URL {URL}
Branch {BRANCH}
Commit {COMMIT_SHA} ({COMMIT_DATE})
PR {PR_NUMBER} ({PR_URL}) or "—"
Tier Quick / Standard / Exhaustive
Scope {SCOPE or "Full app"}
Duration {DURATION}
Pages visited {COUNT}
Screenshots {COUNT}
Framework {DETECTED or "Unknown"}
Index All QA runs

Health Score: {SCORE}/100

Category Score
Console {0-100}
Links {0-100}
Visual {0-100}
Functional {0-100}
UX {0-100}
Performance {0-100}
Accessibility {0-100}

Top 3 Things to Fix

  1. {ISSUE-NNN}: {title} — {one-line description}
  2. {ISSUE-NNN}: {title} — {one-line description}
  3. {ISSUE-NNN}: {title} — {one-line description}

Console Health

Error Count First seen
{error message} {N} {URL}

Summary

Severity Count
Critical 0
High 0
Medium 0
Low 0
Total 0

Issues

ISSUE-001: {Short title}

Field Value
Severity critical / high / medium / low
Category visual / functional / ux / content / performance / console / accessibility
URL {page URL}

Description: {What is wrong, expected vs actual.}

Repro Steps:

  1. Navigate to {URL} Step 1
  2. {Action} Step 2
  3. Observe: {what goes wrong} Result

Fixes Applied (if applicable)

Issue Fix Status Commit Files Changed
ISSUE-NNN verified / best-effort / reverted / deferred {SHA} {files}

Before/After Evidence

ISSUE-NNN: {title}

Before: Before After: After


Regression Tests

Issue Test File Status Description
ISSUE-NNN path/to/test committed / deferred / skipped description

Deferred Tests

ISSUE-NNN: {title}

Precondition: {setup state that triggers the bug} Action: {what the user does} Expected: {correct behavior} Why deferred: {reason}


Ship Readiness

Metric Value
Health score {before} → {after} ({delta})
Issues found N
Fixes applied N (verified: X, best-effort: Y, reverted: Z)
Deferred N

PR Summary: "QA found N issues, fixed M, health score X → Y."


Regression (if applicable)

Metric Baseline Current Delta
Health score {N} {N} {+/-N}
Issues {N} {N} {+/-N}

Fixed since baseline: {list} New since baseline: {list}