mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-02 03:35:09 +02:00
f3ee0ee28a
* feat: browser ref staleness detection via async count() validation resolveRef() now checks element count to detect stale refs after page mutations (e.g. SPA navigation). RefEntry stores role+name metadata for better diagnostics. 3 new snapshot tests for staleness detection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: qa-only skill, qa fix loop, plan-to-QA artifact flow Add /qa-only (report-only, Edit tool blocked), restructure /qa with find-fix-verify cycle, add {{QA_METHODOLOGY}} DRY placeholder for shared methodology. /plan-eng-review now writes test-plan artifacts to ~/.gstack/projects/<slug>/ for QA consumption. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: eval efficiency metrics — turns, duration, commentary across all surfaces Add generateCommentary() for natural-language delta interpretation, per-test turns/duration in comparison and summary output, judgePassed unit tests, 3 new E2E tests (qa-only, qa fix loop, plan artifact). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: bump version and changelog (v0.4.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: update ARCHITECTURE, BROWSER, CONTRIBUTING, README for v0.4.0 - ARCHITECTURE: add ref staleness detection section, update RefEntry type - BROWSER: add ref staleness paragraph to snapshot system docs - CONTRIBUTING: update eval tool descriptions with commentary feature - README: fix missing qa-only in project-local uninstall command Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add user-facing benefit descriptions to v0.4.0 changelog Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2.5 KiB
2.5 KiB
QA Report: {APP_NAME}
| Field | Value |
|---|---|
| Date | {DATE} |
| URL | {URL} |
| Branch | {BRANCH} |
| Commit | {COMMIT_SHA} ({COMMIT_DATE}) |
| PR | {PR_NUMBER} ({PR_URL}) or "—" |
| Tier | Quick / Standard / Exhaustive |
| Scope | {SCOPE or "Full app"} |
| Duration | {DURATION} |
| Pages visited | {COUNT} |
| Screenshots | {COUNT} |
| Framework | {DETECTED or "Unknown"} |
| Index | All QA runs |
Health Score: {SCORE}/100
| Category | Score |
|---|---|
| Console | {0-100} |
| Links | {0-100} |
| Visual | {0-100} |
| Functional | {0-100} |
| UX | {0-100} |
| Performance | {0-100} |
| Accessibility | {0-100} |
Top 3 Things to Fix
- {ISSUE-NNN}: {title} — {one-line description}
- {ISSUE-NNN}: {title} — {one-line description}
- {ISSUE-NNN}: {title} — {one-line description}
Console Health
| Error | Count | First seen |
|---|---|---|
| {error message} | {N} | {URL} |
Summary
| Severity | Count |
|---|---|
| Critical | 0 |
| High | 0 |
| Medium | 0 |
| Low | 0 |
| Total | 0 |
Issues
ISSUE-001: {Short title}
| Field | Value |
|---|---|
| Severity | critical / high / medium / low |
| Category | visual / functional / ux / content / performance / console / accessibility |
| URL | {page URL} |
Description: {What is wrong, expected vs actual.}
Repro Steps:
Fixes Applied (if applicable)
| Issue | Fix Status | Commit | Files Changed |
|---|---|---|---|
| ISSUE-NNN | verified / best-effort / reverted / deferred | {SHA} | {files} |
Before/After Evidence
ISSUE-NNN: {title}
Ship Readiness
| Metric | Value |
|---|---|
| Health score | {before} → {after} ({delta}) |
| Issues found | N |
| Fixes applied | N (verified: X, best-effort: Y, reverted: Z) |
| Deferred | N |
PR Summary: "QA found N issues, fixed M, health score X → Y."
Regression (if applicable)
| Metric | Baseline | Current | Delta |
|---|---|---|---|
| Health score | {N} | {N} | {+/-N} |
| Issues | {N} | {N} | {+/-N} |
Fixed since baseline: {list} New since baseline: {list}




