Files
gstack/review/specialists/testing.md
T
Garry Tan 9ca8f1d7a9 feat: adaptive gating + cross-review dedup for review army (v0.15.2.0) (#760)
* feat: add test_stub optional field to specialist finding schema

All specialist prompts now document test_stub as an optional output field,
enabling specialists to suggest test code alongside findings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: adaptive gating + test framework detection for review army

Adds gstack-specialist-stats binary for tracking specialist hit rates.
Resolver now detects test framework for test_stub generation, applies
adaptive gating to skip silent specialists, and compiles per-specialist
stats for the review-log entry.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: cross-review finding dedup + test stub override + enriched review-log

Step 5.0 suppresses findings previously skipped by the user when the
relevant code hasn't changed. Test stub findings force ASK classification
so users approve test creation. Review-log now includes quality_score,
per-specialist stats, and per-finding action records.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.15.2.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: bash operator precedence in test framework detection

[ -f a ] || [ -f b ] && X="y" evaluates as A || (B && C), so the
assignment only runs when the second test passes. Wrap the OR group
in braces: { [ -f a ] || [ -f b ]; } && X="y".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 22:46:21 -07:00

2.2 KiB

Testing Specialist Review Checklist

Scope: Always-on (every review) Output: JSON objects, one finding per line. Schema: {"severity":"CRITICAL|INFORMATIONAL","confidence":N,"path":"file","line":N,"category":"testing","summary":"...","fix":"...","fingerprint":"path:line:testing","specialist":"testing"} Optional: line, fix, fingerprint, evidence, test_stub. If no findings: output NO FINDINGS and nothing else.


Categories

Missing Negative-Path Tests

  • New code paths that handle errors, rejections, or invalid input with NO corresponding test
  • Guard clauses and early returns that are untested
  • Error branches in try/catch, rescue, or error boundaries with no failure-path test
  • Permission/auth checks that are asserted in code but never tested for the "denied" case

Missing Edge-Case Coverage

  • Boundary values: zero, negative, max-int, empty string, empty array, nil/null/undefined
  • Single-element collections (off-by-one on loops)
  • Unicode and special characters in user-facing inputs
  • Concurrent access patterns with no race-condition test

Test Isolation Violations

  • Tests sharing mutable state (class variables, global singletons, DB records not cleaned up)
  • Order-dependent tests (pass in sequence, fail when randomized)
  • Tests that depend on system clock, timezone, or locale
  • Tests that make real network calls instead of using stubs/mocks

Flaky Test Patterns

  • Timing-dependent assertions (sleep, setTimeout, waitFor with tight timeouts)
  • Assertions on ordering of unordered results (hash keys, Set iteration, async resolution order)
  • Tests that depend on external services (APIs, databases) without fallback
  • Randomized test data without seed control

Security Enforcement Tests Missing

  • Auth/authz checks in controllers with no test for the "unauthorized" case
  • Rate limiting logic with no test proving it actually blocks
  • Input sanitization with no test for malicious input
  • CSRF/CORS configuration with no integration test

Coverage Gaps

  • New public methods/functions with zero test coverage
  • Changed methods where existing tests only cover the old behavior, not the new branch
  • Utility functions called from multiple places but tested only indirectly