From 9b3f1facb5b364a53fc3cf43cb8c8461f8879581 Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Tue, 17 Mar 2026 10:57:02 -0700 Subject: [PATCH] feat: make coverage audit trace actual codepaths, not just syntax patterns Step 3.4 now instructs Claude to read full files, trace data flow through every branch, diagram the execution, and check each branch against tests. Phase 8e.5 regression tests now trace the bug's codepath before writing the test, catching adjacent edge cases. --- qa/SKILL.md | 9 ++++++++- qa/SKILL.md.tmpl | 9 ++++++++- ship/SKILL.md | 36 +++++++++++++++++++++++------------- ship/SKILL.md.tmpl | 36 +++++++++++++++++++++++------------- 4 files changed, 62 insertions(+), 28 deletions(-) diff --git a/qa/SKILL.md b/qa/SKILL.md index 44167be7..c01514cf 100644 --- a/qa/SKILL.md +++ b/qa/SKILL.md @@ -707,12 +707,19 @@ Read 2-3 test files closest to the fix (same directory, same code type). Match e - File naming, imports, assertion style, describe/it nesting, setup/teardown patterns The regression test must look like it was written by the same developer. -**2. Write a regression test encoding the exact bug condition:** +**2. Trace the bug's codepath, then write a regression test:** + +Before writing the test, trace the data flow through the code you just fixed: +- What input/state triggered the bug? (the exact precondition) +- What codepath did it follow? (which branches, which function calls) +- Where did it break? (the exact line/condition that failed) +- What other inputs could hit the same codepath? (edge cases around the fix) The test MUST: - Set up the precondition that triggered the bug (the exact state that made it break) - Perform the action that exposed the bug - Assert the correct behavior (NOT "it renders" or "it doesn't throw") +- If you found adjacent edge cases while tracing, test those too (e.g., null input, empty array, boundary value) - Include full attribution comment: ``` // Regression: ISSUE-NNN — {what broke} diff --git a/qa/SKILL.md.tmpl b/qa/SKILL.md.tmpl index 6a7513c3..bd94debe 100644 --- a/qa/SKILL.md.tmpl +++ b/qa/SKILL.md.tmpl @@ -184,12 +184,19 @@ Read 2-3 test files closest to the fix (same directory, same code type). Match e - File naming, imports, assertion style, describe/it nesting, setup/teardown patterns The regression test must look like it was written by the same developer. -**2. Write a regression test encoding the exact bug condition:** +**2. Trace the bug's codepath, then write a regression test:** + +Before writing the test, trace the data flow through the code you just fixed: +- What input/state triggered the bug? (the exact precondition) +- What codepath did it follow? (which branches, which function calls) +- Where did it break? (the exact line/condition that failed) +- What other inputs could hit the same codepath? (edge cases around the fix) The test MUST: - Set up the precondition that triggered the bug (the exact state that made it break) - Perform the action that exposed the bug - Assert the correct behavior (NOT "it renders" or "it doesn't throw") +- If you found adjacent edge cases while tracing, test those too (e.g., null input, empty array, boundary value) - Include full attribution comment: ``` // Regression: ISSUE-NNN — {what broke} diff --git a/ship/SKILL.md b/ship/SKILL.md index 84389bda..6f1057c4 100644 --- a/ship/SKILL.md +++ b/ship/SKILL.md @@ -450,22 +450,32 @@ find . -name '*.test.*' -o -name '*.spec.*' -o -name '*_test.*' -o -name '*_spec Store this number for the PR body. -**1. Build the code path map** from `git diff origin/...HEAD`: +**1. Trace every codepath changed** using `git diff origin/...HEAD`: -Extract all new or modified: -- Functions/methods (def, function, const, class methods) -- Conditional branches (if/else, switch/case, ternary, guard clauses, early returns) -- API routes/endpoints (route definitions, controller actions) -- Components (new files or new exports) -- Error handlers (try/catch, rescue, error boundaries, fallback paths) +Read every changed file. For each one, trace how data flows through the code — don't just list functions, actually follow the execution: -**2. Search for corresponding tests and score quality:** +1. **Read the diff.** For each changed file, read the full file (not just the diff hunk) to understand context. +2. **Trace data flow.** Starting from each entry point (route handler, exported function, event listener, component render), follow the data through every branch: + - Where does input come from? (request params, props, database, API call) + - What transforms it? (validation, mapping, computation) + - Where does it go? (database write, API response, rendered output, side effect) + - What can go wrong at each step? (null/undefined, invalid input, network failure, empty collection) +3. **Diagram the execution.** For each changed file, draw an ASCII diagram showing: + - Every function/method that was added or modified + - Every conditional branch (if/else, switch, ternary, guard clause, early return) + - Every error path (try/catch, rescue, error boundary, fallback) + - Every call to another function (trace into it — does IT have untested branches?) + - Every edge: what happens with null input? Empty array? Invalid type? -For each code path, search for a test exercising it: -- `src/services/billing.ts:processPayment` → `billing.test.ts`, `billing.spec.ts` -- `app/controllers/payments_controller.rb#create` → `test/controllers/payments_controller_test.rb` -- New if/else → tests for BOTH paths (not just happy path) -- New error handler → test triggering the error condition +This is the critical step — you're building a map of every line of code that can execute differently based on input. Every branch in this diagram needs a test. + +**2. Check each branch against existing tests:** + +Go through your diagram branch by branch. For each one, search for a test that exercises it: +- Function `processPayment()` → look for `billing.test.ts`, `billing.spec.ts`, `test/billing_test.rb` +- An if/else → look for tests covering BOTH the true AND false path +- An error handler → look for a test that triggers that specific error condition +- A call to `helperFn()` that has its own branches → those branches need tests too Quality scoring rubric: - ★★★ Tests behavior with edge cases AND error paths diff --git a/ship/SKILL.md.tmpl b/ship/SKILL.md.tmpl index b3b8c7bb..9b2fcbc0 100644 --- a/ship/SKILL.md.tmpl +++ b/ship/SKILL.md.tmpl @@ -181,22 +181,32 @@ find . -name '*.test.*' -o -name '*.spec.*' -o -name '*_test.*' -o -name '*_spec Store this number for the PR body. -**1. Build the code path map** from `git diff origin/...HEAD`: +**1. Trace every codepath changed** using `git diff origin/...HEAD`: -Extract all new or modified: -- Functions/methods (def, function, const, class methods) -- Conditional branches (if/else, switch/case, ternary, guard clauses, early returns) -- API routes/endpoints (route definitions, controller actions) -- Components (new files or new exports) -- Error handlers (try/catch, rescue, error boundaries, fallback paths) +Read every changed file. For each one, trace how data flows through the code — don't just list functions, actually follow the execution: -**2. Search for corresponding tests and score quality:** +1. **Read the diff.** For each changed file, read the full file (not just the diff hunk) to understand context. +2. **Trace data flow.** Starting from each entry point (route handler, exported function, event listener, component render), follow the data through every branch: + - Where does input come from? (request params, props, database, API call) + - What transforms it? (validation, mapping, computation) + - Where does it go? (database write, API response, rendered output, side effect) + - What can go wrong at each step? (null/undefined, invalid input, network failure, empty collection) +3. **Diagram the execution.** For each changed file, draw an ASCII diagram showing: + - Every function/method that was added or modified + - Every conditional branch (if/else, switch, ternary, guard clause, early return) + - Every error path (try/catch, rescue, error boundary, fallback) + - Every call to another function (trace into it — does IT have untested branches?) + - Every edge: what happens with null input? Empty array? Invalid type? -For each code path, search for a test exercising it: -- `src/services/billing.ts:processPayment` → `billing.test.ts`, `billing.spec.ts` -- `app/controllers/payments_controller.rb#create` → `test/controllers/payments_controller_test.rb` -- New if/else → tests for BOTH paths (not just happy path) -- New error handler → test triggering the error condition +This is the critical step — you're building a map of every line of code that can execute differently based on input. Every branch in this diagram needs a test. + +**2. Check each branch against existing tests:** + +Go through your diagram branch by branch. For each one, search for a test that exercises it: +- Function `processPayment()` → look for `billing.test.ts`, `billing.spec.ts`, `test/billing_test.rb` +- An if/else → look for tests covering BOTH the true AND false path +- An error handler → look for a test that triggers that specific error condition +- A call to `helperFn()` that has its own branches → those branches need tests too Quality scoring rubric: - ★★★ Tests behavior with edge cases AND error paths