From dd94598adb26b511ffbb77fc4a2a57d3cbb50087 Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Thu, 26 Mar 2026 17:31:11 -0600 Subject: [PATCH] chore: bump version and changelog (v0.12.3.0) Co-Authored-By: Claude Opus 4.6 --- CHANGELOG.md | 9 +++++++++ TODOS.md | 12 ++++++++++++ VERSION | 2 +- 3 files changed, 22 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index b228078a..a31ca57f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,14 @@ # Changelog +## [0.12.3.0] - 2026-03-26 — Full Commit Coverage in /ship + +When you ship a branch with 12 commits spanning performance work, dead code removal, and test infra, the PR should mention all three. It wasn't. The CHANGELOG and PR summary biased toward whatever happened most recently, silently dropping earlier work. + +### Fixed + +- **/ship Step 5 (CHANGELOG):** Now forces explicit commit enumeration before writing. You list every commit, group by theme, write the entry, then cross-check that every commit maps to a bullet. No more recency bias. +- **/ship Step 8 (PR body):** Changed from "bullet points from CHANGELOG" to explicit commit-by-commit coverage. Groups commits into logical sections. Excludes the VERSION/CHANGELOG metadata commit (bookkeeping, not a change). Every substantive commit must appear somewhere. + ## [0.12.2.0] - 2026-03-26 — Deploy with Confidence: First-Run Dry Run The first time you run `/land-and-deploy` on a project, it does a dry run. It detects your deploy infrastructure, tests that every command works, and shows you exactly what will happen... before it touches anything. You confirm, and from then on it just works. diff --git a/TODOS.md b/TODOS.md index 8458a98a..819ff02d 100644 --- a/TODOS.md +++ b/TODOS.md @@ -221,6 +221,18 @@ Linux cookie import shipped in v0.11.11.0 (Wave 3). Supports Chrome, Chromium, B **Priority:** P2 **Depends on:** None (BASE_BRANCH_DETECT multi-platform resolver is already done) +### Multi-commit CHANGELOG completeness eval + +**What:** Add a periodic E2E eval that creates a branch with 5+ commits spanning 3+ themes (features, cleanup, infra), runs /ship's Step 5 CHANGELOG generation, and verifies the CHANGELOG mentions all themes. + +**Why:** The bug fixed in v0.11.22 (garrytan/ship-full-commit-coverage) showed that /ship's CHANGELOG generation biased toward recent commits on long branches. The prompt fix adds a cross-check, but no test exercises the multi-commit failure mode. The existing `ship-local-workflow` E2E only uses a single-commit branch. + +**Context:** Would be a `periodic` tier test (~$4/run, non-deterministic since it tests LLM instruction-following). Setup: create bare remote, clone, add 5+ commits across different themes on a feature branch, run Step 5 via `claude -p`, verify CHANGELOG output covers all themes. Pattern: `ship-local-workflow` in `test/skill-e2e-workflow.test.ts`. + +**Effort:** M +**Priority:** P3 +**Depends on:** None + ### Ship log — persistent record of /ship runs **What:** Append structured JSON entry to `.gstack/ship-log.json` at end of every /ship run (version, date, branch, PR URL, review findings, Greptile stats, todos completed, test results). diff --git a/VERSION b/VERSION index 26ff4d6c..47516518 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -0.12.2.0 +0.12.3.0