docs: surface v0.19 binaries and continuous checkpoint in README

The /review doc-staleness check flagged that v0.19.0.0 ships three new CLIs (gstack-model-benchmark, gstack-publish, gstack-taste-update) and an opt-in continuous checkpoint mode, none of which were visible in README's Power tools section. New users couldn't find them without reading CHANGELOG. Added: - "New binaries (v0.19)" subsection with one-row descriptions for each CLI - "Continuous checkpoint mode (opt-in, local by default)" subsection explaining WIP auto-commit + [gstack-context] body + /ship squash + /checkpoint resume CHANGELOG entry already has good voice from /ship; no polish needed. VERSION already at 0.19.0.0. Other docs (ARCHITECTURE/CONTRIBUTING/BROWSER) don't reference this surface — scoped intentionally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:15:24 +02:00 · 2026-04-18 07:57:53 +08:00
parent 620f5dbaea
commit 6a07bdb499
1 changed files with 14 additions and 0 deletions
@@ -236,6 +236,20 @@ Each skill feeds into the next. `/office-hours` writes a design doc that `/plan-
 | `/setup-deploy` | **Deploy Configurator** — one-time setup for `/land-and-deploy`. Detects your platform, production URL, and deploy commands. |
 | `/gstack-upgrade` | **Self-Updater** — upgrade gstack to latest. Detects global vs vendored install, syncs both, shows what changed. |

+### New binaries (v0.19)
+
+Beyond the slash-command skills, gstack ships standalone CLIs for workflows that don't belong inside a session:
+
+| Command | What it does |
+|---------|-------------|
+| `gstack-model-benchmark` | **Cross-model benchmark** — run the same prompt through Claude, GPT (via Codex CLI), and Gemini; compare latency, tokens, cost, and (optionally) LLM-judge quality score. Auth detected per provider, unavailable providers skip cleanly. Output as table, JSON, or markdown. `--dry-run` validates flags + auth without spending API calls. |
+| `gstack-publish` | **Marketplace distribution** — publishes standalone methodology skills (office-hours, ceo-review, investigate, retro) to ClawHub, SkillsMP, and Vercel Skills.sh. Manifest at `skills.json`. `--dry-run` validates everything without actually publishing. Per-skill, per-marketplace error isolation — one failure never aborts the batch. |
+| `gstack-taste-update` | **Design taste learning** — writes approvals and rejections from `/design-shotgun` into a persistent per-project taste profile. Decays 5%/week. Feeds back into future variant generation so the system learns what you actually pick. |
+
+### Continuous checkpoint mode (opt-in, local by default)
+
+Set `gstack-config set checkpoint_mode continuous` and skills auto-commit your work as you go with a `WIP:` prefix plus a structured `[gstack-context]` body (decisions, remaining work, failed approaches). Survives crashes and context switches. `/checkpoint resume` reads those commits to reconstruct session state. `/ship` filter-squashes WIP commits before the PR (preserving non-WIP commits) so bisect stays clean. Push is opt-in via `checkpoint_push=true` — default is local-only so you don't trigger CI on every WIP commit.
+
 **[Deep dives with examples and philosophy for every skill →](docs/skills.md)**

 ### Karpathy's four failure modes? Already covered.