chore: bump version and changelog (v0.19.0.0)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 03:35:09 +02:00 · 2026-04-17 14:17:30 +08:00
parent 7529dbb276
commit c3fd12bab5
2 changed files with 29 additions and 1 deletions
@@ -1,5 +1,33 @@
 # Changelog

+## [0.19.0.0] - 2026-04-17
+
+### Added
+
+- **Per-model behavioral overlays via `--model` flag.** Different LLMs need different nudges. Run `bun run gen:skill-docs --model gpt-5.4` and every generated skill picks up GPT-tuned behavioral patches. Five overlays ship in `model-overlays/`: claude (todo discipline), gpt (anti-termination), gpt-5.4 (anti-verbosity, inherits gpt), gemini (conciseness), o-series (structured output). Overlay files are plain markdown — edit in place, no code changes. `MODEL_OVERLAY: {model}` line in the preamble output tells you which one is active. Defaults to claude. Missing overlay file → empty string (graceful), no error.
+- **Continuous checkpoint mode (opt-in, local by default).** Set `gstack-config set checkpoint_mode continuous` and skills will auto-commit your work as you go with `WIP: <description>` prefix and a structured `[gstack-context]` body block (decisions, remaining work, tried-and-failed approaches, current skill). Survives Claude Code crashes. Push is opt-in via `checkpoint_push=true` — defaults to local-only so you don't accidentally trigger CI on every WIP commit. `/checkpoint resume` now reads both the markdown checkpoint files AND the `[gstack-context]` blocks from WIP commits to reconstruct session state.
+- **`/ship` non-destructively squashes WIP commits** before creating the PR (when continuous mode is active). Uses `git rebase --autosquash` scoped to WIP commits only — preserves any non-WIP commits on the branch. Refuses to blind soft-reset when non-WIP work is mixed in (would have caused non-fast-forward push). Aborts with BLOCKED status on conflict instead of destroying real work.
+- **Feature discovery prompt after upgrade.** When `JUST_UPGRADED` fires, gstack now offers to enable new features once per user (per-feature marker files at `~/.gstack/.feature-prompted-{name}`). Skipped entirely in spawned sessions (OpenClaw orchestrator). No more silent features that never get discovered.
+- **Context health soft directive (T2+ skills).** During long-running skills (/qa, /investigate, /cso), gstack now nudges you to write periodic `[PROGRESS]` summaries. Self-monitoring during 50+ tool-call sessions. No fake thresholds — soft directive that the model self-applies. Progress reporting never mutates git state.
+- **`gstack-model-benchmark` CLI.** Run the same prompt across Claude, GPT, and Gemini, compare latency/tokens/cost/quality. Per-provider auth detection, pricing tables, tool-compatibility map, parallel execution, per-provider error isolation. Quality scoring via Anthropic SDK as the stable judge (`--judge` flag). Output as table, JSON, or markdown. The first multi-provider benchmark in any agent framework — every other tool guesses which model is best, gstack measures it.
+- **`gstack-publish` CLI for marketplace distribution.** Publishes gstack standalone methodology skills (gstack-office-hours, gstack-ceo-review, gstack-investigate, gstack-retro) to ClawHub, SkillsMP, and Vercel Skills.sh. Supports `--dry-run` (validate manifest + auth without publishing), per-skill error isolation (one failure doesn't abort the batch), idempotent re-runs. New `skills.json` manifest at the repo root declares per-skill marketplace targets and metadata.
+- **Design taste engine.** Persistent cross-session taste profile at `~/.gstack/projects/$SLUG/taste-profile.json`. Tracks fonts, colors, layouts, and aesthetic directions you approve and reject across sessions. Confidence decays 5% per week. Design-consultation and design-shotgun now factor in your demonstrated preferences. Schema migration handles legacy approved.json. New `gstack-taste-update` CLI updates the profile after design-shotgun decisions.
+- **Anti-slop design constraints.** Design-consultation now asks "What's the one thing someone will remember?" as a forcing question. Phase 5 self-gate: "Would a human designer be embarrassed by this?" — discards and regenerates if yes. Anti-convergence directive in design-shotgun: each variant must use a different font, palette, and layout, or one of them failed. Space Grotesk added to the overused fonts list (it's the new "safe alternative to Inter" trap). system-ui-as-primary-font added to the AI slop blacklist.
+- **`gstack-config list` and `gstack-config defaults`** subcommands. `list` shows all config keys with their current value AND source (set/default). `defaults` shows just the defaults table. Fixes the prior gap where `get` returned empty for missing keys instead of falling back to the documented defaults. Telemetry default aligned: header and runtime both say `off` now (previously mismatched).
+- **`gstack-config checkpoint_mode` and `checkpoint_push` keys.** New config knobs for continuous checkpoint mode. Both default to safe values (`explicit` mode, no auto-push).
+
+### Changed
+
+- **Preamble split into submodules.** `scripts/resolvers/preamble.ts` was 740 lines with 18 generators inline. Now it's an 80-line composition root that imports each generator from `scripts/resolvers/preamble/*.ts`. Output is byte-identical (verified via `diff -r` on all 135 generated SKILL.md files across all hosts before and after the refactor). Maintenance gets easier: adding a new preamble section is now "create one file, add one import line" instead of "find a spot in the god-file."
+- **Anti-slop dead code removed.** `scripts/gen-skill-docs.ts` had a duplicate copy of `AI_SLOP_BLACKLIST`, `OPENAI_HARD_REJECTIONS`, and `OPENAI_LITMUS_CHECKS`. Deleted — `scripts/resolvers/constants.ts` is now the single source. No more drift risk.
+
+### For contributors
+
+- **Test infrastructure for multi-provider benchmarking.** `test/helpers/providers/{types,claude,gpt,gemini}.ts` defines a uniform `ProviderAdapter` interface and three adapters wrapping the existing CLI runners. `test/helpers/pricing.ts` has per-model cost tables (update quarterly). `test/helpers/tool-map.ts` declares which tools each provider's CLI exposes — benchmarks that need Edit/Glob/Grep correctly skip Gemini and report `unsupported_tool`.
+- **Model taxonomy in neutral `scripts/models.ts`.** Avoids an import cycle through `hosts/index.ts` that would have happened if `Model` lived in `scripts/resolvers/types.ts`. `resolveModel()` handles family heuristics: `gpt-5.4-mini` → `gpt-5.4`, `o3` → `o-series`, `claude-opus-4-7` → `claude`.
+- **`scripts/resolvers/preamble/`** — 18 single-purpose generators, 16-160 lines each. The composition root in `scripts/resolvers/preamble.ts` imports them and wires them into the tier-gated section list.
+- **Plan and reviews persisted.** Implementation followed `~/.claude/plans/declarative-riding-cook.md` which went through CEO review (SCOPE EXPANSION, 6 expansions accepted), DX review (POLISH, 5 gaps fixed), Eng review (4 architecture issues), and Codex review (11 brutal findings, all integrated and 2 prior decisions reversed).
+
 ## [0.18.1.0] - 2026-04-16

 ### Fixed
@@ -1 +1 @@
-0.18.1.0
+0.19.0.0
@@ -1 +1 @@
 .18.1.0
 .19.0.0