test: refresh ship golden baselines for the memory-loop preamble change

Context Recovery now emits the cross-session-decisions block, so ship's preamble (all hosts) changed. Golden baselines are hand-maintained copies (gen does not write them); refresh them from the fresh gen so golden-file regression passes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 07:10:12 +02:00 · 2026-06-07 12:33:39 -07:00
parent c375788e6c
commit 906fffef26
3 changed files with 55 additions and 0 deletions
@@ -22,6 +22,7 @@ triggers:
 <!-- AUTO-GENERATED from SKILL.md.tmpl — do not edit directly -->
 <!-- Regenerate: bun run gen:skill-docs -->

+
 ## When to invoke this skill

 Use when asked to "ship", "deploy",
@@ -403,6 +404,7 @@ Before calling AskUserQuestion, verify:
 - [ ] If you split, you checked dependencies between options before firing the chain
 - [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue)

+
 ## Artifacts Sync (skill start)

 ```bash
@@ -500,6 +502,8 @@ else
 fi
 ```

+
+
 Privacy stop-gate: if output shows `ARTIFACTS_SYNC: off`, `artifacts_sync_mode_prompted` is `false`, and gbrain is on PATH or `gbrain doctor --fast --json` works, ask once:

 > gstack can publish your artifacts (CEO plans, designs, reports) to a private GitHub repo that GBrain indexes across machines. How much should sync?
@@ -526,6 +530,7 @@ At skill END before telemetry:
 "~/.claude/skills/gstack/bin/gstack-brain-sync" --once 2>/dev/null || true
 ```

+
 ## Model-Specific Behavioral Patch (claude)

 The following nudges are tuned for the claude model family. They are
@@ -580,12 +585,19 @@ if [ -d "$_PROJ" ]; then
  fi
  _LATEST_CP=$(find "$_PROJ/checkpoints" -name "*.md" -type f 2>/dev/null | xargs ls -t 2>/dev/null | head -1)
  [ -n "$_LATEST_CP" ] && echo "LATEST_CHECKPOINT: $_LATEST_CP"
+  if [ -f "$_PROJ/decisions.active.json" ]; then
+    echo "--- ACTIVE DECISIONS (recent, scope-relevant) ---"
+    ~/.claude/skills/gstack/bin/gstack-decision-search --recent 5 2>/dev/null
+    echo "--- END DECISIONS ---"
+  fi
  echo "--- END ARTIFACTS ---"
 fi
 ```

 If artifacts are listed, read the newest useful one. If `LAST_SESSION` or `LATEST_CHECKPOINT` appears, give a 2-sentence welcome back summary. If `RECENT_PATTERN` clearly implies a next skill, suggest it once.

+**Cross-session decisions.** If `ACTIVE DECISIONS` are listed, treat them as prior settled calls with their rationale — do not silently re-litigate them; if you're about to reverse one, say so explicitly. Reach for `~/.claude/skills/gstack/bin/gstack-decision-search` whenever a question touches a past decision ("what did we decide / why / did we try"). When you or the user make a DURABLE decision (architecture, scope, tool/vendor choice, or a reversal) — NOT a turn-level or trivial choice — log it with `~/.claude/skills/gstack/bin/gstack-decision-log` (`--supersede <id>` for a reversal). Reliable and local; gbrain not required.
+
 ## Writing Style (skip entirely if `EXPLAIN_LEVEL: terse` appears in the preamble echo OR the user's current message explicitly requests terse / no-explanations output)

 Applies to AskUserQuestion, user replies, and findings. AskUserQuestion Format is structure; this is prose quality.
@@ -599,6 +611,7 @@ Applies to AskUserQuestion, user replies, and findings. AskUserQuestion Format i

 Curated jargon list lives at `~/.claude/skills/gstack/scripts/jargon-list.json` (80+ terms). On the first jargon term you encounter this session, Read that file once; treat the `terms` array as the canonical list. The list is repo-owned and may grow between releases.

+
 ## Completeness Principle — Boil the Lake

 AI makes completeness cheap. Recommend complete lakes (tests, edge cases, error paths); flag oceans (rewrites, multi-quarter migrations).
@@ -774,6 +787,8 @@ branch name wherever the instructions say "the base branch" or `<default>`.

 ---

+
+
 # Ship: Fully Automated Ship Workflow

 You are running the `/ship` workflow. This is a **non-interactive, fully automated** workflow. Do NOT ask for confirmation at any step. The user said `/ship` which means DO IT. Run straight through and output the PR URL at the end.
@@ -390,6 +390,7 @@ Before calling AskUserQuestion, verify:
 - [ ] If you split, you checked dependencies between options before firing the chain
 - [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue)

+
 ## Artifacts Sync (skill start)

 ```bash
@@ -487,6 +488,8 @@ else
 fi
 ```

+
+
 Privacy stop-gate: if output shows `ARTIFACTS_SYNC: off`, `artifacts_sync_mode_prompted` is `false`, and gbrain is on PATH or `gbrain doctor --fast --json` works, ask once:

 > gstack can publish your artifacts (CEO plans, designs, reports) to a private GitHub repo that GBrain indexes across machines. How much should sync?
@@ -513,6 +516,7 @@ At skill END before telemetry:
 "$GSTACK_BIN/gstack-brain-sync" --once 2>/dev/null || true
 ```

+
 ## Model-Specific Behavioral Patch (claude)

 The following nudges are tuned for the claude model family. They are
@@ -567,12 +571,19 @@ if [ -d "$_PROJ" ]; then
  fi
  _LATEST_CP=$(find "$_PROJ/checkpoints" -name "*.md" -type f 2>/dev/null | xargs ls -t 2>/dev/null | head -1)
  [ -n "$_LATEST_CP" ] && echo "LATEST_CHECKPOINT: $_LATEST_CP"
+  if [ -f "$_PROJ/decisions.active.json" ]; then
+    echo "--- ACTIVE DECISIONS (recent, scope-relevant) ---"
+    $GSTACK_BIN/gstack-decision-search --recent 5 2>/dev/null
+    echo "--- END DECISIONS ---"
+  fi
  echo "--- END ARTIFACTS ---"
 fi
 ```

 If artifacts are listed, read the newest useful one. If `LAST_SESSION` or `LATEST_CHECKPOINT` appears, give a 2-sentence welcome back summary. If `RECENT_PATTERN` clearly implies a next skill, suggest it once.

+**Cross-session decisions.** If `ACTIVE DECISIONS` are listed, treat them as prior settled calls with their rationale — do not silently re-litigate them; if you're about to reverse one, say so explicitly. Reach for `$GSTACK_BIN/gstack-decision-search` whenever a question touches a past decision ("what did we decide / why / did we try"). When you or the user make a DURABLE decision (architecture, scope, tool/vendor choice, or a reversal) — NOT a turn-level or trivial choice — log it with `$GSTACK_BIN/gstack-decision-log` (`--supersede <id>` for a reversal). Reliable and local; gbrain not required.
+
 ## Writing Style (skip entirely if `EXPLAIN_LEVEL: terse` appears in the preamble echo OR the user's current message explicitly requests terse / no-explanations output)

 Applies to AskUserQuestion, user replies, and findings. AskUserQuestion Format is structure; this is prose quality.
@@ -586,6 +597,7 @@ Applies to AskUserQuestion, user replies, and findings. AskUserQuestion Format i

 Curated jargon list lives at `$GSTACK_ROOT/scripts/jargon-list.json` (80+ terms). On the first jargon term you encounter this session, Read that file once; treat the `terms` array as the canonical list. The list is repo-owned and may grow between releases.

+
 ## Completeness Principle — Boil the Lake

 AI makes completeness cheap. Recommend complete lakes (tests, edge cases, error paths); flag oceans (rewrites, multi-quarter migrations).
@@ -761,6 +773,8 @@ branch name wherever the instructions say "the base branch" or `<default>`.

 ---

+
+
 # Ship: Fully Automated Ship Workflow

 You are running the `/ship` workflow. This is a **non-interactive, fully automated** workflow. Do NOT ask for confirmation at any step. The user said `/ship` which means DO IT. Run straight through and output the PR URL at the end.
@@ -800,6 +814,8 @@ Never skip a verification step because a prior `/ship` run already performed it.

 ---

+
+
 ---

 ## Step 1: Pre-flight
@@ -1928,6 +1944,8 @@ Substitute: TIMESTAMP = ISO 8601 datetime, STATUS = "clean" if 0 findings or "is

   Include any design findings alongside the code review findings. They follow the same Fix-First flow below.

+
+
 ### Step 9.3: Cross-review finding dedup

 Before classifying findings, check if any were previously skipped by the user in a prior review on this branch.
@@ -2046,6 +2064,8 @@ For each comment in `comments`:

 ---

+
+
 ## Capture Learnings

 If you discovered a non-obvious pattern, pitfall, or architectural insight during
@@ -2071,6 +2091,8 @@ staleness detection: if those files are later deleted, the learning can be flagg
 **Only log genuine discoveries.** Don't log obvious things. Don't log things the user
 already knows. A good test: would this insight save time in a future session? If yes, log it.

+
+
 ### Refresh learnings for the headline feature on this branch

 The top-of-skill learnings pull was keyed to "release ship" broadly. Before the VERSION/CHANGELOG step, re-pull learnings keyed to THIS branch's headline feature so any prior version-bump or CHANGELOG pitfalls for similar features surface.
@@ -392,6 +392,7 @@ Before calling AskUserQuestion, verify:
 - [ ] If you split, you checked dependencies between options before firing the chain
 - [ ] If a per-option Hold fires, you stopped the chain immediately (didn't queue)

+
 ## Artifacts Sync (skill start)

 ```bash
@@ -489,6 +490,8 @@ else
 fi
 ```

+
+
 Privacy stop-gate: if output shows `ARTIFACTS_SYNC: off`, `artifacts_sync_mode_prompted` is `false`, and gbrain is on PATH or `gbrain doctor --fast --json` works, ask once:

 > gstack can publish your artifacts (CEO plans, designs, reports) to a private GitHub repo that GBrain indexes across machines. How much should sync?
@@ -515,6 +518,7 @@ At skill END before telemetry:
 "$GSTACK_BIN/gstack-brain-sync" --once 2>/dev/null || true
 ```

+
 ## Model-Specific Behavioral Patch (claude)

 The following nudges are tuned for the claude model family. They are
@@ -569,12 +573,19 @@ if [ -d "$_PROJ" ]; then
  fi
  _LATEST_CP=$(find "$_PROJ/checkpoints" -name "*.md" -type f 2>/dev/null | xargs ls -t 2>/dev/null | head -1)
  [ -n "$_LATEST_CP" ] && echo "LATEST_CHECKPOINT: $_LATEST_CP"
+  if [ -f "$_PROJ/decisions.active.json" ]; then
+    echo "--- ACTIVE DECISIONS (recent, scope-relevant) ---"
+    $GSTACK_BIN/gstack-decision-search --recent 5 2>/dev/null
+    echo "--- END DECISIONS ---"
+  fi
  echo "--- END ARTIFACTS ---"
 fi
 ```

 If artifacts are listed, read the newest useful one. If `LAST_SESSION` or `LATEST_CHECKPOINT` appears, give a 2-sentence welcome back summary. If `RECENT_PATTERN` clearly implies a next skill, suggest it once.

+**Cross-session decisions.** If `ACTIVE DECISIONS` are listed, treat them as prior settled calls with their rationale — do not silently re-litigate them; if you're about to reverse one, say so explicitly. Reach for `$GSTACK_BIN/gstack-decision-search` whenever a question touches a past decision ("what did we decide / why / did we try"). When you or the user make a DURABLE decision (architecture, scope, tool/vendor choice, or a reversal) — NOT a turn-level or trivial choice — log it with `$GSTACK_BIN/gstack-decision-log` (`--supersede <id>` for a reversal). Reliable and local; gbrain not required.
+
 ## Writing Style (skip entirely if `EXPLAIN_LEVEL: terse` appears in the preamble echo OR the user's current message explicitly requests terse / no-explanations output)

 Applies to AskUserQuestion, user replies, and findings. AskUserQuestion Format is structure; this is prose quality.
@@ -588,6 +599,7 @@ Applies to AskUserQuestion, user replies, and findings. AskUserQuestion Format i

 Curated jargon list lives at `$GSTACK_ROOT/scripts/jargon-list.json` (80+ terms). On the first jargon term you encounter this session, Read that file once; treat the `terms` array as the canonical list. The list is repo-owned and may grow between releases.

+
 ## Completeness Principle — Boil the Lake

 AI makes completeness cheap. Recommend complete lakes (tests, edge cases, error paths); flag oceans (rewrites, multi-quarter migrations).
@@ -763,6 +775,8 @@ branch name wherever the instructions say "the base branch" or `<default>`.

 ---

+
+
 # Ship: Fully Automated Ship Workflow

 You are running the `/ship` workflow. This is a **non-interactive, fully automated** workflow. Do NOT ask for confirmation at any step. The user said `/ship` which means DO IT. Run straight through and output the PR URL at the end.
@@ -802,6 +816,8 @@ Never skip a verification step because a prior `/ship` run already performed it.

 ---

+
+
 ---

 ## Step 1: Pre-flight
@@ -2453,6 +2469,8 @@ staleness detection: if those files are later deleted, the learning can be flagg
 **Only log genuine discoveries.** Don't log obvious things. Don't log things the user
 already knows. A good test: would this insight save time in a future session? If yes, log it.

+
+
 ### Refresh learnings for the headline feature on this branch

 The top-of-skill learnings pull was keyed to "release ship" broadly. Before the VERSION/CHANGELOG step, re-pull learnings keyed to THIS branch's headline feature so any prior version-bump or CHANGELOG pitfalls for similar features surface.