mirror of
https://github.com/garrytan/gstack.git
synced 2026-06-30 21:45:38 +02:00
9fd03fae9e
* fix(gbrain): stop forcing GBRAIN_PREPARE on transaction-mode poolers (#1965) buildGbrainEnv auto-set GBRAIN_PREPARE=true whenever DATABASE_URL targeted port 6543, and the /sync-gbrain capability check exported it for the rest of the skill run. Both had the semantics inverted: gbrain auto-disables prepared statements on transaction-mode poolers because they break every write there ("prepared statement does not exist"); GBRAIN_PREPARE=true is gbrain's documented override for SESSION-mode poolers on 6543, not a requirement for transaction mode. The #1435 search symptom the auto-set worked around was fixed gbrain-side. Remove both force-sets. A caller-set GBRAIN_PREPARE (either value) still passes through untouched, preserving the session-mode-on-6543 escape hatch. isTransactionModePooler stays exported. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(gbrain): classify probe timeout as its own status; sync proceeds instead of skipping (#1964) The 5s engine probe misclassified healthy-but-slow engines (cold Supabase pooler connections measured at 6.9-10.7s) as broken-config, so /sync-gbrain silently skipped code+memory and told the user their config was malformed. - New "timeout" status: probe killed at the deadline with no recognized stderr pattern. Default deadline is now 15s, overridable via GSTACK_GBRAIN_PROBE_TIMEOUT_MS (tests set 300ms against a fake that sleeps 2s). - Sync stages PROCEED on timeout with a stderr warning naming the env knob; a genuinely-dead engine surfaces its real error at the first operation instead of a false config diagnosis. - Consistency everywhere "ok" gated behavior: gstack-gbrain-detect --is-ok exits 0 on timeout, and gen-skill-docs' detection gate accepts it, so a slow engine no longer silently suppresses brain-aware features. - Status cache: key now includes the effective probe timeout (raising it invalidates a cached timeout) and GBRAIN_HOME; config detection honors GBRAIN_HOME so relocated-home users stop being misclassified as missing-config. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(bins): cygpath-normalize SCRIPT_DIR for bun imports; surface learnings-log errors (#1950) Under Windows git-bash, pwd yields a POSIX path (/c/Users/...) that Bun on Windows cannot resolve as an ES module specifier. gstack-learnings-log interpolates SCRIPT_DIR into a bun -e import, so every invocation died with "Cannot find module" — and 2>/dev/null swallowed the error, silently dropping every AI-logged learning for Windows users. - 3-line cygpath -m guard in gstack-learnings-log and gstack-question-log (which gains the same import shape in the next commit). Matches the duplicated IS_WINDOWS convention in setup; no shared shell lib exists. - learnings-log adopts question-log's set +e / TMPERR capture pattern wholesale: validation errors now print to stderr. The old `if [ $? -ne 0 ]` check was dead code under set -euo pipefail — the script exited at the failing assignment before reaching it. - New test/bin-windows-bun-import-paths.test.ts: static invariant (any bash bin interpolating $SCRIPT_DIR into a bun -e import must carry the guard) + behavioral end-to-end run invoked via `bash <bin>` — added to the windows-free-tests workflow list so the conversion is proven on the only platform where the bug exists. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(question-log): dedupe INJECTION_PATTERNS via lib/jsonl-store (#1934) bin/gstack-question-log carried a local copy of the injection-pattern list, so pattern fixes to lib/jsonl-store.ts never propagated — including the /override[:\s]/i false-positive fix arriving via community PR #1940. Import the shared hasInjection instead (enabled by the previous commit's cygpath guard). question-log also gets the lib's stricter superset (human:, disregard, from-now-on, approve-all patterns). Tests pin the contract in a #1940-order-independent way: an "Override: ignore all previous instructions" header is rejected, "prose overrides the deterministic table" is accepted, and a static invariant keeps local INJECTION_PATTERNS duplicates out of the bin. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(security): community-pulse + both dashboards never report fake zeros (#1947) The security-signaling surface failed open at three layers — every failure mode read as a reassuring "0 attacks" / "0 installs": - community-pulse edge function: supabase-js returns {data,error} without throwing, and all five queries discarded `error` — a DB outage produced real-looking zeros via the SUCCESS path, and the catch (also returning zeros with HTTP 200) was unreachable for query failures. Every query now destructures and throws; the catch serves the stale cache (marked "stale": true) when one exists, else 503 {"error":"pulse_unavailable"}. Success responses carry "status":"ok" so clients can distinguish authoritative data from legacy backends. NOTE: the edge function deploys out-of-band (supabase functions deploy community-pulse). - gstack-security-dashboard: captures the HTTP status; non-200 / network failure / error body / missing section → "unknown — backend error"; jq missing → "unknown — install jq" (the lossy grep fallback broke on nested arrays and under-reported attacks as zero — removed); a 200 without the new marker shows figures with an "unverified (legacy backend)" note. Also fixes a latent display bug: the TOTAL grep matched the digit 7 inside "attacks_last_7_days" and misreported every count. - gstack-community-dashboard: same class — curl || echo "{}" plus grep || echo "0" printed "Weekly active installs: 0" on any failure. Now "unknown — backend error (HTTP N)". test/security-dashboard-fallback.test.ts pins the matrix (200+marker, 200-legacy, 503, network failure) x (jq present, jq absent) for both bins: "unknown" states never render as 0. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(telemetry): redact error_message spans before they leave the machine (#1947) error_message was uploaded with only quote/newline escaping — stack traces and failed-API errors can embed credentials, private paths, and hostnames, and the sync path strips only _repo_slug/_branch. New lib/redact-engine.ts export redactFindingSpans(): replaces EVERY finding's span with <REDACTED-{id}> regardless of tier (applyRedactions is the interactive PII-only path and exits nonzero on credential findings, so it can't serve machine egress). Returns null when a span can't be located — callers drop the whole payload rather than risk a leak. gstack-telemetry-log pipes error_message through it at LOG time, so the local JSONL at rest is clean too; surrounding text survives for crash triage. FAIL CLOSED: bun missing, engine error, or non-JSON-string output all null the field. Tests pin: embedded ghp_ token → <REDACTED-github.pat> with context intact; redactor unavailable → null; raw bytes on disk never contain the token. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(redact): prepush guard fails closed on git failure; /ship owns hook install (#1946) Two gaps closed: 1. Fail closed. The git() helper returned "" on ANY non-zero exit or maxBuffer overflow (status null), addedLinesFor produced an empty string, and the push sailed through unscanned — fail-open on exactly the oversized-diff case where a large secret-bearing blob is most likely. The diff call now uses a strict variant that throws; main blocks with a clear message naming the GSTACK_REDACT_PREPUSH=skip escape valve. Probe calls (symbolic-ref, rev-parse, merge-base) keep the permissive helper — their failures are normal control flow. 2. Install path. The hook was installed by nothing ("opt-in, installed by nothing" was the issue's words). ./setup runs in the gstack checkout — the wrong repo for a per-project hook — so it gets a one-line hint only. /ship owns per-repo install: config redact_prepush_hook=true + hook missing → silent install (consent already given); config unset + no ~/.gstack/.redact-prepush-prompted marker → one-time machine-wide AskUserQuestion offer, answer persisted. ship/SKILL.md regenerated in this same commit (check-freshness bisect discipline). Tests: unscannable diff (bogus SHAs) → exit 1 + valve named; empty-but- successful diff → exit 0; static asserts pin setup as hint-only and the ship template as the installer surface. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(redact): six new credential patterns — GitLab, HuggingFace, npm, DigitalOcean, Bearer, GCP SA (#1946) Coverage gaps from the #1946 security review, including token types for tooling gstack itself drives (glab): HIGH (block): gitlab.token (glpat-/glptt-/gldt-), huggingface.token (hf_), npm.token (npm_), digitalocean.token (dop_v1_), gcp.service_account (the JSON-escaped "private_key" form that dodges pem.private_key's literal-block match when minified, confirmed by "private_key_id" proximity). MEDIUM (warn): auth.bearer — the most FP-prone shape in the set (docs are full of "Authorization: Bearer <token>"), so it requires header-context proximity and the same entropy>=3.0 + placeholder validator recipe as env.kv. "Bearer YOUR_TOKEN_HERE" never fires; calibration over coverage, per the cries-wolf principle. All shapes are linear-time; test/redact-pattern-lint.test.ts covers them automatically. Engine tests add positive + placeholder-negative cases per pattern. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test: coverage-audit additions for the fix wave Ship Step 7 gap-fill (all passing, 248 tests across the touched suites): memory + dream stage probe-timeout proceeds, gbrain-detect override paths, stale-flag passthrough, 200-body-missing-.security fail-closed case, telemetry redaction edges, and credential-pattern edge cases. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix: pre-landing review fixes Review army findings (1 critical, auto-fixed with regression tests): - CRITICAL (security specialist, verified live): redactFindingSpans spliced only the regex capture span, and pem.private_key / gcp.service_account capture just the BEGIN-header — the key body survived "redaction" and shipped via telemetry. Marker-only patterns now drop the whole payload (null, fail closed). Overlapping spans (Bearer+JWT on the same bytes) are coalesced before splicing so stale offsets can't leave partial secret bytes behind. - gitStrict: drop the dead `|| r.status === null` disjunct (null !== 0 already covers it); add the signal-kill/null-status regression test the docstring promised. - security-dashboard human mode flags stale snapshots ("figures may be out of date") instead of presenting frozen counts as current. - community-dashboard marker check uses jq when available — the grep-only variant misclassified whitespaced/reserialized bodies as legacy. - telemetry fail-closed test now shadows bun with a failing stub (deterministic on any host layout); stale "five status cases" describe title renamed. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix: adversarial review fixes (Claude + Codex cross-model passes) Both adversarial passes ran against the wave; every FIXABLE finding landed with a regression test: - probeTimeoutMs clamps to >=1ms: a fractional override floored to 0, and execFileSync treats timeout:0 as NO timeout — the probe that exists to bound hangs could hang forever (found by both models independently). - /ship silent hook install now requires the hooks dir to live inside .git: with core.hooksPath (husky's COMMITTED .husky/), the chaining installer would have renamed the team's committed pre-push and written a machine-local wrapper into the working tree (found by both models). - gstack-config gbrain-refresh accepts the "timeout" status — the last consumer still gating on literal "ok" (Codex); gstack-gbrain-detect's config-derived fields honor GBRAIN_HOME so the detection JSON can't report status ok alongside config_exists false (Codex). - prepush: a remote sha absent locally (shallow clone / stale fetch) falls back to the merge-base/empty-tree range — scans MORE, never blocks a legitimate push into training users toward --no-verify. - dashboards: curl's own 000 no longer doubles to "HTTP 000000"; the community dashboard flags stale snapshots like the security one; array sections parse via jq (the sed/grep loops truncated at the first ']'); the no-jq marker grep tolerates whitespace. - telemetry: multi-line redactor output nulls the field instead of corrupting the JSONL record; setup's hint fires only when the config key is genuinely unset (an explicit false is a recorded decline); the /ship prompt marker honors GSTACK_HOME. Kept as designed (cross-model tension noted): Bearer stays MEDIUM in the prepush gate — a HIGH Bearer would block every docs example; the entropy validator can't eliminate that FP class, and MEDIUM warns visibly. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * chore: bump version and changelog (v1.57.11.0) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * docs: P1 TODO — eval harness live progress + incremental persistence Root-caused during this ship: a killed eval run was indistinguishable from a healthy one for hours (per-file output buffering across mega test files, no incremental eval-store writes, no honest liveness signal). Full context and starting points in the entry. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test: fix operational-learning E2E fixture — copy lib/jsonl-store.ts Pre-existing breakage, proven on main: gstack-learnings-log has imported lib/jsonl-store.ts (shared injection patterns) since v1.57.5.0 / #1910, but the fixture copies only the bin scripts — the bin exits 1 before writing anything, on main silently (stderr swallowed) and on this branch loudly (the #1950 error-surfacing made the four-day-old failure visible). A real install always ships bin/ and lib/ together; the fixture now does too. Verified: the fixture-shaped invocation writes the learning (exit 0) with lib present, exits 1 on both main and this branch without it. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(ios-qa): isolate E2E tests under --concurrent (3 real races) The ios-qa E2E file failed intermittently under `bun test --concurrent` (the eval harness default). Three distinct shared-state races, all fixed: 1. Shared pidfile: a module-level `workDir` reassigned in beforeEach was clobbered by parallel tests, so concurrent daemons collided on the same pidfile and the loser returned `already_running`. Each test now gets its own dir via makeWorkDir(). 2. process.env path globals: tests set GSTACK_IOS_AUDIT_PATH / _ATTEMPTS_PATH / _ALLOWLIST_PATH on the shared process env; concurrent tests stomped each other's audit/attempts destinations. Threaded auditPath/attemptsPath/allowlistPath through DaemonOptions (and mintForCaller) as explicit args — env is no longer load-bearing. 3. afterEach cleanup race: the per-test cleanup drained a shared dir array, so the first test to finish deleted still-running tests' workDirs mid-assertion. Moved to afterAll (cleans once, after all settle). Verified: 5/5 clean full-suite runs at --max-concurrency 15 (was intermittent); daemon unit suite 91/91; daemon source compiles. The paths default to the env-derived locations when options are omitted, so the production CLI path is unchanged. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test(pty): pin spawned claude to EVALS model chain (default claude-sonnet-4-6) launchClaudePty spawned the interactive `claude` TUI with no --model flag, so the child inherited the operator's ~/.claude/settings.json model. On a slow-thinking model that meant 5+ min of extended thinking on empty plan-mode context, timing out the plan-mode smoke tests regardless of contention. Pin the model via opts.model ?? EVALS_MODEL ?? 'claude-sonnet-4-6' — byte-identical to session-runner.ts:144, so PTY and `claude -p` evals always agree. Pushed before extraArgs (last flag wins, so a per-test --model still overrides). Placement leaves the spawn region byte-stable for a clean merge with the in-flight hermetic-env branch. Plumbed model through the three plan-skill wrappers. Static-grep tripwires guard the pin, its fallback chain, the before-extraArgs ordering, and all three wrapper forwards. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(pty): detect markdown bold-bullet prose AUQs (fixes office-hours smoke) office-hours auto-mode renders its mode question as `- **Building a startup**` markdown bullets (office-hours/SKILL.md.tmpl:102) with no letter/number marker. isProseAUQVisible only matched `A)`-style lettered or `1.`-style numbered options, so the question went undetected: the model surfaced it at ~2m19s (well under the 300s budget) but the harness kept scoring the run "working" off the spinner glyphs and timed out — a false timeout on a question that was already on screen. Add Pattern 3: when an interrogative line ('?') is present AND 3+ bold-bullet markers (`- **`) appear in the 4KB tail, classify as a prose AUQ. Bold is the discriminator vs incidental prose bullets; the line anchor is dropped (stripAnsi can collapse option lines) and the existing `❯ 1.` cursor gate still defers to a live native list. Wires through the existing classifyVisible 'asked' path and the timeout high-water-mark, so office-hours now classifies 'asked' instead of 'timeout'. Five unit cases: the office-hours render passes; no-'?', <3-bullet, plain-bullet, and native-cursor cases stay false. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(pty): detect stripAnsi-collapsed prose AUQs + judge spinner-precedence The plan-eng/plan-design plan-mode + finding-floor smokes timed out even when the skill HAD rendered a complete prose AskUserQuestion and was waiting: the PTY strips cursor-positioning escapes, collapsing the option newlines/spaces so "A) ..." arrives as "A(recommended)" / "-B:" and "Reply with A, B, or C" as "ReplywithA,B,orC". Every line-anchored detector (Patterns 1-3) returns false on those bytes, so proseAUQEverObserved never latched and the run timed out on a question that was already on screen. Add Pattern 4/5: a two-signal collapsed-form detector — a reply/recommendation marker (space-insensitive "reply with [A-D]", "Recommendation:", or "(recommended)") AND 2+ distinct A-D letters each punctuated by ) : or (. The conjunction is what separates a real AUQ from incidental report prose; verified true on the verbatim failing-run buffers where Patterns 1-3 return false. Also fix the Haiku judge spinner bias: of 614 verdicts, 569 were 'working' and 95 of those noted a question was visible — Claude Code keeps the spinner animating at an idle prose decision, so the judge coin-flipped. Add a precedence override: when an option list AND a Recommendation/Reply instruction are both visible, classify WAITING even with spinner glyphs. Kept the strict dual-signal gate (never option-list-alone) so auto-decide-preserved doesn't flip. 5 unit tests pin the two-signal contract (2 true on real collapsed bytes, 3 false guards). 90 -> 95 pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(plan-review): ask-first scope gate for plan-eng + plan-design review On an empty/cold invocation, plan-eng-review and plan-design-review would dive straight into repo exploration (plan-eng) or a 7-pass mockup+audit (plan-design) and only ask the user much later, if at all. plan-ceo-review already asks first via an unconditional Step-0 gate and behaves well; these two did not. Add a hard-STOP scope gate as the FIRST operational instruction in each skill (above the design-doc check / pre-review audit / mockup defaults it explicitly overrides): the first tool call must be AskUserQuestion confirming the review target, before any git/Read/Grep/Glob/Bash or mockup generation. Under --disallowedTools the options render as plain column-0 lettered prose with a Recommendation + "Reply with A, B, or C" line so the answer is detectable. This is correct cold-start UX (confirm what to review before grinding a full review on nothing) and it is the product half of the plan-mode smoke fix; the harness collapsed-form detector is the deterministic half that catches the ask however it renders. Templates + regenerated SKILL.md (default variant). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(tiers): reclassify stochastic plan-eng/plan-design ask-first smokes as periodic plan-eng-review and plan-design-review run a long explore/audit before their first AskUserQuestion, so whether the plan-mode + finding-floor smokes reach a terminal outcome within the 300s/600s budget depends on stochastic ask-first compliance (measured ~50-67%/run even with the hardened gate). Per the "non-deterministic -> periodic" tiering rule, move the four affected smokes (plan-eng/plan-design review-plan-mode + finding-floor) to periodic. The deterministic harness fix (collapsed-form detector + judge precedence) and the ask-first gate lift these from always-failing to mostly-passing and are the real product+harness improvements; periodic monitoring tracks the rate weekly without blocking PRs on an LLM coin-flip. plan-ceo/plan-devex ask-first reliably and stay gate-tier. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * ci(evals): gate the deterministic PTY plan-mode smokes in CI The real-PTY plan-mode smokes never ran in CI — the gate was local-only. Add an e2e-pty-plan-smoke matrix suite running the two deterministically-reliable ones (office-hours-auto-mode, plan-mode-no-op) so a regression there blocks PRs. The stochastic plan-eng/plan-design ask-first smokes stay periodic (touchfiles E2E_TIERS) and are not CI-gated. A fresh CI container has no ~/.claude.json, so the spawned interactive `claude` would wedge on the onboarding + API-key-approval dialog. Add a scoped seed step (hasCompletedOnboarding + key approval, its own ANTHROPIC_API_KEY env) before the run — mirrors what the hermetic E2E child env seeds. Per-suite timeout override (35 min) via matrix.suite.timeout so the PTY suite has headroom for --retry 2 without bumping the other 12 suites. Report runner count 12 -> 13. Validate via workflow_dispatch before relying on the gate (PTY-in-CI is new). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * ci(evals): install gstack skill registry for the PTY smoke suite The first dry-run of e2e-pty-plan-smoke failed: the spawned interactive `claude` printed "Unknown command: /plan-ceo-review". .claude/skills is gitignored, so a fresh CI checkout has no gstack skill registry and the TUI can't resolve /office-hours or /plan-ceo-review. Add a Register step (scoped to the suite, after Seed, before Run) that mirrors setup's --no-prefix user-scoped registry minimally: $HOME/.claude/skills/gstack -> repo (resolves the preambles' absolute ~/.claude/skills/gstack/bin/* and <skill>/sections/* paths) + per-skill SKILL.md/sections symlinks for the two skills these tests invoke. HOME is /github/home in this container and the runner adds no HOME/CLAUDE_CONFIG_DIR override (no hermetic mode), so $HOME is the right anchor — the Seed step already proved claude reads it. No ./setup (binary build + Chromium + fonts + /dev/tty prompt); SKILL.md + bin/ + sections/ are committed. Self-validating: fails the step loudly on a dangling symlink or missing `name:` frontmatter, so a moved target surfaces here instead of as a silent 35-min "Unknown command" timeout. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v1.58.4.0) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
538 lines
25 KiB
Cheetah
538 lines
25 KiB
Cheetah
---
|
|
name: ship
|
|
preamble-tier: 4
|
|
version: 1.0.0
|
|
description: |
|
|
Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION,
|
|
update CHANGELOG, commit, push, create PR. Use when asked to "ship", "deploy",
|
|
"push to main", "create a PR", "merge and push", or "get it deployed".
|
|
Proactively invoke this skill (do NOT push/PR directly) when the user says code
|
|
is ready, asks about deploying, wants to push code up, or asks to create a PR. (gstack)
|
|
allowed-tools:
|
|
- Bash
|
|
- Read
|
|
- Write
|
|
- Edit
|
|
- Grep
|
|
- Glob
|
|
- Agent
|
|
- AskUserQuestion
|
|
- WebSearch
|
|
sensitive: true
|
|
triggers:
|
|
- ship it
|
|
- create a pr
|
|
- push to main
|
|
- deploy this
|
|
---
|
|
|
|
{{PREAMBLE}}
|
|
|
|
{{BASE_BRANCH_DETECT}}
|
|
|
|
{{GBRAIN_CONTEXT_LOAD}}
|
|
|
|
# Ship: Fully Automated Ship Workflow
|
|
|
|
You are running the `/ship` workflow. This is a **non-interactive, fully automated** workflow. Do NOT ask for confirmation at any step. The user said `/ship` which means DO IT. Run straight through and output the PR URL at the end.
|
|
|
|
**Only stop for:**
|
|
- On the base branch (abort)
|
|
- Merge conflicts that can't be auto-resolved (stop, show conflicts)
|
|
- In-branch test failures (pre-existing failures are triaged, not auto-blocking)
|
|
- Pre-landing review finds ASK items that need user judgment
|
|
- MINOR or MAJOR version bump needed (ask — see Step 12)
|
|
- Greptile review comments that need user decision (complex fixes, false positives)
|
|
- AI-assessed coverage below minimum threshold (hard gate with user override — see Step 7)
|
|
- Plan items NOT DONE with no user override (see Step 8)
|
|
- Plan verification failures (see Step 8.1)
|
|
- TODOS.md missing and user wants to create one (ask — see Step 14)
|
|
- TODOS.md disorganized and user wants to reorganize (ask — see Step 14)
|
|
|
|
**Never stop for:**
|
|
- Uncommitted changes (always include them)
|
|
- Version bump choice (auto-pick MICRO or PATCH — see Step 12)
|
|
- CHANGELOG content (auto-generate from diff)
|
|
- Commit message approval (auto-commit)
|
|
- Multi-file changesets (auto-split into bisectable commits)
|
|
- TODOS.md completed-item detection (auto-mark)
|
|
- Auto-fixable review findings (dead code, N+1, stale comments — fixed automatically)
|
|
- Test coverage gaps within target threshold (auto-generate and commit, or flag in PR body)
|
|
|
|
**Re-run behavior (idempotency):**
|
|
Re-running `/ship` means "run the whole checklist again." Every verification step
|
|
(tests, coverage audit, plan completion, pre-landing review, adversarial review,
|
|
VERSION/CHANGELOG check, TODOS, document-release) runs on every invocation.
|
|
Only *actions* are idempotent:
|
|
- Step 12: If VERSION already bumped, skip the bump but still read the version
|
|
- Step 17: If already pushed, skip the push command
|
|
- Step 19: If PR exists, update the body instead of creating a new PR
|
|
Never skip a verification step because a prior `/ship` run already performed it.
|
|
|
|
---
|
|
|
|
{{SECTION_INDEX:ship}}
|
|
|
|
---
|
|
|
|
## Step 1: Pre-flight
|
|
|
|
1. Check the current branch. If on the base branch or the repo's default branch, **abort**: "You're on the base branch. Ship from a feature branch."
|
|
|
|
2. Run `git status` (never use `-uall`). Uncommitted changes are always included — no need to ask.
|
|
|
|
3. Run `git diff <base>...HEAD --stat` and `git log <base>..HEAD --oneline` to understand what's being shipped.
|
|
|
|
4. Check review readiness:
|
|
|
|
{{REVIEW_DASHBOARD}}
|
|
|
|
If the Eng Review is NOT "CLEAR":
|
|
|
|
Print: "No prior eng review found — ship will run its own pre-landing review in Step 9."
|
|
|
|
Check diff size: `git diff <base>...HEAD --stat | tail -1`. If the diff is >200 lines, add: "Note: This is a large diff. Consider running `/plan-eng-review` or `/autoplan` for architecture-level review before shipping."
|
|
|
|
If CEO Review is missing, mention as informational ("CEO Review not run — recommended for product changes") but do NOT block.
|
|
|
|
For Design Review: run `source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 9, but consider running /design-review for a full visual audit post-implementation." Still never block.
|
|
|
|
Continue to Step 2 — do NOT block or ask. Ship runs its own review in Step 9.
|
|
|
|
---
|
|
|
|
## Step 2: Distribution Pipeline Check
|
|
|
|
If the diff introduces a new standalone artifact (CLI binary, library package, tool) — not a web
|
|
service with existing deployment — verify that a distribution pipeline exists.
|
|
|
|
1. Check if the diff adds a new `cmd/` directory, `main.go`, or `bin/` entry point:
|
|
```bash
|
|
git diff origin/<base> --name-only | grep -E '(cmd/.*/main\.go|bin/|Cargo\.toml|setup\.py|package\.json)' | head -5
|
|
```
|
|
|
|
2. If new artifact detected, check for a release workflow:
|
|
```bash
|
|
ls .github/workflows/ 2>/dev/null | grep -iE 'release|publish|dist'
|
|
grep -qE 'release|publish|deploy' .gitlab-ci.yml 2>/dev/null && echo "GITLAB_CI_RELEASE"
|
|
```
|
|
|
|
3. **If no release pipeline exists and a new artifact was added:** Use AskUserQuestion:
|
|
- "This PR adds a new binary/tool but there's no CI/CD pipeline to build and publish it.
|
|
Users won't be able to download the artifact after merge."
|
|
- A) Add a release workflow now (CI/CD release pipeline — GitHub Actions or GitLab CI depending on platform)
|
|
- B) Defer — add to TODOS.md
|
|
- C) Not needed — this is internal/web-only, existing deployment covers it
|
|
|
|
4. **If release pipeline exists:** Continue silently.
|
|
5. **If no new artifact detected:** Skip silently.
|
|
|
|
---
|
|
|
|
## Step 3: Merge the base branch (BEFORE tests)
|
|
|
|
Fetch and merge the base branch into the feature branch so tests run against the merged state:
|
|
|
|
```bash
|
|
git fetch origin <base> && git merge origin/<base> --no-edit
|
|
```
|
|
|
|
**If there are merge conflicts:** Try to auto-resolve if they are simple (VERSION, schema.rb, CHANGELOG ordering). If conflicts are complex or ambiguous, **STOP** and show them.
|
|
|
|
**If already up to date:** Continue silently.
|
|
|
|
---
|
|
|
|
{{SECTION:tests}}
|
|
|
|
{{SECTION:test-coverage}}
|
|
|
|
{{SECTION:plan-completion}}
|
|
|
|
{{SECTION:review-army}}
|
|
|
|
{{SECTION:greptile}}
|
|
|
|
{{SECTION:adversarial}}
|
|
|
|
## Step 12: Version bump (auto-decide)
|
|
|
|
The deterministic version-state logic is the tested **`gstack-version-bump`** CLI
|
|
(classify / write / repair). The bump-LEVEL decision and queue-collision handling
|
|
stay agent judgment; the slot pick stays `gstack-next-version`.
|
|
|
|
1. **Classify state** — pure reader, never writes:
|
|
```bash
|
|
bun run ~/.claude/skills/gstack/bin/gstack-version-bump classify --base <base>
|
|
```
|
|
Read the JSON `state` and dispatch:
|
|
- **FRESH** → do the bump (steps 2-4).
|
|
- **ALREADY_BUMPED** → skip the bump, but run the queue-drift check (step 3) with the reported `currentVersion`. If the queue moved (next free version differs), **AskUserQuestion**: rebump to the new version (rewrites CHANGELOG header + PR title) or keep current (CI version-gate will reject until resolved).
|
|
- **DRIFT_STALE_PKG** → run `gstack-version-bump repair` (syncs package.json to VERSION). No re-bump; reuse `currentVersion` for CHANGELOG + PR.
|
|
- **DRIFT_UNEXPECTED** → **STOP**. package.json disagrees with VERSION while VERSION matches base — a manual edit bypassed /ship. Reconcile manually, then re-run.
|
|
|
|
2. **Decide the bump level** from the diff (agent judgment):
|
|
- **MICRO**: <50 lines, trivial tweaks/config. **PATCH**: 50+ lines, no feature signals.
|
|
- **MINOR**: **ASK** if any feature signal (new route/page, migration, new module), OR 500+ lines. **MAJOR**: **ASK** — milestones or breaking changes only.
|
|
Save as `BUMP_LEVEL`. The level is the user-intended bump; queue-aware placement may advance the slot without changing the level.
|
|
|
|
3. **Queue-aware pick** (workspace-aware ship):
|
|
```bash
|
|
QUEUE_JSON=$(bun run ~/.claude/skills/gstack/bin/gstack-next-version --base <base> --bump "$BUMP_LEVEL" --current-version "$BASE_VERSION" 2>/dev/null || echo '{"offline":true}')
|
|
NEW_VERSION=$(echo "$QUEUE_JSON" | jq -r '.version // empty')
|
|
```
|
|
If `offline`/util fails: fall back to local `BUMP_LEVEL` arithmetic and print `⚠ workspace-aware ship offline — using local bump only`. If `claimed` is non-empty, render the queue table so the user sees landing order. If an active sibling workspace holds a version `>= NEW_VERSION`, **AskUserQuestion**: advance past (unrelated work) or abort and sync with the sibling.
|
|
|
|
4. **Write the bump** (FRESH, or an approved rebump):
|
|
```bash
|
|
bun run ~/.claude/skills/gstack/bin/gstack-version-bump write --version "$NEW_VERSION"
|
|
```
|
|
The CLI validates the 4-digit `MAJOR.MINOR.PATCH.MICRO` pattern and writes **both** VERSION and package.json. On a half-write (VERSION written, package.json failed) it exits 3 — re-run, and classify will report DRIFT_STALE_PKG for `repair` to fix.
|
|
|
|
5. **Record the release decision** (durable cross-session memory). The bump level is a real decision the next session should not re-derive blind:
|
|
```bash
|
|
~/.claude/skills/gstack/bin/gstack-decision-log '{"decision":"Ship NEW_VERSION (BUMP_LEVEL)","rationale":"WHY","scope":"repo","source":"skill","confidence":9}' 2>/dev/null || true
|
|
```
|
|
Substitute `NEW_VERSION`, `BUMP_LEVEL`, and a one-line `WHY` (the signal that set the level: diff scale, a new feature, a breaking change). Best-effort and non-interactive; never blocks the ship. Skip on the ALREADY_BUMPED path (the decision was logged on the run that did the bump).
|
|
|
|
{{SECTION:changelog}}
|
|
|
|
## Step 14: TODOS.md (auto-update)
|
|
|
|
Cross-reference the project's TODOS.md against the changes being shipped. Mark completed items automatically; prompt only if the file is missing or disorganized.
|
|
|
|
Read `.claude/skills/review/TODOS-format.md` for the canonical format reference.
|
|
|
|
**1. Check if TODOS.md exists** in the repository root.
|
|
|
|
**If TODOS.md does not exist:** Use AskUserQuestion:
|
|
- Message: "GStack recommends maintaining a TODOS.md organized by skill/component, then priority (P0 at top through P4, then Completed at bottom). See TODOS-format.md for the full format. Would you like to create one?"
|
|
- Options: A) Create it now, B) Skip for now
|
|
- If A: Create `TODOS.md` with a skeleton (# TODOS heading + ## Completed section). Continue to step 3.
|
|
- If B: Skip the rest of Step 14. Continue to Step 15.
|
|
|
|
**2. Check structure and organization:**
|
|
|
|
Read TODOS.md and verify it follows the recommended structure:
|
|
- Items grouped under `## <Skill/Component>` headings
|
|
- Each item has `**Priority:**` field with P0-P4 value
|
|
- A `## Completed` section at the bottom
|
|
|
|
**If disorganized** (missing priority fields, no component groupings, no Completed section): Use AskUserQuestion:
|
|
- Message: "TODOS.md doesn't follow the recommended structure (skill/component groupings, P0-P4 priority, Completed section). Would you like to reorganize it?"
|
|
- Options: A) Reorganize now (recommended), B) Leave as-is
|
|
- If A: Reorganize in-place following TODOS-format.md. Preserve all content — only restructure, never delete items.
|
|
- If B: Continue to step 3 without restructuring.
|
|
|
|
**3. Detect completed TODOs:**
|
|
|
|
This step is fully automatic — no user interaction.
|
|
|
|
Use the diff and commit history already gathered in earlier steps:
|
|
- `git diff <base>...HEAD` (full diff against the base branch)
|
|
- `git log <base>..HEAD --oneline` (all commits being shipped)
|
|
|
|
For each TODO item, check if the changes in this PR complete it by:
|
|
- Matching commit messages against the TODO title and description
|
|
- Checking if files referenced in the TODO appear in the diff
|
|
- Checking if the TODO's described work matches the functional changes
|
|
|
|
**Be conservative:** Only mark a TODO as completed if there is clear evidence in the diff. If uncertain, leave it alone.
|
|
|
|
**4. Move completed items** to the `## Completed` section at the bottom. Append: `**Completed:** vX.Y.Z (YYYY-MM-DD)`
|
|
|
|
**5. Output summary:**
|
|
- `TODOS.md: N items marked complete (item1, item2, ...). M items remaining.`
|
|
- Or: `TODOS.md: No completed items detected. M items remaining.`
|
|
- Or: `TODOS.md: Created.` / `TODOS.md: Reorganized.`
|
|
|
|
**6. Defensive:** If TODOS.md cannot be written (permission error, disk full), warn the user and continue. Never stop the ship workflow for a TODOS failure.
|
|
|
|
Save this summary — it goes into the PR body in Step 19.
|
|
|
|
---
|
|
|
|
## Step 15: Commit (bisectable chunks)
|
|
|
|
### Step 15.0: WIP Commit Squash (continuous checkpoint mode only)
|
|
|
|
If `CHECKPOINT_MODE` is `"continuous"`, the branch likely contains `WIP:` commits
|
|
from auto-checkpointing. These must be squashed INTO the corresponding logical
|
|
commits before the bisectable-grouping logic in Step 15.1 runs. Non-WIP commits
|
|
on the branch (earlier landed work) must be preserved.
|
|
|
|
**Detection:**
|
|
```bash
|
|
WIP_COUNT=$(git log <base>..HEAD --oneline --grep="^WIP:" 2>/dev/null | wc -l | tr -d ' ')
|
|
echo "WIP_COMMITS: $WIP_COUNT"
|
|
```
|
|
|
|
If `WIP_COUNT` is 0: skip this sub-step entirely.
|
|
|
|
If `WIP_COUNT` > 0, collect the WIP context first so it survives the squash:
|
|
|
|
```bash
|
|
# Export [gstack-context] blocks from all WIP commits on this branch.
|
|
# This file becomes input to the CHANGELOG entry and may inform PR body context.
|
|
mkdir -p "$(git rev-parse --show-toplevel)/.gstack"
|
|
git log <base>..HEAD --grep="^WIP:" --format="%H%n%B%n---END---" > \
|
|
"$(git rev-parse --show-toplevel)/.gstack/wip-context-before-squash.md" 2>/dev/null || true
|
|
```
|
|
|
|
**Non-destructive squash strategy:**
|
|
|
|
`git reset --soft <merge-base>` WOULD uncommit everything including non-WIP commits.
|
|
DO NOT DO THAT. Instead, use `git rebase` scoped to filter WIP commits only.
|
|
|
|
Option 1 (preferred, if there are non-WIP commits mixed in):
|
|
```bash
|
|
# Interactive rebase with automated WIP squashing.
|
|
# Mark every WIP commit as 'fixup' (drop its message, fold changes into prior commit).
|
|
git rebase -i $(git merge-base HEAD origin/<base>) \
|
|
--exec 'true' \
|
|
-X ours 2>/dev/null || {
|
|
echo "Rebase conflict. Aborting: git rebase --abort"
|
|
git rebase --abort
|
|
echo "STATUS: BLOCKED — manual WIP squash required"
|
|
exit 1
|
|
}
|
|
```
|
|
|
|
Option 2 (simpler, if the branch is ALL WIP commits so far — no landed work):
|
|
```bash
|
|
# Branch contains only WIP commits. Reset-soft is safe here because there's
|
|
# nothing non-WIP to preserve. Verify first.
|
|
NON_WIP=$(git log <base>..HEAD --oneline --invert-grep --grep="^WIP:" 2>/dev/null | wc -l | tr -d ' ')
|
|
if [ "$NON_WIP" -eq 0 ]; then
|
|
git reset --soft $(git merge-base HEAD origin/<base>)
|
|
echo "WIP-only branch, reset-soft to merge base. Step 15.1 will create clean commits."
|
|
fi
|
|
```
|
|
|
|
Decide at runtime which option applies. If unsure, prefer stopping and asking the
|
|
user via AskUserQuestion rather than destroying non-WIP commits.
|
|
|
|
**Anti-footgun rules:**
|
|
- NEVER blind `git reset --soft` if there are non-WIP commits. Codex flagged this
|
|
as destructive — it would uncommit real landed work and turn the push step into
|
|
a non-fast-forward push for anyone who already pushed.
|
|
- Only proceed to Step 15.1 after WIP commits are successfully squashed/absorbed
|
|
or the branch has been verified to contain only WIP work.
|
|
|
|
### Step 15.1: Bisectable Commits
|
|
|
|
**Goal:** Create small, logical commits that work well with `git bisect` and help LLMs understand what changed.
|
|
|
|
1. Analyze the diff and group changes into logical commits. Each commit should represent **one coherent change** — not one file, but one logical unit.
|
|
|
|
2. **Commit ordering** (earlier commits first):
|
|
- **Infrastructure:** migrations, config changes, route additions
|
|
- **Models & services:** new models, services, concerns (with their tests)
|
|
- **Controllers & views:** controllers, views, JS/React components (with their tests)
|
|
- **VERSION + CHANGELOG + TODOS.md:** always in the final commit
|
|
|
|
3. **Rules for splitting:**
|
|
- A model and its test file go in the same commit
|
|
- A service and its test file go in the same commit
|
|
- A controller, its views, and its test go in the same commit
|
|
- Migrations are their own commit (or grouped with the model they support)
|
|
- Config/route changes can group with the feature they enable
|
|
- If the total diff is small (< 50 lines across < 4 files), a single commit is fine
|
|
|
|
4. **Each commit must be independently valid** — no broken imports, no references to code that doesn't exist yet. Order commits so dependencies come first.
|
|
|
|
5. Compose each commit message:
|
|
- First line: `<type>: <summary>` (type = feat/fix/chore/refactor/docs)
|
|
- Body: brief description of what this commit contains
|
|
- Only the **final commit** (VERSION + CHANGELOG) gets the version tag and co-author trailer:
|
|
|
|
```bash
|
|
git commit -m "$(cat <<'EOF'
|
|
chore: bump version and changelog (vX.Y.Z.W)
|
|
|
|
{{CO_AUTHOR_TRAILER}}
|
|
EOF
|
|
)"
|
|
```
|
|
|
|
---
|
|
|
|
## Step 16: Verification Gate
|
|
|
|
**IRON LAW: NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE.**
|
|
|
|
Before pushing, re-verify if code changed during Steps 4-6:
|
|
|
|
1. **Test verification:** If ANY code changed after Step 5's test run (fixes from review findings, CHANGELOG edits don't count), re-run the test suite. Paste fresh output. Stale output from Step 5 is NOT acceptable.
|
|
|
|
2. **Build verification:** If the project has a build step, run it. Paste output.
|
|
|
|
3. **Rationalization prevention:**
|
|
- "Should work now" → RUN IT.
|
|
- "I'm confident" → Confidence is not evidence.
|
|
- "I already tested earlier" → Code changed since then. Test again.
|
|
- "It's a trivial change" → Trivial changes break production.
|
|
|
|
**If tests fail here:** STOP. Do not push. Fix the issue and return to Step 5.
|
|
|
|
Claiming work is complete without verification is dishonesty, not efficiency.
|
|
|
|
---
|
|
|
|
## Step 17: Push
|
|
|
|
**Credential pre-push guard (#1946) — run before the push:**
|
|
|
|
```bash
|
|
_REDACT_PREPUSH=$(~/.claude/skills/gstack/bin/gstack-config get redact_prepush_hook 2>/dev/null || echo "false")
|
|
_HOOK_PATH=$(git rev-parse --git-path hooks/pre-push 2>/dev/null || echo "")
|
|
_HOOK_INSTALLED="no"
|
|
[ -n "$_HOOK_PATH" ] && [ -f "$_HOOK_PATH" ] && grep -q "gstack-redact" "$_HOOK_PATH" 2>/dev/null && _HOOK_INSTALLED="yes"
|
|
# Custom hooks dirs (core.hooksPath — e.g. husky's COMMITTED .husky/) must
|
|
# never get a silent install: the chaining installer would rename the team's
|
|
# committed hook and write a machine-local wrapper into the working tree.
|
|
_HOOKS_DIR=$(git rev-parse --git-path hooks 2>/dev/null || echo "")
|
|
_GIT_DIR=$(git rev-parse --absolute-git-dir 2>/dev/null || echo "")
|
|
_HOOKS_IN_GIT_DIR="no"
|
|
case "$_HOOKS_DIR" in
|
|
"$_GIT_DIR"/*|hooks|.git/hooks) _HOOKS_IN_GIT_DIR="yes" ;;
|
|
esac
|
|
_PREPUSH_PROMPTED=$([ -f "${GSTACK_HOME:-$HOME/.gstack}/.redact-prepush-prompted" ] && echo "yes" || echo "no")
|
|
echo "REDACT_PREPUSH: $_REDACT_PREPUSH"
|
|
echo "HOOK_INSTALLED: $_HOOK_INSTALLED"
|
|
echo "HOOKS_IN_GIT_DIR: $_HOOKS_IN_GIT_DIR"
|
|
echo "PREPUSH_PROMPTED: $_PREPUSH_PROMPTED"
|
|
```
|
|
|
|
Branch on the echoed values:
|
|
|
|
1. **`REDACT_PREPUSH: true` and `HOOK_INSTALLED: no` and `HOOKS_IN_GIT_DIR: yes`** —
|
|
consent already given; install silently (no question) and continue:
|
|
```bash
|
|
~/.claude/skills/gstack/bin/gstack-redact install-prepush-hook
|
|
```
|
|
If `HOOKS_IN_GIT_DIR: no` (husky or another committed hooks dir), do NOT
|
|
install silently — print one line: "redact pre-push guard not installed:
|
|
this repo uses a custom core.hooksPath; run
|
|
`gstack-redact install-prepush-hook` manually if you want it chained."
|
|
2. **`REDACT_PREPUSH` not true AND `PREPUSH_PROMPTED: no`** — one-time
|
|
offer (fires once EVER, machine-wide). AskUserQuestion:
|
|
|
|
> gstack can install a per-repo git pre-push hook that blocks pushes
|
|
> containing credentials (API keys, tokens, private keys). It's a
|
|
> guardrail, not enforcement — `GSTACK_REDACT_PREPUSH=skip` bypasses it.
|
|
> Install it for repos you ship from?
|
|
|
|
Options:
|
|
- A) Yes — install the credential guard (recommended)
|
|
- B) No — never ask again
|
|
|
|
If A: run `~/.claude/skills/gstack/bin/gstack-config set redact_prepush_hook true`
|
|
then `~/.claude/skills/gstack/bin/gstack-redact install-prepush-hook`.
|
|
If B: run `~/.claude/skills/gstack/bin/gstack-config set redact_prepush_hook false`.
|
|
ALWAYS (after either answer, but NOT if the question itself failed to
|
|
render — a failed AskUserQuestion must re-offer next time):
|
|
```bash
|
|
touch "${GSTACK_HOME:-$HOME/.gstack}/.redact-prepush-prompted"
|
|
```
|
|
3. **Anything else** (declined earlier, or already installed) — continue
|
|
without comment.
|
|
|
|
**Idempotency check:** Check if the branch is already pushed and up to date.
|
|
|
|
```bash
|
|
git fetch origin <branch-name> 2>/dev/null
|
|
LOCAL=$(git rev-parse HEAD)
|
|
REMOTE=$(git rev-parse origin/<branch-name> 2>/dev/null || echo "none")
|
|
echo "LOCAL: $LOCAL REMOTE: $REMOTE"
|
|
[ "$LOCAL" = "$REMOTE" ] && echo "ALREADY_PUSHED" || echo "PUSH_NEEDED"
|
|
```
|
|
|
|
If `ALREADY_PUSHED`, skip the push but continue to Step 18. Otherwise push with upstream tracking:
|
|
|
|
```bash
|
|
git push -u origin <branch-name>
|
|
```
|
|
|
|
**You are NOT done.** The code is pushed but documentation sync and PR creation are mandatory final steps. Continue to Step 18.
|
|
|
|
---
|
|
|
|
**PR/MR title invariant (always applies — do not skip even if you don't open the section below):** Any PR or MR you create OR update in the next step MUST have a title that starts with `v$NEW_VERSION` (the version bumped in Step 12), in the format `v<NEW_VERSION> <type>: <summary>`. Never create or edit a PR/MR title without this prefix. Compute the correct title with the single source of truth helper: `~/.claude/skills/gstack/bin/gstack-pr-title-rewrite.sh "$NEW_VERSION" "<current title>"`. The full create/update procedure (idempotency, redaction scan, self-check) is in the section below.
|
|
|
|
{{SECTION:pr-body}}
|
|
|
|
## Step 20: Persist ship metrics
|
|
|
|
Log coverage and plan completion data so `/retro` can track trends:
|
|
|
|
```bash
|
|
eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
|
|
```
|
|
|
|
Append to `~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl`:
|
|
|
|
```bash
|
|
echo '{"skill":"ship","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","coverage_pct":COVERAGE_PCT,"plan_items_total":PLAN_TOTAL,"plan_items_done":PLAN_DONE,"verification_result":"VERIFY_RESULT","version":"VERSION","branch":"BRANCH"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
|
|
```
|
|
|
|
Substitute from earlier steps:
|
|
- **COVERAGE_PCT**: coverage percentage from Step 7 diagram (integer, or -1 if undetermined)
|
|
- **PLAN_TOTAL**: total plan items extracted in Step 8 (0 if no plan file)
|
|
- **PLAN_DONE**: count of DONE + CHANGED items from Step 8 (0 if no plan file)
|
|
- **VERIFY_RESULT**: "pass", "fail", or "skipped" from Step 8.1
|
|
- **VERSION**: from the VERSION file
|
|
- **BRANCH**: current branch name
|
|
|
|
This step is automatic — never skip it, never ask for confirmation.
|
|
|
|
---
|
|
|
|
## Step 21: Plan-tune discoverability nudge (first-successful-ship only)
|
|
|
|
Plan-tune cathedral T15. After a successful ship, surface /plan-tune once
|
|
per machine. Single line, non-blocking, marker-gated so it never re-fires.
|
|
|
|
```bash
|
|
_NUDGE_MARKER="$HOME/.gstack/.plan-tune-nudge-shown"
|
|
_QT=$(~/.claude/skills/gstack/bin/gstack-config get question_tuning 2>/dev/null || echo "false")
|
|
if [ ! -f "$_NUDGE_MARKER" ] && [ "$_QT" = "false" ]; then
|
|
echo ""
|
|
echo "gstack can learn from your AskUserQuestion answers. Run /plan-tune to opt in"
|
|
echo "— it captures which prompts you find valuable vs noisy and (with hooks installed)"
|
|
echo "auto-decides your never-ask preferences."
|
|
touch "$_NUDGE_MARKER"
|
|
fi
|
|
```
|
|
|
|
If the marker exists, OR question_tuning is already on, the nudge is a
|
|
no-op. The marker guarantees at-most-once per machine. To re-enable:
|
|
`rm ~/.gstack/.plan-tune-nudge-shown` before next ship.
|
|
|
|
---
|
|
|
|
## Section self-check (before you finish)
|
|
|
|
You ran a carved skill. For your situation, list every section the Section index
|
|
named as applying, and confirm you issued a Read for each one. If you executed any
|
|
of those steps from memory without reading its section, you skipped the source of
|
|
truth — STOP, Read it now, and redo that step. Deterministic version work goes
|
|
through `gstack-version-bump`; never hand-roll the VERSION/package.json write.
|
|
|
|
---
|
|
|
|
## Important Rules
|
|
|
|
- **Never skip tests.** If tests fail, stop.
|
|
- **Never skip the pre-landing review.** If checklist.md is unreadable, stop.
|
|
- **Never force push.** Use regular `git push` only.
|
|
- **Never ask for trivial confirmations** (e.g., "ready to push?", "create PR?"). DO stop for: version bumps (MINOR/MAJOR), pre-landing review findings (ASK items), and Codex structured review [P1] findings (large diffs only).
|
|
- **Always use the 4-digit version format** from the VERSION file.
|
|
- **Date format in CHANGELOG:** `YYYY-MM-DD`
|
|
- **Split commits for bisectability** — each commit = one logical change.
|
|
- **TODOS.md completion detection must be conservative.** Only mark items as completed when the diff clearly shows the work is done.
|
|
- **Use Greptile reply templates from greptile-triage.md.** Every reply includes evidence (inline diff, code references, re-rank suggestion). Never post vague replies.
|
|
- **Never push without fresh verification evidence.** If code changed after Step 5 tests, re-run before pushing.
|
|
- **Step 7 generates coverage tests.** They must pass before committing. Never commit failing tests.
|
|
- **The goal is: user says `/ship`, next thing they see is the review + PR URL + auto-synced docs.**
|