mirror of
https://github.com/garrytan/gstack.git
synced 2026-06-01 15:51:41 +02:00
46c1fae7f1
* feat(test): transcript-section-logger + ship-action fingerprint (T10) Pure-analysis module over a SkillTestResult/NDJSON transcript: - extractSectionReads(): which sections/*.md a run opened (post-carve check) - extractShipActions(): observable action fingerprint (merge/test/bump/ changelog/commit/push/pr) that works on the MONOLITH too, so a baseline captured before the carve can detect a sectioned-ship regression - baseline read/write + compareShipActions() for baseline-first dogf(T10) Baseline-first answers the Codex outside-voice critique that a logger in the same PR as the carve is post-failure telemetry without a pre-carve reference. 11 unit tests, all green. Paid monolith baseline capture runs separately. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(pipeline): section discovery + generation machinery (T9) - discover-skills.ts: discoverSectionTemplates() scans <skill>/sections/*.md.tmpl - gen-skill-docs.ts: extract resolvePlaceholders + applyHostRewrites + buildContext as shared helpers (processTemplate and the new processSectionTemplate both call them, so a sanitization/rewrite fix can't miss sections) [C1] - processSectionTemplate: body-fragment generation (no frontmatter/catalog/voice), parent-skill TemplateContext (skillName pinned to parent, not 'sections', so appliesTo gating + tier behave identically), per-host output routing - --host all now fails the build on ANY host failure, not just claude, so a stale external-host output can't slip the freshness gate [Codex outside-voice #9] Inert until a skill is carved (no sections/ dirs exist yet). Refactor is output-neutral: gen:skill-docs --dry-run --host all reports 0 STALE. 5 discovery unit tests + 389 gen-skill-docs tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(setup): install sections/ for cherry-pick targets (claude + kiro) (T9) Two install targets cherry-pick SKILL.md and would leave a carved skill's sections/ behind, 404ing a runtime 'Read sections/<name>.md': - link_claude_skill_dirs: link the sections/ subdir via _link_or_copy (windows gets a fresh copy on every ./setup) - kiro per-skill loop: sed-rewrite + copy each sections/* so paths resolve under ~/.kiro, not ~/.codex/~/.claude codex/factory/opencode link the whole generated dir, so sections ride free. Addresses Codex outside-voice #4/#6 (runtime pathing landmine). Inert until a skill is carved. Static-tripwire test + windows-fallback invariant green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(ship): gstack-version-bump CLI — tested idempotency classify + write (T9) Hybrid CLI extraction (CM1): the deterministic core of ship Step 12 becomes a tested CLI instead of bash prose the agent re-derives each run. - classify: FRESH/ALREADY_BUMPED/DRIFT_STALE_PKG/DRIFT_UNEXPECTED from VERSION vs origin/<base>:VERSION vs package.json.version (pure reader) - write: validated dual-write to VERSION + package.json (FRESH bump) - repair: DRIFT_STALE_PKG sync, no re-bump Bump-LEVEL choice + queue collision stay agent judgment; slot pick stays bin/gstack-next-version. This removes the re-bump-a-shipped-branch footgun from skippable prose into code that can't be skipped or misread. 15 tests (exhaustive state matrix + write/repair fs + real-git classify). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(parity): sectioned-skill parity capability — guards the carve (T9) Carved skills (skeleton + sections/*.md) need parity checks that see relocated content, or moving a phrase into a section reads as 'lost': - readSkillForParity(): union skeleton + all sections/*.md - checkSkillParity sectioned mode: content checks against the union; minBytes/ maxSizeRatio against union bytes (total behavior preserved); maxSkeletonBytes asserts the always-loaded skeleton actually shrank. Lowering minBytes to fit a small skeleton would otherwise make the size floor toothless [Codex #12]. Built + tested BEFORE the carve so ship's invariant can flip to sectioned in the same commit it lands. Monolith path byte-identical (verified: pre-existing investigate 1.053 ratio drift fails the same with this change stashed). 7 sectioned-parity tests + existing parity tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(ship): carve into skeleton + on-demand sections (Claude) (T9) ship/SKILL.md drops 167KB → 68.7KB (~59% of the always-loaded skill) by moving 8 prose-heavy steps into ship/sections/*.md, read on demand: tests, test-coverage, plan-completion, review-army, greptile, adversarial, changelog, pr-body. Step 12's version logic now calls the tested gstack-version-bump CLI instead of inline bash. Claude-first (S2): {{SECTION:id}} emits a STOP-Read pointer on Claude (skeleton + generated section files) and INLINES the content on every other host, so external hosts keep the full monolith — verified factory at 162KB with no sections dir. {{SECTION_INDEX:ship}} renders the situation→section table from the PASSIVE manifest (CM2 / v2_PLAN.md:663); required-reads live only in test fixtures. Multi-pass resolve expands inlined sections' own resolvers. Parity: ship invariant flipped to sectioned (union content checks + maxSkeletonBytes asserts the shrink). Carve-fallout fixed across gen-skill-docs/skill-validation/ golden/plan-completion/#1539/size-budget tests via skeleton+sections union reads. Free suite green except the pre-existing investigate parity drift. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(ship): manifest-consistency + context-parity + requiredReads helper (T9) Free deterministic guards for the carve: - required-reads.ts + unit test: assertRequiredReads(run, requiredFiles) — the mechanical layer-5 check that the agent Read the sections its situation needs (required set comes from the fixture, not the passive manifest) - section-manifest-consistency: 3-tier orphan classification (generated orphan + hand-edited generated file → FAIL; manifest orphan → WARN per v2_PLAN.md) and pins the PASSIVE-manifest contract (no applies_when/required_for) - template-context-parity: generated sections have zero unresolved placeholders and gated resolvers (ADVERSARIAL_STEP/CONFIDENCE_CALIBRATION/CHANGELOG_WORKFLOW) rendered — proving sections resolve with the parent skillName, not 'sections' 16 tests, all green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(ship): section-loading E2E + idempotency CLI detection (T9) - skill-e2e-ship-section-loading.test.ts (new, periodic): runs real /ship in plan mode against a fresh version-changing fixture and asserts the agent Read the required sections (review-army + changelog). Runs against the INSTALLED skill (~/.claude/skills/gstack/ship), not repo paths, so install-layout 404s surface [Codex outside-voice #5]. Layer-5 mechanical guard against silent section-skip. - skill-e2e-ship-idempotency.test.ts: detection updated for the carve — Step 12 now runs gstack-version-bump classify (JSON "state":"ALREADY_BUMPED") instead of the inline bash echo (STATE: ALREADY_BUMPED). Accept both; add a gstack-version-bump-write re-bump regression signal. - touchfiles: register ship-section-loading (periodic) + extend idempotency deps with bin/gstack-version-bump + scripts/resolvers/sections.ts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(ship): union-read redaction wiring test for the carve (T9) main's PR-body redaction-at-sink lives in sections/pr-body.md.tmpl after the carve, not the skeleton template. Read skeleton + section templates union so the redaction-wiring assertions follow the relocated content. 9/9 green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * v1.54.0.0 feat: carve /ship into skeleton + on-demand sections (-59% always-loaded) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
473 lines
21 KiB
Cheetah
473 lines
21 KiB
Cheetah
---
|
|
name: ship
|
|
preamble-tier: 4
|
|
version: 1.0.0
|
|
description: |
|
|
Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION,
|
|
update CHANGELOG, commit, push, create PR. Use when asked to "ship", "deploy",
|
|
"push to main", "create a PR", "merge and push", or "get it deployed".
|
|
Proactively invoke this skill (do NOT push/PR directly) when the user says code
|
|
is ready, asks about deploying, wants to push code up, or asks to create a PR. (gstack)
|
|
allowed-tools:
|
|
- Bash
|
|
- Read
|
|
- Write
|
|
- Edit
|
|
- Grep
|
|
- Glob
|
|
- Agent
|
|
- AskUserQuestion
|
|
- WebSearch
|
|
sensitive: true
|
|
triggers:
|
|
- ship it
|
|
- create a pr
|
|
- push to main
|
|
- deploy this
|
|
---
|
|
|
|
{{PREAMBLE}}
|
|
|
|
{{BASE_BRANCH_DETECT}}
|
|
|
|
{{GBRAIN_CONTEXT_LOAD}}
|
|
|
|
# Ship: Fully Automated Ship Workflow
|
|
|
|
You are running the `/ship` workflow. This is a **non-interactive, fully automated** workflow. Do NOT ask for confirmation at any step. The user said `/ship` which means DO IT. Run straight through and output the PR URL at the end.
|
|
|
|
**Only stop for:**
|
|
- On the base branch (abort)
|
|
- Merge conflicts that can't be auto-resolved (stop, show conflicts)
|
|
- In-branch test failures (pre-existing failures are triaged, not auto-blocking)
|
|
- Pre-landing review finds ASK items that need user judgment
|
|
- MINOR or MAJOR version bump needed (ask — see Step 12)
|
|
- Greptile review comments that need user decision (complex fixes, false positives)
|
|
- AI-assessed coverage below minimum threshold (hard gate with user override — see Step 7)
|
|
- Plan items NOT DONE with no user override (see Step 8)
|
|
- Plan verification failures (see Step 8.1)
|
|
- TODOS.md missing and user wants to create one (ask — see Step 14)
|
|
- TODOS.md disorganized and user wants to reorganize (ask — see Step 14)
|
|
|
|
**Never stop for:**
|
|
- Uncommitted changes (always include them)
|
|
- Version bump choice (auto-pick MICRO or PATCH — see Step 12)
|
|
- CHANGELOG content (auto-generate from diff)
|
|
- Commit message approval (auto-commit)
|
|
- Multi-file changesets (auto-split into bisectable commits)
|
|
- TODOS.md completed-item detection (auto-mark)
|
|
- Auto-fixable review findings (dead code, N+1, stale comments — fixed automatically)
|
|
- Test coverage gaps within target threshold (auto-generate and commit, or flag in PR body)
|
|
|
|
**Re-run behavior (idempotency):**
|
|
Re-running `/ship` means "run the whole checklist again." Every verification step
|
|
(tests, coverage audit, plan completion, pre-landing review, adversarial review,
|
|
VERSION/CHANGELOG check, TODOS, document-release) runs on every invocation.
|
|
Only *actions* are idempotent:
|
|
- Step 12: If VERSION already bumped, skip the bump but still read the version
|
|
- Step 17: If already pushed, skip the push command
|
|
- Step 19: If PR exists, update the body instead of creating a new PR
|
|
Never skip a verification step because a prior `/ship` run already performed it.
|
|
|
|
---
|
|
|
|
{{SECTION_INDEX:ship}}
|
|
|
|
---
|
|
|
|
## Step 1: Pre-flight
|
|
|
|
1. Check the current branch. If on the base branch or the repo's default branch, **abort**: "You're on the base branch. Ship from a feature branch."
|
|
|
|
2. Run `git status` (never use `-uall`). Uncommitted changes are always included — no need to ask.
|
|
|
|
3. Run `git diff <base>...HEAD --stat` and `git log <base>..HEAD --oneline` to understand what's being shipped.
|
|
|
|
4. Check review readiness:
|
|
|
|
{{REVIEW_DASHBOARD}}
|
|
|
|
If the Eng Review is NOT "CLEAR":
|
|
|
|
Print: "No prior eng review found — ship will run its own pre-landing review in Step 9."
|
|
|
|
Check diff size: `git diff <base>...HEAD --stat | tail -1`. If the diff is >200 lines, add: "Note: This is a large diff. Consider running `/plan-eng-review` or `/autoplan` for architecture-level review before shipping."
|
|
|
|
If CEO Review is missing, mention as informational ("CEO Review not run — recommended for product changes") but do NOT block.
|
|
|
|
For Design Review: run `source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 9, but consider running /design-review for a full visual audit post-implementation." Still never block.
|
|
|
|
Continue to Step 2 — do NOT block or ask. Ship runs its own review in Step 9.
|
|
|
|
---
|
|
|
|
## Step 2: Distribution Pipeline Check
|
|
|
|
If the diff introduces a new standalone artifact (CLI binary, library package, tool) — not a web
|
|
service with existing deployment — verify that a distribution pipeline exists.
|
|
|
|
1. Check if the diff adds a new `cmd/` directory, `main.go`, or `bin/` entry point:
|
|
```bash
|
|
git diff origin/<base> --name-only | grep -E '(cmd/.*/main\.go|bin/|Cargo\.toml|setup\.py|package\.json)' | head -5
|
|
```
|
|
|
|
2. If new artifact detected, check for a release workflow:
|
|
```bash
|
|
ls .github/workflows/ 2>/dev/null | grep -iE 'release|publish|dist'
|
|
grep -qE 'release|publish|deploy' .gitlab-ci.yml 2>/dev/null && echo "GITLAB_CI_RELEASE"
|
|
```
|
|
|
|
3. **If no release pipeline exists and a new artifact was added:** Use AskUserQuestion:
|
|
- "This PR adds a new binary/tool but there's no CI/CD pipeline to build and publish it.
|
|
Users won't be able to download the artifact after merge."
|
|
- A) Add a release workflow now (CI/CD release pipeline — GitHub Actions or GitLab CI depending on platform)
|
|
- B) Defer — add to TODOS.md
|
|
- C) Not needed — this is internal/web-only, existing deployment covers it
|
|
|
|
4. **If release pipeline exists:** Continue silently.
|
|
5. **If no new artifact detected:** Skip silently.
|
|
|
|
---
|
|
|
|
## Step 3: Merge the base branch (BEFORE tests)
|
|
|
|
Fetch and merge the base branch into the feature branch so tests run against the merged state:
|
|
|
|
```bash
|
|
git fetch origin <base> && git merge origin/<base> --no-edit
|
|
```
|
|
|
|
**If there are merge conflicts:** Try to auto-resolve if they are simple (VERSION, schema.rb, CHANGELOG ordering). If conflicts are complex or ambiguous, **STOP** and show them.
|
|
|
|
**If already up to date:** Continue silently.
|
|
|
|
---
|
|
|
|
{{SECTION:tests}}
|
|
|
|
{{SECTION:test-coverage}}
|
|
|
|
{{SECTION:plan-completion}}
|
|
|
|
{{SECTION:review-army}}
|
|
|
|
{{SECTION:greptile}}
|
|
|
|
{{SECTION:adversarial}}
|
|
|
|
## Step 12: Version bump (auto-decide)
|
|
|
|
The deterministic version-state logic is the tested **`gstack-version-bump`** CLI
|
|
(classify / write / repair). The bump-LEVEL decision and queue-collision handling
|
|
stay agent judgment; the slot pick stays `gstack-next-version`.
|
|
|
|
1. **Classify state** — pure reader, never writes:
|
|
```bash
|
|
bun run ~/.claude/skills/gstack/bin/gstack-version-bump classify --base <base>
|
|
```
|
|
Read the JSON `state` and dispatch:
|
|
- **FRESH** → do the bump (steps 2-4).
|
|
- **ALREADY_BUMPED** → skip the bump, but run the queue-drift check (step 3) with the reported `currentVersion`. If the queue moved (next free version differs), **AskUserQuestion**: rebump to the new version (rewrites CHANGELOG header + PR title) or keep current (CI version-gate will reject until resolved).
|
|
- **DRIFT_STALE_PKG** → run `gstack-version-bump repair` (syncs package.json to VERSION). No re-bump; reuse `currentVersion` for CHANGELOG + PR.
|
|
- **DRIFT_UNEXPECTED** → **STOP**. package.json disagrees with VERSION while VERSION matches base — a manual edit bypassed /ship. Reconcile manually, then re-run.
|
|
|
|
2. **Decide the bump level** from the diff (agent judgment):
|
|
- **MICRO**: <50 lines, trivial tweaks/config. **PATCH**: 50+ lines, no feature signals.
|
|
- **MINOR**: **ASK** if any feature signal (new route/page, migration, new module), OR 500+ lines. **MAJOR**: **ASK** — milestones or breaking changes only.
|
|
Save as `BUMP_LEVEL`. The level is the user-intended bump; queue-aware placement may advance the slot without changing the level.
|
|
|
|
3. **Queue-aware pick** (workspace-aware ship):
|
|
```bash
|
|
QUEUE_JSON=$(bun run ~/.claude/skills/gstack/bin/gstack-next-version --base <base> --bump "$BUMP_LEVEL" --current-version "$BASE_VERSION" 2>/dev/null || echo '{"offline":true}')
|
|
NEW_VERSION=$(echo "$QUEUE_JSON" | jq -r '.version // empty')
|
|
```
|
|
If `offline`/util fails: fall back to local `BUMP_LEVEL` arithmetic and print `⚠ workspace-aware ship offline — using local bump only`. If `claimed` is non-empty, render the queue table so the user sees landing order. If an active sibling workspace holds a version `>= NEW_VERSION`, **AskUserQuestion**: advance past (unrelated work) or abort and sync with the sibling.
|
|
|
|
4. **Write the bump** (FRESH, or an approved rebump):
|
|
```bash
|
|
bun run ~/.claude/skills/gstack/bin/gstack-version-bump write --version "$NEW_VERSION"
|
|
```
|
|
The CLI validates the 4-digit `MAJOR.MINOR.PATCH.MICRO` pattern and writes **both** VERSION and package.json. On a half-write (VERSION written, package.json failed) it exits 3 — re-run, and classify will report DRIFT_STALE_PKG for `repair` to fix.
|
|
|
|
{{SECTION:changelog}}
|
|
|
|
## Step 14: TODOS.md (auto-update)
|
|
|
|
Cross-reference the project's TODOS.md against the changes being shipped. Mark completed items automatically; prompt only if the file is missing or disorganized.
|
|
|
|
Read `.claude/skills/review/TODOS-format.md` for the canonical format reference.
|
|
|
|
**1. Check if TODOS.md exists** in the repository root.
|
|
|
|
**If TODOS.md does not exist:** Use AskUserQuestion:
|
|
- Message: "GStack recommends maintaining a TODOS.md organized by skill/component, then priority (P0 at top through P4, then Completed at bottom). See TODOS-format.md for the full format. Would you like to create one?"
|
|
- Options: A) Create it now, B) Skip for now
|
|
- If A: Create `TODOS.md` with a skeleton (# TODOS heading + ## Completed section). Continue to step 3.
|
|
- If B: Skip the rest of Step 14. Continue to Step 15.
|
|
|
|
**2. Check structure and organization:**
|
|
|
|
Read TODOS.md and verify it follows the recommended structure:
|
|
- Items grouped under `## <Skill/Component>` headings
|
|
- Each item has `**Priority:**` field with P0-P4 value
|
|
- A `## Completed` section at the bottom
|
|
|
|
**If disorganized** (missing priority fields, no component groupings, no Completed section): Use AskUserQuestion:
|
|
- Message: "TODOS.md doesn't follow the recommended structure (skill/component groupings, P0-P4 priority, Completed section). Would you like to reorganize it?"
|
|
- Options: A) Reorganize now (recommended), B) Leave as-is
|
|
- If A: Reorganize in-place following TODOS-format.md. Preserve all content — only restructure, never delete items.
|
|
- If B: Continue to step 3 without restructuring.
|
|
|
|
**3. Detect completed TODOs:**
|
|
|
|
This step is fully automatic — no user interaction.
|
|
|
|
Use the diff and commit history already gathered in earlier steps:
|
|
- `git diff <base>...HEAD` (full diff against the base branch)
|
|
- `git log <base>..HEAD --oneline` (all commits being shipped)
|
|
|
|
For each TODO item, check if the changes in this PR complete it by:
|
|
- Matching commit messages against the TODO title and description
|
|
- Checking if files referenced in the TODO appear in the diff
|
|
- Checking if the TODO's described work matches the functional changes
|
|
|
|
**Be conservative:** Only mark a TODO as completed if there is clear evidence in the diff. If uncertain, leave it alone.
|
|
|
|
**4. Move completed items** to the `## Completed` section at the bottom. Append: `**Completed:** vX.Y.Z (YYYY-MM-DD)`
|
|
|
|
**5. Output summary:**
|
|
- `TODOS.md: N items marked complete (item1, item2, ...). M items remaining.`
|
|
- Or: `TODOS.md: No completed items detected. M items remaining.`
|
|
- Or: `TODOS.md: Created.` / `TODOS.md: Reorganized.`
|
|
|
|
**6. Defensive:** If TODOS.md cannot be written (permission error, disk full), warn the user and continue. Never stop the ship workflow for a TODOS failure.
|
|
|
|
Save this summary — it goes into the PR body in Step 19.
|
|
|
|
---
|
|
|
|
## Step 15: Commit (bisectable chunks)
|
|
|
|
### Step 15.0: WIP Commit Squash (continuous checkpoint mode only)
|
|
|
|
If `CHECKPOINT_MODE` is `"continuous"`, the branch likely contains `WIP:` commits
|
|
from auto-checkpointing. These must be squashed INTO the corresponding logical
|
|
commits before the bisectable-grouping logic in Step 15.1 runs. Non-WIP commits
|
|
on the branch (earlier landed work) must be preserved.
|
|
|
|
**Detection:**
|
|
```bash
|
|
WIP_COUNT=$(git log <base>..HEAD --oneline --grep="^WIP:" 2>/dev/null | wc -l | tr -d ' ')
|
|
echo "WIP_COMMITS: $WIP_COUNT"
|
|
```
|
|
|
|
If `WIP_COUNT` is 0: skip this sub-step entirely.
|
|
|
|
If `WIP_COUNT` > 0, collect the WIP context first so it survives the squash:
|
|
|
|
```bash
|
|
# Export [gstack-context] blocks from all WIP commits on this branch.
|
|
# This file becomes input to the CHANGELOG entry and may inform PR body context.
|
|
mkdir -p "$(git rev-parse --show-toplevel)/.gstack"
|
|
git log <base>..HEAD --grep="^WIP:" --format="%H%n%B%n---END---" > \
|
|
"$(git rev-parse --show-toplevel)/.gstack/wip-context-before-squash.md" 2>/dev/null || true
|
|
```
|
|
|
|
**Non-destructive squash strategy:**
|
|
|
|
`git reset --soft <merge-base>` WOULD uncommit everything including non-WIP commits.
|
|
DO NOT DO THAT. Instead, use `git rebase` scoped to filter WIP commits only.
|
|
|
|
Option 1 (preferred, if there are non-WIP commits mixed in):
|
|
```bash
|
|
# Interactive rebase with automated WIP squashing.
|
|
# Mark every WIP commit as 'fixup' (drop its message, fold changes into prior commit).
|
|
git rebase -i $(git merge-base HEAD origin/<base>) \
|
|
--exec 'true' \
|
|
-X ours 2>/dev/null || {
|
|
echo "Rebase conflict. Aborting: git rebase --abort"
|
|
git rebase --abort
|
|
echo "STATUS: BLOCKED — manual WIP squash required"
|
|
exit 1
|
|
}
|
|
```
|
|
|
|
Option 2 (simpler, if the branch is ALL WIP commits so far — no landed work):
|
|
```bash
|
|
# Branch contains only WIP commits. Reset-soft is safe here because there's
|
|
# nothing non-WIP to preserve. Verify first.
|
|
NON_WIP=$(git log <base>..HEAD --oneline --invert-grep --grep="^WIP:" 2>/dev/null | wc -l | tr -d ' ')
|
|
if [ "$NON_WIP" -eq 0 ]; then
|
|
git reset --soft $(git merge-base HEAD origin/<base>)
|
|
echo "WIP-only branch, reset-soft to merge base. Step 15.1 will create clean commits."
|
|
fi
|
|
```
|
|
|
|
Decide at runtime which option applies. If unsure, prefer stopping and asking the
|
|
user via AskUserQuestion rather than destroying non-WIP commits.
|
|
|
|
**Anti-footgun rules:**
|
|
- NEVER blind `git reset --soft` if there are non-WIP commits. Codex flagged this
|
|
as destructive — it would uncommit real landed work and turn the push step into
|
|
a non-fast-forward push for anyone who already pushed.
|
|
- Only proceed to Step 15.1 after WIP commits are successfully squashed/absorbed
|
|
or the branch has been verified to contain only WIP work.
|
|
|
|
### Step 15.1: Bisectable Commits
|
|
|
|
**Goal:** Create small, logical commits that work well with `git bisect` and help LLMs understand what changed.
|
|
|
|
1. Analyze the diff and group changes into logical commits. Each commit should represent **one coherent change** — not one file, but one logical unit.
|
|
|
|
2. **Commit ordering** (earlier commits first):
|
|
- **Infrastructure:** migrations, config changes, route additions
|
|
- **Models & services:** new models, services, concerns (with their tests)
|
|
- **Controllers & views:** controllers, views, JS/React components (with their tests)
|
|
- **VERSION + CHANGELOG + TODOS.md:** always in the final commit
|
|
|
|
3. **Rules for splitting:**
|
|
- A model and its test file go in the same commit
|
|
- A service and its test file go in the same commit
|
|
- A controller, its views, and its test go in the same commit
|
|
- Migrations are their own commit (or grouped with the model they support)
|
|
- Config/route changes can group with the feature they enable
|
|
- If the total diff is small (< 50 lines across < 4 files), a single commit is fine
|
|
|
|
4. **Each commit must be independently valid** — no broken imports, no references to code that doesn't exist yet. Order commits so dependencies come first.
|
|
|
|
5. Compose each commit message:
|
|
- First line: `<type>: <summary>` (type = feat/fix/chore/refactor/docs)
|
|
- Body: brief description of what this commit contains
|
|
- Only the **final commit** (VERSION + CHANGELOG) gets the version tag and co-author trailer:
|
|
|
|
```bash
|
|
git commit -m "$(cat <<'EOF'
|
|
chore: bump version and changelog (vX.Y.Z.W)
|
|
|
|
{{CO_AUTHOR_TRAILER}}
|
|
EOF
|
|
)"
|
|
```
|
|
|
|
---
|
|
|
|
## Step 16: Verification Gate
|
|
|
|
**IRON LAW: NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE.**
|
|
|
|
Before pushing, re-verify if code changed during Steps 4-6:
|
|
|
|
1. **Test verification:** If ANY code changed after Step 5's test run (fixes from review findings, CHANGELOG edits don't count), re-run the test suite. Paste fresh output. Stale output from Step 5 is NOT acceptable.
|
|
|
|
2. **Build verification:** If the project has a build step, run it. Paste output.
|
|
|
|
3. **Rationalization prevention:**
|
|
- "Should work now" → RUN IT.
|
|
- "I'm confident" → Confidence is not evidence.
|
|
- "I already tested earlier" → Code changed since then. Test again.
|
|
- "It's a trivial change" → Trivial changes break production.
|
|
|
|
**If tests fail here:** STOP. Do not push. Fix the issue and return to Step 5.
|
|
|
|
Claiming work is complete without verification is dishonesty, not efficiency.
|
|
|
|
---
|
|
|
|
## Step 17: Push
|
|
|
|
**Idempotency check:** Check if the branch is already pushed and up to date.
|
|
|
|
```bash
|
|
git fetch origin <branch-name> 2>/dev/null
|
|
LOCAL=$(git rev-parse HEAD)
|
|
REMOTE=$(git rev-parse origin/<branch-name> 2>/dev/null || echo "none")
|
|
echo "LOCAL: $LOCAL REMOTE: $REMOTE"
|
|
[ "$LOCAL" = "$REMOTE" ] && echo "ALREADY_PUSHED" || echo "PUSH_NEEDED"
|
|
```
|
|
|
|
If `ALREADY_PUSHED`, skip the push but continue to Step 18. Otherwise push with upstream tracking:
|
|
|
|
```bash
|
|
git push -u origin <branch-name>
|
|
```
|
|
|
|
**You are NOT done.** The code is pushed but documentation sync and PR creation are mandatory final steps. Continue to Step 18.
|
|
|
|
---
|
|
|
|
{{SECTION:pr-body}}
|
|
|
|
## Step 20: Persist ship metrics
|
|
|
|
Log coverage and plan completion data so `/retro` can track trends:
|
|
|
|
```bash
|
|
eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG
|
|
```
|
|
|
|
Append to `~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl`:
|
|
|
|
```bash
|
|
echo '{"skill":"ship","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","coverage_pct":COVERAGE_PCT,"plan_items_total":PLAN_TOTAL,"plan_items_done":PLAN_DONE,"verification_result":"VERIFY_RESULT","version":"VERSION","branch":"BRANCH"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
|
|
```
|
|
|
|
Substitute from earlier steps:
|
|
- **COVERAGE_PCT**: coverage percentage from Step 7 diagram (integer, or -1 if undetermined)
|
|
- **PLAN_TOTAL**: total plan items extracted in Step 8 (0 if no plan file)
|
|
- **PLAN_DONE**: count of DONE + CHANGED items from Step 8 (0 if no plan file)
|
|
- **VERIFY_RESULT**: "pass", "fail", or "skipped" from Step 8.1
|
|
- **VERSION**: from the VERSION file
|
|
- **BRANCH**: current branch name
|
|
|
|
This step is automatic — never skip it, never ask for confirmation.
|
|
|
|
---
|
|
|
|
## Step 21: Plan-tune discoverability nudge (first-successful-ship only)
|
|
|
|
Plan-tune cathedral T15. After a successful ship, surface /plan-tune once
|
|
per machine. Single line, non-blocking, marker-gated so it never re-fires.
|
|
|
|
```bash
|
|
_NUDGE_MARKER="$HOME/.gstack/.plan-tune-nudge-shown"
|
|
_QT=$(~/.claude/skills/gstack/bin/gstack-config get question_tuning 2>/dev/null || echo "false")
|
|
if [ ! -f "$_NUDGE_MARKER" ] && [ "$_QT" = "false" ]; then
|
|
echo ""
|
|
echo "gstack can learn from your AskUserQuestion answers. Run /plan-tune to opt in"
|
|
echo "— it captures which prompts you find valuable vs noisy and (with hooks installed)"
|
|
echo "auto-decides your never-ask preferences."
|
|
touch "$_NUDGE_MARKER"
|
|
fi
|
|
```
|
|
|
|
If the marker exists, OR question_tuning is already on, the nudge is a
|
|
no-op. The marker guarantees at-most-once per machine. To re-enable:
|
|
`rm ~/.gstack/.plan-tune-nudge-shown` before next ship.
|
|
|
|
---
|
|
|
|
## Section self-check (before you finish)
|
|
|
|
You ran a carved skill. For your situation, list every section the Section index
|
|
named as applying, and confirm you issued a Read for each one. If you executed any
|
|
of those steps from memory without reading its section, you skipped the source of
|
|
truth — STOP, Read it now, and redo that step. Deterministic version work goes
|
|
through `gstack-version-bump`; never hand-roll the VERSION/package.json write.
|
|
|
|
---
|
|
|
|
## Important Rules
|
|
|
|
- **Never skip tests.** If tests fail, stop.
|
|
- **Never skip the pre-landing review.** If checklist.md is unreadable, stop.
|
|
- **Never force push.** Use regular `git push` only.
|
|
- **Never ask for trivial confirmations** (e.g., "ready to push?", "create PR?"). DO stop for: version bumps (MINOR/MAJOR), pre-landing review findings (ASK items), and Codex structured review [P1] findings (large diffs only).
|
|
- **Always use the 4-digit version format** from the VERSION file.
|
|
- **Date format in CHANGELOG:** `YYYY-MM-DD`
|
|
- **Split commits for bisectability** — each commit = one logical change.
|
|
- **TODOS.md completion detection must be conservative.** Only mark items as completed when the diff clearly shows the work is done.
|
|
- **Use Greptile reply templates from greptile-triage.md.** Every reply includes evidence (inline diff, code references, re-rank suggestion). Never post vague replies.
|
|
- **Never push without fresh verification evidence.** If code changed after Step 5 tests, re-run before pushing.
|
|
- **Step 7 generates coverage tests.** They must pass before committing. Never commit failing tests.
|
|
- **The goal is: user says `/ship`, next thing they see is the review + PR URL + auto-synced docs.**
|