merge: incorporate origin/main into community-mode branch

Conflicts resolved:
- VERSION: accept main's 0.14.5.0 (higher than our 0.14.0.0)
- package.json: same version resolution
- CHANGELOG.md: drop duplicate 0.14.0.0 entry (already on main),
  keep main's entries for 0.14.1-0.14.5 and 0.13.7-0.13.10
- README.md: merge skill lists — keep main's /design-html + /learn,
  add our /gstack-submit to both install and troubleshooting lists
- docs/skills.md: keep all three entries (/gstack-submit, /autoplan, /learn)

Main brought in 0.14.1-0.14.5: design-to-code (/design-html),
comparison board chooser, sidebar CSS inspector + per-tab agents,
always-on adversarial review + scope drift, review army (7 parallel
specialist reviewers), ship idempotency, skill prefix fix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-03-30 21:50:02 -07:00
128 changed files with 14314 additions and 1393 deletions
+205 -70
View File
@@ -3,8 +3,11 @@ name: ship
preamble-tier: 4
version: 1.0.0
description: |
Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. Use when asked to "ship", "deploy", "push to main", "create a PR", or "merge and push".
Proactively suggest when the user says code is ready or asks about deploying.
Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION,
update CHANGELOG, commit, push, create PR. Use when asked to "ship", "deploy",
"push to main", "create a PR", "merge and push", or "get it deployed".
Proactively invoke this skill (do NOT push/PR directly) when the user says code
is ready, asks about deploying, wants to push code up, or asks to create a PR. (gstack)
allowed-tools:
- Bash
- Read
@@ -27,7 +30,7 @@ _UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/sk
mkdir -p ~/.gstack/sessions
touch ~/.gstack/sessions/"$PPID"
_SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ')
find ~/.gstack/sessions -mmin +120 -type f -delete 2>/dev/null || true
find ~/.gstack/sessions -mmin +120 -type f -exec rm {} + 2>/dev/null || true
_CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || true)
_PROACTIVE=$(~/.claude/skills/gstack/bin/gstack-config get proactive 2>/dev/null || echo "true")
_PROACTIVE_PROMPTED=$([ -f ~/.gstack/.proactive-prompted ] && echo "yes" || echo "no")
@@ -49,7 +52,9 @@ _SESSION_ID="$$-$(date +%s)"
echo "TELEMETRY: ${_TEL:-off}"
echo "TEL_PROMPTED: $_TEL_PROMPTED"
mkdir -p ~/.gstack/analytics
echo '{"skill":"ship","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","repo":"'$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null || echo "unknown")'"}' >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true
if [ "${_TEL:-off}" != "off" ]; then
echo '{"skill":"ship","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","repo":"'$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null || echo "unknown")'"}' >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true
fi
# zsh-compatible: use find instead of glob to avoid NOMATCH error
for _PF in $(find ~/.gstack/analytics -maxdepth 1 -name '.pending-*' 2>/dev/null); do
if [ -f "$_PF" ]; then
@@ -69,6 +74,14 @@ if [ -f "$_LEARN_FILE" ]; then
else
echo "LEARNINGS: 0"
fi
# Check if CLAUDE.md has routing rules
_HAS_ROUTING="no"
if [ -f CLAUDE.md ] && grep -q "## Skill routing" CLAUDE.md 2>/dev/null; then
_HAS_ROUTING="yes"
fi
_ROUTING_DECLINED=$(~/.claude/skills/gstack/bin/gstack-config get routing_declined 2>/dev/null || echo "false")
echo "HAS_ROUTING: $_HAS_ROUTING"
echo "ROUTING_DECLINED: $_ROUTING_DECLINED"
```
If `PROACTIVE` is `"false"`, do not proactively suggest gstack skills AND do not
@@ -150,6 +163,49 @@ touch ~/.gstack/.proactive-prompted
This only happens once. If `PROACTIVE_PROMPTED` is `yes`, skip this entirely.
If `HAS_ROUTING` is `no` AND `ROUTING_DECLINED` is `false` AND `PROACTIVE_PROMPTED` is `yes`:
Check if a CLAUDE.md file exists in the project root. If it does not exist, create it.
Use AskUserQuestion:
> gstack works best when your project's CLAUDE.md includes skill routing rules.
> This tells Claude to use specialized workflows (like /ship, /investigate, /qa)
> instead of answering directly. It's a one-time addition, about 15 lines.
Options:
- A) Add routing rules to CLAUDE.md (recommended)
- B) No thanks, I'll invoke skills manually
If A: Append this section to the end of CLAUDE.md:
```markdown
## Skill routing
When the user's request matches an available skill, ALWAYS invoke it using the Skill
tool as your FIRST action. Do NOT answer directly, do NOT use other tools first.
The skill has specialized workflows that produce better results than ad-hoc answers.
Key routing rules:
- Product ideas, "is this worth building", brainstorming → invoke office-hours
- Bugs, errors, "why is this broken", 500 errors → invoke investigate
- Ship, deploy, push, create PR → invoke ship
- QA, test the site, find bugs → invoke qa
- Code review, check my diff → invoke review
- Update docs after shipping → invoke document-release
- Weekly retro → invoke retro
- Design system, brand → invoke design-consultation
- Visual audit, design polish → invoke design-review
- Architecture review → invoke plan-eng-review
```
Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"`
If B: run `~/.claude/skills/gstack/bin/gstack-config set routing_declined true`
Say "No problem. You can add routing rules later by running `gstack-config set routing_declined false` and re-running any skill."
This only happens once per project. If `HAS_ROUTING` is `yes` or `ROUTING_DECLINED` is `true`, skip this entirely.
## Voice
You are GStack, an open source AI builder framework shaped by Garry Tan's product, startup, and engineering judgment. Encode how he thinks, not his biography.
@@ -302,20 +358,37 @@ Run this bash:
_TEL_END=$(date +%s)
_TEL_DUR=$(( _TEL_END - _TEL_START ))
rm -f ~/.gstack/analytics/.pending-"$_SESSION_ID" 2>/dev/null || true
# Local analytics (always available, no binary needed)
echo '{"skill":"SKILL_NAME","duration_s":"'"$_TEL_DUR"'","outcome":"OUTCOME","browse":"USED_BROWSE","session":"'"$_SESSION_ID"'","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'"}' >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true
# Remote telemetry (opt-in, requires binary)
if [ "$_TEL" != "off" ] && [ -x ~/.claude/skills/gstack/bin/gstack-telemetry-log ]; then
~/.claude/skills/gstack/bin/gstack-telemetry-log \
--skill "SKILL_NAME" --duration "$_TEL_DUR" --outcome "OUTCOME" \
--used-browse "USED_BROWSE" --session-id "$_SESSION_ID" 2>/dev/null &
# Local + remote telemetry (both gated by _TEL setting)
if [ "$_TEL" != "off" ]; then
echo '{"skill":"SKILL_NAME","duration_s":"'"$_TEL_DUR"'","outcome":"OUTCOME","browse":"USED_BROWSE","session":"'"$_SESSION_ID"'","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'"}' >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true
if [ -x ~/.claude/skills/gstack/bin/gstack-telemetry-log ]; then
~/.claude/skills/gstack/bin/gstack-telemetry-log \
--skill "SKILL_NAME" --duration "$_TEL_DUR" --outcome "OUTCOME" \
--used-browse "USED_BROWSE" --session-id "$_SESSION_ID" 2>/dev/null &
fi
fi
```
Replace `SKILL_NAME` with the actual skill name from frontmatter, `OUTCOME` with
success/error/abort, and `USED_BROWSE` with true/false based on whether `$B` was used.
If you cannot determine the outcome, use "unknown". The local JSONL always logs. The
remote binary only runs if telemetry is not off and the binary exists.
If you cannot determine the outcome, use "unknown". Both local JSONL and remote
telemetry only run if telemetry is not off. The remote binary additionally requires
the binary to exist.
## Plan Mode Safe Operations
When in plan mode, these operations are always allowed because they produce
artifacts that inform the plan, not code changes:
- `$B` commands (browse: screenshots, page inspection, navigation, snapshots)
- `$D` commands (design: generate mockups, variants, comparison boards, iterate)
- `codex exec` / `codex review` (outside voice, plan review, adversarial challenge)
- Writing to `~/.gstack/` (config, analytics, review logs, design artifacts, learnings)
- Writing to the plan file (already allowed by plan mode)
- `open` commands for viewing generated artifacts (comparison boards, HTML previews)
These are read-only in spirit — they inspect the live site, generate visual artifacts,
or get independent opinions. They do NOT modify project source files.
## Plan Status Footer
@@ -468,7 +541,7 @@ Display:
- **Eng Review (required by default):** The only review that gates shipping. Covers architecture, code quality, tests, performance. Can be disabled globally with \`gstack-config set skip_eng_review true\` (the "don't bother me" setting).
- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup.
- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes.
- **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed.
- **Adversarial Review (automatic):** Always-on for every review. Every diff gets both Claude adversarial subagent and Codex adversarial challenge. Large diffs (200+ lines) additionally get Codex structured review with P1 gate. No configuration needed.
- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping.
**Verdict logic:**
@@ -1366,6 +1439,41 @@ matches a past learning, display:
This makes the compounding visible. The user should see that gstack is getting
smarter on their codebase over time.
## Step 3.48: Scope Drift Detection
Before reviewing code quality, check: **did they build what was requested — nothing more, nothing less?**
1. Read `TODOS.md` (if it exists). Read PR description (`gh pr view --json body --jq .body 2>/dev/null || true`).
Read commit messages (`git log origin/<base>..HEAD --oneline`).
**If no PR exists:** rely on commit messages and TODOS.md for stated intent — this is the common case since /review runs before /ship creates the PR.
2. Identify the **stated intent** — what was this branch supposed to accomplish?
3. Run `git diff origin/<base>...HEAD --stat` and compare the files changed against the stated intent.
4. Evaluate with skepticism (incorporating plan completion results if available from an earlier step or adjacent section):
**SCOPE CREEP detection:**
- Files changed that are unrelated to the stated intent
- New features or refactors not mentioned in the plan
- "While I was in there..." changes that expand blast radius
**MISSING REQUIREMENTS detection:**
- Requirements from TODOS.md/PR description not addressed in the diff
- Test coverage gaps for stated requirements
- Partial implementations (started but not finished)
5. Output (before the main review begins):
\`\`\`
Scope Check: [CLEAN / DRIFT DETECTED / REQUIREMENTS MISSING]
Intent: <1-line summary of what was requested>
Delivered: <1-line summary of what the diff actually does>
[If drift: list each out-of-scope change]
[If missing: list each unaddressed requirement]
\`\`\`
6. This is **INFORMATIONAL** — does not block the review. Proceed to the next step.
---
---
## Step 3.5: Pre-Landing Review
@@ -1533,9 +1641,9 @@ For each classified comment:
---
## Step 3.8: Adversarial review (auto-scaled)
## Step 3.8: Adversarial review (always-on)
Adversarial review thoroughness scales automatically based on diff size. No configuration needed.
Every diff gets adversarial review from both Claude and Codex. LOC is not a proxy for risk — a 5-line auth change can be critical.
**Detect diff size and tool availability:**
@@ -1544,30 +1652,34 @@ DIFF_INS=$(git diff origin/<base> --stat | tail -1 | grep -oE '[0-9]+ insertion'
DIFF_DEL=$(git diff origin/<base> --stat | tail -1 | grep -oE '[0-9]+ deletion' | grep -oE '[0-9]+' || echo "0")
DIFF_TOTAL=$((DIFF_INS + DIFF_DEL))
which codex 2>/dev/null && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
# Respect old opt-out
# Legacy opt-out — only gates Codex passes, Claude always runs
OLD_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || true)
echo "DIFF_SIZE: $DIFF_TOTAL"
echo "OLD_CFG: ${OLD_CFG:-not_set}"
```
If `OLD_CFG` is `disabled`: skip this step silently. Continue to the next step.
If `OLD_CFG` is `disabled`: skip Codex passes only. Claude adversarial subagent still runs (it's free and fast). Jump to the "Claude adversarial subagent" section.
**User override:** If the user explicitly requested a specific tier (e.g., "run all passes", "paranoid review", "full adversarial", "do all 4 passes", "thorough review"), honor that request regardless of diff size. Jump to the matching tier section.
**Auto-select tier based on diff size:**
- **Small (< 50 lines changed):** Skip adversarial review entirely. Print: "Small diff ($DIFF_TOTAL lines) — adversarial review skipped." Continue to the next step.
- **Medium (50199 lines changed):** Run Codex adversarial challenge (or Claude adversarial subagent if Codex unavailable). Jump to the "Medium tier" section.
- **Large (200+ lines changed):** Run all remaining passes — Codex structured review + Claude adversarial subagent + Codex adversarial. Jump to the "Large tier" section.
**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size.
---
### Medium tier (50199 lines)
### Claude adversarial subagent (always runs)
Claude's structured review already ran. Now add a **cross-model adversarial challenge**.
Dispatch via the Agent tool. The subagent has fresh context — no checklist bias from the structured review. This genuine independence catches things the primary reviewer is blind to.
**If Codex is available:** run the Codex adversarial challenge. **If Codex is NOT available:** fall back to the Claude adversarial subagent instead.
Subagent prompt:
"Read the diff for this branch with `git diff origin/<base>`. Think like an attacker and a chaos engineer. Your job is to find ways this code will fail in production. Look for: edge cases, race conditions, security holes, resource leaks, failure modes, silent data corruption, logic errors that produce wrong results silently, error handling that swallows failures, and trust boundary violations. Be adversarial. Be thorough. No compliments — just the problems. For each finding, classify as FIXABLE (you know how to fix it) or INVESTIGATE (needs human judgment)."
**Codex adversarial:**
Present findings under an `ADVERSARIAL REVIEW (Claude subagent):` header. **FIXABLE findings** flow into the same Fix-First pipeline as the structured review. **INVESTIGATE findings** are presented as informational.
If the subagent fails or times out: "Claude adversarial subagent unavailable. Continuing."
---
### Codex adversarial challenge (always runs when available)
If Codex is available AND `OLD_CFG` is NOT `disabled`:
```bash
TMPERR_ADV=$(mktemp /tmp/codex-adv-XXXXXXXX)
@@ -1587,34 +1699,16 @@ Present the full output verbatim. This is informational — it never blocks ship
- **Timeout:** "Codex timed out after 5 minutes."
- **Empty response:** "Codex returned no response. Stderr: <paste relevant error>."
On any Codex error, fall back to the Claude adversarial subagent automatically.
**Cleanup:** Run `rm -f "$TMPERR_ADV"` after processing.
**Claude adversarial subagent** (fallback when Codex unavailable or errored):
Dispatch via the Agent tool. The subagent has fresh context — no checklist bias from the structured review. This genuine independence catches things the primary reviewer is blind to.
Subagent prompt:
"Read the diff for this branch with `git diff origin/<base>`. Think like an attacker and a chaos engineer. Your job is to find ways this code will fail in production. Look for: edge cases, race conditions, security holes, resource leaks, failure modes, silent data corruption, logic errors that produce wrong results silently, error handling that swallows failures, and trust boundary violations. Be adversarial. Be thorough. No compliments — just the problems. For each finding, classify as FIXABLE (you know how to fix it) or INVESTIGATE (needs human judgment)."
Present findings under an `ADVERSARIAL REVIEW (Claude subagent):` header. **FIXABLE findings** flow into the same Fix-First pipeline as the structured review. **INVESTIGATE findings** are presented as informational.
If the subagent fails or times out: "Claude adversarial subagent unavailable. Continuing without adversarial review."
**Persist the review result:**
```bash
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"adversarial-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","source":"SOURCE","tier":"medium","commit":"'"$(git rev-parse --short HEAD)"'"}'
```
Substitute STATUS: "clean" if no findings, "issues_found" if findings exist. SOURCE: "codex" if Codex ran, "claude" if subagent ran. If both failed, do NOT persist.
**Cleanup:** Run `rm -f "$TMPERR_ADV"` after processing (if Codex was used).
If Codex is NOT available: "Codex CLI not found — running Claude adversarial only. Install Codex for cross-model coverage: `npm install -g @openai/codex`"
---
### Large tier (200+ lines)
### Codex structured review (large diffs only, 200+ lines)
Claude's structured review already ran. Now run **all three remaining passes** for maximum coverage:
If `DIFF_TOTAL >= 200` AND Codex is available AND `OLD_CFG` is NOT `disabled`:
**1. Codex structured review (if available):**
```bash
TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX)
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
@@ -1635,34 +1729,34 @@ B) Continue — review will still complete
If A: address the findings. After fixing, re-run tests (Step 3) since code has changed. Re-run `codex review` to verify.
Read stderr for errors (same error handling as medium tier).
Read stderr for errors (same error handling as Codex adversarial above).
After stderr: `rm -f "$TMPERR"`
**2. Claude adversarial subagent:** Dispatch a subagent with the adversarial prompt (same prompt as medium tier). This always runs regardless of Codex availability.
**3. Codex adversarial challenge (if available):** Run `codex exec` with the adversarial prompt (same as medium tier).
If Codex is not available for steps 1 and 3, note to the user: "Codex CLI not found — large-diff review ran Claude structured + Claude adversarial (2 of 4 passes). Install Codex for full 4-pass coverage: `npm install -g @openai/codex`"
**Persist the review result AFTER all passes complete** (not after each sub-step):
```bash
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"adversarial-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","source":"SOURCE","tier":"large","gate":"GATE","commit":"'"$(git rev-parse --short HEAD)"'"}'
```
Substitute: STATUS = "clean" if no findings across ALL passes, "issues_found" if any pass found issues. SOURCE = "both" if Codex ran, "claude" if only Claude subagent ran. GATE = the Codex structured review gate result ("pass"/"fail"), or "informational" if Codex was unavailable. If all passes failed, do NOT persist.
If `DIFF_TOTAL < 200`: skip this section silently. The Claude + Codex adversarial passes provide sufficient coverage for smaller diffs.
---
### Cross-model synthesis (medium and large tiers)
### Persist the review result
After all passes complete, persist:
```bash
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"adversarial-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","source":"SOURCE","tier":"always","gate":"GATE","commit":"'"$(git rev-parse --short HEAD)"'"}'
```
Substitute: STATUS = "clean" if no findings across ALL passes, "issues_found" if any pass found issues. SOURCE = "both" if Codex ran, "claude" if only Claude subagent ran. GATE = the Codex structured review gate result ("pass"/"fail"), "skipped" if diff < 200, or "informational" if Codex was unavailable. If all passes failed, do NOT persist.
---
### Cross-model synthesis
After all passes complete, synthesize findings across all sources:
```
ADVERSARIAL REVIEW SYNTHESIS (auto: TIER, N lines):
ADVERSARIAL REVIEW SYNTHESIS (always-on, N lines):
════════════════════════════════════════════════════════════
High confidence (found by multiple sources): [findings agreed on by >1 pass]
Unique to Claude structured review: [from earlier step]
Unique to Claude adversarial: [from subagent, if ran]
Unique to Claude adversarial: [from subagent]
Unique to Codex: [from codex adversarial or code review, if ran]
Models used: Claude structured ✓ Claude adversarial ✓/✗ Codex ✓/✗
════════════════════════════════════════════════════════════
@@ -1698,13 +1792,25 @@ already knows. A good test: would this insight save time in a future session? If
## Step 4: Version bump (auto-decide)
**Idempotency check:** Before bumping, compare VERSION against the base branch.
```bash
BASE_VERSION=$(git show origin/<base>:VERSION 2>/dev/null || echo "0.0.0.0")
CURRENT_VERSION=$(cat VERSION 2>/dev/null || echo "0.0.0.0")
echo "BASE: $BASE_VERSION HEAD: $CURRENT_VERSION"
if [ "$CURRENT_VERSION" != "$BASE_VERSION" ]; then echo "ALREADY_BUMPED"; fi
```
If output shows `ALREADY_BUMPED`, VERSION was already bumped on this branch (prior `/ship` run). Skip the rest of Step 4 and use the current VERSION. Otherwise proceed with the bump.
1. Read the current `VERSION` file (4-digit format: `MAJOR.MINOR.PATCH.MICRO`)
2. **Auto-decide the bump level based on the diff:**
- Count lines changed (`git diff origin/<base>...HEAD --stat | tail -1`)
- Check for feature signals: new route/page files (e.g. `app/*/page.tsx`, `pages/*.ts`), new DB migration/schema files, new test files alongside new source files, or branch name starting with `feat/`
- **MICRO** (4th digit): < 50 lines changed, trivial tweaks, typos, config
- **PATCH** (3rd digit): 50+ lines changed, bug fixes, small-medium features
- **MINOR** (2nd digit): **ASK the user** — only for major features or significant architectural changes
- **PATCH** (3rd digit): 50+ lines changed, no feature signals detected
- **MINOR** (2nd digit): **ASK the user** if ANY feature signal is detected, OR 500+ lines changed, OR new modules/packages added
- **MAJOR** (1st digit): **ASK the user** — only for milestones or breaking changes
3. Compute the new version:
@@ -1715,7 +1821,7 @@ already knows. A good test: would this insight save time in a future session? If
---
## Step 5: CHANGELOG (auto-generate)
## CHANGELOG (auto-generate)
1. Read `CHANGELOG.md` header to know the format.
@@ -1748,6 +1854,7 @@ already knows. A good test: would this insight save time in a future session? If
- Write concise, descriptive bullet points
- Insert after the file header (line 5), dated today
- Format: `## [X.Y.Z.W] - YYYY-MM-DD`
- **Voice:** Lead with what the user can now **do** that they couldn't before. Use plain language, not implementation details. Never mention TODOS.md, internal tracking, or contributor-facing details.
6. **Cross-check:** Compare your CHANGELOG entry against the commit list from step 2.
Every commit must map to at least one bullet point. If any commit is unrepresented,
@@ -1949,7 +2056,17 @@ If `SCOPE_FRONTEND=false` or browse is not available, skip this step silently.
## Step 7: Push
Push to the remote with upstream tracking:
**Idempotency check:** Check if the branch is already pushed and up to date.
```bash
git fetch origin <branch-name> 2>/dev/null
LOCAL=$(git rev-parse HEAD)
REMOTE=$(git rev-parse origin/<branch-name> 2>/dev/null || echo "none")
echo "LOCAL: $LOCAL REMOTE: $REMOTE"
[ "$LOCAL" = "$REMOTE" ] && echo "ALREADY_PUSHED" || echo "PUSH_NEEDED"
```
If `ALREADY_PUSHED`, skip the push. Otherwise push with upstream tracking:
```bash
git push -u origin <branch-name>
@@ -1959,7 +2076,21 @@ git push -u origin <branch-name>
## Step 8: Create PR/MR
Create a pull request (GitHub) or merge request (GitLab) using the platform detected in Step 0.
**Idempotency check:** Check if a PR/MR already exists for this branch.
**If GitHub:**
```bash
gh pr view --json url,number,state -q 'if .state == "OPEN" then "PR #\(.number): \(.url)" else "NO_PR" end' 2>/dev/null || echo "NO_PR"
```
**If GitLab:**
```bash
glab mr view -F json 2>/dev/null | jq -r 'if .state == "opened" then "MR_EXISTS" else "NO_MR" end' 2>/dev/null || echo "NO_MR"
```
If an **open** PR/MR already exists: **update** the PR body with the latest test results, coverage, and review findings using `gh pr edit --body "..."` (GitHub) or `glab mr update -d "..."` (GitLab). Print the existing URL and continue to Step 8.5.
If no PR/MR exists: create a pull request (GitHub) or merge request (GitLab) using the platform detected in Step 0.
The PR/MR body should contain these sections:
@@ -1991,6 +2122,10 @@ you missed it.>
<If no Greptile comments found: "No Greptile comments.">
<If no PR existed during Step 3.75: omit this section entirely>
## Scope Drift
<If scope drift ran: "Scope Check: CLEAN" or list of drift/creep findings>
<If no scope drift: omit this section>
## Plan Completion
<If plan file found: completion checklist summary from Step 3.45>
<If no plan file: "No plan file detected.">
+52 -46
View File
@@ -3,8 +3,11 @@ name: ship
preamble-tier: 4
version: 1.0.0
description: |
Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. Use when asked to "ship", "deploy", "push to main", "create a PR", or "merge and push".
Proactively suggest when the user says code is ready or asks about deploying.
Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION,
update CHANGELOG, commit, push, create PR. Use when asked to "ship", "deploy",
"push to main", "create a PR", "merge and push", or "get it deployed".
Proactively invoke this skill (do NOT push/PR directly) when the user says code
is ready, asks about deploying, wants to push code up, or asks to create a PR. (gstack)
allowed-tools:
- Bash
- Read
@@ -230,6 +233,8 @@ If multiple suites need to run, run them sequentially (each needs a test lane).
{{LEARNINGS_SEARCH}}
{{SCOPE_DRIFT}}
---
## Step 3.5: Pre-Landing Review
@@ -326,13 +331,25 @@ For each classified comment:
## Step 4: Version bump (auto-decide)
**Idempotency check:** Before bumping, compare VERSION against the base branch.
```bash
BASE_VERSION=$(git show origin/<base>:VERSION 2>/dev/null || echo "0.0.0.0")
CURRENT_VERSION=$(cat VERSION 2>/dev/null || echo "0.0.0.0")
echo "BASE: $BASE_VERSION HEAD: $CURRENT_VERSION"
if [ "$CURRENT_VERSION" != "$BASE_VERSION" ]; then echo "ALREADY_BUMPED"; fi
```
If output shows `ALREADY_BUMPED`, VERSION was already bumped on this branch (prior `/ship` run). Skip the rest of Step 4 and use the current VERSION. Otherwise proceed with the bump.
1. Read the current `VERSION` file (4-digit format: `MAJOR.MINOR.PATCH.MICRO`)
2. **Auto-decide the bump level based on the diff:**
- Count lines changed (`git diff origin/<base>...HEAD --stat | tail -1`)
- Check for feature signals: new route/page files (e.g. `app/*/page.tsx`, `pages/*.ts`), new DB migration/schema files, new test files alongside new source files, or branch name starting with `feat/`
- **MICRO** (4th digit): < 50 lines changed, trivial tweaks, typos, config
- **PATCH** (3rd digit): 50+ lines changed, bug fixes, small-medium features
- **MINOR** (2nd digit): **ASK the user** — only for major features or significant architectural changes
- **PATCH** (3rd digit): 50+ lines changed, no feature signals detected
- **MINOR** (2nd digit): **ASK the user** if ANY feature signal is detected, OR 500+ lines changed, OR new modules/packages added
- **MAJOR** (1st digit): **ASK the user** — only for milestones or breaking changes
3. Compute the new version:
@@ -343,46 +360,7 @@ For each classified comment:
---
## Step 5: CHANGELOG (auto-generate)
1. Read `CHANGELOG.md` header to know the format.
2. **First, enumerate every commit on the branch:**
```bash
git log <base>..HEAD --oneline
```
Copy the full list. Count the commits. You will use this as a checklist.
3. **Read the full diff** to understand what each commit actually changed:
```bash
git diff <base>...HEAD
```
4. **Group commits by theme** before writing anything. Common themes:
- New features / capabilities
- Performance improvements
- Bug fixes
- Dead code removal / cleanup
- Infrastructure / tooling / tests
- Refactoring
5. **Write the CHANGELOG entry** covering ALL groups:
- If existing CHANGELOG entries on the branch already cover some commits, replace them with one unified entry for the new version
- Categorize changes into applicable sections:
- `### Added` — new features
- `### Changed` — changes to existing functionality
- `### Fixed` — bug fixes
- `### Removed` — removed features
- Write concise, descriptive bullet points
- Insert after the file header (line 5), dated today
- Format: `## [X.Y.Z.W] - YYYY-MM-DD`
6. **Cross-check:** Compare your CHANGELOG entry against the commit list from step 2.
Every commit must map to at least one bullet point. If any commit is unrepresented,
add it now. If the branch has N commits spanning K themes, the CHANGELOG must
reflect all K themes.
**Do NOT ask the user to describe changes.** Infer from the diff and commit history.
{{CHANGELOG_WORKFLOW}}
---
@@ -577,7 +555,17 @@ If `SCOPE_FRONTEND=false` or browse is not available, skip this step silently.
## Step 7: Push
Push to the remote with upstream tracking:
**Idempotency check:** Check if the branch is already pushed and up to date.
```bash
git fetch origin <branch-name> 2>/dev/null
LOCAL=$(git rev-parse HEAD)
REMOTE=$(git rev-parse origin/<branch-name> 2>/dev/null || echo "none")
echo "LOCAL: $LOCAL REMOTE: $REMOTE"
[ "$LOCAL" = "$REMOTE" ] && echo "ALREADY_PUSHED" || echo "PUSH_NEEDED"
```
If `ALREADY_PUSHED`, skip the push. Otherwise push with upstream tracking:
```bash
git push -u origin <branch-name>
@@ -587,7 +575,21 @@ git push -u origin <branch-name>
## Step 8: Create PR/MR
Create a pull request (GitHub) or merge request (GitLab) using the platform detected in Step 0.
**Idempotency check:** Check if a PR/MR already exists for this branch.
**If GitHub:**
```bash
gh pr view --json url,number,state -q 'if .state == "OPEN" then "PR #\(.number): \(.url)" else "NO_PR" end' 2>/dev/null || echo "NO_PR"
```
**If GitLab:**
```bash
glab mr view -F json 2>/dev/null | jq -r 'if .state == "opened" then "MR_EXISTS" else "NO_MR" end' 2>/dev/null || echo "NO_MR"
```
If an **open** PR/MR already exists: **update** the PR body with the latest test results, coverage, and review findings using `gh pr edit --body "..."` (GitHub) or `glab mr update -d "..."` (GitLab). Print the existing URL and continue to Step 8.5.
If no PR/MR exists: create a pull request (GitHub) or merge request (GitLab) using the platform detected in Step 0.
The PR/MR body should contain these sections:
@@ -619,6 +621,10 @@ you missed it.>
<If no Greptile comments found: "No Greptile comments.">
<If no PR existed during Step 3.75: omit this section entirely>
## Scope Drift
<If scope drift ran: "Scope Check: CLEAN" or list of drift/creep findings>
<If no scope drift: omit this section>
## Plan Completion
<If plan file found: completion checklist summary from Step 3.45>
<If no plan file: "No plan file detected.">