* fix: cookie import picker returns JSON instead of HTML jsonResponse() was defined at module scope but referenced `url` which only existed as a parameter of handleCookiePickerRoute(). Every API call crashed, the catch block also crashed, and Bun returned a default HTML page that the frontend couldn't parse as JSON. Thread port via corsOrigin() helper and options objects. Add route-level tests to prevent this class of bug from shipping again. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add help command to browse server Agents that don't have SKILL.md loaded (or misread flags) had no way to self-discover the CLI. The help command returns a formatted reference of all commands and snapshot flags. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: version-aware find-browse with META signal protocol Agents in other workspaces found stale browse binaries that were missing newer flags. find-browse now compares the local binary's git SHA against origin/main via git ls-remote (4hr cache), and emits META:UPDATE_AVAILABLE when behind. SKILL.md setup checks parse META signals and prompt the user to update. - New compiled binary: browse/dist/find-browse (TypeScript, testable) - Bash shim at browse/bin/find-browse delegates to compiled binary - .version file written at build time with git commit SHA - Build script compiles both browse and find-browse binaries - Graceful degradation: offline, missing .version, corrupt cache all skip check Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: clean up .bun-build temp files after compile bun build --compile leaves ~58MB temp files in the working directory. Add rm -f .*.bun-build to the build script to clean up after each build. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: make help command reachable by removing it from META_COMMANDS help was in META_COMMANDS, so it dispatched to handleMetaCommand() which threw "Unknown meta command: help". Removing it from the set lets the dedicated else-if handler in handleCommand() execute correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: bump version and changelog (v0.3.2) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add shared Greptile comment triage reference doc Shared reference for fetching, filtering, and classifying Greptile review comments on GitHub PRs. Used by both /review and /ship skills. Includes parallel API fetching, suppressions check, classification logic, reply APIs, and history file writes. * feat: make /review and /ship Greptile-aware /review: Step 2.5 fetches and classifies Greptile comments, Step 5 resolves them with AskUserQuestion for valid issues and false positives. /ship: Step 3.75 triages Greptile comments between pre-landing review and version bump. Adds Greptile Review section to PR body in Step 8. Re-runs tests if any Greptile fixes are applied. * feat: add Greptile batting average to /retro Reads ~/.gstack/greptile-history.md, computes signal ratio (valid catches vs false positives), includes in metrics table, JSON snapshot, and Code Quality Signals narrative. * docs: add Greptile integration section to README Personal endorsement, two-layer review narrative, full UX walkthrough transcript, skills table updates. Add Greptile training feedback loop to TODO.md future ideas. * feat: add local dev mode for testing skills from within the repo bin/dev-setup creates .claude/skills/gstack symlink to the working tree so Claude Code discovers skills locally. bin/dev-teardown cleans up. DEVELOPING_GSTACK.md documents the workflow. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: narrow gitignore to .claude/skills/ instead of all .claude/ Avoids ignoring legitimate Claude Code config like settings.json or CLAUDE.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: rename DEVELOPING_GSTACK.md to CONTRIBUTING.md Rewritten as a contributor-friendly guide instead of a dry plan doc. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: explain why dev-setup is needed in CONTRIBUTING.md quick start Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add browser interaction guidance to CLAUDE.md Prevents Claude from using mcp__claude-in-chrome__* tools instead of /browse. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add shared config module for project-local browse state Centralizes path resolution (git root detection, state dir, log paths) into config.ts. Both cli.ts and server.ts import from it, eliminating duplicated PORT_OFFSET/BROWSE_PORT/STATE_FILE logic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: rewrite port selection to use random ports Replace CONDUCTOR_PORT magic offset and 9400-9409 scan with random port 10000-60000. Atomic state file writes, log paths from config module, binaryVersion field for auto-restart on update. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: move browse state from /tmp to project-local .gstack/ CLI now uses config module for state paths, passes BROWSE_STATE_FILE to spawned server. Adds version mismatch auto-restart, legacy /tmp cleanup with PID verification, and removes stale global install fallback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: update crash log path reference to .gstack/ Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add config tests and update CLI lifecycle test 14 new tests for config resolution, ensureStateDir, readVersionHash, resolveServerScript, and version mismatch detection. Remove obsolete CONDUCTOR_PORT/BROWSE_PORT filtering from commands.test.ts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: update BROWSER.md and TODO.md for project-local state Replace /tmp paths with .gstack/, remove CONDUCTOR_PORT docs, document random port selection and per-project isolation. Add server bundling TODO. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: update README, CHANGELOG, and CONTRIBUTING for v0.3.2 - README: replace Conductor-aware language with project-local isolation, add Greptile setup note - CHANGELOG: comprehensive v0.3.2 entry with all state management changes - CONTRIBUTING: add instructions for testing branches in other repos Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add diff-aware mode to /qa — auto-tests affected pages from branch diff When on a feature branch, /qa now reads git diff main, identifies affected pages/routes from changed files, and tests them automatically. No URL required. The most natural flow: write code, /ship, /qa. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: update CHANGELOG for complete v0.3.2 coverage Add missing entries: diff-aware QA mode, Greptile integration, local dev mode, crash log path fix, README/SKILL.md updates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
14 KiB
name, version, description, allowed-tools
| name | version | description | allowed-tools | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| ship | 1.0.0 | Ship workflow: merge main, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. |
|
Ship: Fully Automated Ship Workflow
You are running the /ship workflow. This is a non-interactive, fully automated workflow. Do NOT ask for confirmation at any step. The user said /ship which means DO IT. Run straight through and output the PR URL at the end.
Only stop for:
- On
mainbranch (abort) - Merge conflicts that can't be auto-resolved (stop, show conflicts)
- Test failures (stop, show failures)
- Pre-landing review finds CRITICAL issues and user chooses to fix (not acknowledge or skip)
- MINOR or MAJOR version bump needed (ask — see Step 4)
- Greptile review comments that need user decision (complex fixes, false positives)
Never stop for:
- Uncommitted changes (always include them)
- Version bump choice (auto-pick MICRO or PATCH — see Step 4)
- CHANGELOG content (auto-generate from diff)
- Commit message approval (auto-commit)
- Multi-file changesets (auto-split into bisectable commits)
Step 1: Pre-flight
-
Check the current branch. If on
main, abort: "You're on main. Ship from a feature branch." -
Run
git status(never use-uall). Uncommitted changes are always included — no need to ask. -
Run
git diff main...HEAD --statandgit log main..HEAD --onelineto understand what's being shipped.
Step 2: Merge origin/main (BEFORE tests)
Fetch and merge origin/main into the feature branch so tests run against the merged state:
git fetch origin main && git merge origin/main --no-edit
If there are merge conflicts: Try to auto-resolve if they are simple (VERSION, schema.rb, CHANGELOG ordering). If conflicts are complex or ambiguous, STOP and show them.
If already up to date: Continue silently.
Step 3: Run tests (on merged code)
Do NOT run RAILS_ENV=test bin/rails db:migrate — bin/test-lane already calls
db:test:prepare internally, which loads the schema into the correct lane database.
Running bare test migrations without INSTANCE hits an orphan DB and corrupts structure.sql.
Run both test suites in parallel:
bin/test-lane 2>&1 | tee /tmp/ship_tests.txt &
npm run test 2>&1 | tee /tmp/ship_vitest.txt &
wait
After both complete, read the output files and check pass/fail.
If any test fails: Show the failures and STOP. Do not proceed.
If all pass: Continue silently — just note the counts briefly.
Step 3.25: Eval Suites (conditional)
Evals are mandatory when prompt-related files change. Skip this step entirely if no prompt files are in the diff.
1. Check if the diff touches prompt-related files:
git diff origin/main --name-only
Match against these patterns (from CLAUDE.md):
app/services/*_prompt_builder.rbapp/services/*_generation_service.rb,*_writer_service.rb,*_designer_service.rbapp/services/*_evaluator.rb,*_scorer.rb,*_classifier_service.rb,*_analyzer.rbapp/services/concerns/*voice*.rb,*writing*.rb,*prompt*.rb,*token*.rbapp/services/chat_tools/*.rb,app/services/x_thread_tools/*.rbconfig/system_prompts/*.txttest/evals/**/*(eval infrastructure changes affect all suites)
If no matches: Print "No prompt-related files changed — skipping evals." and continue to Step 3.5.
2. Identify affected eval suites:
Each eval runner (test/evals/*_eval_runner.rb) declares PROMPT_SOURCE_FILES listing which source files affect it. Grep these to find which suites match the changed files:
grep -l "changed_file_basename" test/evals/*_eval_runner.rb
Map runner → test file: post_generation_eval_runner.rb → post_generation_eval_test.rb.
Special cases:
- Changes to
test/evals/judges/*.rb,test/evals/support/*.rb, ortest/evals/fixtures/affect ALL suites that use those judges/support files. Check imports in the eval test files to determine which. - Changes to
config/system_prompts/*.txt— grep eval runners for the prompt filename to find affected suites. - If unsure which suites are affected, run ALL suites that could plausibly be impacted. Over-testing is better than missing a regression.
3. Run affected suites at EVAL_JUDGE_TIER=full:
/ship is a pre-merge gate, so always use full tier (Sonnet structural + Opus persona judges).
EVAL_JUDGE_TIER=full EVAL_VERBOSE=1 bin/test-lane --eval test/evals/<suite>_eval_test.rb 2>&1 | tee /tmp/ship_evals.txt
If multiple suites need to run, run them sequentially (each needs a test lane). If the first suite fails, stop immediately — don't burn API cost on remaining suites.
4. Check results:
- If any eval fails: Show the failures, the cost dashboard, and STOP. Do not proceed.
- If all pass: Note pass counts and cost. Continue to Step 3.5.
5. Save eval output — include eval results and cost dashboard in the PR body (Step 8).
Tier reference (for context — /ship always uses full):
| Tier | When | Speed (cached) | Cost |
|---|---|---|---|
fast (Haiku) |
Dev iteration, smoke tests | ~5s (14x faster) | ~$0.07/run |
standard (Sonnet) |
Default dev, bin/test-lane --eval |
~17s (4x faster) | ~$0.37/run |
full (Opus persona) |
/ship and pre-merge |
~72s (baseline) | ~$1.27/run |
Step 3.5: Pre-Landing Review
Review the diff for structural issues that tests don't catch.
-
Read
.claude/skills/review/checklist.md. If the file cannot be read, STOP and report the error. -
Run
git diff origin/mainto get the full diff (scoped to feature changes against the freshly-fetched remote main). -
Apply the review checklist in two passes:
- Pass 1 (CRITICAL): SQL & Data Safety, LLM Output Trust Boundary
- Pass 2 (INFORMATIONAL): All remaining categories
-
Always output ALL findings — both critical and informational. The user must see every issue found.
-
Output a summary header:
Pre-Landing Review: N issues (X critical, Y informational) -
If CRITICAL issues found: For EACH critical issue, use a separate AskUserQuestion with:
- The problem (
file:line+ description) - Your recommended fix
- Options: A) Fix it now (recommend), B) Acknowledge and ship anyway, C) It's a false positive — skip
After resolving all critical issues: if the user chose A (fix) on any issue, apply the recommended fixes, then commit only the fixed files by name (
git add <fixed-files> && git commit -m "fix: apply pre-landing review fixes"), then STOP and tell the user to run/shipagain to re-test with the fixes applied. If the user chose only B (acknowledge) or C (false positive) on all issues, continue with Step 4.
- The problem (
-
If only non-critical issues found: Output them and continue. They will be included in the PR body at Step 8.
-
If no issues found: Output
Pre-Landing Review: No issues found.and continue.
Save the review output — it goes into the PR body in Step 8.
Step 3.75: Address Greptile review comments (if PR exists)
Read .claude/skills/review/greptile-triage.md and follow the fetch, filter, and classify steps.
If no PR exists, gh fails, API returns an error, or there are zero Greptile comments: Skip this step silently. Continue to Step 4.
If Greptile comments are found:
Include a Greptile summary in your output: + N Greptile comments (X valid, Y fixed, Z FP)
For each classified comment:
VALID & ACTIONABLE: Use AskUserQuestion with:
- The comment (file:line or [top-level] + body summary + permalink URL)
- Your recommended fix
- Options: A) Fix now (recommended), B) Acknowledge and ship anyway, C) It's a false positive
- If user chooses A: apply the fix, commit the fixed files (
git add <fixed-files> && git commit -m "fix: address Greptile review — <brief description>"), reply to the comment ("Fixed in <commit-sha>."), and save to~/.gstack/greptile-history.md(type: fix). - If user chooses C: reply explaining the false positive, save to history (type: fp).
VALID BUT ALREADY FIXED: Reply acknowledging the catch — no AskUserQuestion needed:
- Post reply:
"Good catch — already fixed in <commit-sha>." - Save to
~/.gstack/greptile-history.md(type: already-fixed)
FALSE POSITIVE: Use AskUserQuestion:
- Show the comment and why you think it's wrong (file:line or [top-level] + body summary + permalink URL)
- Options:
- A) Reply to Greptile explaining the false positive (recommended if clearly wrong)
- B) Fix it anyway (if trivial)
- C) Ignore silently
- If user chooses A: post reply using the appropriate API from the triage doc, save to history (type: fp)
SUPPRESSED: Skip silently — these are known false positives from previous triage.
After all comments are resolved: If any fixes were applied, the tests from Step 3 are now stale. Re-run tests (Step 3) before continuing to Step 4. If no fixes were applied, continue to Step 4.
Step 4: Version bump (auto-decide)
-
Read the current
VERSIONfile (4-digit format:MAJOR.MINOR.PATCH.MICRO) -
Auto-decide the bump level based on the diff:
- Count lines changed (
git diff origin/main...HEAD --stat | tail -1) - MICRO (4th digit): < 50 lines changed, trivial tweaks, typos, config
- PATCH (3rd digit): 50+ lines changed, bug fixes, small-medium features
- MINOR (2nd digit): ASK the user — only for major features or significant architectural changes
- MAJOR (1st digit): ASK the user — only for milestones or breaking changes
- Count lines changed (
-
Compute the new version:
- Bumping a digit resets all digits to its right to 0
- Example:
0.19.1.0+ PATCH →0.19.2.0
-
Write the new version to the
VERSIONfile.
Step 5: CHANGELOG (auto-generate)
-
Read
CHANGELOG.mdheader to know the format. -
Auto-generate the entry from ALL commits on the branch (not just recent ones):
- Use
git log main..HEAD --onelineto see every commit being shipped - Use
git diff main...HEADto see the full diff against main - The CHANGELOG entry must be comprehensive of ALL changes going into the PR
- If existing CHANGELOG entries on the branch already cover some commits, replace them with one unified entry for the new version
- Categorize changes into applicable sections:
### Added— new features### Changed— changes to existing functionality### Fixed— bug fixes### Removed— removed features
- Write concise, descriptive bullet points
- Insert after the file header (line 5), dated today
- Format:
## [X.Y.Z.W] - YYYY-MM-DD
- Use
Do NOT ask the user to describe changes. Infer from the diff and commit history.
Step 6: Commit (bisectable chunks)
Goal: Create small, logical commits that work well with git bisect and help LLMs understand what changed.
-
Analyze the diff and group changes into logical commits. Each commit should represent one coherent change — not one file, but one logical unit.
-
Commit ordering (earlier commits first):
- Infrastructure: migrations, config changes, route additions
- Models & services: new models, services, concerns (with their tests)
- Controllers & views: controllers, views, JS/React components (with their tests)
- VERSION + CHANGELOG: always in the final commit
-
Rules for splitting:
- A model and its test file go in the same commit
- A service and its test file go in the same commit
- A controller, its views, and its test go in the same commit
- Migrations are their own commit (or grouped with the model they support)
- Config/route changes can group with the feature they enable
- If the total diff is small (< 50 lines across < 4 files), a single commit is fine
-
Each commit must be independently valid — no broken imports, no references to code that doesn't exist yet. Order commits so dependencies come first.
-
Compose each commit message:
- First line:
<type>: <summary>(type = feat/fix/chore/refactor/docs) - Body: brief description of what this commit contains
- Only the final commit (VERSION + CHANGELOG) gets the version tag and co-author trailer:
- First line:
git commit -m "$(cat <<'EOF'
chore: bump version and changelog (vX.Y.Z.W)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
EOF
)"
Step 7: Push
Push to the remote with upstream tracking:
git push -u origin <branch-name>
Step 8: Create PR
Create a pull request using gh:
gh pr create --title "<type>: <summary>" --body "$(cat <<'EOF'
## Summary
<bullet points from CHANGELOG>
## Pre-Landing Review
<findings from Step 3.5, or "No issues found.">
## Eval Results
<If evals ran: suite names, pass/fail counts, cost dashboard summary. If skipped: "No prompt-related files changed — evals skipped.">
## Greptile Review
<If Greptile comments were found: bullet list with [FIXED] / [FALSE POSITIVE] / [ALREADY FIXED] tag + one-line summary per comment>
<If no Greptile comments found: "No Greptile comments.">
<If no PR existed during Step 3.75: omit this section entirely>
## Test plan
- [x] All Rails tests pass (N runs, 0 failures)
- [x] All Vitest tests pass (N tests)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
EOF
)"
Output the PR URL — this should be the final output the user sees.
Important Rules
- Never skip tests. If tests fail, stop.
- Never skip the pre-landing review. If checklist.md is unreadable, stop.
- Never force push. Use regular
git pushonly. - Never ask for confirmation except for MINOR/MAJOR version bumps and CRITICAL review findings (one AskUserQuestion per critical issue with fix recommendation).
- Always use the 4-digit version format from the VERSION file.
- Date format in CHANGELOG:
YYYY-MM-DD - Split commits for bisectability — each commit = one logical change.
- The goal is: user says
/ship, next thing they see is the review + PR URL.