Merge remote-tracking branch 'origin/main' into garrytan/usage-telemetry

# Conflicts:
#	SKILL.md
#	TODOS.md
#	browse/SKILL.md
#	design-consultation/SKILL.md
#	design-review/SKILL.md
#	document-release/SKILL.md
#	plan-ceo-review/SKILL.md
#	plan-design-review/SKILL.md
#	plan-eng-review/SKILL.md
#	qa-only/SKILL.md
#	qa/SKILL.md
#	retro/SKILL.md
#	retro/SKILL.md.tmpl
#	review/SKILL.md
#	scripts/gen-skill-docs.ts
#	setup-browser-cookies/SKILL.md
#	ship/SKILL.md
This commit is contained in:
Garry Tan
2026-03-19 00:50:11 -07:00
81 changed files with 8178 additions and 609 deletions
+133 -22
View File
@@ -3,6 +3,7 @@ name: ship
version: 1.0.0
description: |
Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. Use when asked to "ship", "deploy", "push to main", "create a PR", or "merge and push".
Proactively suggest when the user says code is ready or asks about deploying.
allowed-tools:
- Bash
- Read
@@ -39,7 +40,8 @@ _SESSION_ID="$$-$(date +%s)"
echo "TELEMETRY: ${_TEL:-off}"
echo "TEL_PROMPTED: $_TEL_PROMPTED"
mkdir -p ~/.gstack/analytics
for _PF in ~/.gstack/analytics/.pending-* 2>/dev/null; do [ -f "$_PF" ] && ~/.claude/skills/gstack/bin/gstack-telemetry-log --event-type skill_run --skill _pending_finalize --outcome unknown --session-id "$_SESSION_ID" 2>/dev/null || true; break; done
echo '{"skill":"ship","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","repo":"'$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null || echo "unknown")'"}' >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true
for _PF in ~/.gstack/analytics/.pending-*; do [ -f "$_PF" ] && ~/.claude/skills/gstack/bin/gstack-telemetry-log --event-type skill_run --skill _pending_finalize --outcome unknown --session-id "$_SESSION_ID" 2>/dev/null || true; break; done
```
If `PROACTIVE` is `"false"`, do not proactively suggest gstack skills — only invoke
@@ -154,13 +156,37 @@ Hey gstack team — ran into this while using /{skill-name}:
Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
## Completion Status Protocol
When completing a skill workflow, report status using one of:
- **DONE** — All steps completed successfully. Evidence provided for each claim.
- **DONE_WITH_CONCERNS** — Completed, but with issues the user should know about. List each concern.
- **BLOCKED** — Cannot proceed. State what is blocking and what was tried.
- **NEEDS_CONTEXT** — Missing information required to continue. State exactly what you need.
### Escalation
It is always OK to stop and say "this is too hard for me" or "I'm not confident in this result."
Bad work is worse than no work. You will not be penalized for escalating.
- If you have attempted a task 3 times without success, STOP and escalate.
- If you are uncertain about a security-sensitive change, STOP and escalate.
- If the scope of work exceeds what you can verify, STOP and escalate.
Escalation format:
```
STATUS: BLOCKED | NEEDS_CONTEXT
REASON: [1-2 sentences]
ATTEMPTED: [what you tried]
RECOMMENDATION: [what the user should do next]
```
## Telemetry (run last)
After the skill workflow completes (success, error, or abort), write the .pending marker
with the actual skill name, then log the telemetry event. Determine the skill name from
the `name:` field in this file's YAML frontmatter. Determine the outcome from the
workflow result (success if completed normally, error if it failed, abort if the user
interrupted). Run this bash:
After the skill workflow completes (success, error, or abort), log the telemetry event.
Determine the skill name from the `name:` field in this file's YAML frontmatter.
Determine the outcome from the workflow result (success if completed normally, error
if it failed, abort if the user interrupted). Run this bash:
```bash
_TEL_END=$(date +%s)
@@ -236,13 +262,10 @@ You are running the `/ship` workflow. This is a **non-interactive, fully automat
After completing the review, read the review log and config to display the dashboard.
```bash
eval $(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
cat ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl 2>/dev/null || echo "NO_REVIEWS"
echo "---CONFIG---"
~/.claude/skills/gstack/bin/gstack-config get skip_eng_review 2>/dev/null || echo "false"
~/.claude/skills/gstack/bin/gstack-review-read
```
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite). Ignore entries with timestamps older than 7 days. For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, codex-review). Ignore entries with timestamps older than 7 days. For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
```
+====================================================================+
@@ -253,6 +276,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
| Eng Review | 1 | 2026-03-16 15:00 | CLEAR | YES |
| CEO Review | 0 | — | — | no |
| Design Review | 0 | — | — | no |
| Codex Review | 0 | — | — | no |
+--------------------------------------------------------------------+
| VERDICT: CLEARED — Eng Review passed |
+====================================================================+
@@ -262,18 +286,25 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
- **Eng Review (required by default):** The only review that gates shipping. Covers architecture, code quality, tests, performance. Can be disabled globally with \`gstack-config set skip_eng_review true\` (the "don't bother me" setting).
- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup.
- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes.
- **Codex Review (optional):** Independent second opinion from OpenAI Codex CLI. Shows pass/fail gate. Recommend for critical code changes where a second AI perspective adds value. Skip when Codex CLI is not installed.
**Verdict logic:**
- **CLEARED**: Eng Review has >= 1 entry within 7 days with status "clean" (or \`skip_eng_review\` is \`true\`)
- **NOT CLEARED**: Eng Review missing, stale (>7 days), or has open issues
- CEO and Design reviews are shown for context but never block shipping
- CEO, Design, and Codex reviews are shown for context but never block shipping
- If \`skip_eng_review\` config is \`true\`, Eng Review shows "SKIPPED (global)" and verdict is CLEARED
**Staleness detection:** After displaying the dashboard, check if any existing reviews may be stale:
- Parse the \`---HEAD---\` section from the bash output to get the current HEAD commit hash
- For each review entry that has a \`commit\` field: compare it against the current HEAD. If different, count elapsed commits: \`git rev-list --count STORED_COMMIT..HEAD\`. Display: "Note: {skill} review from {date} may be stale — {N} commits since review"
- For entries without a \`commit\` field (legacy entries): display "Note: {skill} review from {date} has no commit tracking — consider re-running for accurate staleness detection"
- If all reviews match the current HEAD, do not display any staleness notes
If the Eng Review is NOT "CLEAR":
1. **Check for a prior override on this branch:**
```bash
eval $(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
grep '"skill":"ship-review-override"' ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl 2>/dev/null || echo "NO_OVERRIDE"
```
If an override exists, display the dashboard and note "Review gate previously accepted — continuing." Do NOT ask again.
@@ -283,11 +314,11 @@ If the Eng Review is NOT "CLEAR":
- RECOMMENDATION: Choose C if the change is obviously trivial (< 20 lines, typo fix, config-only); Choose B for larger changes
- Options: A) Ship anyway B) Abort — run /plan-eng-review first C) Change is too small to need eng review
- If CEO Review is missing, mention as informational ("CEO Review not run — recommended for product changes") but do NOT block
- For Design Review: run `eval $(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 3.5, but consider running /design-review for a full visual audit post-implementation." Still never block.
- For Design Review: run `source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 3.5, but consider running /design-review for a full visual audit post-implementation." Still never block.
3. **If the user chooses A or C,** persist the decision so future `/ship` runs on this branch skip the gate:
```bash
eval $(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
echo '{"skill":"ship-review-override","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","decision":"USER_CHOICE"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
```
Substitute USER_CHOICE with "ship_anyway" or "not_relevant".
@@ -704,7 +735,7 @@ Review the diff for structural issues that tests don't catch.
Check if the diff touches frontend files using `gstack-diff-scope`:
```bash
eval $(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)
source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)
```
**If `SCOPE_FRONTEND=false`:** Skip design review silently. No output.
@@ -727,12 +758,10 @@ eval $(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)
6. **Log the result** for the Review Readiness Dashboard:
```bash
eval $(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
mkdir -p ~/.gstack/projects/$SLUG
echo '{"skill":"design-review-lite","timestamp":"TIMESTAMP","status":"STATUS","findings":N,"auto_fixed":M}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"design-review-lite","timestamp":"TIMESTAMP","status":"STATUS","findings":N,"auto_fixed":M,"commit":"COMMIT"}'
```
Substitute: TIMESTAMP = ISO 8601 datetime, STATUS = "clean" if 0 findings or "issues_found", N = total findings, M = auto-fixed count.
Substitute: TIMESTAMP = ISO 8601 datetime, STATUS = "clean" if 0 findings or "issues_found", N = total findings, M = auto-fixed count, COMMIT = output of `git rev-parse --short HEAD`.
Include any design findings alongside the code review findings. They follow the same Fix-First flow below.
@@ -799,6 +828,44 @@ For each classified comment:
---
## Step 3.8: Codex second opinion (optional)
Check if the Codex CLI is available:
```bash
which codex 2>/dev/null && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
```
If Codex is available, use AskUserQuestion:
```
Pre-landing review complete. Want an independent Codex (OpenAI) review before shipping?
A) Run Codex code review — independent diff review with pass/fail gate
B) Run Codex adversarial challenge — try to break this code
C) Skip — ship without Codex review
```
If the user chooses A or B:
**For code review (A):** Run `codex review --base <base>` with a 5-minute timeout.
Present the full output verbatim under a `CODEX SAYS:` header. Check for `[P1]` markers
to determine pass/fail gate. Persist the result:
```bash
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-review","timestamp":"TIMESTAMP","status":"STATUS","gate":"GATE"}'
```
If GATE is FAIL, use AskUserQuestion: "Codex found critical issues. Ship anyway?"
If the user says no, stop. If yes, continue to Step 4.
**For adversarial (B):** Run codex exec with the adversarial prompt (see /codex skill).
Present findings. This is informational — does not block shipping.
If Codex is not available, skip silently. Continue to Step 4.
---
## Step 4: Version bump (auto-decide)
1. Read the current `VERSION` file (4-digit format: `MAJOR.MINOR.PATCH.MICRO`)
@@ -933,6 +1000,28 @@ EOF
---
## Step 6.5: Verification Gate
**IRON LAW: NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE.**
Before pushing, re-verify if code changed during Steps 4-6:
1. **Test verification:** If ANY code changed after Step 3's test run (fixes from review findings, CHANGELOG edits don't count), re-run the test suite. Paste fresh output. Stale output from Step 3 is NOT acceptable.
2. **Build verification:** If the project has a build step, run it. Paste output.
3. **Rationalization prevention:**
- "Should work now" → RUN IT.
- "I'm confident" → Confidence is not evidence.
- "I already tested earlier" → Code changed since then. Test again.
- "It's a trivial change" → Trivial changes break production.
**If tests fail here:** STOP. Do not push. Fix the issue and return to Step 3.
Claiming work is complete without verification is dishonesty, not efficiency.
---
## Step 7: Push
Push to the remote with upstream tracking:
@@ -986,7 +1075,28 @@ EOF
)"
```
**Output the PR URL** — this should be the final output the user sees.
**Output the PR URL** — then proceed to Step 8.5.
---
## Step 8.5: Auto-invoke /document-release
After the PR is created, automatically sync project documentation. Read the
`document-release/SKILL.md` skill file (adjacent to this skill's directory) and
execute its full workflow:
1. Read the `/document-release` skill: `cat ${CLAUDE_SKILL_DIR}/../document-release/SKILL.md`
2. Follow its instructions — it reads all .md files in the project, cross-references
the diff, and updates anything that drifted (README, ARCHITECTURE, CONTRIBUTING,
CLAUDE.md, TODOS, etc.)
3. If any docs were updated, commit the changes and push to the same branch:
```bash
git add -A && git commit -m "docs: sync documentation with shipped changes" && git push
```
4. If no docs needed updating, say "Documentation is current — no updates needed."
This step is automatic. Do not ask the user for confirmation. The goal is zero-friction
doc updates — the user runs `/ship` and documentation stays current without a separate command.
---
@@ -1001,5 +1111,6 @@ EOF
- **Split commits for bisectability** — each commit = one logical change.
- **TODOS.md completion detection must be conservative.** Only mark items as completed when the diff clearly shows the work is done.
- **Use Greptile reply templates from greptile-triage.md.** Every reply includes evidence (inline diff, code references, re-rank suggestion). Never post vague replies.
- **Never push without fresh verification evidence.** If code changed after Step 3 tests, re-run before pushing.
- **Step 3.4 generates coverage tests.** They must pass before committing. Never commit failing tests.
- **The goal is: user says `/ship`, next thing they see is the review + PR URL.**
- **The goal is: user says `/ship`, next thing they see is the review + PR URL + auto-synced docs.**
+88 -5
View File
@@ -3,6 +3,7 @@ name: ship
version: 1.0.0
description: |
Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. Use when asked to "ship", "deploy", "push to main", "create a PR", or "merge and push".
Proactively suggest when the user says code is ready or asks about deploying.
allowed-tools:
- Bash
- Read
@@ -60,7 +61,7 @@ If the Eng Review is NOT "CLEAR":
1. **Check for a prior override on this branch:**
```bash
eval $(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
grep '"skill":"ship-review-override"' ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl 2>/dev/null || echo "NO_OVERRIDE"
```
If an override exists, display the dashboard and note "Review gate previously accepted — continuing." Do NOT ask again.
@@ -70,11 +71,11 @@ If the Eng Review is NOT "CLEAR":
- RECOMMENDATION: Choose C if the change is obviously trivial (< 20 lines, typo fix, config-only); Choose B for larger changes
- Options: A) Ship anyway B) Abort — run /plan-eng-review first C) Change is too small to need eng review
- If CEO Review is missing, mention as informational ("CEO Review not run — recommended for product changes") but do NOT block
- For Design Review: run `eval $(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 3.5, but consider running /design-review for a full visual audit post-implementation." Still never block.
- For Design Review: run `source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 3.5, but consider running /design-review for a full visual audit post-implementation." Still never block.
3. **If the user chooses A or C,** persist the decision so future `/ship` runs on this branch skip the gate:
```bash
eval $(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
source <(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
echo '{"skill":"ship-review-override","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","decision":"USER_CHOICE"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
```
Substitute USER_CHOICE with "ship_anyway" or "not_relevant".
@@ -402,6 +403,44 @@ For each classified comment:
---
## Step 3.8: Codex second opinion (optional)
Check if the Codex CLI is available:
```bash
which codex 2>/dev/null && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
```
If Codex is available, use AskUserQuestion:
```
Pre-landing review complete. Want an independent Codex (OpenAI) review before shipping?
A) Run Codex code review — independent diff review with pass/fail gate
B) Run Codex adversarial challenge — try to break this code
C) Skip — ship without Codex review
```
If the user chooses A or B:
**For code review (A):** Run `codex review --base <base>` with a 5-minute timeout.
Present the full output verbatim under a `CODEX SAYS:` header. Check for `[P1]` markers
to determine pass/fail gate. Persist the result:
```bash
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-review","timestamp":"TIMESTAMP","status":"STATUS","gate":"GATE"}'
```
If GATE is FAIL, use AskUserQuestion: "Codex found critical issues. Ship anyway?"
If the user says no, stop. If yes, continue to Step 4.
**For adversarial (B):** Run codex exec with the adversarial prompt (see /codex skill).
Present findings. This is informational — does not block shipping.
If Codex is not available, skip silently. Continue to Step 4.
---
## Step 4: Version bump (auto-decide)
1. Read the current `VERSION` file (4-digit format: `MAJOR.MINOR.PATCH.MICRO`)
@@ -536,6 +575,28 @@ EOF
---
## Step 6.5: Verification Gate
**IRON LAW: NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE.**
Before pushing, re-verify if code changed during Steps 4-6:
1. **Test verification:** If ANY code changed after Step 3's test run (fixes from review findings, CHANGELOG edits don't count), re-run the test suite. Paste fresh output. Stale output from Step 3 is NOT acceptable.
2. **Build verification:** If the project has a build step, run it. Paste output.
3. **Rationalization prevention:**
- "Should work now" → RUN IT.
- "I'm confident" → Confidence is not evidence.
- "I already tested earlier" → Code changed since then. Test again.
- "It's a trivial change" → Trivial changes break production.
**If tests fail here:** STOP. Do not push. Fix the issue and return to Step 3.
Claiming work is complete without verification is dishonesty, not efficiency.
---
## Step 7: Push
Push to the remote with upstream tracking:
@@ -589,7 +650,28 @@ EOF
)"
```
**Output the PR URL** — this should be the final output the user sees.
**Output the PR URL** — then proceed to Step 8.5.
---
## Step 8.5: Auto-invoke /document-release
After the PR is created, automatically sync project documentation. Read the
`document-release/SKILL.md` skill file (adjacent to this skill's directory) and
execute its full workflow:
1. Read the `/document-release` skill: `cat ${CLAUDE_SKILL_DIR}/../document-release/SKILL.md`
2. Follow its instructions — it reads all .md files in the project, cross-references
the diff, and updates anything that drifted (README, ARCHITECTURE, CONTRIBUTING,
CLAUDE.md, TODOS, etc.)
3. If any docs were updated, commit the changes and push to the same branch:
```bash
git add -A && git commit -m "docs: sync documentation with shipped changes" && git push
```
4. If no docs needed updating, say "Documentation is current — no updates needed."
This step is automatic. Do not ask the user for confirmation. The goal is zero-friction
doc updates — the user runs `/ship` and documentation stays current without a separate command.
---
@@ -604,5 +686,6 @@ EOF
- **Split commits for bisectability** — each commit = one logical change.
- **TODOS.md completion detection must be conservative.** Only mark items as completed when the diff clearly shows the work is done.
- **Use Greptile reply templates from greptile-triage.md.** Every reply includes evidence (inline diff, code references, re-rank suggestion). Never post vague replies.
- **Never push without fresh verification evidence.** If code changed after Step 3 tests, re-run before pushing.
- **Step 3.4 generates coverage tests.** They must pass before committing. Never commit failing tests.
- **The goal is: user says `/ship`, next thing they see is the review + PR URL.**
- **The goal is: user says `/ship`, next thing they see is the review + PR URL + auto-synced docs.**