Claude reads its own ~/.claude/projects/.../tool-results/ files as
internal plumbing. These showed up as long unreadable paths in the
sidebar. Now: describeToolCall returns empty for tool-result reads,
and the sidebar skips rendering tool_use entries with no description.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fifth example shows the parent agent workflow: navigate, extract CSS,
write to file. The other four are all sidebar-only. This one shows
co-presence — the Claude Code session that launched the browser can
also control it directly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
README.md: updated Real browser mode and sidebar agent sections with
model routing, cookie import button, no idle timeout in headed mode.
Updated skill table entries for /browse and /open-gstack-browser.
docs/skills.md: updated /open-gstack-browser deep dive with model
routing and cookie import details.
GSTACK_BROWSER_V0.md: added 6 new SHIPPED items to implementation
status table (model routing, debug logging, idle timeout, cookie
button, arrow hint, architecture doc).
TODOS.md: marked "Sidebar agent Write tool + error visibility" as
SHIPPED. Added new P2 TODO for direct API calls to eliminate
claude -p startup tax.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
One-click cookie import from the sidebar. Navigates the headed browser
to /cookie-picker where you can select which domains to import from
your real Chrome profile.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 30-minute idle timeout only applies to headless mode now. In headed
mode the user is looking at the Chrome window, so auto-shutdown is wrong.
The browser stays alive until explicit disconnect or window close.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two fixes for the "browser died while chatting" problem:
1. /sidebar-command now calls resetIdleTimer(). Previously only CLI
commands reset it, so the server would shut down after 30 min even
while the user was actively chatting in the sidebar.
2. shutdown() now pkills the sidebar-agent daemon. Previously the agent
survived server shutdown, kept polling a dead server, and spawned
confused claude processes that auto-started headless browsers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SIDEBAR_MESSAGE_FLOW.md documents the full init timeline, message flow
(user types → claude responds), auth token chain, arrow hint signal
chain, model routing, tab concurrency, and known failure modes.
CLAUDE.md now tells you to read it before touching sidebar files.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the user sends a message and the server assigns it to a new tab
(because Chrome's active tab changed), switchChatTab() was blowing away
the optimistic user bubble and thinking dots with a welcome screen.
Now preserves the current DOM if we're mid-send with a thinking indicator.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8 new tests verify the sidebarOpened → background → content → welcome
signal chain. Updates stale "(session ended)" test that checked for
text removed in a prior commit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously the welcome page arrow hid immediately when the extension's
content script loaded — but extension loaded ≠ sidebar open. Now the
signal flow is: sidepanel connects → tells background.js → relays to
content script → dispatches gstack-extension-ready → arrow hides.
Adds welcome-page.test.ts: 14 tests verifying arrow, branding, feature
cards, dark theme, and auto-hide behavior via real HTTP server.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: document /plan-devex-review and /devex-review in README
Add both skills to the install instructions, skills table, and a new
"Which review?" comparison table showing when to use each review type.
* feat: add /plan-devex-review to /autoplan as conditional Phase 3.5
Auto-detects developer-facing scope (API, CLI, SDK, shell, agent, MCP,
OpenClaw) and runs DX review with dual adversarial voices after Eng review.
* chore: bump version and changelog (v0.15.4.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add DX framework resolver for shared principles and scoring rubric
New {{DX_FRAMEWORK}} resolver provides compact (~150 lines) shared content
for /plan-devex-review and /devex-review: Addy Osmani's 8 DX principles,
7 characteristics table, 10 cognitive patterns, scoring rubric, and TTHW
benchmarks. Hall of Fame examples loaded on-demand per pass to avoid bloat.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add DX Review row to review dashboard
Adds plan-devex-review and devex-review schema entries to the review
dashboard resolver and placeholder table in the preamble. All existing
SKILL.md files regenerated to include the new DX Review row.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: /plan-devex-review skill — DX plan review with Osmani framework
Plan-stage developer experience review. Rates 8 DX dimensions 0-10:
getting started, API/CLI/SDK design, error messages, docs, upgrade path,
dev environment, community, and DX measurement. Includes developer empathy
simulation, auto-detect product type with applicability gate, DX scorecard
with trend tracking, and a conditional Claude Code Skill DX checklist.
Hall of Fame examples loaded on-demand per pass from dx-hall-of-fame.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: /devex-review skill — live DX audit with browse
Live-system developer experience audit using browse tool. Tests all 8
dimensions aligned with /plan-devex-review for boomerang comparison
(plan said 3 min TTHW, reality says 8). Each dimension marked TESTED,
INFERRED, or N/A with evidence. Scope-aware: declares what browse can
and cannot test, falls back to file artifacts for untestable dimensions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.15.3.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comprehensive research on Slate (Random Labs) as a potential first-class
coding agent host. Includes binary analysis findings, skill discovery paths,
env vars, CLI flags, and npm package structure. Blocked on host config
refactor before implementation.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: setup runs pending migrations so git pull + ./setup works
Previously, version migrations only ran during /gstack-upgrade (Step 4.75).
Users who updated via git pull + ./setup never got migrations applied,
leaving stale skill directory structures in place. Now setup tracks the
last-run version in ~/.gstack/.last-setup-version and runs any pending
migrations automatically. Idempotent — safe to run multiple times.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: setup runs pending migrations so git pull + ./setup works
Previously, version migrations only ran during /gstack-upgrade (Step 4.75).
Users who updated via git pull + ./setup never got migrations applied,
leaving stale skill directory structures in place. Now setup tracks the
last-run version in ~/.gstack/.last-setup-version and runs any pending
migrations automatically.
Addresses adversarial review findings: space-safe while-read loop,
fresh install guard, upper-bound version check, missing VERSION guard.
* chore: bump version and changelog (v0.15.2.1)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: voice-friendly skill triggers for speech-to-text input
Add voice-triggers YAML field to 10 SKILL.md.tmpl files with natural-language
aliases (e.g. "see-so" for /cso, "tech review" for /plan-eng-review).
gen-skill-docs preprocesses voice triggers before transformFrontmatter,
folding them into the description and stripping the field from output.
Includes unit tests, README voice input section, and CONTRIBUTING.md update.
* chore: bump version and changelog (v0.14.6.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- stopAgent slice 800→1000 to accommodate added error logging lines
- Replace hardcoded opus assertion with model router assertions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Server-side silent catch blocks (22 instances) now log with [browse] prefix:
chat persistence, session save/load, agent kill, tab pin/restore, welcome
page, buffer flush, worktree cleanup, lock files, SSE streams.
Also adds pickSidebarModel() — routes sidebar messages to sonnet for
navigation/interaction (click, goto, fill, screenshot) and opus for
analysis/comprehension (summarize, describe, find bugs). Sonnet is
~4x faster for action commands with zero quality difference.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Every empty catch {} in sidepanel.js, sidebar-agent.ts now logs with
[gstack sidebar] or [sidebar-agent] prefix. Chat poll 401s, stop agent,
tab poll, clear chat, SSE parse, refs fetch, stream JSON parse, queue
read/parse, process kill — all now visible in console.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
background.js checkHealth() never refreshed authToken from /health responses,
so when the browse server restarted with a new token, all sidebar-command
requests got 401 Unauthorized forever.
Also: error placeholder text was #3f3f46 on #0C0C0C (nearly invisible).
Now shows in red to match the error border.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5 tests: CLI checks env var before starting server, shows actionable
error, sidebar-agent sets the flag + BROWSE_PORT, guard runs before
lock acquisition to prevent stale lock files.
When set, the browse CLI refuses to start a new server and exits with
a clear error: "Server not available, run /open-gstack-browser to restart."
The sidebar agent sets this so users never get an invisible headless
browser when the headed one is closed.
Ensures the child claude process connects to the existing headed
browse server (port 34567) instead of spawning a new headless one.
Without this, sidebar chat commands run in an invisible browser.
Two issues making the sidebar look broken when it's actually working:
1. "Timed out after 300s" error displayed after agent_done — this is a
cleanup timer, not a real error. Now suppressed when no active session.
2. "(session ended)" text appended on every idle poll — removed entirely.
The thinking spinner is cleaned up silently instead.
Root cause: sidepanel asked background "are you connected?" but background's
health check hadn't succeeded yet (1-10s gap). Sidepanel waited forever.
Fix: when background says not connected, sidepanel hits /health directly
with fetch(). Gets the token from the response. Bypasses background
entirely for initial connection. Shows step-by-step debug info:
"Checking server directly... port: 34567 / Trying GET /health..."
Replace useless "Connecting..." with real-time debug info:
- "Looking for browse server... (attempt N)"
- Shows port, server responding status, token status
- Shows chrome.runtime errors if extension messaging fails
- Tells user to run /open-gstack-browser if server not found
Root cause: extension service worker starts before Bun.serve() is
listening. First checkHealth() fails, next attempt is 10 seconds
later. User stares at "Connecting..." for 10 seconds.
Fix: retry every 1s for up to 15 attempts on startup, then switch
to 10s polling once connected (or after 15s gives up). Sidebar
should connect within 1-2 seconds of server becoming available.
3 new tests verify the fast-retry → slow-poll transition.
- Show attempt count in loading screen ("Connecting... attempt 3")
- After 5 failed attempts, show debug details (port, connected, token)
so stuck users can see exactly what's failing
- Add 4 tests: getPort includes token, tryConnect uses token,
dead state exists with MAX_RECONNECT_ATTEMPTS, reconnectAttempts visible
The sidebar called tryConnect() → getPort → got {port, connected} but
NO token. All subsequent requests (SSE, chat poll) failed with 401.
The token only arrived later via the health broadcast, but by then
the SSE connection was already broken.
Fix: include authToken in the getPort response so the sidebar has
the token from its very first connection attempt.
Replace invisible text fallback with visible amber bubble + animated
right arrow (→) pointing toward where the sidebar opens. Always correct
regardless of window size (unlike the old up arrow at toolbar chrome).
Adds a "reload" button next to "debug" and "clear" in the sidebar
footer. Calls location.reload() to fully refresh the side panel,
re-run connection logic, and clear stale state.
- Replace single-attempt sidePanel.open() with autoOpenSidePanel() that
retries up to 5 times with 500ms-5000ms backoff
- Fire on both onInstalled AND every service worker startup
- Remove misaligned arrow from welcome page, replace with text fallback
- Add 12 tests: welcome page structure, /welcome endpoint, headed launch
navigation timing, sidebar auto-open retry logic, extension-ready event
- Add top-level setTimeout in background.js that fires on every service
worker startup (onInstalled only fires on install/update)
- Remove misaligned arrow from welcome page, replace with text fallback
that hides when extension content script fires gstack-extension-ready
- Add /welcome endpoint to server.ts, serves welcome.html
- Navigate to /welcome after server starts (not during launchHeaded,
which runs before the server is listening)
- welcome.html bundled in browse/src/ for portability
* fix: top-level skill dirs so Claude discovers unprefixed names
Replace directory symlinks (gstack/qa → qa) with real directories
containing a SKILL.md symlink. Claude Code auto-prefixes skills nested
under a parent dir symlink, so /plan-ceo-review became "Unknown skill"
even with skill_prefix=false. Real dirs fix this.
Also syncs package.json version to match VERSION file and updates
test assertions to match the new mkdir + ln approach.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: update symlink references to new top-level directory pattern
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* test: regression tests for top-level skill directory structure
Verifies the invariant that setup/relink creates real directories (not
symlinks) at the top level, with SKILL.md symlinks inside. This prevents
Claude Code from auto-prefixing skills with gstack- when using --no-prefix.
Tests added:
- unprefixed skills must be real dirs with SKILL.md symlinks
- prefixed skills must also be real dirs with SKILL.md symlinks
- old directory symlinks get upgraded to real directories
- cleanup functions handle both old symlinks and new dir pattern
- link function removes old directory symlinks before mkdir
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* test: namespace isolation tests for first install + mode switching
Verifies the core invariant: when you pick a prefix mode, ONLY that
mode's entries exist. Zero pollution from the other mode.
- first install --no-prefix: only flat names, zero gstack-* leaks
- first install --prefix: only gstack-* names, zero flat leaks
- non-TTY defaults to flat names
- switching prefix→no-prefix removes ALL gstack-* entries
- switching no-prefix→prefix removes ALL flat entries
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: upgrade migration system — versioned fix scripts for broken state
Adds gstack-upgrade/migrations/ directory with version-keyed bash scripts
that run automatically during /gstack-upgrade (Step 4.75, after ./setup).
Each script is idempotent and handles state fixes that setup alone can't
cover. First migration: v0.15.2.0.sh runs gstack-relink to fix stale
directory symlinks from pre-v0.15.2.0 installs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* test: migration script validation + v0.15.2.0 end-to-end fix test
Tests that migration scripts are executable, parse without syntax errors,
follow the v{VERSION}.sh naming convention, and that v0.15.2.0 actually
fixes stale directory symlinks by converting them to real directories.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: upgrade migration guide in CONTRIBUTING.md + CLAUDE.md pointer
CONTRIBUTING.md: new "Upgrade migrations" section documenting when and
how to add migration scripts for broken on-disk state.
CLAUDE.md: added note under vendored symlink awareness pointing to
CONTRIBUTING.md's migration section when worried about broken installs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: /design-html works from any starting point — not just design-shotgun
Three routing modes: approved mockup (Case A), CEO plan or design variants
without formal approval (Case B), or clean slate with just a description
(Case C). Each mode asks the right questions via AskUserQuestion instead of
blocking with "no approved design found."
* chore: bump version and changelog (v0.15.1.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: session timeline binaries (gstack-timeline-log + gstack-timeline-read)
New binaries for the Session Intelligence Layer. gstack-timeline-log appends
JSONL events to ~/.gstack/projects/$SLUG/timeline.jsonl. gstack-timeline-read
reads, filters, and formats timeline data for /retro consumption.
Timeline is local-only project intelligence, never sent anywhere. Always-on
regardless of telemetry setting.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: preamble context recovery + timeline events + predictive suggestions
Layers 1-3 of the Session Intelligence Layer:
- Timeline start/complete events injected into every skill via preamble
- Context recovery (tier 2+): lists recent CEO plans, checkpoints, reviews
- Cross-session injection: LAST_SESSION and LATEST_CHECKPOINT for branch
- Predictive skill suggestion from recent timeline patterns
- Welcome back message synthesis
- Routing rules for /checkpoint and /health
Timeline writes are NOT gated by telemetry (local project intelligence).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: /checkpoint + /health skills (Layers 4-5)
/checkpoint: save/resume/list working state snapshots. Supports cross-branch
listing for Conductor workspace handoff. Session duration tracking.
/health: code quality scorekeeper. Wraps project tools (tsc, biome, knip,
shellcheck, tests), computes composite 0-10 score, tracks trends over time.
Auto-detects tools or reads from CLAUDE.md ## Health Stack.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: regenerate SKILL.md files + add timeline tests
9 timeline tests (all passing) mirroring learnings.test.ts pattern.
All 34 SKILL.md files regenerated with new preamble (context recovery,
timeline events, routing rules for /checkpoint and /health).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.15.0.0)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: update self-learning roadmap post-Session Intelligence
R1-R3 marked shipped with actual versions. R4 becomes Adaptive Ceremony
(trust as separate policy engine, scope-aware, gradual degradation). R5
becomes /autoship (resumable state machine, not linear chain). R6-R7
unbundled from old R5. Added State Systems reference, Risk Register
(Codex-reviewed), and validation metrics for R4.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: E2E tests for Session Intelligence (timeline, recovery, checkpoint)
3 gate-tier E2E tests:
- timeline-event-flow: binary data flow round-trip (no LLM)
- context-recovery-artifacts: seeded artifacts appear in preamble
- checkpoint-save-resume: checkpoint file created with YAML frontmatter
Also fixes package.json version sync (0.14.6.0 → 0.15.0.0).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: remove dead contributor mode, replace with operational self-improvement slot
Contributor mode never fired in 18 days of heavy use (required manual opt-in
via gstack-config, gated behind _CONTRIB=true, wrote disconnected markdown).
Removes: generateContributorMode(), _CONTRIB bash var, 2 E2E tests, touchfile
entry, doc references. Cleans up skip-lists in plan-ceo-review, autoplan,
review resolver, and document-release templates.
The operational self-improvement system (next commit) replaces this slot with
automatic learning capture that requires no opt-in.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: operational self-improvement — every skill learns from failures
Adds universal operational learning capture to the preamble completion protocol.
At the end of every skill session, the agent reflects on CLI failures, wrong
approaches, and project quirks, logging them as type "operational" to the
learnings JSONL. Future sessions surface these automatically.
- generateCompletionStatus(ctx) now includes operational capture section
- Preamble bash shows top 3 learnings inline when count > 5
- New "operational" type in generateLearningsLog alongside pattern/pitfall/etc
- Updated unit tests + operational seed entry in learnings E2E
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: wire learnings into all insight-producing skills
Adds LEARNINGS_SEARCH and/or LEARNINGS_LOG to 10 skill templates that
produce reusable insights but were previously disconnected from the
learning system:
- office-hours, plan-ceo-review, plan-eng-review: add LOG (had SEARCH)
- plan-design-review: add both SEARCH + LOG (had neither)
- design-review, design-consultation, cso, qa, qa-only: add both
- retro: add SEARCH (had LOG)
13 skills now fully participate in the learning loop (read + write).
Every review, QA, investigation, and design session both consults prior
learnings and contributes new ones.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add operational-learning E2E test (gate-tier)
Validates the write path: agent encounters a CLI failure, logs an
operational learning to JSONL via gstack-learnings-log. Replaces the
removed contributor-mode E2E test.
Setup: temp git repo, copy bin scripts, set GSTACK_HOME.
Prompt: simulated npm test failure needing --experimental-vm-modules.
Assert: learnings.jsonl exists with type=operational entry.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: learnings-show E2E slug mismatch — seed at computed slug, not hardcoded
The test seeded learnings at projects/test-project/ but gstack-slug computes
the slug from basename(workDir) when no git remote exists. The agent's search
looked at the wrong path and found nothing.
Fix: compute slug the same way gstack-slug does (basename + sanitize) and
seed the learnings there.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.13.8.0)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: Session Intelligence Layer design doc
Frames gstack as the persistent brain that survives Claude's ephemeral
context window. Architecture diagram, 5-layer feature breakdown, and
research sources from claude-mem, Anthropic engineering blog, CodeScene,
and Claude Code Agent Teams.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add context intelligence, health, swarm, and refactoring to roadmap
9 new TODOS across 4 sections based on Claude Code ecosystem research:
- Context Intelligence (P1): preamble artifact recovery, session timeline,
cross-session injection, /checkpoint skill, vision doc
- Health (P1): /health dashboard with CodeScene MCP integration option,
/health as /ship quality gate
- Swarm (P2): extract Review Army into reusable multi-agent primitive
- Refactoring (P2): /refactor-prep for pre-refactor token hygiene
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Left-align all sidebar text (chat welcome, loading, empty states,
notifications, inspector empty, session placeholder)
- Dispatch 'gstack-extension-ready' CustomEvent from content.js so
the welcome page can detect when the sidebar is active
- Generated 1024px icon: dark terminal window with amber prompt cursor
- Converted to .icns with all macOS sizes (16-1024px, 1x and 2x)
- build-app.sh copies icon into both the outer .app and bundled Chromium's
Resources (Chromium's process owns the Dock icon, not the launcher)
- browser-manager.ts patches Chromium's icon at runtime for dev mode too
- Both the Dock and Cmd+Tab now show the GStack icon
Main shipped v0.14.5.0 (Ship Idempotency + Skill Prefix Fix). Our branch's
GStack Browser entry bumps to v0.14.6.0. Both entries preserved in order.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: add idempotency guards to /ship Steps 4, 7, 8 (#649)
If git push succeeds but gh pr create fails, re-running /ship would
double-bump VERSION and duplicate CHANGELOG entries. Now:
- Step 4: check if VERSION already differs from base branch
- Step 7: fetch only the specific branch, skip push if already up to date
- Step 8: if PR exists, update body via gh pr edit instead of creating duplicate
No CHANGELOG guard needed — Step 5 is already idempotent by design
("replace existing entries with one unified entry").
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: patch name: in SKILL.md frontmatter for prefix mode (#620, #578)
./setup --prefix creates gstack-* symlinks but SKILL.md still says
name: qa, so Claude Code ignores the prefix. Now:
- New bin/gstack-patch-names shared helper patches name: field via sed
- setup calls it after link_claude_skill_dirs
- gstack-relink calls it after symlink loop
- gen-skill-docs.ts prints warning when skill_prefix is true
Edge cases: gstack-upgrade not double-prefixed, root gstack skill
never prefixed, prefix removal restores original names, SKILL.md
without frontmatter is a safe no-op.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add name patching + ship idempotency tests (#620, #649)
- 4 unit tests for name: patching in relink.test.ts (prefix on/off,
gstack-upgrade not double-prefixed, no-frontmatter no-op)
- 2 tests for gen-skill-docs prefix warning
- 1 E2E test for ship idempotency (periodic tier)
- Updated setupMockInstall to write SKILL.md with proper frontmatter
- Added ship-idempotency touchfiles + tier classification
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.14.3.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: PR idempotency checks open state, dedupe touchfiles, sync package.json
- Step 8 PR guard now checks state==OPEN so closed PRs don't prevent
new PR creation (adversarial review finding)
- Remove duplicate ship-idempotency entry in E2E_TOUCHFILES
- Sync package.json version to 0.14.3.0
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: patch name: before creating symlinks to fix --no-prefix ordering bug
gstack-patch-names must run BEFORE link_claude_skill_dirs so symlink
names reflect the correct (patched) name: values. Previously, switching
from --prefix to --no-prefix would read stale gstack-* names from
SKILL.md and create wrong symlinks. (Codex adversarial finding)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>