Merge origin/main into garrytan/slim-gstack-skills

VERSION → 1.15.0.0 (MINOR bump on top of main's v1.14.0.0). Branch's
v1.13.1.0 work (preamble compression + real-PTY harness + 5 plan-mode
tests passing) consolidated with v1.15.0.0 work (6 new E2E tests on the
harness + parseNumberedOptions + budget regression utils) into a single
release entry — v1.13.1.0 never landed on main, so its content rolls
into the final shippable version per the never-orphan rule in
CLAUDE.md.

Conflicts resolved:
- VERSION: 1.13.1.0 (HEAD) + 1.14.0.0 (main) → 1.15.0.0
- package.json: matching 1.15.0.0
- CHANGELOG.md: replaced HEAD's 1.13.1.0 entry with a consolidated
  1.15.0.0 entry above main's untouched 1.14.0.0 entry. Itemized
  changes split per-version (no shared header).

CLAUDE.md adds "Scale-aware bumps — use common sense" guidance under
CHANGELOG + VERSION style. Big diffs (>2K LOC, new capability) bump
MINOR; PATCH is for fixes/small adds; MAJOR for breaking changes.
Codified after a v1.14.1.0 PATCH attempt got correctly pushed back on
for a ~10K-line additions / -24K-line removals release.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-04-26 04:53:23 -07:00
35 changed files with 3049 additions and 5120 deletions
+54 -6
View File
@@ -225,12 +225,35 @@ When you need to interact with a browser (QA, dogfooding, cookie setup), use the
project uses.
**Sidebar architecture:** Before modifying `sidepanel.js`, `background.js`,
`content.js`, `sidebar-agent.ts`, or sidebar-related server endpoints, read
`docs/designs/SIDEBAR_MESSAGE_FLOW.md`. It documents the full initialization
timeline, message flow, auth token chain, tab concurrency model, and known
failure modes. The sidebar spans 5 files across 2 codebases (extension + server)
with non-obvious ordering dependencies. The doc exists to prevent the kind of
silent failures that come from not understanding the cross-component flow.
`content.js`, `terminal-agent.ts`, or sidebar-related server endpoints,
read `docs/designs/SIDEBAR_MESSAGE_FLOW.md`. The sidebar has one primary
surface — the **Terminal** pane (interactive `claude` PTY) — with
Activity / Refs / Inspector as debug overlays behind the footer's
`debug` toggle. The chat queue path was ripped once the PTY proved out;
`sidebar-agent.ts` and the `/sidebar-command` / `/sidebar-chat` /
`/sidebar-agent/event` endpoints are gone. The doc covers the WS auth
flow, dual-token model, and threat-model boundary — silent failures
here usually trace to not understanding the cross-component flow.
**WebSocket auth uses Sec-WebSocket-Protocol, not cookies.** Browsers
can't set `Authorization` on a WebSocket upgrade, but they CAN set
`Sec-WebSocket-Protocol` via `new WebSocket(url, [token])`. The agent
reads it, validates against `validTokens`, and MUST echo the protocol
back in the upgrade response — without the echo, Chromium closes the
connection immediately. `Set-Cookie: gstack_pty=...` is kept as a
fallback for non-browser callers (the cross-port `SameSite=Strict`
cookie path doesn't survive from a chrome-extension origin).
**Cross-pane PTY injection.** The toolbar's Cleanup button and the
Inspector's "Send to Code" action both pipe text into the live claude
PTY via `window.gstackInjectToTerminal(text)`, exposed by
`sidepanel-terminal.js`. No `/sidebar-command` POST — the live REPL is
the only execution surface in the sidebar now.
**`/health` MUST NOT surface any shell-grant token.** It already leaks
`AUTH_TOKEN` to localhost callers in headed mode (a v1.1+ TODO). Don't
make that worse by adding the PTY session token there. PTY auth flows
through `POST /pty-session` only.
**Transport-layer security** (v1.6.0.0+). When `pair-agent` starts an ngrok tunnel,
the daemon binds two HTTP listeners: a local listener (127.0.0.1, full command
@@ -437,6 +460,31 @@ claims v1.7.0.0 as a MINOR and branch B is also a MINOR, B lands at v1.8.0.0
`bin/gstack-next-version` advances within the chosen bump level rather than
repicking the level when collisions happen.
**Scale-aware bumps — use common sense.** When the diff is big, bump MINOR (or
MAJOR), not PATCH. PATCH is for bug fixes and small additions; MINOR is for
substantial new capability or substantial reduction; MAJOR is for breaking
changes. Rough guideposts (don't treat as rules, treat as smell-checks):
- **PATCH (X.Y.Z+1.0)**: bug fix, doc tweak, small additive change, single
test/file added. Net diff under ~500 lines, no new user-facing capability.
- **MINOR (X.Y+1.0.0)**: new capability shipped (skill, harness, command, big
refactor), substantial code reduction (compression, migration), or coordinated
multi-file change. Net diff over ~2000 lines added/removed, OR a user-visible
feature you'd put in a tweet.
- **MAJOR (X+1.0.0.0)**: breaking change to public surface (CLI flag rename,
skill removed, config format changed), OR a release big enough to be the
headline of a blog post.
If you find yourself debating "is 10K added + 24K removed really a PATCH?" — it
isn't. Bump MINOR. Same for "this adds a whole new test harness with 6 new E2E
tests + helper utilities" — MINOR. The bump level is communication to the user
about what kind of release this is; don't undersell it.
When merging origin/main brings a higher VERSION, re-evaluate the bump level
against the SCALE of your branch's work, not just whether main moved forward.
If main bumped MINOR and your branch is also a substantial change, you bump
MINOR again on top (e.g., main at v1.14.0.0, your branch lands v1.15.0.0).
**VERSION and CHANGELOG are branch-scoped.** Every feature branch that ships gets its
own version bump and CHANGELOG entry. The entry describes what THIS branch adds —
not what was already on main.