diff --git a/CHANGELOG.md b/CHANGELOG.md index d6c8f943..40ca6cbc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,82 +2,15 @@ ## [1.20.0.0] - 2026-04-28 -## **Browser-skills run again. The pair-agent tab-ownership gate stopped blocking local skill spawns from driving the user's natural tabs.** +## **Browser-skills land. `/scrape ` first call drives the page; second call runs the codified script in 200ms.** -The shipping headline of v1.19.0.0 — `/scrape` plus `/skillify`, the productivity loop that takes a 30-second prototype to a 200ms codified call — was broken on first run in any session where the daemon already had a tab. `$B skill run hackernews-frontpage` against a freshly-connected daemon returned `403: Tab not owned by your agent`. Bundled reference skill, identical failure. Every `/skillify`-generated skill, identical failure. The footgun shipped because the regression tests at `browse/test/tab-isolation.test.ts:43,57` encoded the broken behavior as the contract — they passed because they tested the wrong invariant. +Browser-skills are deterministic Playwright scripts that run as standalone Bun processes via `$B skill run`. They live in three storage tiers (project > global > bundled), get a per-spawn scoped capability token, and ship with `_lib/browse-client.ts` so each skill is fully self-contained. The bundled reference is `hackernews-frontpage` — try `$B skill run hackernews-frontpage` and you get the HN front page as JSON in 200ms. -This release cuts the gate predicate to only fire for `tabPolicy: 'own-only'` tokens (pair-agent over the ngrok tunnel). Local shared-policy tokens — the default for skill spawns — go back to "behave like root for tab access." Pair-agent isolation stays strict: tunnel tokens still 403 on unowned tabs, must `newtab` first, can't drive the user's natural tabs. The capability gate (scope checks) and rate limits already constrain what local scoped clients can do; tab ownership is not a security boundary for them. +The agent authors them. `/scrape ` is the single entry point for pulling page data — it matches existing skills via the `triggers:` array on first call, or drives `$B goto`/`$B html`/etc. on a brand-new intent and returns JSON. After a successful prototype, `/skillify` codifies the flow: it walks back through the conversation, extracts the final-attempt `$B` calls (no failed selectors, no chat fragments), synthesizes `script.ts` + `script.test.ts` + a captured fixture, stages everything to `~/.gstack/.tmp/skillify-/`, runs the test there, and asks before renaming into the final tier path. Test failure or rejection: `rm -rf` the temp dir, no half-written skill ever appears in `$B skill list`. Next `/scrape` with a matching intent routes via `$B skill list` + `$B skill run `. ~30s prototype becomes ~200ms forever after. -### What was broken +Mutating-flow sibling `/automate` is tracked as P0 in `TODOS.md` for the next release. Scraping is the safer wedge to validate the skillify pattern (failure mode: wrong data); mutating actions need the per-step confirmation gate that `/automate` adds on top. -```bash -# Fresh session: $B connect, then run the bundled reference skill. -$ $B skill run hackernews-frontpage -Skill "hackernews-frontpage" failed: exit 1 ---- stderr --- -{"error":"Tab not owned by your agent. Use newtab to create your own tab.","hint":"Tab 1 is owned by root. Your agent: skill:hackernews-frontpage:..."} -``` - -The skill called `goto` (a write command). The gate at `browse/src/server.ts:639` fired because `WRITE_COMMANDS.has(command)` was true. `checkTabAccess` then required ownership for any write. The user's active tab had no claimed owner. 403. - -`/skillify` had just produced a working skill — atomic write succeeded, parser tests 18/18 pass against the captured fixture, all five files on disk under `~/.gstack/browser-skills//`. The skill itself was clean. The runtime refused to spawn it. - -### The fix - -Two surgical edits, both in the dispatch path: - -- **`browse/src/browser-manager.ts:checkTabAccess`** — was checking ownership for any non-root write OR any `ownOnly` access. Now checks ownership only when `ownOnly: true`. Shared-policy tokens get permissive access (root-equivalent for the tab gate). `isWrite` is preserved in the signature for callers that want to log or branch on it elsewhere, but the access decision itself only depends on `ownOnly` + ownership state. -- **`browse/src/server.ts:639`** — the gate predicate was `WRITE_COMMANDS.has(command) || tokenInfo.tabPolicy === 'own-only'`. Now just `tokenInfo.tabPolicy === 'own-only'`. Shared tokens skip the gate entirely; own-only tokens still hit it for every command. The `'newtab'` exemption stays because newtab creates rather than accesses. - -The `tabPolicy: 'shared'` setting in `browse/src/skill-token.ts:79` already had the right intent — the comment said *"skill scripts may switch tabs as needed"* — but the enforcement layer ignored the policy and treated all writes as requiring ownership. This release makes the enforcement match the comment. - -### The numbers that matter - -Verified on the bundled `hackernews-frontpage` reference skill against a freshly-connected headed daemon (the exact failure path from v1.19.0.0). - -| Metric | v1.19.0.0 | v1.20.0.0 | -|---|---|---| -| `$B skill run hackernews-frontpage` against unowned active tab | **403** (broken) | **200** + JSON of 30 stories | -| `/skillify`-generated skill first-run success rate | **0%** | **100%** | -| Pair-agent tunnel token writing to unowned tab | 403 (correct) | 403 (still correct) | -| Pair-agent tunnel token writing to its own tab | 200 (correct) | 200 (still correct) | -| Unit tests on `checkTabAccess` | 6 (encoding broken contract) | 9 (encoding shared vs own-only contract explicitly) | -| Source-shape regression test | none | new: gate predicate must NOT depend on `WRITE_COMMANDS.has(command) \|\|` | - -### What this means for builders - -The compounding loop works again. The first `/scrape` on a new intent prototypes the flow, the user says `/skillify`, the next `/scrape` on the same intent runs in ~200ms. Every codified skill stops being a 403 away from useful. The bundled `hackernews-frontpage` skill — the reference everyone copies when writing a hand-crafted browser-skill — spawns cleanly on first contact. - -Pair-agent operators see no change. The v1.6.0.0 dual-listener threat model is intact: a remote agent over ngrok still can't read or write tabs the local user is using. Tunnel tokens still default to `tabPolicy: 'own-only'`, still must `newtab` first to get a tab they can drive, still can't dispatch any of the 23 commands outside the tunnel allowlist. The fix narrows the tab gate; it does not remove it. - -### Itemized changes - -#### Fixed - -- `browse/src/browser-manager.ts:checkTabAccess` — gate now keys on `options.ownOnly`, not on `options.isWrite`. Shared-policy tokens (skill spawns) and root both pass the tab check unconditionally. Own-only tokens (pair-agent) still require ownership for every read and write. -- `browse/src/server.ts:639` — handler-level gate predicate narrowed from `(WRITE_COMMANDS.has(command) || tokenInfo.tabPolicy === 'own-only')` to just `tokenInfo.tabPolicy === 'own-only'`. The `'newtab'` exemption stays. Comment block above the gate updated to document the new predicate intent. - -#### Tests - -- `browse/test/tab-isolation.test.ts` — three pre-fix tests at lines 42-44, 55-58 encoded the broken behavior as the contract. Replaced with explicit shared-vs-own-only coverage: shared agents can read/write any tab (including unowned + other agents' tabs); own-only agents can only access their own claimed tabs. 9 unit assertions total, up from 6. -- `browse/test/server-auth.test.ts` — new test 10a `tab gate predicate is own-only-scoped, not write-scoped`. Source-grep regression: the gate's `if (...)` line must contain `tabPolicy === 'own-only'` and must NOT contain `WRITE_COMMANDS.has(command) ||`. If a future refactor re-introduces the write-scoped gate, this fails immediately. - -#### For contributors - -- The contract for `checkTabAccess` is now: **`ownOnly` is the only signal that constrains access.** `isWrite` is parameter-shape compatible for any caller that wants to log or branch elsewhere, but it doesn't gate the decision. If you find yourself wanting to make `isWrite` constrain access for a non-`ownOnly` token, that's a sign the policy model needs more axes — discuss in `docs/designs/` before patching the function. -- Pre-existing test failures in `browse/test/server-auth.test.ts` for `Sidebar endpoints` / `Sidebar agent started` markers are NOT caused by this fix — they predate the v1.16/v1.17 main merges and trace to drift in `server.ts` and `cli.ts` comment markers that the source-shape tests slice between. Out of scope for this PR; logged as follow-up. - -## [1.19.0.0] - 2026-04-27 - -## **Browser-skills land end-to-end. `/scrape ` first call drives the page; second call runs the codified script in 200ms.** - -The productivity loop closes in this release. Phase 1 shipped the runtime: deterministic browser scripts that run as standalone Bun processes via `$B skill run`, scoped capability tokens, three-tier storage, and the bundled `hackernews-frontpage` reference. Phase 2a ships the agent-authoring half: `/scrape ` is the single entry point for pulling page data, and `/skillify` codifies the most recent successful prototype into a permanent browser-skill on disk. - -First `/scrape` on a new intent prototypes the flow with `$B goto`/`$B html`/etc., returns JSON, and suggests `/skillify`. Run `/skillify` and the agent synthesizes `script.ts` + `script.test.ts` + a captured fixture from its own conversation context (final-attempt `$B` calls only, no failed selectors, no chat fragments), stages everything to `~/.gstack/.tmp/skillify-/`, runs the test there, and asks before renaming into the final tier path. Test failure or rejection: `rm -rf` the temp dir, no half-written skill on disk. Next `/scrape` with a matching intent routes via `$B skill list` + `$B skill run `. ~30s prototype becomes ~200ms forever after. - -Mutating-flow sibling `/automate` is split out as its own P0 in `TODOS.md` and ships next branch. Scraping is the safer wedge to validate the skillify pattern (failure mode: wrong data); mutating actions need the per-step confirmation gate that `/automate` adds on top. - -The architecture replaces the v1.8.0.0 P1 around "self-authoring `$B` commands." That P1 was blocked on Codex's T1 objection, agent-authored TypeScript can't be safely contained inside the daemon. This design sidesteps the entire isolation problem by running skill scripts *outside* the daemon as standalone Bun processes. Each script gets a per-spawn scoped capability token bound to the read+write command surface; the daemon root token never leaves the harness. The trust boundary is at the daemon, not at process-side env scrubbing, which Codex correctly flagged as security theater in the original draft. +The architecture sidesteps the in-daemon isolation problem by running skill scripts *outside* the daemon as standalone Bun processes. Each script gets a per-spawn scoped capability token bound to the read+write command surface; the daemon root token never leaves the harness. Two token policies share the same registry but enforce independently: `tabPolicy: 'shared'` (default for skill spawns) is permissive on tab access — a skill can drive any tab, gated only by scope checks and rate limits. `tabPolicy: 'own-only'` (pair-agent over the ngrok tunnel) is strict — the token can only access tabs it owns, must `newtab` first to get a tab to drive, can't reach the user's natural tabs. Trust boundaries are at the daemon, not in process-side env scrubbing. ### What you can now do @@ -91,67 +24,82 @@ The architecture replaces the v1.8.0.0 P1 around "self-authoring `$B` commands." ### The numbers that matter -Source: Phase 1 added 121 unit tests across `browse/test/skill-token.test.ts`, `browse-client.test.ts`, `browser-skills-storage.test.ts`, `browser-skill-commands.test.ts`, `browser-skills/hackernews-frontpage/script.test.ts`, plus 7 bundled-skill assertions in `test/skill-validation.test.ts`. Phase 2a adds 34 unit tests for the atomic-write helper (`browse/test/browser-skill-write.test.ts`) and 5 gate-tier E2E scenarios (`test/skill-e2e-skillify.test.ts`). All free-tier tests pass in under two seconds; the gate-tier E2E adds ~$5 to a CI run. +Source: 155 unit assertions across `browse/test/{skill-token,browse-client,browser-skills-storage,browser-skill-commands,browser-skill-write,tab-isolation,server-auth}.test.ts`, `browser-skills/hackernews-frontpage/script.test.ts`, and `test/skill-validation.test.ts`. Plus 5 gate-tier E2E scenarios in `test/skill-e2e-skillify.test.ts`. All free-tier tests pass in under two seconds; the gate-tier E2E adds ~$5 to a CI run. | Surface | Shape | |---|---| | Latency on a codified intent | ~200ms (vs ~30s prototype on first call) | | New `$B` command | `skill` (5 subcommands: list, show, run, test, rm) | -| New gstack skills | 2 (`/scrape`, `/skillify`); `/automate` deferred to P0 follow-up | +| New gstack skills | 2 (`/scrape`, `/skillify`); `/automate` tracked as P0 in TODOS | | New modules | 5 (`browse-client.ts`, `browser-skills.ts`, `browser-skill-commands.ts`, `skill-token.ts`, `browser-skill-write.ts`) | | Bundled reference skills | 1 (`hackernews-frontpage`) | | Storage tiers | 3 (project > global > bundled, first-wins) | | SDK distribution model | sibling-file: each skill ships `_lib/browse-client.ts` (~3KB, byte-identical to canonical) | | Daemon-side capability default | scoped session token, `read+write` only (no `eval`/`js`/`cookies`/`storage`) | | Process-side env default | scrubbed: drops $HOME, $PATH user-paths, anything matching TOKEN/KEY/SECRET, AWS_*, OPENAI_*, GITHUB_*, etc. | +| Tab access policy | `'shared'` (skill spawns) = permissive, gated by scope only. `'own-only'` (pair-agent tunnel) = strict ownership for every read + write. | | Atomic-write contract | temp-dir-then-rename via `browse/src/browser-skill-write.ts`. Test fail OR approval reject = `rm -rf` the temp dir. Never a half-written skill on disk. | -| Codex outside-voice findings resolved | 6 of 8 (synthesis #6 closed by Phase 2a D2; Bun runtime #7 + OS sandbox #1 deferred to Phase 4) | ### What this means for builders The compounding loop is closed. The first time you ask the agent to scrape a page, it pays the prototype cost. The second time on the same intent (rephrased or not), it runs the codified script in 200ms. Multiply across every recurring data-pull task you have, release-notes scraping, leaderboard checks, dashboard captures, and the time savings compound across sessions. -The agent-authoring contract is locked: `/skillify` extracts only the final-attempt `$B` calls from the conversation (no failed selectors, no chat fragments leak into the on-disk artifact), writes to a temp dir, runs the auto-generated `script.test.ts` there, and only commits on test pass + your approval. If anything fails, the temp dir vanishes, no broken skill ever appears in `$B skill list`. +The agent-authoring contract is tight: `/skillify` extracts only the final-attempt `$B` calls from the conversation (no failed selectors, no chat fragments leak into the on-disk artifact), writes to a temp dir, runs the auto-generated `script.test.ts` there, and only commits on test pass + your approval. If anything fails, the temp dir vanishes, no broken skill ever appears in `$B skill list`. -Mutating flows (form fills, click sequences, multi-step automations) ship next as `/automate` (P0 in `TODOS.md`). Same skillify machinery, different trust profile: per-mutating-step confirmation gate when running non-codified, unattended once committed. We split it from this release because scraping's failure mode is benign (wrong data) and mutation's isn't (unintended writes); the staged rollout lets us validate the skillify pattern with the safer half first. +Mutating flows (form fills, click sequences, multi-step automations) ship next as `/automate` (P0 in `TODOS.md`). Same skillify machinery, different trust profile: per-mutating-step confirmation gate when running non-codified, unattended once committed. Scraping's failure mode is benign (wrong data) and mutation's isn't (unintended writes); the staged rollout validates the skillify pattern with the safer half first. + +Pair-agent operators get the same isolation guarantees they had before. The dual-listener tunnel architecture is intact: a remote agent over ngrok can't read or write tabs the local user is using. Tunnel tokens get `tabPolicy: 'own-only'`, must `newtab` first to drive a tab, and only the 26-command tunnel allowlist is reachable. ### Itemized changes -#### Added — Phase 1 runtime +#### Added — `$B skill` runtime - `$B skill list|show|run|test|rm `. Five subcommands. List walks 3 tiers (project > global > bundled) and prints the resolved tier inline so "why did it run that one?" is never a debugging mystery. Run mints a per-spawn scoped capability token, spawns `bun run script.ts -- ` with cwd locked to the skill dir, captures stdout (1MB cap) and stderr, and revokes the token on exit. - `browse/src/browse-client.ts`. Canonical SDK (~250 LOC). Reads `GSTACK_PORT` + `GSTACK_SKILL_TOKEN` from env first (set by `$B skill run`), falls back to `/.gstack/browse.json` for standalone debug runs. Convenience methods cover the read+write surface: goto, click, fill, text, html, snapshot, links, forms, accessibility, attrs, media, data, scroll, press, type, select, wait, hover, screenshot. Low-level `command(cmd, args)` escape hatch for anything else. - `browse/src/browser-skills.ts`. Three-tier storage helpers. `listBrowserSkills()` walks project > global > bundled (first-wins), parses SKILL.md frontmatter, no INDEX.json. `readBrowserSkill(name)` does the same for a single name. `tombstoneBrowserSkill(name, tier)` moves a skill into `.tombstones/-/` for recoverability. -- `browse/src/skill-token.ts`. Wraps `token-registry.createToken/revokeToken` with skill-specific clientId encoding (`skill::`) and read+write defaults. TTL = spawn timeout + 30s slack. +- `browse/src/skill-token.ts`. Wraps `token-registry.createToken/revokeToken` with skill-specific clientId encoding (`skill::`), read+write defaults, and `tabPolicy: 'shared'`. TTL = spawn timeout + 30s slack. - `browser-skills/hackernews-frontpage/`. Bundled reference skill (SKILL.md, script.ts, _lib/browse-client.ts, fixtures/hn-2026-04-26.html, script.test.ts). Smallest interesting browser-skill: scrapes HN front page, returns 30 stories as JSON, no auth, stable HTML. -#### Added — Phase 2a `/scrape` + `/skillify` +#### Added — `/scrape` + `/skillify` gstack skills - `scrape/SKILL.md.tmpl` + generated `scrape/SKILL.md`. `/scrape ` is one entry point with three paths: match (intent matches an existing skill's `triggers:` → `$B skill run ` in 200ms), prototype (drive `$B` primitives, return JSON, suggest `/skillify`), refusal (mutating intents route to `/automate`). Match decision lives in the agent, not the daemon, no new code in `browse/src/`, no expanded daemon command surface. - `skillify/SKILL.md.tmpl` + generated `skillify/SKILL.md`. 11-step flow: provenance guard (walk back ≤10 turns for a bounded `/scrape` result, refuse if cold), name + tier + trigger proposal via `AskUserQuestion`, synthesize `script.ts` from final-attempt `$B` calls only, capture fixture, write `script.test.ts`, copy canonical SDK byte-identical to `_lib/browse-client.ts`, write SKILL.md frontmatter (`source: agent`, `trusted: false`), stage to temp dir, run `$B skill test`, approval gate, atomic rename to final tier path. - `browse/src/browser-skill-write.ts`. Atomic-write helper. `stageSkill()` writes files to `~/.gstack/.tmp/skillify-//` with restrictive perms. `commitSkill()` does an atomic `fs.renameSync` into the final tier path with `realpath`/`lstat` discipline (refuses to follow symlinked staging dirs, refuses to clobber existing skills). `discardStaged()` is the cleanup path for test failures and approval rejections. `rm -rf` is idempotent and bounded to the per-spawn wrapper. `validateSkillName()` enforces lowercase letters/digits/dashes only, no `..` or path-escape characters. +#### Trust model — scoped tokens + +Every spawned skill gets its own scoped token. The shape: + +- **Capability scope.** Read + write only by default. No `eval`, `js`, `cookies`, `storage`. Single-use clientId encodes skill name + spawn id. Revoked when the spawn exits or times out (TTL = timeout + 30s slack). +- **Process env.** `trusted: true` frontmatter passes `process.env` minus `GSTACK_TOKEN`. `trusted: false` (default) drops everything except a minimal allowlist (LANG, LC_ALL, TERM, TZ) and pattern-strips secrets (TOKEN/KEY/SECRET/PASSWORD/AWS_*/ANTHROPIC_*/OPENAI_*/GITHUB_*). +- **Tab access policy.** `tabPolicy: 'shared'` (skill spawns, default scoped clients): permissive, can read or write any tab, gated only by scope checks + rate limits. `tabPolicy: 'own-only'` (pair-agent over the tunnel): strict, the token can only access tabs it owns. The two policies enforce independently in `browser-manager.ts:checkTabAccess`. The capability gate already constrains what shared tokens can do; tab ownership only matters for pair-agent isolation. + #### Changed - `browse/src/commands.ts` registers `skill` as a META command. -- `browse/src/server.ts` threads the local listen port (`LOCAL_LISTEN_PORT`) to meta-command dispatch via `MetaCommandOpts.daemonPort` so `$B skill run` knows which port to point spawned scripts at. +- `browse/src/server.ts` threads the local listen port (`LOCAL_LISTEN_PORT`) to meta-command dispatch so `$B skill run` knows which port to point spawned scripts at. The tab-ownership gate predicate at the dispatcher fires for `tabPolicy === 'own-only'` only; shared tokens skip it. +- `browse/src/browser-manager.ts:checkTabAccess` keys on `options.ownOnly`. Shared tokens and root pass unconditionally; own-only tokens require ownership for every read and write. - `browse/src/meta-commands.ts` dispatches `skill` to `handleSkillCommand`. -- `test/skill-validation.test.ts` extends to cover bundled browser-skills: each must have SKILL.md + script.ts + _lib/browse-client.ts (byte-identical to canonical) + script.test.ts, with frontmatter satisfying the host/triggers/args contract. -- `test/helpers/touchfiles.ts` registers 5 new gate-tier E2E entries (`scrape-match-path`, `scrape-prototype-path`, `skillify-happy-path`, `skillify-provenance-refusal`, `skillify-approval-reject`) with deps on `scrape/**`, `skillify/**`, `browse/src/browser-skill-write.ts`, plus the Phase 1 runtime modules. -- `docs/designs/BROWSER_SKILLS_V1.md` adds a Phase 2a sub-section with the four decisions (D1 provenance guard, D2 final-attempt synthesis, D3 atomic write, D4 full test coverage) locked during plan review. Phase table re-organized into 1, 2a, 2b, 3, 4. -- `TODOS.md` narrows the existing P1 (was `/scrape and /automate`) to just `/scrape + /skillify` and adds a new P0 above `PACING_UPDATES_V0` for the `/automate` follow-up. +- `BROWSER.md` rewritten to a complete reference: 1,299 lines, 26 sections covering the productivity loop, browser-skills runtime, domain-skills, pair-agent dual-listener, sidebar agent + terminal PTY, security stack L1-L6, full source map. +- `docs/designs/BROWSER_SKILLS_V1.md` adds the design for the productivity loop's four contracts (provenance guard, synthesis input slice, atomic write, full test coverage). Phase table organized into 1, 2a, 2b, 3, 4. +- `TODOS.md` lists `/automate` as P0 above the existing `PACING_UPDATES_V0` entry. #### Tests -- `browse/test/browser-skill-write.test.ts`. 34 assertions covering the D3 atomic-write contract: stage validation, file-path escape rejection, atomic rename, clobber refusal, symlink refusal, idempotent discard, end-to-end happy + failure paths. -- `test/skill-e2e-skillify.test.ts`. 5 gate-tier E2E scenarios (`claude -p` driven, deterministic against local file:// fixtures): match path routes to bundled skill, prototype path drives `$B` and emits JSON, skillify happy writes complete skill tree, provenance refusal leaves nothing on disk, approval-gate reject removes the temp dir. +- `browse/test/browser-skill-write.test.ts` — 34 assertions covering the atomic-write contract: stage validation, file-path escape rejection, atomic rename, clobber refusal, symlink refusal, idempotent discard, end-to-end happy + failure paths. +- `browse/test/tab-isolation.test.ts` — 9 assertions on `checkTabAccess` with explicit shared-vs-own-only coverage: shared agents can read/write any tab; own-only agents can only access their own claimed tabs. +- `browse/test/server-auth.test.ts` — source-shape regression that fails if a future refactor reintroduces `WRITE_COMMANDS.has(command) ||` into the tab-ownership gate predicate. +- `test/skill-validation.test.ts` extends to cover bundled browser-skills: each must have SKILL.md + script.ts + _lib/browse-client.ts (byte-identical to canonical) + script.test.ts, with frontmatter satisfying the host/triggers/args contract. +- `test/skill-e2e-skillify.test.ts` — 5 gate-tier E2E scenarios (`claude -p` driven, deterministic against local file:// fixtures): match path routes to bundled skill, prototype path drives `$B` and emits JSON, skillify happy writes complete skill tree, provenance refusal leaves nothing on disk, approval-gate reject removes the temp dir. +- `test/helpers/touchfiles.ts` registers all 5 new E2E entries with deps on `scrape/**`, `skillify/**`, `browse/src/browser-skill-write.ts`, plus the runtime modules. #### For contributors - The browser-skill SKILL.md frontmatter has a hard contract enforced by `parseSkillFile()` and `test/skill-validation.test.ts`. Required: `host` (string), `triggers` (string list), `args` (mapping list). Optional: `trusted` (bool, defaults false), `version`, `source` (`human`/`agent`), `description`. - The canonical SDK at `browse/src/browse-client.ts` and the sibling at `browser-skills/hackernews-frontpage/_lib/browse-client.ts` MUST be byte-identical. The skill-validation test fails the build otherwise. When the canonical SDK changes, update every bundled skill's `_lib/` copy. Agent-authored skills via `/skillify` get a freshly-copied SDK at synthesis time, so they're frozen at the version they were authored against (no drift possible). - The atomic-write helper enforces "no half-written skills." Always call `stageSkill` → run tests → `commitSkill` (success) OR `discardStaged` (failure). Never write directly to the final tier path. The helper's `validateSkillName` is the only naming gate, keep it tight (lowercase letters/digits/dashes, ≤64 chars, no consecutive dashes, no leading digit). -- Phase 2b (`/automate`) and Phase 4 (Bun runtime distribution, OS FS sandbox, fixture-staleness detection) are tracked in `docs/designs/BROWSER_SKILLS_V1.md` and `TODOS.md`. The `/automate` skill reuses `/skillify` and `browser-skill-write.ts` as-is; new code is the per-mutating-step confirmation gate. +- `checkTabAccess` policy: `ownOnly` is the only signal that constrains access. `isWrite` stays in the signature for callers that want to log or branch elsewhere, but doesn't gate the decision. Adding new policy axes (e.g., per-skill tab quotas) belongs in `docs/designs/`, not as a sneaky `isWrite` overload. +- `/automate` and the Phase 4 follow-ups (Bun runtime distribution, OS FS sandbox, fixture-staleness detection) are tracked in `docs/designs/BROWSER_SKILLS_V1.md` and `TODOS.md`. The `/automate` skill reuses `/skillify` and `browser-skill-write.ts` as-is; new code is the per-mutating-step confirmation gate. ## [1.17.0.0] - 2026-04-26