--- name: skillify version: 1.0.0 description: | Codify the most recent successful /scrape flow into a permanent browser-skill on disk. Future /scrape calls with the same intent run the codified script in ~200ms instead of re-driving the page. Walks back through the conversation, synthesizes script.ts + script.test.ts + fixture, runs the test in a temp dir, and asks before committing. Use when asked to "skillify", "codify", "save this scrape", or "make this permanent". (gstack) allowed-tools: - Bash - Read - Write - AskUserQuestion triggers: - skillify - codify this scrape - save this scrape - make this permanent --- {{PREAMBLE}} # /skillify — codify the last scrape into a permanent skill The productivity multiplier. `/scrape` discovered how to pull the data; `/skillify` writes it as deterministic Playwright-via-`browse-client` code so the next `/scrape` call on the same intent runs in ~200ms. Without this command, `/scrape` is a slow wrapper around `$B`. With it, every successful scrape is a one-time cost. ## Iron contract — never write a half-broken skill to disk Skills are user-trust artifacts. A broken skill in `$B skill list` makes agents reach for the wrong tool and erodes confidence. This skill writes to a temp dir, runs the auto-generated test there, and only renames into the final tier path on (a) test pass + (b) explicit user approval. On either failure, the temp dir is removed entirely. There is no "almost shipped" state. --- ## Step 1 — Provenance guard (D1) Walk back through the conversation, **at most 10 agent turns**, looking for the most recent `/scrape` invocation that: - Was bounded (you can identify the user's intent line and the trailing JSON the prototype produced) - Produced a JSON result the user did not subsequently invalidate (e.g., did not say "that's wrong", did not ask you to retry) If you cannot find one, refuse with exactly this message: > "No recent /scrape result found in this conversation. Run /scrape > first, then say /skillify." Stop. Do not synthesize from chat fragments. Do not synthesize from a match-path /scrape result (matched skills are already codified — there's nothing to skillify). If you find a candidate but the user is currently three turns past it discussing something unrelated, ask once before proceeding: > "The last successful /scrape was '' a few turns back. > Skillify that one?" A "yes" lets you continue. Anything else: refuse with the message above. ## Step 2 — Propose name + triggers From the prototype intent, extract: - A short skill name: lowercase letters/digits/dashes, ≤32 chars, starts with a letter, no consecutive dashes. E.g., `lobsters-frontpage`, `gh-issue-list`, `pypi-package-stats`. - 3–5 trigger phrases the agent should match against in future `/scrape` calls. Mix the canonical phrase ("scrape lobsters frontpage") with paraphrases ("top posts on lobste.rs", "lobsters front page"). - The host (just the hostname, e.g. `lobste.rs`). Then **AskUserQuestion** to confirm: ``` D — Skill name + tier Project/branch/task: codifying /scrape "" as a browser-skill. ELI10: Pick a short name we'll use to find this skill next time you say something similar. Pick a tier — global means every project on this machine sees it, project means just this repo. Stakes if we pick wrong: bad name buries the skill in $B skill list; wrong tier means future projects can't find it (or can find it when you didn't want them to). Recommendation: A — at global tier — most scrape skills generalize across projects. Note: options differ in kind, not coverage — no completeness score. A) Keep "" at global tier — ~/.gstack/browser-skills// (recommended) B) Keep "" but at project tier — /.gstack/browser-skills// C) Rename it (free-form — say the new name) ``` **Tier-shadowing check.** Before showing the question, run `$B skill list` and check for an existing skill at the same name. If found, add to the question: > "Note: a skill named '' already exists. Picking the same > name at a higher tier (project > global > bundled) shadows it; picking > the same tier collides and will be refused at write time. Pick a > different name to coexist." ## Step 3 — Synthesize `script.ts` (D2) **Use only the final-attempt `$B` calls** that produced the JSON the user accepted, plus the user's intent string. Drop: - Failed selector attempts (the four selectors you tried before the working one) - Unrelated `$B` commands from earlier turns - All conversation prose, summaries, your own reasoning The script imports the SDK from `./_lib/browse-client` (a sibling copy, written in step 6) and exports a parser function so `script.test.ts` can exercise it against the bundled fixture without spinning up the daemon. Mirror the bundled reference at `browser-skills/hackernews-frontpage/script.ts`: ```ts import { browse } from './_lib/browse-client'; export interface Item { /* one row of the JSON output */ } export interface Output { items: Item[]; count: number; } const TARGET_URL = ''; export function parseFromHtml(html: string): Item[] { // Pure function: HTML in, parsed Item[] out. No $B calls. // Future fixture-replay tests call this directly. } if (import.meta.main) { await main(); } async function main(): Promise { await browse.goto(TARGET_URL); const html = await browse.html(); const items = parseFromHtml(html); const output: Output = { items, count: items.length }; process.stdout.write(JSON.stringify(output) + '\n'); } ``` The parser MUST be a pure function. If your prototype used multiple `$B` calls (e.g., goto + click "Next" + html), keep all of them in `main()` but extract the parsing into pure helpers. The fixture-replay tests in step 5 only exercise the pure parts. ## Step 4 — Capture the fixture ```bash $B goto "" $B html > /tmp/skillify-fixture-$$.html ``` The fixture filename inside the staged dir is `fixtures/-.html`, where the date is today. E.g. `fixtures/lobste-rs-2026-04-27.html`. Read the file you wrote, store its contents in a variable, and use it when staging in step 7. ## Step 5 — Write `script.test.ts` Mirror `browser-skills/hackernews-frontpage/script.test.ts`. The test must include at least one ★★ assertion — parsed output has the expected shape AND non-empty key fields — not a smoke ★ assertion. Smoke tests that only check `parseFromHtml` doesn't throw are insufficient. ```ts import { describe, it, expect } from 'bun:test'; import * as fs from 'fs'; import * as path from 'path'; import { parseFromHtml } from './script'; describe(' parser', () => { const fixturePath = path.join(import.meta.dir, 'fixtures', '-.html'); const html = fs.readFileSync(fixturePath, 'utf-8'); const items = parseFromHtml(html); it('returns at least one item from the bundled fixture', () => { expect(items.length).toBeGreaterThan(0); }); it('every item has the required shape', () => { for (const item of items) { expect(typeof item.).toBe(''); // ... assert on every required field } }); }); ``` ## Step 6 — Resolve the canonical SDK path + read it The canonical SDK lives at `/browse/src/browse-client.ts`. The bundled-skill loader walks the install tree to find it; mirror that. Resolve the gstack install dir. Two reliable signals (in order): 1. The bundled `hackernews-frontpage` skill — look at its tier path from `$B skill list` (the `bundled` row). The skill dir is `/browser-skills/hackernews-frontpage/`, so the install dir is two `dirname` calls above its `_lib/browse-client.ts`. 2. The active gstack skills install at `~/.claude/skills/gstack/`. Read the symlink target if it's a symlink, otherwise use the path directly. Example (run as Bun, not bash, to avoid shell-redirect parsing issues): ```ts import * as fs from 'fs'; import * as os from 'os'; import * as path from 'path'; function resolveSdkPath(): string { const candidates = [ path.join(os.homedir(), '.claude', 'skills', 'gstack', 'browse', 'src', 'browse-client.ts'), // Add other install-dir candidates if your environment differs. ]; for (const c of candidates) { try { const real = fs.realpathSync(c); if (fs.existsSync(real)) return real; } catch {} } throw new Error('Could not resolve canonical browse-client.ts'); } const sdkContents = fs.readFileSync(resolveSdkPath(), 'utf-8'); ``` Read the SDK contents into a variable. The staging step writes it as `_lib/browse-client.ts` byte-identical to the canonical. Phase 1 decision #4 — each skill is fully self-contained, no version drift possible. ## Step 7 — Stage the skill (D3 atomic write) Use the helper at `browse/src/browser-skill-write.ts`. Construct an inline TypeScript snippet (or shell out to a small Bun one-liner) that calls: ```ts import { stageSkill } from '/browse/src/browser-skill-write'; const stagedDir = stageSkill({ name: '', files: new Map([ ['SKILL.md', skillMd], ['script.ts', scriptTs], ['script.test.ts', scriptTestTs], ['_lib/browse-client.ts', sdkContents], ['fixtures/-.html', fixtureHtml], ]), }); console.log(stagedDir); ``` The SKILL.md content for `` follows the Phase 1 frontmatter contract: ```yaml --- name: description: host: trusted: false # agent-authored skills are untrusted by default source: agent version: 1.0.0 args: [] # extend if your script accepts --arg key=value triggers: - - - --- # scraper <2-3 sentences on what the script does, what URL it hits, and what shape of JSON it returns. NO conversation context. NO chat fragments. This is a durable on-disk artifact — keep it tight.> ## Usage \`\`\` $ $B skill run { "items": [...], "count": N } \`\`\` ``` Capture `stagedDir` (the path returned by `stageSkill`). You'll pass it to `$B skill test` next, then to `commitSkill` or `discardStaged`. ## Step 8 — Run `$B skill test` against the staged dir ```bash $B skill test "" --dir "" ``` If `$B skill test` does not yet accept `--dir`, fall back to invoking the test runner directly against the staged path: ```bash ( cd "" && bun test script.test.ts ) ``` If the test fails: 1. Read the test output. If the failure is a fixable parser bug, rewrite `script.ts` and `script.test.ts` (still inside the staged dir) and retry — at most twice. Show the diff to the user before each retry. 2. If still failing after two retries, OR the failure is an environmental issue (SDK import, daemon connection): ```ts import { discardStaged } from '/browse/src/browser-skill-write'; discardStaged(''); ``` Report the failure to the user, show them the staged `script.ts` for reference, and stop. No on-disk artifact. ## Step 9 — Approval gate Tests passed. Now ask the user before committing: ``` D — Commit skill "" at ? Project/branch/task: codified /scrape "" — tests pass against fixture. ELI10: The script ran clean against the snapshot we captured. Saying yes moves the staged folder into ~/.gstack/browser-skills/ where /scrape will find it next time. Saying no removes the staged folder and nothing lands on disk. Stakes if we pick wrong: yes commits an artifact you have to manually rm later if you regret it ($B skill rm --global). No throws away ~30s of synthesis work. Recommendation: A — tests passed, the script is self-contained, this is the productivity payoff for the prototype. Note: options differ in kind, not coverage — no completeness score. A) Commit it (recommended) B) Look at the script first (I'll print SKILL.md + script.ts and re-ask) C) Discard — don't commit ``` If the user picks B, print the staged `SKILL.md` and `script.ts` (NOT the fixture or _lib/), then re-ask the same A/B/C question (without B this time — they already saw it). ## Step 10 — Commit (atomic) or discard If the user approved: ```ts import { commitSkill } from '/browse/src/browser-skill-write'; const dest = commitSkill({ name: '', tier: '', // from step 2 answer stagedDir: '', }); console.log(`Committed: ${dest}`); ``` If `commitSkill` throws "already exists" (tier-shadowing collision the user dismissed in step 2), report and ask whether to: - Pick a different name (back to step 2) - `$B skill rm ` then retry - Discard If the user rejected in step 9: ```ts import { discardStaged } from '/browse/src/browser-skill-write'; discardStaged(''); ``` Report: "Discarded. No skill was written to disk." ## Step 11 — Confirm + verify After a successful commit, run one verification: ```bash $B skill list | grep $B skill run # should match the JSON the prototype produced ``` If the post-commit run does not match the prototype output, something in synthesis drifted. Surface this to the user — they may want to `$B skill rm ` and retry. Do NOT silently roll back; the user deserves to see the discrepancy. End the skill with one line: "Skill '' committed at . Future /scrape calls matching '' will run in ~200ms." --- ## Limits (be honest) - **Bun runtime required.** The codified skill runs as a Bun process (`bun run script.ts`). Phase 1 design carry-over (Codex finding #7). Real fix lands in Phase 4 (self-contained binary or Node fallback). For now: the skill works on any machine that has gstack installed, which means it has Bun. - **Fixture-replay tests are point-in-time.** When the target site rotates HTML, the fixture goes stale and the test passes against an outdated snapshot. Phase 4 will add fixture-staleness detection. - **Synthesis is best-effort.** You're writing a script from your own conversation memory. If the prototype was complex (multi-page, JS hydration, lazy load) the codified script may need a hand-edit before it's reliable. The post-commit verify step catches obvious drift. - **Single-target only.** One `$B goto` URL per skill. Multi-page crawls are out of scope — write a separate skill per target, or parameterize via `args:` if the URL pattern is regular. ## What this skill does NOT do - Codify match-path /scrape results (matched skills are already codified) - Codify mutating flows (those are /automate's job — Phase 2 P0) - Run skills (that's `$B skill run` — codified skills are run via /scrape's match path or directly) - Edit existing skills ($EDITOR + the skill dir is the surface — `$B skill show ` finds the path) - Tombstone or remove ($B skill rm) {{LEARNINGS_LOG}}