gstack/skillify/SKILL.md.tmpl

---
name: skillify
version: 1.0.0
description: |
  Codify the most recent successful /scrape flow into a permanent
  browser-skill on disk. Future /scrape calls with the same intent run
  the codified script in ~200ms instead of re-driving the page. Walks
  back through the conversation, synthesizes script.ts + script.test.ts
  + fixture, runs the test in a temp dir, and asks before committing.
  Use when asked to "skillify", "codify", "save this scrape", or
  "make this permanent". (gstack)
allowed-tools:
  - Bash
  - Read
  - Write
  - AskUserQuestion
triggers:
  - skillify
  - codify this scrape
  - save this scrape
  - make this permanent
---

{{PREAMBLE}}

# /skillify — codify the last scrape into a permanent skill

The productivity multiplier. `/scrape` discovered how to pull the data;
`/skillify` writes it as deterministic Playwright-via-`browse-client`
code so the next `/scrape` call on the same intent runs in ~200ms.

Without this command, `/scrape` is a slow wrapper around `$B`. With it,
every successful scrape is a one-time cost.

## Iron contract — never write a half-broken skill to disk

Skills are user-trust artifacts. A broken skill in `$B skill list` makes
agents reach for the wrong tool and erodes confidence. This skill writes
to a temp dir, runs the auto-generated test there, and only renames into
the final tier path on (a) test pass + (b) explicit user approval. On
either failure, the temp dir is removed entirely. There is no "almost
shipped" state.

---

## Step 1 — Provenance guard (D1)

Walk back through the conversation, **at most 10 agent turns**, looking
for the most recent `/scrape` invocation that:

- Was bounded (you can identify the user's intent line and the trailing
  JSON the prototype produced)
- Produced a JSON result the user did not subsequently invalidate
  (e.g., did not say "that's wrong", did not ask you to retry)

If you cannot find one, refuse with exactly this message:

> "No recent /scrape result found in this conversation. Run /scrape
> <intent> first, then say /skillify."

Stop. Do not synthesize from chat fragments. Do not synthesize from a
match-path /scrape result (matched skills are already codified — there's
nothing to skillify).

If you find a candidate but the user is currently three turns past it
discussing something unrelated, ask once before proceeding:

> "The last successful /scrape was '<intent line>' a few turns back.
> Skillify that one?"

A "yes" lets you continue. Anything else: refuse with the message above.

## Step 2 — Propose name + triggers

From the prototype intent, extract:

- A short skill name: lowercase letters/digits/dashes, ≤32 chars,
  starts with a letter, no consecutive dashes. E.g.,
  `lobsters-frontpage`, `gh-issue-list`, `pypi-package-stats`.
- 3–5 trigger phrases the agent should match against in future `/scrape`
  calls. Mix the canonical phrase ("scrape lobsters frontpage") with
  paraphrases ("top posts on lobste.rs", "lobsters front page").
- The host (just the hostname, e.g. `lobste.rs`).

Then **AskUserQuestion** to confirm:

```
D<N> — Skill name + tier
Project/branch/task: codifying /scrape "<intent>" as a browser-skill.
ELI10: Pick a short name we'll use to find this skill next time you say
something similar. Pick a tier — global means every project on this
machine sees it, project means just this repo.
Stakes if we pick wrong: bad name buries the skill in $B skill list;
wrong tier means future projects can't find it (or can find it when you
didn't want them to).
Recommendation: A — <proposed-name> at global tier — most scrape skills
generalize across projects.
Note: options differ in kind, not coverage — no completeness score.
A) Keep "<proposed-name>" at global tier — ~/.gstack/browser-skills/<proposed-name>/  (recommended)
B) Keep "<proposed-name>" but at project tier — <project>/.gstack/browser-skills/<proposed-name>/
C) Rename it (free-form — say the new name)
```

**Tier-shadowing check.** Before showing the question, run `$B skill list`
and check for an existing skill at the same name. If found, add to the
question:

> "Note: a <tier> skill named '<name>' already exists. Picking the same
> name at a higher tier (project > global > bundled) shadows it; picking
> the same tier collides and will be refused at write time. Pick a
> different name to coexist."

## Step 3 — Synthesize `script.ts` (D2)

**Use only the final-attempt `$B` calls** that produced the JSON the
user accepted, plus the user's intent string. Drop:

- Failed selector attempts (the four selectors you tried before the
  working one)
- Unrelated `$B` commands from earlier turns
- All conversation prose, summaries, your own reasoning

The script imports the SDK from `./_lib/browse-client` (a sibling copy,
written in step 6) and exports a parser function so `script.test.ts` can
exercise it against the bundled fixture without spinning up the daemon.

Mirror the bundled reference at `browser-skills/hackernews-frontpage/script.ts`:

```ts
import { browse } from './_lib/browse-client';

export interface Item { /* one row of the JSON output */ }
export interface Output { items: Item[]; count: number; }

const TARGET_URL = '<the URL the prototype used>';

export function parseFromHtml(html: string): Item[] {
  // Pure function: HTML in, parsed Item[] out. No $B calls.
  // Future fixture-replay tests call this directly.
}

if (import.meta.main) { await main(); }

async function main(): Promise<void> {
  await browse.goto(TARGET_URL);
  const html = await browse.html();
  const items = parseFromHtml(html);
  const output: Output = { items, count: items.length };
  process.stdout.write(JSON.stringify(output) + '\n');
}
```

The parser MUST be a pure function. If your prototype used multiple `$B`
calls (e.g., goto + click "Next" + html), keep all of them in `main()`
but extract the parsing into pure helpers. The fixture-replay tests in
step 5 only exercise the pure parts.

## Step 4 — Capture the fixture

```bash
$B goto "<TARGET_URL>"
$B html > /tmp/skillify-fixture-$$.html
```

The fixture filename inside the staged dir is
`fixtures/<host-with-dashes>-<YYYY-MM-DD>.html`, where the date is today.
E.g. `fixtures/lobste-rs-2026-04-27.html`.

Read the file you wrote, store its contents in a variable, and use it
when staging in step 7.

## Step 5 — Write `script.test.ts`

Mirror `browser-skills/hackernews-frontpage/script.test.ts`. The test
must include at least one ★★ assertion — parsed output has the expected
shape AND non-empty key fields — not a smoke ★ assertion. Smoke tests
that only check `parseFromHtml` doesn't throw are insufficient.

```ts
import { describe, it, expect } from 'bun:test';
import * as fs from 'fs';
import * as path from 'path';
import { parseFromHtml } from './script';

describe('<name> parser', () => {
  const fixturePath = path.join(import.meta.dir, 'fixtures', '<host>-<date>.html');
  const html = fs.readFileSync(fixturePath, 'utf-8');
  const items = parseFromHtml(html);

  it('returns at least one item from the bundled fixture', () => {
    expect(items.length).toBeGreaterThan(0);
  });

  it('every item has the required shape', () => {
    for (const item of items) {
      expect(typeof item.<keyfield>).toBe('<keytype>');
      // ... assert on every required field
    }
  });
});
```

## Step 6 — Resolve the canonical SDK path + read it

The canonical SDK lives at `<gstack-install>/browse/src/browse-client.ts`.
The bundled-skill loader walks the install tree to find it; mirror that.

Resolve the gstack install dir. Two reliable signals (in order):

1. The bundled `hackernews-frontpage` skill — look at its tier path from
   `$B skill list` (the `bundled` row). The skill dir is
   `<gstack-install>/browser-skills/hackernews-frontpage/`, so the install
   dir is two `dirname` calls above its `_lib/browse-client.ts`.
2. The active gstack skills install at `~/.claude/skills/gstack/`. Read
   the symlink target if it's a symlink, otherwise use the path directly.

Example (run as Bun, not bash, to avoid shell-redirect parsing issues):

```ts
import * as fs from 'fs';
import * as os from 'os';
import * as path from 'path';

function resolveSdkPath(): string {
  const candidates = [
    path.join(os.homedir(), '.claude', 'skills', 'gstack', 'browse', 'src', 'browse-client.ts'),
    // Add other install-dir candidates if your environment differs.
  ];
  for (const c of candidates) {
    try {
      const real = fs.realpathSync(c);
      if (fs.existsSync(real)) return real;
    } catch {}
  }
  throw new Error('Could not resolve canonical browse-client.ts');
}

const sdkContents = fs.readFileSync(resolveSdkPath(), 'utf-8');
```

Read the SDK contents into a variable. The staging step writes it as
`_lib/browse-client.ts` byte-identical to the canonical. Phase 1 decision
#4 — each skill is fully self-contained, no version drift possible.

## Step 7 — Stage the skill (D3 atomic write)

Use the helper at `browse/src/browser-skill-write.ts`. Construct an inline
TypeScript snippet (or shell out to a small Bun one-liner) that calls:

```ts
import { stageSkill } from '<gstack-install>/browse/src/browser-skill-write';

const stagedDir = stageSkill({
  name: '<name>',
  files: new Map([
    ['SKILL.md', skillMd],
    ['script.ts', scriptTs],
    ['script.test.ts', scriptTestTs],
    ['_lib/browse-client.ts', sdkContents],
    ['fixtures/<host>-<date>.html', fixtureHtml],
  ]),
});
console.log(stagedDir);
```

The SKILL.md content for `<name>` follows the Phase 1 frontmatter
contract:

```yaml
---
name: <name>
description: <one-line, what data this returns>
host: <hostname>
trusted: false       # agent-authored skills are untrusted by default
source: agent
version: 1.0.0
args: []             # extend if your script accepts --arg key=value
triggers:
  - <phrase 1>
  - <phrase 2>
  - <phrase 3>
---

# <Name> scraper

<2-3 sentences on what the script does, what URL it hits, and what
shape of JSON it returns. NO conversation context. NO chat fragments.
This is a durable on-disk artifact — keep it tight.>

## Usage

\`\`\`
$ $B skill run <name>
{ "items": [...], "count": N }
\`\`\`
```

Capture `stagedDir` (the path returned by `stageSkill`). You'll pass it
to `$B skill test` next, then to `commitSkill` or `discardStaged`.

## Step 8 — Run `$B skill test` against the staged dir

```bash
$B skill test "<name>" --dir "<stagedDir>"
```

If `$B skill test` does not yet accept `--dir`, fall back to invoking the
test runner directly against the staged path:

```bash
( cd "<stagedDir>" && bun test script.test.ts )
```

If the test fails:

1. Read the test output. If the failure is a fixable parser bug,
   rewrite `script.ts` and `script.test.ts` (still inside the staged
   dir) and retry — at most twice. Show the diff to the user before
   each retry.
2. If still failing after two retries, OR the failure is an
   environmental issue (SDK import, daemon connection):

   ```ts
   import { discardStaged } from '<gstack-install>/browse/src/browser-skill-write';
   discardStaged('<stagedDir>');
   ```

   Report the failure to the user, show them the staged `script.ts` for
   reference, and stop. No on-disk artifact.

## Step 9 — Approval gate

Tests passed. Now ask the user before committing:

```
D<N> — Commit skill "<name>" at <resolved-tier-path>?
Project/branch/task: codified /scrape "<intent>" — tests pass against fixture.
ELI10: The script ran clean against the snapshot we captured. Saying yes
moves the staged folder into ~/.gstack/browser-skills/ where /scrape
will find it next time. Saying no removes the staged folder and nothing
lands on disk.
Stakes if we pick wrong: yes commits an artifact you have to manually rm
later if you regret it ($B skill rm <name> --global). No throws away
~30s of synthesis work.
Recommendation: A — tests passed, the script is self-contained, this is
the productivity payoff for the prototype.
Note: options differ in kind, not coverage — no completeness score.
A) Commit it (recommended)
B) Look at the script first (I'll print SKILL.md + script.ts and re-ask)
C) Discard — don't commit
```

If the user picks B, print the staged `SKILL.md` and `script.ts` (NOT
the fixture or _lib/), then re-ask the same A/B/C question (without B
this time — they already saw it).

## Step 10 — Commit (atomic) or discard

If the user approved:

```ts
import { commitSkill } from '<gstack-install>/browse/src/browser-skill-write';
const dest = commitSkill({
  name: '<name>',
  tier: '<global|project>',  // from step 2 answer
  stagedDir: '<stagedDir>',
});
console.log(`Committed: ${dest}`);
```

If `commitSkill` throws "already exists" (tier-shadowing collision the
user dismissed in step 2), report and ask whether to:

- Pick a different name (back to step 2)
- `$B skill rm <name>` then retry
- Discard

If the user rejected in step 9:

```ts
import { discardStaged } from '<gstack-install>/browse/src/browser-skill-write';
discardStaged('<stagedDir>');
```

Report: "Discarded. No skill was written to disk."

## Step 11 — Confirm + verify

After a successful commit, run one verification:

```bash
$B skill list | grep <name>
$B skill run <name>    # should match the JSON the prototype produced
```

If the post-commit run does not match the prototype output, something
in synthesis drifted. Surface this to the user — they may want to
`$B skill rm <name>` and retry. Do NOT silently roll back; the user
deserves to see the discrepancy.

End the skill with one line: "Skill '<name>' committed at <tier>. Future
/scrape calls matching '<canonical-trigger>' will run in ~200ms."

---

## Limits (be honest)

- **Bun runtime required.** The codified skill runs as a Bun process
  (`bun run script.ts`). Phase 1 design carry-over (Codex finding #7).
  Real fix lands in Phase 4 (self-contained binary or Node fallback).
  For now: the skill works on any machine that has gstack installed,
  which means it has Bun.
- **Fixture-replay tests are point-in-time.** When the target site
  rotates HTML, the fixture goes stale and the test passes against an
  outdated snapshot. Phase 4 will add fixture-staleness detection.
- **Synthesis is best-effort.** You're writing a script from your own
  conversation memory. If the prototype was complex (multi-page, JS
  hydration, lazy load) the codified script may need a hand-edit before
  it's reliable. The post-commit verify step catches obvious drift.
- **Single-target only.** One `$B goto` URL per skill. Multi-page
  crawls are out of scope — write a separate skill per target, or
  parameterize via `args:` if the URL pattern is regular.

## What this skill does NOT do

- Codify match-path /scrape results (matched skills are already codified)
- Codify mutating flows (those are /automate's job — Phase 2 P0)
- Run skills (that's `$B skill run` — codified skills are run via /scrape's
  match path or directly)
- Edit existing skills ($EDITOR + the skill dir is the surface — `$B skill
  show <name>` finds the path)
- Tombstone or remove ($B skill rm)

{{LEARNINGS_LOG}}