mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-07 05:56:41 +02:00
5ae696b6fa
One entry point for pulling page data. Three paths under the hood: 1. Match — agent reads $B skill list, semantically matches the user's intent against each skill's triggers + description + host. Confident match = $B skill run <name> in ~200ms. 2. Prototype — no match, drive the page with $B goto/text/html/links etc. Return JSON, append a one-line "say /skillify" nudge. 3. Mutating refusal — verbs like submit/click/fill route to /automate (Phase 2b P0); /scrape is read-only by contract. Match decision lives in the agent, not the daemon. No new code in browse/src/, no expanded daemon command surface, no new prompt-injection blast radius. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
153 lines
5.1 KiB
Cheetah
153 lines
5.1 KiB
Cheetah
---
|
|
name: scrape
|
|
version: 1.0.0
|
|
description: |
|
|
Pull data from a web page. First call on a new intent prototypes the flow
|
|
via $B primitives and returns JSON. Subsequent calls on a matching intent
|
|
route to a codified browser-skill and return in ~200ms. Read-only — for
|
|
mutating flows (form fills, clicks, submissions), use /automate.
|
|
Use when asked to "scrape", "get data from", "pull", "extract from", or
|
|
"what's on" a page. (gstack)
|
|
allowed-tools:
|
|
- Bash
|
|
- Read
|
|
- AskUserQuestion
|
|
triggers:
|
|
- scrape this page
|
|
- get data from
|
|
- pull from
|
|
- extract from
|
|
- what is on
|
|
---
|
|
|
|
{{PREAMBLE}}
|
|
|
|
# /scrape — pull data from a page
|
|
|
|
One entry point for getting data off the web. Two paths under the hood:
|
|
|
|
1. **Match path** (~200ms) — if the user's intent matches an existing
|
|
browser-skill's triggers, run it via `$B skill run <name>` and emit
|
|
the JSON.
|
|
2. **Prototype path** (~30s) — no matching skill yet, so drive the page
|
|
with `$B` primitives, return the JSON, and suggest `/skillify` so the
|
|
next call lands on the match path.
|
|
|
|
Read-only by contract. If the intent implies writing (submitting forms,
|
|
clicking buttons that mutate state), refuse and route to `/automate`.
|
|
|
|
## Step 1 — Determine intent
|
|
|
|
The user's request after `/scrape` is the intent. If they did not include
|
|
one, ask once:
|
|
|
|
> "What do you want to scrape? Describe it in one line, e.g. 'top stories
|
|
> on Hacker News' or 'product names + prices on example.com/products'."
|
|
|
|
Do not ask multiple clarifying questions up front. Any further questions
|
|
go in the prototype path where they're cheaper.
|
|
|
|
## Step 2 — Refuse mutating intents
|
|
|
|
If the intent implies writes — verbs like *submit*, *post*, *send*, *log
|
|
in*, *click X*, *fill the form*, *delete*, *create*, *order*, *book* —
|
|
respond:
|
|
|
|
> "/scrape is read-only. For mutating flows, use /automate (browser-skills
|
|
> Phase 2 P0 in TODOS.md — not yet shipped). Until then, use $B click /
|
|
> $B fill / $B type directly."
|
|
|
|
Stop. Do not enter the match or prototype path.
|
|
|
|
## Step 3 — Match phase
|
|
|
|
List existing browser-skills:
|
|
|
|
```bash
|
|
$B skill list
|
|
```
|
|
|
|
For each skill, `$B skill show <name>` exposes the full SKILL.md including
|
|
`triggers:`, `description:`, and `host:`. Read these and judge whether the
|
|
user's intent semantically matches one of them.
|
|
|
|
A confident match means **all three** are true:
|
|
|
|
- The intent's domain matches the skill's `host` (or one of its hostnames)
|
|
- A `triggers:` phrase or the `description:` covers the same data the
|
|
intent asks for
|
|
- The intent does not require args the skill does not declare in `args:`
|
|
|
|
If matched, parse any `--arg key=value` from the intent (or pass none for
|
|
zero-arg skills) and run:
|
|
|
|
```bash
|
|
$B skill run <name> [--arg key=value ...]
|
|
```
|
|
|
|
Emit the JSON the skill prints to stdout. Stop.
|
|
|
|
If matching is ambiguous (two skills could plausibly fit), pick the
|
|
narrower-tier one (project > global > bundled — `$B skill list` shows the
|
|
tier). If still ambiguous, fall through to the prototype path rather than
|
|
guess wrong.
|
|
|
|
## Step 4 — Prototype phase
|
|
|
|
No match. Drive the page using `$B` primitives:
|
|
|
|
1. `$B goto <url>` — navigate to the target. The user's intent usually
|
|
names a host or a URL; use it directly.
|
|
2. `$B snapshot --text` (or `$B text`) — get a clean text view of the
|
|
page to find selectors.
|
|
3. `$B html` — pull the raw HTML when you need to parse structured data
|
|
(lists, tables, repeated rows).
|
|
4. `$B links` — when the intent is to gather URLs.
|
|
5. Iterate: try a selector, check the output, refine.
|
|
|
|
Emit the result as JSON on stdout (one document, not pretty-printed).
|
|
Use a stable shape — typically `{ "items": [...], "count": N }` or
|
|
similar — so downstream consumers can treat it as data.
|
|
|
|
## Step 5 — Skillify nudge
|
|
|
|
After a successful prototype, append exactly one line:
|
|
|
|
> "Say /skillify to make this a permanent skill (200ms on next call)."
|
|
|
|
That is the entire nudge. Do not nag, do not list pros, do not push.
|
|
Proactive surfacing is a Phase 3 knob (`gstack-config browser_skillify_prompts`),
|
|
not this skill's job.
|
|
|
|
## When the prototype fails
|
|
|
|
If the page loads but data extraction does not yield a sensible JSON shape
|
|
after 3-4 selector attempts:
|
|
|
|
- Report what you tried, what came back, and what's blocking (lazy-loaded,
|
|
JS-rendered, paywalled, etc.).
|
|
- Do NOT write a partial result and call it done.
|
|
- Do NOT suggest /skillify on a broken prototype.
|
|
- Ask the user whether they want to (a) try a different selector, (b)
|
|
switch to a different page, or (c) stop.
|
|
|
|
## What this skill does NOT do
|
|
|
|
- Mutating actions (use /automate when shipped, or $B primitives directly)
|
|
- Auth flows / cookie import (use /setup-browser-cookies first)
|
|
- Multi-page crawls (this is one-shot per call)
|
|
- Anything that requires the daemon to not be running
|
|
|
|
## Output discipline
|
|
|
|
The match path returns whatever JSON the matched skill emits. The
|
|
prototype path returns whatever JSON you construct. In both cases:
|
|
|
|
- One JSON document, on stdout.
|
|
- Stderr (or chat) is for logs and the skillify nudge.
|
|
- Do not embed prose around the JSON in the chat reply unless the user
|
|
asked for an explanation — many `/scrape` callers pipe the output to
|
|
`jq`.
|
|
|
|
{{LEARNINGS_LOG}}
|