Files
Garry Tan b73f364411 feat: browser data platform for AI agents (v0.16.0.0) (#907)
* refactor: extract path-security.ts shared module

validateOutputPath, validateReadPath, and SAFE_DIRECTORIES were duplicated
across write-commands.ts, meta-commands.ts, and read-commands.ts. Extract
to a single shared module with re-exports for backward compatibility.

Also adds validateTempPath() for the upcoming GET /file endpoint (TEMP_DIR
only, not cwd, to prevent remote agents from reading project files).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: default paired agents to full access, split SCOPE_CONTROL

The trust boundary for paired agents is the pairing ceremony itself, not
the scope. An agent with write scope can already click anything and navigate
anywhere. Gating js/cookies behind --admin was security theater.

Changes:
- Default pair scopes: read+write+admin+meta (was read+write)
- New SCOPE_CONTROL for browser-wide destructive ops (stop, restart,
  disconnect, state, handoff, resume, connect)
- --admin flag now grants control scope (backward compat)
- New --restrict flag for limited access (e.g., --restrict read)
- Updated hint text: "re-pair with --control" instead of "--admin"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add media and data commands for page content extraction

media command: discovers all img/video/audio/background-image elements
on the page. Returns JSON with URLs, dimensions, srcset, loading state,
HLS/DASH detection. Supports --images/--videos/--audio filters and
optional CSS selector scoping.

data command: extracts structured data embedded in pages (JSON-LD,
Open Graph, Twitter Cards, meta tags). One command returns product
prices, article metadata, social share info without DOM scraping.

Both are READ scope with untrusted content wrapping.
Shared media-extract.ts helper for reuse by the upcoming scrape command.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add download, scrape, and archive commands

download: fetch any URL or @ref element to disk using browser session
cookies via page.request.fetch(). Supports blob: URLs via in-page
base64 conversion. --base64 flag returns inline data URI (cap 10MB).
Detects HLS/DASH and rejects with yt-dlp hint.

scrape: bulk media download composing media discovery + download loop.
Sequential with 100ms delay, URL deduplication, configurable --limit.
Writes manifest.json with per-file metadata for machine consumption.

archive: saves complete page as MHTML via CDP Page.captureSnapshot.
No silent fallback -- errors clearly if CDP unavailable.

All three are WRITE scope (write to disk, blocked in watch mode).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add GET /file endpoint for remote agent file retrieval

Remote paired agents can now retrieve downloaded files over HTTP.
TEMP_DIR only (not cwd) to prevent project file exfiltration.

- Bearer token auth (root or scoped with read scope)
- Path validation via validateTempPath() (symlink-aware)
- 200MB size cap
- Extension-based MIME detection
- Zero-copy streaming via Bun.file()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add scroll --times N for automated repeated scrolling

Extends the scroll command with --times N flag for infinite feed
scraping. Scrolls N times with configurable --wait delay (default
1000ms) between each scroll for content loading.

Usage: scroll --times 10
       scroll --times 5 --wait 2000
       scroll --times 3 .feed-container

Composable with scrape: scroll to load content, then scrape images.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add network response body capture (--capture/--export/--bodies)

The killer feature for social media scraping. Extends the existing
network command to intercept API response bodies:

  network --capture [--filter graphql]  # start capturing
  network --capture stop                # stop
  network --export /tmp/api.jsonl       # export as JSONL
  network --bodies                      # show summary

Uses page.on('response') listener with URL pattern filtering.
SizeCappedBuffer (50MB total, 5MB per-entry cap) evicts oldest
entries when full. Binary responses stored as base64, text as-is.

This lets agents tap Instagram's GraphQL API, TikTok's hydration
data, and any SPA's internal API responses instead of fragile DOM
scraping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add screenshot --base64 for inline image return

Returns data:image/png;base64,... instead of writing to disk.
Cap at 10MB. Works with all screenshot modes (element, clip, viewport).

Eliminates the two-step screenshot+file-serve dance for remote agents.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add data platform tests and media fixture

Tests for SizeCappedBuffer (eviction, export, summary), validateTempPath
(TEMP_DIR only, rejects cwd), command registration (all new commands in
correct scope sets), and MIME mapping source checks.

Rich HTML fixture with: standard images, lazy-loaded images, srcset,
video with sources + HLS, audio, CSS background-images, JSON-LD,
Open Graph, Twitter Cards, and meta tags.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: regenerate SKILL.md with Extraction category

Add Extraction category to browse command table ordering. Regenerate
SKILL.md files to include media, data, download, scrape, archive
commands in the generated documentation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.16.0.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 00:41:55 -07:00

68 lines
2.3 KiB
HTML

<!DOCTYPE html>
<html>
<head>
<title>Media Test Page</title>
<meta property="og:title" content="Test Product">
<meta property="og:description" content="A test product description">
<meta property="og:image" content="https://example.com/og-image.jpg">
<meta property="og:type" content="product">
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:title" content="Test Product Tweet">
<meta name="description" content="Page description for SEO">
<meta name="keywords" content="test, product, media">
<meta name="author" content="Test Author">
<link rel="canonical" href="https://example.com/test-product">
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Product",
"name": "Test Widget",
"description": "A widget for testing",
"image": "https://example.com/widget.jpg",
"offers": {
"@type": "Offer",
"price": "29.99",
"priceCurrency": "USD"
}
}
</script>
<style>
.hero { background-image: url('https://example.com/hero-bg.jpg'); width: 100%; height: 300px; }
.banner { background-image: url('https://example.com/banner.png'); width: 100%; height: 100px; }
</style>
</head>
<body>
<div class="hero"></div>
<div class="banner"></div>
<!-- Standard images -->
<img src="https://example.com/photo1.jpg" alt="Photo 1" width="800" height="600">
<img src="https://example.com/photo2.png" alt="Photo 2" width="400" height="300">
<!-- Lazy loaded image -->
<img data-src="https://example.com/lazy.jpg" alt="Lazy Image" loading="lazy" width="600" height="400">
<!-- Image with srcset -->
<img src="https://example.com/responsive-sm.jpg"
srcset="https://example.com/responsive-sm.jpg 480w, https://example.com/responsive-lg.jpg 1200w"
alt="Responsive Image"
width="480" height="320">
<!-- Video with sources -->
<video width="640" height="480" poster="https://example.com/poster.jpg">
<source src="https://example.com/video.mp4" type="video/mp4">
<source src="https://example.com/video.webm" type="video/webm">
</video>
<!-- HLS video -->
<video width="1920" height="1080">
<source src="https://example.com/stream.m3u8" type="application/x-mpegURL">
</video>
<!-- Audio -->
<audio>
<source src="https://example.com/podcast.mp3" type="audio/mpeg">
</audio>
</body>
</html>