security: gate download + scrape through validateNavigationUrl (SSRF)

The `goto` command was correctly wired through validateNavigationUrl,
but `download` and `scrape` called page.request.fetch(url, ...) directly.
A caller with the default write scope could hit the /command endpoint
and ask the daemon to fetch http://169.254.169.254/latest/meta-data/
(AWS IMDSv1) or the GCP/Azure/internal equivalents. The response body
comes back as base64 or lands on disk where GET /file serves it.

Fix: call validateNavigationUrl(url) immediately before each
page.request.fetch() call site in download and in the scrape loop.
Same blocklist that already protects `goto`: file://, javascript:,
data:, chrome://, cloud metadata (IPv4 all encodings, IPv6 ULA,
metadata.*.internal).

Tests: extend browse/test/url-validation.test.ts with a source-level
guard that walks every `await page.request.fetch(` call site and
asserts a validateNavigationUrl call precedes it within the same
branch. Regression trips before code review if a future refactor
drops the gate.
This commit is contained in:
gus
2026-04-16 19:47:19 -03:00
committed by Garry Tan
parent de972b514f
commit b7cf46d6e6
2 changed files with 88 additions and 1 deletions
+14 -1
View File
@@ -1188,7 +1188,16 @@ export async function handleWriteCommand(
contentType = match[1];
buffer = Buffer.from(match[2], 'base64');
} else {
// Strategy 1: Direct URL via page.request.fetch()
// Strategy 1: Direct URL via page.request.fetch().
// Gate the URL through the same validator `goto` uses. Without
// this check, download + scrape bypass the navigation
// blocklist and a caller with write scope can read
// http://169.254.169.254/latest/meta-data/ (AWS IMDSv1), the
// GCP/Azure metadata equivalents, or any internal IPv4/IPv6
// the server happens to route to. The response body is then
// returned to the caller (base64) or written to disk where
// GET /file serves it back.
await validateNavigationUrl(url);
const response = await page.request.fetch(url, { timeout: 30000 });
const status = response.status();
if (status >= 400) {
@@ -1286,6 +1295,10 @@ export async function handleWriteCommand(
for (let i = 0; i < toDownload.length; i++) {
const { url, type } = toDownload[i];
try {
// Same gate as the download command — page.request.fetch
// must not reach cloud metadata, ULA ranges, or the rest of
// the blocklist. See url-validation.ts for the full list.
await validateNavigationUrl(url);
const response = await page.request.fetch(url, { timeout: 30000 });
if (response.status() >= 400) throw new Error(`HTTP ${response.status()}`);
const ct = response.headers()['content-type'] || 'application/octet-stream';