mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-01 19:25:10 +02:00
fix: Supabase telemetry security lockdown (v0.11.16.0) (#460)
* fix: drop all anon RLS policies + revoke view access + add cache table Migration 002 locks down the Supabase telemetry backend: - Drops all SELECT, INSERT, UPDATE policies for the anon role - Explicitly revokes SELECT on crash_clusters and skill_sequences views - Drops stale error_message/failed_step columns (exist live but not in migration) - Creates community_pulse_cache table for server-side aggregation caching * feat: extend community-pulse with full dashboard data + server-side cache community-pulse now returns top skills, crash clusters, version distribution, and weekly active count in a single aggregated response. Results are cached in the community_pulse_cache table (1-hour TTL) to prevent DoS via repeated expensive queries. * fix: route all telemetry through edge functions, not PostgREST - gstack-telemetry-sync: POST to /functions/v1/telemetry-ingest instead of /rest/v1/telemetry_events. Removes sed field-renaming (edge function expects raw JSONL names). Parses inserted count — holds cursor if zero inserted. - gstack-update-check: POST to /functions/v1/update-check. - gstack-community-dashboard: calls community-pulse edge function instead of direct PostgREST queries. - config.sh: removes GSTACK_TELEMETRY_ENDPOINT, fixes misleading comment. * test: RLS smoke test + telemetry field name verification - verify-rls.sh: 9-check smoke test (5 reads + 3 inserts + 1 update) verifying anon key is fully locked out after migration. - telemetry.test.ts: verifies JSONL uses raw field names (v, ts, sessions) that the edge function expects, not Postgres column names. - README.md: fixes privacy claim to match actual RLS policy. * chore: bump version and changelog (v0.11.16.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: pre-landing review fixes — JSONB field order, version filter, RLS verification - Dashboard JSON parsing: use per-object grep instead of field-order-dependent regex (JSONB doesn't preserve key order) - Version distribution: filter to skill_run events only (was counting all types) - verify-rls.sh: only 401/403 count as PASS (not empty 200 or 5xx); add Authorization header to test as anon role properly - Remove dead empty loop in community-pulse * chore: untrack browse/dist binaries — 116MB of arm64-only Mach-O These compiled Bun binaries only work on arm64 macOS, and ./setup already rebuilds from source for every platform. They were tracked despite .gitignore due to being committed before the ignore rule. Untracking stops them from appearing as modified in every diff. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: tone down changelog — security hardening, not incident report * fix: keep INSERT policies for old client compat, preserve extra columns - Keep anon INSERT policies so pre-v0.11.16 clients can still sync telemetry via PostgREST while new clients use edge functions - Add error_message/failed_step columns to migration (reconcile repo with live schema) instead of dropping them - Security fix still lands: SELECT and UPDATE policies are dropped Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: sync package.json version with VERSION file (0.11.16.0) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
+2
-4
@@ -1,10 +1,8 @@
|
||||
#!/usr/bin/env bash
|
||||
# Supabase project config for gstack telemetry
|
||||
# These are PUBLIC keys — safe to commit (like Firebase public config).
|
||||
# RLS policies restrict what the anon/publishable key can do (INSERT only).
|
||||
# RLS denies all access to the anon key. All reads and writes go through
|
||||
# edge functions (which use SUPABASE_SERVICE_ROLE_KEY server-side).
|
||||
|
||||
GSTACK_SUPABASE_URL="https://frugpmstpnojnhfyimgv.supabase.co"
|
||||
GSTACK_SUPABASE_ANON_KEY="sb_publishable_tR4i6cyMIrYTE3s6OyHGHw_ppx2p6WK"
|
||||
|
||||
# Telemetry ingest endpoint (Data API)
|
||||
GSTACK_TELEMETRY_ENDPOINT="${GSTACK_SUPABASE_URL}/rest/v1"
|
||||
|
||||
@@ -1,9 +1,12 @@
|
||||
// gstack community-pulse edge function
|
||||
// Returns weekly active installation count for preamble display.
|
||||
// Cached for 1 hour via Cache-Control header.
|
||||
// Returns aggregated community stats for the dashboard:
|
||||
// weekly active count, top skills, crash clusters, version distribution.
|
||||
// Uses server-side cache (community_pulse_cache table) to prevent DoS.
|
||||
|
||||
import { createClient } from "https://esm.sh/@supabase/supabase-js@2";
|
||||
|
||||
const CACHE_MAX_AGE_MS = 60 * 60 * 1000; // 1 hour
|
||||
|
||||
Deno.serve(async () => {
|
||||
const supabase = createClient(
|
||||
Deno.env.get("SUPABASE_URL") ?? "",
|
||||
@@ -11,17 +14,37 @@ Deno.serve(async () => {
|
||||
);
|
||||
|
||||
try {
|
||||
// Count unique update checks in the last 7 days (install base proxy)
|
||||
// Check cache first
|
||||
const { data: cached } = await supabase
|
||||
.from("community_pulse_cache")
|
||||
.select("data, refreshed_at")
|
||||
.eq("id", 1)
|
||||
.single();
|
||||
|
||||
if (cached?.refreshed_at) {
|
||||
const age = Date.now() - new Date(cached.refreshed_at).getTime();
|
||||
if (age < CACHE_MAX_AGE_MS) {
|
||||
return new Response(JSON.stringify(cached.data), {
|
||||
status: 200,
|
||||
headers: {
|
||||
"Content-Type": "application/json",
|
||||
"Cache-Control": "public, max-age=3600",
|
||||
},
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// Cache is stale or missing — recompute
|
||||
const weekAgo = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000).toISOString();
|
||||
const twoWeeksAgo = new Date(Date.now() - 14 * 24 * 60 * 60 * 1000).toISOString();
|
||||
|
||||
// This week's active
|
||||
// Weekly active (update checks this week)
|
||||
const { count: thisWeek } = await supabase
|
||||
.from("update_checks")
|
||||
.select("*", { count: "exact", head: true })
|
||||
.gte("checked_at", weekAgo);
|
||||
|
||||
// Last week's active (for change %)
|
||||
// Last week (for change %)
|
||||
const { count: lastWeek } = await supabase
|
||||
.from("update_checks")
|
||||
.select("*", { count: "exact", head: true })
|
||||
@@ -34,22 +57,78 @@ Deno.serve(async () => {
|
||||
? Math.round(((current - previous) / previous) * 100)
|
||||
: 0;
|
||||
|
||||
return new Response(
|
||||
JSON.stringify({
|
||||
weekly_active: current,
|
||||
change_pct: changePct,
|
||||
}),
|
||||
{
|
||||
status: 200,
|
||||
headers: {
|
||||
"Content-Type": "application/json",
|
||||
"Cache-Control": "public, max-age=3600", // 1 hour cache
|
||||
},
|
||||
// Top skills (last 7 days)
|
||||
const { data: skillRows } = await supabase
|
||||
.from("telemetry_events")
|
||||
.select("skill")
|
||||
.eq("event_type", "skill_run")
|
||||
.gte("event_timestamp", weekAgo)
|
||||
.not("skill", "is", null)
|
||||
.limit(1000);
|
||||
|
||||
const skillCounts: Record<string, number> = {};
|
||||
for (const row of skillRows ?? []) {
|
||||
if (row.skill) {
|
||||
skillCounts[row.skill] = (skillCounts[row.skill] ?? 0) + 1;
|
||||
}
|
||||
);
|
||||
}
|
||||
const topSkills = Object.entries(skillCounts)
|
||||
.sort(([, a], [, b]) => b - a)
|
||||
.slice(0, 10)
|
||||
.map(([skill, count]) => ({ skill, count }));
|
||||
|
||||
// Crash clusters (top 5)
|
||||
const { data: crashes } = await supabase
|
||||
.from("crash_clusters")
|
||||
.select("error_class, gstack_version, total_occurrences, identified_users")
|
||||
.limit(5);
|
||||
|
||||
// Version distribution (last 7 days)
|
||||
const versionCounts: Record<string, number> = {};
|
||||
const { data: versionRows } = await supabase
|
||||
.from("telemetry_events")
|
||||
.select("gstack_version")
|
||||
.eq("event_type", "skill_run")
|
||||
.gte("event_timestamp", weekAgo)
|
||||
.limit(1000);
|
||||
|
||||
for (const row of versionRows ?? []) {
|
||||
if (row.gstack_version) {
|
||||
versionCounts[row.gstack_version] = (versionCounts[row.gstack_version] ?? 0) + 1;
|
||||
}
|
||||
}
|
||||
const topVersions = Object.entries(versionCounts)
|
||||
.sort(([, a], [, b]) => b - a)
|
||||
.slice(0, 5)
|
||||
.map(([version, count]) => ({ version, count }));
|
||||
|
||||
const result = {
|
||||
weekly_active: current,
|
||||
change_pct: changePct,
|
||||
top_skills: topSkills,
|
||||
crashes: crashes ?? [],
|
||||
versions: topVersions,
|
||||
};
|
||||
|
||||
// Upsert cache
|
||||
await supabase
|
||||
.from("community_pulse_cache")
|
||||
.upsert({
|
||||
id: 1,
|
||||
data: result,
|
||||
refreshed_at: new Date().toISOString(),
|
||||
});
|
||||
|
||||
return new Response(JSON.stringify(result), {
|
||||
status: 200,
|
||||
headers: {
|
||||
"Content-Type": "application/json",
|
||||
"Cache-Control": "public, max-age=3600",
|
||||
},
|
||||
});
|
||||
} catch {
|
||||
return new Response(
|
||||
JSON.stringify({ weekly_active: 0, change_pct: 0 }),
|
||||
JSON.stringify({ weekly_active: 0, change_pct: 0, top_skills: [], crashes: [], versions: [] }),
|
||||
{
|
||||
status: 200,
|
||||
headers: { "Content-Type": "application/json" },
|
||||
|
||||
@@ -0,0 +1,36 @@
|
||||
-- 002_tighten_rls.sql
|
||||
-- Lock down read/update access. Keep INSERT policies so old clients can still
|
||||
-- write via PostgREST while new clients migrate to edge functions.
|
||||
|
||||
-- Drop all SELECT policies (anon key should not read telemetry data)
|
||||
DROP POLICY IF EXISTS "anon_select" ON telemetry_events;
|
||||
DROP POLICY IF EXISTS "anon_select" ON installations;
|
||||
DROP POLICY IF EXISTS "anon_select" ON update_checks;
|
||||
|
||||
-- Drop dangerous UPDATE policy (was unrestricted on all columns)
|
||||
DROP POLICY IF EXISTS "anon_update_last_seen" ON installations;
|
||||
|
||||
-- Keep INSERT policies — old clients (pre-v0.11.16) still POST directly to
|
||||
-- PostgREST. These will be dropped in a future migration once adoption of
|
||||
-- edge-function-based sync is widespread.
|
||||
-- (anon_insert_only ON telemetry_events — kept)
|
||||
-- (anon_insert_only ON installations — kept)
|
||||
-- (anon_insert_only ON update_checks — kept)
|
||||
|
||||
-- Explicitly revoke view access (belt-and-suspenders)
|
||||
REVOKE SELECT ON crash_clusters FROM anon;
|
||||
REVOKE SELECT ON skill_sequences FROM anon;
|
||||
|
||||
-- Keep error_message and failed_step columns (exist on live schema, may be
|
||||
-- used in future). Add them to the migration record so repo matches live.
|
||||
ALTER TABLE telemetry_events ADD COLUMN IF NOT EXISTS error_message TEXT;
|
||||
ALTER TABLE telemetry_events ADD COLUMN IF NOT EXISTS failed_step TEXT;
|
||||
|
||||
-- Cache table for community-pulse aggregation (prevents DoS via repeated queries)
|
||||
CREATE TABLE IF NOT EXISTS community_pulse_cache (
|
||||
id INTEGER PRIMARY KEY DEFAULT 1,
|
||||
data JSONB NOT NULL DEFAULT '{}'::jsonb,
|
||||
refreshed_at TIMESTAMPTZ DEFAULT now()
|
||||
);
|
||||
ALTER TABLE community_pulse_cache ENABLE ROW LEVEL SECURITY;
|
||||
-- No anon policies — only service_role_key (used by edge functions) can read/write
|
||||
Executable
+103
@@ -0,0 +1,103 @@
|
||||
#!/usr/bin/env bash
|
||||
# verify-rls.sh — smoke test that anon key is locked out after 002_tighten_rls.sql
|
||||
#
|
||||
# Run manually after deploying the migration:
|
||||
# bash supabase/verify-rls.sh
|
||||
#
|
||||
# All 9 checks should PASS (anon key denied for reads AND writes).
|
||||
set -uo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||
. "$SCRIPT_DIR/config.sh"
|
||||
|
||||
URL="$GSTACK_SUPABASE_URL"
|
||||
KEY="$GSTACK_SUPABASE_ANON_KEY"
|
||||
PASS=0
|
||||
FAIL=0
|
||||
|
||||
check() {
|
||||
local desc="$1"
|
||||
local method="$2"
|
||||
local path="$3"
|
||||
local data="${4:-}"
|
||||
|
||||
local args=(-sf -o /dev/null -w '%{http_code}' --max-time 10
|
||||
-H "apikey: ${KEY}"
|
||||
-H "Authorization: Bearer ${KEY}"
|
||||
-H "Content-Type: application/json")
|
||||
|
||||
if [ "$method" = "GET" ]; then
|
||||
HTTP="$(curl "${args[@]}" "${URL}/rest/v1/${path}" 2>/dev/null || echo "000")"
|
||||
elif [ "$method" = "POST" ]; then
|
||||
HTTP="$(curl "${args[@]}" -X POST "${URL}/rest/v1/${path}" -H "Prefer: return=minimal" -d "$data" 2>/dev/null || echo "000")"
|
||||
elif [ "$method" = "PATCH" ]; then
|
||||
HTTP="$(curl "${args[@]}" -X PATCH "${URL}/rest/v1/${path}" -d "$data" 2>/dev/null || echo "000")"
|
||||
fi
|
||||
|
||||
# Only 401/403 prove RLS denial. 200 (even empty) means access is granted.
|
||||
# 5xx means something errored but access wasn't denied by policy.
|
||||
case "$HTTP" in
|
||||
401|403)
|
||||
echo " PASS $desc (HTTP $HTTP, denied by RLS)"
|
||||
PASS=$(( PASS + 1 ))
|
||||
;;
|
||||
200)
|
||||
# 200 means the request was accepted — check if data was returned
|
||||
if [ "$method" = "GET" ]; then
|
||||
BODY="$(curl -sf --max-time 10 "${URL}/rest/v1/${path}" -H "apikey: ${KEY}" -H "Authorization: Bearer ${KEY}" -H "Content-Type: application/json" 2>/dev/null || echo "")"
|
||||
if [ "$BODY" = "[]" ] || [ -z "$BODY" ]; then
|
||||
echo " WARN $desc (HTTP $HTTP, empty — may be RLS or empty table, verify manually)"
|
||||
FAIL=$(( FAIL + 1 ))
|
||||
else
|
||||
echo " FAIL $desc (HTTP $HTTP, got data)"
|
||||
FAIL=$(( FAIL + 1 ))
|
||||
fi
|
||||
else
|
||||
echo " FAIL $desc (HTTP $HTTP, write accepted)"
|
||||
FAIL=$(( FAIL + 1 ))
|
||||
fi
|
||||
;;
|
||||
201)
|
||||
echo " FAIL $desc (HTTP $HTTP, write succeeded!)"
|
||||
FAIL=$(( FAIL + 1 ))
|
||||
;;
|
||||
000)
|
||||
echo " WARN $desc (connection failed)"
|
||||
FAIL=$(( FAIL + 1 ))
|
||||
;;
|
||||
*)
|
||||
# 404, 406, 500, etc. — access not definitively denied by RLS
|
||||
echo " WARN $desc (HTTP $HTTP — not a clean RLS denial)"
|
||||
FAIL=$(( FAIL + 1 ))
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
echo "RLS Lockdown Verification"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo ""
|
||||
echo "Read denial checks:"
|
||||
check "SELECT telemetry_events" GET "telemetry_events?select=*&limit=1"
|
||||
check "SELECT installations" GET "installations?select=*&limit=1"
|
||||
check "SELECT update_checks" GET "update_checks?select=*&limit=1"
|
||||
check "SELECT crash_clusters" GET "crash_clusters?select=*&limit=1"
|
||||
check "SELECT skill_sequences" GET "skill_sequences?select=skill_a&limit=1"
|
||||
|
||||
echo ""
|
||||
echo "Write denial checks:"
|
||||
check "INSERT telemetry_events" POST "telemetry_events" '{"gstack_version":"test","os":"test","event_timestamp":"2026-01-01T00:00:00Z","outcome":"test"}'
|
||||
check "INSERT update_checks" POST "update_checks" '{"gstack_version":"test","os":"test"}'
|
||||
check "INSERT installations" POST "installations" '{"installation_id":"test_verify_rls"}'
|
||||
check "UPDATE installations" PATCH "installations?installation_id=eq.test_verify_rls" '{"gstack_version":"hacked"}'
|
||||
|
||||
echo ""
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "Results: $PASS passed, $FAIL failed (of 9 checks)"
|
||||
|
||||
if [ "$FAIL" -gt 0 ]; then
|
||||
echo "VERDICT: FAIL — anon key still has access"
|
||||
exit 1
|
||||
else
|
||||
echo "VERDICT: PASS — anon key fully locked out"
|
||||
exit 0
|
||||
fi
|
||||
Reference in New Issue
Block a user