gstack

mirror of https://github.com/garrytan/gstack.git synced 2026-05-02 11:45:20 +02:00

Files

T

Garry Tan 94a83c50cd test(security): adversarial suite for canary + ensemble combiner

23 tests covering realistic attack shapes that a hostile QA engineer would
write to break the security layer. All pure logic — no model download, no
subprocess, no network. Covers two groups:

Canary channel coverage (14 tests)
  * leak via goto URL query, fragment, screenshot path, Write file_path,
    Write content, form fill, curl, deep-nested BatchTool args
  * key-vs-value distinction (canary in value = leak; canary in key = miss,
    which is fine because Claude doesn't build keys from attacker content)
  * benign deeply-nested object stays clean (no false positive)
  * partial-prefix substring does NOT trigger (full-token requirement)
  * canary embedded in base64-looking blob still fires on raw text
  * stream text_delta chunk triggers (matches sidebar-agent detectCanaryLeak)

Verdict combiner (9 tests)
  * ensemble_agreement blocks when both ML layers >= WARN (Haiku rescues
    StackOne-style FPs — e.g. Stack Overflow instruction content)
  * single_layer_high degrades to WARN (the canonical Stack Overflow FP
    mitigation — one classifier's 0.99 does NOT kill the session alone)
  * canary leak trumps all ML safe signals (deterministic > probabilistic)
  * threshold boundary behavior at exactly WARN
  * aria_regex + content co-correlation does NOT count as ensemble
    agreement (addresses Codex review's "correlated signal amplification"
    critique — ensemble needs testsavant + transcript specifically)
  * degraded classifiers (confidence 0, meta.degraded) produce safe verdict
    — fail-open contract preserved

All 23 tests pass in 82ms. Combined with security.test.ts, we now have
48 tests across 90 expectations for the pure-logic security surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-20 04:18:48 +08:00

bin

feat: multi-agent support — gstack works on Codex, Gemini CLI, and Cursor (v0.9.0) (#226 )

2026-03-19 18:20:50 -07:00

scripts

fix: ngrok Windows build + close CI error-swallowing gap (v0.18.0.1) (#1024 )

2026-04-16 13:49:04 -07:00

src

feat(security): wire logAttempt to gstack-telemetry-log (fire-and-forget)

2026-04-19 19:16:26 +08:00

test

test(security): adversarial suite for canary + ensemble combiner