docs(security): document GSTACK_SECURITY_ENSEMBLE env var

Adds the opt-in DeBERTa-v3 ensemble to the Sidebar security stack section of CLAUDE.md. Documents: * What it does (L4c cross-model classifier, 2-of-3 agreement for BLOCK) * How to enable (GSTACK_SECURITY_ENSEMBLE=deberta) * The cost (721MB model download on first run) * Default behavior (disabled — 2-of-2 testsavant + transcript) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-08-03 12:58:40 +02:00 · 2026-04-20 04:55:23 +08:00
parent 4e0516031b
commit 7a815fa7f6
1 changed files with 6 additions and 0 deletions
@@ -236,7 +236,13 @@ always BLOCKs (deterministic).
 **Env knobs:**
 - `GSTACK_SECURITY_OFF=1` — emergency kill switch. Classifier stays off even if
  warmed. Canary is still injected; just the ML scan is skipped.
+- `GSTACK_SECURITY_ENSEMBLE=deberta` — opt-in DeBERTa-v3 ensemble. Adds
+  ProtectAI DeBERTa-v3-base-injection-onnx as L4c classifier for cross-model
+  agreement. 721MB first-run download. With ensemble enabled, BLOCK requires
+  2-of-3 ML classifiers agreeing at >= WARN (testsavant, deberta, transcript).
+  Without ensemble (default), BLOCK requires testsavant + transcript at >= WARN.
 - Classifier model cache: `~/.gstack/models/testsavant-small/` (112MB, first run only)
+  plus `~/.gstack/models/deberta-v3-injection/` (721MB, only when ensemble enabled)
 - Attack log: `~/.gstack/security/attempts.jsonl` (salted sha256 + domain only,
  rotates at 10MB, 5 generations)
 - Per-device salt: `~/.gstack/security/device-salt` (0600)