Garry Tan
|
de59a5cc3e
|
feat(redact): shared redaction engine + taxonomy (pure lib, no behavior change)
Add the foundation for cross-skill PII/secret/legal redaction:
- lib/redact-patterns.ts — canonical 3-tier taxonomy (HIGH genuinely-secret
credentials, MEDIUM PII/legal/internal + high-FP credential-shaped, LOW
surface-only). Tier-1 calibration: Stripe-publishable, Google AIza, JWT, and
env-KV are MEDIUM not HIGH (context-variable / high-FP). Validators: Luhn,
Shannon-entropy gate, RFC1918 exclusion, wallet sanity. Per-span placeholder
suppression (not line-based).
- lib/redact-engine.ts — pure scan() + applyRedactions(). Normalization pass
(NFKC + zero-width strip + entity decode) with offset map back to original.
Oversize input fails CLOSED. No visibility-based tier promotion (records
repoVisibility for sterner wording only). Tool-attributed-fence WARN-degrade
for obvious doc-examples. Safe preview masking (≤4 leading chars).
- 100 unit tests: per-pattern positives, FP filters, validators, email
allowlist, no-promotion semantics, tool-fence degrade, normalization,
oversize-fail-closed, ReDoS pattern-lint + runtime budget, auto-redact
(idempotent, right-to-left, structural-corruption guard).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
2026-05-29 07:05:17 -07:00 |
|