fix(safety): one-way-door classifier catches "rotate ... password" (#1839)

scripts/one-way-doors.ts is the secondary safety net for ad-hoc AskUserQuestion
ids with no registry entry; a false negative auto-approves a destructive op. The
revoke and reset credential patterns both include `password`, but the rotate
pattern omitted it, so the most common phrasing ("rotate the database password")
classified as a reversible two-way question.

Add `password` to the rotate alternation so all three verbs are parallel. New test
covers rotate+password, the revoke/reset/rotate parallel, and rotate's other nouns.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-06-07 22:50:10 -07:00
parent 549f32a8f9
commit 1f768ad142
2 changed files with 33 additions and 1 deletions
+1 -1
View File
@@ -65,7 +65,7 @@ const DESTRUCTIVE_PATTERNS: RegExp[] = [
// Credentials / auth — allow filler words ("the", "my") between verb and noun
/\brevoke\s+[\w\s]*\b(api key|token|credential|access key|password)\b/i,
/\breset\s+[\w\s]*\b(api key|token|password|credential)\b/i,
/\brotate\s+[\w\s]*\b(api key|token|secret|credential|access key)\b/i,
/\brotate\s+[\w\s]*\b(api key|token|secret|credential|access key|password)\b/i,
// Scope / architecture forks (reversible with effort — still deserve confirmation)
/\barchitectur(e|al)\s+(change|fork|shift|decision)\b/i,