Files
gstack/test/redact-engine.test.ts
T
Garry Tan 9cc41b7163 v1.57.6.0 fix wave: 8 community bugs (4 security guards failing open) (#1911)
* fix(ship): adversarial subagent no longer trips usage-policy denial on own security fixtures (#1899)

The Claude adversarial subagent in /review and /ship was told to "think like an
attacker" over the full diff. When the diff includes the repo's own security
regression fixtures (real attack payloads, by design), reasoning adversarially
over that material triggered Anthropic's real-time usage-policy safeguards and
the subagent call was denied — blocking the review.

Fix at the prompt's source of truth (scripts/resolvers/review.ts {{ADVERSARIAL_STEP}}):
- Authorized-defensive-testing framing: declares this is the maintainer's own repo
  and that attack-pattern strings inside test/fixture paths are the project's own
  regression corpus to analyze, not material to expand on.
- Fixture summary-mode diff: full content for non-fixture source, --stat/--name-status
  for test/fixture files, so raw exploit bytes aren't fed into adversarial reasoning.
  The subagent must state fixtures were reviewed in summary mode (no silent coverage cut).

Reported by @bmajewski. Regenerated review/SKILL.md + ship/sections/adversarial.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(redact): detect modern sk-proj-/sk-svcacct-/sk-admin- OpenAI keys (#1868)

openai.key (HIGH/block) used /\b(sk-(?:proj-)?[A-Za-z0-9]{32,})\b/, which stops
at the first - or _ in the body. Modern OpenAI project/service-account/admin keys
use base64url bodies containing - and _, so they never reached the 32-char run and
produced ZERO findings — a HIGH credential failing open through /spec, /ship, /cso,
and /document-*.

Replace with explicit alternation, bare vs prefixed (not a globally-optional prefix,
which would match malformed sk--... or separator-less sk-projabc...):
  sk-{proj,svcacct,admin}- + [A-Za-z0-9_-]{20,}  |  sk-[A-Za-z0-9]{32,} (legacy)

Tests: the three previously-missed shapes now block; FP guards pin that hyphenated
prose and malformed sk- strings do NOT match (HIGH tier blocks, so calibration matters).

Reported by @jbetala7.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(redact): reject malformed --max-bytes instead of silently disabling the size guard (#1824)

The oversize check is designed to fail CLOSED, but a malformed --max-bytes turned
it fail-OPEN. bin/gstack-redact did parseInt(maxBytes,10) and passed it straight
through; parseInt("foo") is NaN. The engine guarded with `opts.maxBytes ?? DEFAULT`,
and ?? does not catch NaN, so `byteLen > NaN` was always false and the fail-closed
block never fired. A negative value made `byteLen > -5` always true, blocking
everything.

Two layers:
- bin/gstack-redact validates the RAW string (parseInt accepts "123abc"->123,
  "1.5"->1): require /^\d+$/ and > 0, else exit 1 with a clear message.
- lib/redact-engine.ts hardens the fallback to Number.isFinite && > 0 else the
  default cap — a guardrail so the engine never silently runs uncapped even if a
  bad value reaches it directly.

Tests: NaN and negative both fall back to the default cap (oversize still blocks);
CLI rejects garbage/negative with exit 1.

Reported by @jbetala7.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(learnings): cross-project trust gate is an allowlist, not a denylist (#1745)

gstack-learnings-search --cross-project is documented as an allowlist — foreign
learnings load only when user-stated/trusted, to stop one project's AI-generated
learnings from injecting into another project's reviews. It was implemented as a
denylist: `if (isCrossProject && e.trusted === false) continue`. Any row where
`trusted` is missing/undefined (legacy rows from before the field existed,
hand-edited rows, rows from other tools) passed `undefined === false` → false →
admitted. Those rows leaked across projects.

Flip to `e.trusted !== true`. Test: a foreign row with no `trusted` field is now
excluded (true still included, false still excluded).

Reported by @jbetala7.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(safety): one-way-door classifier catches "rotate ... password" (#1839)

scripts/one-way-doors.ts is the secondary safety net for ad-hoc AskUserQuestion
ids with no registry entry; a false negative auto-approves a destructive op. The
revoke and reset credential patterns both include `password`, but the rotate
pattern omitted it, so the most common phrasing ("rotate the database password")
classified as a reversible two-way question.

Add `password` to the rotate alternation so all three verbs are parallel. New test
covers rotate+password, the revoke/reset/rotate parallel, and rotate's other nouns.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(review): route .mjs/.cjs/.mts/.cts changes to the backend reviewer (#1810)

gstack-diff-scope backend detection matched only *.ts|*.js. Modern Node ships
backend code as ESM (.mjs) / CommonJS (.cjs) and explicit-module TS (.mts/.cts);
none matched any category, so a PR touching only those files reported no backend
scope and the Review Army skipped the backend reviewer.

Add the four module extensions to the backend case. Test covers all four.

Reported by @jbetala7.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(brain-cache): loadMeta tolerates malformed _meta.json without crashing (#1879)

loadMeta returned the parsed JSON verbatim. A valid JSON file that lacked the
last_refresh map made three consumers (isStale, cmdInvalidate, refreshEntity)
throw a TypeError dereferencing meta.last_refresh — the sibling last_attempt was
already guarded, last_refresh wasn't.

Fix in loadMeta:
- Shape-guard: JSON.parse can return null/array/string/number; non-object → fresh meta.
- Normalize ONLY the dereferenced maps (last_refresh, last_attempt).
- Deliberately do NOT default schema_version/endpoint_hash. Leaving them absent
  makes schemaVersionMismatch()/endpointSwitched() force a rebuild (missing
  identity = mismatch = safe); defaulting them would suppress cache invalidation
  and trust a stale file of unknown provenance.

Tests: missing last_refresh no longer throws; null/array/primitive treated as cold;
missing schema_version forces rebuild instead of a trusted warm hit.

Reported by @jbetala7.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(skills): anchor guard/freeze/careful hook paths so they survive CC 2.1.162 (#1871)

The PreToolUse frontmatter hooks for guard, freeze, and careful invoked
`bash ${CLAUDE_SKILL_DIR}/.../check-*.sh`. Claude Code 2.1.162 no longer populates
${CLAUDE_SKILL_DIR} in the skill-hook execution env, so it expanded to empty and
every Edit/Write/Bash ran `bash /...` and errored — breaking the safety skills
entirely.

Frontmatter hooks run before any skill-body bash, so no runtime-resolved variable
can fix this; the command must be a path that's valid at hook time. Anchor to the
installed checkout: $HOME/.claude/skills/gstack/{careful,freeze}/bin/check-*.sh,
where the scripts actually live. ($HOME is expanded by the hook shell.)

Reported by @omariani-howdy. Regenerated the three SKILL.md from templates.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore: v1.58.0.0 — fix-wave release notes, VERSION bump, #1882 TODO

CHANGELOG entry for the 8-fix safety wave (#1899, #1868, #1824, #1745, #1839,
#1810, #1879, #1871). VERSION + package.json to 1.58.0.0 (MINOR — coordinated
multi-file safety fixes on top of main's 1.57.3.0). #1882 filed as the top
TODOS.md item (scoped out of this wave per decision; host-config change touching
all 52 skills, distinct from the #1871 hook fix).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(learnings): strip backticks from #1745 comment inside the bun -e block

The #1745 trust-gate fix added an explanatory comment containing backticks
(`=== false`) and the JS block is a double-quoted `bun -e "..."` bash string, so
bash command-substituted the backtick contents on every cross-project search —
polluting stderr with "command not found" and leaving a latent shell-injection /
source-corruption surface in a security gate. Caught by the wave's own adversarial
review (#1899 framing working as intended). Reworded the comments to avoid backticks
and dollar-paren entirely; the gate logic is unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(golden): refresh ship golden baselines (#1899 prompt + main's PR-title line)

The three ship golden fixtures were stale: main's v1.57.3.0 added the always-loaded
PR-title invariant to ship/SKILL.md but did not regenerate the goldens (the golden
regression test fails on main too), and the codex golden still carried an unresolved
${ctx.paths.binDir} token. Regenerated from the current generated ship skills, which
also picks up this wave's #1899 adversarial-prompt framing (inlined for codex/factory).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 06:39:38 -07:00

335 lines
13 KiB
TypeScript
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
/**
* Unit tests for lib/redact-engine.ts + lib/redact-patterns.ts.
*
* One positive test per pattern, plus FP-filters, validators (Luhn/entropy/
* RFC1918), email allowlist, no-promotion visibility semantics, tool-fence
* degrade, normalization (zero-width / homoglyph / entity), oversize fail-closed,
* and pure-function purity.
*/
import { describe, test, expect } from "bun:test";
import {
scan,
exitCodeFor,
maskPreview,
normalizeWithMap,
type RepoVisibility,
} from "../lib/redact-engine";
import {
PATTERNS,
luhnValid,
shannonEntropy,
isPublicIPv4,
isPlaceholderSpan,
} from "../lib/redact-patterns";
function ids(text: string, vis: RepoVisibility = "private"): string[] {
return scan(text, { repoVisibility: vis }).findings.map((f) => f.id);
}
describe("HIGH credential patterns", () => {
const cases: Array<[string, string]> = [
["aws.access_key", "key = AKIA1234567890ABCDEF"],
["aws.secret_key", "aws_secret_access_key = AbCdEfGhIjKlMnOpQrStUvWxYz0123456789AbCd"],
["github.pat", "token ghp_" + "1234567890abcdefghijklmnopqrstuvwxyz"],
["github.oauth", "gho_" + "1234567890abcdefghijklmnopqrstuvwxyz"],
["github.server", "ghs_1234567890abcdefghijklmnopqrstuvwxyz"],
["github.fine_grained", "github_pat_" + "A".repeat(82)],
["anthropic.key", "sk-ant-" + "api03-abcdefghij1234567890XYZ"],
["openai.key", "sk-proj-" + "a".repeat(40)],
["sendgrid.key", "SG." + "a".repeat(22) + "." + "b".repeat(43)],
["stripe.secret", "sk_live_" + "a".repeat(30)],
["slack.token", "xox" + "b-1234567890-abcdefghijklmnop"],
["slack.webhook", "https://hooks.slack.com/services/T00000000/B11111111/" + "a".repeat(24)],
["discord.webhook", "https://discord.com/api/webhooks/123456789012345678/" + "a".repeat(60)],
["pem.private_key", "-----BEGIN RSA PRIVATE KEY-----"],
];
for (const [id, text] of cases) {
test(`flags ${id}`, () => {
expect(ids(text)).toContain(id);
});
}
// #1868 — modern OpenAI keys use base64url bodies (with - and _). The old
// [A-Za-z0-9]{32,} regex stopped at the first separator and missed them all,
// failing a HIGH credential OPEN through the redaction gate.
test("openai.key flags modern sk-proj-/sk-svcacct-/sk-admin- shapes (#1868)", () => {
const missed = [
"sk-proj-Ab12_Cd34-Ef56Gh78Ij90Kl12Mn34Op56Qr78St90Uv",
"sk-svcacct-abc_def-ghijklmnopqrstuvwxyz0123456789ABCDEF",
"sk-admin-AAAA_BBBB-CCCC_DDDD-EEEE_FFFF-GGGG_HHHH1234",
];
for (const key of missed) {
expect(ids(`OPENAI_API_KEY=${key}`)).toContain("openai.key");
}
// legacy contiguous shape still flags
expect(ids("sk-proj-" + "a".repeat(40))).toContain("openai.key");
});
test("openai.key does not over-match prose / malformed sk- strings (#1868 calibration)", () => {
// HIGH tier BLOCKS, so false positives on prose are costly. None of these
// should flag as openai.key.
const benign = [
"the sk-learning-rate-schedule-was-tuned-carefully", // hyphenated prose
"sk--double-dash-typo-not-a-real-key",
"use sk-proj for the project prefix in docs", // no body
"sk-short", // too short, no prefix
];
for (const text of benign) {
expect(ids(text)).not.toContain("openai.key");
}
});
test("twilio.auth_token needs an SID nearby", () => {
const sid = "AC" + "a".repeat(32);
const tok = "b".repeat(32);
expect(ids(`account ${sid} token ${tok}`)).toContain("twilio.auth_token");
// bare 32-hex with no SID nearby should NOT flag as twilio
expect(ids(`random ${tok} here`)).not.toContain("twilio.auth_token");
});
test("db.url_with_password flags real password, skips placeholder/env-var", () => {
expect(ids("postgres://user:s3cretP@ss@db.example.com/app")).toContain("db.url_with_password");
expect(ids("postgres://user:${DB_PASSWORD}@host/app")).not.toContain("db.url_with_password");
});
test("all HIGH patterns block (exit 3)", () => {
const r = scan("AKIA1234567890ABCDEF", { repoVisibility: "private" });
expect(exitCodeFor(r)).toBe(3);
});
});
describe("MEDIUM demoted credential-shaped patterns (TENSION-1)", () => {
test("stripe.publishable is MEDIUM not HIGH", () => {
const f = scan("pk_live_" + "a".repeat(30), { repoVisibility: "private" }).findings.find(
(x) => x.id === "stripe.publishable",
);
expect(f?.tier).toBe("MEDIUM");
});
test("google.api_key is MEDIUM", () => {
const f = scan("AIza" + "a".repeat(35), { repoVisibility: "private" }).findings.find(
(x) => x.id === "google.api_key",
);
expect(f?.tier).toBe("MEDIUM");
});
test("jwt is MEDIUM", () => {
const jwt = "eyJhbGciOiJ.eyJzdWIiOiI." + "x".repeat(20);
const f = scan(jwt, { repoVisibility: "private" }).findings.find((x) => x.id === "jwt");
expect(f?.tier).toBe("MEDIUM");
});
test("env.kv fires on high-entropy, skips placeholder", () => {
expect(ids("API_TOKEN=8Fk2pQ9vXz4wL7mN3rT6yB1cD5eG0hJ")).toContain("env.kv");
expect(ids("API_KEY=changeme")).not.toContain("env.kv");
expect(ids("API_KEY=${MY_VAR}")).not.toContain("env.kv");
});
});
describe("PII patterns", () => {
test("email flags + is autoRedactable", () => {
const f = scan("ping alice@corp.io please", { repoVisibility: "private" }).findings.find(
(x) => x.id === "pii.email",
);
expect(f).toBeTruthy();
expect(f?.autoRedactable).toBe(true);
});
test("email allowlist: example.com, noreply, self, repo-public", () => {
expect(ids("see user@example.com")).not.toContain("pii.email");
expect(ids("from noreply@github.com")).not.toContain("pii.email");
expect(
scan("me@garry.dev", { repoVisibility: "private", selfEmail: "me@garry.dev" }).findings,
).toHaveLength(0);
expect(
scan("bob@acme.co", { repoVisibility: "private", repoPublicEmails: ["bob@acme.co"] }).findings,
).toHaveLength(0);
});
test("phone E.164", () => {
expect(ids("call +14155550123 now")).toContain("pii.phone.e164");
});
test("ssn flags valid, skips 000 octet", () => {
expect(ids("ssn 123-45-6789")).toContain("pii.ssn");
expect(ids("000-12-3456")).not.toContain("pii.ssn");
});
test("credit card needs Luhn", () => {
expect(ids("card 4111111111111111")).toContain("pii.cc");
expect(ids("num 4111111111111112")).not.toContain("pii.cc");
});
test("public IP flagged, RFC1918 skipped", () => {
expect(ids("connect 8.8.8.8")).toContain("pii.ip_public");
expect(ids("local 192.168.1.5")).not.toContain("pii.ip_public");
expect(ids("local 10.0.0.1")).not.toContain("pii.ip_public");
});
});
describe("internal + legal patterns", () => {
test("internal hostname", () => {
expect(ids("db1.corp internal host")).toContain("internal.hostname");
});
test("localhost url with path", () => {
expect(ids("hit http://localhost:8080/admin/secrets")).toContain("internal.url_private");
});
test("NDA marker", () => {
expect(ids("This is CONFIDENTIAL material")).toContain("legal.nda_marker");
});
test("named criticism needs a capitalized full name nearby", () => {
expect(ids("John Smith is incompetent at this")).toContain("legal.named_criticism");
expect(ids("the build is incompet019ently configured".replace("019", ""))).not.toContain(
"legal.named_criticism",
);
});
});
describe("LOW patterns surface only", () => {
test("user path is LOW", () => {
const f = scan("/Users/bob/secret/config", { repoVisibility: "private" }).findings.find(
(x) => x.id === "internal.user_path",
);
expect(f?.tier).toBe("LOW");
});
test("TODO marker is LOW", () => {
const f = scan("TODO(alice) fix later", { repoVisibility: "private" }).findings.find(
(x) => x.id === "hygiene.todo",
);
expect(f?.tier).toBe("LOW");
});
});
describe("placeholder suppression (per-span)", () => {
test("AWS docs EXAMPLE key not flagged", () => {
expect(ids("AKIAIOSFODNN7EXAMPLE")).not.toContain("aws.access_key");
});
test("your_ prefix not flagged", () => {
expect(isPlaceholderSpan("your_api_key")).toBe(true);
});
test("a real secret on a line that ALSO contains EXAMPLE still flags", () => {
// line-based suppression would wrongly skip this; per-span must catch it.
expect(ids("# EXAMPLE usage\nkey AKIA1234567890ABCDEF")).toContain("aws.access_key");
});
});
describe("no visibility-based tier promotion (TENSION-2-followup)", () => {
test("email stays MEDIUM on both private and public", () => {
const priv = scan("x@corp.io", { repoVisibility: "private" }).findings[0];
const pub = scan("x@corp.io", { repoVisibility: "public" }).findings[0];
expect(priv.tier).toBe("MEDIUM");
expect(pub.tier).toBe("MEDIUM");
expect(pub.severity).toBe("MEDIUM"); // NOT promoted to HIGH
expect(pub.repoVisibility).toBe("public"); // recorded for sterner wording
});
test("demoted credential patterns stay MEDIUM on public", () => {
const pub = scan("pk_live_" + "a".repeat(30), { repoVisibility: "public" }).findings[0];
expect(pub.severity).toBe("MEDIUM");
});
test("unknown visibility treated as public for wording, still no promotion", () => {
const r = scan("x@corp.io", { repoVisibility: "unknown" });
expect(r.findings[0].severity).toBe("MEDIUM");
});
});
describe("tool-attributed fence WARN-degrade (TENSION-3)", () => {
test("placeholder-shaped credential in tool fence → WARN", () => {
const text = "```codex-review\nfound your_aws_key AKIAIOSFODNN7EXAMPLE in code\n```";
const r = scan(text, { repoVisibility: "private" });
// the EXAMPLE key is suppressed as placeholder; verify a non-credential note doesn't block
expect(r.counts.HIGH).toBe(0);
});
test("live-format credential in tool fence STILL blocks", () => {
const text = "```codex-review\nleaked AKIA1234567890ABCDEF here\n```";
const r = scan(text, { repoVisibility: "private" });
expect(r.counts.HIGH).toBe(1); // not degraded — live format
});
test("AKIA outside any fence blocks", () => {
expect(exitCodeFor(scan("AKIA1234567890ABCDEF", {}))).toBe(3);
});
});
describe("normalization", () => {
test("zero-width chars inside a key are stripped before matching", () => {
const zwsp = "";
const broken = "AKIA1234567890" + zwsp + "ABCDEF";
expect(ids(broken)).toContain("aws.access_key");
});
test("HTML entity decode", () => {
const { normalized } = normalizeWithMap("a &amp; b");
expect(normalized).toBe("a & b");
});
test("offset map points back into original", () => {
const input = "xyz";
const { normalized, map } = normalizeWithMap(input);
expect(normalized).toBe("xyz");
// 'z' is at normalized index 2, original index 3
expect(map[2]).toBe(3);
});
});
describe("oversize fails CLOSED", () => {
test("input over the byte cap returns a single blocking HIGH finding", () => {
const big = "a".repeat(2000);
const r = scan(big, { maxBytes: 1000 });
expect(r.oversize).toBe(true);
expect(r.counts.HIGH).toBe(1);
expect(r.findings[0].id).toBe("engine.input_too_large");
expect(exitCodeFor(r)).toBe(3);
});
// #1824: a malformed --max-bytes used to reach the engine as NaN. `byteLen >
// NaN` is always false, silently disabling the fail-closed guard. The engine
// guardrail must fall back to the default cap for any non-finite / <= 0 value.
test("NaN maxBytes falls back to the default cap (does NOT disable the guard)", () => {
const big = "a".repeat(2 * 1024 * 1024); // > 1 MiB default cap
const r = scan(big, { maxBytes: NaN });
expect(r.oversize).toBe(true);
expect(r.findings[0].id).toBe("engine.input_too_large");
expect(exitCodeFor(r)).toBe(3);
});
test("negative / zero maxBytes falls back to the default cap", () => {
// negative would make `byteLen > -5` always true (block everything);
// the guardrail normalizes it to the default instead.
const small = "ok";
expect(scan(small, { maxBytes: -5 }).oversize).toBeFalsy();
expect(scan(small, { maxBytes: 0 }).oversize).toBeFalsy();
const big = "a".repeat(2 * 1024 * 1024);
expect(scan(big, { maxBytes: -5 }).oversize).toBe(true);
});
});
describe("validators", () => {
test("luhn", () => {
expect(luhnValid("4111111111111111")).toBe(true);
expect(luhnValid("4111111111111112")).toBe(false);
});
test("entropy", () => {
expect(shannonEntropy("aaaaaaaa")).toBeLessThan(1);
expect(shannonEntropy("8Fk2pQ9vXz4wL7mN")).toBeGreaterThan(3);
});
test("isPublicIPv4", () => {
expect(isPublicIPv4("8.8.8.8")).toBe(true);
expect(isPublicIPv4("10.1.2.3")).toBe(false);
expect(isPublicIPv4("172.16.5.5")).toBe(false);
expect(isPublicIPv4("999.1.1.1")).toBe(false);
});
});
describe("masking + purity", () => {
test("preview never leaks more than 4 leading chars", () => {
expect(maskPreview("AKIA1234567890ABCDEF")).toBe("AKIA********…");
expect(maskPreview("abc")).toBe("abc");
});
test("scan is pure — same input twice yields identical findings", () => {
const a = scan("AKIA1234567890ABCDEF x@corp.io", { repoVisibility: "public" });
const b = scan("AKIA1234567890ABCDEF x@corp.io", { repoVisibility: "public" });
expect(a).toEqual(b);
});
});
describe("taxonomy integrity", () => {
test("every pattern has a unique id", () => {
const set = new Set(PATTERNS.map((p) => p.id));
expect(set.size).toBe(PATTERNS.length);
});
test("autoRedactable patterns have a redactToken", () => {
for (const p of PATTERNS) {
if (p.autoRedactable) expect(p.redactToken).toBeTruthy();
}
});
});