fix: anonymize entries when downloading the full repo as a zip

The streaming zip pipeline was constructing AnonymizeTransformer first and
then assigning opt.filePath afterwards. AnonymizeTransformer determines
isText in its constructor from opt.filePath, so every entry was classified
as binary and passed through unchanged — the downloaded zip leaked the
original (un-anonymized) terms even though the web view scrubbed them.

Pass filePath via the constructor so isText is computed correctly.

Fixes #342, #349.
This commit is contained in:
tdurieux
2026-05-03 19:47:10 +02:00
parent 9feeab1055
commit d8b129c670
2 changed files with 54 additions and 5 deletions
+8 -2
View File
@@ -71,8 +71,14 @@ export async function streamAnonymizedZip(
entry.path.substring(entry.path.indexOf("/") + 1),
opt.anonymizerOptions.terms || []
);
const anonymizer = new AnonymizeTransformer(opt.anonymizerOptions);
anonymizer.opt.filePath = fileName;
// Pass filePath via the constructor — AnonymizeTransformer reads it
// there to decide whether the entry is text (and therefore should be
// anonymized) vs binary (passthrough). Assigning afterwards leaves
// isText=false for every file, so the zip ships unanonymized.
const anonymizer = new AnonymizeTransformer({
...opt.anonymizerOptions,
filePath: fileName,
});
const st = entry.pipe(anonymizer);
archive.append(st, { name: fileName });
} catch (error) {