fix(cache): make Zip-source caches atomic and robust to partial state

Follow-up to the GitHubStream cache fixes. The same poisoned-cache
class existed in the GitHubDownload path and a few related spots:

- GitHubDownload.download: wipe pre-existing state before extracting
  and write a .anon-complete marker only after a successful extract.
  On error, rm the partial cache so a retry starts clean. getFileContent
  and getFiles now gate on the marker instead of "any file/folder
  exists," so a half-extracted tree can never be served as canonical.
- GitHubDownload.getFileContent: validate cached file size against the
  upstream FileModel size (via the new AnonymizedFile.size()), same
  guard as GitHubStream. getFiles filters the marker from the listing.
- FileSystem.listFiles: drop the bogus stats.ino.toString() as sha.
  An inode isn't a content hash; anything comparing it to a Git blob
  sha would silently disagree. Leave undefined.
- S3.write: remove the fire-and-forget data.on("error") -> this.rm(...).
  Multipart Upload doesn't commit partial objects, so there was nothing
  to clean up, and the handler raced retries and could delete a
  previously-good object on a transient source-stream hiccup. The
  size-validated read path recovers from any other undersized objects.
- GitHubStream.resolveLfsPointer: drop the post-decision early-return
  in blobStream.on("error"). Currently redundant with the inner
  listener, but removes the future-refactor footgun.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
tdurieux
2026-05-05 08:54:42 +03:00
parent 9adff11e74
commit f413a30313
4 changed files with 74 additions and 13 deletions
+5 -1
View File
@@ -153,13 +153,17 @@ export default class FileSystem extends StorageBase {
size: stats.size,
});
}
// Don't synthesise a sha here. The previous value (stats.ino)
// wasn't a content hash — just an inode number — and any code
// that compared it to an upstream Git blob sha would silently
// disagree. Leave it undefined so callers either look up the
// real sha from FileModel/GitHub or skip sha-keyed paths.
output2.push(
new FileModel({
name: file,
path: dir,
repoId: repoId,
size: stats.size,
sha: stats.ino.toString(),
})
);
}