anonymous_github

mirror of https://github.com/tdurieux/anonymous_github.git synced 2026-05-15 22:48:00 +02:00

Author	SHA1	Message	Date
tdurieux	c2d43164d0	error logging improvement, regex fix	2026-05-06 11:16:12 +03:00
tdurieux	5b72b630c4	fix: silent-truncation, token-refresh, and content-type bugs across hot paths Follow-up review pass after the cache fixes turned up several bugs in the same family — silent failures that look like success to the client, plus content-correctness issues in the ZIP and per-file delivery paths. - zipStream: stop calling archive.finalize() on upstream/parser errors. That produced a valid-looking ZIP (200 OK, archive opens) silently missing entries — same class as #694, but worse because the user has no signal anything went wrong. Destroy the response on failure instead so the client sees a connection drop. - zipStream: apply per-repo image/pdf gates inside the entry handler. The single-file /file/... endpoint refuses to serve those types via AnonymizedFile.isFileSupported when image=false / pdf=false, but the ZIP shipped them anyway — privacy-relevant for maintainers who toggle image=false to suppress identifying screenshots. Threaded contentOptions through both ZIP entry points (direct and streamer). - GitHubUtils.getToken: validate the OAuth token-refresh response before persisting. On a non-2xx response or a body without a string token, we used to overwrite the stored token with `undefined`, which then propagated as `Authorization: token undefined` to every API call — 401 even on public repos, with the config.GITHUB_TOKEN fallback unreachable because the field was no longer falsy. - AnonymizedFile.send (streamer branch): forward Content-Type from the upstream streamer response. got.stream(...).pipe(res) carries body bytes only, so the parent response had no Content-Type and browsers guessed (text rendered as download, etc.). Also resolve on res.on("finish") in addition to "close" — keep-alive sockets stay open long after the response is delivered, delaying countView(). - Repository.updateIfNeeded: persist a renamed source.repositoryName even when the commit hasn't changed. Previously the new value lived in memory only and was overwritten on the next reload, so the rename detection ran every request. - Repository.anonymize: stop materialising a dummy {path:"",name:""} FileModel for empty repos. That row collided with the special case in AnonymizedFile.getFileInfo and surfaced in unfiltered listings. - streamer/route POST /: reject filePath segments containing ".." or empty parts. Defence in depth — the parent server validates against FileModel before calling, but the streamer joins filePath straight into the storage path, so any future caller forwarding an unvalidated path could traverse out of the repo root. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-05 09:19:05 +03:00
tdurieux	9adff11e74	fix(cache): atomic file writes and size-validated cache reads A failed/interrupted GitHub fetch could leave a 0-byte or truncated file in the local cache. Subsequent reads happily streamed the empty content as the file's body — visible to users as an "Empty file" with HTTP 200. Reproduced on artifact-70B6/Lethe/configs.py (#694). - FileSystem.write: stream into a sibling .tmp and rename into place only on finish. Stream errors discard the tmp and leave any prior cached file untouched. Drop the utf-8 encoding that was silently corrupting binary blobs. - GitHubStream.getFileContentCache: accept an expected size and treat cached.size < expected as a poisoned cache (truncated fetch) → rm and re-fetch. cached.size >= expected is accepted, which keeps Git LFS-resolved files (whose FileModel.size is the pointer size) working. - AnonymizedFile: expose size() and pass it through to the streamer alongside sha so the cache check has the upstream size. Existing poisoned entries self-heal on next access. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-05 08:47:41 +03:00
tdurieux	a30ab7fb96	fix: don't declare Accept-Ranges: none for binary files The server set Accept-Ranges: none on every file response. For text we anonymize on the fly so byte ranges aren't meaningful, but binary entries pass through unchanged — and the explicit "none" header makes some browsers refuse to play <video>/<audio> elements that would otherwise fall back to a full download. Newly uploaded MP4s under the inline-preview threshold rendered as a blank progress bar (#538). Only set Accept-Ranges: none for text entries; let binary entries omit it so the standard fallback kicks in. Fixes #538.	2026-05-03 21:23:59 +02:00
tdurieux	a5f66d6844	multiple fixes	2026-05-03 15:30:54 +02:00
Thomas Durieux	f4209110c7	Fix all 93 ESLint issues (3 errors, 90 warnings) (#666 )	2026-04-15 09:04:22 +02:00
tdurieux	f93eb8787e	fix: protect archive.finalize	2024-07-22 16:31:52 +02:00
tdurieux	d8dd408a65	fix: avoid cache of list of files	2024-07-22 16:20:18 +02:00
tdurieux	532c094388	fix: improve token management	2024-06-18 12:00:53 +02:00
tdurieux	dcf483ea03	feat: improve download anonymized repository	2024-05-06 11:52:32 +02:00
tdurieux	93606a5c39	fix: catch error when requesting a folder	2024-05-03 10:49:25 +02:00
tdurieux	3a00a27153	feat: improve support for binary & audio files	2024-04-28 10:01:40 +01:00
tdurieux	2a145730b7	Improve log and GH token validation	2024-04-27 16:19:33 +01:00
tdurieux	a86e050f8b	fix: handle empty repository	2024-04-26 13:48:32 +01:00
tdurieux	9048b5c3b1	fix: fix healthcheck	2024-04-05 13:11:43 +01:00
tdurieux	1d4bab7866	fix: fix webview & improve download progress	2024-04-03 18:25:33 +01:00
tdurieux	db67f53b2c	fix: fix GitHubDownload	2024-04-03 13:24:34 +01:00
tdurieux	4d12641c7e	feat: introduce streamers that handle the stream and anonymization from github	2024-04-03 11:13:01 +01:00

18 Commits