Follow-up review pass after the cache fixes turned up several bugs in the same family — silent failures that look like success to the client, plus content-correctness issues in the ZIP and per-file delivery paths. - zipStream: stop calling archive.finalize() on upstream/parser errors. That produced a valid-looking ZIP (200 OK, archive opens) silently missing entries — same class as #694, but worse because the user has no signal anything went wrong. Destroy the response on failure instead so the client sees a connection drop. - zipStream: apply per-repo image/pdf gates inside the entry handler. The single-file /file/... endpoint refuses to serve those types via AnonymizedFile.isFileSupported when image=false / pdf=false, but the ZIP shipped them anyway — privacy-relevant for maintainers who toggle image=false to suppress identifying screenshots. Threaded contentOptions through both ZIP entry points (direct and streamer). - GitHubUtils.getToken: validate the OAuth token-refresh response before persisting. On a non-2xx response or a body without a string token, we used to overwrite the stored token with `undefined`, which then propagated as `Authorization: token undefined` to every API call — 401 even on public repos, with the config.GITHUB_TOKEN fallback unreachable because the field was no longer falsy. - AnonymizedFile.send (streamer branch): forward Content-Type from the upstream streamer response. got.stream(...).pipe(res) carries body bytes only, so the parent response had no Content-Type and browsers guessed (text rendered as download, etc.). Also resolve on res.on("finish") in addition to "close" — keep-alive sockets stay open long after the response is delivered, delaying countView(). - Repository.updateIfNeeded: persist a renamed source.repositoryName even when the commit hasn't changed. Previously the new value lived in memory only and was overwritten on the next reload, so the rename detection ran every request. - Repository.anonymize: stop materialising a dummy {path:"",name:""} FileModel for empty repos. That row collided with the special case in AnonymizedFile.getFileInfo and surfaced in unfiltered listings. - streamer/route POST /: reject filePath segments containing ".." or empty parts. Defence in depth — the parent server validates against FileModel before calling, but the streamer joins filePath straight into the storage path, so any future caller forwarding an unvalidated path could traverse out of the repo root. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Anonymous Github
Anonymous Github is a system that helps anonymize Github repositories for double-anonymous paper submissions. A public instance of Anonymous Github is hosted at https://anonymous.4open.science/.
Anonymous Github anonymizes the following:
- Github repository owner, organization, and name
- File and directory names
- File contents of all extensions, including markdown, text, Java, etc.
Usage
Public instance
https://anonymous.4open.science/
CLI
This CLI tool allows you to anonymize your GitHub repositories locally, generating an anonymized zip file based on your configuration settings.
# Install the Anonymous GitHub CLI tool
npm install -g @tdurieux/anonymous_github
# Run the Anonymous GitHub CLI tool
anonymous_github
Own instance
1. Clone the repository
git clone https://github.com/tdurieux/anonymous_github/
cd anonymous_github
npm i
2. Configure the GitHub token
Create a .env file with the following contents:
GITHUB_TOKEN=<GITHUB_TOKEN>
CLIENT_ID=<CLIENT_ID>
CLIENT_SECRET=<CLIENT_SECRET>
PORT=5000
DB_USERNAME=
DB_PASSWORD=
AUTH_CALLBACK=http://localhost:5000/github/auth,
GITHUB_TOKENcan be generated here: https://github.com/settings/tokens/new withreposcope.CLIENT_IDandCLIENT_SECRETare the tokens are generated when you create a new GitHub app https://github.com/settings/applications/new.- The callback of the GitHub app needs to be defined as
https://<host>/github/auth(the same as defined in AUTH_CALLBACK).
3. Start Anonymous Github server
docker-compose up -d
4. Go to Anonymous Github
Go to http://localhost:5000. By default, Anonymous Github uses port 5000. It can be changed in docker-compose.yml. I would recommand to put Anonymous GitHub behind ngnix to handle the https certificates.
What is the scope of anonymization?
In double-anonymous peer-review, the boundary of anonymization is the paper plus its online appendix, and only this, it's not the whole world. Googling any part of the paper or the online appendix can be considered as a deliberate attempt to break anonymity (explanation)
How does it work?
Anonymous Github either downloads the complete repository and anonymizes the content of the file or proxies the request to GitHub. In both cases, the original and anonymized versions of the file are cached on the server.
Related tools
gitmask is a tool to anonymously contribute to a Github repository.
blind-reviews is a browser add-on that enables a person reviewing a GitHub pull request to hide identifying information about the person submitting it.
