* fix: anonymize Windows batch scripts (#735) mime-types maps .bat to application/x-msdownload, the same MIME type as .exe/.dll, so batch scripts were classified as binary and streamed through without any anonymization. Special-case .bat/.cmd as text before the MIME lookup, keeping .exe/.dll binary. * fix: recover files missing from truncated tree listings (#738) GitHub truncates tree listings of very large repositories. Folders whose listing was truncated are recorded in truncatedFolders, but files that fell outside the listing never reached the database, so requesting them returned 404 file_not_found even though they exist on GitHub — and a force refresh could not help. When a file lookup misses and its directory is under a truncated folder, fetch the file metadata directly from GitHub's contents API (object media type, so it works past the 1MB inline limit), cache it in the database, and serve it normally. * feat: warn when a repository uses git submodules (#737) GitHub archives and tree listings never include submodule contents, so submodules end up as empty folders in the anonymized repository, which surprises users. Detect a root .gitmodules file and show a warning banner in the explorer explaining that submodule contents are not included. * feat: allow users to delete their account (#741) Add DELETE /api/user: removes all anonymized repositories, gists, and pull requests owned by the user, best-effort revokes the GitHub OAuth grant, and scrubs personal data (username, emails, tokens, GitHub id, photo) from the user record. The record itself is kept with a placeholder username so removed repoIds stay reserved and owner references remain resolvable. The settings page gains an Account section with a confirmed delete button. * fix: add missing error translations for token_expired and job_is_active The error-code coverage test failed because both backend codes had no frontend translation.
Anonymous Github
Anonymous Github is a system that helps anonymize Github repositories for double-anonymous paper submissions. A public instance of Anonymous Github is hosted at https://anonymous.4open.science/.
Anonymous Github anonymizes the following:
- Github repository owner, organization, and name
- File and directory names
- File contents of all extensions, including markdown, text, Java, etc.
Usage
Public instance
https://anonymous.4open.science/
CLI
This CLI tool allows you to anonymize your GitHub repositories locally, generating an anonymized zip file based on your configuration settings.
# Install the Anonymous GitHub CLI tool
npm install -g @tdurieux/anonymous_github
# Run the Anonymous GitHub CLI tool
anonymous_github
Own instance
1. Clone the repository
git clone https://github.com/tdurieux/anonymous_github/
cd anonymous_github
npm i
2. Configure the GitHub token
Create a .env file with the following contents:
GITHUB_TOKEN=<GITHUB_TOKEN>
CLIENT_ID=<CLIENT_ID>
CLIENT_SECRET=<CLIENT_SECRET>
PORT=5000
DB_USERNAME=
DB_PASSWORD=
AUTH_CALLBACK=http://localhost:5000/github/auth,
GITHUB_TOKENcan be generated here: https://github.com/settings/tokens/new withreposcope.CLIENT_IDandCLIENT_SECRETare the tokens are generated when you create a new GitHub app https://github.com/settings/applications/new.- The callback of the GitHub app needs to be defined as
https://<host>/github/auth(the same as defined in AUTH_CALLBACK).
3. Start Anonymous Github server
docker-compose up -d
4. Go to Anonymous Github
Go to http://localhost:5000. By default, Anonymous Github uses port 5000. It can be changed in docker-compose.yml. I would recommand to put Anonymous GitHub behind ngnix to handle the https certificates.
What is the scope of anonymization?
In double-anonymous peer-review, the boundary of anonymization is the paper plus its online appendix, and only this, it's not the whole world. Googling any part of the paper or the online appendix can be considered as a deliberate attempt to break anonymity (explanation)
How does it work?
Anonymous Github either downloads the complete repository and anonymizes the content of the file or proxies the request to GitHub. In both cases, the original and anonymized versions of the file are cached on the server.
Related tools
gitmask is a tool to anonymously contribute to a Github repository.
blind-reviews is a browser add-on that enables a person reviewing a GitHub pull request to hide identifying information about the person submitting it.
