* fix: anonymize Windows batch scripts (#735)
mime-types maps .bat to application/x-msdownload, the same MIME type as
.exe/.dll, so batch scripts were classified as binary and streamed
through without any anonymization. Special-case .bat/.cmd as text before
the MIME lookup, keeping .exe/.dll binary.
* fix: recover files missing from truncated tree listings (#738)
GitHub truncates tree listings of very large repositories. Folders whose
listing was truncated are recorded in truncatedFolders, but files that
fell outside the listing never reached the database, so requesting them
returned 404 file_not_found even though they exist on GitHub — and a
force refresh could not help.
When a file lookup misses and its directory is under a truncated folder,
fetch the file metadata directly from GitHub's contents API (object
media type, so it works past the 1MB inline limit), cache it in the
database, and serve it normally.
* feat: warn when a repository uses git submodules (#737)
GitHub archives and tree listings never include submodule contents, so
submodules end up as empty folders in the anonymized repository, which
surprises users. Detect a root .gitmodules file and show a warning
banner in the explorer explaining that submodule contents are not
included.
* feat: allow users to delete their account (#741)
Add DELETE /api/user: removes all anonymized repositories, gists, and
pull requests owned by the user, best-effort revokes the GitHub OAuth
grant, and scrubs personal data (username, emails, tokens, GitHub id,
photo) from the user record. The record itself is kept with a
placeholder username so removed repoIds stay reserved and owner
references remain resolvable.
The settings page gains an Account section with a confirmed delete
button.
* fix: add missing error translations for token_expired and job_is_active
The error-code coverage test failed because both backend codes had no
frontend translation.
* security: harden against XSS, ReDoS, path traversal, and injection
Defensive fixes across the server, storage, and viewer:
- XSS (CWE-79): sanitise rendered notebooks with DOMPurify, escape file
names interpolated into AngularJS expressions (escapeNgString), set
Mermaid securityLevel to 'strict', and stop urlRel2abs from returning
javascript:/vbscript:/data:text/html URLs.
- Path traversal / zip-slip (CWE-22/23/24): validate URL-derived path
components before they reach the storage layer (file/webview routes +
StorageBase.assertSafePath) and sanitise zip entry names on extract for
both the filesystem and S3 backends.
- ReDoS (CWE-1333): escape anonymization terms with catastrophic
backtracking shapes to literals instead of compiling them as regexes.
- Secret hardening (CWE-798): require SESSION_SECRET / OAuth creds / DB
password in production, random dev SESSION_SECRET fallback.
- Rate-limit spoofing (CWE-290): derive request.ip via trust-proxy hop
count instead of the client-settable cf-connecting-ip header.
- NoSQL injection (CWE-943): allow only plain field paths as admin sort keys.
- Reject malformed streamer requests missing required string fields.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(ui): make gists reachable/visible and clarify the ZIP button
- Gist & PR routes now accept a trailing slash (/gist/:id/:path*?), so the
dashboard links (which end in "/") resolve to the gist/PR page instead of
falling through to the 404 route (#725).
- Gist viewer picks the default tab after content loads, defaulting to
"files" when files exist; previously the ng-init ran before the async
load and a files-only gist rendered blank under the hidden comments tab.
- Explorer toolbar: relabel ZIP to "Full repo ZIP" with a tooltip, and add
tooltips to Raw/Download clarifying they apply to the current file (#721).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix: report SAML-enforced orgs clearly instead of "token expired"
When a repo's organization enforces SAML SSO, GitHub returns a 403 whose
message differs from the OAuth-App-restriction case. That 403 fell through
to the generic handler and surfaced as "token_expired", pushing users to
re-login when the real fix is authorizing their token for the org. Detect
the "SAML enforcement" message and raise a dedicated, actionable error
instead (#379, #550).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* security: catch nested quantified groups in ReDoS guard and backslash path traversal
- hasCatastrophicBacktracking now scans across nested parens ([\s\S]*?)
so shapes like ((a+))+ are detected; comment reframed as a heuristic
backstop rather than a proof.
- file route path-traversal check now rejects backslash separators and a
leading backslash, covering Windows-style "..\" payloads (CWE-22/25).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* chore(dev): track dev-proxy script, ignore .DS_Store and .claude/
scripts/dev-proxy.js is referenced by the "dev:ui" npm script but was
never committed, breaking the command on a fresh clone. Add it and
ignore local-only macOS/Claude Code files.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>