Security hardening + gist UI fixes (#731)

* security: harden against XSS, ReDoS, path traversal, and injection

Defensive fixes across the server, storage, and viewer:

- XSS (CWE-79): sanitise rendered notebooks with DOMPurify, escape file
  names interpolated into AngularJS expressions (escapeNgString), set
  Mermaid securityLevel to 'strict', and stop urlRel2abs from returning
  javascript:/vbscript:/data:text/html URLs.
- Path traversal / zip-slip (CWE-22/23/24): validate URL-derived path
  components before they reach the storage layer (file/webview routes +
  StorageBase.assertSafePath) and sanitise zip entry names on extract for
  both the filesystem and S3 backends.
- ReDoS (CWE-1333): escape anonymization terms with catastrophic
  backtracking shapes to literals instead of compiling them as regexes.
- Secret hardening (CWE-798): require SESSION_SECRET / OAuth creds / DB
  password in production, random dev SESSION_SECRET fallback.
- Rate-limit spoofing (CWE-290): derive request.ip via trust-proxy hop
  count instead of the client-settable cf-connecting-ip header.
- NoSQL injection (CWE-943): allow only plain field paths as admin sort keys.
- Reject malformed streamer requests missing required string fields.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(ui): make gists reachable/visible and clarify the ZIP button

- Gist & PR routes now accept a trailing slash (/gist/:id/:path*?), so the
  dashboard links (which end in "/") resolve to the gist/PR page instead of
  falling through to the 404 route (#725).
- Gist viewer picks the default tab after content loads, defaulting to
  "files" when files exist; previously the ng-init ran before the async
  load and a files-only gist rendered blank under the hidden comments tab.
- Explorer toolbar: relabel ZIP to "Full repo ZIP" with a tooltip, and add
  tooltips to Raw/Download clarifying they apply to the current file (#721).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix: report SAML-enforced orgs clearly instead of "token expired"

When a repo's organization enforces SAML SSO, GitHub returns a 403 whose
message differs from the OAuth-App-restriction case. That 403 fell through
to the generic handler and surfaced as "token_expired", pushing users to
re-login when the real fix is authorizing their token for the org. Detect
the "SAML enforcement" message and raise a dedicated, actionable error
instead (#379, #550).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* security: catch nested quantified groups in ReDoS guard and backslash path traversal

- hasCatastrophicBacktracking now scans across nested parens ([\s\S]*?)
  so shapes like ((a+))+ are detected; comment reframed as a heuristic
  backstop rather than a proof.
- file route path-traversal check now rejects backslash separators and a
  leading backslash, covering Windows-style "..\" payloads (CWE-22/25).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* chore(dev): track dev-proxy script, ignore .DS_Store and .claude/

scripts/dev-proxy.js is referenced by the "dev:ui" npm script but was
never committed, breaking the command on a fresh clone. Add it and
ignore local-only macOS/Claude Code files.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Thomas Durieux
2026-06-18 04:50:55 -07:00
committed by GitHub
parent bdfcc56d81
commit e4ffd74068
21 changed files with 484 additions and 23 deletions
+6
View File
@@ -4,6 +4,12 @@ build
repo/ repo/
db_backups db_backups
message.txt message.txt
# macOS
.DS_Store
# Local Claude Code settings
.claude/
# Created by https://www.gitignore.io/api/node # Created by https://www.gitignore.io/api/node
# Edit at https://www.gitignore.io/?templates=node # Edit at https://www.gitignore.io/?templates=node
+2 -2
View File
@@ -1,6 +1,6 @@
{ {
"core.min.js": "core.3db744fc07.min.js", "core.min.js": "core.6332b3c288.min.js",
"vendor.min.js": "vendor.09f02f70c0.min.js", "vendor.min.js": "vendor.d7d972f465.min.js",
"mermaid.min.js": "mermaid.f848a72d16.min.js", "mermaid.min.js": "mermaid.f848a72d16.min.js",
"all.min.css": "all.1a9babcb45.min.css" "all.min.css": "all.1a9babcb45.min.css"
} }
+2
View File
@@ -10,6 +10,7 @@
"user_not_found": "The requested user could not be found.", "user_not_found": "The requested user could not be found.",
"user_banned": "Your account has been banned. Contact the admin for more information.", "user_banned": "Your account has been banned. Contact the admin for more information.",
"repo_access_limited": "GitHub blocked access because the repository's organization restricts third-party OAuth apps. Ask an org owner to approve Anonymous GitHub under Settings → Third-party Access → OAuth app policy, or anonymize a personal fork instead.", "repo_access_limited": "GitHub blocked access because the repository's organization restricts third-party OAuth apps. Ask an org owner to approve Anonymous GitHub under Settings → Third-party Access → OAuth app policy, or anonymize a personal fork instead.",
"repo_saml_enforcement": "The repository's organization enforces SAML single sign-on. Authorize your token for that organization (GitHub → Settings → Applications, or re-run the org's SSO sign-in), then retry. Alternatively, anonymize a personal fork.",
"repo_not_found": "The repository was not found on GitHub. Check the URL and spelling, make sure you are signed in to the account that can see it, and confirm the repo isn't hidden under an org that restricts third-party app access.", "repo_not_found": "The repository was not found on GitHub. Check the URL and spelling, make sure you are signed in to the account that can see it, and confirm the repo isn't hidden under an org that restricts third-party app access.",
"repo_empty": "The selected branch has no commits on GitHub. Push at least one commit, or pick a different branch, then retry.", "repo_empty": "The selected branch has no commits on GitHub. Push at least one commit, or pick a different branch, then retry.",
"repo_not_accessible": "Anonymous GitHub cannot access this repository. Verify the repository exists and that Anonymous GitHub has been authorized for the owning organization.", "repo_not_accessible": "Anonymous GitHub cannot access this repository. Verify the repository exists and that Anonymous GitHub has been authorized for the owning organization.",
@@ -52,6 +53,7 @@
"path_not_specified": "A file path must be specified.", "path_not_specified": "A file path must be specified.",
"path_not_defined": "The file path has not been resolved yet.", "path_not_defined": "The file path has not been resolved yet.",
"invalid_file_path": "The requested file path is not valid.", "invalid_file_path": "The requested file path is not valid.",
"invalid_request": "The request is missing required fields or is malformed.",
"no_file_selected": "Please select a file.", "no_file_selected": "Please select a file.",
"file_not_found": "The requested file is not found.", "file_not_found": "The requested file is not found.",
"file_not_accessible": "The requested file is not accessible.", "file_not_accessible": "The requested file is not accessible.",
+7 -4
View File
@@ -88,7 +88,8 @@
ng-href="{{url}}" ng-href="{{url}}"
target="__self" target="__self"
class="btn btn-sm" class="btn btn-sm"
aria-label="Raw" aria-label="View raw current file"
title="View the raw content of the current file"
><i class="fas fa-file-alt"></i><span class="d-none d-md-inline"> Raw</span></a ><i class="fas fa-file-alt"></i><span class="d-none d-md-inline"> Raw</span></a
> >
<a <a
@@ -96,7 +97,8 @@
ng-href="{{url}}&download=true" ng-href="{{url}}&download=true"
target="__self" target="__self"
class="btn btn-sm" class="btn btn-sm"
aria-label="Download" aria-label="Download current file"
title="Download the current file"
><i class="fas fa-download"></i><span class="d-none d-md-inline"> Download</span></a ><i class="fas fa-download"></i><span class="d-none d-md-inline"> Download</span></a
> >
<a <a
@@ -104,8 +106,9 @@
ng-href="/api/repo/{{repoId}}/zip" ng-href="/api/repo/{{repoId}}/zip"
target="__self" target="__self"
class="btn btn-sm" class="btn btn-sm"
aria-label="Download ZIP" aria-label="Download full repository as ZIP"
><i class="fas fa-file-archive"></i><span class="d-none d-md-inline"> ZIP</span></a title="Download the full repository as a ZIP archive"
><i class="fas fa-file-archive"></i><span class="d-none d-md-inline"> Full repo ZIP</span></a
> >
<a <a
ng-if="options.hasWebsite" ng-if="options.hasWebsite"
+35 -5
View File
@@ -88,13 +88,13 @@ angular
controller: "claimController", controller: "claimController",
title: "Claim an anonymization Anonymous GitHub", title: "Claim an anonymization Anonymous GitHub",
}) })
.when("/pr/:pullRequestId", { .when("/pr/:pullRequestId/:path*?", {
templateUrl: "/partials/pullRequest.htm", templateUrl: "/partials/pullRequest.htm",
controller: "pullRequestController", controller: "pullRequestController",
title: "Anonymous pull request Anonymous GitHub", title: "Anonymous pull request Anonymous GitHub",
reloadOnUrl: false, reloadOnUrl: false,
}) })
.when("/gist/:gistId", { .when("/gist/:gistId/:path*?", {
templateUrl: "/partials/gist.htm", templateUrl: "/partials/gist.htm",
controller: "gistController", controller: "gistController",
title: "Anonymous gist Anonymous GitHub", title: "Anonymous gist Anonymous GitHub",
@@ -593,6 +593,23 @@ angular
return str.replace(/&/g, "&amp;").replace(/</g, "&lt;").replace(/>/g, "&gt;").replace(/"/g, "&quot;"); return str.replace(/&/g, "&amp;").replace(/</g, "&lt;").replace(/>/g, "&gt;").replace(/"/g, "&quot;");
} }
// Escape a value for safe interpolation into a single-quoted
// AngularJS expression string (e.g. ng-click="openFolder('...')")
// that itself sits inside a double-quoted HTML attribute which is
// later $compile()d. Backslash/quote are escaped at the Angular
// string level; &<>" are HTML-encoded for the attribute. Without
// this a file name like `');$emit(...)//` would break out of the
// expression string and execute (DOM XSS, CWE-79).
function escapeNgString(str) {
return String(str)
.replace(/\\/g, "\\\\")
.replace(/'/g, "\\'")
.replace(/&/g, "&amp;")
.replace(/</g, "&lt;")
.replace(/>/g, "&gt;")
.replace(/"/g, "&quot;");
}
function buildSearchFilter() { function buildSearchFilter() {
const results = $scope.searchResults; const results = $scope.searchResults;
if (!results || !results.length) return null; if (!results || !results.length) return null;
@@ -675,11 +692,12 @@ angular
cssClasses.push("truncated"); cssClasses.push("truncated");
} }
const ngPath = escapeNgString(path);
output += `<li class="${cssClasses.join( output += `<li class="${cssClasses.join(
" " " "
)}" ng-class="{active: isActive('${path}'), open: ${filterSet ? "opens['" + path + "'] !== false" : "opens['" + path + "']"}}" title="${escapeHtml(sizeTitle)}">`; )}" ng-class="{active: isActive('${ngPath}'), open: ${filterSet ? "opens['" + ngPath + "'] !== false" : "opens['" + ngPath + "']"}}" title="${escapeHtml(sizeTitle)}">`;
if (dir) { if (dir) {
output += `<a ng-click="openFolder('${path}', $event)"><span class="tree-toggle"></span><span class="tree-icon-folder"></span><span class="tree-name">${escapeHtml(name)}</span>`; output += `<a ng-click="openFolder('${ngPath}', $event)"><span class="tree-toggle"></span><span class="tree-icon-folder"></span><span class="tree-name">${escapeHtml(name)}</span>`;
if (truncated) { if (truncated) {
output += `<span class="truncated-warning" title="{{ 'WARNINGS.folder_truncated' | translate }}"><i class="fas fa-exclamation-triangle"></i></span>`; output += `<span class="truncated-warning" title="{{ 'WARNINGS.folder_truncated' | translate }}"><i class="fas fa-exclamation-triangle"></i></span>`;
} }
@@ -911,7 +929,13 @@ angular
const notebook = nb.parse(json); const notebook = nb.parse(json);
try { try {
$element.html(""); $element.html("");
$element.append(notebook.render()); // notebook.render() turns notebook JSON (markdown cells, cell
// outputs) into HTML without sanitising it — a malicious
// notebook could embed <script>/onerror handlers that execute
// in the viewer's browser (XSS, CWE-79). Run the rendered
// output through DOMPurify before inserting it.
const rendered = notebook.render();
$element.html(DOMPurify.sanitize(rendered));
Prism.highlightAll(); Prism.highlightAll();
} catch (error) { } catch (error) {
$element.html("Unable to render the notebook."); $element.html("Unable to render the notebook.");
@@ -3118,6 +3142,12 @@ angular
$http.get(`/api/gist/${$scope.gistId}/content`).then( $http.get(`/api/gist/${$scope.gistId}/content`).then(
(res) => { (res) => {
$scope.details = res.data; $scope.details = res.data;
// Pick the default tab once the content is loaded. The ng-init in
// the template runs before this async response arrives (details is
// still null then), so without this a files-only gist would default
// to the hidden "comments" tab and render blank.
const hasFiles = res.data && res.data.files && res.data.files.length;
$scope.tabState = { active: hasFiles ? "files" : "comments" };
if (callback) callback(res.data); if (callback) callback(res.data);
}, },
(err) => { (err) => {
+1 -1
View File
File diff suppressed because one or more lines are too long
+6 -1
View File
@@ -39,7 +39,12 @@ function markedMermaid(options) {
mermaid.initialize({ mermaid.initialize({
startOnLoad: false, startOnLoad: false,
theme: 'default', theme: 'default',
securityLevel: 'loose' // 'strict' keeps Mermaid's own HTML/script sanitisation and
// disables click-binding callbacks. 'loose' (the previous
// value) lets diagram syntax inject clickable elements with
// JavaScript handlers that run in the viewer's browser
// (XSS, CWE-79).
securityLevel: 'strict'
}); });
window.mermaidInitialized = true; window.mermaidInitialized = true;
} }
+11 -2
View File
@@ -54,12 +54,21 @@ function urlRel2abs(
) { ) {
/* Only accept commonly trusted protocols: /* Only accept commonly trusted protocols:
* Only data-image URLs are accepted, Exotic flavours (escaped slash, * Only data-image URLs are accepted, Exotic flavours (escaped slash,
* html-entitied characters) are not supported to keep the function fast */ * html-entitied characters) are not supported to keep the function fast.
* "javascript:" is intentionally NOT allowed — returning such a URL
* unchanged would let it reach an href attribute and execute on click
* (XSS, CWE-79). */
if ( if (
/^(https?|file|ftps?|mailto|javascript|data:image\/[^;]{2,9};):/i.test(url) /^(https?|file|ftps?|mailto|data:image\/[^;]{2,9};):/i.test(url)
) { ) {
return url; //Url is already absolute return url; //Url is already absolute
} }
// Block any other explicit scheme (javascript:, vbscript:, data:text/html,
// …) so it can't slip through as an "absolute" URL via the relative-path
// handling below.
if (/^\s*[a-z][a-z0-9+.-]*:/i.test(url)) {
return "";
}
if (url.substring(0, 2) == "//") return location.protocol + url; if (url.substring(0, 2) == "//") return location.protocol + url;
else if (url.charAt(0) == "/") return baseUrl + url; else if (url.charAt(0) == "/") return baseUrl + url;
+1 -1
View File
File diff suppressed because one or more lines are too long
+186
View File
@@ -0,0 +1,186 @@
/**
* Dev proxy for local UI iteration.
*
* Serves the local `public/` folder for HTML/CSS/JS/partials/images so you
* see your design changes instantly, and proxies everything else (API,
* auth, repo content, …) to the live https://anonymous.4open.science site.
*
* npm run dev:ui # default port 4001
* PORT=5000 npm run dev:ui
*
* Notes
* - Cookies from upstream are rewritten so they stick on localhost:
* • `Secure` flag stripped
* • `Domain=anonymous.4open.science` stripped
* - GitHub OAuth callback points at the production host, so live sign-in
* won't complete against localhost. You can still browse as an anonymous
* visitor (landing page, FAQ, anonymous repo mirrors) with full data.
*/
const path = require("path");
const express = require("express");
const {
createProxyMiddleware,
responseInterceptor,
} = require("http-proxy-middleware");
const fs = require("fs");
const UPSTREAM = process.env.UPSTREAM || "https://anonymous.4open.science";
const PORT = parseInt(process.env.PORT || "4001", 10);
const PUBLIC_DIR = path.resolve(__dirname, "..", "public");
// Re-read manifest on each request so gulp rebuilds are picked up instantly.
const manifestPath = path.join(PUBLIC_DIR, "asset-manifest.json");
function asset(name) {
try {
const manifest = JSON.parse(fs.readFileSync(manifestPath, "utf-8"));
return manifest[name] || name;
} catch {
return name;
}
}
// Paths that should always be served from the local `public/` folder.
// Anything else falls through to the proxy.
const LOCAL_PREFIXES = [
"/css/",
"/script/",
"/partials/",
"/fonts/",
"/imgs/",
"/i18n/",
"/favicon/",
"/favicon.ico",
"/robots.txt",
];
function isLocalPath(urlPath) {
if (urlPath === "/" || urlPath === "/index.html") return true;
return LOCAL_PREFIXES.some((p) => urlPath === p || urlPath.startsWith(p));
}
const app = express();
// 0) Serve hashed asset filenames by stripping the hash.
app.get(/^\/(script|css)\/(.+)\.([a-f0-9]{10})\.(min\.\w+|\w+)$/, (req, res, next) => {
const dir = req.params[0];
const base = req.params[1];
const ext = req.params[3];
const filePath = path.join(PUBLIC_DIR, dir, `${base}.${ext}`);
if (!fs.existsSync(filePath)) return next();
res.sendFile(filePath);
});
// 1) Local static for the UI shell.
app.use((req, res, next) => {
if (req.method === "GET" && isLocalPath(req.path)) {
res.setHeader("Cache-Control", "no-store, max-age=0");
// The SPA entry: serve index.html with asset-hash placeholders filled in.
if (req.path === "/" || req.path === "/index.html") {
let html = fs.readFileSync(path.join(PUBLIC_DIR, "index.html"), "utf-8");
html = html
.replace("__CORE_JS__", asset("core.min.js"))
.replace("__VENDOR_JS__", asset("vendor.min.js"))
.replace("__MERMAID_JS__", asset("mermaid.min.js"))
.replace("__ALL_CSS__", asset("all.min.css"));
res.type("html").send(html);
return;
}
return express.static(PUBLIC_DIR, {
fallthrough: true,
etag: false,
cacheControl: false,
})(req, res, next);
}
next();
});
// 2) SPA catch-all: serve local index.html for HTML page navigations
// so all routes use the local shell (with split bundles).
app.use((req, res, next) => {
const accept = req.headers.accept || "";
if (
req.method === "GET" &&
accept.includes("text/html") &&
!req.path.startsWith("/api/") &&
!req.path.startsWith("/github/") &&
!req.path.startsWith("/w/")
) {
let html = fs.readFileSync(path.join(PUBLIC_DIR, "index.html"), "utf-8");
html = html
.replace("__CORE_JS__", asset("core.min.js"))
.replace("__VENDOR_JS__", asset("vendor.min.js"))
.replace("__MERMAID_JS__", asset("mermaid.min.js"))
.replace("__ALL_CSS__", asset("all.min.css"));
res.type("html").send(html);
return;
}
next();
});
// 3) Proxy everything else to the live site.
app.use(
createProxyMiddleware({
target: UPSTREAM,
changeOrigin: true,
secure: true,
ws: true,
xfwd: false,
followRedirects: false,
selfHandleResponse: true, // so we can rewrite Set-Cookie + HTML
cookieDomainRewrite: "",
cookiePathRewrite: "/",
onProxyReq(proxyReq, req) {
// Make upstream think the request came in over HTTPS at its domain.
proxyReq.setHeader("origin", UPSTREAM);
proxyReq.setHeader("referer", UPSTREAM + req.originalUrl);
},
onProxyRes: responseInterceptor(async (buffer, proxyRes, req, res) => {
// Rewrite Set-Cookie so cookies stick on localhost.
const setCookie = proxyRes.headers["set-cookie"];
if (setCookie) {
const rewritten = setCookie.map((c) =>
c
.replace(/;\s*Secure/gi, "")
.replace(/;\s*Domain=[^;]+/gi, "")
.replace(/;\s*SameSite=None/gi, "; SameSite=Lax"),
);
res.setHeader("set-cookie", rewritten);
}
// Rewrite Location headers on 3xx redirects.
const location = proxyRes.headers["location"];
if (location && typeof location === "string") {
try {
const u = new URL(location, UPSTREAM);
if (u.origin === UPSTREAM) {
res.setHeader("location", u.pathname + u.search + u.hash);
}
} catch {
/* leave as-is */
}
}
const ct = String(proxyRes.headers["content-type"] || "");
if (ct.includes("text/html")) {
// Swap upstream domain references in HTML so relative navigation
// stays on localhost.
const body = buffer
.toString("utf8")
.replace(new RegExp(UPSTREAM.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"), "g"), "");
return body;
}
return buffer;
}),
logLevel: "warn",
}),
);
app.listen(PORT, () => {
console.log(
`\n dev-proxy http://localhost:${PORT}` +
`\n → local: ${PUBLIC_DIR}` +
`\n → upstream ${UPSTREAM}\n`,
);
});
+43 -1
View File
@@ -1,4 +1,5 @@
import { resolve } from "path"; import { resolve } from "path";
import { randomBytes } from "crypto";
interface Config { interface Config {
SESSION_SECRET: string; SESSION_SECRET: string;
@@ -37,7 +38,10 @@ interface Config {
RATE_LIMIT: number; RATE_LIMIT: number;
} }
const config: Config = { const config: Config = {
SESSION_SECRET: "SESSION_SECRET", // Predictable defaults are dangerous: a known SESSION_SECRET lets anyone
// forge session cookies. Default to empty and resolve below — random in
// dev, required in production. See the post-env block.
SESSION_SECRET: "",
CLIENT_ID: "CLIENT_ID", CLIENT_ID: "CLIENT_ID",
CLIENT_SECRET: "CLIENT_SECRET", CLIENT_SECRET: "CLIENT_SECRET",
GITHUB_TOKEN: "", GITHUB_TOKEN: "",
@@ -99,4 +103,42 @@ for (const conf in process.env) {
} }
} }
// Harden security-sensitive secrets that still hold an unset/predictable
// value after reading the environment (CWE-798).
const isProduction = process.env.NODE_ENV === "production";
// SESSION_SECRET: a known value allows session forgery. Require it in
// production; in development fall back to a per-process random value so the
// app still boots without shipping a guessable secret.
if (!config.SESSION_SECRET || config.SESSION_SECRET === "SESSION_SECRET") {
if (isProduction) {
throw new Error(
"SESSION_SECRET must be set to a strong random value in production"
);
}
config.SESSION_SECRET = randomBytes(32).toString("hex");
// eslint-disable-next-line no-console
console.warn(
"SESSION_SECRET not set — generated a random development secret. " +
"Sessions will not persist across restarts. Set SESSION_SECRET in production."
);
}
// Refuse to start in production with the placeholder OAuth credentials or the
// default database password baked into the image.
if (isProduction) {
const insecureDefaults: [string, string][] = [
["CLIENT_ID", "CLIENT_ID"],
["CLIENT_SECRET", "CLIENT_SECRET"],
["DB_PASSWORD", "password"],
];
for (const [key, badValue] of insecureDefaults) {
if ((config as unknown as Record<string, unknown>)[key] === badValue) {
throw new Error(
`${key} is using its insecure default value; set it via the environment in production`
);
}
}
}
export default config; export default config;
+30
View File
@@ -284,6 +284,29 @@ interface CompiledTermVariant {
mask: string; mask: string;
} }
// Detect exponential-backtracking regex shapes — a quantifier applied to a
// group that itself contains a quantifier or top-level alternation, e.g.
// (a+)+, (a*)*, (a|aa)+, and the nested form ((a+))+. Anonymization terms come
// from the repository owner and are applied as live regexes against file
// content, so a crafted term could otherwise hang the worker (ReDoS,
// CWE-1333/624). This is a heuristic, not a proof: the lazy [\s\S]*? body
// matches across nested parentheses so nested quantified groups are caught,
// and it errs toward over-escaping benign regexes rather than letting a
// dangerous one through. It is not exhaustive — exotic backtracking shapes may
// still slip past — so it backstops, rather than replaces, any execution-time
// bound on the regex.
function hasCatastrophicBacktracking(src: string): boolean {
const quantifiedGroup = /\(([\s\S]*?)\)\s*(?:[*+]|\{\d+(?:,\d*)?\})/g;
let match: RegExpExecArray | null;
while ((match = quantifiedGroup.exec(src)) !== null) {
const inner = match[1];
if (/[*+]|\{\d+(?:,\d*)?\}/.test(inner) || inner.includes("|")) {
return true;
}
}
return false;
}
function compileTerms(terms: string[] | undefined): CompiledTermVariant[] { function compileTerms(terms: string[] | undefined): CompiledTermVariant[] {
if (!terms || terms.length === 0) return []; if (!terms || terms.length === 0) return [];
const compiled: CompiledTermVariant[] = []; const compiled: CompiledTermVariant[] = [];
@@ -298,9 +321,16 @@ function compileTerms(terms: string[] | undefined): CompiledTermVariant[] {
parsed.replacement !== null parsed.replacement !== null
? parsed.replacement ? parsed.replacement
: config.ANONYMIZATION_MASK + "-" + (i + 1); : config.ANONYMIZATION_MASK + "-" + (i + 1);
// Use the term as a regex only when it both compiles AND is free of
// catastrophic-backtracking shapes; otherwise escape it to a literal so a
// malicious term cannot trigger ReDoS during anonymization.
let useAsRegex = true;
try { try {
new RegExp(term, "gi"); new RegExp(term, "gi");
} catch { } catch {
useAsRegex = false;
}
if (!useAsRegex || hasCatastrophicBacktracking(term)) {
term = term.replace(/[-[\]{}()*+?.,\\^$|#]/g, "\\$&"); term = term.replace(/[-[\]{}()*+?.,\\^$|#]/g, "\\$&");
} }
for (const variant of termVariants(term)) { for (const variant of termVariants(term)) {
+18
View File
@@ -295,6 +295,24 @@ export async function getRepositoryFromGitHub(opt: {
cause: error as Error, cause: error as Error,
}); });
} }
// SAML SSO-protected orgs return a 403 with a distinct message. Without
// this branch it falls through to the generic 401/403 handler below and
// is reported as "token_expired", which sends users to re-login instead
// of authorizing their token for the organization (the real fix). See
// #379/#550.
if (
error instanceof Error &&
error.message.includes("SAML enforcement")
) {
throw new AnonymousError("repo_saml_enforcement", {
httpStatus: 403,
object: {
owner: opt.owner,
repo: opt.repo,
},
cause: error as Error,
});
}
// If the name 404s but we know the GitHub repo id, the repo was // If the name 404s but we know the GitHub repo id, the repo was
// probably renamed. Look it up by id and continue with the new name. // probably renamed. Look it up by id and continue with the new name.
const status = (error as { status?: number }).status; const status = (error as { status?: number }).status;
+15 -1
View File
@@ -24,6 +24,7 @@ export default class FileSystem extends StorageBase {
/** @override */ /** @override */
async exists(repoId: string, p: string = ""): Promise<FILE_TYPE> { async exists(repoId: string, p: string = ""): Promise<FILE_TYPE> {
this.assertSafePath(p);
const fullPath = join(config.FOLDER, this.repoPath(repoId), p); const fullPath = join(config.FOLDER, this.repoPath(repoId), p);
try { try {
const stat = await fs.promises.stat(fullPath); const stat = await fs.promises.stat(fullPath);
@@ -37,17 +38,20 @@ export default class FileSystem extends StorageBase {
/** @override */ /** @override */
async send(repoId: string, p: string, res: Response) { async send(repoId: string, p: string, res: Response) {
this.assertSafePath(p);
const fullPath = join(config.FOLDER, this.repoPath(repoId), p); const fullPath = join(config.FOLDER, this.repoPath(repoId), p);
res.sendFile(fullPath, { dotfiles: "allow" }); res.sendFile(fullPath, { dotfiles: "allow" });
} }
/** @override */ /** @override */
async read(repoId: string, p: string): Promise<Readable> { async read(repoId: string, p: string): Promise<Readable> {
this.assertSafePath(p);
const fullPath = join(config.FOLDER, this.repoPath(repoId), p); const fullPath = join(config.FOLDER, this.repoPath(repoId), p);
return fs.createReadStream(fullPath); return fs.createReadStream(fullPath);
} }
async fileInfo(repoId: string, path: string) { async fileInfo(repoId: string, path: string) {
this.assertSafePath(path);
const fullPath = join(config.FOLDER, this.repoPath(repoId), path); const fullPath = join(config.FOLDER, this.repoPath(repoId), path);
const info = await fs.promises.stat(fullPath); const info = await fs.promises.stat(fullPath);
return { return {
@@ -67,6 +71,7 @@ export default class FileSystem extends StorageBase {
_source?: string, _source?: string,
expectedSize?: number expectedSize?: number
): Promise<void> { ): Promise<void> {
this.assertSafePath(p);
const fullPath = join(config.FOLDER, this.repoPath(repoId), p); const fullPath = join(config.FOLDER, this.repoPath(repoId), p);
// Atomic write: stream into a sibling .tmp and only rename into place // Atomic write: stream into a sibling .tmp and only rename into place
// when the source stream finishes successfully. If the source errors // when the source stream finishes successfully. If the source errors
@@ -126,6 +131,7 @@ export default class FileSystem extends StorageBase {
/** @override */ /** @override */
async rm(repoId: string, dir: string = ""): Promise<void> { async rm(repoId: string, dir: string = ""): Promise<void> {
this.assertSafePath(dir);
const fullPath = join(config.FOLDER, this.repoPath(repoId), dir); const fullPath = join(config.FOLDER, this.repoPath(repoId), dir);
await fs.promises.rm(fullPath, { await fs.promises.rm(fullPath, {
force: true, force: true,
@@ -135,6 +141,7 @@ export default class FileSystem extends StorageBase {
/** @override */ /** @override */
async mk(repoId: string, dir: string = ""): Promise<void> { async mk(repoId: string, dir: string = ""): Promise<void> {
this.assertSafePath(dir);
const fullPath = join(config.FOLDER, this.repoPath(repoId), dir); const fullPath = join(config.FOLDER, this.repoPath(repoId), dir);
try { try {
await fs.promises.mkdir(fullPath, { await fs.promises.mkdir(fullPath, {
@@ -155,6 +162,7 @@ export default class FileSystem extends StorageBase {
onEntry?: (file: { path: string; size: number }) => void; onEntry?: (file: { path: string; size: number }) => void;
} = {} } = {}
): Promise<IFile[]> { ): Promise<IFile[]> {
this.assertSafePath(dir);
const fullPath = join(config.FOLDER, this.repoPath(repoId), dir); const fullPath = join(config.FOLDER, this.repoPath(repoId), dir);
const files = await fs.promises.readdir(fullPath); const files = await fs.promises.readdir(fullPath);
const output2: IFile[] = []; const output2: IFile[] = [];
@@ -197,13 +205,18 @@ export default class FileSystem extends StorageBase {
/** @override */ /** @override */
async extractZip(repoId: string, p: string, data: Readable): Promise<void> { async extractZip(repoId: string, p: string, data: Readable): Promise<void> {
this.assertSafePath(p);
const pipe = promisify(pipeline); const pipe = promisify(pipeline);
const fullPath = join(config.FOLDER, this.repoPath(repoId), p); const fullPath = join(config.FOLDER, this.repoPath(repoId), p);
const extractor = Extract({ const extractor = Extract({
path: fullPath, path: fullPath,
decodeString: (buf) => { decodeString: (buf) => {
const name = buf.toString(); const name = buf.toString();
const newName = name.substr(name.indexOf("/") + 1); // Strip the top-level directory GitHub wraps every entry in, then
// drop any "../" / absolute segments so a malicious entry name
// cannot escape the extraction root (zip-slip, CWE-24).
const stripped = name.substr(name.indexOf("/") + 1);
const newName = this.sanitizeZipEntryName(stripped);
if (newName == "") { if (newName == "") {
return "___IGNORE___"; return "___IGNORE___";
} }
@@ -223,6 +236,7 @@ export default class FileSystem extends StorageBase {
fileTransformer?: (path: string) => Transform; fileTransformer?: (path: string) => Transform;
} }
) { ) {
this.assertSafePath(dir);
const archive = archiver(opt?.format || "zip", {}); const archive = archiver(opt?.format || "zip", {});
const fullPath = join(config.FOLDER, this.repoPath(repoId), dir); const fullPath = join(config.FOLDER, this.repoPath(repoId), dir);
+16 -1
View File
@@ -54,6 +54,7 @@ export default class S3Storage extends StorageBase {
/** @override */ /** @override */
async exists(repoId: string, path: string = ""): Promise<FILE_TYPE> { async exists(repoId: string, path: string = ""): Promise<FILE_TYPE> {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set"); if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(path);
try { try {
// if we can get the file info, it is a file // if we can get the file info, it is a file
await this.fileInfo(repoId, path); await this.fileInfo(repoId, path);
@@ -79,6 +80,7 @@ export default class S3Storage extends StorageBase {
/** @override */ /** @override */
async rm(repoId: string, dir: string = ""): Promise<void> { async rm(repoId: string, dir: string = ""): Promise<void> {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set"); if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(dir);
const data = await this.client(200000).listObjectsV2({ const data = await this.client(200000).listObjectsV2({
Bucket: config.S3_BUCKET, Bucket: config.S3_BUCKET,
Prefix: join(this.repoPath(repoId), dir), Prefix: join(this.repoPath(repoId), dir),
@@ -110,6 +112,7 @@ export default class S3Storage extends StorageBase {
/** @override */ /** @override */
async send(repoId: string, path: string, res: Response) { async send(repoId: string, path: string, res: Response) {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set"); if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(path);
try { try {
const command = new GetObjectCommand({ const command = new GetObjectCommand({
Bucket: config.S3_BUCKET, Bucket: config.S3_BUCKET,
@@ -145,6 +148,7 @@ export default class S3Storage extends StorageBase {
async fileInfo(repoId: string, path: string) { async fileInfo(repoId: string, path: string) {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set"); if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(path);
const info = await this.client(3000).headObject({ const info = await this.client(3000).headObject({
Bucket: config.S3_BUCKET, Bucket: config.S3_BUCKET,
Key: join(this.repoPath(repoId), path), Key: join(this.repoPath(repoId), path),
@@ -161,6 +165,7 @@ export default class S3Storage extends StorageBase {
/** @override */ /** @override */
async read(repoId: string, path: string): Promise<Readable> { async read(repoId: string, path: string): Promise<Readable> {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set"); if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(path);
const command = new GetObjectCommand({ const command = new GetObjectCommand({
Bucket: config.S3_BUCKET, Bucket: config.S3_BUCKET,
Key: join(this.repoPath(repoId), path), Key: join(this.repoPath(repoId), path),
@@ -184,6 +189,7 @@ export default class S3Storage extends StorageBase {
expectedSize?: number expectedSize?: number
): Promise<void> { ): Promise<void> {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set"); if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(path);
// No fire-and-forget rm on stream error: Upload uses multipart and // No fire-and-forget rm on stream error: Upload uses multipart and
// does not commit a partially-uploaded object, so there's nothing to // does not commit a partially-uploaded object, so there's nothing to
@@ -249,6 +255,7 @@ export default class S3Storage extends StorageBase {
/** @override */ /** @override */
async listFiles(repoId: string, dir: string = ""): Promise<IFile[]> { async listFiles(repoId: string, dir: string = ""): Promise<IFile[]> {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set"); if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(dir);
if (dir && dir[dir.length - 1] != "/") dir = dir + "/"; if (dir && dir[dir.length - 1] != "/") dir = dir + "/";
const out: IFile[] = []; const out: IFile[] = [];
let req: ListObjectsV2CommandOutput; let req: ListObjectsV2CommandOutput;
@@ -287,6 +294,7 @@ export default class S3Storage extends StorageBase {
data: Readable, data: Readable,
source?: string source?: string
): Promise<void> { ): Promise<void> {
this.assertSafePath(path);
let toS3: ArchiveStreamToS3; let toS3: ArchiveStreamToS3;
return new Promise((resolve, reject) => { return new Promise((resolve, reject) => {
if (!config.S3_BUCKET) return reject("S3_BUCKET not set"); if (!config.S3_BUCKET) return reject("S3_BUCKET not set");
@@ -296,7 +304,13 @@ export default class S3Storage extends StorageBase {
s3: this.client(2 * 60 * 60 * 1000), // 2h timeout s3: this.client(2 * 60 * 60 * 1000), // 2h timeout
type: "zip", type: "zip",
onEntry: (header) => { onEntry: (header) => {
header.name = header.name.substring(header.name.indexOf("/") + 1); // Strip the wrapping top-level dir, then drop any "../" / absolute
// segments so a crafted entry name cannot write objects outside
// the repo key prefix (zip-slip, CWE-23).
const stripped = header.name.substring(
header.name.indexOf("/") + 1
);
header.name = this.sanitizeZipEntryName(stripped);
if (source) { if (source) {
header.Tagging = `source=${source}`; header.Tagging = `source=${source}`;
header.Metadata = { header.Metadata = {
@@ -329,6 +343,7 @@ export default class S3Storage extends StorageBase {
} }
) { ) {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set"); if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(dir);
const archive = archiver(opt?.format || "zip", {}); const archive = archiver(opt?.format || "zip", {});
if (dir && dir[dir.length - 1] != "/") dir = dir + "/"; if (dir && dir[dir.length - 1] != "/") dir = dir + "/";
+53
View File
@@ -6,6 +6,7 @@ import { Response } from "express";
import S3Storage from "./S3"; import S3Storage from "./S3";
import FileSystem from "./FileSystem"; import FileSystem from "./FileSystem";
import { IFile } from "../model/files/files.types"; import { IFile } from "../model/files/files.types";
import AnonymousError from "../AnonymousError";
export type Storage = S3Storage | FileSystem; export type Storage = S3Storage | FileSystem;
@@ -124,4 +125,56 @@ export default abstract class StorageBase {
join(repoId, "original") + (process.platform === "win32" ? "\\" : "/") join(repoId, "original") + (process.platform === "win32" ? "\\" : "/")
); );
} }
/**
* Reject any path/dir argument that could escape the per-repo base
* directory (filesystem) or key prefix (S3) once joined. The storage
* methods take a path component that ultimately derives from the request
* URL; `path.join`/key concatenation normalises `../` but does NOT stop it
* from climbing above the base. Validating the raw component before it is
* joined is the load-bearing defence against path traversal / zip-slip
* (CWE-22/23/24) for both backends.
*
* Throws AnonymousError(400) when the path is absolute or contains a `..`
* segment. An empty string (the repo root) is allowed.
*/
protected assertSafePath(p: string | undefined): void {
if (p == null || p === "") return;
if (typeof p !== "string") {
throw new AnonymousError("invalid_path", {
httpStatus: 400,
object: String(p),
});
}
// Absolute paths (POSIX "/x", Windows "C:\x" / "\x") must not be allowed
// to override the base in a join.
if (/^([a-zA-Z]:)?[\\/]/.test(p)) {
throw new AnonymousError("invalid_path", { httpStatus: 400, object: p });
}
for (const segment of p.split(/[\\/]/)) {
if (segment === "..") {
throw new AnonymousError("invalid_path", {
httpStatus: 400,
object: p,
});
}
}
}
/**
* Sanitise a single zip entry name during extraction. The archive
* extractors strip the leading top-level directory of each entry; this
* additionally drops any `..` / absolute components so a crafted entry like
* `repo/../../../etc/crontab` cannot escape the extraction root
* (zip-slip, CWE-23/24). Returns "" when nothing safe remains.
*/
protected sanitizeZipEntryName(name: string): string {
return name
.split(/[\\/]/)
.filter(
(segment) =>
segment !== "" && segment !== "." && segment !== ".."
)
.join("/");
}
} }
+9 -3
View File
@@ -109,6 +109,10 @@ export default async function start() {
}) })
); );
app.set("etag", "strong"); app.set("etag", "strong");
// Trust exactly TRUST_PROXY proxy hops so Express derives request.ip from
// the right X-Forwarded-For entry. This is what makes request.ip
// trustworthy for rate limiting instead of a client-spoofable header.
app.set("trust proxy", config.TRUST_PROXY);
// handle session and connection // handle session and connection
app.use(initSession()); app.use(initSession());
@@ -134,9 +138,11 @@ export default async function start() {
request: express.Request, request: express.Request,
_response: express.Response _response: express.Response
): string { ): string {
if (request.headers["cf-connecting-ip"]) { // Use request.ip, which Express resolves from X-Forwarded-For honouring
return request.headers["cf-connecting-ip"] as string; // the configured "trust proxy" hop count. Do NOT key off the
} // cf-connecting-ip header directly: when the server isn't actually behind
// Cloudflare a client can set that header to an arbitrary value per
// request and trivially bypass the rate limiter (CWE-290).
if (!request.ip && request.socket.remoteAddress) { if (!request.ip && request.socket.remoteAddress) {
logger.warn("request.ip is missing"); logger.warn("request.ip is missing");
return request.socket.remoteAddress; return request.socket.remoteAddress;
+10 -1
View File
@@ -105,9 +105,18 @@ function escapeRegex(s: string): string {
return s.replace(/[-[\]{}()*+?.,\\^$|#\s]/g, "\\$&"); return s.replace(/[-[\]{}()*+?.,\\^$|#\s]/g, "\\$&");
} }
// Only plain field paths may be used as a Mongo sort key. Rejecting keys that
// start with "$" or contain anything other than [A-Za-z0-9_.] prevents an
// admin-supplied req.query.sort from injecting operator-prefixed keys
// (e.g. "$where") into the query object (CWE-943).
function isSafeSortField(field: unknown): field is string {
return typeof field === "string" && /^[A-Za-z_][A-Za-z0-9_.]*$/.test(field);
}
function parseSort(req: express.Request, fallbackField = "_id"): Record<string, 1 | -1> { function parseSort(req: express.Request, fallbackField = "_id"): Record<string, 1 | -1> {
const direction = req.query.direction === "asc" ? 1 : -1; const direction = req.query.direction === "asc" ? 1 : -1;
const field = (req.query.sort as string) || fallbackField; const requested = req.query.sort;
const field = isSafeSortField(requested) ? requested : fallbackField;
return { [field]: direction }; return { [field]: direction };
} }
+17
View File
@@ -50,6 +50,23 @@ router.get(
res res
); );
} }
// Reject path traversal before the path reaches the storage layer. The
// storage backends also validate, but failing fast here keeps a crafted
// "../" URL from being treated as a real lookup (CWE-22/25).
if (
anonymizedPath
.split(/[\\/]/)
.some((segment) => segment === "..") ||
/^[\\/]/.test(anonymizedPath)
) {
return handleError(
new AnonymousError("invalid_path", {
httpStatus: 400,
object: anonymizedPath,
}),
res
);
}
const repo = await getRepo(req, res, { const repo = await getRepo(req, res, {
nocheck: false, nocheck: false,
+9
View File
@@ -85,6 +85,15 @@ async function webView(req: express.Request, res: express.Response) {
const filePath = req.path.substring( const filePath = req.path.substring(
indexRepoId + req.params.repoId.length + 1 indexRepoId + req.params.repoId.length + 1
); );
// Reject traversal in the URL-derived segment before joining it onto the
// page-source root. Stripping a single leading "/" or "." is not enough
// to stop "../../" sequences from climbing out of the repo (CWE-22).
if (filePath.split(/[\\/]/).some((segment) => segment === "..")) {
throw new AnonymousError("invalid_path", {
httpStatus: 400,
object: filePath,
});
}
let requestPath = path.join(wRoot, filePath); let requestPath = path.join(wRoot, filePath);
if (requestPath.at(0) == "/" || requestPath.at(0) == ".") { if (requestPath.at(0) == "/" || requestPath.at(0) == ".") {
requestPath = requestPath.substring(1); requestPath = requestPath.substring(1);
+7
View File
@@ -13,7 +13,11 @@ export const router = express.Router();
router.post( router.post(
"/download", "/download",
async (req: express.Request, res: express.Response) => { async (req: express.Request, res: express.Response) => {
req.body = req.body || {};
const token: string = req.body.token; const token: string = req.body.token;
if (typeof req.body.repoFullName !== "string" || typeof token !== "string") {
return res.status(400).json({ error: "invalid_request" });
}
const repoFullName = req.body.repoFullName.split("/"); const repoFullName = req.body.repoFullName.split("/");
const repoId = req.body.repoId; const repoId = req.body.repoId;
const commit = req.body.commit; const commit = req.body.commit;
@@ -41,6 +45,9 @@ router.post(
router.post("/", async (req: express.Request, res: express.Response) => { router.post("/", async (req: express.Request, res: express.Response) => {
req.body = req.body || {}; req.body = req.body || {};
const token: string = req.body.token; const token: string = req.body.token;
if (typeof req.body.repoFullName !== "string" || typeof token !== "string") {
return res.status(400).json({ error: "invalid_request" });
}
const repoFullName = req.body.repoFullName.split("/"); const repoFullName = req.body.repoFullName.split("/");
const repoId = req.body.repoId; const repoId = req.body.repoId;
const fileSha = req.body.sha; const fileSha = req.body.sha;