Security hardening + gist UI fixes (#731)

* security: harden against XSS, ReDoS, path traversal, and injection

Defensive fixes across the server, storage, and viewer:

- XSS (CWE-79): sanitise rendered notebooks with DOMPurify, escape file
  names interpolated into AngularJS expressions (escapeNgString), set
  Mermaid securityLevel to 'strict', and stop urlRel2abs from returning
  javascript:/vbscript:/data:text/html URLs.
- Path traversal / zip-slip (CWE-22/23/24): validate URL-derived path
  components before they reach the storage layer (file/webview routes +
  StorageBase.assertSafePath) and sanitise zip entry names on extract for
  both the filesystem and S3 backends.
- ReDoS (CWE-1333): escape anonymization terms with catastrophic
  backtracking shapes to literals instead of compiling them as regexes.
- Secret hardening (CWE-798): require SESSION_SECRET / OAuth creds / DB
  password in production, random dev SESSION_SECRET fallback.
- Rate-limit spoofing (CWE-290): derive request.ip via trust-proxy hop
  count instead of the client-settable cf-connecting-ip header.
- NoSQL injection (CWE-943): allow only plain field paths as admin sort keys.
- Reject malformed streamer requests missing required string fields.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(ui): make gists reachable/visible and clarify the ZIP button

- Gist & PR routes now accept a trailing slash (/gist/:id/:path*?), so the
  dashboard links (which end in "/") resolve to the gist/PR page instead of
  falling through to the 404 route (#725).
- Gist viewer picks the default tab after content loads, defaulting to
  "files" when files exist; previously the ng-init ran before the async
  load and a files-only gist rendered blank under the hidden comments tab.
- Explorer toolbar: relabel ZIP to "Full repo ZIP" with a tooltip, and add
  tooltips to Raw/Download clarifying they apply to the current file (#721).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix: report SAML-enforced orgs clearly instead of "token expired"

When a repo's organization enforces SAML SSO, GitHub returns a 403 whose
message differs from the OAuth-App-restriction case. That 403 fell through
to the generic handler and surfaced as "token_expired", pushing users to
re-login when the real fix is authorizing their token for the org. Detect
the "SAML enforcement" message and raise a dedicated, actionable error
instead (#379, #550).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* security: catch nested quantified groups in ReDoS guard and backslash path traversal

- hasCatastrophicBacktracking now scans across nested parens ([\s\S]*?)
  so shapes like ((a+))+ are detected; comment reframed as a heuristic
  backstop rather than a proof.
- file route path-traversal check now rejects backslash separators and a
  leading backslash, covering Windows-style "..\" payloads (CWE-22/25).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* chore(dev): track dev-proxy script, ignore .DS_Store and .claude/

scripts/dev-proxy.js is referenced by the "dev:ui" npm script but was
never committed, breaking the command on a fresh clone. Add it and
ignore local-only macOS/Claude Code files.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Thomas Durieux
2026-06-18 04:50:55 -07:00
committed by GitHub
parent bdfcc56d81
commit e4ffd74068
21 changed files with 484 additions and 23 deletions
+43 -1
View File
@@ -1,4 +1,5 @@
import { resolve } from "path";
import { randomBytes } from "crypto";
interface Config {
SESSION_SECRET: string;
@@ -37,7 +38,10 @@ interface Config {
RATE_LIMIT: number;
}
const config: Config = {
SESSION_SECRET: "SESSION_SECRET",
// Predictable defaults are dangerous: a known SESSION_SECRET lets anyone
// forge session cookies. Default to empty and resolve below — random in
// dev, required in production. See the post-env block.
SESSION_SECRET: "",
CLIENT_ID: "CLIENT_ID",
CLIENT_SECRET: "CLIENT_SECRET",
GITHUB_TOKEN: "",
@@ -99,4 +103,42 @@ for (const conf in process.env) {
}
}
// Harden security-sensitive secrets that still hold an unset/predictable
// value after reading the environment (CWE-798).
const isProduction = process.env.NODE_ENV === "production";
// SESSION_SECRET: a known value allows session forgery. Require it in
// production; in development fall back to a per-process random value so the
// app still boots without shipping a guessable secret.
if (!config.SESSION_SECRET || config.SESSION_SECRET === "SESSION_SECRET") {
if (isProduction) {
throw new Error(
"SESSION_SECRET must be set to a strong random value in production"
);
}
config.SESSION_SECRET = randomBytes(32).toString("hex");
// eslint-disable-next-line no-console
console.warn(
"SESSION_SECRET not set — generated a random development secret. " +
"Sessions will not persist across restarts. Set SESSION_SECRET in production."
);
}
// Refuse to start in production with the placeholder OAuth credentials or the
// default database password baked into the image.
if (isProduction) {
const insecureDefaults: [string, string][] = [
["CLIENT_ID", "CLIENT_ID"],
["CLIENT_SECRET", "CLIENT_SECRET"],
["DB_PASSWORD", "password"],
];
for (const [key, badValue] of insecureDefaults) {
if ((config as unknown as Record<string, unknown>)[key] === badValue) {
throw new Error(
`${key} is using its insecure default value; set it via the environment in production`
);
}
}
}
export default config;
+30
View File
@@ -284,6 +284,29 @@ interface CompiledTermVariant {
mask: string;
}
// Detect exponential-backtracking regex shapes — a quantifier applied to a
// group that itself contains a quantifier or top-level alternation, e.g.
// (a+)+, (a*)*, (a|aa)+, and the nested form ((a+))+. Anonymization terms come
// from the repository owner and are applied as live regexes against file
// content, so a crafted term could otherwise hang the worker (ReDoS,
// CWE-1333/624). This is a heuristic, not a proof: the lazy [\s\S]*? body
// matches across nested parentheses so nested quantified groups are caught,
// and it errs toward over-escaping benign regexes rather than letting a
// dangerous one through. It is not exhaustive — exotic backtracking shapes may
// still slip past — so it backstops, rather than replaces, any execution-time
// bound on the regex.
function hasCatastrophicBacktracking(src: string): boolean {
const quantifiedGroup = /\(([\s\S]*?)\)\s*(?:[*+]|\{\d+(?:,\d*)?\})/g;
let match: RegExpExecArray | null;
while ((match = quantifiedGroup.exec(src)) !== null) {
const inner = match[1];
if (/[*+]|\{\d+(?:,\d*)?\}/.test(inner) || inner.includes("|")) {
return true;
}
}
return false;
}
function compileTerms(terms: string[] | undefined): CompiledTermVariant[] {
if (!terms || terms.length === 0) return [];
const compiled: CompiledTermVariant[] = [];
@@ -298,9 +321,16 @@ function compileTerms(terms: string[] | undefined): CompiledTermVariant[] {
parsed.replacement !== null
? parsed.replacement
: config.ANONYMIZATION_MASK + "-" + (i + 1);
// Use the term as a regex only when it both compiles AND is free of
// catastrophic-backtracking shapes; otherwise escape it to a literal so a
// malicious term cannot trigger ReDoS during anonymization.
let useAsRegex = true;
try {
new RegExp(term, "gi");
} catch {
useAsRegex = false;
}
if (!useAsRegex || hasCatastrophicBacktracking(term)) {
term = term.replace(/[-[\]{}()*+?.,\\^$|#]/g, "\\$&");
}
for (const variant of termVariants(term)) {
+18
View File
@@ -295,6 +295,24 @@ export async function getRepositoryFromGitHub(opt: {
cause: error as Error,
});
}
// SAML SSO-protected orgs return a 403 with a distinct message. Without
// this branch it falls through to the generic 401/403 handler below and
// is reported as "token_expired", which sends users to re-login instead
// of authorizing their token for the organization (the real fix). See
// #379/#550.
if (
error instanceof Error &&
error.message.includes("SAML enforcement")
) {
throw new AnonymousError("repo_saml_enforcement", {
httpStatus: 403,
object: {
owner: opt.owner,
repo: opt.repo,
},
cause: error as Error,
});
}
// If the name 404s but we know the GitHub repo id, the repo was
// probably renamed. Look it up by id and continue with the new name.
const status = (error as { status?: number }).status;
+15 -1
View File
@@ -24,6 +24,7 @@ export default class FileSystem extends StorageBase {
/** @override */
async exists(repoId: string, p: string = ""): Promise<FILE_TYPE> {
this.assertSafePath(p);
const fullPath = join(config.FOLDER, this.repoPath(repoId), p);
try {
const stat = await fs.promises.stat(fullPath);
@@ -37,17 +38,20 @@ export default class FileSystem extends StorageBase {
/** @override */
async send(repoId: string, p: string, res: Response) {
this.assertSafePath(p);
const fullPath = join(config.FOLDER, this.repoPath(repoId), p);
res.sendFile(fullPath, { dotfiles: "allow" });
}
/** @override */
async read(repoId: string, p: string): Promise<Readable> {
this.assertSafePath(p);
const fullPath = join(config.FOLDER, this.repoPath(repoId), p);
return fs.createReadStream(fullPath);
}
async fileInfo(repoId: string, path: string) {
this.assertSafePath(path);
const fullPath = join(config.FOLDER, this.repoPath(repoId), path);
const info = await fs.promises.stat(fullPath);
return {
@@ -67,6 +71,7 @@ export default class FileSystem extends StorageBase {
_source?: string,
expectedSize?: number
): Promise<void> {
this.assertSafePath(p);
const fullPath = join(config.FOLDER, this.repoPath(repoId), p);
// Atomic write: stream into a sibling .tmp and only rename into place
// when the source stream finishes successfully. If the source errors
@@ -126,6 +131,7 @@ export default class FileSystem extends StorageBase {
/** @override */
async rm(repoId: string, dir: string = ""): Promise<void> {
this.assertSafePath(dir);
const fullPath = join(config.FOLDER, this.repoPath(repoId), dir);
await fs.promises.rm(fullPath, {
force: true,
@@ -135,6 +141,7 @@ export default class FileSystem extends StorageBase {
/** @override */
async mk(repoId: string, dir: string = ""): Promise<void> {
this.assertSafePath(dir);
const fullPath = join(config.FOLDER, this.repoPath(repoId), dir);
try {
await fs.promises.mkdir(fullPath, {
@@ -155,6 +162,7 @@ export default class FileSystem extends StorageBase {
onEntry?: (file: { path: string; size: number }) => void;
} = {}
): Promise<IFile[]> {
this.assertSafePath(dir);
const fullPath = join(config.FOLDER, this.repoPath(repoId), dir);
const files = await fs.promises.readdir(fullPath);
const output2: IFile[] = [];
@@ -197,13 +205,18 @@ export default class FileSystem extends StorageBase {
/** @override */
async extractZip(repoId: string, p: string, data: Readable): Promise<void> {
this.assertSafePath(p);
const pipe = promisify(pipeline);
const fullPath = join(config.FOLDER, this.repoPath(repoId), p);
const extractor = Extract({
path: fullPath,
decodeString: (buf) => {
const name = buf.toString();
const newName = name.substr(name.indexOf("/") + 1);
// Strip the top-level directory GitHub wraps every entry in, then
// drop any "../" / absolute segments so a malicious entry name
// cannot escape the extraction root (zip-slip, CWE-24).
const stripped = name.substr(name.indexOf("/") + 1);
const newName = this.sanitizeZipEntryName(stripped);
if (newName == "") {
return "___IGNORE___";
}
@@ -223,6 +236,7 @@ export default class FileSystem extends StorageBase {
fileTransformer?: (path: string) => Transform;
}
) {
this.assertSafePath(dir);
const archive = archiver(opt?.format || "zip", {});
const fullPath = join(config.FOLDER, this.repoPath(repoId), dir);
+16 -1
View File
@@ -54,6 +54,7 @@ export default class S3Storage extends StorageBase {
/** @override */
async exists(repoId: string, path: string = ""): Promise<FILE_TYPE> {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(path);
try {
// if we can get the file info, it is a file
await this.fileInfo(repoId, path);
@@ -79,6 +80,7 @@ export default class S3Storage extends StorageBase {
/** @override */
async rm(repoId: string, dir: string = ""): Promise<void> {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(dir);
const data = await this.client(200000).listObjectsV2({
Bucket: config.S3_BUCKET,
Prefix: join(this.repoPath(repoId), dir),
@@ -110,6 +112,7 @@ export default class S3Storage extends StorageBase {
/** @override */
async send(repoId: string, path: string, res: Response) {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(path);
try {
const command = new GetObjectCommand({
Bucket: config.S3_BUCKET,
@@ -145,6 +148,7 @@ export default class S3Storage extends StorageBase {
async fileInfo(repoId: string, path: string) {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(path);
const info = await this.client(3000).headObject({
Bucket: config.S3_BUCKET,
Key: join(this.repoPath(repoId), path),
@@ -161,6 +165,7 @@ export default class S3Storage extends StorageBase {
/** @override */
async read(repoId: string, path: string): Promise<Readable> {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(path);
const command = new GetObjectCommand({
Bucket: config.S3_BUCKET,
Key: join(this.repoPath(repoId), path),
@@ -184,6 +189,7 @@ export default class S3Storage extends StorageBase {
expectedSize?: number
): Promise<void> {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(path);
// No fire-and-forget rm on stream error: Upload uses multipart and
// does not commit a partially-uploaded object, so there's nothing to
@@ -249,6 +255,7 @@ export default class S3Storage extends StorageBase {
/** @override */
async listFiles(repoId: string, dir: string = ""): Promise<IFile[]> {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(dir);
if (dir && dir[dir.length - 1] != "/") dir = dir + "/";
const out: IFile[] = [];
let req: ListObjectsV2CommandOutput;
@@ -287,6 +294,7 @@ export default class S3Storage extends StorageBase {
data: Readable,
source?: string
): Promise<void> {
this.assertSafePath(path);
let toS3: ArchiveStreamToS3;
return new Promise((resolve, reject) => {
if (!config.S3_BUCKET) return reject("S3_BUCKET not set");
@@ -296,7 +304,13 @@ export default class S3Storage extends StorageBase {
s3: this.client(2 * 60 * 60 * 1000), // 2h timeout
type: "zip",
onEntry: (header) => {
header.name = header.name.substring(header.name.indexOf("/") + 1);
// Strip the wrapping top-level dir, then drop any "../" / absolute
// segments so a crafted entry name cannot write objects outside
// the repo key prefix (zip-slip, CWE-23).
const stripped = header.name.substring(
header.name.indexOf("/") + 1
);
header.name = this.sanitizeZipEntryName(stripped);
if (source) {
header.Tagging = `source=${source}`;
header.Metadata = {
@@ -329,6 +343,7 @@ export default class S3Storage extends StorageBase {
}
) {
if (!config.S3_BUCKET) throw new Error("S3_BUCKET not set");
this.assertSafePath(dir);
const archive = archiver(opt?.format || "zip", {});
if (dir && dir[dir.length - 1] != "/") dir = dir + "/";
+53
View File
@@ -6,6 +6,7 @@ import { Response } from "express";
import S3Storage from "./S3";
import FileSystem from "./FileSystem";
import { IFile } from "../model/files/files.types";
import AnonymousError from "../AnonymousError";
export type Storage = S3Storage | FileSystem;
@@ -124,4 +125,56 @@ export default abstract class StorageBase {
join(repoId, "original") + (process.platform === "win32" ? "\\" : "/")
);
}
/**
* Reject any path/dir argument that could escape the per-repo base
* directory (filesystem) or key prefix (S3) once joined. The storage
* methods take a path component that ultimately derives from the request
* URL; `path.join`/key concatenation normalises `../` but does NOT stop it
* from climbing above the base. Validating the raw component before it is
* joined is the load-bearing defence against path traversal / zip-slip
* (CWE-22/23/24) for both backends.
*
* Throws AnonymousError(400) when the path is absolute or contains a `..`
* segment. An empty string (the repo root) is allowed.
*/
protected assertSafePath(p: string | undefined): void {
if (p == null || p === "") return;
if (typeof p !== "string") {
throw new AnonymousError("invalid_path", {
httpStatus: 400,
object: String(p),
});
}
// Absolute paths (POSIX "/x", Windows "C:\x" / "\x") must not be allowed
// to override the base in a join.
if (/^([a-zA-Z]:)?[\\/]/.test(p)) {
throw new AnonymousError("invalid_path", { httpStatus: 400, object: p });
}
for (const segment of p.split(/[\\/]/)) {
if (segment === "..") {
throw new AnonymousError("invalid_path", {
httpStatus: 400,
object: p,
});
}
}
}
/**
* Sanitise a single zip entry name during extraction. The archive
* extractors strip the leading top-level directory of each entry; this
* additionally drops any `..` / absolute components so a crafted entry like
* `repo/../../../etc/crontab` cannot escape the extraction root
* (zip-slip, CWE-23/24). Returns "" when nothing safe remains.
*/
protected sanitizeZipEntryName(name: string): string {
return name
.split(/[\\/]/)
.filter(
(segment) =>
segment !== "" && segment !== "." && segment !== ".."
)
.join("/");
}
}
+9 -3
View File
@@ -109,6 +109,10 @@ export default async function start() {
})
);
app.set("etag", "strong");
// Trust exactly TRUST_PROXY proxy hops so Express derives request.ip from
// the right X-Forwarded-For entry. This is what makes request.ip
// trustworthy for rate limiting instead of a client-spoofable header.
app.set("trust proxy", config.TRUST_PROXY);
// handle session and connection
app.use(initSession());
@@ -134,9 +138,11 @@ export default async function start() {
request: express.Request,
_response: express.Response
): string {
if (request.headers["cf-connecting-ip"]) {
return request.headers["cf-connecting-ip"] as string;
}
// Use request.ip, which Express resolves from X-Forwarded-For honouring
// the configured "trust proxy" hop count. Do NOT key off the
// cf-connecting-ip header directly: when the server isn't actually behind
// Cloudflare a client can set that header to an arbitrary value per
// request and trivially bypass the rate limiter (CWE-290).
if (!request.ip && request.socket.remoteAddress) {
logger.warn("request.ip is missing");
return request.socket.remoteAddress;
+10 -1
View File
@@ -105,9 +105,18 @@ function escapeRegex(s: string): string {
return s.replace(/[-[\]{}()*+?.,\\^$|#\s]/g, "\\$&");
}
// Only plain field paths may be used as a Mongo sort key. Rejecting keys that
// start with "$" or contain anything other than [A-Za-z0-9_.] prevents an
// admin-supplied req.query.sort from injecting operator-prefixed keys
// (e.g. "$where") into the query object (CWE-943).
function isSafeSortField(field: unknown): field is string {
return typeof field === "string" && /^[A-Za-z_][A-Za-z0-9_.]*$/.test(field);
}
function parseSort(req: express.Request, fallbackField = "_id"): Record<string, 1 | -1> {
const direction = req.query.direction === "asc" ? 1 : -1;
const field = (req.query.sort as string) || fallbackField;
const requested = req.query.sort;
const field = isSafeSortField(requested) ? requested : fallbackField;
return { [field]: direction };
}
+17
View File
@@ -50,6 +50,23 @@ router.get(
res
);
}
// Reject path traversal before the path reaches the storage layer. The
// storage backends also validate, but failing fast here keeps a crafted
// "../" URL from being treated as a real lookup (CWE-22/25).
if (
anonymizedPath
.split(/[\\/]/)
.some((segment) => segment === "..") ||
/^[\\/]/.test(anonymizedPath)
) {
return handleError(
new AnonymousError("invalid_path", {
httpStatus: 400,
object: anonymizedPath,
}),
res
);
}
const repo = await getRepo(req, res, {
nocheck: false,
+9
View File
@@ -85,6 +85,15 @@ async function webView(req: express.Request, res: express.Response) {
const filePath = req.path.substring(
indexRepoId + req.params.repoId.length + 1
);
// Reject traversal in the URL-derived segment before joining it onto the
// page-source root. Stripping a single leading "/" or "." is not enough
// to stop "../../" sequences from climbing out of the repo (CWE-22).
if (filePath.split(/[\\/]/).some((segment) => segment === "..")) {
throw new AnonymousError("invalid_path", {
httpStatus: 400,
object: filePath,
});
}
let requestPath = path.join(wRoot, filePath);
if (requestPath.at(0) == "/" || requestPath.at(0) == ".") {
requestPath = requestPath.substring(1);
+7
View File
@@ -13,7 +13,11 @@ export const router = express.Router();
router.post(
"/download",
async (req: express.Request, res: express.Response) => {
req.body = req.body || {};
const token: string = req.body.token;
if (typeof req.body.repoFullName !== "string" || typeof token !== "string") {
return res.status(400).json({ error: "invalid_request" });
}
const repoFullName = req.body.repoFullName.split("/");
const repoId = req.body.repoId;
const commit = req.body.commit;
@@ -41,6 +45,9 @@ router.post(
router.post("/", async (req: express.Request, res: express.Response) => {
req.body = req.body || {};
const token: string = req.body.token;
if (typeof req.body.repoFullName !== "string" || typeof token !== "string") {
return res.status(400).json({ error: "invalid_request" });
}
const repoFullName = req.body.repoFullName.split("/");
const repoId = req.body.repoId;
const fileSha = req.body.sha;