From 41799f98913a6004b84bded09aa916a4827a27e0 Mon Sep 17 00:00:00 2001 From: Shadowbroker <43977454+BigBodyCobain@users.noreply.github.com> Date: Mon, 25 May 2026 04:22:09 -0600 Subject: [PATCH] feat(ci): switch GitLab mirror-to-github job to per-repo SSH deploy key (#331) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat(ci): switch mirror-to-github job from PAT to per-repo SSH deploy key GitHub fine-grained PATs are capped at 366 days, classic PATs would need 'public_repo' (broader scope than needed). Per-repo SSH deploy keys are tighter: - Can ONLY push to BigBodyCobain/Shadowbroker (no access to anything else, not even other repos owned by the same account). - Never expire. - Rotating == one-click delete on github.com/.../settings/keys. Changes: - New CI/CD variable GITHUB_MIRROR_SSH_KEY (File, Protected) holding the ed25519 private half. Public half lives on the repo's deploy keys with write access enabled. - mirror-to-github before_script writes the key to ~/.ssh/id_ed25519, pins github.com host fingerprints (ed25519 + ecdsa + rsa from the 2023-03-24 rotation) into ~/.ssh/known_hosts so we never trust a MITM, then pushes via git@github.com:... instead of HTTPS. - Job rule now gates on GITHUB_MIRROR_SSH_KEY (the new var) instead of GITHUB_MIRROR_TOKEN (which never existed). After this lands, every commit pushed directly to GitLab main will mirror back to GitHub main automatically — closing the loop on bi-directional sync. Co-Authored-By: Claude Opus 4.7 * fix(secret-scan): exempt SSH known_hosts entries from leaked-key detection PR #331 introduced github.com host fingerprints pinned in .gitlab-ci.yml's mirror-to-github before_script. The scanner flagged them as embedded secrets and blocked CI: BLOCKED: Embedded secrets/tokens found in: .gitlab-ci.yml 133: github.com ssh-ed25519 AAAA... 135: github.com ssh-rsa AAAA... These are PUBLIC host keys — the whole point of pinning known_hosts is to publish the fingerprint widely so a MITM is detectable. They are documented at https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/githubs-ssh-key-fingerprints and committing them is the correct, secure practice. Fix: add a KNOWN_HOSTS_LINE regex to the content-scan block that recognizes ` [salt] AAAA...` shape lines (the exact format used in ~/.ssh/known_hosts) and filters them out before flagging the file. Bare `ssh-rsa AAAA...` lines without a host prefix are still caught — only the host-key shape is exempt. Co-Authored-By: Claude Opus 4.7 --------- Co-authored-by: Claude Opus 4.7 --- .gitlab-ci.yml | 44 ++++++++++++++++++++++++--------- backend/scripts/scan-secrets.sh | 37 ++++++++++++++++++++------- 2 files changed, 61 insertions(+), 20 deletions(-) diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index 131a214..28ab80e 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -20,10 +20,15 @@ # Auth notes: # - The image build/push uses $CI_JOB_TOKEN, which GitLab provides # automatically. No credentials need to be configured. -# - The reverse mirror requires a GitHub personal access token stored -# as the GitLab CI/CD variable GITHUB_MIRROR_TOKEN (Protected + Masked). -# Scope: public_repo (or repo for private). If the variable isn't -# set the mirror job is skipped — image builds still run. +# - The reverse mirror authenticates to GitHub via a per-repo SSH +# deploy key. The private half is stored as the File-type GitLab +# CI/CD variable GITHUB_MIRROR_SSH_KEY (Protected). The matching +# public key is added to github.com/BigBodyCobain/Shadowbroker/ +# settings/keys with write access. This is a tighter-scoped +# replacement for a personal access token: it can ONLY push to +# Shadowbroker, never expires, and rotating it is a one-click +# delete on GitHub's deploy-keys page. If the variable isn't set, +# the mirror job is skipped — image builds still run. stages: - build @@ -101,18 +106,35 @@ build-frontend: - .gitlab-ci.yml # ── Reverse mirror to GitHub ───────────────────────────────────────────── -# Pushes refs/heads/main to github.com/BigBodyCobain/Shadowbroker. -# Fast-forward-only — if GitLab main and GitHub main have diverged, this -# fails loudly rather than silently overwriting either side. +# Pushes refs/heads/main to github.com/BigBodyCobain/Shadowbroker via SSH +# using a per-repo deploy key. Fast-forward-only by default — if GitLab +# main and GitHub main have diverged, the push fails loudly rather than +# silently overwriting either side. # -# Only runs if GITHUB_MIRROR_TOKEN is set as a CI/CD variable. See the -# header comment of this file for setup instructions. +# Only runs if GITHUB_MIRROR_SSH_KEY is set as a File-type CI/CD variable. +# See the header comment of this file for setup instructions. mirror-to-github: stage: mirror image: alpine:3.20 needs: [] before_script: - apk add --no-cache git openssh-client ca-certificates + - mkdir -p ~/.ssh + - chmod 700 ~/.ssh + # Install the deploy key. File-type CI variable exposes the path; copy + # to ~/.ssh/id_ed25519 with restrictive perms so ssh accepts it. + - cp "$GITHUB_MIRROR_SSH_KEY" ~/.ssh/id_ed25519 + - chmod 600 ~/.ssh/id_ed25519 + # Pin github.com's current host keys so we never trust a man-in-the- + # middle. Sourced from https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/githubs-ssh-key-fingerprints + # (rotated 2023-03-24 after the previous RSA key leak). + - | + cat > ~/.ssh/known_hosts <<'EOF' + github.com ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOMqqnkVzrm0SdG6UOoqKLsabgH5C9okWi0dh2l9GKJl + github.com ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBEmKSENjQEezOmxkZMy7opKgwFB9nkt5YRrYMjNuG5N87uRgg6CLrbo5wAdT/y6v0mKV0U2w0WZ2YB/++Tpockg= + github.com ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCj7ndNxQowgcQnjshcLrqPEiiphnt+VTTvDP6mHBL9j1aNUkY4Ue1gvwnGLVlOhGeYrnZaMgRK6+PKCUXaDbC7qtbW8gIkhL7aGCsOr/C56SJMy/BCZfxd1nWzAOxSDPgVsmerOBYfNqltV9/hWCqBywINIR+5dIg6JTJ72pcEpEjcYgXkE2YEFXV1JHnsKgbLWNlhScqb2UmyRkQyytRLtL+38TGxkxCflmO+5Z8CSSNY7GidjMIZ7Q4zMjA2n1nGrlTDkzwDCsw+wqFPGQA179cnfGWOWRVruj16z6XyvxvjJwbz0wQZ75XK5tKSb7FNyeIEs4TT4jk+S4dhPeAUC5y+bDYirYgM4GC7uEnztnZyaVWQ7B381AK4Qdrwt51ZqExKbQpTUNn+EjqoTwvqNj4kqx5QUCI0ThS/YkOxJCXmPUWZbhjpCg56i+2aB6CmK2JGhn57K5mj0MNdBXA4/WnwH6XoPWJzK5Nyu2zB3nAZp+S5hpQs+p1vN1/wsjk= + EOF + - chmod 644 ~/.ssh/known_hosts script: - git config --global user.email "ci-mirror@gitlab.com" - git config --global user.name "GitLab CI Mirror" @@ -123,7 +145,7 @@ mirror-to-github: - cd repo - > git push - "https://x-access-token:${GITHUB_MIRROR_TOKEN}@github.com/BigBodyCobain/Shadowbroker.git" + "git@github.com:BigBodyCobain/Shadowbroker.git" "${CI_COMMIT_SHA}:refs/heads/main" rules: - - if: $CI_COMMIT_BRANCH == "main" && $GITHUB_MIRROR_TOKEN + - if: $CI_COMMIT_BRANCH == "main" && $GITHUB_MIRROR_SSH_KEY diff --git a/backend/scripts/scan-secrets.sh b/backend/scripts/scan-secrets.sh index dfd8d38..50910d1 100644 --- a/backend/scripts/scan-secrets.sh +++ b/backend/scripts/scan-secrets.sh @@ -92,18 +92,37 @@ SECRET_REGEX+='pypi-[0-9a-zA-Z-]{50,}' # PyPI token TEXT_FILES=$(grep -ivE '\.(png|jpg|jpeg|gif|ico|svg|woff2?|ttf|eot|pbf|zip|tar|gz|db|sqlite|xlsx|pdf|mp[34]|wav|ogg|webm|webp|avif)$' "$FILELIST" | grep -v 'scan-secrets\.sh$' || true) if [[ -n "$TEXT_FILES" ]]; then + # Known-public exclusions: lines matching ` ssh- ` + # are SSH known_hosts entries — the host's PUBLIC fingerprint, which is + # by definition safe to commit (the whole point of pinning known_hosts + # is to publish the fingerprint widely so MITM is detectable). Filter + # these out before flagging the file. + KNOWN_HOSTS_LINE='^[[:space:]]*[a-zA-Z0-9._:,*-]+([[:space:]]+[a-zA-Z0-9._:,*-]+)?[[:space:]]+(ssh-rsa|ssh-ed25519|ssh-dss|ecdsa-sha2-nistp256|ecdsa-sha2-nistp384|ecdsa-sha2-nistp521)[[:space:]]+AAAA' + # Use grep with file list, skip missing/binary, limit output CONTENT_HITS=$(echo "$TEXT_FILES" | xargs grep -lE "$SECRET_REGEX" 2>/dev/null || true) if [[ -n "$CONTENT_HITS" ]]; then - echo -e "\n${RED}BLOCKED: Embedded secrets/tokens found in:${NC}" - echo "$CONTENT_HITS" | while read -r f; do - echo -e " ${RED}$f${NC}" - # Show first matching line for context - grep -nE "$SECRET_REGEX" "$f" 2>/dev/null | head -2 | while read -r line; do - echo -e " ${YELLOW}$line${NC}" - done - done - FOUND=1 + REAL_HITS="" + REAL_REPORT="" + while IFS= read -r f; do + [[ -z "$f" ]] && continue + # Re-grep this file, but filter out known_hosts-style lines. + FILE_HITS=$(grep -nE "$SECRET_REGEX" "$f" 2>/dev/null | grep -vE "$KNOWN_HOSTS_LINE" || true) + if [[ -n "$FILE_HITS" ]]; then + REAL_HITS+="$f"$'\n' + REAL_REPORT+=" ${RED}$f${NC}"$'\n' + # Show first 2 matching lines for context + while IFS= read -r line; do + [[ -z "$line" ]] && continue + REAL_REPORT+=" ${YELLOW}$line${NC}"$'\n' + done < <(echo "$FILE_HITS" | head -2) + fi + done <<< "$CONTENT_HITS" + if [[ -n "$REAL_HITS" ]]; then + echo -e "\n${RED}BLOCKED: Embedded secrets/tokens found in:${NC}" + echo -en "$REAL_REPORT" + FOUND=1 + fi fi fi