Files
Luong NGUYEN 6d1e0ae4af refactor(ci): shift quality checks to pre-commit, CI as 2nd pass (#34)
* refactor(ci): shift quality checks to pre-commit, CI as 2nd pass

- Remove ci.yml (lint, security, pytest were only for EPUB scripts)
- Move EPUB build to pre-commit local hook (runs on .md changes)
- Add check_cross_references.py, check_mermaid.py, check_links.py scripts
- Add markdown-lint, cross-references, mermaid-syntax, link-check as
  pre-commit hooks — mirrors all 4 CI doc-check jobs locally
- Remove spell check job from docs-check.yml (breaks on translations)
- Refactor docs-check.yml to reuse scripts/ instead of inline Python
- Add .markdownlint.json config shared by pre-commit and CI
- Update CONTRIBUTING.md with required dependencies and hook table

* fix(ci): resolve all CI check failures in docs-check workflow

- fix(check_cross_references): skip code blocks and inline code spans
  to avoid false positives from documentation examples; fix emoji
  heading anchor generation (rstrip not strip); add blog-posts,
  openspec, prompts, .agents to IGNORE_DIRS; ignore README.backup.md
- fix(check_links): strip trailing Markdown punctuation from captured
  URLs; add wikipedia, api.github.com to SKIP_DOMAINS; add placeholder
  URL patterns to SKIP_URL_PATTERNS; add .agents/.claude to IGNORE_DIRS
- fix(check_mermaid): add --no-sandbox puppeteer config support via
  MERMAID_PUPPETEER_NO_SANDBOX env var for GitHub Actions Linux runners
- fix(docs-check.yml): pass MERMAID_PUPPETEER_NO_SANDBOX=true to mermaid job
- fix(content): repair broken anchors in README.md, 09-advanced-features;
  fix #plugins -> #claude-code-plugins in claude_concepts_guide.md;
  remove non-existent ./docs/performance.md placeholder links; fix
  dependabot alerts URL in SECURITY_REPORTING.md; update auto-mode URL
  in resources.md; use placeholder pattern for 07-plugins example URL
- remove README.backup.md (stale file)

* fix(check-scripts): fix strip_code_blocks regex and URL fragment handling

- fix regex in strip_code_blocks to avoid conflicting MULTILINE+DOTALL
  flags that could fail to strip indented code fences; use DOTALL only
- strip URL fragments (#section) before dispatching link checks to avoid
  false-positive 404s on valid URLs with anchor fragments

* fix(check-scripts): fix anchor stripping, cross-ref enforcement, and mermaid temp file cleanup

- heading_to_anchor: use .strip("-") instead of .rstrip("-") to also strip leading hyphens
  produced by emoji-prefixed headings, preventing false-positive anchor errors
- check_cross_references: always exit with main()'s return code — filesystem checks
  should block pre-commit unconditionally, not silently pass on errors
- check_mermaid: wrap file-processing loop in try/finally so the puppeteer config
  temp file is cleaned up even if an unexpected exception (e.g. UnicodeDecodeError) occurs
- docs-check.yml: remove now-unused CROSS_REF_STRICT env var

* fix(scripts): fix anchor stripping and mermaid output path

- Replace .strip('-') with .rstrip('-') in heading_to_anchor() so leading
  hyphens from emoji-prefixed headings are preserved, matching GitHub's
  anchor generation behaviour.
- Use Path.with_suffix('.svg') in check_mermaid.py instead of
  str.replace('.mmd', '.svg') to avoid replacing all occurrences of .mmd
  in the full temp path.
2026-04-02 02:20:45 +02:00

87 lines
2.7 KiB
Python

#!/usr/bin/env python3
"""Validate Mermaid diagram syntax in Markdown files using mmdc."""
import json
import os
import re
import shutil
import subprocess # nosec B404
import sys
import tempfile
from pathlib import Path
IGNORE_DIRS = {".venv", "node_modules", ".git", "blog-posts", ".agents"}
def main() -> int:
if not shutil.which("mmdc"):
print(
"⚠ mmdc not found — skipping Mermaid validation (install @mermaid-js/mermaid-cli)"
)
return 0
errors = []
checked = 0
# On GitHub Actions Linux runners, Chrome/Puppeteer requires --no-sandbox.
# Write a temporary puppeteer config when MERMAID_PUPPETEER_NO_SANDBOX is set.
puppeteer_config_path = None
extra_args: list[str] = []
if os.environ.get("MERMAID_PUPPETEER_NO_SANDBOX") == "true":
with tempfile.NamedTemporaryFile(
suffix=".json", mode="w", delete=False
) as pcfg:
json.dump({"args": ["--no-sandbox", "--disable-setuid-sandbox"]}, pcfg)
puppeteer_config_path = pcfg.name
extra_args = ["-p", puppeteer_config_path]
md_files = [
f
for f in Path().rglob("*.md")
if not any(part in IGNORE_DIRS for part in f.parts)
]
try:
for file_path in md_files:
content = file_path.read_text()
blocks = re.findall(r"```mermaid\n(.*?)```", content, re.DOTALL)
for i, block in enumerate(blocks):
with tempfile.NamedTemporaryFile(
suffix=".mmd", mode="w", delete=False
) as tmp:
tmp.write(block)
tmp_path = tmp.name
out_path = str(Path(tmp_path).with_suffix(".svg"))
try:
result = subprocess.run( # nosec B603 B607
["mmdc", "-i", tmp_path, "-o", out_path, *extra_args],
capture_output=True,
text=True,
check=False,
)
if result.returncode != 0:
errors.append(
f"{file_path} (block {i + 1}): {result.stderr.strip()}"
)
else:
checked += 1
finally:
Path(tmp_path).unlink(missing_ok=True)
Path(out_path).unlink(missing_ok=True)
finally:
if puppeteer_config_path:
Path(puppeteer_config_path).unlink(missing_ok=True)
print(f"✅ Checked {checked} Mermaid diagram(s)")
if errors:
print("\n❌ Mermaid errors:")
for e in errors:
print(f" - {e}")
return 1
return 0
if __name__ == "__main__":
sys.exit(main())