mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-05 05:05:08 +02:00
feat: Python/async/SSRF patterns in review checklist (#531)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -49,6 +49,13 @@ Be terse. For each issue: one line describing the problem, one line with the fix
|
||||
#### LLM Output Trust Boundary
|
||||
- LLM-generated values (emails, URLs, names) written to DB or passed to mailers without format validation. Add lightweight guards (`EMAIL_REGEXP`, `URI.parse`, `.strip`) before persisting.
|
||||
- Structured tool output (arrays, hashes) accepted without type/shape checks before database writes.
|
||||
- LLM-generated URLs fetched without allowlist — SSRF risk if URL points to internal network (Python: `urllib.parse.urlparse` → check hostname against blocklist before `requests.get`/`httpx.get`)
|
||||
- LLM output stored in knowledge bases or vector DBs without sanitization — stored prompt injection risk
|
||||
|
||||
#### Shell Injection (Python-specific)
|
||||
- `subprocess.run()` / `subprocess.call()` / `subprocess.Popen()` with `shell=True` AND f-string/`.format()` interpolation in the command string — use argument arrays instead
|
||||
- `os.system()` with variable interpolation — replace with `subprocess.run()` using argument arrays
|
||||
- `eval()` / `exec()` on LLM-generated code without sandboxing
|
||||
|
||||
#### Enum & Value Completeness
|
||||
When the diff introduces a new enum value, status string, tier name, or type constant:
|
||||
@@ -59,6 +66,16 @@ To do this: use Grep to find all references to the sibling values (e.g., grep fo
|
||||
|
||||
### Pass 2 — INFORMATIONAL
|
||||
|
||||
#### Async/Sync Mixing (Python-specific)
|
||||
- Synchronous `subprocess.run()`, `open()`, `requests.get()` inside `async def` endpoints — blocks the event loop. Use `asyncio.to_thread()`, `aiofiles`, or `httpx.AsyncClient` instead.
|
||||
- `time.sleep()` inside async functions — use `asyncio.sleep()`
|
||||
- Sync DB calls in async context without `run_in_executor()` wrapping
|
||||
|
||||
#### Column/Field Name Safety
|
||||
- Verify column names in ORM queries (`.select()`, `.eq()`, `.gte()`, `.order()`) against actual DB schema — wrong column names silently return empty results or throw swallowed errors
|
||||
- Check `.get()` calls on query results use the column name that was actually selected
|
||||
- Cross-reference with schema documentation when available
|
||||
|
||||
#### Conditional Side Effects
|
||||
- Code paths that branch on a condition but forget to apply a side effect on one branch. Example: item promoted to verified but URL only attached when a secondary condition is true — the other branch promotes without the URL, creating an inconsistent record.
|
||||
- Log messages that claim an action happened but the action was conditionally skipped. The log should reflect what actually occurred.
|
||||
|
||||
Reference in New Issue
Block a user