mirror of
https://github.com/Shiva108/ai-llm-red-team-handbook.git
synced 2026-02-12 14:42:46 +00:00
- Introduce a new sequential pipeline architecture for automated scans. - Update the scan command to utilize the new pipeline for auto mode. - Integrate InjectionTester for actual discovery of injection points within the DiscoveryPhase. - Implement real attack execution in the AttackPhase using InjectionTester and pattern registry. - Enhance the VerificationPhase to process detailed TestResult objects with detection scoring.
7.5 KiB
7.5 KiB
Prompt Injection Tester
A professional-grade automated testing framework for LLM prompt injection vulnerabilities, based on the comprehensive attack techniques documented in the AI LLM Red Team Handbook Chapter 14.
Features
- Injection Point Discovery: Automatically identify LLM interaction points (APIs, chatbots, RAG systems, plugins)
- Comprehensive Attack Library: 15+ attack pattern categories including:
- Direct injection (instruction override, role manipulation, delimiter confusion)
- Indirect injection (document poisoning, web pages, emails)
- Advanced techniques (multi-turn, payload fragmentation, encoding/obfuscation)
- Multi-Turn Testing: Context-preserving conversation attacks
- Automated Detection: Multiple heuristics for success detection with confidence scoring
- Reporting: JSON, YAML, and HTML reports with CVSS scoring
- Extensible: Plugin system for custom attack patterns
Installation
# From the handbook repository
cd tools/prompt_injection_tester
pip install -r requirements.txt
# Or install as a package
pip install -e .
Requirements
- Python 3.10+
- aiohttp
- pyyaml
Quick Start
Command Line
# Basic usage
python -m prompt_injection_tester \
--target https://api.example.com/v1/chat/completions \
--token $API_KEY \
--authorize
# With configuration file
python -m prompt_injection_tester --config config.yaml --authorize
# Test specific categories
python -m prompt_injection_tester \
--target URL --token TOKEN \
--categories instruction_override role_manipulation \
--authorize
# Generate HTML report
python -m prompt_injection_tester \
--target URL --token TOKEN \
--output report.html --format html \
--authorize
Programmatic API
import asyncio
from prompt_injection_tester import InjectionTester, AttackConfig
async def main():
# Initialize tester
async with InjectionTester(
target_url="https://api.example.com/v1/chat/completions",
auth_token="your-api-key",
config=AttackConfig(
patterns=["direct_instruction_override", "indirect_rag_poisoning"],
max_concurrent=5,
)
) as tester:
# IMPORTANT: Authorize testing
tester.authorize(scope=["all"])
# Discover injection points
injection_points = await tester.discover_injection_points()
# Run automated tests
results = await tester.run_tests(
injection_points=injection_points,
max_concurrent=5,
)
# Generate report
report = tester.generate_report(format="html", include_cvss=True)
print(f"Tests: {results.total_tests}, Successful: {results.successful_attacks}")
asyncio.run(main())
Attack Pattern Categories
Direct Injection
direct_instruction_override- Override system instructionsdirect_system_prompt_override- Extract system promptsdirect_task_hijacking- Redirect LLM tasksdirect_role_authority- Authority impersonationdirect_persona_shift- Character/persona manipulationdirect_developer_mode- Debug mode activationdirect_delimiter_escape- Delimiter manipulationdirect_xml_injection- XML/HTML tag injectiondirect_markdown_injection- Markdown formatting exploits
Indirect Injection
indirect_rag_poisoning- RAG document poisoningindirect_metadata_injection- Document metadata attacksindirect_hidden_text- Hidden text techniquesindirect_web_injection- Web page injectionindirect_seo_poisoning- SEO-based attacksindirect_comment_injection- Forum/comment injectionindirect_email_body- Email body injectionindirect_email_header- Email header injectionindirect_email_attachment- Attachment-based injection
Advanced Techniques
advanced_gradual_escalation- Multi-turn escalationadvanced_context_buildup- Context accumulationadvanced_trust_establishment- Trust-based exploitationadvanced_fragmentation- Payload fragmentationadvanced_token_smuggling- Token-level obfuscationadvanced_base64- Base64 encodingadvanced_unicode_obfuscation- Unicode tricksadvanced_language_switching- Multi-language attacksadvanced_leetspeak- Leetspeak obfuscation
Configuration
Create a config.yaml file:
target:
name: "My LLM API"
url: "https://api.example.com/v1/chat/completions"
api_type: "openai" # openai, anthropic, or custom
auth_token: "${API_KEY}"
timeout: 30
rate_limit: 1.0
attack:
patterns:
- "direct_instruction_override"
- "indirect_rag_poisoning"
max_concurrent: 5
encoding_variants: ["plain", "base64"]
language_variants: ["en", "es"]
multi_turn_enabled: true
detection:
confidence_threshold: 0.5
reporting:
format: "html"
include_cvss: true
Extending with Custom Patterns
from prompt_injection_tester.patterns import BaseAttackPattern, register_pattern
from prompt_injection_tester import AttackCategory, AttackPayload
@register_pattern
class MyCustomPattern(BaseAttackPattern):
pattern_id = "custom_my_pattern"
name = "My Custom Attack"
category = AttackCategory.INSTRUCTION_OVERRIDE
description = "Custom attack pattern"
success_indicators = ["success", "confirmed"]
def generate_payloads(self) -> list[AttackPayload]:
return [
AttackPayload(
content="My custom injection payload",
category=self.category,
description="Custom payload",
)
]
Security & Ethics
IMPORTANT: This tool is for authorized security testing only.
- Always obtain explicit written authorization before testing
- Only test systems you own or have permission to test
- Follow responsible disclosure practices
- Comply with all applicable laws and regulations
The tool includes:
- Authorization checks before testing
- Rate limiting to prevent accidental DoS
- Audit logging for traceability
- Scope limitation to authorized targets
Project Structure
prompt_injection_tester/
├── __init__.py # Package exports
├── cli.py # Command-line interface
├── core/
│ ├── models.py # Data models
│ └── tester.py # Main InjectionTester class
├── patterns/
│ ├── base.py # Base pattern class
│ ├── registry.py # Pattern registry
│ ├── direct/ # Direct injection patterns
│ ├── indirect/ # Indirect injection patterns
│ └── advanced/ # Advanced techniques
├── detection/
│ ├── base.py # Base detector
│ ├── system_prompt.py # System prompt leak detection
│ ├── behavior_change.py # Behavioral change detection
│ └── scoring.py # CVSS scoring
├── utils/
│ ├── encoding.py # Encoding utilities
│ └── http_client.py # Async HTTP client
├── examples/
│ └── config.yaml # Example configuration
└── tests/ # Test suite
Testing
# Run tests
pytest tests/ -v
# With coverage
pytest tests/ --cov=prompt_injection_tester --cov-report=html
References
License
Part of the AI LLM Red Team Handbook. See repository LICENSE for details.