Previously, calculate_cost() was always called without a model parameter,
causing all scans to report costs based on deepseek-chat pricing regardless
of the actual target model (e.g. gpt-4, claude-3-opus).
Changes:
- http_spec.py: Add 'model_name' property to LLMSpec that extracts the
model field from the JSON request body. Returns 'unknown' if the body
is not valid JSON or has no 'model' field.
- probe_data/image_generator.py: Add 'model_name' pass-through property
to RequestAdapter, delegating to the underlying LLMSpec.
- probe_data/audio_generator.py: Same as above - add 'model_name'
pass-through property to RequestAdapter.
- probe_actor/cost_module.py:
- Change return type from float to float | None.
- Unknown models now log a warning and return None instead of raising
ValueError, so scans are not interrupted by unsupported model names.
- Add logger import for the warning message.
- probe_actor/fuzzer.py: Pass model_name to calculate_cost() in both
scan_module() and perform_many_shot_scan() using
getattr(request_factory, 'model_name', 'unknown').
- primitives/models.py: Update ScanResult.cost type from float to
float | None to accommodate unknown model pricing.
Removes Content-Length from request headers before sending with httpx
to prevent LocalProtocolError when placeholder replacement (e.g.
<<PROMPT>>) changes the body size. httpx calculates the correct
Content-Length from the actual content.
Closes#139
Add export_full_log() method to FuzzerState that exports a comprehensive
log of all events including errors, refusals, and successful outputs.
Previously only failures were exported. This change addresses issue #100
by creating a complete audit trail in full_scan_log.csv with event type,
module, prompt, status code, content, and refused flag columns.
Co-Authored-By: Nivesh Dandyan <niveshdandyan@gmail.com>
Implement hybrid refusal classifier combining multiple detection methods:
- Add confidence scoring to refusal detection (HybridResult)
- Implement weighted voting with configurable thresholds
- Support require_unanimous mode for strict classification
- Add factory function create_hybrid_classifier for common setup
- Include 32 unit tests with table-driven test patterns
Create unified provider abstraction layer for direct LLM integrations beyond
HTTP specs, inspired by FuzzyAI's comprehensive provider system.
- Add BaseLLMProvider abstract class with standard interface (generate, chat,
sync_generate, sync_chat methods)
- Implement OpenAIProvider supporting chat completions API
- Implement AnthropicProvider supporting messages API
- Create provider factory for instantiation by name (create_provider,
get_provider_class)
- Add 60 unit tests covering all provider implementations
Implement FuzzNode and FuzzChain classes for multi-step attack chains
with pipe operator syntax, inspired by FuzzyAI architecture.
- FuzzNode: Single LLM call with {var} template substitution
- FuzzChain: Sequential execution passing output as input
- Pipe operator (|) for composing nodes into chains
- LLMProvider protocol for provider abstraction
- 22 unit tests covering composition and execution
Implement a YAML-based rule system for defining attack patterns and success
conditions, inspired by Promptmap's 50+ YAML rule definitions.
Features:
- AttackRule model with name, type, severity, prompt, pass/fail conditions
- RuleLoader for parsing YAML files with validation
- Support for recursive directory loading and filtering by type/severity
- Template variable substitution in prompts
- Dataset integration for converting rules to ProbeDataset format
- YAMLRulesDatasetLoader for loading rules from multiple directories
Tested with 47 unit tests covering models, loader, and dataset integration.
Successfully loads 69 rules from promptmap research directory.
Add LLM-based refusal classifier inspired by Promptmap's dual-LLM
architecture. The controller LLM evaluates whether an attack succeeded
by analyzing the target's response against pass/fail conditions.
- Create LLMRefusalClassifier plugin integrating with existing system
- Support OpenAI and Anthropic providers with lazy initialization
- Add configurable system prompts and pass/fail conditions
- Include 20 unit tests for comprehensive coverage