Implement hybrid refusal classifier combining multiple detection methods:
- Add confidence scoring to refusal detection (HybridResult)
- Implement weighted voting with configurable thresholds
- Support require_unanimous mode for strict classification
- Add factory function create_hybrid_classifier for common setup
- Include 32 unit tests with table-driven test patterns
Add LLM-based refusal classifier inspired by Promptmap's dual-LLM
architecture. The controller LLM evaluates whether an attack succeeded
by analyzing the target's response against pass/fail conditions.
- Create LLMRefusalClassifier plugin integrating with existing system
- Support OpenAI and Anthropic providers with lazy initialization
- Add configurable system prompts and pass/fail conditions
- Include 20 unit tests for comprehensive coverage