agentic_security

mirror of https://github.com/msoedov/agentic_security.git synced 2026-06-24 06:09:55 +02:00

Author	SHA1	Message	Date
Alexander Myasoedov	bc7fdd7cfa	fix(pc):	2026-01-28 21:04:29 +02:00
Alexander Myasoedov	b38a27d78c	feat: US-005 - Enhanced Refusal Detection with Hybrid Approach Implement hybrid refusal classifier combining multiple detection methods: - Add confidence scoring to refusal detection (HybridResult) - Implement weighted voting with configurable thresholds - Support require_unanimous mode for strict classification - Add factory function create_hybrid_classifier for common setup - Include 32 unit tests with table-driven test patterns	2026-01-28 18:52:20 +02:00
Alexander Myasoedov	32f103acbc	feat: US-001 - Dual-LLM Evaluation for Attack Success Detection Add LLM-based refusal classifier inspired by Promptmap's dual-LLM architecture. The controller LLM evaluates whether an attack succeeded by analyzing the target's response against pass/fail conditions. - Create LLMRefusalClassifier plugin integrating with existing system - Support OpenAI and Anthropic providers with lazy initialization - Add configurable system prompts and pass/fail conditions - Include 20 unit tests for comprehensive coverage	2026-01-28 18:18:09 +02:00
Alexander Myasoedov	ce7636fe9e	feat(restruct tests):	2025-12-26 22:58:21 +02:00