mirror of
https://github.com/msoedov/agentic_security.git
synced 2026-06-25 14:49:57 +02:00
Compare commits
44 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| e752ebaeeb | |||
| 2549194bd1 | |||
| 4c580ea1b8 | |||
| 705fe21887 | |||
| 6505d29d36 | |||
| 801a330e27 | |||
| 92cabf6483 | |||
| 38f1bd7450 | |||
| ead883eeed | |||
| 5a57b997e5 | |||
| a8516a2da3 | |||
| cb3a9bcbc0 | |||
| 3b2f407f2d | |||
| 4b0ecc70ca | |||
| 59d77904dd | |||
| a8dd608f06 | |||
| f8102d1ee9 | |||
| ad6e0dbbc8 | |||
| 6a8cc9bb14 | |||
| 263a282f47 | |||
| 181e39bcfb | |||
| ec4bb0b086 | |||
| cfd621bd4f | |||
| 072ce574ad | |||
| a63106686f | |||
| 3d14cc3719 | |||
| b152e78de3 | |||
| 7e458dbfc4 | |||
| e12ef2d0db | |||
| ce3686e198 | |||
| c79172b4df | |||
| e26d4ab841 | |||
| a377e82a24 | |||
| 126bf11b63 | |||
| 4b0b6987cb | |||
| 0ce4aac682 | |||
| c15ac38bec | |||
| bf14877ef4 | |||
| b8069b809a | |||
| 5c37e33069 | |||
| 5bb5fafa89 | |||
| be85b21767 | |||
| 7e05716977 | |||
| 518cbf7fc3 |
@@ -1,37 +1,54 @@
|
|||||||
<p align="center">
|
<p align="center">
|
||||||
|
<h1 align="center">Agentic Security</h1>
|
||||||
<h1 align="center">Agentic Security</h1>
|
<p align="center">
|
||||||
|
An open-source vulnerability scanner for Agent Workflows and Large Language Models (LLMs)<br />
|
||||||
<p align="center">
|
Protecting AI systems from jailbreaks, fuzzing, and multimodal attacks.<br />
|
||||||
The open-source Agentic LLM Vulnerability Scanner
|
<a href="https://agentic-security.vercel.app">Explore the docs »</a> ·
|
||||||
<br />
|
<a href="https://github.com/msoedov/agentic_security/issues">Report a Bug »</a>
|
||||||
<br />
|
</p>
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p align="center">
|
<p align="center">
|
||||||
<a href="https://github.com/msoedov/agentic_security/commits/main">
|
<a href="https://github.com/msoedov/agentic_security/commits/main">
|
||||||
<img alt="GitHub Last Commit" src="https://img.shields.io/github/last-commit/msoedov/agentic_security?style=for-the-badge&logo=git&labelColor=000000&logoColor=FFFFFF&label=Last Commit&color=6A35FF" />
|
<img alt="GitHub Last Commit" src="https://img.shields.io/github/last-commit/msoedov/agentic_security?style=for-the-badge&logo=git&labelColor=000000&color=6A35FF" />
|
||||||
</a>
|
</a>
|
||||||
<a href="https://github.com/msoedov/agentic_security">
|
<a href="https://github.com/msoedov/agentic_security">
|
||||||
<img alt="GitHub Repo Size" src="https://img.shields.io/github/repo-size/msoedov/agentic_security?style=for-the-badge&logo=database&labelColor=000000&logoColor=FFFFFF&label=Repo Size&color=yellow" />
|
<img alt="GitHub Repo Size" src="https://img.shields.io/github/repo-size/msoedov/agentic_security?style=for-the-badge&logo=database&labelColor=000000&color=yellow" />
|
||||||
</a>
|
|
||||||
</a>
|
</a>
|
||||||
<a href="https://github.com/msoedov/agentic_security/blob/master/LICENSE">
|
<a href="https://github.com/msoedov/agentic_security/blob/master/LICENSE">
|
||||||
<img alt="GitHub License" src="https://img.shields.io/github/license/msoedov/agentic_security?style=for-the-badge&logo=codeigniter&labelColor=000000&logoColor=FFFFFF&label=License&color=FFCC19" />
|
<img alt="GitHub License" src="https://img.shields.io/github/license/msoedov/agentic_security?style=for-the-badge&logo=codeigniter&labelColor=000000&color=FFCC19" />
|
||||||
|
</a>
|
||||||
|
<a href="https://pypi.org/project/agentic-security/">
|
||||||
|
<img alt="PyPI Version" src="https://img.shields.io/pypi/v/agentic-security?style=for-the-badge&logo=pypi&labelColor=000000&color=00CCFF" />
|
||||||
|
</a>
|
||||||
|
<a href="https://discord.gg/stw3DfZQ">
|
||||||
|
<img alt="Join Discord" src="https://img.shields.io/badge/Discord-Join%20Us-black?style=for-the-badge&logo=discord&labelColor=000000&color=DD55FF" />
|
||||||
</a>
|
</a>
|
||||||
<a href="https://discord.gg/stw3DfZQ"><img alt="Join the community" src="https://img.shields.io/badge/Join%20the%20community-black.svg?style=for-the-badge&logo=lightning&labelColor=000000&logoColor=FFFFFF&label=&color=DD55FF&logoWidth=20" /></a>
|
|
||||||
|
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
|
|
||||||
## Features
|
## Features
|
||||||
|
|
||||||
- Multi modal attacks and vulnerability scanners🛠️
|
|
||||||
- Multi-Step/multi-round Jailbreaks 🌀
|
|
||||||
- Comprehensive fuzzing for any LLMs 🧪
|
|
||||||
- LLM API integration and stress testing 🛠️
|
|
||||||
- RL based attacks 📡
|
|
||||||
|
|
||||||
Note: Please be aware that Agentic Security is designed as a safety scanner tool and not a foolproof solution. It cannot guarantee complete protection against all possible threats.
|
Agentic Security equips you with powerful tools to safeguard LLMs against emerging threats. Here's what you can do:
|
||||||
|
|
||||||
|
- **Multimodal Attacks** 🖼️🎙️
|
||||||
|
Probe vulnerabilities across text, images, and audio inputs to ensure your LLM is robust against diverse threats.
|
||||||
|
|
||||||
|
- **Multi-Step Jailbreaks** 🌀
|
||||||
|
Simulate sophisticated, iterative attack sequences to uncover weaknesses in LLM safety mechanisms.
|
||||||
|
|
||||||
|
- **Comprehensive Fuzzing** 🧪
|
||||||
|
Stress-test any LLM with randomized inputs to identify edge cases and unexpected behaviors.
|
||||||
|
|
||||||
|
- **API Integration & Stress Testing** 🌐
|
||||||
|
Seamlessly connect to LLM APIs and push their limits with high-volume, real-world attack scenarios.
|
||||||
|
|
||||||
|
- **RL-Based Attacks** 📡
|
||||||
|
Leverage reinforcement learning to craft adaptive, intelligent probes that evolve with your model’s defenses.
|
||||||
|
|
||||||
|
> **Why It Matters**: These features help developers, researchers, and security teams proactively identify and mitigate risks in AI systems, ensuring safer and more reliable deployments.
|
||||||
|
|
||||||
|
|
||||||
## 📦 Installation
|
## 📦 Installation
|
||||||
|
|
||||||
@@ -111,7 +128,7 @@ Init config
|
|||||||
```shell
|
```shell
|
||||||
agentic_security init
|
agentic_security init
|
||||||
|
|
||||||
2025-01-08 20:12:02.449 | INFO | agentic_security.lib:generate_default_cfg:324 - Default configuration generated successfully to agesec.toml.
|
2025-01-08 20:12:02.449 | INFO | agentic_security.lib:generate_default_settings:324 - Default configuration generated successfully to agesec.toml.
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -6,6 +6,7 @@ import uvicorn
|
|||||||
|
|
||||||
from agentic_security.app import app
|
from agentic_security.app import app
|
||||||
from agentic_security.lib import AgenticSecurity
|
from agentic_security.lib import AgenticSecurity
|
||||||
|
from agentic_security.misc.banner import init_banner
|
||||||
|
|
||||||
|
|
||||||
class CLI:
|
class CLI:
|
||||||
@@ -38,7 +39,7 @@ class CLI:
|
|||||||
Generate the default CI configuration file.
|
Generate the default CI configuration file.
|
||||||
"""
|
"""
|
||||||
sys.path.append(os.path.dirname("."))
|
sys.path.append(os.path.dirname("."))
|
||||||
AgenticSecurity().generate_default_cfg(host, port)
|
AgenticSecurity().generate_default_settings(host, port)
|
||||||
|
|
||||||
i = init
|
i = init
|
||||||
|
|
||||||
@@ -61,4 +62,5 @@ def main():
|
|||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
init_banner()
|
||||||
main()
|
main()
|
||||||
|
|||||||
@@ -0,0 +1,256 @@
|
|||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
from crewai import Agent, Crew, Task
|
||||||
|
from crewai_tools import tool
|
||||||
|
from pydantic import BaseModel, ConfigDict, Field
|
||||||
|
|
||||||
|
# Assuming LLMSpec is defined elsewhere; placeholder import
|
||||||
|
from agentic_security.http_spec import LLMSpec
|
||||||
|
|
||||||
|
LLM_SPECS = [] # Populate with LLM spec strings if needed
|
||||||
|
|
||||||
|
# Configure logging
|
||||||
|
logging.basicConfig(level=logging.INFO)
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
# Define AgentSpecification model
|
||||||
|
class AgentSpecification(BaseModel):
|
||||||
|
name: str | None = Field(None, description="Name of the LLM/agent")
|
||||||
|
version: str | None = Field(None, description="Version of the LLM/agent")
|
||||||
|
description: str | None = Field(None, description="Description of the LLM/agent")
|
||||||
|
capabilities: list[str] | None = Field(None, description="List of capabilities")
|
||||||
|
configuration: dict[str, Any] | None = Field(
|
||||||
|
None, description="Configuration settings"
|
||||||
|
)
|
||||||
|
endpoint: str | None = Field(None, description="Endpoint URL of the deployed agent")
|
||||||
|
|
||||||
|
model_config = ConfigDict(arbitrary_types_allowed=True)
|
||||||
|
|
||||||
|
|
||||||
|
# Define OperatorToolBox class (unchanged from original)
|
||||||
|
class OperatorToolBox:
|
||||||
|
def __init__(self, spec: AgentSpecification, datasets: list[dict[str, Any]]):
|
||||||
|
self.spec = spec
|
||||||
|
self.datasets = datasets
|
||||||
|
self.failures = []
|
||||||
|
self.llm_specs = [LLMSpec.from_string(spec) for spec in LLM_SPECS]
|
||||||
|
|
||||||
|
def get_spec(self) -> AgentSpecification:
|
||||||
|
return self.spec
|
||||||
|
|
||||||
|
def get_datasets(self) -> list[dict[str, Any]]:
|
||||||
|
return self.datasets
|
||||||
|
|
||||||
|
def validate(self) -> bool:
|
||||||
|
if not self.spec.name or not self.spec.version:
|
||||||
|
self.failures.append("Invalid specification: Name or version is missing.")
|
||||||
|
return False
|
||||||
|
if not self.datasets:
|
||||||
|
self.failures.append("No datasets provided.")
|
||||||
|
return False
|
||||||
|
return True
|
||||||
|
|
||||||
|
def stop(self) -> None:
|
||||||
|
logger.info("Stopping the toolbox...")
|
||||||
|
|
||||||
|
def run(self) -> None:
|
||||||
|
logger.info("Running the toolbox...")
|
||||||
|
|
||||||
|
def get_results(self) -> list[dict[str, Any]]:
|
||||||
|
return self.datasets
|
||||||
|
|
||||||
|
def get_failures(self) -> list[str]:
|
||||||
|
return self.failures
|
||||||
|
|
||||||
|
def run_operation(self, operation: str) -> str:
|
||||||
|
if operation not in ["dataset1", "dataset2", "dataset3"]:
|
||||||
|
self.failures.append(f"Operation '{operation}' failed: Dataset not found.")
|
||||||
|
return f"Operation '{operation}' failed: Dataset not found."
|
||||||
|
return f"Operation '{operation}' executed successfully."
|
||||||
|
|
||||||
|
async def test_llm_spec(self, llm_spec: LLMSpec, user_prompt: str) -> str:
|
||||||
|
try:
|
||||||
|
response = await llm_spec.verify()
|
||||||
|
response.raise_for_status()
|
||||||
|
logger.info(f"Verification succeeded for {llm_spec.url}")
|
||||||
|
|
||||||
|
test_response = await llm_spec.probe(user_prompt)
|
||||||
|
test_response.raise_for_status()
|
||||||
|
response_data = test_response.json()
|
||||||
|
return f"Test succeeded for {llm_spec.url}: {response_data}"
|
||||||
|
except httpx.HTTPStatusError as e:
|
||||||
|
self.failures.append(f"HTTP error occurred: {e}")
|
||||||
|
logger.error(f"Test failed for {llm_spec.url}: {e}")
|
||||||
|
return f"Test failed for {llm_spec.url}: {e}"
|
||||||
|
except Exception as e:
|
||||||
|
self.failures.append(f"An error occurred: {e}")
|
||||||
|
logger.error(f"Test failed for {llm_spec.url}: {e}")
|
||||||
|
return f"Test failed for {llm_spec.url}: {e}"
|
||||||
|
|
||||||
|
async def test_with_prompt(self, spec_index: int, user_prompt: str) -> str:
|
||||||
|
if not 0 <= spec_index < len(self.llm_specs):
|
||||||
|
return f"Invalid spec index: {spec_index}. Valid range is 0 to {len(self.llm_specs) - 1}"
|
||||||
|
llm_spec = self.llm_specs[spec_index]
|
||||||
|
return await self.test_llm_spec(llm_spec, user_prompt)
|
||||||
|
|
||||||
|
|
||||||
|
# Define CrewAI Tools
|
||||||
|
@tool("validate_toolbox")
|
||||||
|
def validate_toolbox(toolbox: OperatorToolBox) -> str:
|
||||||
|
"""Validate the toolbox configuration."""
|
||||||
|
is_valid = toolbox.validate()
|
||||||
|
return (
|
||||||
|
"ToolBox validation successful." if is_valid else "ToolBox validation failed."
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@tool("execute_operation")
|
||||||
|
def execute_operation(toolbox: OperatorToolBox, operation: str) -> str:
|
||||||
|
"""Execute a dataset operation."""
|
||||||
|
return toolbox.run_operation(operation)
|
||||||
|
|
||||||
|
|
||||||
|
@tool("retrieve_results")
|
||||||
|
def retrieve_results(toolbox: OperatorToolBox) -> str:
|
||||||
|
"""Retrieve the results of operations."""
|
||||||
|
results = toolbox.get_results()
|
||||||
|
return (
|
||||||
|
f"Operation Results:\n{results}"
|
||||||
|
if results
|
||||||
|
else "No operations have been executed yet."
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@tool("retrieve_failures")
|
||||||
|
def retrieve_failures(toolbox: OperatorToolBox) -> str:
|
||||||
|
"""Retrieve recorded failures."""
|
||||||
|
failures = toolbox.get_failures()
|
||||||
|
return f"Failures:\n{failures}" if failures else "No failures recorded."
|
||||||
|
|
||||||
|
|
||||||
|
@tool("list_llm_specs")
|
||||||
|
def list_llm_specs(toolbox: OperatorToolBox) -> str:
|
||||||
|
"""List available LLM specifications."""
|
||||||
|
spec_list = "\n".join(
|
||||||
|
f"{i}: {spec.url}" for i, spec in enumerate(toolbox.llm_specs)
|
||||||
|
)
|
||||||
|
return f"Available LLM Specs:\n{spec_list}"
|
||||||
|
|
||||||
|
|
||||||
|
@tool("test_llm_with_prompt")
|
||||||
|
async def test_llm_with_prompt(
|
||||||
|
toolbox: OperatorToolBox, spec_index: int, user_prompt: str
|
||||||
|
) -> str:
|
||||||
|
"""Test an LLM spec with a user prompt."""
|
||||||
|
return await toolbox.test_with_prompt(spec_index, user_prompt)
|
||||||
|
|
||||||
|
|
||||||
|
# Setup OperatorToolBox
|
||||||
|
spec = AgentSpecification(
|
||||||
|
name="DeepSeek Chat",
|
||||||
|
version="1.0",
|
||||||
|
description="A powerful language model",
|
||||||
|
capabilities=["text-generation", "question-answering"],
|
||||||
|
configuration={"max_tokens": 100},
|
||||||
|
)
|
||||||
|
toolbox = OperatorToolBox(
|
||||||
|
spec=spec, datasets=[{"id": "dataset1"}, {"id": "dataset2"}, {"id": "dataset3"}]
|
||||||
|
)
|
||||||
|
|
||||||
|
# Define CrewAI Agent
|
||||||
|
dataset_manager_agent = Agent(
|
||||||
|
role="Dataset Manager",
|
||||||
|
goal="Manage and operate the OperatorToolBox to validate configurations, run operations, and test LLMs.",
|
||||||
|
backstory="An expert in dataset management and LLM testing, designed to assist with toolbox operations.",
|
||||||
|
verbose=True,
|
||||||
|
llm="openai", # Using OpenAI-compatible API for DeepSeek; adjust if DeepSeek has a specific ID
|
||||||
|
tools=[
|
||||||
|
validate_toolbox,
|
||||||
|
execute_operation,
|
||||||
|
retrieve_results,
|
||||||
|
retrieve_failures,
|
||||||
|
list_llm_specs,
|
||||||
|
test_llm_with_prompt,
|
||||||
|
],
|
||||||
|
allow_delegation=False, # Single agent, no delegation needed
|
||||||
|
)
|
||||||
|
|
||||||
|
# Define Tasks
|
||||||
|
tasks = [
|
||||||
|
Task(
|
||||||
|
description="Validate the toolbox configuration.",
|
||||||
|
agent=dataset_manager_agent,
|
||||||
|
expected_output="A string indicating whether validation succeeded or failed.",
|
||||||
|
),
|
||||||
|
Task(
|
||||||
|
description="List available LLM specifications.",
|
||||||
|
agent=dataset_manager_agent,
|
||||||
|
expected_output="A string listing available LLM specs.",
|
||||||
|
),
|
||||||
|
Task(
|
||||||
|
description="Guide the user to test an LLM with the prompt: 'Tell me a short story about a robot'. Suggest listing specs first.",
|
||||||
|
agent=dataset_manager_agent,
|
||||||
|
expected_output="A string suggesting the user list specs and proceed with testing.",
|
||||||
|
),
|
||||||
|
]
|
||||||
|
|
||||||
|
# Define Crew
|
||||||
|
crew = Crew(
|
||||||
|
agents=[dataset_manager_agent],
|
||||||
|
tasks=tasks,
|
||||||
|
verbose=2, # Detailed logging
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# Async wrapper to handle async tools
|
||||||
|
async def run_crew():
|
||||||
|
# Since CrewAI's process() is synchronous but our tool is async, we need to run it in an event loop
|
||||||
|
result = (
|
||||||
|
crew.kickoff()
|
||||||
|
) # Synchronous call; async tools are awaited internally by CrewAI
|
||||||
|
print("\nCrew Results:")
|
||||||
|
for task_result in result:
|
||||||
|
print(f"Task: {task_result.description}")
|
||||||
|
print(f"Output: {task_result.output}\n")
|
||||||
|
|
||||||
|
# Handle user interaction for LLM testing
|
||||||
|
print("Please select a spec index from the listed specs and confirm to proceed.")
|
||||||
|
user_input = (
|
||||||
|
input("Enter spec index and 'yes' to confirm (e.g., '0 yes'): ").strip().split()
|
||||||
|
)
|
||||||
|
if len(user_input) == 2 and user_input[1].lower() == "yes":
|
||||||
|
try:
|
||||||
|
spec_index = int(user_input[0])
|
||||||
|
user_prompt = "Tell me a short story about a robot"
|
||||||
|
# Create a new task for testing
|
||||||
|
test_task = Task(
|
||||||
|
description=f"Test LLM at index {spec_index} with prompt: '{user_prompt}'",
|
||||||
|
agent=dataset_manager_agent,
|
||||||
|
expected_output="A string with the test result from the LLM.",
|
||||||
|
)
|
||||||
|
test_crew = Crew(
|
||||||
|
agents=[dataset_manager_agent], tasks=[test_task], verbose=2
|
||||||
|
)
|
||||||
|
test_result = test_crew.kickoff()
|
||||||
|
print(f"Test Output: {test_result[0].output}\n")
|
||||||
|
except ValueError:
|
||||||
|
print("Invalid spec index provided.\n")
|
||||||
|
else:
|
||||||
|
print("Test canceled. Please provide a valid index and confirmation.\n")
|
||||||
|
|
||||||
|
|
||||||
|
# Ensure DeepSeek API key is set
|
||||||
|
os.environ["OPENAI_API_KEY"] = os.environ.get(
|
||||||
|
"DEEPSEEK_API_KEY", ""
|
||||||
|
) # CrewAI uses OPENAI_API_KEY
|
||||||
|
os.environ[
|
||||||
|
"OPENAI_MODEL_NAME"
|
||||||
|
] = "deepseek:chat" # Specify DeepSeek model (adjust if needed)
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
asyncio.run(run_crew())
|
||||||
@@ -0,0 +1,238 @@
|
|||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
from pydantic import BaseModel, ConfigDict, Field
|
||||||
|
from pydantic_ai import Agent, RunContext, Tool
|
||||||
|
|
||||||
|
# Assuming LLMSpec is defined elsewhere; placeholder import
|
||||||
|
from agentic_security.http_spec import LLMSpec
|
||||||
|
|
||||||
|
LLM_SPECS = [] # Populate this list with LLM spec strings if needed
|
||||||
|
|
||||||
|
# Configure logging
|
||||||
|
logging.basicConfig(level=logging.INFO)
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
# Define AgentSpecification model
|
||||||
|
class AgentSpecification(BaseModel):
|
||||||
|
name: str | None = Field(None, description="Name of the LLM/agent")
|
||||||
|
version: str | None = Field(None, description="Version of the LLM/agent")
|
||||||
|
description: str | None = Field(None, description="Description of the LLM/agent")
|
||||||
|
capabilities: list[str] | None = Field(None, description="List of capabilities")
|
||||||
|
configuration: dict[str, Any] | None = Field(
|
||||||
|
None, description="Configuration settings"
|
||||||
|
)
|
||||||
|
endpoint: str | None = Field(None, description="Endpoint URL of the deployed agent")
|
||||||
|
|
||||||
|
model_config = ConfigDict(arbitrary_types_allowed=True)
|
||||||
|
|
||||||
|
|
||||||
|
# Define OperatorToolBox class
|
||||||
|
class OperatorToolBox:
|
||||||
|
def __init__(self, spec: AgentSpecification, datasets: list[dict[str, Any]]):
|
||||||
|
self.spec = spec
|
||||||
|
self.datasets = datasets
|
||||||
|
self.failures = []
|
||||||
|
self.llm_specs = [LLMSpec.from_string(spec) for spec in LLM_SPECS]
|
||||||
|
|
||||||
|
def get_spec(self) -> AgentSpecification:
|
||||||
|
return self.spec
|
||||||
|
|
||||||
|
def get_datasets(self) -> list[dict[str, Any]]:
|
||||||
|
return self.datasets
|
||||||
|
|
||||||
|
def validate(self) -> bool:
|
||||||
|
if not self.spec.name or not self.spec.version:
|
||||||
|
self.failures.append("Invalid specification: Name or version is missing.")
|
||||||
|
return False
|
||||||
|
if not self.datasets:
|
||||||
|
self.failures.append("No datasets provided.")
|
||||||
|
return False
|
||||||
|
return True
|
||||||
|
|
||||||
|
def stop(self) -> None:
|
||||||
|
logger.info("Stopping the toolbox...")
|
||||||
|
|
||||||
|
def run(self) -> None:
|
||||||
|
logger.info("Running the toolbox...")
|
||||||
|
|
||||||
|
def get_results(self) -> list[dict[str, Any]]:
|
||||||
|
return self.datasets
|
||||||
|
|
||||||
|
def get_failures(self) -> list[str]:
|
||||||
|
return self.failures
|
||||||
|
|
||||||
|
def run_operation(self, operation: str) -> str:
|
||||||
|
if operation not in ["dataset1", "dataset2", "dataset3"]:
|
||||||
|
self.failures.append(f"Operation '{operation}' failed: Dataset not found.")
|
||||||
|
return f"Operation '{operation}' failed: Dataset not found."
|
||||||
|
return f"Operation '{operation}' executed successfully."
|
||||||
|
|
||||||
|
async def test_llm_spec(self, llm_spec: LLMSpec, user_prompt: str) -> str:
|
||||||
|
try:
|
||||||
|
response = await llm_spec.verify()
|
||||||
|
response.raise_for_status()
|
||||||
|
logger.info(f"Verification succeeded for {llm_spec.url}")
|
||||||
|
|
||||||
|
test_response = await llm_spec.probe(user_prompt)
|
||||||
|
test_response.raise_for_status()
|
||||||
|
response_data = test_response.json()
|
||||||
|
return f"Test succeeded for {llm_spec.url}: {response_data}"
|
||||||
|
except httpx.HTTPStatusError as e:
|
||||||
|
self.failures.append(f"HTTP error occurred: {e}")
|
||||||
|
logger.error(f"Test failed for {llm_spec.url}: {e}")
|
||||||
|
return f"Test failed for {llm_spec.url}: {e}"
|
||||||
|
except Exception as e:
|
||||||
|
self.failures.append(f"An error occurred: {e}")
|
||||||
|
logger.error(f"Test failed for {llm_spec.url}: {e}")
|
||||||
|
return f"Test failed for {llm_spec.url}: {e}"
|
||||||
|
|
||||||
|
async def test_with_prompt(self, spec_index: int, user_prompt: str) -> str:
|
||||||
|
if not 0 <= spec_index < len(self.llm_specs):
|
||||||
|
return f"Invalid spec index: {spec_index}. Valid range is 0 to {len(self.llm_specs) - 1}"
|
||||||
|
llm_spec = self.llm_specs[spec_index]
|
||||||
|
return await self.test_llm_spec(llm_spec, user_prompt)
|
||||||
|
|
||||||
|
|
||||||
|
# Define the Agent
|
||||||
|
class DatasetManagerAgent(Agent):
|
||||||
|
model: str = "deepseek:chat"
|
||||||
|
system_prompt: str = (
|
||||||
|
"You are an AI agent managing an OperatorToolBox. You can validate the toolbox, run operations, "
|
||||||
|
"retrieve results or failures, list LLM specs, and test LLM specs with user prompts. "
|
||||||
|
"Use the provided tools to assist the user based on their request."
|
||||||
|
)
|
||||||
|
|
||||||
|
def __init__(self, toolbox: OperatorToolBox, **kwargs):
|
||||||
|
super().__init__(**kwargs)
|
||||||
|
self.toolbox = toolbox
|
||||||
|
|
||||||
|
# Define async tools within __init__
|
||||||
|
async def validate_toolbox(ctx: RunContext[Any]) -> str:
|
||||||
|
is_valid = self.toolbox.validate()
|
||||||
|
return (
|
||||||
|
"ToolBox validation successful."
|
||||||
|
if is_valid
|
||||||
|
else "ToolBox validation failed."
|
||||||
|
)
|
||||||
|
|
||||||
|
async def execute_operation(ctx: RunContext[Any], operation: str) -> str:
|
||||||
|
return self.toolbox.run_operation(operation)
|
||||||
|
|
||||||
|
async def retrieve_results(ctx: RunContext[Any]) -> str:
|
||||||
|
results = self.toolbox.get_results()
|
||||||
|
return (
|
||||||
|
f"Operation Results:\n{results}"
|
||||||
|
if results
|
||||||
|
else "No operations have been executed yet."
|
||||||
|
)
|
||||||
|
|
||||||
|
async def retrieve_failures(ctx: RunContext[Any]) -> str:
|
||||||
|
failures = self.toolbox.get_failures()
|
||||||
|
return f"Failures:\n{failures}" if failures else "No failures recorded."
|
||||||
|
|
||||||
|
async def list_llm_specs(ctx: RunContext[Any]) -> str:
|
||||||
|
spec_list = "\n".join(
|
||||||
|
f"{i}: {spec.url}" for i, spec in enumerate(self.toolbox.llm_specs)
|
||||||
|
)
|
||||||
|
return f"Available LLM Specs:\n{spec_list}"
|
||||||
|
|
||||||
|
async def test_llm_with_prompt(
|
||||||
|
ctx: RunContext[Any], spec_index: int, user_prompt: str
|
||||||
|
) -> str:
|
||||||
|
return await self.toolbox.test_with_prompt(spec_index, user_prompt)
|
||||||
|
|
||||||
|
# Register tools
|
||||||
|
self.tools = [
|
||||||
|
Tool(
|
||||||
|
name="validate_toolbox",
|
||||||
|
description="Validate the toolbox configuration.",
|
||||||
|
function=validate_toolbox,
|
||||||
|
),
|
||||||
|
Tool(
|
||||||
|
name="execute_operation",
|
||||||
|
description="Execute a dataset operation.",
|
||||||
|
function=execute_operation,
|
||||||
|
),
|
||||||
|
Tool(
|
||||||
|
name="retrieve_results",
|
||||||
|
description="Retrieve the results of operations.",
|
||||||
|
function=retrieve_results,
|
||||||
|
),
|
||||||
|
Tool(
|
||||||
|
name="retrieve_failures",
|
||||||
|
description="Retrieve recorded failures.",
|
||||||
|
function=retrieve_failures,
|
||||||
|
),
|
||||||
|
Tool(
|
||||||
|
name="list_llm_specs",
|
||||||
|
description="List available LLM specifications.",
|
||||||
|
function=list_llm_specs,
|
||||||
|
),
|
||||||
|
Tool(
|
||||||
|
name="test_llm_with_prompt",
|
||||||
|
description="Test an LLM spec with a user prompt.",
|
||||||
|
function=test_llm_with_prompt,
|
||||||
|
),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
# Setup and run example
|
||||||
|
async def run_dataset_manager_agent_async():
|
||||||
|
# Initialize OperatorToolBox with AgentSpecification
|
||||||
|
spec = AgentSpecification(
|
||||||
|
name="DeepSeek Chat",
|
||||||
|
version="1.0",
|
||||||
|
description="A powerful language model",
|
||||||
|
capabilities=["text-generation", "question-answering"],
|
||||||
|
configuration={"max_tokens": 100},
|
||||||
|
)
|
||||||
|
toolbox = OperatorToolBox(
|
||||||
|
spec=spec, datasets=[{"id": "dataset1"}, {"id": "dataset2"}, {"id": "dataset3"}]
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create the agent
|
||||||
|
agent = DatasetManagerAgent(toolbox=toolbox)
|
||||||
|
|
||||||
|
# Example prompts
|
||||||
|
prompts = [
|
||||||
|
"Validate the toolbox.",
|
||||||
|
"List available LLM specs.",
|
||||||
|
"I want to test an LLM with my prompt: 'Tell me a short story about a robot'. Which spec index should I use?",
|
||||||
|
]
|
||||||
|
|
||||||
|
for prompt in prompts:
|
||||||
|
result = await agent.run(prompt)
|
||||||
|
print(f"Prompt: {prompt}")
|
||||||
|
print(f"Response: {result}\n")
|
||||||
|
|
||||||
|
# Handle testing request
|
||||||
|
if "test an LLM with my prompt" in prompt:
|
||||||
|
print(
|
||||||
|
"Please select a spec index from the list above and confirm to proceed."
|
||||||
|
)
|
||||||
|
# Simulate user input (replace with real input in practice)
|
||||||
|
user_input = (
|
||||||
|
input("Enter spec index and 'yes' to confirm (e.g., '0 yes'): ")
|
||||||
|
.strip()
|
||||||
|
.split()
|
||||||
|
)
|
||||||
|
if len(user_input) == 2 and user_input[1].lower() == "yes":
|
||||||
|
try:
|
||||||
|
spec_index = int(user_input[0])
|
||||||
|
user_prompt = prompt.split("my prompt: ")[1].strip("'")
|
||||||
|
test_result = await agent.run(
|
||||||
|
f"Test LLM at index {spec_index} with prompt: {user_prompt}"
|
||||||
|
)
|
||||||
|
print(f"Test Response: {test_result}\n")
|
||||||
|
except ValueError:
|
||||||
|
print("Invalid spec index provided.\n")
|
||||||
|
else:
|
||||||
|
print("Test canceled. Please provide a valid index and confirmation.\n")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
asyncio.run(run_dataset_manager_agent_async())
|
||||||
@@ -1,16 +1,38 @@
|
|||||||
|
from functools import lru_cache
|
||||||
|
|
||||||
import tomli
|
import tomli
|
||||||
from loguru import logger
|
from loguru import logger
|
||||||
|
|
||||||
|
SETTINGS_VERSION = 1
|
||||||
|
|
||||||
class CfgMixin:
|
|
||||||
|
@lru_cache(maxsize=1)
|
||||||
|
def settings_var(name: str, default=None):
|
||||||
|
return get_or_create_config().get_config_value(name, default)
|
||||||
|
|
||||||
|
|
||||||
|
@lru_cache(maxsize=1)
|
||||||
|
def get_or_create_config():
|
||||||
|
cfg = SettingsMixin()
|
||||||
|
cfg.get_or_create_config()
|
||||||
|
return cfg
|
||||||
|
|
||||||
|
|
||||||
|
class SettingsMixin:
|
||||||
config = {}
|
config = {}
|
||||||
default_path = "agentic_security.toml"
|
default_path = "agentic_security.toml"
|
||||||
|
|
||||||
def get_or_create_config(self) -> bool:
|
def get_or_create_config(self) -> bool:
|
||||||
if not self.has_local_config():
|
if not self.has_local_config():
|
||||||
self.generate_default_cfg()
|
self.generate_default_settings()
|
||||||
return False
|
return False
|
||||||
self.load_config(self.default_path)
|
self.load_config(self.default_path)
|
||||||
|
settings_version = self.get_config_value("general.version")
|
||||||
|
if settings_version and settings_version != SETTINGS_VERSION:
|
||||||
|
logger.error(
|
||||||
|
f"Configuration version mismatch: expected {SETTINGS_VERSION}, got {settings_version}."
|
||||||
|
)
|
||||||
|
return False
|
||||||
return True
|
return True
|
||||||
|
|
||||||
def has_local_config(self):
|
def has_local_config(self):
|
||||||
@@ -64,7 +86,7 @@ class CfgMixin:
|
|||||||
return default
|
return default
|
||||||
return value
|
return value
|
||||||
|
|
||||||
def generate_default_cfg(self, host: str = "0.0.0.0", port: int = 8718):
|
def generate_default_settings(self, host: str = "0.0.0.0", port: int = 8718):
|
||||||
# Accept host / port as parameters
|
# Accept host / port as parameters
|
||||||
with open(self.default_path, "w") as f:
|
with open(self.default_path, "w") as f:
|
||||||
f.write(
|
f.write(
|
||||||
@@ -84,6 +106,7 @@ maxBudget = 1000000 # Maximum budget for the scan
|
|||||||
max_th = 0.3 # Maximum failure threshold (percentage)
|
max_th = 0.3 # Maximum failure threshold (percentage)
|
||||||
optimize = false # Enable optimization during scanning
|
optimize = false # Enable optimization during scanning
|
||||||
enableMultiStepAttack = false # Enable multi-step attack simulations
|
enableMultiStepAttack = false # Enable multi-step attack simulations
|
||||||
|
version = $SETTINGS_VERSION
|
||||||
|
|
||||||
# [modules.LLM-Jailbreak-Classifier]
|
# [modules.LLM-Jailbreak-Classifier]
|
||||||
# dataset_name = "markush1/LLM-Jailbreak-Classifier"
|
# dataset_name = "markush1/LLM-Jailbreak-Classifier"
|
||||||
@@ -110,11 +133,20 @@ high = 0.5
|
|||||||
OPENAI_API_KEY = "$OPENAI_API_KEY"
|
OPENAI_API_KEY = "$OPENAI_API_KEY"
|
||||||
DEEPSEEK_API_KEY = "$DEEPSEEK_API_KEY"
|
DEEPSEEK_API_KEY = "$DEEPSEEK_API_KEY"
|
||||||
|
|
||||||
|
[caching]
|
||||||
|
enable = true
|
||||||
|
cache_size = 10000
|
||||||
|
use_disk_cache = false
|
||||||
|
|
||||||
|
[network]
|
||||||
|
retry = 3
|
||||||
|
timeout_connect = 30
|
||||||
|
timeout_response = 90
|
||||||
""".replace(
|
""".replace(
|
||||||
"$HOST", host
|
"$HOST", host
|
||||||
).replace(
|
|
||||||
"$PORT", str(port)
|
|
||||||
)
|
)
|
||||||
|
.replace("$PORT", str(port))
|
||||||
|
.replace("$SETTINGS_VERSION", str(SETTINGS_VERSION))
|
||||||
)
|
)
|
||||||
|
|
||||||
logger.info(
|
logger.info(
|
||||||
|
|||||||
@@ -2,6 +2,7 @@ import os
|
|||||||
from asyncio import Event, Queue
|
from asyncio import Event, Queue
|
||||||
|
|
||||||
from fastapi import FastAPI
|
from fastapi import FastAPI
|
||||||
|
from fastapi.responses import ORJSONResponse
|
||||||
|
|
||||||
tools_inbox: Queue = Queue()
|
tools_inbox: Queue = Queue()
|
||||||
stop_event: Event = Event()
|
stop_event: Event = Event()
|
||||||
@@ -11,7 +12,7 @@ _secrets = {}
|
|||||||
|
|
||||||
def create_app() -> FastAPI:
|
def create_app() -> FastAPI:
|
||||||
"""Create and configure the FastAPI application."""
|
"""Create and configure the FastAPI application."""
|
||||||
app = FastAPI()
|
app = FastAPI(default_response_class=ORJSONResponse)
|
||||||
return app
|
return app
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -1,13 +1,11 @@
|
|||||||
from agentic_security.config import CfgMixin
|
from agentic_security.config import get_or_create_config
|
||||||
from agentic_security.core.app import set_secrets
|
from agentic_security.core.app import set_secrets
|
||||||
|
|
||||||
|
|
||||||
class InMemorySecrets:
|
class InMemorySecrets:
|
||||||
def __init__(self):
|
def __init__(self):
|
||||||
self.secrets = {}
|
config = get_or_create_config()
|
||||||
self.config = CfgMixin()
|
self.secrets = config.get_config_value("secrets", {})
|
||||||
self.config.get_or_create_config()
|
|
||||||
self.secrets = self.config.config.get("secrets", {})
|
|
||||||
set_secrets(self.secrets)
|
set_secrets(self.secrets)
|
||||||
|
|
||||||
def set_secret(self, key: str, value: str):
|
def set_secret(self, key: str, value: str):
|
||||||
|
|||||||
@@ -4,6 +4,8 @@ from enum import Enum
|
|||||||
import httpx
|
import httpx
|
||||||
from pydantic import BaseModel
|
from pydantic import BaseModel
|
||||||
|
|
||||||
|
from agentic_security.config import settings_var
|
||||||
|
|
||||||
|
|
||||||
class Modality(Enum):
|
class Modality(Enum):
|
||||||
TEXT = 0
|
TEXT = 0
|
||||||
@@ -28,7 +30,7 @@ def encode_audio_base64_by_url(url: str) -> str:
|
|||||||
|
|
||||||
|
|
||||||
class InvalidHTTPSpecError(Exception):
|
class InvalidHTTPSpecError(Exception):
|
||||||
...
|
pass
|
||||||
|
|
||||||
|
|
||||||
class LLMSpec(BaseModel):
|
class LLMSpec(BaseModel):
|
||||||
@@ -47,14 +49,21 @@ class LLMSpec(BaseModel):
|
|||||||
except Exception as e:
|
except Exception as e:
|
||||||
raise InvalidHTTPSpecError(f"Failed to parse HTTP spec: {e}") from e
|
raise InvalidHTTPSpecError(f"Failed to parse HTTP spec: {e}") from e
|
||||||
|
|
||||||
|
def timeout(self):
|
||||||
|
return (
|
||||||
|
settings_var("network.timeout_connect", 30),
|
||||||
|
settings_var("network.timeout_response", 90),
|
||||||
|
)
|
||||||
|
|
||||||
async def _probe_with_files(self, files):
|
async def _probe_with_files(self, files):
|
||||||
async with httpx.AsyncClient() as client:
|
transport = httpx.AsyncHTTPTransport(retries=settings_var("network.retry", 3))
|
||||||
|
async with httpx.AsyncClient(transport=transport) as client:
|
||||||
response = await client.request(
|
response = await client.request(
|
||||||
method=self.method,
|
method=self.method,
|
||||||
url=self.url,
|
url=self.url,
|
||||||
headers=self.headers,
|
headers=self.headers,
|
||||||
files=files,
|
files=files,
|
||||||
timeout=(30, 90),
|
timeout=self.timeout(),
|
||||||
)
|
)
|
||||||
|
|
||||||
return response
|
return response
|
||||||
@@ -90,13 +99,15 @@ class LLMSpec(BaseModel):
|
|||||||
content = self.body.replace("<<PROMPT>>", escape_special_chars_for_json(prompt))
|
content = self.body.replace("<<PROMPT>>", escape_special_chars_for_json(prompt))
|
||||||
content = content.replace("<<BASE64_IMAGE>>", encoded_image)
|
content = content.replace("<<BASE64_IMAGE>>", encoded_image)
|
||||||
content = content.replace("<<BASE64_AUDIO>>", encoded_audio)
|
content = content.replace("<<BASE64_AUDIO>>", encoded_audio)
|
||||||
async with httpx.AsyncClient() as client:
|
|
||||||
|
transport = httpx.AsyncHTTPTransport(retries=settings_var("network.retry", 3))
|
||||||
|
async with httpx.AsyncClient(transport=transport) as client:
|
||||||
response = await client.request(
|
response = await client.request(
|
||||||
method=self.method,
|
method=self.method,
|
||||||
url=self.url,
|
url=self.url,
|
||||||
headers=self.headers,
|
headers=self.headers,
|
||||||
content=content,
|
content=content,
|
||||||
timeout=(30, 90),
|
timeout=self.timeout(),
|
||||||
)
|
)
|
||||||
|
|
||||||
return response
|
return response
|
||||||
|
|||||||
@@ -9,8 +9,8 @@ from rich.console import Console
|
|||||||
from rich.table import Table
|
from rich.table import Table
|
||||||
from tabulate import tabulate
|
from tabulate import tabulate
|
||||||
|
|
||||||
from agentic_security.config import CfgMixin # Importing the configuration mixin
|
from agentic_security.config import SettingsMixin # Importing the configuration mixin
|
||||||
from agentic_security.models.schemas import Scan
|
from agentic_security.primitives import Scan
|
||||||
from agentic_security.probe_data import REGISTRY
|
from agentic_security.probe_data import REGISTRY
|
||||||
from agentic_security.routes.scan import streaming_response_generator
|
from agentic_security.routes.scan import streaming_response_generator
|
||||||
|
|
||||||
@@ -23,7 +23,7 @@ YELLOW = colorama.Fore.YELLOW
|
|||||||
BLUE = colorama.Fore.BLUE
|
BLUE = colorama.Fore.BLUE
|
||||||
|
|
||||||
|
|
||||||
class AgenticSecurity(CfgMixin):
|
class AgenticSecurity(SettingsMixin):
|
||||||
@classmethod
|
@classmethod
|
||||||
async def async_scan(
|
async def async_scan(
|
||||||
cls,
|
cls,
|
||||||
|
|||||||
@@ -0,0 +1,88 @@
|
|||||||
|
from pyfiglet import Figlet, FontNotFound
|
||||||
|
from termcolor import colored
|
||||||
|
|
||||||
|
try:
|
||||||
|
from importlib.metadata import version
|
||||||
|
except ImportError:
|
||||||
|
from importlib_metadata import version
|
||||||
|
|
||||||
|
|
||||||
|
def generate_banner(
|
||||||
|
title="Agentic Security",
|
||||||
|
font="slant",
|
||||||
|
version="v2.1.0",
|
||||||
|
tagline="Proactive Threat Detection & Automated Security Protocols",
|
||||||
|
author="Developed by: [Security Team]",
|
||||||
|
website="Website: https://github.com/msoedov/agentic_security",
|
||||||
|
warning="",
|
||||||
|
):
|
||||||
|
"""Generate a visually enhanced banner with dynamic width and borders."""
|
||||||
|
# Define the text elements
|
||||||
|
|
||||||
|
# Initialize Figlet with the specified font, fallback to default if not found
|
||||||
|
try:
|
||||||
|
f = Figlet(font=font)
|
||||||
|
except FontNotFound:
|
||||||
|
f = Figlet() # Fallback to default font
|
||||||
|
|
||||||
|
# Render the title text and calculate the maximum width of Figlet lines
|
||||||
|
banner_text = f.renderText(title)
|
||||||
|
banner_lines = banner_text.splitlines()
|
||||||
|
figlet_max_width = max(len(line) for line in banner_lines) if banner_lines else 0
|
||||||
|
|
||||||
|
# Create the details line and calculate its width
|
||||||
|
details_line = f"Version: {version} | {website}"
|
||||||
|
details_width = len(details_line)
|
||||||
|
|
||||||
|
# Calculate widths of other text elements
|
||||||
|
warning_width = len(warning)
|
||||||
|
tagline_width = len(tagline)
|
||||||
|
|
||||||
|
# Determine the overall maximum width for centering
|
||||||
|
overall_max_width = max(
|
||||||
|
figlet_max_width, warning_width, tagline_width, details_width
|
||||||
|
)
|
||||||
|
|
||||||
|
# Pad the Figlet lines to the overall maximum width
|
||||||
|
padded_banner_lines = [line.center(overall_max_width) for line in banner_lines]
|
||||||
|
|
||||||
|
# Define decorative characters and colors
|
||||||
|
decor_chars = ["▄", "■", "►"]
|
||||||
|
decor_colors = ["blue", "red", "yellow"]
|
||||||
|
|
||||||
|
# Create and color the content lines
|
||||||
|
content_lines = []
|
||||||
|
for line in padded_banner_lines:
|
||||||
|
content_lines.append(colored(line, "blue"))
|
||||||
|
content_lines.append(colored(decor_chars[0] * overall_max_width, decor_colors[0]))
|
||||||
|
content_lines.append(
|
||||||
|
colored(warning.center(overall_max_width), "red", attrs=["blink", "bold"])
|
||||||
|
)
|
||||||
|
content_lines.append(colored(decor_chars[1] * overall_max_width, decor_colors[1]))
|
||||||
|
content_lines.append(colored(tagline.center(overall_max_width), "red"))
|
||||||
|
content_lines.append(colored(decor_chars[2] * overall_max_width, decor_colors[2]))
|
||||||
|
content_lines.append(colored(details_line.center(overall_max_width), "magenta"))
|
||||||
|
|
||||||
|
# Define border color and create top and bottom borders
|
||||||
|
border_color = "blue"
|
||||||
|
top_border = colored("╔" + "═" * (overall_max_width + 2) + "╗", border_color)
|
||||||
|
bottom_border = colored("╚" + "═" * (overall_max_width + 2) + "╝", border_color)
|
||||||
|
|
||||||
|
# Add side borders to each content line with padding
|
||||||
|
bordered_content = [
|
||||||
|
colored("║ ", border_color) + line + colored(" ║", border_color)
|
||||||
|
for line in content_lines
|
||||||
|
]
|
||||||
|
|
||||||
|
# Assemble the full banner
|
||||||
|
banner = top_border + "\n" + "\n".join(bordered_content) + "\n" + bottom_border
|
||||||
|
return banner
|
||||||
|
|
||||||
|
|
||||||
|
def init_banner():
|
||||||
|
ver = version("agentic_security")
|
||||||
|
print(generate_banner(version=ver))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
init_banner()
|
||||||
@@ -0,0 +1,11 @@
|
|||||||
|
from agentic_security.primitives.models import ( # noqa
|
||||||
|
CompletionRequest,
|
||||||
|
FileProbeResponse,
|
||||||
|
LLMInfo,
|
||||||
|
Message,
|
||||||
|
Probe,
|
||||||
|
Scan,
|
||||||
|
ScanResult,
|
||||||
|
Settings,
|
||||||
|
Table,
|
||||||
|
)
|
||||||
@@ -2,6 +2,7 @@ import asyncio
|
|||||||
import random
|
import random
|
||||||
import time
|
import time
|
||||||
from collections.abc import AsyncGenerator
|
from collections.abc import AsyncGenerator
|
||||||
|
from json import JSONDecodeError
|
||||||
|
|
||||||
import httpx
|
import httpx
|
||||||
import pandas as pd
|
import pandas as pd
|
||||||
@@ -10,7 +11,7 @@ from skopt import Optimizer
|
|||||||
from skopt.space import Real
|
from skopt.space import Real
|
||||||
|
|
||||||
from agentic_security.http_spec import Modality
|
from agentic_security.http_spec import Modality
|
||||||
from agentic_security.models.schemas import Scan, ScanResult
|
from agentic_security.primitives import Scan, ScanResult
|
||||||
from agentic_security.probe_actor.cost_module import calculate_cost
|
from agentic_security.probe_actor.cost_module import calculate_cost
|
||||||
from agentic_security.probe_actor.refusal import refusal_heuristic
|
from agentic_security.probe_actor.refusal import refusal_heuristic
|
||||||
from agentic_security.probe_data import audio_generator, image_generator, msj_data
|
from agentic_security.probe_data import audio_generator, image_generator, msj_data
|
||||||
@@ -19,6 +20,10 @@ from agentic_security.probe_data.data import prepare_prompts
|
|||||||
# TODO: full log file
|
# TODO: full log file
|
||||||
|
|
||||||
MAX_PROMPT_LENGTH = 2048
|
MAX_PROMPT_LENGTH = 2048
|
||||||
|
BUDGET_MULTIPLIER = 100_000_000
|
||||||
|
INITIAL_OPTIMIZER_POINTS = 25
|
||||||
|
MIN_FAILURE_SAMPLES = 5
|
||||||
|
FAILURE_RATE_THRESHOLD = 0.5
|
||||||
|
|
||||||
|
|
||||||
async def generate_prompts(
|
async def generate_prompts(
|
||||||
@@ -75,6 +80,31 @@ async def process_prompt(
|
|||||||
logger.error(f"Request error: {exc}")
|
logger.error(f"Request error: {exc}")
|
||||||
errors.append((module_name, prompt, "?", str(exc)))
|
errors.append((module_name, prompt, "?", str(exc)))
|
||||||
return tokens, True
|
return tokens, True
|
||||||
|
except JSONDecodeError as json_decode_error:
|
||||||
|
logger.error(f"Jason error: {json_decode_error}")
|
||||||
|
errors.append((module_name, prompt, "?", str(json_decode_error)))
|
||||||
|
return tokens, True
|
||||||
|
|
||||||
|
|
||||||
|
async def process_prompt_batch(
|
||||||
|
request_factory,
|
||||||
|
prompts: list[str],
|
||||||
|
tokens: int,
|
||||||
|
module_name: str,
|
||||||
|
refusals,
|
||||||
|
errors,
|
||||||
|
outputs,
|
||||||
|
) -> tuple[int, int]:
|
||||||
|
tasks = [
|
||||||
|
process_prompt(
|
||||||
|
request_factory, p, tokens, module_name, refusals, errors, outputs
|
||||||
|
)
|
||||||
|
for p in prompts
|
||||||
|
]
|
||||||
|
results = await asyncio.gather(*tasks)
|
||||||
|
total_tokens = sum(r[0] for r in results)
|
||||||
|
failures = sum(1 for r in results if r[1])
|
||||||
|
return total_tokens, failures
|
||||||
|
|
||||||
|
|
||||||
async def perform_single_shot_scan(
|
async def perform_single_shot_scan(
|
||||||
@@ -87,7 +117,7 @@ async def perform_single_shot_scan(
|
|||||||
secrets: dict[str, str] = {},
|
secrets: dict[str, str] = {},
|
||||||
) -> AsyncGenerator[str, None]:
|
) -> AsyncGenerator[str, None]:
|
||||||
"""Perform a standard security scan."""
|
"""Perform a standard security scan."""
|
||||||
max_budget = max_budget * 100_000_000
|
max_budget = max_budget * BUDGET_MULTIPLIER
|
||||||
selected_datasets = [m for m in datasets if m["selected"]]
|
selected_datasets = [m for m in datasets if m["selected"]]
|
||||||
request_factory = multi_modality_spec(request_factory)
|
request_factory = multi_modality_spec(request_factory)
|
||||||
try:
|
try:
|
||||||
@@ -123,6 +153,7 @@ async def perform_single_shot_scan(
|
|||||||
module_failures = 0
|
module_failures = 0
|
||||||
module_size = 0 if module.lazy else len(module.prompts)
|
module_size = 0 if module.lazy else len(module.prompts)
|
||||||
logger.info(f"Scanning {module.dataset_name} {module_size}")
|
logger.info(f"Scanning {module.dataset_name} {module_size}")
|
||||||
|
module_prompts = 0 # Reset for each module
|
||||||
|
|
||||||
async for prompt in generate_prompts(module.prompts):
|
async for prompt in generate_prompts(module.prompts):
|
||||||
if stop_event and stop_event.is_set():
|
if stop_event and stop_event.is_set():
|
||||||
@@ -132,9 +163,12 @@ async def perform_single_shot_scan(
|
|||||||
return
|
return
|
||||||
|
|
||||||
processed_prompts += 1
|
processed_prompts += 1
|
||||||
|
module_prompts += 1 # Fixed increment syntax
|
||||||
|
# Calculate progress based on total processed prompts
|
||||||
progress = (
|
progress = (
|
||||||
100 * processed_prompts / total_prompts if total_prompts else 0
|
100 * processed_prompts / total_prompts if total_prompts else 0
|
||||||
)
|
)
|
||||||
|
|
||||||
total_tokens -= tokens
|
total_tokens -= tokens
|
||||||
start = time.time()
|
start = time.time()
|
||||||
tokens, failed = await process_prompt(
|
tokens, failed = await process_prompt(
|
||||||
@@ -148,14 +182,13 @@ async def perform_single_shot_scan(
|
|||||||
)
|
)
|
||||||
end = time.time()
|
end = time.time()
|
||||||
total_tokens += tokens
|
total_tokens += tokens
|
||||||
# logger.debug(f"Trying prompt: {prompt}, {failed=}")
|
|
||||||
if failed:
|
if failed:
|
||||||
module_failures += 1
|
module_failures += 1
|
||||||
failure_rate = module_failures / max(processed_prompts, 1)
|
failure_rate = module_failures / max(module_prompts, 1)
|
||||||
failure_rates.append(failure_rate)
|
failure_rates.append(failure_rate)
|
||||||
cost = calculate_cost(tokens)
|
cost = calculate_cost(tokens)
|
||||||
|
|
||||||
# TODO: improve this cond
|
|
||||||
last_output = outputs[-1] if outputs else None
|
last_output = outputs[-1] if outputs else None
|
||||||
if last_output and last_output[1] == prompt:
|
if last_output and last_output[1] == prompt:
|
||||||
response_text = last_output[2]
|
response_text = last_output[2]
|
||||||
@@ -204,7 +237,6 @@ async def perform_single_shot_scan(
|
|||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.exception("Scan failed")
|
logger.exception("Scan failed")
|
||||||
yield ScanResult.status_msg(f"Scan failed: {str(e)}")
|
yield ScanResult.status_msg(f"Scan failed: {str(e)}")
|
||||||
# raise e
|
|
||||||
finally:
|
finally:
|
||||||
yield ScanResult.status_msg("Scan completed.")
|
yield ScanResult.status_msg("Scan completed.")
|
||||||
|
|
||||||
|
|||||||
@@ -3,10 +3,13 @@ import logging
|
|||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
import httpx
|
import httpx
|
||||||
from httpx import LLMSpec
|
|
||||||
from pydantic import BaseModel, Field
|
from pydantic import BaseModel, Field
|
||||||
from pydantic_ai import Agent, RunContext
|
from pydantic_ai import Agent, RunContext
|
||||||
|
|
||||||
|
from agentic_security.http_spec import LLMSpec
|
||||||
|
|
||||||
|
LLM_SPECS = []
|
||||||
|
|
||||||
# Configure logging
|
# Configure logging
|
||||||
logging.basicConfig(level=logging.INFO)
|
logging.basicConfig(level=logging.INFO)
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
@@ -28,6 +31,7 @@ class OperatorToolBox:
|
|||||||
self.spec = spec
|
self.spec = spec
|
||||||
self.datasets = datasets
|
self.datasets = datasets
|
||||||
self.failures = []
|
self.failures = []
|
||||||
|
self.llm_specs = [LLMSpec.from_string(spec) for spec in LLM_SPECS]
|
||||||
|
|
||||||
def get_spec(self) -> AgentSpecification:
|
def get_spec(self) -> AgentSpecification:
|
||||||
return self.spec
|
return self.spec
|
||||||
@@ -62,52 +66,33 @@ class OperatorToolBox:
|
|||||||
return f"Operation '{operation}' failed: Dataset not found."
|
return f"Operation '{operation}' failed: Dataset not found."
|
||||||
return f"Operation '{operation}' executed successfully."
|
return f"Operation '{operation}' executed successfully."
|
||||||
|
|
||||||
async def test(self, description: str, sample_test: dict[str, Any]) -> str:
|
async def test_llm_spec(self, llm_spec: LLMSpec, user_prompt: str) -> str:
|
||||||
agent = Agent(
|
try:
|
||||||
"openai:gpt-4o",
|
# Verify the spec
|
||||||
result_type=LLMSpec,
|
response = await llm_spec.verify()
|
||||||
system_prompt="Extract the LLM specification from the input",
|
response.raise_for_status()
|
||||||
)
|
logger.info(f"Verification succeeded for {llm_spec.url}")
|
||||||
|
|
||||||
async with agent.run_stream(description) as result:
|
# Run test with user prompt
|
||||||
async for spec in result.stream():
|
test_response = await llm_spec.probe(user_prompt)
|
||||||
self.spec.endpoint = spec.url
|
test_response.raise_for_status()
|
||||||
|
response_data = test_response.json()
|
||||||
|
return f"Test succeeded for {llm_spec.url}: {response_data}"
|
||||||
|
except httpx.HTTPStatusError as e:
|
||||||
|
self.failures.append(f"HTTP error occurred: {e}")
|
||||||
|
logger.error(f"Test failed for {llm_spec.url}: {e}")
|
||||||
|
return f"Test failed for {llm_spec.url}: {e}"
|
||||||
|
except Exception as e:
|
||||||
|
self.failures.append(f"An error occurred: {e}")
|
||||||
|
logger.error(f"Test failed for {llm_spec.url}: {e}")
|
||||||
|
return f"Test failed for {llm_spec.url}: {e}"
|
||||||
|
|
||||||
# Verify access to the endpoint
|
async def test_with_prompt(self, spec_index: int, user_prompt: str) -> str:
|
||||||
async with httpx.AsyncClient() as client:
|
if not 0 <= spec_index < len(self.llm_specs):
|
||||||
try:
|
return f"Invalid spec index: {spec_index}. Valid range is 0 to {len(self.llm_specs) - 1}"
|
||||||
access_response = await client.get(spec.url)
|
|
||||||
access_response.raise_for_status()
|
|
||||||
except httpx.HTTPStatusError as e:
|
|
||||||
self.failures.append(f"HTTP error occurred: {e}")
|
|
||||||
logger.error(f"Access verification failed: {e}")
|
|
||||||
return f"Access verification failed: {e}"
|
|
||||||
except Exception as e:
|
|
||||||
self.failures.append(f"An error occurred: {e}")
|
|
||||||
logger.error(f"Access verification failed: {e}")
|
|
||||||
return f"Access verification failed: {e}"
|
|
||||||
|
|
||||||
# Run the sample test
|
llm_spec = self.llm_specs[spec_index]
|
||||||
try:
|
return await self.test_llm_spec(llm_spec, user_prompt)
|
||||||
test_response = await client.post(
|
|
||||||
f"{spec.url}/test", json=sample_test
|
|
||||||
)
|
|
||||||
test_response.raise_for_status()
|
|
||||||
response_data = test_response.json()
|
|
||||||
if "choices" in response_data and len(response_data["choices"]) > 0:
|
|
||||||
return f"Testing agent at {spec.url} succeeded: {response_data}"
|
|
||||||
else:
|
|
||||||
self.failures.append("Invalid response format")
|
|
||||||
logger.error("Sample test failed: Invalid response format")
|
|
||||||
return "Sample test failed: Invalid response format"
|
|
||||||
except httpx.HTTPStatusError as e:
|
|
||||||
self.failures.append(f"HTTP error occurred: {e}")
|
|
||||||
logger.error(f"Sample test failed: {e}")
|
|
||||||
return f"Sample test failed: {e}"
|
|
||||||
except Exception as e:
|
|
||||||
self.failures.append(f"An error occurred: {e}")
|
|
||||||
logger.error(f"Sample test failed: {e}")
|
|
||||||
return f"Sample test failed: {e}"
|
|
||||||
|
|
||||||
|
|
||||||
# Initialize OperatorToolBox with AgentSpecification
|
# Initialize OperatorToolBox with AgentSpecification
|
||||||
@@ -126,104 +111,95 @@ dataset_manager_agent = Agent(
|
|||||||
model="gpt-4",
|
model="gpt-4",
|
||||||
deps_type=OperatorToolBox,
|
deps_type=OperatorToolBox,
|
||||||
result_type=str,
|
result_type=str,
|
||||||
system_prompt="You can validate the toolbox, run operations, and retrieve results or failures.",
|
system_prompt="You can validate the toolbox, run operations, retrieve results or failures, and test LLM specs.",
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@dataset_manager_agent.tool
|
@dataset_manager_agent.tool
|
||||||
async def validate_toolbox(ctx: RunContext[OperatorToolBox]) -> str:
|
async def validate_toolbox(ctx: RunContext[OperatorToolBox]) -> str:
|
||||||
is_valid = ctx.deps.validate()
|
is_valid = ctx.deps.validate()
|
||||||
if is_valid:
|
return (
|
||||||
return "ToolBox validation successful."
|
"ToolBox validation successful." if is_valid else "ToolBox validation failed."
|
||||||
else:
|
)
|
||||||
return "ToolBox validation failed."
|
|
||||||
|
|
||||||
|
|
||||||
@dataset_manager_agent.tool
|
@dataset_manager_agent.tool
|
||||||
async def execute_operation(ctx: RunContext[OperatorToolBox], operation: str) -> str:
|
async def execute_operation(ctx: RunContext[OperatorToolBox], operation: str) -> str:
|
||||||
result = ctx.deps.run_operation(operation)
|
return ctx.deps.run_operation(operation)
|
||||||
return result
|
|
||||||
|
|
||||||
|
|
||||||
@dataset_manager_agent.tool
|
@dataset_manager_agent.tool
|
||||||
async def retrieve_results(ctx: RunContext[OperatorToolBox]) -> str:
|
async def retrieve_results(ctx: RunContext[OperatorToolBox]) -> str:
|
||||||
results = ctx.deps.get_results()
|
results = ctx.deps.get_results()
|
||||||
if results:
|
return (
|
||||||
formatted_results = "\n".join([f"{op}: {res}" for op, res in results.items()])
|
f"Operation Results:\n{results}"
|
||||||
return f"Operation Results:\n{formatted_results}"
|
if results
|
||||||
else:
|
else "No operations have been executed yet."
|
||||||
return "No operations have been executed yet."
|
)
|
||||||
|
|
||||||
|
|
||||||
@dataset_manager_agent.tool
|
@dataset_manager_agent.tool
|
||||||
async def retrieve_failures(ctx: RunContext[OperatorToolBox]) -> str:
|
async def retrieve_failures(ctx: RunContext[OperatorToolBox]) -> str:
|
||||||
failures = ctx.deps.get_failures()
|
failures = ctx.deps.get_failures()
|
||||||
if failures:
|
return f"Failures:\n{failures}" if failures else "No failures recorded."
|
||||||
formatted_failures = "\n".join(failures)
|
|
||||||
return f"Failures:\n{formatted_failures}"
|
|
||||||
else:
|
|
||||||
return "No failures recorded."
|
|
||||||
|
|
||||||
|
|
||||||
@dataset_manager_agent.tool
|
@dataset_manager_agent.tool
|
||||||
async def test_agent(
|
async def list_llm_specs(ctx: RunContext[OperatorToolBox]) -> str:
|
||||||
ctx: RunContext[OperatorToolBox], description: str, sample_test: dict[str, Any]
|
spec_list = "\n".join(
|
||||||
|
f"{i}: {spec.url}" for i, spec in enumerate(ctx.deps.llm_specs)
|
||||||
|
)
|
||||||
|
return f"Available LLM Specs:\n{spec_list}"
|
||||||
|
|
||||||
|
|
||||||
|
@dataset_manager_agent.tool
|
||||||
|
async def test_llm_with_prompt(
|
||||||
|
ctx: RunContext[OperatorToolBox], spec_index: int, user_prompt: str
|
||||||
) -> str:
|
) -> str:
|
||||||
result = await ctx.deps.test(description, sample_test)
|
return await ctx.deps.test_with_prompt(spec_index, user_prompt)
|
||||||
return result
|
|
||||||
|
|
||||||
|
|
||||||
# Synchronous run example
|
# Asynchronous run example with user confirmation
|
||||||
def run_dataset_manager_agent_sync():
|
|
||||||
prompts = [
|
|
||||||
"Validate the toolbox.",
|
|
||||||
"Execute operation on 'dataset2'.",
|
|
||||||
"Execute operation on 'dataset4'.", # This should fail
|
|
||||||
"Retrieve the results.",
|
|
||||||
"Retrieve any failures.",
|
|
||||||
"Test my openAI compatible agent deployed at localhost:3000",
|
|
||||||
]
|
|
||||||
|
|
||||||
sample_test = {"prompt": "Hello, how are you?", "max_tokens": 5}
|
|
||||||
|
|
||||||
for prompt in prompts:
|
|
||||||
if "Test my" in prompt:
|
|
||||||
result = dataset_manager_agent.run_sync(
|
|
||||||
prompt, deps=toolbox, sample_test=sample_test
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
result = dataset_manager_agent.run_sync(prompt, deps=toolbox)
|
|
||||||
print(f"Prompt: {prompt}")
|
|
||||||
print(f"Response: {result.data}\n")
|
|
||||||
|
|
||||||
|
|
||||||
# Asynchronous run example
|
|
||||||
async def run_dataset_manager_agent_async():
|
async def run_dataset_manager_agent_async():
|
||||||
prompts = [
|
prompts = [
|
||||||
"Validate the toolbox.",
|
"Validate the toolbox.",
|
||||||
"Execute operation on 'dataset2'.",
|
"List available LLM specs.",
|
||||||
"Execute operation on 'dataset4'.", # This should fail
|
"I want to test an LLM with my prompt: 'Tell me a short story about a robot'. Which spec index should I use?",
|
||||||
"Retrieve the results.",
|
|
||||||
"Retrieve any failures.",
|
|
||||||
"Test my openAI compatible agent deployed at localhost:3000",
|
|
||||||
]
|
]
|
||||||
|
|
||||||
sample_test = {"prompt": "Hello, how are you?", "max_tokens": 5}
|
|
||||||
|
|
||||||
for prompt in prompts:
|
for prompt in prompts:
|
||||||
if "Test my" in prompt:
|
result = await dataset_manager_agent.run(prompt, deps=toolbox)
|
||||||
result = await dataset_manager_agent.run(
|
|
||||||
prompt, deps=toolbox, sample_test=sample_test
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
result = await dataset_manager_agent.run(prompt, deps=toolbox)
|
|
||||||
print(f"Prompt: {prompt}")
|
print(f"Prompt: {prompt}")
|
||||||
print(f"Response: {result.data}\n")
|
print(f"Response: {result.data}\n")
|
||||||
|
|
||||||
|
# Handle testing request
|
||||||
|
if "test an LLM with my prompt" in prompt:
|
||||||
|
print(
|
||||||
|
"Please select a spec index from the list above and confirm to proceed."
|
||||||
|
)
|
||||||
|
# Simulate user input for demo (in real app, you'd get this from user)
|
||||||
|
user_input = (
|
||||||
|
input("Enter spec index and 'yes' to confirm (e.g., '0 yes'): ")
|
||||||
|
.strip()
|
||||||
|
.split()
|
||||||
|
)
|
||||||
|
if len(user_input) == 2 and user_input[1].lower() == "yes":
|
||||||
|
try:
|
||||||
|
spec_index = int(user_input[0])
|
||||||
|
# Extract prompt from the original input
|
||||||
|
user_prompt = prompt.split("my prompt: ")[1].strip("'")
|
||||||
|
test_result = await dataset_manager_agent.run(
|
||||||
|
f"Test LLM at index {spec_index} with prompt: {user_prompt}",
|
||||||
|
deps=toolbox,
|
||||||
|
spec_index=spec_index,
|
||||||
|
user_prompt=user_prompt,
|
||||||
|
)
|
||||||
|
print(f"Test Response: {test_result.data}\n")
|
||||||
|
except ValueError:
|
||||||
|
print("Invalid spec index provided.\n")
|
||||||
|
else:
|
||||||
|
print("Test canceled. Please provide a valid index and confirmation.\n")
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
# Run synchronous example
|
|
||||||
run_dataset_manager_agent_sync()
|
|
||||||
|
|
||||||
# Run asynchronous example
|
|
||||||
asyncio.run(run_dataset_manager_agent_async())
|
asyncio.run(run_dataset_manager_agent_async())
|
||||||
|
|||||||
@@ -5,7 +5,7 @@ from unittest.mock import AsyncMock, MagicMock, Mock, patch
|
|||||||
import httpx
|
import httpx
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
from agentic_security.models.schemas import Scan
|
from agentic_security.primitives import Scan
|
||||||
from agentic_security.probe_actor.fuzzer import (
|
from agentic_security.probe_actor.fuzzer import (
|
||||||
generate_prompts,
|
generate_prompts,
|
||||||
perform_many_shot_scan,
|
perform_many_shot_scan,
|
||||||
|
|||||||
@@ -5,6 +5,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "simonycl/aya-23-8B_advbench_jailbreak",
|
"dataset_name": "simonycl/aya-23-8B_advbench_jailbreak",
|
||||||
"num_prompts": 416,
|
"num_prompts": 416,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -16,6 +17,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "acmc/jailbreaks_dataset_with_perplexity_bigcode_starcoder2-3b_bigcode_starcoder2-7b",
|
"dataset_name": "acmc/jailbreaks_dataset_with_perplexity_bigcode_starcoder2-3b_bigcode_starcoder2-7b",
|
||||||
"num_prompts": 11191,
|
"num_prompts": 11191,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -27,6 +29,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "karanxa/dolphin-jailbreak-finetuning-dataset",
|
"dataset_name": "karanxa/dolphin-jailbreak-finetuning-dataset",
|
||||||
"num_prompts": 42684,
|
"num_prompts": 42684,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -38,6 +41,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "karanxa/llama-2-jailbreak-dataset",
|
"dataset_name": "karanxa/llama-2-jailbreak-dataset",
|
||||||
"num_prompts": 40613,
|
"num_prompts": 40613,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -49,6 +53,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "karanxa/llama2-uncensored-jailbreak-dataset-finetuning",
|
"dataset_name": "karanxa/llama2-uncensored-jailbreak-dataset-finetuning",
|
||||||
"num_prompts": 42854,
|
"num_prompts": 42854,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -60,6 +65,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "liuyanchen1015/Llama-3.2-1B_jailbreak_responses",
|
"dataset_name": "liuyanchen1015/Llama-3.2-1B_jailbreak_responses",
|
||||||
"num_prompts": 9888,
|
"num_prompts": 9888,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -71,6 +77,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "liuyanchen1015/Llama-3.2-1B-Instruct_jailbreak_responses",
|
"dataset_name": "liuyanchen1015/Llama-3.2-1B-Instruct_jailbreak_responses",
|
||||||
"num_prompts": 9888,
|
"num_prompts": 9888,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -82,6 +89,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "liuyanchen1015/Llama-3.2-1B-Instruct_jailbreak_responses_with_judgment",
|
"dataset_name": "liuyanchen1015/Llama-3.2-1B-Instruct_jailbreak_responses_with_judgment",
|
||||||
"num_prompts": 9888,
|
"num_prompts": 9888,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -93,6 +101,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "jackhhao/jailbreak-classification",
|
"dataset_name": "jackhhao/jailbreak-classification",
|
||||||
"num_prompts": 1044,
|
"num_prompts": 1044,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -104,6 +113,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "markush1/LLM-Jailbreak-Classifier",
|
"dataset_name": "markush1/LLM-Jailbreak-Classifier",
|
||||||
"num_prompts": 201193,
|
"num_prompts": 201193,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -115,6 +125,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "walledai/JailbreakBench",
|
"dataset_name": "walledai/JailbreakBench",
|
||||||
"num_prompts": 200,
|
"num_prompts": 200,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -126,6 +137,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "walledai/JailbreakHub",
|
"dataset_name": "walledai/JailbreakHub",
|
||||||
"num_prompts": 15140,
|
"num_prompts": 15140,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -137,6 +149,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "Granther/evil-jailbreak",
|
"dataset_name": "Granther/evil-jailbreak",
|
||||||
"num_prompts": 1200,
|
"num_prompts": 1200,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -148,6 +161,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "sevdeawesome/jailbreak_success",
|
"dataset_name": "sevdeawesome/jailbreak_success",
|
||||||
"num_prompts": 10800,
|
"num_prompts": 10800,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -159,6 +173,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "IDA-SERICS/Disaster-tweet-jailbreaking",
|
"dataset_name": "IDA-SERICS/Disaster-tweet-jailbreaking",
|
||||||
"num_prompts": 3000,
|
"num_prompts": 3000,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -170,6 +185,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "GeorgeDaDude/Jailbreak_Complete_DS_labeled",
|
"dataset_name": "GeorgeDaDude/Jailbreak_Complete_DS_labeled",
|
||||||
"num_prompts": 11383,
|
"num_prompts": 11383,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -181,6 +197,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "dayone3nder/jailbreak_prompt_JBB_sft_trainset",
|
"dataset_name": "dayone3nder/jailbreak_prompt_JBB_sft_trainset",
|
||||||
"num_prompts": 4785,
|
"num_prompts": 4785,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -192,6 +209,7 @@ REGISTRY_V0 = [
|
|||||||
"dataset_name": "dayone3nder/general_safe_mix_jailbreak_prompt_JBB_trainset",
|
"dataset_name": "dayone3nder/general_safe_mix_jailbreak_prompt_JBB_trainset",
|
||||||
"num_prompts": 24679,
|
"num_prompts": 24679,
|
||||||
"tokens": None, # Add actual token count if available
|
"tokens": None, # Add actual token count if available
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -206,6 +224,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "AgenticBackend",
|
"dataset_name": "AgenticBackend",
|
||||||
"num_prompts": 2000,
|
"num_prompts": 2000,
|
||||||
"tokens": 0,
|
"tokens": 0,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Fine-tuned cloud hosted model",
|
"source": "Fine-tuned cloud hosted model",
|
||||||
"selected": True,
|
"selected": True,
|
||||||
@@ -221,6 +240,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "ShawnMenz/DAN_jailbreak",
|
"dataset_name": "ShawnMenz/DAN_jailbreak",
|
||||||
"num_prompts": 666,
|
"num_prompts": 666,
|
||||||
"tokens": 224196,
|
"tokens": 224196,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -232,6 +252,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "deepset/prompt-injections",
|
"dataset_name": "deepset/prompt-injections",
|
||||||
"num_prompts": 203,
|
"num_prompts": 203,
|
||||||
"tokens": 6988,
|
"tokens": 6988,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -243,6 +264,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "rubend18/ChatGPT-Jailbreak-Prompts",
|
"dataset_name": "rubend18/ChatGPT-Jailbreak-Prompts",
|
||||||
"num_prompts": 79,
|
"num_prompts": 79,
|
||||||
"tokens": 26971,
|
"tokens": 26971,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -254,6 +276,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "notrichardren/refuse-to-answer-prompts",
|
"dataset_name": "notrichardren/refuse-to-answer-prompts",
|
||||||
"num_prompts": 522,
|
"num_prompts": 522,
|
||||||
"tokens": 7172,
|
"tokens": 7172,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -265,6 +288,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "Lemhf14/EasyJailbreak_Datasets",
|
"dataset_name": "Lemhf14/EasyJailbreak_Datasets",
|
||||||
"num_prompts": 1630,
|
"num_prompts": 1630,
|
||||||
"tokens": 19758,
|
"tokens": 19758,
|
||||||
|
"is_active": False,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -276,6 +300,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "markush1/LLM-Jailbreak-Classifier",
|
"dataset_name": "markush1/LLM-Jailbreak-Classifier",
|
||||||
"num_prompts": 1119,
|
"num_prompts": 1119,
|
||||||
"tokens": 19758,
|
"tokens": 19758,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -287,6 +312,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "JailbreakV-28K/JailBreakV-28k",
|
"dataset_name": "JailbreakV-28K/JailBreakV-28k",
|
||||||
"num_prompts": 28300,
|
"num_prompts": 28300,
|
||||||
"tokens": 1975800,
|
"tokens": 1975800,
|
||||||
|
"is_active": False,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -298,6 +324,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "ShawnMenz/jailbreak_sft_rm_ds",
|
"dataset_name": "ShawnMenz/jailbreak_sft_rm_ds",
|
||||||
"num_prompts": 371000,
|
"num_prompts": 371000,
|
||||||
"tokens": 1975800,
|
"tokens": 1975800,
|
||||||
|
"is_active": False,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Hugging Face Datasets",
|
"source": "Hugging Face Datasets",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -309,6 +336,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "Steganography",
|
"dataset_name": "Steganography",
|
||||||
"num_prompts": 10,
|
"num_prompts": 10,
|
||||||
"tokens": 0,
|
"tokens": 0,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Local mutation dataset",
|
"source": "Local mutation dataset",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -320,6 +348,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "GPT fuzzer",
|
"dataset_name": "GPT fuzzer",
|
||||||
"num_prompts": 10,
|
"num_prompts": 10,
|
||||||
"tokens": 0,
|
"tokens": 0,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Local mutation dataset",
|
"source": "Local mutation dataset",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -331,6 +360,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "jailbreak_llms/2023_05_07",
|
"dataset_name": "jailbreak_llms/2023_05_07",
|
||||||
"num_prompts": 0,
|
"num_prompts": 0,
|
||||||
"tokens": 0,
|
"tokens": 0,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Github",
|
"source": "Github",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -342,6 +372,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "jailbreak_llms/2023_12_25.csv",
|
"dataset_name": "jailbreak_llms/2023_12_25.csv",
|
||||||
"num_prompts": 0,
|
"num_prompts": 0,
|
||||||
"tokens": 0,
|
"tokens": 0,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Github",
|
"source": "Github",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -353,6 +384,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "Malwaregen",
|
"dataset_name": "Malwaregen",
|
||||||
"num_prompts": 0,
|
"num_prompts": 0,
|
||||||
"tokens": 0,
|
"tokens": 0,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Local dataset",
|
"source": "Local dataset",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -364,6 +396,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "Hallucination",
|
"dataset_name": "Hallucination",
|
||||||
"num_prompts": 0,
|
"num_prompts": 0,
|
||||||
"tokens": 0,
|
"tokens": 0,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Local dataset",
|
"source": "Local dataset",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -375,6 +408,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "DataLeak",
|
"dataset_name": "DataLeak",
|
||||||
"num_prompts": 0,
|
"num_prompts": 0,
|
||||||
"tokens": 0,
|
"tokens": 0,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Local dataset",
|
"source": "Local dataset",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -386,6 +420,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "llm-adaptive-attacks",
|
"dataset_name": "llm-adaptive-attacks",
|
||||||
"num_prompts": 0,
|
"num_prompts": 0,
|
||||||
"tokens": 0,
|
"tokens": 0,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Github: tml-epfl/llm-adaptive-attacks#0.0.1",
|
"source": "Github: tml-epfl/llm-adaptive-attacks#0.0.1",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -397,6 +432,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "Garak",
|
"dataset_name": "Garak",
|
||||||
"num_prompts": 0,
|
"num_prompts": 0,
|
||||||
"tokens": 0,
|
"tokens": 0,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Github: https://github.com/leondz/garak#v0.9.0.1",
|
"source": "Github: https://github.com/leondz/garak#v0.9.0.1",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -412,6 +448,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "Reinforcement Learning Optimization",
|
"dataset_name": "Reinforcement Learning Optimization",
|
||||||
"num_prompts": 0,
|
"num_prompts": 0,
|
||||||
"tokens": 0,
|
"tokens": 0,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Cloud hosted model",
|
"source": "Cloud hosted model",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -427,6 +464,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"dataset_name": "InspectAI",
|
"dataset_name": "InspectAI",
|
||||||
"num_prompts": 0,
|
"num_prompts": 0,
|
||||||
"tokens": 0,
|
"tokens": 0,
|
||||||
|
"is_active": True,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
"source": "Github: https://github.com/UKGovernmentBEIS/inspect_ai",
|
"source": "Github: https://github.com/UKGovernmentBEIS/inspect_ai",
|
||||||
"selected": False,
|
"selected": False,
|
||||||
@@ -439,6 +477,7 @@ REGISTRY = REGISTRY_V0 + [
|
|||||||
"num_prompts": len(load_local_csv().prompts),
|
"num_prompts": len(load_local_csv().prompts),
|
||||||
"tokens": load_local_csv().tokens,
|
"tokens": load_local_csv().tokens,
|
||||||
"approx_cost": 0.0,
|
"approx_cost": 0.0,
|
||||||
|
"is_active": True,
|
||||||
"source": f"Local file dataset: {load_local_csv().metadata['src']}",
|
"source": f"Local file dataset: {load_local_csv().metadata['src']}",
|
||||||
"selected": len(load_local_csv().prompts),
|
"selected": len(load_local_csv().prompts),
|
||||||
"url": "",
|
"url": "",
|
||||||
|
|||||||
@@ -3,7 +3,7 @@ import random
|
|||||||
from fastapi import APIRouter, File, Header, HTTPException, UploadFile
|
from fastapi import APIRouter, File, Header, HTTPException, UploadFile
|
||||||
from fastapi.responses import JSONResponse
|
from fastapi.responses import JSONResponse
|
||||||
|
|
||||||
from ..models.schemas import FileProbeResponse, Probe
|
from ..primitives import FileProbeResponse, Probe
|
||||||
from ..probe_actor.refusal import REFUSAL_MARKS
|
from ..probe_actor.refusal import REFUSAL_MARKS
|
||||||
from ..probe_data import REGISTRY
|
from ..probe_data import REGISTRY
|
||||||
|
|
||||||
@@ -77,3 +77,11 @@ async def data_config():
|
|||||||
async def health_check():
|
async def health_check():
|
||||||
"""Health check endpoint."""
|
"""Health check endpoint."""
|
||||||
return JSONResponse(content={"status": "ok"})
|
return JSONResponse(content={"status": "ok"})
|
||||||
|
|
||||||
|
|
||||||
|
@router.post("/v1/self-probe-t5")
|
||||||
|
def self_probe_t5(probe: Probe):
|
||||||
|
import languagemodels as lm # noqa
|
||||||
|
|
||||||
|
message = lm.do(probe.prompt)
|
||||||
|
return make_mock_response(message)
|
||||||
|
|||||||
@@ -5,7 +5,7 @@ from fastapi import APIRouter
|
|||||||
from loguru import logger
|
from loguru import logger
|
||||||
|
|
||||||
from ..core.app import get_current_run, get_tools_inbox
|
from ..core.app import get_current_run, get_tools_inbox
|
||||||
from ..models.schemas import CompletionRequest, Settings
|
from ..primitives import CompletionRequest, Settings
|
||||||
from ..probe_actor.refusal import REFUSAL_MARKS
|
from ..probe_actor.refusal import REFUSAL_MARKS
|
||||||
|
|
||||||
router = APIRouter()
|
router = APIRouter()
|
||||||
|
|||||||
@@ -3,7 +3,7 @@ from pathlib import Path
|
|||||||
from fastapi import APIRouter, Response
|
from fastapi import APIRouter, Response
|
||||||
from fastapi.responses import FileResponse, StreamingResponse
|
from fastapi.responses import FileResponse, StreamingResponse
|
||||||
|
|
||||||
from ..models.schemas import Table
|
from ..primitives import Table
|
||||||
from ..report_chart import plot_security_report
|
from ..report_chart import plot_security_report
|
||||||
|
|
||||||
router = APIRouter()
|
router = APIRouter()
|
||||||
|
|||||||
@@ -14,7 +14,7 @@ from fastapi.responses import StreamingResponse
|
|||||||
from ..core.app import get_stop_event, get_tools_inbox, set_current_run
|
from ..core.app import get_stop_event, get_tools_inbox, set_current_run
|
||||||
from ..dependencies import InMemorySecrets, get_in_memory_secrets
|
from ..dependencies import InMemorySecrets, get_in_memory_secrets
|
||||||
from ..http_spec import LLMSpec
|
from ..http_spec import LLMSpec
|
||||||
from ..models.schemas import LLMInfo, Scan
|
from ..primitives import LLMInfo, Scan
|
||||||
from ..probe_actor import fuzzer
|
from ..probe_actor import fuzzer
|
||||||
|
|
||||||
router = APIRouter()
|
router = APIRouter()
|
||||||
|
|||||||
@@ -7,7 +7,7 @@ from fastapi.templating import Jinja2Templates
|
|||||||
from jinja2 import Environment, FileSystemLoader
|
from jinja2 import Environment, FileSystemLoader
|
||||||
from starlette.responses import Response
|
from starlette.responses import Response
|
||||||
|
|
||||||
from ..models.schemas import Settings
|
from ..primitives import Settings
|
||||||
|
|
||||||
router = APIRouter()
|
router = APIRouter()
|
||||||
STATIC_DIR = Path(__file__).parent.parent / "static"
|
STATIC_DIR = Path(__file__).parent.parent / "static"
|
||||||
|
|||||||
@@ -2,7 +2,7 @@ import sentry_sdk
|
|||||||
from loguru import logger
|
from loguru import logger
|
||||||
from sentry_sdk.integrations.logging import ignore_logger
|
from sentry_sdk.integrations.logging import ignore_logger
|
||||||
|
|
||||||
from ..models.schemas import Settings
|
from ..primitives import Settings
|
||||||
|
|
||||||
|
|
||||||
def setup(app):
|
def setup(app):
|
||||||
@@ -16,6 +16,7 @@ def setup(app):
|
|||||||
# Set traces_sample_rate to 1.0 to capture 100%
|
# Set traces_sample_rate to 1.0 to capture 100%
|
||||||
# of transactions for tracing.
|
# of transactions for tracing.
|
||||||
traces_sample_rate=1.0,
|
traces_sample_rate=1.0,
|
||||||
|
ignore_errors=[KeyboardInterrupt],
|
||||||
_experiments={
|
_experiments={
|
||||||
# Set continuous_profiling_auto_start to True
|
# Set continuous_profiling_auto_start to True
|
||||||
# to automatically start the profiler on when
|
# to automatically start the profiler on when
|
||||||
|
|||||||
@@ -6,7 +6,7 @@ import pytest
|
|||||||
from fastapi.testclient import TestClient
|
from fastapi.testclient import TestClient
|
||||||
|
|
||||||
from ..app import app
|
from ..app import app
|
||||||
from ..models.schemas import Probe
|
from ..primitives import Probe
|
||||||
from ..probe_actor.refusal import REFUSAL_MARKS
|
from ..probe_actor.refusal import REFUSAL_MARKS
|
||||||
from ..probe_data import REGISTRY
|
from ..probe_data import REGISTRY
|
||||||
|
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ import pytest
|
|||||||
from fastapi import HTTPException
|
from fastapi import HTTPException
|
||||||
from fastapi.testclient import TestClient
|
from fastapi.testclient import TestClient
|
||||||
|
|
||||||
from ..models.schemas import Settings
|
from ..primitives import Settings
|
||||||
from .static import get_static_file, router
|
from .static import get_static_file, router
|
||||||
|
|
||||||
client = TestClient(router)
|
client = TestClient(router)
|
||||||
|
|||||||
@@ -1,4 +1,3 @@
|
|||||||
|
|
||||||
let SELF_URL = window.location.href;
|
let SELF_URL = window.location.href;
|
||||||
if (SELF_URL.endsWith('/')) {
|
if (SELF_URL.endsWith('/')) {
|
||||||
SELF_URL = SELF_URL.slice(0, -1);
|
SELF_URL = SELF_URL.slice(0, -1);
|
||||||
@@ -171,6 +170,21 @@ Content-Type: application/json
|
|||||||
{
|
{
|
||||||
"audio_url": "<<AUDIO_FILE_URL>>"
|
"audio_url": "<<AUDIO_FILE_URL>>"
|
||||||
}
|
}
|
||||||
|
`,
|
||||||
|
|
||||||
|
`POST https://api.openrouter.ai/v1/chat/completions
|
||||||
|
Authorization: Bearer $OPENROUTER_API_KEY
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
{
|
||||||
|
"model": "openrouter-latest",
|
||||||
|
"prompt": "<<PROMPT>>",
|
||||||
|
"temperature": 0.7,
|
||||||
|
"max_tokens": 150,
|
||||||
|
"top_p": 0.9,
|
||||||
|
"frequency_penalty": 0,
|
||||||
|
"presence_penalty": 0
|
||||||
|
}
|
||||||
`,
|
`,
|
||||||
|
|
||||||
]
|
]
|
||||||
@@ -190,6 +204,7 @@ let LLM_CONFIGS = [
|
|||||||
{ name: 'Claude', prompts: 40000, logo: '/icons/claude.png' },
|
{ name: 'Claude', prompts: 40000, logo: '/icons/claude.png' },
|
||||||
{ name: 'Cohere', prompts: 40000, logo: '/icons/cohere.png' },
|
{ name: 'Cohere', prompts: 40000, logo: '/icons/cohere.png' },
|
||||||
{ name: 'Azure OpenAI', prompts: 40000, logo: '/icons/azureai.png' },
|
{ name: 'Azure OpenAI', prompts: 40000, logo: '/icons/azureai.png' },
|
||||||
|
{ name: 'OpenRouter.ai', prompts: 40000, logo: '/icons/openrouter.png' },
|
||||||
{ name: 'assemblyai', prompts: 40000, logo: fallbackIcon },
|
{ name: 'assemblyai', prompts: 40000, logo: fallbackIcon },
|
||||||
];
|
];
|
||||||
function has_image(spec) {
|
function has_image(spec) {
|
||||||
@@ -226,5 +241,6 @@ function _getFailureRateScore(failureRate) {
|
|||||||
else if (strengthRate >= 80) return 'B';
|
else if (strengthRate >= 80) return 'B';
|
||||||
else if (strengthRate >= 70) return 'C';
|
else if (strengthRate >= 70) return 'C';
|
||||||
else if (strengthRate >= 60) return 'D';
|
else if (strengthRate >= 60) return 'D';
|
||||||
|
else if (strengthRate >= 1) return '?';
|
||||||
else return 'E'; // For strengthRate less than 60
|
else return 'E'; // For strengthRate less than 60
|
||||||
}
|
}
|
||||||
|
|||||||
Binary file not shown.
|
After Width: | Height: | Size: 21 KiB |
@@ -383,27 +383,26 @@
|
|||||||
class="text-gray-400 hover:underline">Deselect All</button>
|
class="text-gray-400 hover:underline">Deselect All</button>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<div class="grid grid-cols-1 sm:grid-cols-2 md:grid-cols-3 gap-4">
|
<div class="grid grid-cols-1 sm:grid-cols-2 md:grid-cols-3 gap-4">
|
||||||
<div
|
<div
|
||||||
v-for="(package, index) in dataConfig"
|
v-for="(package, index) in dataConfig"
|
||||||
:key="index"
|
:key="index"
|
||||||
@click="addPackage(index)"
|
@click="package.is_active !== false && addPackage(index)"
|
||||||
class="border rounded-lg p-3 cursor-pointer transition-all hover:shadow-md overflow-hidden"
|
class="border rounded-lg p-3 cursor-pointer transition-all hover:shadow-md overflow-hidden"
|
||||||
:class="{
|
:class="{
|
||||||
'border-dark-accent-green bg-dark-accent-green bg-opacity-20': package.selected,
|
'border-dark-accent-green bg-dark-accent-green bg-opacity-20': package.selected,
|
||||||
'border-gray-600': !package.selected
|
'border-gray-600': !package.selected,
|
||||||
}">
|
'opacity-30 pointer-events-none cursor-not-allowed': package.is_active === false
|
||||||
<div class="font-medium mb-1 truncate">{{ package.dataset_name
|
}">
|
||||||
}}</div>
|
<div class="font-medium mb-1 truncate">{{ package.dataset_name }}</div>
|
||||||
<div class="text-sm text-gray-400 truncate">
|
<div class="text-sm text-gray-400 truncate">
|
||||||
{{ package.source || 'Local dataset' }}
|
{{ package.source || 'Local dataset' }}
|
||||||
</div>
|
|
||||||
<div class="mt-2 text-sm font-semibold">
|
|
||||||
{{ package.dynamic ? 'Dynamic dataset' :
|
|
||||||
`${package.num_prompts.toLocaleString()} prompts` }}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
</div>
|
||||||
|
<div class="mt-2 text-sm font-semibold">
|
||||||
|
{{ package.dynamic ? 'Dynamic dataset' : `${package.num_prompts.toLocaleString()} prompts` }}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
</div>
|
</div>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
|
|||||||
@@ -350,6 +350,10 @@ var app = new Vue({
|
|||||||
|
|
||||||
// If all are selected, deselect all. Otherwise, select all.
|
// If all are selected, deselect all. Otherwise, select all.
|
||||||
this.dataConfig.forEach(package => {
|
this.dataConfig.forEach(package => {
|
||||||
|
if (!package.is_active) {
|
||||||
|
package.selected = false;
|
||||||
|
return
|
||||||
|
}
|
||||||
package.selected = !allSelected;
|
package.selected = !allSelected;
|
||||||
});
|
});
|
||||||
|
|
||||||
|
|||||||
@@ -50,7 +50,7 @@ def make_test_registry():
|
|||||||
]
|
]
|
||||||
|
|
||||||
|
|
||||||
class TestAS:
|
class TestLibraryLevel:
|
||||||
# Handles an empty dataset list.
|
# Handles an empty dataset list.
|
||||||
def test_class(self, test_server):
|
def test_class(self, test_server):
|
||||||
llmSpec = test_spec_assets.SAMPLE_SPEC
|
llmSpec = test_spec_assets.SAMPLE_SPEC
|
||||||
@@ -62,8 +62,8 @@ class TestAS:
|
|||||||
print(result)
|
print(result)
|
||||||
assert len(result) in [0, 1]
|
assert len(result) in [0, 1]
|
||||||
|
|
||||||
# TODO: slow test
|
@pytest.mark.slow
|
||||||
def _test_class_msj(self, test_server):
|
def test_class_msj(self, test_server):
|
||||||
llmSpec = test_spec_assets.SAMPLE_SPEC
|
llmSpec = test_spec_assets.SAMPLE_SPEC
|
||||||
maxBudget = 1000
|
maxBudget = 1000
|
||||||
max_th = 0.3
|
max_th = 0.3
|
||||||
@@ -98,6 +98,7 @@ class TestAS:
|
|||||||
print(result)
|
print(result)
|
||||||
assert len(result) in [0, 1]
|
assert len(result) in [0, 1]
|
||||||
|
|
||||||
|
@pytest.mark.slow
|
||||||
def test_backend(self, test_server):
|
def test_backend(self, test_server):
|
||||||
llmSpec = test_spec_assets.SAMPLE_SPEC
|
llmSpec = test_spec_assets.SAMPLE_SPEC
|
||||||
maxBudget = 1000000
|
maxBudget = 1000000
|
||||||
@@ -156,7 +157,7 @@ class TestAS:
|
|||||||
class TestEntrypointCI:
|
class TestEntrypointCI:
|
||||||
def test_generate_default_cfg_to_tmp_path(self):
|
def test_generate_default_cfg_to_tmp_path(self):
|
||||||
"""
|
"""
|
||||||
Test that the `generate_default_cfg` method generates a valid default config file in a temporary path.
|
Test that the `generate_default_settings` method generates a valid default config file in a temporary path.
|
||||||
"""
|
"""
|
||||||
# Create a temporary directory
|
# Create a temporary directory
|
||||||
with tempfile.TemporaryDirectory() as tmpdir:
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
@@ -167,7 +168,7 @@ class TestEntrypointCI:
|
|||||||
|
|
||||||
# Generate the default configuration
|
# Generate the default configuration
|
||||||
security = AgenticSecurity()
|
security = AgenticSecurity()
|
||||||
security.generate_default_cfg()
|
security.generate_default_settings()
|
||||||
|
|
||||||
# Check that the config file was created at the temporary path
|
# Check that the config file was created at the temporary path
|
||||||
assert os.path.exists(temp_path), f"{temp_path} file should be generated."
|
assert os.path.exists(temp_path), f"{temp_path} file should be generated."
|
||||||
@@ -192,7 +193,7 @@ class TestEntrypointCI:
|
|||||||
|
|
||||||
# Generate the default configuration
|
# Generate the default configuration
|
||||||
security = AgenticSecurity()
|
security = AgenticSecurity()
|
||||||
security.generate_default_cfg()
|
security.generate_default_settings()
|
||||||
|
|
||||||
# Load the generated configuration
|
# Load the generated configuration
|
||||||
AgenticSecurity.load_config(temp_path)
|
AgenticSecurity.load_config(temp_path)
|
||||||
|
|||||||
@@ -0,0 +1,2 @@
|
|||||||
|
from:python-pytest-poetry
|
||||||
|
# This file was generated automatically by CodeBeaver based on your repository. Learn how to customize it here: https://docs.codebeaver.ai/configuration/
|
||||||
Generated
+109
-10
@@ -1,4 +1,4 @@
|
|||||||
# This file is automatically @generated by Poetry 1.7.1 and should not be changed by hand.
|
# This file is automatically @generated by Poetry 1.8.5 and should not be changed by hand.
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "aiohappyeyeballs"
|
name = "aiohappyeyeballs"
|
||||||
@@ -784,18 +784,18 @@ tests = ["asttokens (>=2.1.0)", "coverage", "coverage-enable-subprocess", "ipyth
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "fastapi"
|
name = "fastapi"
|
||||||
version = "0.115.8"
|
version = "0.115.11"
|
||||||
description = "FastAPI framework, high performance, easy to learn, fast to code, ready for production"
|
description = "FastAPI framework, high performance, easy to learn, fast to code, ready for production"
|
||||||
optional = false
|
optional = false
|
||||||
python-versions = ">=3.8"
|
python-versions = ">=3.8"
|
||||||
files = [
|
files = [
|
||||||
{file = "fastapi-0.115.8-py3-none-any.whl", hash = "sha256:753a96dd7e036b34eeef8babdfcfe3f28ff79648f86551eb36bfc1b0bf4a8cbf"},
|
{file = "fastapi-0.115.11-py3-none-any.whl", hash = "sha256:32e1541b7b74602e4ef4a0260ecaf3aadf9d4f19590bba3e1bf2ac4666aa2c64"},
|
||||||
{file = "fastapi-0.115.8.tar.gz", hash = "sha256:0ce9111231720190473e222cdf0f07f7206ad7e53ea02beb1d2dc36e2f0741e9"},
|
{file = "fastapi-0.115.11.tar.gz", hash = "sha256:cc81f03f688678b92600a65a5e618b93592c65005db37157147204d8924bf94f"},
|
||||||
]
|
]
|
||||||
|
|
||||||
[package.dependencies]
|
[package.dependencies]
|
||||||
pydantic = ">=1.7.4,<1.8 || >1.8,<1.8.1 || >1.8.1,<2.0.0 || >2.0.0,<2.0.1 || >2.0.1,<2.1.0 || >2.1.0,<3.0.0"
|
pydantic = ">=1.7.4,<1.8 || >1.8,<1.8.1 || >1.8.1,<2.0.0 || >2.0.0,<2.0.1 || >2.0.1,<2.1.0 || >2.1.0,<3.0.0"
|
||||||
starlette = ">=0.40.0,<0.46.0"
|
starlette = ">=0.40.0,<0.47.0"
|
||||||
typing-extensions = ">=4.8.0"
|
typing-extensions = ">=4.8.0"
|
||||||
|
|
||||||
[package.extras]
|
[package.extras]
|
||||||
@@ -1202,13 +1202,13 @@ files = [
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "inline-snapshot"
|
name = "inline-snapshot"
|
||||||
version = "0.20.1"
|
version = "0.20.3"
|
||||||
description = "golden master/snapshot/approval testing library which puts the values right into your source code"
|
description = "golden master/snapshot/approval testing library which puts the values right into your source code"
|
||||||
optional = false
|
optional = false
|
||||||
python-versions = ">=3.8"
|
python-versions = ">=3.8"
|
||||||
files = [
|
files = [
|
||||||
{file = "inline_snapshot-0.20.1-py3-none-any.whl", hash = "sha256:5b5c3fd037f340dff5adee1c2c58db9038325937a8190dedbba98e37b87c979a"},
|
{file = "inline_snapshot-0.20.3-py3-none-any.whl", hash = "sha256:1ea999fbf38dd11cc72d0e1a0b9303c63d496b77bdc406a394fe2424ae842f70"},
|
||||||
{file = "inline_snapshot-0.20.1.tar.gz", hash = "sha256:c56c871e59973500eca00610022eac19e79cd2c1b0b2d7a18abe14dde11a1431"},
|
{file = "inline_snapshot-0.20.3.tar.gz", hash = "sha256:7a353170b7e42aa89086c7ba790a973c9645523acf985532648dabd7ee2d71f2"},
|
||||||
]
|
]
|
||||||
|
|
||||||
[package.dependencies]
|
[package.dependencies]
|
||||||
@@ -1217,7 +1217,7 @@ executing = ">=2.2.0"
|
|||||||
rich = ">=13.7.1"
|
rich = ">=13.7.1"
|
||||||
|
|
||||||
[package.extras]
|
[package.extras]
|
||||||
black = ["black (>=23.3.0)", "click (>=8.1.4)"]
|
black = ["black (>=23.3.0)"]
|
||||||
dirty-equals = ["dirty-equals (>=0.9.0)"]
|
dirty-equals = ["dirty-equals (>=0.9.0)"]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
@@ -2319,6 +2319,94 @@ files = [
|
|||||||
{file = "numpy-2.2.2.tar.gz", hash = "sha256:ed6906f61834d687738d25988ae117683705636936cc605be0bb208b23df4d8f"},
|
{file = "numpy-2.2.2.tar.gz", hash = "sha256:ed6906f61834d687738d25988ae117683705636936cc605be0bb208b23df4d8f"},
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "orjson"
|
||||||
|
version = "3.10.15"
|
||||||
|
description = "Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy"
|
||||||
|
optional = false
|
||||||
|
python-versions = ">=3.8"
|
||||||
|
files = [
|
||||||
|
{file = "orjson-3.10.15-cp310-cp310-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:552c883d03ad185f720d0c09583ebde257e41b9521b74ff40e08b7dec4559c04"},
|
||||||
|
{file = "orjson-3.10.15-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:616e3e8d438d02e4854f70bfdc03a6bcdb697358dbaa6bcd19cbe24d24ece1f8"},
|
||||||
|
{file = "orjson-3.10.15-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:7c2c79fa308e6edb0ffab0a31fd75a7841bf2a79a20ef08a3c6e3b26814c8ca8"},
|
||||||
|
{file = "orjson-3.10.15-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:73cb85490aa6bf98abd20607ab5c8324c0acb48d6da7863a51be48505646c814"},
|
||||||
|
{file = "orjson-3.10.15-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:763dadac05e4e9d2bc14938a45a2d0560549561287d41c465d3c58aec818b164"},
|
||||||
|
{file = "orjson-3.10.15-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a330b9b4734f09a623f74a7490db713695e13b67c959713b78369f26b3dee6bf"},
|
||||||
|
{file = "orjson-3.10.15-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:a61a4622b7ff861f019974f73d8165be1bd9a0855e1cad18ee167acacabeb061"},
|
||||||
|
{file = "orjson-3.10.15-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:acd271247691574416b3228db667b84775c497b245fa275c6ab90dc1ffbbd2b3"},
|
||||||
|
{file = "orjson-3.10.15-cp310-cp310-musllinux_1_2_armv7l.whl", hash = "sha256:e4759b109c37f635aa5c5cc93a1b26927bfde24b254bcc0e1149a9fada253d2d"},
|
||||||
|
{file = "orjson-3.10.15-cp310-cp310-musllinux_1_2_i686.whl", hash = "sha256:9e992fd5cfb8b9f00bfad2fd7a05a4299db2bbe92e6440d9dd2fab27655b3182"},
|
||||||
|
{file = "orjson-3.10.15-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:f95fb363d79366af56c3f26b71df40b9a583b07bbaaf5b317407c4d58497852e"},
|
||||||
|
{file = "orjson-3.10.15-cp310-cp310-win32.whl", hash = "sha256:f9875f5fea7492da8ec2444839dcc439b0ef298978f311103d0b7dfd775898ab"},
|
||||||
|
{file = "orjson-3.10.15-cp310-cp310-win_amd64.whl", hash = "sha256:17085a6aa91e1cd70ca8533989a18b5433e15d29c574582f76f821737c8d5806"},
|
||||||
|
{file = "orjson-3.10.15-cp311-cp311-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:c4cc83960ab79a4031f3119cc4b1a1c627a3dc09df125b27c4201dff2af7eaa6"},
|
||||||
|
{file = "orjson-3.10.15-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ddbeef2481d895ab8be5185f2432c334d6dec1f5d1933a9c83014d188e102cef"},
|
||||||
|
{file = "orjson-3.10.15-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:9e590a0477b23ecd5b0ac865b1b907b01b3c5535f5e8a8f6ab0e503efb896334"},
|
||||||
|
{file = "orjson-3.10.15-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:a6be38bd103d2fd9bdfa31c2720b23b5d47c6796bcb1d1b598e3924441b4298d"},
|
||||||
|
{file = "orjson-3.10.15-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:ff4f6edb1578960ed628a3b998fa54d78d9bb3e2eb2cfc5c2a09732431c678d0"},
|
||||||
|
{file = "orjson-3.10.15-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b0482b21d0462eddd67e7fce10b89e0b6ac56570424662b685a0d6fccf581e13"},
|
||||||
|
{file = "orjson-3.10.15-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:bb5cc3527036ae3d98b65e37b7986a918955f85332c1ee07f9d3f82f3a6899b5"},
|
||||||
|
{file = "orjson-3.10.15-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:d569c1c462912acdd119ccbf719cf7102ea2c67dd03b99edcb1a3048651ac96b"},
|
||||||
|
{file = "orjson-3.10.15-cp311-cp311-musllinux_1_2_armv7l.whl", hash = "sha256:1e6d33efab6b71d67f22bf2962895d3dc6f82a6273a965fab762e64fa90dc399"},
|
||||||
|
{file = "orjson-3.10.15-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:c33be3795e299f565681d69852ac8c1bc5c84863c0b0030b2b3468843be90388"},
|
||||||
|
{file = "orjson-3.10.15-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:eea80037b9fae5339b214f59308ef0589fc06dc870578b7cce6d71eb2096764c"},
|
||||||
|
{file = "orjson-3.10.15-cp311-cp311-win32.whl", hash = "sha256:d5ac11b659fd798228a7adba3e37c010e0152b78b1982897020a8e019a94882e"},
|
||||||
|
{file = "orjson-3.10.15-cp311-cp311-win_amd64.whl", hash = "sha256:cf45e0214c593660339ef63e875f32ddd5aa3b4adc15e662cdb80dc49e194f8e"},
|
||||||
|
{file = "orjson-3.10.15-cp312-cp312-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:9d11c0714fc85bfcf36ada1179400862da3288fc785c30e8297844c867d7505a"},
|
||||||
|
{file = "orjson-3.10.15-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:dba5a1e85d554e3897fa9fe6fbcff2ed32d55008973ec9a2b992bd9a65d2352d"},
|
||||||
|
{file = "orjson-3.10.15-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:7723ad949a0ea502df656948ddd8b392780a5beaa4c3b5f97e525191b102fff0"},
|
||||||
|
{file = "orjson-3.10.15-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:6fd9bc64421e9fe9bd88039e7ce8e58d4fead67ca88e3a4014b143cec7684fd4"},
|
||||||
|
{file = "orjson-3.10.15-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:dadba0e7b6594216c214ef7894c4bd5f08d7c0135f4dd0145600be4fbcc16767"},
|
||||||
|
{file = "orjson-3.10.15-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b48f59114fe318f33bbaee8ebeda696d8ccc94c9e90bc27dbe72153094e26f41"},
|
||||||
|
{file = "orjson-3.10.15-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:035fb83585e0f15e076759b6fedaf0abb460d1765b6a36f48018a52858443514"},
|
||||||
|
{file = "orjson-3.10.15-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:d13b7fe322d75bf84464b075eafd8e7dd9eae05649aa2a5354cfa32f43c59f17"},
|
||||||
|
{file = "orjson-3.10.15-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:7066b74f9f259849629e0d04db6609db4cf5b973248f455ba5d3bd58a4daaa5b"},
|
||||||
|
{file = "orjson-3.10.15-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:88dc3f65a026bd3175eb157fea994fca6ac7c4c8579fc5a86fc2114ad05705b7"},
|
||||||
|
{file = "orjson-3.10.15-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:b342567e5465bd99faa559507fe45e33fc76b9fb868a63f1642c6bc0735ad02a"},
|
||||||
|
{file = "orjson-3.10.15-cp312-cp312-win32.whl", hash = "sha256:0a4f27ea5617828e6b58922fdbec67b0aa4bb844e2d363b9244c47fa2180e665"},
|
||||||
|
{file = "orjson-3.10.15-cp312-cp312-win_amd64.whl", hash = "sha256:ef5b87e7aa9545ddadd2309efe6824bd3dd64ac101c15dae0f2f597911d46eaa"},
|
||||||
|
{file = "orjson-3.10.15-cp313-cp313-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:bae0e6ec2b7ba6895198cd981b7cca95d1487d0147c8ed751e5632ad16f031a6"},
|
||||||
|
{file = "orjson-3.10.15-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f93ce145b2db1252dd86af37d4165b6faa83072b46e3995ecc95d4b2301b725a"},
|
||||||
|
{file = "orjson-3.10.15-cp313-cp313-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:7c203f6f969210128af3acae0ef9ea6aab9782939f45f6fe02d05958fe761ef9"},
|
||||||
|
{file = "orjson-3.10.15-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:8918719572d662e18b8af66aef699d8c21072e54b6c82a3f8f6404c1f5ccd5e0"},
|
||||||
|
{file = "orjson-3.10.15-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:f71eae9651465dff70aa80db92586ad5b92df46a9373ee55252109bb6b703307"},
|
||||||
|
{file = "orjson-3.10.15-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e117eb299a35f2634e25ed120c37c641398826c2f5a3d3cc39f5993b96171b9e"},
|
||||||
|
{file = "orjson-3.10.15-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:13242f12d295e83c2955756a574ddd6741c81e5b99f2bef8ed8d53e47a01e4b7"},
|
||||||
|
{file = "orjson-3.10.15-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:7946922ada8f3e0b7b958cc3eb22cfcf6c0df83d1fe5521b4a100103e3fa84c8"},
|
||||||
|
{file = "orjson-3.10.15-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:b7155eb1623347f0f22c38c9abdd738b287e39b9982e1da227503387b81b34ca"},
|
||||||
|
{file = "orjson-3.10.15-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:208beedfa807c922da4e81061dafa9c8489c6328934ca2a562efa707e049e561"},
|
||||||
|
{file = "orjson-3.10.15-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:eca81f83b1b8c07449e1d6ff7074e82e3fd6777e588f1a6632127f286a968825"},
|
||||||
|
{file = "orjson-3.10.15-cp313-cp313-win32.whl", hash = "sha256:c03cd6eea1bd3b949d0d007c8d57049aa2b39bd49f58b4b2af571a5d3833d890"},
|
||||||
|
{file = "orjson-3.10.15-cp313-cp313-win_amd64.whl", hash = "sha256:fd56a26a04f6ba5fb2045b0acc487a63162a958ed837648c5781e1fe3316cfbf"},
|
||||||
|
{file = "orjson-3.10.15-cp38-cp38-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:5e8afd6200e12771467a1a44e5ad780614b86abb4b11862ec54861a82d677746"},
|
||||||
|
{file = "orjson-3.10.15-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:da9a18c500f19273e9e104cca8c1f0b40a6470bcccfc33afcc088045d0bf5ea6"},
|
||||||
|
{file = "orjson-3.10.15-cp38-cp38-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:bb00b7bfbdf5d34a13180e4805d76b4567025da19a197645ca746fc2fb536586"},
|
||||||
|
{file = "orjson-3.10.15-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:33aedc3d903378e257047fee506f11e0833146ca3e57a1a1fb0ddb789876c1e1"},
|
||||||
|
{file = "orjson-3.10.15-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:dd0099ae6aed5eb1fc84c9eb72b95505a3df4267e6962eb93cdd5af03be71c98"},
|
||||||
|
{file = "orjson-3.10.15-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7c864a80a2d467d7786274fce0e4f93ef2a7ca4ff31f7fc5634225aaa4e9e98c"},
|
||||||
|
{file = "orjson-3.10.15-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:c25774c9e88a3e0013d7d1a6c8056926b607a61edd423b50eb5c88fd7f2823ae"},
|
||||||
|
{file = "orjson-3.10.15-cp38-cp38-musllinux_1_2_aarch64.whl", hash = "sha256:e78c211d0074e783d824ce7bb85bf459f93a233eb67a5b5003498232ddfb0e8a"},
|
||||||
|
{file = "orjson-3.10.15-cp38-cp38-musllinux_1_2_armv7l.whl", hash = "sha256:43e17289ffdbbac8f39243916c893d2ae41a2ea1a9cbb060a56a4d75286351ae"},
|
||||||
|
{file = "orjson-3.10.15-cp38-cp38-musllinux_1_2_i686.whl", hash = "sha256:781d54657063f361e89714293c095f506c533582ee40a426cb6489c48a637b81"},
|
||||||
|
{file = "orjson-3.10.15-cp38-cp38-musllinux_1_2_x86_64.whl", hash = "sha256:6875210307d36c94873f553786a808af2788e362bd0cf4c8e66d976791e7b528"},
|
||||||
|
{file = "orjson-3.10.15-cp38-cp38-win32.whl", hash = "sha256:305b38b2b8f8083cc3d618927d7f424349afce5975b316d33075ef0f73576b60"},
|
||||||
|
{file = "orjson-3.10.15-cp38-cp38-win_amd64.whl", hash = "sha256:5dd9ef1639878cc3efffed349543cbf9372bdbd79f478615a1c633fe4e4180d1"},
|
||||||
|
{file = "orjson-3.10.15-cp39-cp39-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:ffe19f3e8d68111e8644d4f4e267a069ca427926855582ff01fc012496d19969"},
|
||||||
|
{file = "orjson-3.10.15-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d433bf32a363823863a96561a555227c18a522a8217a6f9400f00ddc70139ae2"},
|
||||||
|
{file = "orjson-3.10.15-cp39-cp39-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:da03392674f59a95d03fa5fb9fe3a160b0511ad84b7a3914699ea5a1b3a38da2"},
|
||||||
|
{file = "orjson-3.10.15-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:3a63bb41559b05360ded9132032239e47983a39b151af1201f07ec9370715c82"},
|
||||||
|
{file = "orjson-3.10.15-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:3766ac4702f8f795ff3fa067968e806b4344af257011858cc3d6d8721588b53f"},
|
||||||
|
{file = "orjson-3.10.15-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7a1c73dcc8fadbd7c55802d9aa093b36878d34a3b3222c41052ce6b0fc65f8e8"},
|
||||||
|
{file = "orjson-3.10.15-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:b299383825eafe642cbab34be762ccff9fd3408d72726a6b2a4506d410a71ab3"},
|
||||||
|
{file = "orjson-3.10.15-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:abc7abecdbf67a173ef1316036ebbf54ce400ef2300b4e26a7b843bd446c2480"},
|
||||||
|
{file = "orjson-3.10.15-cp39-cp39-musllinux_1_2_armv7l.whl", hash = "sha256:3614ea508d522a621384c1d6639016a5a2e4f027f3e4a1c93a51867615d28829"},
|
||||||
|
{file = "orjson-3.10.15-cp39-cp39-musllinux_1_2_i686.whl", hash = "sha256:295c70f9dc154307777ba30fe29ff15c1bcc9dfc5c48632f37d20a607e9ba85a"},
|
||||||
|
{file = "orjson-3.10.15-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:63309e3ff924c62404923c80b9e2048c1f74ba4b615e7584584389ada50ed428"},
|
||||||
|
{file = "orjson-3.10.15-cp39-cp39-win32.whl", hash = "sha256:a2f708c62d026fb5340788ba94a55c23df4e1869fec74be455e0b2f5363b8507"},
|
||||||
|
{file = "orjson-3.10.15-cp39-cp39-win_amd64.whl", hash = "sha256:efcf6c735c3d22ef60c4aa27a5238f1a477df85e9b15f2142f9d669beb2d13fd"},
|
||||||
|
{file = "orjson-3.10.15.tar.gz", hash = "sha256:05ca7fe452a2e9d8d9d706a2984c95b9c2ebc5db417ce0b7a49b91d50642a23e"},
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "packaging"
|
name = "packaging"
|
||||||
version = "24.1"
|
version = "24.1"
|
||||||
@@ -3014,6 +3102,17 @@ files = [
|
|||||||
[package.dependencies]
|
[package.dependencies]
|
||||||
typing-extensions = ">=4.6.0,<4.7.0 || >4.7.0"
|
typing-extensions = ">=4.6.0,<4.7.0 || >4.7.0"
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "pyfiglet"
|
||||||
|
version = "1.0.2"
|
||||||
|
description = "Pure-python FIGlet implementation"
|
||||||
|
optional = false
|
||||||
|
python-versions = ">=3.9"
|
||||||
|
files = [
|
||||||
|
{file = "pyfiglet-1.0.2-py3-none-any.whl", hash = "sha256:889b351d79c99e50a3f619c8f8e6ffdb27fd8c939fc43ecbd7559bd57d5f93ea"},
|
||||||
|
{file = "pyfiglet-1.0.2.tar.gz", hash = "sha256:758788018ab8faaddc0984e1ea05ff330d3c64be663c513cc1f105f6a3066dab"},
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "pygments"
|
name = "pygments"
|
||||||
version = "2.18.0"
|
version = "2.18.0"
|
||||||
@@ -4439,4 +4538,4 @@ propcache = ">=0.2.0"
|
|||||||
[metadata]
|
[metadata]
|
||||||
lock-version = "2.0"
|
lock-version = "2.0"
|
||||||
python-versions = "^3.11"
|
python-versions = "^3.11"
|
||||||
content-hash = "a741ff960d86175204b90cdb4f935d3873a6a38d2d547c1ded73c17ab54b4312"
|
content-hash = "28a2b74bfafa9f93d14d2f8d1fcaffa340db212acce6469d6714d342203ad77f"
|
||||||
|
|||||||
@@ -49,6 +49,10 @@ tomli = "^2.2.1"
|
|||||||
rich = "13.9.4"
|
rich = "13.9.4"
|
||||||
gTTS = "^2.5.4"
|
gTTS = "^2.5.4"
|
||||||
sentry_sdk = "^2.22.0"
|
sentry_sdk = "^2.22.0"
|
||||||
|
orjson = "^3.10"
|
||||||
|
pyfiglet = "^1.0.2"
|
||||||
|
termcolor = "^2.4.0"
|
||||||
|
|
||||||
# garak = { version = "*", optional = true }
|
# garak = { version = "*", optional = true }
|
||||||
|
|
||||||
|
|
||||||
@@ -82,5 +86,7 @@ build-backend = "poetry.core.masonry.api"
|
|||||||
|
|
||||||
|
|
||||||
[tool.pytest.ini_options]
|
[tool.pytest.ini_options]
|
||||||
|
addopts = "--durations=5 -m 'not slow'"
|
||||||
asyncio_mode = "auto"
|
asyncio_mode = "auto"
|
||||||
asyncio_default_fixture_loop_scope = "function"
|
asyncio_default_fixture_loop_scope = "function"
|
||||||
|
markers = "slow: marks tests as slow"
|
||||||
|
|||||||
@@ -0,0 +1,8 @@
|
|||||||
|
import os
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
|
||||||
|
def pytest_runtest_setup(item):
|
||||||
|
if "slow" in item.keywords and not os.getenv("RUN_SLOW_TESTS"):
|
||||||
|
pytest.skip("Skipping slow test")
|
||||||
@@ -0,0 +1,161 @@
|
|||||||
|
import pytest
|
||||||
|
import asyncio
|
||||||
|
from fastapi import FastAPI
|
||||||
|
from asyncio import Queue, Event
|
||||||
|
from agentic_security.core.app import create_app, get_tools_inbox, get_stop_event, get_current_run, set_current_run
|
||||||
|
|
||||||
|
class TestApp:
|
||||||
|
"""Test suite for agentic_security.core.app module."""
|
||||||
|
|
||||||
|
def test_create_app(self):
|
||||||
|
"""Test that create_app returns a FastAPI instance."""
|
||||||
|
app = create_app()
|
||||||
|
assert isinstance(app, FastAPI)
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_get_tools_inbox(self):
|
||||||
|
"""Test that get_tools_inbox returns the global Queue instance."""
|
||||||
|
queue1 = get_tools_inbox()
|
||||||
|
await queue1.put("test item")
|
||||||
|
queue2 = get_tools_inbox()
|
||||||
|
result = queue2.get_nowait()
|
||||||
|
assert result == "test item"
|
||||||
|
|
||||||
|
def test_get_stop_event(self):
|
||||||
|
"""Test that get_stop_event returns the global Event instance and is not set initially."""
|
||||||
|
event = get_stop_event()
|
||||||
|
assert isinstance(event, Event)
|
||||||
|
assert not event.is_set()
|
||||||
|
|
||||||
|
def test_current_run_initial(self):
|
||||||
|
"""Test that get_current_run returns the global current_run with default values initially."""
|
||||||
|
run = get_current_run()
|
||||||
|
# Default values should be empty strings
|
||||||
|
assert run["spec"] == ""
|
||||||
|
assert run["id"] == ""
|
||||||
|
|
||||||
|
def test_set_current_run(self):
|
||||||
|
"""Test that set_current_run correctly updates current_run."""
|
||||||
|
spec = "test run"
|
||||||
|
result = set_current_run(spec)
|
||||||
|
expected_id = hash(id(spec))
|
||||||
|
# Verify that spec is set correctly
|
||||||
|
assert result["spec"] == spec
|
||||||
|
assert result["id"] == expected_id
|
||||||
|
|
||||||
|
def test_current_run_after_set(self):
|
||||||
|
"""Test that get_current_run returns the updated current_run after set_current_run is called."""
|
||||||
|
spec = "another test run"
|
||||||
|
set_current_run(spec)
|
||||||
|
current = get_current_run()
|
||||||
|
assert current["spec"] == spec
|
||||||
|
assert current["id"] == hash(id(spec))
|
||||||
|
def test_tools_inbox_same_instance(self):
|
||||||
|
"""Test that get_tools_inbox returns the same Queue instance by default."""
|
||||||
|
queue1 = get_tools_inbox()
|
||||||
|
queue2 = get_tools_inbox()
|
||||||
|
assert queue1 is queue2
|
||||||
|
|
||||||
|
def test_stop_event_set(self):
|
||||||
|
"""Test that setting the stop event is reflected in subsequent calls."""
|
||||||
|
event = get_stop_event()
|
||||||
|
event.set() # set the global event
|
||||||
|
# Now, subsequent calls should return the same event which is set.
|
||||||
|
event2 = get_stop_event()
|
||||||
|
assert event2.is_set()
|
||||||
|
|
||||||
|
def test_set_current_run_with_none(self):
|
||||||
|
"""Test that set_current_run handles None as a valid input and updates current_run accordingly."""
|
||||||
|
result = set_current_run(None)
|
||||||
|
expected_id = hash(id(None))
|
||||||
|
assert result["spec"] is None
|
||||||
|
assert result["id"] == expected_id
|
||||||
|
|
||||||
|
def test_multiple_current_run_assignments(self):
|
||||||
|
"""Test multiple assignments to current_run to ensure it always updates correctly."""
|
||||||
|
first_spec = "first run"
|
||||||
|
result1 = set_current_run(first_spec)
|
||||||
|
expected_id1 = hash(id(first_spec))
|
||||||
|
assert result1["spec"] == first_spec
|
||||||
|
assert result1["id"] == expected_id1
|
||||||
|
|
||||||
|
second_spec = "second run"
|
||||||
|
result2 = set_current_run(second_spec)
|
||||||
|
expected_id2 = hash(id(second_spec))
|
||||||
|
assert result2["spec"] == second_spec
|
||||||
|
assert result2["id"] == expected_id2
|
||||||
|
|
||||||
|
current = get_current_run()
|
||||||
|
# The current_run should reflect the latest assignment.
|
||||||
|
assert current["spec"] == second_spec
|
||||||
|
assert current["id"] == expected_id2
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_empty_tools_inbox_exception(self):
|
||||||
|
"""Test that calling get_nowait on an empty tools_inbox raises QueueEmpty."""
|
||||||
|
from asyncio import QueueEmpty
|
||||||
|
queue = get_tools_inbox()
|
||||||
|
# Clear any existing items in the queue
|
||||||
|
while True:
|
||||||
|
try:
|
||||||
|
queue.get_nowait()
|
||||||
|
except QueueEmpty:
|
||||||
|
break
|
||||||
|
with pytest.raises(QueueEmpty):
|
||||||
|
queue.get_nowait()
|
||||||
|
|
||||||
|
def test_set_current_run_with_dict(self):
|
||||||
|
"""Test that set_current_run correctly handles a dictionary input as spec."""
|
||||||
|
spec = {"key": "value"}
|
||||||
|
result = set_current_run(spec)
|
||||||
|
expected_id = hash(id(spec))
|
||||||
|
assert result["spec"] == spec
|
||||||
|
assert result["id"] == expected_id
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_stop_event_wait(self):
|
||||||
|
"""Test that waiting on the stop event returns once the event is set."""
|
||||||
|
event = get_stop_event()
|
||||||
|
event.clear() # ensure event is not set
|
||||||
|
async def waiter():
|
||||||
|
await event.wait()
|
||||||
|
return True
|
||||||
|
waiter_task = asyncio.create_task(waiter())
|
||||||
|
# Wait a moment to ensure the waiter is pending
|
||||||
|
await asyncio.sleep(0.1)
|
||||||
|
assert not waiter_task.done()
|
||||||
|
event.set()
|
||||||
|
result = await waiter_task
|
||||||
|
assert result is True
|
||||||
|
|
||||||
|
def test_set_current_run_with_int(self):
|
||||||
|
"""Test that set_current_run handles an integer input as spec."""
|
||||||
|
spec = 12345
|
||||||
|
result = set_current_run(spec)
|
||||||
|
expected_id = hash(id(spec))
|
||||||
|
assert result["spec"] == spec
|
||||||
|
assert result["id"] == expected_id
|
||||||
|
|
||||||
|
def test_create_app_routes(self):
|
||||||
|
"""Test that create_app returns a FastAPI instance with default routes available."""
|
||||||
|
app = create_app()
|
||||||
|
paths = [route.path for route in app.routes]
|
||||||
|
# Check that the default OpenAPI route exists
|
||||||
|
assert "/openapi.json" in paths
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_tools_inbox_async_put_get_order(self):
|
||||||
|
"""Test that tools_inbox preserves order when items are added and retrieved asynchronously."""
|
||||||
|
queue = get_tools_inbox()
|
||||||
|
# Clear any existing items in the queue
|
||||||
|
from asyncio import QueueEmpty
|
||||||
|
while True:
|
||||||
|
try:
|
||||||
|
queue.get_nowait()
|
||||||
|
except QueueEmpty:
|
||||||
|
break
|
||||||
|
items = ["first", "second", "third"]
|
||||||
|
for item in items:
|
||||||
|
await queue.put(item)
|
||||||
|
result_items = []
|
||||||
|
for _ in items:
|
||||||
|
result_items.append(await queue.get())
|
||||||
|
assert result_items == items
|
||||||
@@ -0,0 +1,341 @@
|
|||||||
|
import pytest
|
||||||
|
import base64
|
||||||
|
import httpx
|
||||||
|
import asyncio
|
||||||
|
from agentic_security.http_spec import (
|
||||||
|
LLMSpec,
|
||||||
|
parse_http_spec,
|
||||||
|
escape_special_chars_for_json,
|
||||||
|
encode_image_base64_by_url,
|
||||||
|
encode_audio_base64_by_url,
|
||||||
|
InvalidHTTPSpecError,
|
||||||
|
Modality
|
||||||
|
)
|
||||||
|
|
||||||
|
################################################################################
|
||||||
|
# Tests for agentic_security/http_spec.py
|
||||||
|
################################################################################
|
||||||
|
|
||||||
|
def test_escape_special_chars_for_json():
|
||||||
|
"""Test escaping special characters in a prompt for JSON safety."""
|
||||||
|
prompt = 'Line1\nLine2\t"Quote"\\Backslash'
|
||||||
|
escaped = escape_special_chars_for_json(prompt)
|
||||||
|
assert '\\n' in escaped
|
||||||
|
assert '\\t' in escaped
|
||||||
|
assert '\\"' in escaped
|
||||||
|
assert '\\\\' in escaped
|
||||||
|
|
||||||
|
def test_parse_http_spec_text():
|
||||||
|
"""Test parsing a text HTTP spec without image/audio/files requirements."""
|
||||||
|
spec = "POST http://example.com/api\nContent-Type: application/json\n\nThis is a prompt: <<PROMPT>>"
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
assert llm_spec.method == "POST"
|
||||||
|
assert llm_spec.url == "http://example.com/api"
|
||||||
|
assert llm_spec.headers["Content-Type"] == "application/json"
|
||||||
|
assert "<<PROMPT>>" in llm_spec.body
|
||||||
|
assert not llm_spec.has_files
|
||||||
|
assert not llm_spec.has_image
|
||||||
|
assert not llm_spec.has_audio
|
||||||
|
|
||||||
|
def test_parse_http_spec_files():
|
||||||
|
"""Test parsing a HTTP spec with multipart/form-data header indicating files."""
|
||||||
|
spec = "PUT http://example.com/upload\nContent-Type: multipart/form-data\n\nFile upload test"
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
assert llm_spec.has_files
|
||||||
|
|
||||||
|
def test_parse_http_spec_image_audio():
|
||||||
|
"""Test parsing a HTTP spec that requires image and audio via placeholders."""
|
||||||
|
spec = "GET http://example.com/api\nContent-Type: application/json\n\nImage: <<BASE64_IMAGE>> and Audio: <<BASE64_AUDIO>>"
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
assert llm_spec.has_image
|
||||||
|
assert llm_spec.has_audio
|
||||||
|
|
||||||
|
def test_encode_image_base64_by_url(monkeypatch):
|
||||||
|
"""Test that image encoding returns the correct base64 string with prefix."""
|
||||||
|
dummy_content = b'test_image'
|
||||||
|
class DummyResponse:
|
||||||
|
def __init__(self, content):
|
||||||
|
self.content = content
|
||||||
|
|
||||||
|
def dummy_get(url):
|
||||||
|
return DummyResponse(dummy_content)
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx, "get", dummy_get)
|
||||||
|
result = encode_image_base64_by_url("http://dummyurl.com/image.jpg")
|
||||||
|
expected = "data:image/jpeg;base64," + base64.b64encode(dummy_content).decode("utf-8")
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
def test_encode_audio_base64_by_url(monkeypatch):
|
||||||
|
"""Test that audio encoding returns the correct base64 string with prefix."""
|
||||||
|
dummy_content = b'test_audio'
|
||||||
|
class DummyResponse:
|
||||||
|
def __init__(self, content):
|
||||||
|
self.content = content
|
||||||
|
|
||||||
|
def dummy_get(url):
|
||||||
|
return DummyResponse(dummy_content)
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx, "get", dummy_get)
|
||||||
|
result = encode_audio_base64_by_url("http://dummyurl.com/audio.mp3")
|
||||||
|
expected = "data:audio/mpeg;base64," + base64.b64encode(dummy_content).decode("utf-8")
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_probe_text(monkeypatch):
|
||||||
|
"""Test the probe function for text modality by replacing <<PROMPT>>."""
|
||||||
|
spec = "POST http://example.com/api\nContent-Type: application/json\n\n{\"prompt\": \"<<PROMPT>>\"}"
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
|
||||||
|
async def dummy_request(self, method, url, headers, content, timeout):
|
||||||
|
return httpx.Response(200, text="ok")
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx.AsyncClient, "request", dummy_request)
|
||||||
|
response = await llm_spec.probe("Hello")
|
||||||
|
assert response.status_code == 200
|
||||||
|
assert "ok" in response.text
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_probe_with_files(monkeypatch):
|
||||||
|
"""Test that probe correctly branches to _probe_with_files when files are provided."""
|
||||||
|
spec = "POST http://example.com/api\nContent-Type: multipart/form-data\n\nFile data"
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
files = {"file": ("dummy.txt", b"data")}
|
||||||
|
|
||||||
|
async def dummy_request(self, method, url, headers, files, timeout):
|
||||||
|
return httpx.Response(200, text="file upload ok")
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx.AsyncClient, "request", dummy_request)
|
||||||
|
response = await llm_spec.probe("Unused", files=files)
|
||||||
|
assert response.status_code == 200
|
||||||
|
assert "file upload ok" in response.text
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_verify_image(monkeypatch):
|
||||||
|
"""Test verify method branch for image modality by monkeypatching image encoder."""
|
||||||
|
spec = "POST http://example.com/api\nContent-Type: application/json\n\n{\"image\": \"<<BASE64_IMAGE>>\"}"
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
|
||||||
|
# Replace the image encoder to return a dummy string
|
||||||
|
monkeypatch.setattr("agentic_security.http_spec.encode_image_base64_by_url", lambda url="": "dummy_image")
|
||||||
|
|
||||||
|
async def dummy_request(self, method, url, headers, content, timeout):
|
||||||
|
# Check that the dummy image is injected in the content
|
||||||
|
assert "dummy_image" in content
|
||||||
|
return httpx.Response(200, text="image ok")
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx.AsyncClient, "request", dummy_request)
|
||||||
|
response = await llm_spec.verify()
|
||||||
|
assert response.status_code == 200
|
||||||
|
assert "image ok" in response.text
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_verify_audio(monkeypatch):
|
||||||
|
"""Test verify method branch for audio modality by monkeypatching audio encoder."""
|
||||||
|
spec = "POST http://example.com/api\nContent-Type: application/json\n\n{\"audio\": \"<<BASE64_AUDIO>>\"}"
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
|
||||||
|
monkeypatch.setattr("agentic_security.http_spec.encode_audio_base64_by_url", lambda url: "dummy_audio")
|
||||||
|
|
||||||
|
async def dummy_request(self, method, url, headers, content, timeout):
|
||||||
|
# Ensure that the dummy audio string is present in the request content
|
||||||
|
assert "dummy_audio" in content
|
||||||
|
return httpx.Response(200, text="audio ok")
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx.AsyncClient, "request", dummy_request)
|
||||||
|
response = await llm_spec.verify()
|
||||||
|
assert response.status_code == 200
|
||||||
|
assert "audio ok" in response.text
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_verify_files(monkeypatch):
|
||||||
|
"""Test verify method branch for files modality where _probe_with_files is invoked."""
|
||||||
|
spec = "POST http://example.com/api\nContent-Type: multipart/form-data\n\nFile data"
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
|
||||||
|
async def dummy_request(self, method, url, headers, files, timeout):
|
||||||
|
return httpx.Response(200, text="files ok")
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx.AsyncClient, "request", dummy_request)
|
||||||
|
response = await llm_spec.verify()
|
||||||
|
assert response.status_code == 200
|
||||||
|
assert "files ok" in response.text
|
||||||
|
|
||||||
|
def test_llm_spec_modality_property():
|
||||||
|
"""Test that the modality property reflects the correct modality."""
|
||||||
|
spec_text = "POST http://example.com/api\nContent-Type: application/json\n\nPrompt: <<PROMPT>>"
|
||||||
|
llm_spec_text = parse_http_spec(spec_text)
|
||||||
|
assert llm_spec_text.modality == Modality.TEXT
|
||||||
|
|
||||||
|
spec_image = "POST http://example.com/api\nContent-Type: application/json\n\nImage: <<BASE64_IMAGE>>"
|
||||||
|
llm_spec_image = parse_http_spec(spec_image)
|
||||||
|
assert llm_spec_image.modality == Modality.IMAGE
|
||||||
|
|
||||||
|
spec_audio = "POST http://example.com/api\nContent-Type: application/json\n\nAudio: <<BASE64_AUDIO>>"
|
||||||
|
llm_spec_audio = parse_http_spec(spec_audio)
|
||||||
|
assert llm_spec_audio.modality == Modality.AUDIO
|
||||||
|
|
||||||
|
def test_from_string_invalid():
|
||||||
|
"""Test that LLMSpec.from_string raises an error for an invalid spec."""
|
||||||
|
invalid_spec = "INVALID_SPEC"
|
||||||
|
with pytest.raises(InvalidHTTPSpecError):
|
||||||
|
LLMSpec.from_string(invalid_spec)
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_validate_missing_files():
|
||||||
|
"""Test that LLMSpec.validate raises a ValueError when files are required but missing."""
|
||||||
|
spec = "POST http://example.com/api\nContent-Type: multipart/form-data\n\nFile upload test"
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
with pytest.raises(ValueError, match="Files are required"):
|
||||||
|
llm_spec.validate("test prompt", "", "", {})
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_validate_missing_image():
|
||||||
|
"""Test that LLMSpec.validate raises a ValueError when an image is required but missing."""
|
||||||
|
spec = "POST http://example.com/api\nContent-Type: application/json\n\nImage: <<BASE64_IMAGE>>"
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
with pytest.raises(ValueError, match="An image is required"):
|
||||||
|
llm_spec.validate("test prompt", "", "dummy_audio", {})
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_validate_missing_audio():
|
||||||
|
"""Test that LLMSpec.validate raises a ValueError when audio is required but missing."""
|
||||||
|
spec = "POST http://example.com/api\nContent-Type: application/json\n\nAudio: <<BASE64_AUDIO>>"
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
with pytest.raises(ValueError, match="Audio is required"):
|
||||||
|
llm_spec.validate("test prompt", "dummy_image", "", {})
|
||||||
|
|
||||||
|
def test_fn_alias(monkeypatch):
|
||||||
|
"""Test that LLMSpec.fn is a functional alias for LLMSpec.probe."""
|
||||||
|
spec = "POST http://example.com/api\nContent-Type: application/json\n\n{\"prompt\": \"<<PROMPT>>\"}"
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
|
||||||
|
# Instead of overriding the instance method, verify the alias at the class level.
|
||||||
|
assert LLMSpec.fn is LLMSpec.probe
|
||||||
|
|
||||||
|
def test_escape_special_chars_no_special():
|
||||||
|
"""Test that the escape function returns the original string if no special characters are present."""
|
||||||
|
prompt = "Simple text without specials"
|
||||||
|
escaped = escape_special_chars_for_json(prompt)
|
||||||
|
assert escaped == "Simple text without specials"
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_probe_text_with_special_chars(monkeypatch):
|
||||||
|
"""Test probe for text modality with special characters in prompt ensuring escaped content."""
|
||||||
|
spec = "POST http://example.com/api\nContent-Type: application/json\n\n{\"prompt\": \"<<PROMPT>>\"}"
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
captured = {}
|
||||||
|
|
||||||
|
async def dummy_request(self, method, url, headers, content, timeout):
|
||||||
|
captured['content'] = content
|
||||||
|
return httpx.Response(200, text="ok")
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx.AsyncClient, "request", dummy_request)
|
||||||
|
test_prompt = 'Hello\nWorld\t"Test"'
|
||||||
|
response = await llm_spec.probe(test_prompt)
|
||||||
|
expected_escaped = escape_special_chars_for_json(test_prompt)
|
||||||
|
assert expected_escaped in captured['content']
|
||||||
|
assert response.status_code == 200
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_verify_both_image_audio(monkeypatch):
|
||||||
|
"""Test verify method when both image and audio placeholders are present.
|
||||||
|
Expect a ValueError because only the image branch is triggered by pattern matching and the missing audio causes validation to fail."""
|
||||||
|
spec = ("POST http://example.com/api\nContent-Type: application/json\n\n"
|
||||||
|
"{\"audio\": \"<<BASE64_AUDIO>>\", \"image\":\"<<BASE64_IMAGE>>\"}")
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
# Monkey patch the image encoder to return a dummy value
|
||||||
|
monkeypatch.setattr("agentic_security.http_spec.encode_image_base64_by_url", lambda url="": "dummy_image")
|
||||||
|
with pytest.raises(ValueError, match="Audio is required"):
|
||||||
|
await llm_spec.verify()
|
||||||
|
|
||||||
|
def test_parse_http_spec_invalid_header_format():
|
||||||
|
"""Test that parse_http_spec raises an error when a header line doesn't have the expected 'key: value' format."""
|
||||||
|
invalid_spec = "GET http://example.com/api\nInvalidHeaderWithoutColon\n\nBody with <<PROMPT>>"
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
parse_http_spec(invalid_spec)
|
||||||
|
|
||||||
|
def test_from_string_valid():
|
||||||
|
"""Test that LLMSpec.from_string returns a valid LLMSpec object when given a proper spec string."""
|
||||||
|
spec = "GET http://example.com/api\nContent-Type: application/json\n\n{ \"prompt\": \"<<PROMPT>>\" }"
|
||||||
|
llm_spec = LLMSpec.from_string(spec)
|
||||||
|
assert llm_spec.method == "GET"
|
||||||
|
assert llm_spec.url == "http://example.com/api"
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_parse_http_spec_multiline_body():
|
||||||
|
"""Test parsing an HTTP spec with a multiline body to ensure body concatenation works."""
|
||||||
|
spec = (
|
||||||
|
"PATCH http://example.com/api\n"
|
||||||
|
"Content-Type: application/json\n"
|
||||||
|
"\n"
|
||||||
|
"Line one of body\n"
|
||||||
|
"Line two of body\n"
|
||||||
|
"Line three"
|
||||||
|
)
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
# As implemented, the parser concatenates lines without newline delimiters
|
||||||
|
expected_body = "Line one of bodyLine two of bodyLine three"
|
||||||
|
assert llm_spec.body == expected_body
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_encode_image_default_argument(monkeypatch):
|
||||||
|
"""Test that encode_image_base64_by_url works with its default URL argument."""
|
||||||
|
dummy_content = b'default_image'
|
||||||
|
class DummyResponse:
|
||||||
|
def __init__(self, content):
|
||||||
|
self.content = content
|
||||||
|
|
||||||
|
def dummy_get(url):
|
||||||
|
# check that the default URL (which includes 'fluidicon.png') is used
|
||||||
|
assert "fluidicon.png" in url
|
||||||
|
return DummyResponse(dummy_content)
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx, "get", dummy_get)
|
||||||
|
result = encode_image_base64_by_url()
|
||||||
|
expected = "data:image/jpeg;base64," + base64.b64encode(dummy_content).decode("utf-8")
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_probe_without_prompt_placeholder(monkeypatch):
|
||||||
|
"""Test the probe function when the request body does not include the <<PROMPT>> placeholder."""
|
||||||
|
spec = "POST http://example.com/api\nContent-Type: application/json\n\n{\"message\": \"No placeholder here\"}"
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
|
||||||
|
captured = {}
|
||||||
|
|
||||||
|
async def dummy_request(self, method, url, headers, content, timeout):
|
||||||
|
captured['content'] = content
|
||||||
|
return httpx.Response(200, text="ok without placeholder")
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx.AsyncClient, "request", dummy_request)
|
||||||
|
response = await llm_spec.probe("Ignored prompt")
|
||||||
|
assert "No placeholder here" in captured['content']
|
||||||
|
assert response.status_code == 200
|
||||||
|
|
||||||
|
def test_validate_success():
|
||||||
|
"""Test that LLMSpec.validate does not raise an error when all required data is provided."""
|
||||||
|
# Test case for files: files are provided as required
|
||||||
|
spec_files = "POST http://example.com/api\nContent-Type: multipart/form-data\n\nFile upload"
|
||||||
|
llm_spec_files = parse_http_spec(spec_files)
|
||||||
|
llm_spec_files.validate("some prompt", "dummy_image", "dummy_audio", {"file": ("dummy.txt", b"data")})
|
||||||
|
|
||||||
|
# Test case for image: image is provided as required
|
||||||
|
spec_image = "POST http://example.com/api\nContent-Type: application/json\n\nImage: <<BASE64_IMAGE>>"
|
||||||
|
llm_spec_image = parse_http_spec(spec_image)
|
||||||
|
llm_spec_image.validate("some prompt", "dummy_image", "dummy_audio", {})
|
||||||
|
|
||||||
|
# Test case for audio: audio is provided as required
|
||||||
|
spec_audio = "POST http://example.com/api\nContent-Type: application/json\n\nAudio: <<BASE64_AUDIO>>"
|
||||||
|
llm_spec_audio = parse_http_spec(spec_audio)
|
||||||
|
llm_spec_audio.validate("some prompt", "dummy_image", "dummy_audio", {})
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_probe_invalid_url(monkeypatch):
|
||||||
|
"""Test that probe raises an exception when the HTTP client fails due to an invalid URL."""
|
||||||
|
spec = "GET http://nonexistent_url/api\nContent-Type: application/json\n\n{\"prompt\": \"<<PROMPT>>\"}"
|
||||||
|
llm_spec = parse_http_spec(spec)
|
||||||
|
|
||||||
|
async def dummy_request(self, method, url, headers, content, timeout):
|
||||||
|
raise httpx.RequestError("Invalid URL")
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx.AsyncClient, "request", dummy_request)
|
||||||
|
with pytest.raises(httpx.RequestError):
|
||||||
|
await llm_spec.probe("Test")
|
||||||
@@ -0,0 +1,26 @@
|
|||||||
|
import pytest
|
||||||
|
from datasets import load_dataset
|
||||||
|
|
||||||
|
from agentic_security.probe_data import REGISTRY
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.slow
|
||||||
|
@pytest.mark.parametrize("dataset", REGISTRY)
|
||||||
|
def test_registry_accessibility(dataset):
|
||||||
|
source = dataset.get("source", "")
|
||||||
|
if "hugging" not in source.lower():
|
||||||
|
return pytest.skip("skipped dataset")
|
||||||
|
if not dataset.get("is_active"):
|
||||||
|
return pytest.skip("skipped dataset")
|
||||||
|
|
||||||
|
dataset_name = dataset.get("dataset_name")
|
||||||
|
if not dataset_name:
|
||||||
|
pytest.fail(f"No dataset_name found in {dataset}")
|
||||||
|
|
||||||
|
# Load only metadata (no data download)
|
||||||
|
try:
|
||||||
|
ds = load_dataset(dataset_name, split=None)
|
||||||
|
# Check if metadata is accessible without loading full data
|
||||||
|
assert ds is not None, f"Failed to load metadata for {dataset_name}"
|
||||||
|
except Exception as e:
|
||||||
|
pytest.fail(f"Error loading metadata for {dataset_name}: {str(e)}")
|
||||||
Reference in New Issue
Block a user