mirror of
https://github.com/Ed1s0nZ/CyberStrikeAI.git
synced 2026-07-02 18:55:52 +02:00
Add files via upload
This commit is contained in:
@@ -110,7 +110,7 @@ CyberStrikeAI is an **AI-native security testing platform** built in Go. It inte
|
||||
- 📄 Large-result pagination, compression, and searchable archives
|
||||
- 🔗 Attack-chain graph, risk scoring, and step-by-step replay
|
||||
- 🔒 Password-protected web UI, audit logs, and SQLite persistence
|
||||
- 📚 Knowledge base (RAG) with embedding-based vector retrieval (cosine similarity), optional **Eino Compose** indexing pipeline, and configurable post-retrieval budgets / reranking hooks
|
||||
- 📚 Knowledge base (RAG): **Eino MultiQuery** query rewrite + multi-path vector retrieval + **HTTP rerank** (DashScope `gte-rerank` / Cohere-compatible) + post-processing (dedupe, budget); **Eino Compose** indexing pipeline
|
||||
- 📁 Conversation grouping with pinning, rename, and batch management
|
||||
- 📂 **Project management**: shared facts (blackboard) across sessions, `upsert_project_fact` + `links` to chain paths; attack-chain and project fact graph views
|
||||
- 🛡️ Vulnerability management with CRUD operations, severity tracking, status workflow, and statistics
|
||||
@@ -455,9 +455,10 @@ A test SSE MCP server is available at `cmd/test-sse-mcp-server/` for validation
|
||||
|
||||
### Knowledge Base
|
||||
- **Vector search** – AI agent can automatically search the knowledge base for relevant security knowledge during conversations using the `search_knowledge_base` tool.
|
||||
- **Vector retrieval** – cosine similarity over stored embeddings, aligned with Eino `retriever.Retriever` usage.
|
||||
- **Auto-indexing** – scans the `knowledge_base/` directory for Markdown files and automatically indexes them with embeddings.
|
||||
- **Web management** – create, update, delete knowledge items through the web UI, with category-based organization.
|
||||
- **RAG pipeline (always on)** – **MultiQuery** (LLM query rewrite) → vector prefetch & fusion → **HTTP rerank** (DashScope `gte-rerank` or Cohere-compatible `/v1/rerank`) → post-processing (normalized dedupe, char/token budget, final top_k). Rerank failures degrade to fusion order without breaking search.
|
||||
- **Vector retrieval** – cosine similarity over stored embeddings with configurable threshold, aligned with Eino `retriever.Retriever` usage.
|
||||
- **Auto-indexing** – scans the `knowledge_base/` directory for Markdown files and automatically indexes them with embeddings (Markdown header split + recursive chunking via Eino).
|
||||
- **Web management** – create, update, delete knowledge items through the web UI, with category-based organization; settings page exposes MultiQuery / rerank / prefetch options.
|
||||
- **Retrieval logs** – tracks all knowledge retrieval operations for audit and debugging.
|
||||
|
||||
**Quick Start (Using Pre-built Knowledge Base):**
|
||||
@@ -479,6 +480,17 @@ A test SSE MCP server is available at `cmd/test-sse-mcp-server/` for validation
|
||||
retrieval:
|
||||
top_k: 5
|
||||
similarity_threshold: 0.7
|
||||
multi_query:
|
||||
max_queries: 4 # LLM rewrite variants (always on)
|
||||
rerank: # always on; empty fields inherit openai/embedding credentials
|
||||
provider: "" # auto: dashscope | cohere from base_url
|
||||
model: "" # empty: gte-rerank (DashScope) or rerank-multilingual-v3.0 (Cohere)
|
||||
base_url: ""
|
||||
api_key: ""
|
||||
post_retrieve:
|
||||
prefetch_top_k: 20 # vector candidates per MultiQuery variant; 0 = max(top_k×4, 20)
|
||||
max_context_chars: 0
|
||||
max_context_tokens: 0
|
||||
```
|
||||
2. **Add knowledge files** – place Markdown files in `knowledge_base/` directory, organized by category (e.g., `knowledge_base/SQL Injection/README.md`).
|
||||
3. **Scan and index** – use the web UI to scan the knowledge base directory, which will automatically import files and build vector embeddings.
|
||||
@@ -539,6 +551,17 @@ knowledge:
|
||||
retrieval:
|
||||
top_k: 5 # Number of top results to return
|
||||
similarity_threshold: 0.7 # Minimum cosine similarity (0-1)
|
||||
multi_query:
|
||||
max_queries: 4 # MultiQuery rewrite variants (always on)
|
||||
rerank: # HTTP rerank (always on); empty fields inherit openai/embedding credentials
|
||||
provider: ""
|
||||
model: ""
|
||||
base_url: ""
|
||||
api_key: ""
|
||||
post_retrieve:
|
||||
prefetch_top_k: 20 # per MultiQuery variant; 0 = max(top_k×4, 20)
|
||||
max_context_chars: 0
|
||||
max_context_tokens: 0
|
||||
roles_dir: "roles" # Role configuration directory (relative to config file)
|
||||
skills_dir: "skills" # Skills directory (relative to config file)
|
||||
agents_dir: "agents" # Multi-agent Markdown definitions (orchestrator + sub-agents)
|
||||
|
||||
+27
-4
@@ -109,7 +109,7 @@ CyberStrikeAI 是一款 **AI 原生安全测试平台**,基于 Go 构建,集
|
||||
- 📄 大结果分页、压缩与全文检索
|
||||
- 🔗 攻击链可视化、风险打分与步骤回放
|
||||
- 🔒 Web 登录保护、审计日志、SQLite 持久化
|
||||
- 📚 知识库(RAG):向量嵌入与余弦相似度检索(与 Eino `retriever.Retriever` 语义一致),可选 **Eino Compose** 索引流水线及检索后处理(预算、重排等配置项)
|
||||
- 📚 知识库(RAG):**Eino MultiQuery** 查询改写 + 多路向量检索 + **HTTP 精排**(DashScope `gte-rerank` / Cohere 兼容)+ 后处理(去重、预算);索引侧为 **Eino Compose** 流水线
|
||||
- 📁 对话分组管理:支持分组创建、置顶、重命名、删除等操作
|
||||
- 📂 **项目管理**:共享事实(黑板)跨会话沉淀认知,`upsert_project_fact` + `links` 串联攻击路径;聊天攻击链与项目事实图可视化
|
||||
- 🛡️ 漏洞管理功能:完整的漏洞 CRUD 操作,支持严重程度分级、状态流转、按对话/严重程度/状态过滤,以及统计看板
|
||||
@@ -453,9 +453,10 @@ CyberStrikeAI 支持通过三种传输模式连接外部 MCP 服务器:
|
||||
|
||||
### 知识库功能
|
||||
- **向量检索**:AI 智能体在对话过程中可自动调用 `search_knowledge_base` 工具搜索知识库中的安全知识。
|
||||
- **向量检索**:基于嵌入余弦相似度与相似度阈值过滤(与 Eino `retriever.Retriever` 语义一致)。
|
||||
- **自动索引**:扫描 `knowledge_base/` 目录下的 Markdown 文件,自动构建向量嵌入索引。
|
||||
- **Web 管理**:通过 Web 界面创建、更新、删除知识项,支持分类管理。
|
||||
- **RAG 管线(始终启用)**:**MultiQuery**(LLM 查询改写)→ 向量预取与融合 → **HTTP 精排**(DashScope `gte-rerank` 或 Cohere 兼容 `/v1/rerank`)→ 后处理(规范化去重、字符/token 预算、最终 top_k)。精排失败时自动降级为融合排序,检索仍可用。
|
||||
- **向量相似度**:基于嵌入余弦相似度与相似度阈值过滤(与 Eino `retriever.Retriever` 语义一致)。
|
||||
- **自动索引**:扫描 `knowledge_base/` 目录下的 Markdown 文件,自动构建向量嵌入索引(Eino Markdown 标题切分 + 递归分块)。
|
||||
- **Web 管理**:通过 Web 界面创建、更新、删除知识项,支持分类管理;设置页可配置 MultiQuery / 精排 / 预取候选数。
|
||||
- **检索日志**:记录所有知识检索操作,便于审计与调试。
|
||||
|
||||
**快速开始(使用预构建知识库):**
|
||||
@@ -477,6 +478,17 @@ CyberStrikeAI 支持通过三种传输模式连接外部 MCP 服务器:
|
||||
retrieval:
|
||||
top_k: 5
|
||||
similarity_threshold: 0.7
|
||||
multi_query:
|
||||
max_queries: 4 # LLM 改写变体上限(始终启用)
|
||||
rerank: # 精排始终启用;留空则继承 openai/embedding 凭据
|
||||
provider: "" # 空=按 base_url 推断 dashscope | cohere
|
||||
model: "" # 空=DashScope→gte-rerank,Cohere→rerank-multilingual-v3.0
|
||||
base_url: ""
|
||||
api_key: ""
|
||||
post_retrieve:
|
||||
prefetch_top_k: 20 # 每条 MultiQuery 变体的向量候选数;0=max(top_k×4, 20)
|
||||
max_context_chars: 0
|
||||
max_context_tokens: 0
|
||||
```
|
||||
2. **添加知识文件**:将 Markdown 文件放入 `knowledge_base/` 目录,按分类组织(如 `knowledge_base/SQL注入/README.md`)。
|
||||
3. **扫描索引**:在 Web 界面中点击"扫描知识库",系统会自动导入文件并构建向量索引。
|
||||
@@ -537,6 +549,17 @@ knowledge:
|
||||
retrieval:
|
||||
top_k: 5 # 检索返回的 Top-K 结果数量
|
||||
similarity_threshold: 0.7 # 余弦相似度阈值(0-1),低于此值的结果将被过滤
|
||||
multi_query:
|
||||
max_queries: 4 # MultiQuery 改写变体上限(始终启用)
|
||||
rerank: # HTTP 精排(始终启用);留空则继承 openai/embedding 凭据
|
||||
provider: ""
|
||||
model: ""
|
||||
base_url: ""
|
||||
api_key: ""
|
||||
post_retrieve:
|
||||
prefetch_top_k: 20 # 每条 MultiQuery 变体;0=max(top_k×4, 20)
|
||||
max_context_chars: 0
|
||||
max_context_tokens: 0
|
||||
roles_dir: "roles" # 角色配置文件目录(相对于配置文件所在目录)
|
||||
skills_dir: "skills" # Skills 目录(相对于配置文件所在目录)
|
||||
agents_dir: "agents" # 多代理 Markdown(主代理 orchestrator.md + 子代理 *.md)
|
||||
|
||||
@@ -26,7 +26,7 @@
|
||||
| OpenAPI | 多代理路径说明已更新(流式未启用为 SSE 错误事件)。 |
|
||||
| 机器人 | `ProcessMessageForRobot` 按 `robot_default_agent_mode`(默认 `eino_single`)调用 `RunEinoSingleChatModelAgent` 或 `RunDeepAgent`。 |
|
||||
| 预置编排 | 聊天 / WebShell:`POST /api/multi-agent*` 请求体 `orchestration`:`deep` \| `plan_execute` \| `supervisor`(缺省 `deep`)。`plan_execute` 不构建 YAML/Markdown 子代理;`plan_execute_loop_max_iterations` 仍来自配置。`supervisor` 至少需一个子代理。 |
|
||||
| Eino 中间件 | `multi_agent.eino_middleware`(可选):`patchtoolcalls`(默认开)、`toolsearch`(按阈值拆分 MCP 工具列表)、`plantask`(需 `eino_skills`)、`reduction`(大工具输出截断/落盘)、`checkpoint_dir`(Runner 断点)、`deep_output_key` / `deep_model_retry_max_retries` / `task_tool_description_prefix`(Deep 与 supervisor 主代理共享其中模型重试与 OutputKey)。`plan_execute` 的 Executor 无 Handlers:仅继承 **ToolsConfig** 侧效果(如 `tool_search` 列表拆分),不挂载 patch/plantask/reduction 中间件。 |
|
||||
| Eino 中间件 | `multi_agent.eino_middleware`(可选):`patchtoolcalls`(默认开)、`toolsearch`(按阈值拆分 MCP 工具列表)、`plantask`(需 `eino_skills`)、`reduction`(大工具输出截断/落盘)、`checkpoint_dir`(Runner 断点)、`deep_output_key` / `deep_model_retry_max_retries` / `task_tool_description_prefix`(Deep 与 supervisor 主代理共享其中模型重试与 OutputKey)。**`plan_execute`**:`runner.go` 将 `prependEinoMiddlewares(einoMWMain)` 产物作为 `ExecPreMiddlewares` 挂到 **Executor**(与 Deep/Supervisor 主代理同序:patch → reduction → toolsearch → plantask → filesystem → skill → summarization tail);Planner/Replanner 仅 summarization tail + prompt 预算截断,不跑 MCP 工具链。 |
|
||||
|
||||
## 进行中 / 待办( backlog )
|
||||
|
||||
@@ -37,7 +37,8 @@
|
||||
|
||||
## 关键文件索引
|
||||
|
||||
- `internal/multiagent/runner.go` — DeepAgent 组装与事件循环
|
||||
- `internal/multiagent/runner.go` — DeepAgent / plan_execute / supervisor 组装与事件循环
|
||||
- `internal/multiagent/eino_orchestration.go` — PlanExecute 根节点与 Executor 中间件栈(`buildPlanExecuteExecutorHandlers`)
|
||||
- `internal/handler/multi_agent.go` — SSE 与(同步)HTTP
|
||||
- `internal/handler/multi_agent_prepare.go` — 会话准备(含 WebShell)
|
||||
- `internal/einomcp/` — MCP → Eino Tool
|
||||
@@ -59,4 +60,5 @@
|
||||
| 2026-03-22 | `orchestrator.md` / `kind: orchestrator` 主代理、列表主/子标记、与 `orchestrator_instruction` 优先级。 |
|
||||
| 2026-04-19 | 主聊天「对话模式」:原生 ReAct 与 Deep / Plan-Execute / Supervisor;`POST /api/multi-agent*` 请求体 `orchestration` 与界面一致;`config.yaml` / 设置页不再维护预置编排字段(机器人/批量默认 `deep`)。 |
|
||||
| 2026-04-21 | 移除角色 `skills` 与 `/api/roles/skills/list`;`bind_role` 仅继承 tools;Skills 仅通过 Eino `skill` 工具按需加载。 |
|
||||
| 2026-07-02 | **plan_execute Executor 中间件对齐**:`ExecPreMiddlewares` 与 Deep 主代理同源;`buildPlanExecuteExecutorHandlers` + 回归测试;文档更正。 |
|
||||
| 2026-06-02 | **移除原生 ReAct**:删除 `/api/agent-loop*` 执行入口与 `AgentLoopWithProgress`;统一 Eino ADK(单代理 `/api/eino-agent*`,多代理 `/api/multi-agent*`);任务 cancel/tasks API 保留。 |
|
||||
|
||||
Reference in New Issue
Block a user