Integrate Cognee service updates

This commit is contained in:
Songbird
2025-10-03 11:54:01 +02:00
parent 43d3eae1db
commit c7adfabe0a
17 changed files with 1649 additions and 99 deletions
+1 -1
View File
@@ -12,7 +12,7 @@ Run the command from a project directory that already contains `.fuzzforge/`. Th
**Default directories**
- Logs: `.fuzzforge/logs/cognee.log`
- Cognee datasets: `.fuzzforge/cognee/project_<id>/{data,system}`
- Cognee datasets: `.fuzzforge/cognee/project_<id>/{data,system}` in embedded mode, or `s3://<bucket>/cognee/projects/<project-id>/` when the service backend is active.
- Artifact cache: `.fuzzforge/artifacts`
## HTTP Endpoints
+1 -1
View File
@@ -140,7 +140,7 @@ graph LR
- **Session persistence** is controlled by `SESSION_PERSISTENCE`. When set to `sqlite`, ADKs `DatabaseSessionService` writes transcripts to the path configured by `SESSION_DB_PATH` (defaults to `./fuzzforge_sessions.db`). With `inmemory`, the context is scoped to the current process.
- **Semantic recall** stores vector embeddings so `/recall` queries can surface earlier prompts, even after restarts when using SQLite.
- **Hybrid memory manager** (`HybridMemoryManager`) stitches Cognee results into the ADK session. When a knowledge query hits Cognee, the relevant nodes are appended back into the session context so follow-up prompts can reference them naturally.
- **Cognee datasets** are unique per project. Ingestion runs populate `<project>_codebase` while custom calls to `ingest_to_dataset` let you maintain dedicated buckets (e.g., `insights`). Data is persisted inside `.fuzzforge/cognee/project_<id>/` and shared across CLI and A2A modes.
- **Cognee datasets** are unique per project. Ingestion runs populate `<project>_codebase` while custom calls to `ingest_to_dataset` let you maintain dedicated buckets (e.g., `insights`). Data is persisted inside `.fuzzforge/cognee/project_<id>/` when running embedded, or under `s3://<bucket>/cognee/projects/<project-id>/` when the hosted Cognee service is enabled.
- **Task metadata** (workflow runs, artifact descriptors) lives in the executors in-memory caches but is also mirrored through A2A task events so remote agents can resubscribe if the CLI restarts.
- **Operational check**: Run `/recall <keyword>` or `You> search project knowledge for "topic" using INSIGHTS` after ingestion to confirm both ADK session recall and Cognee graph access are active.
- **CLI quick check**: `/memory status` summarises the current memory type, session persistence, and Cognee dataset directories from inside the agent shell.
+17
View File
@@ -81,6 +81,23 @@ LLM_COGNEE_API_KEY=sk-your-key
If the Cognee variables are omitted, graph-specific tools remain available but return a friendly "not configured" response.
### Hosted Cognee Service
See [Hosted Cognee Service](./cognee-service.md) for step-by-step instructions on starting the shared backend with Docker.
When you want multiple projects to share a dedicated Cognee backend, point the CLI at the service and shared S3 bucket:
```env
COGNEE_STORAGE_MODE=service
COGNEE_SERVICE_URL=http://localhost:8000
COGNEE_S3_BUCKET=cognee-shared
COGNEE_S3_PREFIX=cognee/projects
COGNEE_SERVICE_USER_EMAIL=project_12345678@cognee.local
COGNEE_SERVICE_USER_PASSWORD=super-secret
```
During initialisation the CLI writes these values to `.fuzzforge/cognee/service/project_<id>/.env`. Each project gets its own scoped dataset (default `<project>_codebase`) while the service persists metadata in `s3://<bucket>/<prefix>/` using the project and tenant identifiers.
## MCP / Backend Integration
```env
+9 -3
View File
@@ -38,14 +38,14 @@ All runs automatically skip `.fuzzforge/**` and `.git/**` to avoid recursive ing
- Primary dataset: `<project>_codebase`
- Additional datasets: create ad-hoc buckets such as `insights` via the `ingest_to_dataset` tool
- Storage location: `.fuzzforge/cognee/project_<id>/`
- Storage location: `.fuzzforge/cognee/project_<id>/` when running embedded, or `s3://<bucket>/cognee/projects/<project-id>/` when using the Cognee service mode.
### Persistence Details
- Every dataset lives under `.fuzzforge/cognee/project_<id>/{data,system}`. These directories are safe to commit to long-lived storage (they only contain embeddings and metadata).
- Every dataset lives under `.fuzzforge/cognee/project_<id>/{data,system}` when running locally. In service mode the same layout is mirrored to a shared S3 bucket so multiple projects can reuse the hosted Cognee instance without colliding.
- Cognee assigns deterministic IDs per project; if you move the repository, copy the entire `.fuzzforge/cognee/` tree to retain graph history.
- `HybridMemoryManager` ensures answers from Cognee are written back into the ADK session store so future prompts can refer to the same nodes without repeating the query.
- All Cognee processing runs locally against the files you ingest. No external service calls are made unless you configure a remote Cognee endpoint.
- In embedded mode all Cognee processing runs locally against the files you ingest. When `COGNEE_STORAGE_MODE=service`, the CLI streams files to the Cognee API, which stores them in the shared S3 prefix and runs the pipeline remotely before results flow back into the agent session.
## Prompt Examples
@@ -77,6 +77,12 @@ FUZZFORGE_MCP_URL=http://localhost:8010/mcp
LLM_COGNEE_PROVIDER=openai
LLM_COGNEE_MODEL=gpt-5-mini
LLM_COGNEE_API_KEY=sk-your-key
# Optional: hosted Cognee service
COGNEE_STORAGE_MODE=service
COGNEE_SERVICE_URL=http://localhost:8000
COGNEE_S3_BUCKET=cognee-shared
COGNEE_S3_PREFIX=cognee/projects
```
Add comments or project-specific overrides as needed; the agent reads these variables on startup.