v3.3.0 GUI dashboard + reports + model expansion + root fix

Engine: - Fix: inject IS_SANDBOX=1 so Claude Code's --dangerously-skip-permissions works under root (real backend runs were exiting rc=1 immediately) - models: expand to 40 models / 13 providers, tagged CLI vs API (NVIDIA NIM, DeepSeek, Mistral, Qwen/DashScope, Groq, Together, OpenRouter, Ollama, Gemini) — Qwen/DeepSeek/Llama usable via API - backends: on_start callback surfaces the exact argv ("what runs behind it") - orchestrator: require a Playwright screenshot per confirmed finding; collect results/activity.json; auto-generate reports after a run - report.py: HTML always + PDF via Typst engine (.typ source emitted too) Web dashboard (webgui/, stdlib only — no npm/build): - Sidebar dashboard (PentAGI-style): Run / Agents / Insights / Reports / Settings - Multi-target runs; live execution console + per-task activity; finding cards with screenshots; backend+provider+model pickers (CLI & API) - Agents tab: browse 213 + add new .md agents from the UI - Insights: interactive RL-weight + severity charts - Reports: download/preview PDF + HTML - Settings/API: execution mode, per-provider API keys, orchestrator, verbosity - Endpoints: /api/agents (GET/POST), /api/rl, /api/config, /api/reports, /reports/* + /shots/* static serving Cleanup: retire replaced web stack (frontend React, FastAPI backend, core orchestration, old test) to legacy/. Active engine + GUI are fully standalone. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 07:15:30 +02:00 · 2026-06-14 23:26:11 -03:00
parent 22a7302a35
commit a5badefc29
205 changed files with 809 additions and 199 deletions
@@ -122,18 +122,30 @@ class RunResult:
 def run(backend: Backend, prompt: str, workdir: str, model: str = "",
        autonomous: bool = True, mcp_config: Optional[str] = None,
        env: Optional[Dict[str, str]] = None, timeout: int = 7200,
-        dry_run: bool = False) -> RunResult:
-    """Execute a backend against the composed prompt and stream logs to disk."""
+        dry_run: bool = False, on_start=None) -> RunResult:
+    """Execute a backend against the composed prompt and stream logs to disk.
+
+    on_start(argv): optional callback invoked with the exact command line, so
+    callers/UI can show precisely what is being executed behind the scenes.
+    """
    os.makedirs(workdir, exist_ok=True)
    prompt_file = os.path.join(workdir, "master_prompt.md")
    open(prompt_file, "w", encoding="utf-8").write(prompt)
    log_path = os.path.join(workdir, "backend.log")

    argv = backend.build_argv(prompt_file, workdir, model, autonomous, mcp_config)
+    if on_start:
+        on_start(argv)
    full_env = os.environ.copy()
    if env:
        full_env.update(env)

+    # Claude Code refuses --dangerously-skip-permissions when running as root
+    # unless IS_SANDBOX=1 is set. The engine already isolates each run in its own
+    # workdir, so opt into the sandbox flag rather than failing rc=1 under root.
+    if autonomous and backend.key == "claude" and hasattr(os, "geteuid") and os.geteuid() == 0:
+        full_env.setdefault("IS_SANDBOX", "1")
+
    if dry_run:
        open(log_path, "w").write("DRY RUN\n" + " ".join(argv) + "\n")
        return RunResult(backend.key, 0, log_path, workdir)