Feat/temporal (#46)

* refactor: modularize claude-executor and extract shared utilities - Extract message handling into src/ai/message-handlers.ts with pure functions - Extract output formatting into src/ai/output-formatters.ts - Extract progress management into src/ai/progress-manager.ts - Add audit-logger.ts with Null Object pattern for optional logging - Add shared utilities: formatting.ts, file-io.ts, functional.ts - Consolidate getPromptNameForAgent into src/types/agents.ts * feat: add Claude Code custom commands for debug and review * feat: add Temporal integration foundation (phase 1-2) - Add Temporal SDK dependencies (@temporalio/client, worker, workflow, activity) - Add shared types for pipeline state, metrics, and progress queries - Add classifyErrorForTemporal() for retry behavior classification - Add docker-compose for Temporal server with SQLite persistence * feat: add Temporal activities for agent execution (phase 3) - Add activities.ts with heartbeat loop, git checkpoint/rollback, and error classification - Export runClaudePrompt, validateAgentOutput, ClaudePromptResult for Temporal use - Track attempt number via Temporal Context for accurate audit logging - Rollback git workspace before retry to ensure clean state * feat: add Temporal workflow for 5-phase pipeline orchestration (phase 4) * feat: add Temporal worker, client, and query tools (phase 5) - Add worker.ts with workflow bundling and graceful shutdown - Add client.ts CLI to start pipelines with progress polling - Add query.ts CLI to inspect running workflow state - Fix buffer overflow by truncating error messages and stack traces - Skip git operations gracefully on non-git repositories - Add kill.sh/start.sh dev scripts and Dockerfile.worker * feat: fix Docker worker container setup - Install uv instead of deprecated uvx package - Add mcp-server and configs directories to container - Mount target repo dynamically via TARGET_REPO env variable * fix: add report assembly step to Temporal workflow - Add assembleReportActivity to concatenate exploitation evidence files before report agent runs - Call assembleFinalReport in workflow Phase 5 before runReportAgent - Ensure deliverables directory exists before writing final report - Simplify pipeline-testing report prompt to just prepend header * refactor: consolidate Docker setup to root docker-compose.yml * feat: improve Temporal client UX and env handling - Change default to fire-and-forget (--wait flag to opt-in) - Add splash screen and improve console output formatting - Add .env to gitignore, remove from dockerignore for container access - Add Taskfile for common development commands * refactor: simplify session ID handling and improve Taskfile options - Include hostname in workflow ID for better audit log organization - Extract sanitizeHostname utility to audit/utils.ts for reuse - Remove unused generateSessionLogPath and buildLogFilePath functions - Simplify Taskfile with CONFIG/OUTPUT/CLEAN named parameters * chore: add .env.example and simplify .gitignore * docs: update README and CLAUDE.md for Temporal workflow usage - Replace Docker CLI instructions with Task-based commands - Add monitoring/stopping sections and workflow examples - Document Temporal orchestration layer and troubleshooting - Simplify file structure to key files overview * refactor: replace Taskfile with bash CLI script - Add shannon bash script with start/logs/query/stop/help commands - Remove Taskfile.yml dependency (no longer requires Task installation) - Update README.md and CLAUDE.md to use ./shannon commands - Update client.ts output to show ./shannon commands * docs: fix deliverable filename in README * refactor: remove direct CLI and .shannon-store.json in favor of Temporal - Delete src/shannon.ts direct CLI entry point (Temporal is now the only mode) - Remove .shannon-store.json session lock (Temporal handles workflow deduplication) - Remove broken scripts/export-metrics.js (imported non-existent function) - Update package.json to remove main, start script, and bin entry - Clean up CLAUDE.md and debug.md to remove obsolete references * chore: remove licensing comments from prompt files to prevent leaking into actual prompts * fix: resolve parallel workflow race conditions and retry logic bugs - Fix save_deliverable race condition using closure pattern instead of global variable - Fix error classification order so OutputValidationError matches before generic validation - Fix ApplicationFailure re-classification bug by checking instanceof before re-throwing - Add per-error-type retry limits (3 for output validation, 50 for billing) - Add fast retry intervals for pipeline testing mode (10s vs 5min) - Increase worker concurrent activities to 25 for parallel workflows * refactor: pipeline vuln→exploit workflow for parallel execution - Replace sync barrier between vuln/exploit phases with independent pipelines - Each vuln type runs: vuln agent → queue check → conditional exploit - Add checkExploitationQueue activity to skip exploits when no vulns found - Use Promise.allSettled for graceful failure handling across pipelines - Add PipelineSummary type for aggregated cost/duration/turns metrics * fix: re-throw retryable errors in checkExploitationQueue * fix: detect and retry on Claude Code spending cap errors - Add spending cap pattern detection in detectApiError() with retryable error - Add matching patterns to classifyErrorForTemporal() for proper Temporal retry - Add defense-in-depth safeguard in runClaudePrompt() for $0 cost / low turn detection - Add final sanity check in activities before declaring success * fix: increase heartbeat timeout to prevent false worker-dead detection Original 30s timeout was from POC spec assuming <5min activities. With hour-long activities and multiple concurrent workflows sharing one worker, resource contention causes event loop stalls exceeding 30s, triggering false heartbeat timeouts. Increased to 10min (prod) and 5min (testing). * fix: temporal db init * fix: persist home dir * feat: add per-workflow unified logging with ./shannon logs ID=<workflow-id> - Add WorkflowLogger class for human-readable, per-workflow log files - Create workflow.log in audit-logs/{workflowId}/ with phase, agent, tool, and LLM events - Update ./shannon logs to require ID param and tail specific workflow log - Add phase transition logging at workflow boundaries - Include workflow completion summary with agent breakdown (duration, cost) - Mount audit-logs volume in docker-compose for host access --------- Co-authored-by: ezl-keygraph <ezhil@keygraph.io>
2026-05-28 19:31:34 +02:00 · 2026-01-15 10:36:11 -08:00
parent 45acb16711
commit 51e621d0d5
77 changed files with 6117 additions and 2417 deletions
@@ -0,0 +1,79 @@
+// Copyright (C) 2025 Keygraph, Inc.
+//
+// This program is free software: you can redistribute it and/or modify
+// it under the terms of the GNU Affero General Public License version 3
+// as published by the Free Software Foundation.
+
+// Null Object pattern for audit logging - callers never check for null
+
+import type { AuditSession } from '../audit/index.js';
+import { formatTimestamp } from '../utils/formatting.js';
+
+export interface AuditLogger {
+  logLlmResponse(turn: number, content: string): Promise<void>;
+  logToolStart(toolName: string, parameters: unknown): Promise<void>;
+  logToolEnd(result: unknown): Promise<void>;
+  logError(error: Error, duration: number, turns: number): Promise<void>;
+}
+
+class RealAuditLogger implements AuditLogger {
+  private auditSession: AuditSession;
+
+  constructor(auditSession: AuditSession) {
+    this.auditSession = auditSession;
+  }
+
+  async logLlmResponse(turn: number, content: string): Promise<void> {
+    await this.auditSession.logEvent('llm_response', {
+      turn,
+      content,
+      timestamp: formatTimestamp(),
+    });
+  }
+
+  async logToolStart(toolName: string, parameters: unknown): Promise<void> {
+    await this.auditSession.logEvent('tool_start', {
+      toolName,
+      parameters,
+      timestamp: formatTimestamp(),
+    });
+  }
+
+  async logToolEnd(result: unknown): Promise<void> {
+    await this.auditSession.logEvent('tool_end', {
+      result,
+      timestamp: formatTimestamp(),
+    });
+  }
+
+  async logError(error: Error, duration: number, turns: number): Promise<void> {
+    await this.auditSession.logEvent('error', {
+      message: error.message,
+      errorType: error.constructor.name,
+      stack: error.stack,
+      duration,
+      turns,
+      timestamp: formatTimestamp(),
+    });
+  }
+}
+
+/** Null Object implementation - all methods are safe no-ops */
+class NullAuditLogger implements AuditLogger {
+  async logLlmResponse(_turn: number, _content: string): Promise<void> {}
+
+  async logToolStart(_toolName: string, _parameters: unknown): Promise<void> {}
+
+  async logToolEnd(_result: unknown): Promise<void> {}
+
+  async logError(_error: Error, _duration: number, _turns: number): Promise<void> {}
+}
+
+// Returns no-op when auditSession is null
+export function createAuditLogger(auditSession: AuditSession | null): AuditLogger {
+  if (auditSession) {
+    return new RealAuditLogger(auditSession);
+  }
+
+  return new NullAuditLogger();
+}
@@ -4,35 +4,33 @@
 // it under the terms of the GNU Affero General Public License version 3
 // as published by the Free Software Foundation.

-import { $, fs, path } from 'zx';
+// Production Claude agent execution with retry, git checkpoints, and audit logging
+
+import { fs, path } from 'zx';
 import chalk, { type ChalkInstance } from 'chalk';
 import { query } from '@anthropic-ai/claude-agent-sdk';
-import { fileURLToPath } from 'url';
-import { dirname } from 'path';

 import { isRetryableError, getRetryDelay, PentestError } from '../error-handling.js';
-import { ProgressIndicator } from '../progress-indicator.js';
-import { timingResults, costResults, Timer } from '../utils/metrics.js';
-import { formatDuration } from '../audit/utils.js';
-import { createGitCheckpoint, commitGitSuccess, rollbackGitWorkspace } from '../utils/git-manager.js';
+import { timingResults, Timer } from '../utils/metrics.js';
+import { formatTimestamp } from '../utils/formatting.js';
+import { createGitCheckpoint, commitGitSuccess, rollbackGitWorkspace, getGitCommitHash } from '../utils/git-manager.js';
 import { AGENT_VALIDATORS, MCP_AGENT_MAPPING } from '../constants.js';
-import { filterJsonToolCalls, getAgentPrefix } from '../utils/output-formatter.js';
-import { generateSessionLogPath } from '../session-manager.js';
 import { AuditSession } from '../audit/index.js';
 import { createShannonHelperServer } from '../../mcp-server/dist/index.js';
 import type { SessionMetadata } from '../audit/utils.js';
-import type { PromptName } from '../types/index.js';
+import { getPromptNameForAgent } from '../types/agents.js';
+import type { AgentName } from '../types/index.js';
+
+import { dispatchMessage } from './message-handlers.js';
+import { detectExecutionContext, formatErrorOutput, formatCompletionMessage } from './output-formatters.js';
+import { createProgressManager } from './progress-manager.js';
+import { createAuditLogger } from './audit-logger.js';

-// Extend global for loader flag
 declare global {
  var SHANNON_DISABLE_LOADER: boolean | undefined;
 }

-const __filename = fileURLToPath(import.meta.url);
-const __dirname = dirname(__filename);
-
-// Result types
-interface ClaudePromptResult {
+export interface ClaudePromptResult {
  result?: string | null;
  success: boolean;
  duration: number;
@@ -40,14 +38,12 @@ interface ClaudePromptResult {
  cost: number;
  partialCost?: number;
  apiErrorDetected?: boolean;
-  logFile?: string;
  error?: string;
  errorType?: string;
  prompt?: string;
  retryable?: boolean;
 }

-// MCP Server types
 interface StdioMcpServer {
  type: 'stdio';
  command: string;
@@ -57,157 +53,29 @@ interface StdioMcpServer {

 type McpServer = ReturnType<typeof createShannonHelperServer> | StdioMcpServer;

-/**
- * Convert agent name to prompt name for MCP_AGENT_MAPPING lookup
- */
-function agentNameToPromptName(agentName: string): PromptName {
-  // Special cases
-  if (agentName === 'pre-recon') return 'pre-recon-code';
-  if (agentName === 'report') return 'report-executive';
-  if (agentName === 'recon') return 'recon';
-
-  // Pattern: {type}-vuln → vuln-{type}
-  const vulnMatch = agentName.match(/^(.+)-vuln$/);
-  if (vulnMatch) {
-    return `vuln-${vulnMatch[1]}` as PromptName;
-  }
-
-  // Pattern: {type}-exploit → exploit-{type}
-  const exploitMatch = agentName.match(/^(.+)-exploit$/);
-  if (exploitMatch) {
-    return `exploit-${exploitMatch[1]}` as PromptName;
-  }
-
-  // Default: return as-is
-  return agentName as PromptName;
-}
-
-// Simplified validation using direct agent name mapping
-async function validateAgentOutput(
-  result: ClaudePromptResult,
-  agentName: string | null,
-  sourceDir: string
-): Promise<boolean> {
-  console.log(chalk.blue(`    🔍 Validating ${agentName} agent output`));
-
-  try {
-    // Check if agent completed successfully
-    if (!result.success || !result.result) {
-      console.log(chalk.red(`    ❌ Validation failed: Agent execution was unsuccessful`));
-      return false;
-    }
-
-    // Get validator function for this agent
-    const validator = agentName ? AGENT_VALIDATORS[agentName as keyof typeof AGENT_VALIDATORS] : undefined;
-
-    if (!validator) {
-      console.log(chalk.yellow(`    ⚠️ No validator found for agent "${agentName}" - assuming success`));
-      console.log(chalk.green(`    ✅ Validation passed: Unknown agent with successful result`));
-      return true;
-    }
-
-    console.log(chalk.blue(`    📋 Using validator for agent: ${agentName}`));
-    console.log(chalk.blue(`    📂 Source directory: ${sourceDir}`));
-
-    // Apply validation function
-    const validationResult = await validator(sourceDir);
-
-    if (validationResult) {
-      console.log(chalk.green(`    ✅ Validation passed: Required files/structure present`));
-    } else {
-      console.log(chalk.red(`    ❌ Validation failed: Missing required deliverable files`));
-    }
-
-    return validationResult;
-
-  } catch (error) {
-    const errMsg = error instanceof Error ? error.message : String(error);
-    console.log(chalk.red(`    ❌ Validation failed with error: ${errMsg}`));
-    return false; // Assume invalid on validation error
-  }
-}
-
-// Pure function: Run Claude Code with SDK - Maximum Autonomy
-// WARNING: This is a low-level function. Use runClaudePromptWithRetry() for agent execution
-async function runClaudePrompt(
-  prompt: string,
+// Configures MCP servers for agent execution, with Docker-specific Chromium handling
+function buildMcpServers(
  sourceDir: string,
-  _allowedTools: string = 'Read',
-  context: string = '',
-  description: string = 'Claude analysis',
-  agentName: string | null = null,
-  colorFn: ChalkInstance = chalk.cyan,
-  sessionMetadata: SessionMetadata | null = null,
-  auditSession: AuditSession | null = null,
-  attemptNumber: number = 1
-): Promise<ClaudePromptResult> {
-  const timer = new Timer(`agent-${description.toLowerCase().replace(/\s+/g, '-')}`);
-  const fullPrompt = context ? `${context}\n\n${prompt}` : prompt;
-  let totalCost = 0;
-  let partialCost = 0; // Track partial cost for crash safety
+  agentName: string | null
+): Record<string, McpServer> {
+  const shannonHelperServer = createShannonHelperServer(sourceDir);

-  // Auto-detect execution mode to adjust logging behavior
-  const isParallelExecution = description.includes('vuln agent') || description.includes('exploit agent');
-  const useCleanOutput = description.includes('Pre-recon agent') ||
-                         description.includes('Recon agent') ||
-                         description.includes('Executive Summary and Report Cleanup') ||
-                         description.includes('vuln agent') ||
-                         description.includes('exploit agent');
+  const mcpServers: Record<string, McpServer> = {
+    'shannon-helper': shannonHelperServer,
+  };

-  // Disable status manager - using simple JSON filtering for all agents now
-  const statusManager = null;
+  if (agentName) {
+    const promptName = getPromptNameForAgent(agentName as AgentName);
+    const playwrightMcpName = MCP_AGENT_MAPPING[promptName as keyof typeof MCP_AGENT_MAPPING] || null;

-  // Setup progress indicator for clean output agents (unless disabled via flag)
-  let progressIndicator: ProgressIndicator | null = null;
-  if (useCleanOutput && !global.SHANNON_DISABLE_LOADER) {
-    const agentType = description.includes('Pre-recon') ? 'pre-reconnaissance' :
-                     description.includes('Recon') ? 'reconnaissance' :
-                     description.includes('Report') ? 'report generation' : 'analysis';
-    progressIndicator = new ProgressIndicator(`Running ${agentType}...`);
-  }
-
-  // NOTE: Logging now handled by AuditSession (append-only, crash-safe)
-  let logFilePath: string | null = null;
-  if (sessionMetadata && sessionMetadata.webUrl && sessionMetadata.id) {
-    const timestamp = new Date().toISOString().replace(/T/, '_').replace(/[:.]/g, '-').slice(0, 19);
-    const agentKey = description.toLowerCase().replace(/\s+/g, '-');
-    const logDir = generateSessionLogPath(sessionMetadata.webUrl, sessionMetadata.id);
-    logFilePath = path.join(logDir, `${timestamp}_${agentKey}_attempt-${attemptNumber}.log`);
-  } else {
-    console.log(chalk.blue(`  🤖 Running Claude Code: ${description}...`));
-  }
-
-  // Declare variables that need to be accessible in both try and catch blocks
-  let turnCount = 0;
-
-  try {
-    // Create MCP server with target directory context
-    const shannonHelperServer = createShannonHelperServer(sourceDir);
-
-    // Look up agent's assigned Playwright MCP server
-    let playwrightMcpName: string | null = null;
-    if (agentName) {
-      const promptName = agentNameToPromptName(agentName);
-      playwrightMcpName = MCP_AGENT_MAPPING[promptName as keyof typeof MCP_AGENT_MAPPING] || null;
-
-      if (playwrightMcpName) {
-        console.log(chalk.gray(`    🎭 Assigned ${agentName} → ${playwrightMcpName}`));
-      }
-    }
-
-    // Configure MCP servers: shannon-helper (SDK) + playwright-agentN (stdio)
-    const mcpServers: Record<string, McpServer> = {
-      'shannon-helper': shannonHelperServer,
-    };
-
-    // Add Playwright MCP server if this agent needs browser automation
    if (playwrightMcpName) {
+      console.log(chalk.gray(`    Assigned ${agentName} -> ${playwrightMcpName}`));
+
      const userDataDir = `/tmp/${playwrightMcpName}`;

-      // Detect if running in Docker via explicit environment variable
+      // Docker uses system Chromium; local dev uses Playwright's bundled browsers
      const isDocker = process.env.SHANNON_DOCKER === 'true';

-      // Build args array - conditionally add --executable-path for Docker
      const mcpArgs: string[] = [
        '@playwright/mcp@latest',
        '--isolated',
@@ -220,7 +88,6 @@ async function runClaudePrompt(
        mcpArgs.push('--browser', 'chromium');
      }

-      // Filter out undefined env values for type safety
      const envVars: Record<string, string> = Object.fromEntries(
        Object.entries({
          ...process.env,
@@ -236,335 +103,200 @@ async function runClaudePrompt(
        env: envVars,
      };
    }
+  }

-    const options = {
-      model: 'claude-sonnet-4-5-20250929', // Use latest Claude 4.5 Sonnet
-      maxTurns: 10_000, // Maximum turns for autonomous work
-      cwd: sourceDir, // Set working directory using SDK option
-      permissionMode: 'bypassPermissions' as const, // Bypass all permission checks for pentesting
-      mcpServers,
+  return mcpServers;
+}
+
+function outputLines(lines: string[]): void {
+  for (const line of lines) {
+    console.log(line);
+  }
+}
+
+async function writeErrorLog(
+  err: Error & { code?: string; status?: number },
+  sourceDir: string,
+  fullPrompt: string,
+  duration: number
+): Promise<void> {
+  try {
+    const errorLog = {
+      timestamp: formatTimestamp(),
+      agent: 'claude-executor',
+      error: {
+        name: err.constructor.name,
+        message: err.message,
+        code: err.code,
+        status: err.status,
+        stack: err.stack
+      },
+      context: {
+        sourceDir,
+        prompt: fullPrompt.slice(0, 200) + '...',
+        retryable: isRetryableError(err)
+      },
+      duration
    };
+    const logPath = path.join(sourceDir, 'error.log');
+    await fs.appendFile(logPath, JSON.stringify(errorLog) + '\n');
+  } catch (logError) {
+    const logErrMsg = logError instanceof Error ? logError.message : String(logError);
+    console.log(chalk.gray(`    (Failed to write error log: ${logErrMsg})`));
+  }
+}

-    // SDK Options only shown for verbose agents (not clean output)
-    if (!useCleanOutput) {
-      console.log(chalk.gray(`    SDK Options: maxTurns=${options.maxTurns}, cwd=${sourceDir}, permissions=BYPASS`));
+export async function validateAgentOutput(
+  result: ClaudePromptResult,
+  agentName: string | null,
+  sourceDir: string
+): Promise<boolean> {
+  console.log(chalk.blue(`    Validating ${agentName} agent output`));
+
+  try {
+    // Check if agent completed successfully
+    if (!result.success || !result.result) {
+      console.log(chalk.red(`    Validation failed: Agent execution was unsuccessful`));
+      return false;
    }

-    let result: string | null = null;
-    const messages: string[] = [];
-    let apiErrorDetected = false;
+    // Get validator function for this agent
+    const validator = agentName ? AGENT_VALIDATORS[agentName as keyof typeof AGENT_VALIDATORS] : undefined;

-    // Start progress indicator for clean output agents
-    if (progressIndicator) {
-      progressIndicator.start();
+    if (!validator) {
+      console.log(chalk.yellow(`    No validator found for agent "${agentName}" - assuming success`));
+      console.log(chalk.green(`    Validation passed: Unknown agent with successful result`));
+      return true;
    }

-    let lastHeartbeat = Date.now();
-    const HEARTBEAT_INTERVAL = 30000; // 30 seconds
+    console.log(chalk.blue(`    Using validator for agent: ${agentName}`));
+    console.log(chalk.blue(`    Source directory: ${sourceDir}`));

-    try {
-      for await (const message of query({ prompt: fullPrompt, options })) {
-        // Periodic heartbeat for long-running agents (only when loader is disabled)
-        const now = Date.now();
-        if (global.SHANNON_DISABLE_LOADER && now - lastHeartbeat > HEARTBEAT_INTERVAL) {
-          console.log(chalk.blue(`    ⏱️  [${Math.floor((now - timer.startTime) / 1000)}s] ${description} running... (Turn ${turnCount})`));
-          lastHeartbeat = now;
-        }
+    // Apply validation function
+    const validationResult = await validator(sourceDir);

-        if (message.type === "assistant") {
-          turnCount++;
+    if (validationResult) {
+      console.log(chalk.green(`    Validation passed: Required files/structure present`));
+    } else {
+      console.log(chalk.red(`    Validation failed: Missing required deliverable files`));
+    }

-          const messageContent = message.message as { content: unknown };
-          const content = Array.isArray(messageContent.content)
-            ? messageContent.content.map((c: { text?: string }) => c.text || JSON.stringify(c)).join('\n')
-            : String(messageContent.content);
+    return validationResult;

-          if (statusManager) {
-            // Smart status updates for parallel execution - disabled
-          } else if (useCleanOutput) {
-            // Clean output for all agents: filter JSON tool calls but show meaningful text
-            const cleanedContent = filterJsonToolCalls(content);
-            if (cleanedContent.trim()) {
-              // Temporarily stop progress indicator to show output
-              if (progressIndicator) {
-                progressIndicator.stop();
-              }
+  } catch (error) {
+    const errMsg = error instanceof Error ? error.message : String(error);
+    console.log(chalk.red(`    Validation failed with error: ${errMsg}`));
+    return false;
+  }
+}

-              if (isParallelExecution) {
-                // Compact output for parallel agents with prefixes
-                const prefix = getAgentPrefix(description);
-                console.log(colorFn(`${prefix} ${cleanedContent}`));
-              } else {
-                // Full turn output for single agents
-                console.log(colorFn(`\n    🤖 Turn ${turnCount} (${description}):`));
-                console.log(colorFn(`    ${cleanedContent}`));
-              }
+// Low-level SDK execution. Handles message streaming, progress, and audit logging.
+// Exported for Temporal activities to call single-attempt execution.
+export async function runClaudePrompt(
+  prompt: string,
+  sourceDir: string,
+  context: string = '',
+  description: string = 'Claude analysis',
+  agentName: string | null = null,
+  colorFn: ChalkInstance = chalk.cyan,
+  sessionMetadata: SessionMetadata | null = null,
+  auditSession: AuditSession | null = null,
+  attemptNumber: number = 1
+): Promise<ClaudePromptResult> {
+  const timer = new Timer(`agent-${description.toLowerCase().replace(/\s+/g, '-')}`);
+  const fullPrompt = context ? `${context}\n\n${prompt}` : prompt;

-              // Restart progress indicator after output
-              if (progressIndicator) {
-                progressIndicator.start();
-              }
-            }
-          } else {
-            // Full streaming output - show complete messages with specialist color
-            console.log(colorFn(`\n    🤖 Turn ${turnCount} (${description}):`));
-            console.log(colorFn(`    ${content}`));
-          }
+  const execContext = detectExecutionContext(description);
+  const progress = createProgressManager(
+    { description, useCleanOutput: execContext.useCleanOutput },
+    global.SHANNON_DISABLE_LOADER ?? false
+  );
+  const auditLogger = createAuditLogger(auditSession);

-          // Log to audit system (crash-safe, append-only)
-          if (auditSession) {
-            await auditSession.logEvent('llm_response', {
-              turn: turnCount,
-              content,
-              timestamp: new Date().toISOString()
-            });
-          }
+  console.log(chalk.blue(`  Running Claude Code: ${description}...`));

-          messages.push(content);
+  const mcpServers = buildMcpServers(sourceDir, agentName);
+  const options = {
+    model: 'claude-sonnet-4-5-20250929',
+    maxTurns: 10_000,
+    cwd: sourceDir,
+    permissionMode: 'bypassPermissions' as const,
+    mcpServers,
+  };

-          // Check for API error patterns in assistant message content
-          if (content && typeof content === 'string') {
-            const lowerContent = content.toLowerCase();
-            if (lowerContent.includes('session limit reached')) {
-              throw new PentestError('Session limit reached', 'billing', false);
-            }
-            if (lowerContent.includes('api error') || lowerContent.includes('terminated')) {
-              apiErrorDetected = true;
-              console.log(chalk.red(`    ⚠️  API Error detected in assistant response: ${content.trim()}`));
-            }
-          }
+  if (!execContext.useCleanOutput) {
+    console.log(chalk.gray(`    SDK Options: maxTurns=${options.maxTurns}, cwd=${sourceDir}, permissions=BYPASS`));
+  }

-        } else if (message.type === "system" && (message as { subtype?: string }).subtype === "init") {
-          // Show useful system info only for verbose agents
-          if (!useCleanOutput) {
-            const initMsg = message as { model?: string; permissionMode?: string; mcp_servers?: Array<{ name: string; status: string }> };
-            console.log(chalk.blue(`    ℹ️  Model: ${initMsg.model}, Permission: ${initMsg.permissionMode}`));
-            if (initMsg.mcp_servers && initMsg.mcp_servers.length > 0) {
-              const mcpStatus = initMsg.mcp_servers.map(s => `${s.name}(${s.status})`).join(', ');
-              console.log(chalk.blue(`    📦 MCP: ${mcpStatus}`));
-            }
-          }
+  let turnCount = 0;
+  let result: string | null = null;
+  let apiErrorDetected = false;
+  let totalCost = 0;

-        } else if (message.type === "user") {
-          // Skip user messages (these are our own inputs echoed back)
-          continue;
+  progress.start();

-        } else if ((message.type as string) === "tool_use") {
-          const toolMsg = message as unknown as { name: string; input?: Record<string, unknown> };
-          console.log(chalk.yellow(`\n    🔧 Using Tool: ${toolMsg.name}`));
-          if (toolMsg.input && Object.keys(toolMsg.input).length > 0) {
-            console.log(chalk.gray(`    Input: ${JSON.stringify(toolMsg.input, null, 2)}`));
-          }
+  try {
+    const messageLoopResult = await processMessageStream(
+      fullPrompt,
+      options,
+      { execContext, description, colorFn, progress, auditLogger },
+      timer
+    );

-          // Log tool start event
-          if (auditSession) {
-            await auditSession.logEvent('tool_start', {
-              toolName: toolMsg.name,
-              parameters: toolMsg.input,
-              timestamp: new Date().toISOString()
-            });
-          }
-        } else if ((message.type as string) === "tool_result") {
-          const resultMsg = message as unknown as { content?: unknown };
-          console.log(chalk.green(`    ✅ Tool Result:`));
-          if (resultMsg.content) {
-            // Show tool results but truncate if too long
-            const resultStr = typeof resultMsg.content === 'string' ? resultMsg.content : JSON.stringify(resultMsg.content, null, 2);
-            if (resultStr.length > 500) {
-              console.log(chalk.gray(`    ${resultStr.slice(0, 500)}...\n    [Result truncated - ${resultStr.length} total chars]`));
-            } else {
-              console.log(chalk.gray(`    ${resultStr}`));
-            }
-          }
+    turnCount = messageLoopResult.turnCount;
+    result = messageLoopResult.result;
+    apiErrorDetected = messageLoopResult.apiErrorDetected;
+    totalCost = messageLoopResult.cost;

-          // Log tool end event
-          if (auditSession) {
-            await auditSession.logEvent('tool_end', {
-              result: resultMsg.content,
-              timestamp: new Date().toISOString()
-            });
-          }
-        } else if (message.type === "result") {
-          const resultMessage = message as {
-            result?: string;
-            total_cost_usd?: number;
-            duration_ms?: number;
-            subtype?: string;
-            permission_denials?: unknown[];
-          };
-          result = resultMessage.result || null;
+    // === SPENDING CAP SAFEGUARD ===
+    // Defense-in-depth: Detect spending cap that slipped through detectApiError().
+    // When spending cap is hit, Claude returns a short message with $0 cost.
+    // Legitimate agent work NEVER costs $0 with only 1-2 turns.
+    if (turnCount <= 2 && totalCost === 0) {
+      const resultLower = (result || '').toLowerCase();
+      const BILLING_KEYWORDS = ['spending', 'cap', 'limit', 'budget', 'resets'];
+      const looksLikeBillingError = BILLING_KEYWORDS.some((kw) =>
+        resultLower.includes(kw)
+      );

-          if (!statusManager) {
-            if (useCleanOutput) {
-              // Clean completion output - just duration and cost
-              console.log(chalk.magenta(`\n    🏁 COMPLETED:`));
-              const cost = resultMessage.total_cost_usd || 0;
-              console.log(chalk.gray(`    ⏱️  Duration: ${((resultMessage.duration_ms || 0)/1000).toFixed(1)}s, Cost: $${cost.toFixed(4)}`));
-
-              if (resultMessage.subtype === "error_max_turns") {
-                console.log(chalk.red(`    ⚠️  Stopped: Hit maximum turns limit`));
-              } else if (resultMessage.subtype === "error_during_execution") {
-                console.log(chalk.red(`    ❌ Stopped: Execution error`));
-              }
-
-              if (resultMessage.permission_denials && resultMessage.permission_denials.length > 0) {
-                console.log(chalk.yellow(`    🚫 ${resultMessage.permission_denials.length} permission denials`));
-              }
-            } else {
-              // Full completion output for agents without clean output
-              console.log(chalk.magenta(`\n    🏁 COMPLETED:`));
-              const cost = resultMessage.total_cost_usd || 0;
-              console.log(chalk.gray(`    ⏱️  Duration: ${((resultMessage.duration_ms || 0)/1000).toFixed(1)}s, Cost: $${cost.toFixed(4)}`));
-
-              if (resultMessage.subtype === "error_max_turns") {
-                console.log(chalk.red(`    ⚠️  Stopped: Hit maximum turns limit`));
-              } else if (resultMessage.subtype === "error_during_execution") {
-                console.log(chalk.red(`    ❌ Stopped: Execution error`));
-              }
-
-              if (resultMessage.permission_denials && resultMessage.permission_denials.length > 0) {
-                console.log(chalk.yellow(`    🚫 ${resultMessage.permission_denials.length} permission denials`));
-              }
-
-              // Show result content (if it's reasonable length)
-              if (result && typeof result === 'string') {
-                if (result.length > 1000) {
-                  console.log(chalk.magenta(`    📄 ${result.slice(0, 1000)}... [${result.length} total chars]`));
-                } else {
-                  console.log(chalk.magenta(`    📄 ${result}`));
-                }
-              }
-            }
-          }
-
-          // Track cost for all agents
-          const cost = resultMessage.total_cost_usd || 0;
-          const agentKey = description.toLowerCase().replace(/\s+/g, '-');
-          costResults.agents[agentKey] = cost;
-          costResults.total += cost;
-
-          // Store cost for return value and partial tracking
-          totalCost = cost;
-          partialCost = cost;
-          break;
-        } else {
-          // Log any other message types we might not be handling
-          console.log(chalk.gray(`    💬 ${message.type}: ${JSON.stringify(message, null, 2)}`));
-        }
+      if (looksLikeBillingError) {
+        throw new PentestError(
+          `Spending cap likely reached (turns=${turnCount}, cost=$0): ${result?.slice(0, 100)}`,
+          'billing',
+          true // Retryable - Temporal will use 5-30 min backoff
+        );
      }
-    } catch (queryError) {
-      throw queryError; // Re-throw to outer catch
    }

    const duration = timer.stop();
-    const agentKey = description.toLowerCase().replace(/\s+/g, '-');
-    timingResults.agents[agentKey] = duration;
+    timingResults.agents[execContext.agentKey] = duration;

-    // API error detection is logged but not immediately failed
    if (apiErrorDetected) {
-      console.log(chalk.yellow(`  ⚠️ API Error detected in ${description} - will validate deliverables before failing`));
+      console.log(chalk.yellow(`  API Error detected in ${description} - will validate deliverables before failing`));
    }

-    // Show completion messages based on agent type
-    if (progressIndicator) {
-      const agentType = description.includes('Pre-recon') ? 'Pre-recon analysis' :
-                       description.includes('Recon') ? 'Reconnaissance' :
-                       description.includes('Report') ? 'Report generation' : 'Analysis';
-      progressIndicator.finish(`${agentType} complete! (${turnCount} turns, ${formatDuration(duration)})`);
-    } else if (isParallelExecution) {
-      const prefix = getAgentPrefix(description);
-      console.log(chalk.green(`${prefix} ✅ Complete (${turnCount} turns, ${formatDuration(duration)})`));
-    } else if (!useCleanOutput) {
-      console.log(chalk.green(`  ✅ Claude Code completed: ${description} (${turnCount} turns) in ${formatDuration(duration)}`));
-    }
+    progress.finish(formatCompletionMessage(execContext, description, turnCount, duration));

-    // Return result with log file path for all agents
-    const returnData: ClaudePromptResult = {
+    return {
      result,
      success: true,
      duration,
      turns: turnCount,
      cost: totalCost,
-      partialCost,
+      partialCost: totalCost,
      apiErrorDetected
    };
-    if (logFilePath) {
-      returnData.logFile = logFilePath;
-    }
-    return returnData;

  } catch (error) {
    const duration = timer.stop();
-    const agentKey = description.toLowerCase().replace(/\s+/g, '-');
-    timingResults.agents[agentKey] = duration;
+    timingResults.agents[execContext.agentKey] = duration;

-    const err = error as Error & { code?: string; status?: number; duration?: number; cost?: number };
+    const err = error as Error & { code?: string; status?: number };

-    // Log error to audit system
-    if (auditSession) {
-      await auditSession.logEvent('error', {
-        message: err.message,
-        errorType: err.constructor.name,
-        stack: err.stack,
-        duration,
-        turns: turnCount,
-        timestamp: new Date().toISOString()
-      });
-    }
-
-    // Show error messages based on agent type
-    if (progressIndicator) {
-      progressIndicator.stop();
-      const agentType = description.includes('Pre-recon') ? 'Pre-recon analysis' :
-                       description.includes('Recon') ? 'Reconnaissance' :
-                       description.includes('Report') ? 'Report generation' : 'Analysis';
-      console.log(chalk.red(`❌ ${agentType} failed (${formatDuration(duration)})`));
-    } else if (isParallelExecution) {
-      const prefix = getAgentPrefix(description);
-      console.log(chalk.red(`${prefix} ❌ Failed (${formatDuration(duration)})`));
-    } else if (!useCleanOutput) {
-      console.log(chalk.red(`  ❌ Claude Code failed: ${description} (${formatDuration(duration)})`));
-    }
-    console.log(chalk.red(`    Error Type: ${err.constructor.name}`));
-    console.log(chalk.red(`    Message: ${err.message}`));
-    console.log(chalk.gray(`    Agent: ${description}`));
-    console.log(chalk.gray(`    Working Directory: ${sourceDir}`));
-    console.log(chalk.gray(`    Retryable: ${isRetryableError(err) ? 'Yes' : 'No'}`));
-
-    // Log additional context if available
-    if (err.code) {
-      console.log(chalk.gray(`    Error Code: ${err.code}`));
-    }
-    if (err.status) {
-      console.log(chalk.gray(`    HTTP Status: ${err.status}`));
-    }
-
-    // Save detailed error to log file for debugging
-    try {
-      const errorLog = {
-        timestamp: new Date().toISOString(),
-        agent: description,
-        error: {
-          name: err.constructor.name,
-          message: err.message,
-          code: err.code,
-          status: err.status,
-          stack: err.stack
-        },
-        context: {
-          sourceDir,
-          prompt: fullPrompt.slice(0, 200) + '...',
-          retryable: isRetryableError(err)
-        },
-        duration
-      };
-
-      const logPath = path.join(sourceDir, 'error.log');
-      await fs.appendFile(logPath, JSON.stringify(errorLog) + '\n');
-    } catch (logError) {
-      const logErrMsg = logError instanceof Error ? logError.message : String(logError);
-      console.log(chalk.gray(`    (Failed to write error log: ${logErrMsg})`));
-    }
+    await auditLogger.logError(err, duration, turnCount);
+    progress.stop();
+    outputLines(formatErrorOutput(err, execContext, description, duration, sourceDir, isRetryableError(err)));
+    await writeErrorLog(err, sourceDir, fullPrompt, duration);

    return {
      error: err.message,
@@ -572,17 +304,85 @@ async function runClaudePrompt(
      prompt: fullPrompt.slice(0, 100) + '...',
      success: false,
      duration,
-      cost: partialCost,
+      cost: totalCost,
      retryable: isRetryableError(err)
    };
  }
 }

-// PREFERRED: Production-ready Claude agent execution with full orchestration
+
+interface MessageLoopResult {
+  turnCount: number;
+  result: string | null;
+  apiErrorDetected: boolean;
+  cost: number;
+}
+
+interface MessageLoopDeps {
+  execContext: ReturnType<typeof detectExecutionContext>;
+  description: string;
+  colorFn: ChalkInstance;
+  progress: ReturnType<typeof createProgressManager>;
+  auditLogger: ReturnType<typeof createAuditLogger>;
+}
+
+async function processMessageStream(
+  fullPrompt: string,
+  options: NonNullable<Parameters<typeof query>[0]['options']>,
+  deps: MessageLoopDeps,
+  timer: Timer
+): Promise<MessageLoopResult> {
+  const { execContext, description, colorFn, progress, auditLogger } = deps;
+  const HEARTBEAT_INTERVAL = 30000;
+
+  let turnCount = 0;
+  let result: string | null = null;
+  let apiErrorDetected = false;
+  let cost = 0;
+  let lastHeartbeat = Date.now();
+
+  for await (const message of query({ prompt: fullPrompt, options })) {
+    // Heartbeat logging when loader is disabled
+    const now = Date.now();
+    if (global.SHANNON_DISABLE_LOADER && now - lastHeartbeat > HEARTBEAT_INTERVAL) {
+      console.log(chalk.blue(`    [${Math.floor((now - timer.startTime) / 1000)}s] ${description} running... (Turn ${turnCount})`));
+      lastHeartbeat = now;
+    }
+
+    // Increment turn count for assistant messages
+    if (message.type === 'assistant') {
+      turnCount++;
+    }
+
+    const dispatchResult = await dispatchMessage(
+      message as { type: string; subtype?: string },
+      turnCount,
+      { execContext, description, colorFn, progress, auditLogger }
+    );
+
+    if (dispatchResult.type === 'throw') {
+      throw dispatchResult.error;
+    }
+
+    if (dispatchResult.type === 'complete') {
+      result = dispatchResult.result;
+      cost = dispatchResult.cost;
+      break;
+    }
+
+    if (dispatchResult.type === 'continue' && dispatchResult.apiErrorDetected) {
+      apiErrorDetected = true;
+    }
+  }
+
+  return { turnCount, result, apiErrorDetected, cost };
+}
+
+// Main entry point for agent execution. Handles retries, git checkpoints, and validation.
 export async function runClaudePromptWithRetry(
  prompt: string,
  sourceDir: string,
-  allowedTools: string = 'Read',
+  _allowedTools: string = 'Read',
  context: string = '',
  description: string = 'Claude analysis',
  agentName: string | null = null,
@@ -593,9 +393,8 @@ export async function runClaudePromptWithRetry(
  let lastError: Error | undefined;
  let retryContext = context;

-  console.log(chalk.cyan(`🚀 Starting ${description} with ${maxRetries} max attempts`));
+  console.log(chalk.cyan(`Starting ${description} with ${maxRetries} max attempts`));

-  // Initialize audit session (crash-safe logging)
  let auditSession: AuditSession | null = null;
  if (sessionMetadata && agentName) {
    auditSession = new AuditSession(sessionMetadata);
@@ -603,29 +402,27 @@ export async function runClaudePromptWithRetry(
  }

  for (let attempt = 1; attempt <= maxRetries; attempt++) {
-    // Create checkpoint before each attempt
    await createGitCheckpoint(sourceDir, description, attempt);

-    // Start agent tracking in audit system (saves prompt snapshot automatically)
    if (auditSession && agentName) {
      const fullPrompt = retryContext ? `${retryContext}\n\n${prompt}` : prompt;
      await auditSession.startAgent(agentName, fullPrompt, attempt);
    }

    try {
-      const result = await runClaudePrompt(prompt, sourceDir, allowedTools, retryContext, description, agentName, colorFn, sessionMetadata, auditSession, attempt);
+      const result = await runClaudePrompt(
+        prompt, sourceDir, retryContext,
+        description, agentName, colorFn, sessionMetadata, auditSession, attempt
+      );

-      // Validate output after successful run
      if (result.success) {
        const validationPassed = await validateAgentOutput(result, agentName, sourceDir);

        if (validationPassed) {
-          // Check if API error was detected but validation passed
          if (result.apiErrorDetected) {
-            console.log(chalk.yellow(`📋 Validation: Ready for exploitation despite API error warnings`));
+            console.log(chalk.yellow(`Validation: Ready for exploitation despite API error warnings`));
          }

-          // Record successful attempt in audit system
          if (auditSession && agentName) {
            const commitHash = await getGitCommitHash(sourceDir);
            const endResult: {
@@ -646,15 +443,13 @@ export async function runClaudePromptWithRetry(
            await auditSession.endAgent(agentName, endResult);
          }

-          // Commit successful changes (will include the snapshot)
          await commitGitSuccess(sourceDir, description);
-          console.log(chalk.green.bold(`🎉 ${description} completed successfully on attempt ${attempt}/${maxRetries}`));
+          console.log(chalk.green.bold(`${description} completed successfully on attempt ${attempt}/${maxRetries}`));
          return result;
+        // Validation failure is retryable - agent might succeed on retry with cleaner workspace
        } else {
-          // Agent completed but output validation failed
-          console.log(chalk.yellow(`⚠️ ${description} completed but output validation failed`));
+          console.log(chalk.yellow(`${description} completed but output validation failed`));

-          // Record failed validation attempt in audit system
          if (auditSession && agentName) {
            await auditSession.endAgent(agentName, {
              attemptNumber: attempt,
@@ -666,20 +461,17 @@ export async function runClaudePromptWithRetry(
            });
          }

-          // If API error detected AND validation failed, this is a retryable error
          if (result.apiErrorDetected) {
-            console.log(chalk.yellow(`⚠️ API Error detected with validation failure - treating as retryable`));
+            console.log(chalk.yellow(`API Error detected with validation failure - treating as retryable`));
            lastError = new Error('API Error: terminated with validation failure');
          } else {
            lastError = new Error('Output validation failed');
          }

          if (attempt < maxRetries) {
-            // Rollback contaminated workspace
            await rollbackGitWorkspace(sourceDir, 'validation failure');
            continue;
          } else {
-            // FAIL FAST - Don't continue with broken pipeline
            throw new PentestError(
              `Agent ${description} failed output validation after ${maxRetries} attempts. Required deliverable files were not created.`,
              'validation',
@@ -694,7 +486,6 @@ export async function runClaudePromptWithRetry(
      const err = error as Error & { duration?: number; cost?: number; partialResults?: unknown };
      lastError = err;

-      // Record failed attempt in audit system
      if (auditSession && agentName) {
        await auditSession.endAgent(agentName, {
          attemptNumber: attempt,
@@ -706,24 +497,21 @@ export async function runClaudePromptWithRetry(
        });
      }

-      // Check if error is retryable
      if (!isRetryableError(err)) {
-        console.log(chalk.red(`❌ ${description} failed with non-retryable error: ${err.message}`));
+        console.log(chalk.red(`${description} failed with non-retryable error: ${err.message}`));
        await rollbackGitWorkspace(sourceDir, 'non-retryable error cleanup');
        throw err;
      }

      if (attempt < maxRetries) {
-        // Rollback for clean retry
        await rollbackGitWorkspace(sourceDir, 'retryable error cleanup');

        const delay = getRetryDelay(err, attempt);
        const delaySeconds = (delay / 1000).toFixed(1);
-        console.log(chalk.yellow(`⚠️ ${description} failed (attempt ${attempt}/${maxRetries})`));
+        console.log(chalk.yellow(`${description} failed (attempt ${attempt}/${maxRetries})`));
        console.log(chalk.gray(`    Error: ${err.message}`));
        console.log(chalk.gray(`    Workspace rolled back, retrying in ${delaySeconds}s...`));

-        // Preserve any partial results for next retry
        if (err.partialResults) {
          retryContext = `${context}\n\nPrevious partial results: ${JSON.stringify(err.partialResults)}`;
        }
@@ -731,7 +519,7 @@ export async function runClaudePromptWithRetry(
        await new Promise(resolve => setTimeout(resolve, delay));
      } else {
        await rollbackGitWorkspace(sourceDir, 'final failure cleanup');
-        console.log(chalk.red(`❌ ${description} failed after ${maxRetries} attempts`));
+        console.log(chalk.red(`${description} failed after ${maxRetries} attempts`));
        console.log(chalk.red(`    Final error: ${err.message}`));
      }
    }
@@ -739,13 +527,3 @@ export async function runClaudePromptWithRetry(

  throw lastError;
 }
-
-// Helper function to get git commit hash
-async function getGitCommitHash(sourceDir: string): Promise<string | null> {
-  try {
-    const result = await $`cd ${sourceDir} && git rev-parse HEAD`;
-    return result.stdout.trim();
-  } catch {
-    return null;
-  }
-}
@@ -0,0 +1,272 @@
+// Copyright (C) 2025 Keygraph, Inc.
+//
+// This program is free software: you can redistribute it and/or modify
+// it under the terms of the GNU Affero General Public License version 3
+// as published by the Free Software Foundation.
+
+// Pure functions for processing SDK message types
+
+import { PentestError } from '../error-handling.js';
+import { filterJsonToolCalls } from '../utils/output-formatter.js';
+import { formatTimestamp } from '../utils/formatting.js';
+import chalk from 'chalk';
+import {
+  formatAssistantOutput,
+  formatResultOutput,
+  formatToolUseOutput,
+  formatToolResultOutput,
+} from './output-formatters.js';
+import { costResults } from '../utils/metrics.js';
+import type { AuditLogger } from './audit-logger.js';
+import type { ProgressManager } from './progress-manager.js';
+import type {
+  AssistantMessage,
+  ResultMessage,
+  ToolUseMessage,
+  ToolResultMessage,
+  AssistantResult,
+  ResultData,
+  ToolUseData,
+  ToolResultData,
+  ApiErrorDetection,
+  ContentBlock,
+  SystemInitMessage,
+  ExecutionContext,
+} from './types.js';
+import type { ChalkInstance } from 'chalk';
+
+// Handles both array and string content formats from SDK
+export function extractMessageContent(message: AssistantMessage): string {
+  const messageContent = message.message;
+
+  if (Array.isArray(messageContent.content)) {
+    return messageContent.content
+      .map((c: ContentBlock) => c.text || JSON.stringify(c))
+      .join('\n');
+  }
+
+  return String(messageContent.content);
+}
+
+export function detectApiError(content: string): ApiErrorDetection {
+  if (!content || typeof content !== 'string') {
+    return { detected: false };
+  }
+
+  const lowerContent = content.toLowerCase();
+
+  // === BILLING/SPENDING CAP ERRORS (Retryable with long backoff) ===
+  // When Claude Code hits its spending cap, it returns a short message like
+  // "Spending cap reached resets 8am" instead of throwing an error.
+  // These should retry with 5-30 min backoff so workflows can recover when cap resets.
+  const BILLING_PATTERNS = [
+    'spending cap',
+    'spending limit',
+    'cap reached',
+    'budget exceeded',
+    'usage limit',
+  ];
+
+  const isBillingError = BILLING_PATTERNS.some((pattern) =>
+    lowerContent.includes(pattern)
+  );
+
+  if (isBillingError) {
+    return {
+      detected: true,
+      shouldThrow: new PentestError(
+        `Billing limit reached: ${content.slice(0, 100)}`,
+        'billing',
+        true // RETRYABLE - Temporal will use 5-30 min backoff
+      ),
+    };
+  }
+
+  // === SESSION LIMIT (Non-retryable) ===
+  // Different from spending cap - usually means something is fundamentally wrong
+  if (lowerContent.includes('session limit reached')) {
+    return {
+      detected: true,
+      shouldThrow: new PentestError('Session limit reached', 'billing', false),
+    };
+  }
+
+  // Non-fatal API errors - detected but continue
+  if (lowerContent.includes('api error') || lowerContent.includes('terminated')) {
+    return { detected: true };
+  }
+
+  return { detected: false };
+}
+
+export function handleAssistantMessage(
+  message: AssistantMessage,
+  turnCount: number
+): AssistantResult {
+  const content = extractMessageContent(message);
+  const cleanedContent = filterJsonToolCalls(content);
+  const errorDetection = detectApiError(content);
+
+  const result: AssistantResult = {
+    content,
+    cleanedContent,
+    apiErrorDetected: errorDetection.detected,
+    logData: {
+      turn: turnCount,
+      content,
+      timestamp: formatTimestamp(),
+    },
+  };
+
+  // Only add shouldThrow if it exists (exactOptionalPropertyTypes compliance)
+  if (errorDetection.shouldThrow) {
+    result.shouldThrow = errorDetection.shouldThrow;
+  }
+
+  return result;
+}
+
+// Final message of a query with cost/duration info
+export function handleResultMessage(message: ResultMessage): ResultData {
+  const result: ResultData = {
+    result: message.result || null,
+    cost: message.total_cost_usd || 0,
+    duration_ms: message.duration_ms || 0,
+    permissionDenials: message.permission_denials?.length || 0,
+  };
+
+  // Only add subtype if it exists (exactOptionalPropertyTypes compliance)
+  if (message.subtype) {
+    result.subtype = message.subtype;
+  }
+
+  return result;
+}
+
+export function handleToolUseMessage(message: ToolUseMessage): ToolUseData {
+  return {
+    toolName: message.name,
+    parameters: message.input || {},
+    timestamp: formatTimestamp(),
+  };
+}
+
+// Truncates long results for display (500 char limit), preserves full content for logging
+export function handleToolResultMessage(message: ToolResultMessage): ToolResultData {
+  const content = message.content;
+  const contentStr =
+    typeof content === 'string' ? content : JSON.stringify(content, null, 2);
+
+  const displayContent =
+    contentStr.length > 500
+      ? `${contentStr.slice(0, 500)}...\n[Result truncated - ${contentStr.length} total chars]`
+      : contentStr;
+
+  return {
+    content,
+    displayContent,
+    timestamp: formatTimestamp(),
+  };
+}
+
+// Output helper for console logging
+function outputLines(lines: string[]): void {
+  for (const line of lines) {
+    console.log(line);
+  }
+}
+
+// Message dispatch result types
+export type MessageDispatchAction =
+  | { type: 'continue'; apiErrorDetected?: boolean }
+  | { type: 'complete'; result: string | null; cost: number }
+  | { type: 'throw'; error: Error };
+
+export interface MessageDispatchDeps {
+  execContext: ExecutionContext;
+  description: string;
+  colorFn: ChalkInstance;
+  progress: ProgressManager;
+  auditLogger: AuditLogger;
+}
+
+// Dispatches SDK messages to appropriate handlers and formatters
+export async function dispatchMessage(
+  message: { type: string; subtype?: string },
+  turnCount: number,
+  deps: MessageDispatchDeps
+): Promise<MessageDispatchAction> {
+  const { execContext, description, colorFn, progress, auditLogger } = deps;
+
+  switch (message.type) {
+    case 'assistant': {
+      const assistantResult = handleAssistantMessage(message as AssistantMessage, turnCount);
+
+      if (assistantResult.shouldThrow) {
+        return { type: 'throw', error: assistantResult.shouldThrow };
+      }
+
+      if (assistantResult.cleanedContent.trim()) {
+        progress.stop();
+        outputLines(formatAssistantOutput(
+          assistantResult.cleanedContent,
+          execContext,
+          turnCount,
+          description,
+          colorFn
+        ));
+        progress.start();
+      }
+
+      await auditLogger.logLlmResponse(turnCount, assistantResult.content);
+
+      if (assistantResult.apiErrorDetected) {
+        console.log(chalk.red(`    API Error detected in assistant response`));
+        return { type: 'continue', apiErrorDetected: true };
+      }
+
+      return { type: 'continue' };
+    }
+
+    case 'system': {
+      if (message.subtype === 'init' && !execContext.useCleanOutput) {
+        const initMsg = message as SystemInitMessage;
+        console.log(chalk.blue(`    Model: ${initMsg.model}, Permission: ${initMsg.permissionMode}`));
+        if (initMsg.mcp_servers && initMsg.mcp_servers.length > 0) {
+          const mcpStatus = initMsg.mcp_servers.map(s => `${s.name}(${s.status})`).join(', ');
+          console.log(chalk.blue(`    MCP: ${mcpStatus}`));
+        }
+      }
+      return { type: 'continue' };
+    }
+
+    case 'user':
+      return { type: 'continue' };
+
+    case 'tool_use': {
+      const toolData = handleToolUseMessage(message as unknown as ToolUseMessage);
+      outputLines(formatToolUseOutput(toolData.toolName, toolData.parameters));
+      await auditLogger.logToolStart(toolData.toolName, toolData.parameters);
+      return { type: 'continue' };
+    }
+
+    case 'tool_result': {
+      const toolResultData = handleToolResultMessage(message as unknown as ToolResultMessage);
+      outputLines(formatToolResultOutput(toolResultData.displayContent));
+      await auditLogger.logToolEnd(toolResultData.content);
+      return { type: 'continue' };
+    }
+
+    case 'result': {
+      const resultData = handleResultMessage(message as ResultMessage);
+      outputLines(formatResultOutput(resultData, !execContext.useCleanOutput));
+      costResults.agents[execContext.agentKey] = resultData.cost;
+      costResults.total += resultData.cost;
+      return { type: 'complete', result: resultData.result, cost: resultData.cost };
+    }
+
+    default:
+      console.log(chalk.gray(`    ${message.type}: ${JSON.stringify(message, null, 2)}`));
+      return { type: 'continue' };
+  }
+}
@@ -0,0 +1,169 @@
+// Copyright (C) 2025 Keygraph, Inc.
+//
+// This program is free software: you can redistribute it and/or modify
+// it under the terms of the GNU Affero General Public License version 3
+// as published by the Free Software Foundation.
+
+// Pure functions for formatting console output
+
+import chalk from 'chalk';
+import { extractAgentType, formatDuration } from '../utils/formatting.js';
+import { getAgentPrefix } from '../utils/output-formatter.js';
+import type { ExecutionContext, ResultData } from './types.js';
+
+export function detectExecutionContext(description: string): ExecutionContext {
+  const isParallelExecution =
+    description.includes('vuln agent') || description.includes('exploit agent');
+
+  const useCleanOutput =
+    description.includes('Pre-recon agent') ||
+    description.includes('Recon agent') ||
+    description.includes('Executive Summary and Report Cleanup') ||
+    description.includes('vuln agent') ||
+    description.includes('exploit agent');
+
+  const agentType = extractAgentType(description);
+
+  const agentKey = description.toLowerCase().replace(/\s+/g, '-');
+
+  return { isParallelExecution, useCleanOutput, agentType, agentKey };
+}
+
+export function formatAssistantOutput(
+  cleanedContent: string,
+  context: ExecutionContext,
+  turnCount: number,
+  description: string,
+  colorFn: typeof chalk.cyan = chalk.cyan
+): string[] {
+  if (!cleanedContent.trim()) {
+    return [];
+  }
+
+  const lines: string[] = [];
+
+  if (context.isParallelExecution) {
+    // Compact output for parallel agents with prefixes
+    const prefix = getAgentPrefix(description);
+    lines.push(colorFn(`${prefix} ${cleanedContent}`));
+  } else {
+    // Full turn output for sequential agents
+    lines.push(colorFn(`\n    Turn ${turnCount} (${description}):`));
+    lines.push(colorFn(`    ${cleanedContent}`));
+  }
+
+  return lines;
+}
+
+export function formatResultOutput(data: ResultData, showFullResult: boolean): string[] {
+  const lines: string[] = [];
+
+  lines.push(chalk.magenta(`\n    COMPLETED:`));
+  lines.push(
+    chalk.gray(
+      `    Duration: ${(data.duration_ms / 1000).toFixed(1)}s, Cost: $${data.cost.toFixed(4)}`
+    )
+  );
+
+  if (data.subtype === 'error_max_turns') {
+    lines.push(chalk.red(`    Stopped: Hit maximum turns limit`));
+  } else if (data.subtype === 'error_during_execution') {
+    lines.push(chalk.red(`    Stopped: Execution error`));
+  }
+
+  if (data.permissionDenials > 0) {
+    lines.push(chalk.yellow(`    ${data.permissionDenials} permission denials`));
+  }
+
+  if (showFullResult && data.result && typeof data.result === 'string') {
+    if (data.result.length > 1000) {
+      lines.push(chalk.magenta(`    ${data.result.slice(0, 1000)}... [${data.result.length} total chars]`));
+    } else {
+      lines.push(chalk.magenta(`    ${data.result}`));
+    }
+  }
+
+  return lines;
+}
+
+export function formatErrorOutput(
+  error: Error & { code?: string; status?: number },
+  context: ExecutionContext,
+  description: string,
+  duration: number,
+  sourceDir: string,
+  isRetryable: boolean
+): string[] {
+  const lines: string[] = [];
+
+  if (context.isParallelExecution) {
+    const prefix = getAgentPrefix(description);
+    lines.push(chalk.red(`${prefix} Failed (${formatDuration(duration)})`));
+  } else if (context.useCleanOutput) {
+    lines.push(chalk.red(`${context.agentType} failed (${formatDuration(duration)})`));
+  } else {
+    lines.push(chalk.red(`  Claude Code failed: ${description} (${formatDuration(duration)})`));
+  }
+
+  lines.push(chalk.red(`    Error Type: ${error.constructor.name}`));
+  lines.push(chalk.red(`    Message: ${error.message}`));
+  lines.push(chalk.gray(`    Agent: ${description}`));
+  lines.push(chalk.gray(`    Working Directory: ${sourceDir}`));
+  lines.push(chalk.gray(`    Retryable: ${isRetryable ? 'Yes' : 'No'}`));
+
+  if (error.code) {
+    lines.push(chalk.gray(`    Error Code: ${error.code}`));
+  }
+  if (error.status) {
+    lines.push(chalk.gray(`    HTTP Status: ${error.status}`));
+  }
+
+  return lines;
+}
+
+export function formatCompletionMessage(
+  context: ExecutionContext,
+  description: string,
+  turnCount: number,
+  duration: number
+): string {
+  if (context.isParallelExecution) {
+    const prefix = getAgentPrefix(description);
+    return chalk.green(`${prefix} Complete (${turnCount} turns, ${formatDuration(duration)})`);
+  }
+
+  if (context.useCleanOutput) {
+    return chalk.green(
+      `${context.agentType.charAt(0).toUpperCase() + context.agentType.slice(1)} complete! (${turnCount} turns, ${formatDuration(duration)})`
+    );
+  }
+
+  return chalk.green(
+    `  Claude Code completed: ${description} (${turnCount} turns) in ${formatDuration(duration)}`
+  );
+}
+
+export function formatToolUseOutput(
+  toolName: string,
+  input: Record<string, unknown> | undefined
+): string[] {
+  const lines: string[] = [];
+
+  lines.push(chalk.yellow(`\n    Using Tool: ${toolName}`));
+  if (input && Object.keys(input).length > 0) {
+    lines.push(chalk.gray(`    Input: ${JSON.stringify(input, null, 2)}`));
+  }
+
+  return lines;
+}
+
+export function formatToolResultOutput(displayContent: string): string[] {
+  const lines: string[] = [];
+
+  lines.push(chalk.green(`    Tool Result:`));
+  if (displayContent) {
+    lines.push(chalk.gray(`    ${displayContent}`));
+  }
+
+  return lines;
+}
@@ -0,0 +1,76 @@
+// Copyright (C) 2025 Keygraph, Inc.
+//
+// This program is free software: you can redistribute it and/or modify
+// it under the terms of the GNU Affero General Public License version 3
+// as published by the Free Software Foundation.
+
+// Null Object pattern for progress indicator - callers never check for null
+
+import { ProgressIndicator } from '../progress-indicator.js';
+import { extractAgentType } from '../utils/formatting.js';
+
+export interface ProgressContext {
+  description: string;
+  useCleanOutput: boolean;
+}
+
+export interface ProgressManager {
+  start(): void;
+  stop(): void;
+  finish(message: string): void;
+  isActive(): boolean;
+}
+
+class RealProgressManager implements ProgressManager {
+  private indicator: ProgressIndicator;
+  private active: boolean = false;
+
+  constructor(message: string) {
+    this.indicator = new ProgressIndicator(message);
+  }
+
+  start(): void {
+    this.indicator.start();
+    this.active = true;
+  }
+
+  stop(): void {
+    this.indicator.stop();
+    this.active = false;
+  }
+
+  finish(message: string): void {
+    this.indicator.finish(message);
+    this.active = false;
+  }
+
+  isActive(): boolean {
+    return this.active;
+  }
+}
+
+/** Null Object implementation - all methods are safe no-ops */
+class NullProgressManager implements ProgressManager {
+  start(): void {}
+
+  stop(): void {}
+
+  finish(_message: string): void {}
+
+  isActive(): boolean {
+    return false;
+  }
+}
+
+// Returns no-op when disabled
+export function createProgressManager(
+  context: ProgressContext,
+  disableLoader: boolean
+): ProgressManager {
+  if (!context.useCleanOutput || disableLoader) {
+    return new NullProgressManager();
+  }
+
+  const agentType = extractAgentType(context.description);
+  return new RealProgressManager(`Running ${agentType}...`);
+}
@@ -0,0 +1,134 @@
+// Copyright (C) 2025 Keygraph, Inc.
+//
+// This program is free software: you can redistribute it and/or modify
+// it under the terms of the GNU Affero General Public License version 3
+// as published by the Free Software Foundation.
+
+// Type definitions for Claude executor message processing pipeline
+
+export interface ExecutionContext {
+  isParallelExecution: boolean;
+  useCleanOutput: boolean;
+  agentType: string;
+  agentKey: string;
+}
+
+export interface ProcessingState {
+  turnCount: number;
+  result: string | null;
+  apiErrorDetected: boolean;
+  totalCost: number;
+  partialCost: number;
+  lastHeartbeat: number;
+}
+
+export interface ProcessingResult {
+  result: string | null;
+  turnCount: number;
+  apiErrorDetected: boolean;
+  totalCost: number;
+}
+
+export interface AssistantResult {
+  content: string;
+  cleanedContent: string;
+  apiErrorDetected: boolean;
+  shouldThrow?: Error;
+  logData: {
+    turn: number;
+    content: string;
+    timestamp: string;
+  };
+}
+
+export interface ResultData {
+  result: string | null;
+  cost: number;
+  duration_ms: number;
+  subtype?: string;
+  permissionDenials: number;
+}
+
+export interface ToolUseData {
+  toolName: string;
+  parameters: Record<string, unknown>;
+  timestamp: string;
+}
+
+export interface ToolResultData {
+  content: unknown;
+  displayContent: string;
+  timestamp: string;
+}
+
+export interface ContentBlock {
+  type?: string;
+  text?: string;
+}
+
+export interface AssistantMessage {
+  type: 'assistant';
+  message: {
+    content: ContentBlock[] | string;
+  };
+}
+
+export interface ResultMessage {
+  type: 'result';
+  result?: string;
+  total_cost_usd?: number;
+  duration_ms?: number;
+  subtype?: string;
+  permission_denials?: unknown[];
+}
+
+export interface ToolUseMessage {
+  type: 'tool_use';
+  name: string;
+  input?: Record<string, unknown>;
+}
+
+export interface ToolResultMessage {
+  type: 'tool_result';
+  content?: unknown;
+}
+
+export interface ApiErrorDetection {
+  detected: boolean;
+  shouldThrow?: Error;
+}
+
+// Message types from SDK stream
+export type SdkMessage =
+  | AssistantMessage
+  | ResultMessage
+  | ToolUseMessage
+  | ToolResultMessage
+  | SystemInitMessage
+  | UserMessage;
+
+export interface SystemInitMessage {
+  type: 'system';
+  subtype: 'init';
+  model?: string;
+  permissionMode?: string;
+  mcp_servers?: Array<{ name: string; status: string }>;
+}
+
+export interface UserMessage {
+  type: 'user';
+}
+
+// Dispatch result types for message processing
+export type MessageDispatchResult =
+  | { action: 'continue' }
+  | { action: 'break'; result: string | null; cost: number }
+  | { action: 'throw'; error: Error };
+
+export interface MessageDispatchContext {
+  turnCount: number;
+  execContext: ExecutionContext;
+  description: string;
+  colorFn: (text: string) => string;
+  useCleanOutput: boolean;
+}
@@ -12,8 +12,10 @@
 */

 import { AgentLogger } from './logger.js';
+import { WorkflowLogger, type AgentLogDetails, type WorkflowSummary } from './workflow-logger.js';
 import { MetricsTracker } from './metrics-tracker.js';
-import { initializeAuditStructure, formatTimestamp, type SessionMetadata } from './utils.js';
+import { initializeAuditStructure, type SessionMetadata } from './utils.js';
+import { formatTimestamp } from '../utils/formatting.js';
 import { SessionMutex } from '../utils/concurrency.js';

 // Global mutex instance
@@ -36,7 +38,9 @@ export class AuditSession {
  private sessionMetadata: SessionMetadata;
  private sessionId: string;
  private metricsTracker: MetricsTracker;
+  private workflowLogger: WorkflowLogger;
  private currentLogger: AgentLogger | null = null;
+  private currentAgentName: string | null = null;
  private initialized: boolean = false;

  constructor(sessionMetadata: SessionMetadata) {
@@ -53,6 +57,7 @@ export class AuditSession {

    // Components
    this.metricsTracker = new MetricsTracker(sessionMetadata);
+    this.workflowLogger = new WorkflowLogger(sessionMetadata);
  }

  /**
@@ -70,6 +75,9 @@ export class AuditSession {
    // Initialize metrics tracker (loads or creates session.json)
    await this.metricsTracker.initialize();

+    // Initialize workflow logger
+    await this.workflowLogger.initialize();
+
    this.initialized = true;
  }

@@ -97,6 +105,9 @@ export class AuditSession {
      await AgentLogger.savePrompt(this.sessionMetadata, agentName, promptContent);
    }

+    // Track current agent name for workflow logging
+    this.currentAgentName = agentName;
+
    // Create and initialize logger for this attempt
    this.currentLogger = new AgentLogger(this.sessionMetadata, agentName, attemptNumber);
    await this.currentLogger.initialize();
@@ -110,6 +121,9 @@ export class AuditSession {
      attemptNumber,
      timestamp: formatTimestamp(),
    });
+
+    // Log to unified workflow log
+    await this.workflowLogger.logAgent(agentName, 'start', { attemptNumber });
  }

  /**
@@ -120,7 +134,30 @@ export class AuditSession {
      throw new Error('No active logger. Call startAgent() first.');
    }

+    // Log to agent-specific log file (JSON format)
    await this.currentLogger.logEvent(eventType, eventData);
+
+    // Also log to unified workflow log (human-readable format)
+    const data = eventData as Record<string, unknown>;
+    const agentName = this.currentAgentName || 'unknown';
+    switch (eventType) {
+      case 'tool_start':
+        await this.workflowLogger.logToolStart(
+          agentName,
+          String(data.toolName || ''),
+          data.parameters
+        );
+        break;
+      case 'llm_response':
+        await this.workflowLogger.logLlmResponse(
+          agentName,
+          Number(data.turn || 0),
+          String(data.content || '')
+        );
+        break;
+      // tool_end and error events are intentionally not logged to workflow log
+      // to reduce noise - the agent completion message captures the outcome
+    }
  }

  /**
@@ -142,10 +179,23 @@ export class AuditSession {
      this.currentLogger = null;
    }

+    // Reset current agent name
+    this.currentAgentName = null;
+
+    // Log to unified workflow log
+    const agentLogDetails: AgentLogDetails = {
+      attemptNumber: result.attemptNumber,
+      duration_ms: result.duration_ms,
+      cost_usd: result.cost_usd,
+      success: result.success,
+      ...(result.error !== undefined && { error: result.error }),
+    };
+    await this.workflowLogger.logAgent(agentName, 'end', agentLogDetails);
+
    // Mutex-protected update to session.json
    const unlock = await sessionMutex.lock(this.sessionId);
    try {
-      // Reload metrics (in case of parallel updates)
+      // Reload inside mutex to prevent lost updates during parallel exploitation phase
      await this.metricsTracker.reload();

      // Update metrics
@@ -177,4 +227,28 @@ export class AuditSession {
    await this.ensureInitialized();
    return this.metricsTracker.getMetrics();
  }
+
+  /**
+   * Log phase start to unified workflow log
+   */
+  async logPhaseStart(phase: string): Promise<void> {
+    await this.ensureInitialized();
+    await this.workflowLogger.logPhase(phase, 'start');
+  }
+
+  /**
+   * Log phase completion to unified workflow log
+   */
+  async logPhaseComplete(phase: string): Promise<void> {
+    await this.ensureInitialized();
+    await this.workflowLogger.logPhase(phase, 'complete');
+  }
+
+  /**
+   * Log workflow completion to unified workflow log
+   */
+  async logWorkflowComplete(summary: WorkflowSummary): Promise<void> {
+    await this.ensureInitialized();
+    await this.workflowLogger.logWorkflowComplete(summary);
+  }
 }
@@ -18,5 +18,6 @@

 export { AuditSession } from './audit-session.js';
 export { AgentLogger } from './logger.js';
+export { WorkflowLogger } from './workflow-logger.js';
 export { MetricsTracker } from './metrics-tracker.js';
 export * as AuditUtils from './utils.js';
@@ -15,10 +15,10 @@ import fs from 'fs';
 import {
  generateLogPath,
  generatePromptPath,
-  atomicWrite,
-  formatTimestamp,
  type SessionMetadata,
 } from './utils.js';
+import { atomicWrite } from '../utils/file-io.js';
+import { formatTimestamp } from '../utils/formatting.js';

 interface LogEvent {
  type: string;
@@ -96,22 +96,13 @@ export class AgentLogger {
        return;
      }

-      // Write and flush immediately (crash-safe)
      const needsDrain = !this.stream.write(text, 'utf8', (error) => {
-        if (error) {
-          reject(error);
-        }
+        if (error) reject(error);
      });

      if (needsDrain) {
-        // Buffer is full, wait for drain
-        const drainHandler = (): void => {
-          this.stream!.removeListener('drain', drainHandler);
-          resolve();
-        };
-        this.stream.once('drain', drainHandler);
+        this.stream.once('drain', resolve);
      } else {
-        // Buffer has space, resolve immediately
        resolve();
      }
    });
@@ -13,13 +13,12 @@

 import {
  generateSessionJsonPath,
-  atomicWrite,
-  readJson,
-  fileExists,
-  formatTimestamp,
-  calculatePercentage,
  type SessionMetadata,
 } from './utils.js';
+import { atomicWrite, readJson, fileExists } from '../utils/file-io.js';
+import { formatTimestamp, calculatePercentage } from '../utils/formatting.js';
+import { AGENT_PHASE_MAP, type PhaseName } from '../session-manager.js';
+import type { AgentName } from '../types/index.js';

 interface AttemptData {
  attempt_number: number;
@@ -152,16 +151,14 @@ export class MetricsTracker {
    }

    // Initialize agent metrics if not exists
-    if (!this.data.metrics.agents[agentName]) {
-      this.data.metrics.agents[agentName] = {
-        status: 'in-progress',
-        attempts: [],
-        final_duration_ms: 0,
-        total_cost_usd: 0,
-      };
-    }
-
-    const agent = this.data.metrics.agents[agentName]!;
+    const existingAgent = this.data.metrics.agents[agentName];
+    const agent = existingAgent ?? {
+      status: 'in-progress' as const,
+      attempts: [],
+      final_duration_ms: 0,
+      total_cost_usd: 0,
+    };
+    this.data.metrics.agents[agentName] = agent;

    // Add attempt to array
    const attempt: AttemptData = {
@@ -255,36 +252,19 @@ export class MetricsTracker {
  private calculatePhaseMetrics(
    successfulAgents: Array<[string, AgentMetrics]>
  ): Record<string, PhaseMetrics> {
-    const phases: Record<string, AgentMetrics[]> = {
+    const phases: Record<PhaseName, AgentMetrics[]> = {
      'pre-recon': [],
-      recon: [],
+      'recon': [],
      'vulnerability-analysis': [],
-      exploitation: [],
-      reporting: [],
+      'exploitation': [],
+      'reporting': [],
    };

-    // Map agents to phases
-    const agentPhaseMap: Record<string, string> = {
-      'pre-recon': 'pre-recon',
-      recon: 'recon',
-      'injection-vuln': 'vulnerability-analysis',
-      'xss-vuln': 'vulnerability-analysis',
-      'auth-vuln': 'vulnerability-analysis',
-      'authz-vuln': 'vulnerability-analysis',
-      'ssrf-vuln': 'vulnerability-analysis',
-      'injection-exploit': 'exploitation',
-      'xss-exploit': 'exploitation',
-      'auth-exploit': 'exploitation',
-      'authz-exploit': 'exploitation',
-      'ssrf-exploit': 'exploitation',
-      report: 'reporting',
-    };
-
-    // Group agents by phase
+    // Group agents by phase using imported AGENT_PHASE_MAP
    for (const [agentName, agentData] of successfulAgents) {
-      const phase = agentPhaseMap[agentName];
-      if (phase && phases[phase]) {
-        phases[phase]!.push(agentData);
+      const phase = AGENT_PHASE_MAP[agentName as AgentName];
+      if (phase) {
+        phases[phase].push(agentData);
      }
    }

@@ -296,7 +276,6 @@ export class MetricsTracker {
      if (agentList.length === 0) continue;

      const phaseDuration = agentList.reduce((sum, agent) => sum + agent.final_duration_ms, 0);
-
      const phaseCost = agentList.reduce((sum, agent) => sum + agent.total_cost_usd, 0);

      phaseMetrics[phaseName] = {
@@ -31,12 +31,18 @@ export interface SessionMetadata {
 }

 /**
- * Generate standardized session identifier: {hostname}_{sessionId}
+ * Extract and sanitize hostname from URL for use in identifiers
+ */
+export function sanitizeHostname(url: string): string {
+  return new URL(url).hostname.replace(/[^a-zA-Z0-9-]/g, '-');
+}
+
+/**
+ * Generate standardized session identifier from workflow ID
+ * Workflow IDs already contain hostname, so we use them directly
 */
 export function generateSessionIdentifier(sessionMetadata: SessionMetadata): string {
-  const { id, webUrl } = sessionMetadata;
-  const hostname = new URL(webUrl).hostname.replace(/[^a-zA-Z0-9-]/g, '-');
-  return `${hostname}_${id}`;
+  return sessionMetadata.id;
 }

 /**
@@ -79,6 +85,14 @@ export function generateSessionJsonPath(sessionMetadata: SessionMetadata): strin
  return path.join(auditPath, 'session.json');
 }

+/**
+ * Generate path to workflow.log file
+ */
+export function generateWorkflowLogPath(sessionMetadata: SessionMetadata): string {
+  const auditPath = generateAuditPath(sessionMetadata);
+  return path.join(auditPath, 'workflow.log');
+}
+
 /**
 * Ensure directory exists (idempotent, race-safe)
 */
@@ -0,0 +1,382 @@
+// Copyright (C) 2025 Keygraph, Inc.
+//
+// This program is free software: you can redistribute it and/or modify
+// it under the terms of the GNU Affero General Public License version 3
+// as published by the Free Software Foundation.
+
+/**
+ * Workflow Logger
+ *
+ * Provides a unified, human-readable log file per workflow.
+ * Optimized for `tail -f` viewing during concurrent workflow execution.
+ */
+
+import fs from 'fs';
+import path from 'path';
+import { generateWorkflowLogPath, ensureDirectory, type SessionMetadata } from './utils.js';
+import { formatDuration, formatTimestamp } from '../utils/formatting.js';
+
+export interface AgentLogDetails {
+  attemptNumber?: number;
+  duration_ms?: number;
+  cost_usd?: number;
+  success?: boolean;
+  error?: string;
+}
+
+export interface AgentMetricsSummary {
+  durationMs: number;
+  costUsd: number | null;
+}
+
+export interface WorkflowSummary {
+  status: 'completed' | 'failed';
+  totalDurationMs: number;
+  totalCostUsd: number;
+  completedAgents: string[];
+  agentMetrics: Record<string, AgentMetricsSummary>;
+  error?: string;
+}
+
+/**
+ * WorkflowLogger - Manages the unified workflow log file
+ */
+export class WorkflowLogger {
+  private sessionMetadata: SessionMetadata;
+  private logPath: string;
+  private stream: fs.WriteStream | null = null;
+  private initialized: boolean = false;
+
+  constructor(sessionMetadata: SessionMetadata) {
+    this.sessionMetadata = sessionMetadata;
+    this.logPath = generateWorkflowLogPath(sessionMetadata);
+  }
+
+  /**
+   * Initialize the log stream (creates file and writes header)
+   */
+  async initialize(): Promise<void> {
+    if (this.initialized) {
+      return;
+    }
+
+    // Ensure directory exists
+    await ensureDirectory(path.dirname(this.logPath));
+
+    // Create write stream with append mode
+    this.stream = fs.createWriteStream(this.logPath, {
+      flags: 'a',
+      encoding: 'utf8',
+      autoClose: true,
+    });
+
+    this.initialized = true;
+
+    // Write header only if file is new (empty)
+    const stats = await fs.promises.stat(this.logPath).catch(() => null);
+    if (!stats || stats.size === 0) {
+      await this.writeHeader();
+    }
+  }
+
+  /**
+   * Write header to log file
+   */
+  private async writeHeader(): Promise<void> {
+    const header = [
+      `================================================================================`,
+      `Shannon Pentest - Workflow Log`,
+      `================================================================================`,
+      `Workflow ID: ${this.sessionMetadata.id}`,
+      `Target URL:  ${this.sessionMetadata.webUrl}`,
+      `Started:     ${formatTimestamp()}`,
+      `================================================================================`,
+      ``,
+    ].join('\n');
+
+    return this.writeRaw(header);
+  }
+
+  /**
+   * Write raw text to log file with immediate flush
+   */
+  private writeRaw(text: string): Promise<void> {
+    return new Promise((resolve, reject) => {
+      if (!this.initialized || !this.stream) {
+        reject(new Error('WorkflowLogger not initialized'));
+        return;
+      }
+
+      const needsDrain = !this.stream.write(text, 'utf8', (error) => {
+        if (error) reject(error);
+      });
+
+      if (needsDrain) {
+        this.stream.once('drain', resolve);
+      } else {
+        resolve();
+      }
+    });
+  }
+
+  /**
+   * Format timestamp for log line (local time, human readable)
+   */
+  private formatLogTime(): string {
+    const now = new Date();
+    return now.toISOString().replace('T', ' ').slice(0, 19);
+  }
+
+  /**
+   * Log a phase transition event
+   */
+  async logPhase(phase: string, event: 'start' | 'complete'): Promise<void> {
+    await this.ensureInitialized();
+
+    const action = event === 'start' ? 'Starting' : 'Completed';
+    const line = `[${this.formatLogTime()}] [PHASE] ${action}: ${phase}\n`;
+
+    // Add blank line before phase start for readability
+    if (event === 'start') {
+      await this.writeRaw('\n');
+    }
+
+    await this.writeRaw(line);
+  }
+
+  /**
+   * Log an agent event
+   */
+  async logAgent(
+    agentName: string,
+    event: 'start' | 'end',
+    details?: AgentLogDetails
+  ): Promise<void> {
+    await this.ensureInitialized();
+
+    let message: string;
+
+    if (event === 'start') {
+      const attempt = details?.attemptNumber ?? 1;
+      message = `${agentName}: Starting (attempt ${attempt})`;
+    } else {
+      const parts: string[] = [agentName + ':'];
+
+      if (details?.success === false) {
+        parts.push('Failed');
+        if (details?.error) {
+          parts.push(`- ${details.error}`);
+        }
+      } else {
+        parts.push('Completed');
+      }
+
+      if (details?.duration_ms !== undefined) {
+        parts.push(`(${formatDuration(details.duration_ms)}`);
+        if (details?.cost_usd !== undefined) {
+          parts.push(`$${details.cost_usd.toFixed(2)})`);
+        } else {
+          parts.push(')');
+        }
+      }
+
+      message = parts.join(' ');
+    }
+
+    const line = `[${this.formatLogTime()}] [AGENT] ${message}\n`;
+    await this.writeRaw(line);
+  }
+
+  /**
+   * Log a general event
+   */
+  async logEvent(eventType: string, message: string): Promise<void> {
+    await this.ensureInitialized();
+
+    const line = `[${this.formatLogTime()}] [${eventType.toUpperCase()}] ${message}\n`;
+    await this.writeRaw(line);
+  }
+
+  /**
+   * Log an error
+   */
+  async logError(error: Error, context?: string): Promise<void> {
+    await this.ensureInitialized();
+
+    const contextStr = context ? ` (${context})` : '';
+    const line = `[${this.formatLogTime()}] [ERROR] ${error.message}${contextStr}\n`;
+    await this.writeRaw(line);
+  }
+
+  /**
+   * Truncate string to max length with ellipsis
+   */
+  private truncate(str: string, maxLen: number): string {
+    if (str.length <= maxLen) return str;
+    return str.slice(0, maxLen - 3) + '...';
+  }
+
+  /**
+   * Format tool parameters for human-readable display
+   */
+  private formatToolParams(toolName: string, params: unknown): string {
+    if (!params || typeof params !== 'object') {
+      return '';
+    }
+
+    const p = params as Record<string, unknown>;
+
+    // Tool-specific formatting for common tools
+    switch (toolName) {
+      case 'Bash':
+        if (p.command) {
+          return this.truncate(String(p.command).replace(/\n/g, ' '), 100);
+        }
+        break;
+      case 'Read':
+        if (p.file_path) {
+          return String(p.file_path);
+        }
+        break;
+      case 'Write':
+        if (p.file_path) {
+          return String(p.file_path);
+        }
+        break;
+      case 'Edit':
+        if (p.file_path) {
+          return String(p.file_path);
+        }
+        break;
+      case 'Glob':
+        if (p.pattern) {
+          return String(p.pattern);
+        }
+        break;
+      case 'Grep':
+        if (p.pattern) {
+          const path = p.path ? ` in ${p.path}` : '';
+          return `"${this.truncate(String(p.pattern), 50)}"${path}`;
+        }
+        break;
+      case 'WebFetch':
+        if (p.url) {
+          return String(p.url);
+        }
+        break;
+      case 'mcp__playwright__browser_navigate':
+        if (p.url) {
+          return String(p.url);
+        }
+        break;
+      case 'mcp__playwright__browser_click':
+        if (p.selector) {
+          return this.truncate(String(p.selector), 60);
+        }
+        break;
+      case 'mcp__playwright__browser_type':
+        if (p.selector) {
+          const text = p.text ? `: "${this.truncate(String(p.text), 30)}"` : '';
+          return `${this.truncate(String(p.selector), 40)}${text}`;
+        }
+        break;
+    }
+
+    // Default: show first string-valued param truncated
+    for (const [key, val] of Object.entries(p)) {
+      if (typeof val === 'string' && val.length > 0) {
+        return `${key}=${this.truncate(val, 60)}`;
+      }
+    }
+
+    return '';
+  }
+
+  /**
+   * Log tool start event
+   */
+  async logToolStart(agentName: string, toolName: string, parameters: unknown): Promise<void> {
+    await this.ensureInitialized();
+
+    const params = this.formatToolParams(toolName, parameters);
+    const paramStr = params ? `: ${params}` : '';
+    const line = `[${this.formatLogTime()}] [${agentName}] [TOOL] ${toolName}${paramStr}\n`;
+    await this.writeRaw(line);
+  }
+
+  /**
+   * Log LLM response
+   */
+  async logLlmResponse(agentName: string, turn: number, content: string): Promise<void> {
+    await this.ensureInitialized();
+
+    // Show full content, replacing newlines with escaped version for single-line output
+    const escaped = content.replace(/\n/g, '\\n');
+    const line = `[${this.formatLogTime()}] [${agentName}] [LLM] Turn ${turn}: ${escaped}\n`;
+    await this.writeRaw(line);
+  }
+
+  /**
+   * Log workflow completion with full summary
+   */
+  async logWorkflowComplete(summary: WorkflowSummary): Promise<void> {
+    await this.ensureInitialized();
+
+    const status = summary.status === 'completed' ? 'COMPLETED' : 'FAILED';
+
+    await this.writeRaw('\n');
+    await this.writeRaw(`================================================================================\n`);
+    await this.writeRaw(`Workflow ${status}\n`);
+    await this.writeRaw(`────────────────────────────────────────\n`);
+    await this.writeRaw(`Workflow ID: ${this.sessionMetadata.id}\n`);
+    await this.writeRaw(`Status:      ${summary.status}\n`);
+    await this.writeRaw(`Duration:    ${formatDuration(summary.totalDurationMs)}\n`);
+    await this.writeRaw(`Total Cost:  $${summary.totalCostUsd.toFixed(4)}\n`);
+    await this.writeRaw(`Agents:      ${summary.completedAgents.length} completed\n`);
+
+    if (summary.error) {
+      await this.writeRaw(`Error:       ${summary.error}\n`);
+    }
+
+    await this.writeRaw(`\n`);
+    await this.writeRaw(`Agent Breakdown:\n`);
+
+    for (const agentName of summary.completedAgents) {
+      const metrics = summary.agentMetrics[agentName];
+      if (metrics) {
+        const duration = formatDuration(metrics.durationMs);
+        const cost = metrics.costUsd !== null ? `$${metrics.costUsd.toFixed(4)}` : 'N/A';
+        await this.writeRaw(`  - ${agentName} (${duration}, ${cost})\n`);
+      } else {
+        await this.writeRaw(`  - ${agentName}\n`);
+      }
+    }
+
+    await this.writeRaw(`================================================================================\n`);
+  }
+
+  /**
+   * Ensure initialized (helper for lazy initialization)
+   */
+  private async ensureInitialized(): Promise<void> {
+    if (!this.initialized) {
+      await this.initialize();
+    }
+  }
+
+  /**
+   * Close the log stream
+   */
+  async close(): Promise<void> {
+    if (!this.initialized || !this.stream) {
+      return;
+    }
+
+    return new Promise((resolve) => {
+      this.stream!.end(() => {
+        this.initialized = false;
+        resolve();
+      });
+    });
+  }
+}
@@ -14,6 +14,12 @@ import type {
  PromptErrorResult,
 } from './types/errors.js';

+// Temporal error classification for ApplicationFailure wrapping
+export interface TemporalErrorClassification {
+  type: string;
+  retryable: boolean;
+}
+
 // Custom error class for pentest operations
 export class PentestError extends Error {
  name = 'PentestError' as const;
@@ -37,11 +43,11 @@ export class PentestError extends Error {
 }

 // Centralized error logging function
-export const logError = async (
+export async function logError(
  error: Error & { type?: PentestErrorType; retryable?: boolean; context?: PentestErrorContext },
  contextMsg: string,
  sourceDir: string | null = null
-): Promise<LogEntry> => {
+): Promise<LogEntry> {
  const timestamp = new Date().toISOString();
  const logEntry: LogEntry = {
    timestamp,
@@ -80,13 +86,13 @@ export const logError = async (
  }

  return logEntry;
-};
+}

 // Handle tool execution errors
-export const handleToolError = (
+export function handleToolError(
  toolName: string,
  error: Error & { code?: string }
-): ToolErrorResult => {
+): ToolErrorResult {
  const isRetryable =
    error.code === 'ECONNRESET' ||
    error.code === 'ETIMEDOUT' ||
@@ -105,13 +111,13 @@ export const handleToolError = (
      { toolName, originalError: error.message, errorCode: error.code }
    ),
  };
-};
+}

 // Handle prompt loading errors
-export const handlePromptError = (
+export function handlePromptError(
  promptName: string,
  error: Error
-): PromptErrorResult => {
+): PromptErrorResult {
  return {
    success: false,
    error: new PentestError(
@@ -121,78 +127,63 @@ export const handlePromptError = (
      { promptName, originalError: error.message }
    ),
  };
-};
+}

-// Check if an error should trigger a retry for Claude agents
-export const isRetryableError = (error: Error): boolean => {
+// Patterns that indicate retryable errors
+const RETRYABLE_PATTERNS = [
+  // Network and connection errors
+  'network',
+  'connection',
+  'timeout',
+  'econnreset',
+  'enotfound',
+  'econnrefused',
+  // Rate limiting
+  'rate limit',
+  '429',
+  'too many requests',
+  // Server errors
+  'server error',
+  '5xx',
+  'internal server error',
+  'service unavailable',
+  'bad gateway',
+  // Claude API errors
+  'mcp server',
+  'model unavailable',
+  'service temporarily unavailable',
+  'api error',
+  'terminated',
+  // Max turns
+  'max turns',
+  'maximum turns',
+];
+
+// Patterns that indicate non-retryable errors (checked before default)
+const NON_RETRYABLE_PATTERNS = [
+  'authentication',
+  'invalid prompt',
+  'out of memory',
+  'permission denied',
+  'session limit reached',
+  'invalid api key',
+];
+
+// Conservative retry classification - unknown errors don't retry (fail-safe default)
+export function isRetryableError(error: Error): boolean {
  const message = error.message.toLowerCase();

-  // Network and connection errors - always retryable
-  if (
-    message.includes('network') ||
-    message.includes('connection') ||
-    message.includes('timeout') ||
-    message.includes('econnreset') ||
-    message.includes('enotfound') ||
-    message.includes('econnrefused')
-  ) {
-    return true;
-  }
-
-  // Rate limiting - retryable with longer backoff
-  if (
-    message.includes('rate limit') ||
-    message.includes('429') ||
-    message.includes('too many requests')
-  ) {
-    return true;
-  }
-
-  // Server errors - retryable
-  if (
-    message.includes('server error') ||
-    message.includes('5xx') ||
-    message.includes('internal server error') ||
-    message.includes('service unavailable') ||
-    message.includes('bad gateway')
-  ) {
-    return true;
-  }
-
-  // Claude API specific errors - retryable
-  if (
-    message.includes('mcp server') ||
-    message.includes('model unavailable') ||
-    message.includes('service temporarily unavailable') ||
-    message.includes('api error') ||
-    message.includes('terminated')
-  ) {
-    return true;
-  }
-
-  // Max turns without completion - retryable once
-  if (message.includes('max turns') || message.includes('maximum turns')) {
-    return true;
-  }
-
-  // Non-retryable errors
-  if (
-    message.includes('authentication') ||
-    message.includes('invalid prompt') ||
-    message.includes('out of memory') ||
-    message.includes('permission denied') ||
-    message.includes('session limit reached') ||
-    message.includes('invalid api key')
-  ) {
+  // Check for explicit non-retryable patterns first
+  if (NON_RETRYABLE_PATTERNS.some((pattern) => message.includes(pattern))) {
    return false;
  }

-  // Default to non-retryable for unknown errors
-  return false;
-};
+  // Check for retryable patterns
+  return RETRYABLE_PATTERNS.some((pattern) => message.includes(pattern));
+}

-// Get retry delay based on error type and attempt number
-export const getRetryDelay = (error: Error, attempt: number): number => {
+// Rate limit errors get longer base delay (30s) vs standard exponential backoff (2s)
+export function getRetryDelay(error: Error, attempt: number): number {
  const message = error.message.toLowerCase();

  // Rate limiting gets longer delays
@@ -204,4 +195,125 @@ export const getRetryDelay = (error: Error, attempt: number): number => {
  const baseDelay = Math.pow(2, attempt) * 1000; // 2s, 4s, 8s
  const jitter = Math.random() * 1000; // 0-1s random
  return Math.min(baseDelay + jitter, 30000); // Max 30s
-};
+}
+
+/**
+ * Classifies errors for Temporal workflow retry behavior.
+ * Returns error type and whether Temporal should retry.
+ *
+ * Used by activities to wrap errors in ApplicationFailure:
+ * - Retryable errors: Temporal retries with configured backoff
+ * - Non-retryable errors: Temporal fails immediately
+ */
+export function classifyErrorForTemporal(error: unknown): TemporalErrorClassification {
+  const message = (error instanceof Error ? error.message : String(error)).toLowerCase();
+
+  // === BILLING ERRORS (Retryable with long backoff) ===
+  // Anthropic returns billing as 400 invalid_request_error
+  // Human can add credits OR wait for spending cap to reset (5-30 min backoff)
+  if (
+    message.includes('billing_error') ||
+    message.includes('credit balance is too low') ||
+    message.includes('insufficient credits') ||
+    message.includes('usage is blocked due to insufficient credits') ||
+    message.includes('please visit plans & billing') ||
+    message.includes('please visit plans and billing') ||
+    message.includes('usage limit reached') ||
+    message.includes('quota exceeded') ||
+    message.includes('daily rate limit') ||
+    message.includes('limit will reset') ||
+    // Claude Code spending cap patterns (returns short message instead of error)
+    message.includes('spending cap') ||
+    message.includes('spending limit') ||
+    message.includes('cap reached') ||
+    message.includes('budget exceeded') ||
+    message.includes('billing limit reached')
+  ) {
+    return { type: 'BillingError', retryable: true };
+  }
+
+  // === PERMANENT ERRORS (Non-retryable) ===
+
+  // Authentication (401) - bad API key won't fix itself
+  if (
+    message.includes('authentication') ||
+    message.includes('api key') ||
+    message.includes('401') ||
+    message.includes('authentication_error')
+  ) {
+    return { type: 'AuthenticationError', retryable: false };
+  }
+
+  // Permission (403) - access won't be granted
+  if (
+    message.includes('permission') ||
+    message.includes('forbidden') ||
+    message.includes('403')
+  ) {
+    return { type: 'PermissionError', retryable: false };
+  }
+
+  // === OUTPUT VALIDATION ERRORS (Retryable) ===
+  // Agent didn't produce expected deliverables - retry may succeed
+  // IMPORTANT: Must come BEFORE generic 'validation' check below
+  if (
+    message.includes('failed output validation') ||
+    message.includes('output validation failed')
+  ) {
+    return { type: 'OutputValidationError', retryable: true };
+  }
+
+  // Invalid Request (400) - malformed request is permanent
+  // Note: Checked AFTER billing and AFTER output validation
+  if (
+    message.includes('invalid_request_error') ||
+    message.includes('malformed') ||
+    message.includes('validation')
+  ) {
+    return { type: 'InvalidRequestError', retryable: false };
+  }
+
+  // Request Too Large (413) - won't fit no matter how many retries
+  if (
+    message.includes('request_too_large') ||
+    message.includes('too large') ||
+    message.includes('413')
+  ) {
+    return { type: 'RequestTooLargeError', retryable: false };
+  }
+
+  // Configuration errors - missing files need manual fix
+  if (
+    message.includes('enoent') ||
+    message.includes('no such file') ||
+    message.includes('cli not installed')
+  ) {
+    return { type: 'ConfigurationError', retryable: false };
+  }
+
+  // Execution limits - max turns/budget reached
+  if (
+    message.includes('max turns') ||
+    message.includes('budget') ||
+    message.includes('execution limit') ||
+    message.includes('error_max_turns') ||
+    message.includes('error_max_budget')
+  ) {
+    return { type: 'ExecutionLimitError', retryable: false };
+  }
+
+  // Invalid target URL - bad URL format won't fix itself
+  if (
+    message.includes('invalid url') ||
+    message.includes('invalid target') ||
+    message.includes('malformed url') ||
+    message.includes('invalid uri')
+  ) {
+    return { type: 'InvalidTargetError', retryable: false };
+  }
+
+  // === TRANSIENT ERRORS (Retryable) ===
+  // Rate limits (429), server errors (5xx), network issues
+  // Let Temporal retry with configured backoff
+  return { type: 'TransientError', retryable: true };
+}
@@ -7,7 +7,7 @@
 import { $, fs, path } from 'zx';
 import chalk from 'chalk';
 import { Timer } from '../utils/metrics.js';
-import { formatDuration } from '../audit/utils.js';
+import { formatDuration } from '../utils/formatting.js';
 import { handleToolError, PentestError } from '../error-handling.js';
 import { AGENTS } from '../session-manager.js';
 import { runClaudePromptWithRetry } from '../ai/claude-executor.js';
@@ -40,11 +40,17 @@ interface PromptVariables {
  repoPath: string;
 }

+// Discriminated union for Wave1 tool results - clearer than loose union types
+type Wave1ToolResult =
+  | { kind: 'scan'; result: TerminalScanResult }
+  | { kind: 'skipped'; message: string }
+  | { kind: 'agent'; result: AgentResult };
+
 interface Wave1Results {
-  nmap: TerminalScanResult | string | AgentResult;
-  subfinder: TerminalScanResult | string | AgentResult;
-  whatweb: TerminalScanResult | string | AgentResult;
-  naabu?: TerminalScanResult | string | AgentResult;
+  nmap: Wave1ToolResult;
+  subfinder: Wave1ToolResult;
+  whatweb: Wave1ToolResult;
+  naabu?: Wave1ToolResult;
  codeAnalysis: AgentResult;
 }

@@ -57,7 +63,7 @@ interface PreReconResult {
  report: string;
 }

-// Pure function: Run terminal scanning tools
+// Runs external security tools (nmap, whatweb, etc). Schemathesis requires schemas from code analysis.
 async function runTerminalScan(tool: ToolName, target: string, sourceDir: string | null = null): Promise<TerminalScanResult> {
  const timer = new Timer(`command-${tool}`);
  try {
@@ -89,7 +95,7 @@ async function runTerminalScan(tool: ToolName, target: string, sourceDir: string
        return { tool: 'whatweb', output: result.stdout, status: 'success', duration: whatwebDuration };
      }
      case 'schemathesis': {
-        // Only run if API schemas found
+        // Schemathesis depends on code analysis output - skip if no schemas found
        const schemasDir = path.join(sourceDir || '.', 'outputs', 'schemas');
        if (await fs.pathExists(schemasDir)) {
          const schemaFiles = await fs.readdir(schemasDir) as string[];
@@ -146,6 +152,8 @@ async function runPreReconWave1(

  const operations: Promise<TerminalScanResult | AgentResult>[] = [];

+  const skippedResult = (message: string): Wave1ToolResult => ({ kind: 'skipped', message });
+
  // Skip external commands in pipeline testing mode
  if (pipelineTestingMode) {
    console.log(chalk.gray('    ⏭️ Skipping external tools (pipeline testing mode)'));
@@ -163,9 +171,9 @@ async function runPreReconWave1(
    );
    const [codeAnalysis] = await Promise.all(operations);
    return {
-      nmap: 'Skipped (pipeline testing mode)',
-      subfinder: 'Skipped (pipeline testing mode)',
-      whatweb: 'Skipped (pipeline testing mode)',
+      nmap: skippedResult('Skipped (pipeline testing mode)'),
+      subfinder: skippedResult('Skipped (pipeline testing mode)'),
+      whatweb: skippedResult('Skipped (pipeline testing mode)'),
      codeAnalysis: codeAnalysis as AgentResult
    };
  } else {
@@ -192,9 +200,9 @@ async function runPreReconWave1(
  const [nmap, subfinder, whatweb, codeAnalysis] = await Promise.all(operations);

  return {
-    nmap: nmap as TerminalScanResult,
-    subfinder: subfinder as TerminalScanResult,
-    whatweb: whatweb as TerminalScanResult,
+    nmap: { kind: 'scan', result: nmap as TerminalScanResult },
+    subfinder: { kind: 'scan', result: subfinder as TerminalScanResult },
+    whatweb: { kind: 'scan', result: whatweb as TerminalScanResult },
    codeAnalysis: codeAnalysis as AgentResult
  };
 }
@@ -250,17 +258,21 @@ async function runPreReconWave2(
  return response;
 }

-// Helper type for stitching results
-interface StitchableResult {
-  status?: string;
-  output?: string;
-  tool?: string;
+// Extracts status and output from a Wave1 tool result
+function extractResult(r: Wave1ToolResult | undefined): { status: string; output: string } {
+  if (!r) return { status: 'Skipped', output: 'No output' };
+  switch (r.kind) {
+    case 'scan':
+      return { status: r.result.status || 'Skipped', output: r.result.output || 'No output' };
+    case 'skipped':
+      return { status: 'Skipped', output: r.message };
+    case 'agent':
+      return { status: r.result.success ? 'success' : 'error', output: 'See agent output' };
+  }
 }

-// Pure function: Stitch together pre-recon outputs and save to file
-async function stitchPreReconOutputs(outputs: (StitchableResult | string | undefined)[], sourceDir: string): Promise<string> {
-  const [nmap, subfinder, whatweb, naabu, codeAnalysis, ...additionalScans] = outputs;
-
+// Combines tool outputs into single deliverable. Falls back to reference if file missing.
+async function stitchPreReconOutputs(wave1: Wave1Results, additionalScans: TerminalScanResult[], sourceDir: string): Promise<string> {
  // Try to read the code analysis deliverable file
  let codeAnalysisContent = 'No analysis available';
  try {
@@ -269,62 +281,45 @@ async function stitchPreReconOutputs(outputs: (StitchableResult | string | undef
  } catch (error) {
    const err = error as Error;
    console.log(chalk.yellow(`⚠️ Could not read code analysis deliverable: ${err.message}`));
-    // Fallback message if file doesn't exist
    codeAnalysisContent = 'Analysis located in deliverables/code_analysis_deliverable.md';
  }

-
  // Build additional scans section
  let additionalSection = '';
-  if (additionalScans && additionalScans.length > 0) {
+  if (additionalScans.length > 0) {
    additionalSection = '\n## Authenticated Scans\n';
-    additionalScans.forEach(scan => {
-      const s = scan as StitchableResult;
-      if (s && s.tool) {
-        additionalSection += `
-### ${s.tool.toUpperCase()}
-Status: ${s.status}
-${s.output}
+    for (const scan of additionalScans) {
+      additionalSection += `
+### ${scan.tool.toUpperCase()}
+Status: ${scan.status}
+${scan.output}
 `;
-      }
-    });
+    }
  }

-  const nmapResult = nmap as StitchableResult | string | undefined;
-  const subfinderResult = subfinder as StitchableResult | string | undefined;
-  const whatwebResult = whatweb as StitchableResult | string | undefined;
-  const naabuResult = naabu as StitchableResult | string | undefined;
-
-  const getStatus = (r: StitchableResult | string | undefined): string => {
-    if (!r) return 'Skipped';
-    if (typeof r === 'string') return 'Skipped';
-    return r.status || 'Skipped';
-  };
-
-  const getOutput = (r: StitchableResult | string | undefined): string => {
-    if (!r) return 'No output';
-    if (typeof r === 'string') return r;
-    return r.output || 'No output';
-  };
+  const nmap = extractResult(wave1.nmap);
+  const subfinder = extractResult(wave1.subfinder);
+  const whatweb = extractResult(wave1.whatweb);
+  const naabu = extractResult(wave1.naabu);

  const report = `
 # Pre-Reconnaissance Report

 ## Port Discovery (naabu)
-Status: ${getStatus(naabuResult)}
-${getOutput(naabuResult)}
+Status: ${naabu.status}
+${naabu.output}

 ## Network Scanning (nmap)
-Status: ${getStatus(nmapResult)}
-${getOutput(nmapResult)}
+Status: ${nmap.status}
+${nmap.output}

 ## Subdomain Discovery (subfinder)
-Status: ${getStatus(subfinderResult)}
-${getOutput(subfinderResult)}
+Status: ${subfinder.status}
+${subfinder.output}

 ## Technology Detection (whatweb)
-Status: ${getStatus(whatwebResult)}
-${getOutput(whatwebResult)}
+Status: ${whatweb.status}
+${whatweb.output}
 ## Code Analysis
 ${codeAnalysisContent}
 ${additionalSection}
@@ -375,16 +370,8 @@ export async function executePreReconPhase(
  console.log(chalk.green('  ✅ Wave 2 operations completed'));

  console.log(chalk.blue('📝 Stitching pre-recon outputs...'));
-  // Combine wave 1 and wave 2 results for stitching
-  const allResults: (StitchableResult | string | undefined)[] = [
-    wave1Results.nmap as StitchableResult | string,
-    wave1Results.subfinder as StitchableResult | string,
-    wave1Results.whatweb as StitchableResult | string,
-    wave1Results.naabu as StitchableResult | string | undefined,
-    wave1Results.codeAnalysis as unknown as StitchableResult,
-    ...(wave2Results.schemathesis ? [wave2Results.schemathesis as StitchableResult] : [])
-  ];
-  const preReconReport = await stitchPreReconOutputs(allResults, sourceDir);
+  const additionalScans = wave2Results.schemathesis ? [wave2Results.schemathesis] : [];
+  const preReconReport = await stitchPreReconOutputs(wave1Results, additionalScans, sourceDir);
  const duration = timer.stop();

  console.log(chalk.green(`✅ Pre-reconnaissance complete in ${formatDuration(duration)}`));
@@ -48,9 +48,12 @@ export async function assembleFinalReport(sourceDir: string): Promise<string> {
  }

  const finalContent = sections.join('\n\n');
-  const finalReportPath = path.join(sourceDir, 'deliverables', 'comprehensive_security_assessment_report.md');
+  const deliverablesDir = path.join(sourceDir, 'deliverables');
+  const finalReportPath = path.join(deliverablesDir, 'comprehensive_security_assessment_report.md');

  try {
+    // Ensure deliverables directory exists
+    await fs.ensureDir(deliverablesDir);
    await fs.writeFile(finalReportPath, finalContent);
    console.log(chalk.green(`✅ Final report assembled at ${finalReportPath}`));
  } catch (error) {
@@ -6,6 +6,7 @@

 import { fs, path } from 'zx';
 import { PentestError } from './error-handling.js';
+import { asyncPipe } from './utils/functional.js';

 export type VulnType = 'injection' | 'xss' | 'auth' | 'ssrf' | 'authz';

@@ -16,9 +17,11 @@ interface VulnTypeConfigItem {

 type VulnTypeConfig = Record<VulnType, VulnTypeConfigItem>;

+type ErrorMessageResolver = string | ((existence: FileExistence) => string);
+
 interface ValidationRule {
  predicate: (existence: FileExistence) => boolean;
-  errorMessage: string;
+  errorMessage: ErrorMessageResolver;
  retryable: boolean;
 }

@@ -94,40 +97,36 @@ const VULN_TYPE_CONFIG: VulnTypeConfig = Object.freeze({
  }),
 }) as VulnTypeConfig;

-// Functional composition utilities - async pipe for promise chain
-type PipeFunction = (x: any) => any | Promise<any>;
-
-const pipe =
-  (...fns: PipeFunction[]) =>
-  (x: any): Promise<any> =>
-    fns.reduce(async (v, f) => f(await v), Promise.resolve(x));
-
 // Pure function to create validation rule
-const createValidationRule = (
+function createValidationRule(
  predicate: (existence: FileExistence) => boolean,
-  errorMessage: string,
+  errorMessage: ErrorMessageResolver,
  retryable: boolean = true
-): ValidationRule => Object.freeze({ predicate, errorMessage, retryable });
+): ValidationRule {
+  return Object.freeze({ predicate, errorMessage, retryable });
+}

-// Validation rules for file existence (following QUEUE_VALIDATION_FLOW.md)
+// Symmetric deliverable rules: queue and deliverable must exist together (prevents partial analysis from triggering exploitation)
 const fileExistenceRules: readonly ValidationRule[] = Object.freeze([
-  // Rule 1: Neither deliverable nor queue exists
  createValidationRule(
-    ({ deliverableExists, queueExists }) => deliverableExists || queueExists,
-    'Analysis failed: Neither deliverable nor queue file exists. Analysis agent must create both files.'
-  ),
-  // Rule 2: Queue doesn't exist but deliverable exists
-  createValidationRule(
-    ({ deliverableExists, queueExists }) => !(!queueExists && deliverableExists),
-    'Analysis incomplete: Deliverable exists but queue file missing. Analysis agent must create both files.'
-  ),
-  // Rule 3: Queue exists but deliverable doesn't exist
-  createValidationRule(
-    ({ deliverableExists, queueExists }) => !(queueExists && !deliverableExists),
-    'Analysis incomplete: Queue exists but deliverable file missing. Analysis agent must create both files.'
+    ({ deliverableExists, queueExists }) => deliverableExists && queueExists,
+    getExistenceErrorMessage
  ),
 ]);

+// Generate appropriate error message based on which files are missing
+function getExistenceErrorMessage(existence: FileExistence): string {
+  const { deliverableExists, queueExists } = existence;
+
+  if (!deliverableExists && !queueExists) {
+    return 'Analysis failed: Neither deliverable nor queue file exists. Analysis agent must create both files.';
+  }
+  if (!queueExists) {
+    return 'Analysis incomplete: Deliverable exists but queue file missing. Analysis agent must create both files.';
+  }
+  return 'Analysis incomplete: Queue exists but deliverable file missing. Analysis agent must create both files.';
+}
+
 // Pure function to create file paths
 const createPaths = (
  vulnType: VulnType,
@@ -170,7 +169,7 @@ const checkFileExistence = async (
  });
 };

-// Pure function to validate existence rules
+// Validates deliverable/queue symmetry - both must exist or neither
 const validateExistenceRules = (
  pathsWithExistence: PathsWithExistence | PathsWithError
 ): PathsWithExistence | PathsWithError => {
@@ -182,9 +181,14 @@ const validateExistenceRules = (
  const failedRule = fileExistenceRules.find((rule) => !rule.predicate(existence));

  if (failedRule) {
+    const message =
+      typeof failedRule.errorMessage === 'function'
+        ? failedRule.errorMessage(existence)
+        : failedRule.errorMessage;
+
    return {
      error: new PentestError(
-        `${failedRule.errorMessage} (${vulnType})`,
+        `${message} (${vulnType})`,
        'validation',
        failedRule.retryable,
        {
@@ -224,7 +228,7 @@ const validateQueueStructure = (content: string): QueueValidationResult => {
  }
 };

-// Pure function to read and validate queue content
+// Queue parse failures are retryable - agent can fix malformed JSON on retry
 const validateQueueContent = async (
  pathsWithExistence: PathsWithExistence | PathsWithError
 ): Promise<PathsWithQueue | PathsWithError> => {
@@ -273,7 +277,7 @@ const validateQueueContent = async (
  }
 };

-// Pure function to determine exploitation decision
+// Final decision: skip if queue says no vulns, proceed if vulns found, error otherwise
 const determineExploitationDecision = (
  validatedData: PathsWithQueue | PathsWithError
 ): ExploitationDecision => {
@@ -294,17 +298,18 @@ const determineExploitationDecision = (
 };

 // Main functional validation pipeline
-export const validateQueueAndDeliverable = async (
+export async function validateQueueAndDeliverable(
  vulnType: VulnType,
  sourceDir: string
-): Promise<ExploitationDecision> =>
-  (await pipe(
-    () => createPaths(vulnType, sourceDir),
+): Promise<ExploitationDecision> {
+  return asyncPipe<ExploitationDecision>(
+    createPaths(vulnType, sourceDir),
    checkFileExistence,
    validateExistenceRules,
    validateQueueContent,
    determineExploitationDecision
-  )(() => createPaths(vulnType, sourceDir))) as ExploitationDecision;
+  );
+}

 // Pure function to safely validate (returns result instead of throwing)
 export const safeValidateQueueAndDeliverable = async (
@@ -106,10 +106,24 @@ export const getParallelGroups = (): Readonly<{ vuln: AgentName[]; exploit: Agen
  exploit: ['injection-exploit', 'xss-exploit', 'auth-exploit', 'ssrf-exploit', 'authz-exploit']
 });

-// Generate a session-based log folder path (used by claude-executor.ts)
-export const generateSessionLogPath = (webUrl: string, sessionId: string): string => {
-  const hostname = new URL(webUrl).hostname.replace(/[^a-zA-Z0-9-]/g, '-');
-  const sessionFolderName = `${hostname}_${sessionId}`;
-  return path.join(process.cwd(), 'agent-logs', sessionFolderName);
-};
+// Phase names for metrics aggregation
+export type PhaseName = 'pre-recon' | 'recon' | 'vulnerability-analysis' | 'exploitation' | 'reporting';
+
+// Map agents to their corresponding phases (single source of truth)
+export const AGENT_PHASE_MAP: Readonly<Record<AgentName, PhaseName>> = Object.freeze({
+  'pre-recon': 'pre-recon',
+  'recon': 'recon',
+  'injection-vuln': 'vulnerability-analysis',
+  'xss-vuln': 'vulnerability-analysis',
+  'auth-vuln': 'vulnerability-analysis',
+  'authz-vuln': 'vulnerability-analysis',
+  'ssrf-vuln': 'vulnerability-analysis',
+  'injection-exploit': 'exploitation',
+  'xss-exploit': 'exploitation',
+  'auth-exploit': 'exploitation',
+  'authz-exploit': 'exploitation',
+  'ssrf-exploit': 'exploitation',
+  'report': 'reporting',
+});
+

@@ -1,897 +0,0 @@
-#!/usr/bin/env node
-// Copyright (C) 2025 Keygraph, Inc.
-//
-// This program is free software: you can redistribute it and/or modify
-// it under the terms of the GNU Affero General Public License version 3
-// as published by the Free Software Foundation.
-
-import { path, fs, $ } from 'zx';
-import chalk, { type ChalkInstance } from 'chalk';
-import dotenv from 'dotenv';
-
-dotenv.config();
-
-// Config and Tools
-import { parseConfig, distributeConfig } from './config-parser.js';
-import { checkToolAvailability, handleMissingTools } from './tool-checker.js';
-
-// Session
-import { AGENTS, getParallelGroups } from './session-manager.js';
-import type { AgentName, PromptName } from './types/index.js';
-
-// Setup and Deliverables
-import { setupLocalRepo } from './setup/environment.js';
-
-// AI and Prompts
-import { runClaudePromptWithRetry } from './ai/claude-executor.js';
-import { loadPrompt } from './prompts/prompt-manager.js';
-
-// Phases
-import { executePreReconPhase } from './phases/pre-recon.js';
-import { assembleFinalReport } from './phases/reporting.js';
-
-// Utils
-import { timingResults, displayTimingSummary, Timer } from './utils/metrics.js';
-import { formatDuration, generateAuditPath } from './audit/utils.js';
-import type { SessionMetadata } from './audit/utils.js';
-import { AuditSession } from './audit/audit-session.js';
-
-// CLI
-import { showHelp, displaySplashScreen } from './cli/ui.js';
-import { validateWebUrl, validateRepoPath } from './cli/input-validator.js';
-
-// Error Handling
-import { PentestError, logError } from './error-handling.js';
-
-import type { DistributedConfig } from './types/config.js';
-import type { ToolAvailability } from './tool-checker.js';
-import { safeValidateQueueAndDeliverable } from './queue-validation.js';
-
-// Extend global namespace for SHANNON_DISABLE_LOADER
-declare global {
-  var SHANNON_DISABLE_LOADER: boolean | undefined;
-}
-
-// Session Lock File Management
-const STORE_PATH = path.join(process.cwd(), '.shannon-store.json');
-
-interface Session {
-  id: string;
-  webUrl: string;
-  repoPath: string;
-  status: 'in-progress' | 'completed' | 'failed';
-  startedAt: string;
-}
-
-interface SessionStore {
-  sessions: Session[];
-}
-
-function generateSessionId(): string {
-  return crypto.randomUUID();
-}
-
-async function loadSessions(): Promise<SessionStore> {
-  try {
-    if (await fs.pathExists(STORE_PATH)) {
-      return await fs.readJson(STORE_PATH) as SessionStore;
-    }
-  } catch {
-    // Corrupted file, start fresh
-  }
-  return { sessions: [] };
-}
-
-async function saveSessions(store: SessionStore): Promise<void> {
-  await fs.writeJson(STORE_PATH, store, { spaces: 2 });
-}
-
-async function createSession(webUrl: string, repoPath: string): Promise<Session> {
-  const store = await loadSessions();
-
-  // Check for existing in-progress session
-  const existing = store.sessions.find(
-    s => s.repoPath === repoPath && s.status === 'in-progress'
-  );
-  if (existing) {
-    throw new PentestError(
-      `Session already in progress for ${repoPath}`,
-      'validation',
-      false,
-      { sessionId: existing.id }
-    );
-  }
-
-  const session: Session = {
-    id: generateSessionId(),
-    webUrl,
-    repoPath,
-    status: 'in-progress',
-    startedAt: new Date().toISOString()
-  };
-
-  store.sessions.push(session);
-  await saveSessions(store);
-  return session;
-}
-
-async function updateSessionStatus(
-  sessionId: string,
-  status: 'in-progress' | 'completed' | 'failed'
-): Promise<void> {
-  const store = await loadSessions();
-  const session = store.sessions.find(s => s.id === sessionId);
-  if (session) {
-    session.status = status;
-    await saveSessions(store);
-  }
-}
-
-interface PromptVariables {
-  webUrl: string;
-  repoPath: string;
-  sourceDir: string;
-}
-
-interface MainResult {
-  reportPath: string;
-  auditLogsPath: string;
-}
-
-interface AgentResult {
-  success: boolean;
-  duration: number;
-  cost?: number;
-  error?: string;
-  retryable?: boolean;
-}
-
-interface ParallelAgentResult {
-  agentName: AgentName;
-  success: boolean;
-  timing?: number | undefined;
-  cost?: number | undefined;
-  attempts: number;
-  error?: string | undefined;
-}
-
-// Configure zx to disable timeouts (let tools run as long as needed)
-$.timeout = 0;
-
-// Helper function to get prompt name from agent name
-const getPromptName = (agentName: AgentName): PromptName => {
-  const mappings: Record<AgentName, PromptName> = {
-    'pre-recon': 'pre-recon-code',
-    'recon': 'recon',
-    'injection-vuln': 'vuln-injection',
-    'xss-vuln': 'vuln-xss',
-    'auth-vuln': 'vuln-auth',
-    'ssrf-vuln': 'vuln-ssrf',
-    'authz-vuln': 'vuln-authz',
-    'injection-exploit': 'exploit-injection',
-    'xss-exploit': 'exploit-xss',
-    'auth-exploit': 'exploit-auth',
-    'ssrf-exploit': 'exploit-ssrf',
-    'authz-exploit': 'exploit-authz',
-    'report': 'report-executive'
-  };
-
-  return mappings[agentName] || agentName as PromptName;
-};
-
-// Get color function for agent
-const getAgentColor = (agentName: AgentName): ChalkInstance => {
-  const colorMap: Partial<Record<AgentName, ChalkInstance>> = {
-    'injection-vuln': chalk.red,
-    'injection-exploit': chalk.red,
-    'xss-vuln': chalk.yellow,
-    'xss-exploit': chalk.yellow,
-    'auth-vuln': chalk.blue,
-    'auth-exploit': chalk.blue,
-    'ssrf-vuln': chalk.magenta,
-    'ssrf-exploit': chalk.magenta,
-    'authz-vuln': chalk.green,
-    'authz-exploit': chalk.green
-  };
-  return colorMap[agentName] || chalk.cyan;
-};
-
-/**
- * Consolidate deliverables from target repo into the session folder
- */
-async function consolidateOutputs(sourceDir: string, sessionPath: string): Promise<void> {
-  const srcDeliverables = path.join(sourceDir, 'deliverables');
-  const destDeliverables = path.join(sessionPath, 'deliverables');
-
-  try {
-    if (await fs.pathExists(srcDeliverables)) {
-      await fs.copy(srcDeliverables, destDeliverables, { overwrite: true });
-      console.log(chalk.gray(`📄 Deliverables copied to session folder`));
-    } else {
-      console.log(chalk.yellow(`⚠️ No deliverables directory found at ${srcDeliverables}`));
-    }
-  } catch (error) {
-    const err = error as Error;
-    console.log(chalk.yellow(`⚠️ Failed to consolidate deliverables: ${err.message}`));
-  }
-}
-
-/**
- * Run a single agent
- */
-async function runAgent(
-  agentName: AgentName,
-  sourceDir: string,
-  variables: PromptVariables,
-  distributedConfig: DistributedConfig | null,
-  pipelineTestingMode: boolean,
-  sessionMetadata: SessionMetadata
-): Promise<AgentResult> {
-  const agent = AGENTS[agentName];
-  const promptName = getPromptName(agentName);
-  const prompt = await loadPrompt(promptName, variables, distributedConfig, pipelineTestingMode);
-
-  return await runClaudePromptWithRetry(
-    prompt,
-    sourceDir,
-    '*',
-    '',
-    agent.displayName,
-    agentName,
-    getAgentColor(agentName),
-    sessionMetadata
-  );
-}
-
-/**
- * Run vulnerability agents in parallel
- */
-async function runParallelVuln(
-  sourceDir: string,
-  variables: PromptVariables,
-  distributedConfig: DistributedConfig | null,
-  pipelineTestingMode: boolean,
-  sessionMetadata: SessionMetadata
-): Promise<ParallelAgentResult[]> {
-  const { vuln: vulnAgents } = getParallelGroups();
-
-  console.log(chalk.cyan(`\nStarting ${vulnAgents.length} vulnerability analysis specialists in parallel...`));
-  console.log(chalk.gray('    Specialists: ' + vulnAgents.join(', ')));
-  console.log();
-
-  const startTime = Date.now();
-
-  const results = await Promise.allSettled(
-    vulnAgents.map(async (agentName, index) => {
-      // Add 2-second stagger to prevent API overwhelm
-      await new Promise(resolve => setTimeout(resolve, index * 2000));
-
-      let lastError: Error | undefined;
-      let attempts = 0;
-      const maxAttempts = 3;
-
-      while (attempts < maxAttempts) {
-        attempts++;
-        try {
-          const result = await runAgent(
-            agentName,
-            sourceDir,
-            variables,
-            distributedConfig,
-            pipelineTestingMode,
-            sessionMetadata
-          );
-
-          // Validate vulnerability analysis results
-          const vulnType = agentName.replace('-vuln', '');
-          try {
-            const validation = await safeValidateQueueAndDeliverable(vulnType as 'injection' | 'xss' | 'auth' | 'ssrf' | 'authz', sourceDir);
-
-            if (validation.success && validation.data) {
-              console.log(chalk.blue(`${agentName}: ${validation.data.shouldExploit ? `Ready for exploitation (${validation.data.vulnerabilityCount} vulnerabilities)` : 'No vulnerabilities found'}`));
-            }
-          } catch {
-            // Validation failure is non-critical
-          }
-
-          return {
-            agentName,
-            success: result.success,
-            timing: result.duration,
-            cost: result.cost,
-            attempts
-          };
-        } catch (error) {
-          lastError = error as Error;
-          if (attempts < maxAttempts) {
-            console.log(chalk.yellow(`Warning: ${agentName} failed attempt ${attempts}/${maxAttempts}, retrying...`));
-            await new Promise(resolve => setTimeout(resolve, 5000));
-          }
-        }
-      }
-
-      return {
-        agentName,
-        success: false,
-        attempts,
-        error: lastError?.message || 'Unknown error'
-      };
-    })
-  );
-
-  const totalDuration = Date.now() - startTime;
-
-  // Process and display results
-  console.log(chalk.cyan('\nVulnerability Analysis Results'));
-  console.log(chalk.gray('-'.repeat(80)));
-  console.log(chalk.bold('Agent                  Status     Attempt  Duration    Cost'));
-  console.log(chalk.gray('-'.repeat(80)));
-
-  const processedResults: ParallelAgentResult[] = [];
-
-  results.forEach((result, index) => {
-    const agentName = vulnAgents[index]!;
-    const agentDisplay = agentName.padEnd(22);
-
-    if (result.status === 'fulfilled') {
-      const data = result.value;
-      processedResults.push(data);
-
-      if (data.success) {
-        const duration = formatDuration(data.timing || 0);
-        const cost = `$${(data.cost || 0).toFixed(4)}`;
-
-        console.log(
-          `${chalk.green(agentDisplay)} ${chalk.green('Success')}    ` +
-          `${data.attempts}/3      ${duration.padEnd(11)} ${cost}`
-        );
-      } else {
-        console.log(
-          `${chalk.red(agentDisplay)} ${chalk.red('Failed ')}    ` +
-          `${data.attempts}/3      -           -`
-        );
-        if (data.error) {
-          console.log(chalk.gray(`  Error: ${data.error.substring(0, 60)}...`));
-        }
-      }
-    } else {
-      processedResults.push({
-        agentName,
-        success: false,
-        attempts: 3,
-        error: String(result.reason)
-      });
-
-      console.log(
-        `${chalk.red(agentDisplay)} ${chalk.red('Failed ')}    ` +
-        `3/3      -           -`
-      );
-    }
-  });
-
-  console.log(chalk.gray('-'.repeat(80)));
-  const successCount = processedResults.filter(r => r.success).length;
-  console.log(chalk.cyan(`Summary: ${successCount}/${vulnAgents.length} succeeded in ${formatDuration(totalDuration)}`));
-
-  return processedResults;
-}
-
-/**
- * Run exploitation agents in parallel
- */
-async function runParallelExploit(
-  sourceDir: string,
-  variables: PromptVariables,
-  distributedConfig: DistributedConfig | null,
-  pipelineTestingMode: boolean,
-  sessionMetadata: SessionMetadata
-): Promise<ParallelAgentResult[]> {
-  const { exploit: exploitAgents, vuln: vulnAgents } = getParallelGroups();
-
-  // Load validation module
-  const { safeValidateQueueAndDeliverable } = await import('./queue-validation.js');
-
-  // Check eligibility
-  const eligibilityChecks = await Promise.all(
-    exploitAgents.map(async (agentName) => {
-      const vulnAgentName = agentName.replace('-exploit', '-vuln') as AgentName;
-      const vulnType = vulnAgentName.replace('-vuln', '') as 'injection' | 'xss' | 'auth' | 'ssrf' | 'authz';
-
-      const validation = await safeValidateQueueAndDeliverable(vulnType, sourceDir);
-
-      if (!validation.success || !validation.data?.shouldExploit) {
-        console.log(chalk.gray(`Skipping ${agentName} (no vulnerabilities found in ${vulnAgentName})`));
-        return { agentName, eligible: false };
-      }
-
-      console.log(chalk.blue(`${agentName} eligible (${validation.data.vulnerabilityCount} vulnerabilities from ${vulnAgentName})`));
-      return { agentName, eligible: true };
-    })
-  );
-
-  const eligibleAgents = eligibilityChecks
-    .filter(check => check.eligible)
-    .map(check => check.agentName);
-
-  if (eligibleAgents.length === 0) {
-    console.log(chalk.gray('No exploitation agents eligible (no vulnerabilities found)'));
-    return [];
-  }
-
-  console.log(chalk.cyan(`\nStarting ${eligibleAgents.length} exploitation specialists in parallel...`));
-  console.log(chalk.gray('    Specialists: ' + eligibleAgents.join(', ')));
-  console.log();
-
-  const startTime = Date.now();
-
-  const results = await Promise.allSettled(
-    eligibleAgents.map(async (agentName, index) => {
-      await new Promise(resolve => setTimeout(resolve, index * 2000));
-
-      let lastError: Error | undefined;
-      let attempts = 0;
-      const maxAttempts = 3;
-
-      while (attempts < maxAttempts) {
-        attempts++;
-        try {
-          const result = await runAgent(
-            agentName,
-            sourceDir,
-            variables,
-            distributedConfig,
-            pipelineTestingMode,
-            sessionMetadata
-          );
-
-          return {
-            agentName,
-            success: result.success,
-            timing: result.duration,
-            cost: result.cost,
-            attempts
-          };
-        } catch (error) {
-          lastError = error as Error;
-          if (attempts < maxAttempts) {
-            console.log(chalk.yellow(`Warning: ${agentName} failed attempt ${attempts}/${maxAttempts}, retrying...`));
-            await new Promise(resolve => setTimeout(resolve, 5000));
-          }
-        }
-      }
-
-      return {
-        agentName,
-        success: false,
-        attempts,
-        error: lastError?.message || 'Unknown error'
-      };
-    })
-  );
-
-  const totalDuration = Date.now() - startTime;
-
-  // Process and display results
-  console.log(chalk.cyan('\nExploitation Results'));
-  console.log(chalk.gray('-'.repeat(80)));
-  console.log(chalk.bold('Agent                  Status     Attempt  Duration    Cost'));
-  console.log(chalk.gray('-'.repeat(80)));
-
-  const processedResults: ParallelAgentResult[] = [];
-
-  results.forEach((result, index) => {
-    const agentName = eligibleAgents[index]!;
-    const agentDisplay = agentName.padEnd(22);
-
-    if (result.status === 'fulfilled') {
-      const data = result.value;
-      processedResults.push(data);
-
-      if (data.success) {
-        const duration = formatDuration(data.timing || 0);
-        const cost = `$${(data.cost || 0).toFixed(4)}`;
-
-        console.log(
-          `${chalk.green(agentDisplay)} ${chalk.green('Success')}    ` +
-          `${data.attempts}/3      ${duration.padEnd(11)} ${cost}`
-        );
-      } else {
-        console.log(
-          `${chalk.red(agentDisplay)} ${chalk.red('Failed ')}    ` +
-          `${data.attempts}/3      -           -`
-        );
-        if (data.error) {
-          console.log(chalk.gray(`  Error: ${data.error.substring(0, 60)}...`));
-        }
-      }
-    } else {
-      processedResults.push({
-        agentName,
-        success: false,
-        attempts: 3,
-        error: String(result.reason)
-      });
-
-      console.log(
-        `${chalk.red(agentDisplay)} ${chalk.red('Failed ')}    ` +
-        `3/3      -           -`
-      );
-    }
-  });
-
-  console.log(chalk.gray('-'.repeat(80)));
-  const successCount = processedResults.filter(r => r.success).length;
-  console.log(chalk.cyan(`Summary: ${successCount}/${eligibleAgents.length} succeeded in ${formatDuration(totalDuration)}`));
-
-  return processedResults;
-}
-
-// Setup graceful cleanup on process signals
-process.on('SIGINT', async () => {
-  console.log(chalk.yellow('\n⚠️ Received SIGINT, cleaning up...'));
-
-  process.exit(0);
-});
-
-process.on('SIGTERM', async () => {
-  console.log(chalk.yellow('\n⚠️ Received SIGTERM, cleaning up...'));
-
-  process.exit(0);
-});
-
-// Main orchestration function
-async function main(
-  webUrl: string,
-  repoPath: string,
-  configPath: string | null = null,
-  pipelineTestingMode: boolean = false,
-  disableLoader: boolean = false,
-  outputPath: string | null = null
-): Promise<MainResult> {
-  // Set global flag for loader control
-  global.SHANNON_DISABLE_LOADER = disableLoader;
-
-  const totalTimer = new Timer('total-execution');
-  timingResults.total = totalTimer;
-
-  // Display splash screen
-  await displaySplashScreen();
-
-  console.log(chalk.cyan.bold('🚀 AI PENETRATION TESTING AGENT'));
-  console.log(chalk.cyan(`🎯 Target: ${webUrl}`));
-  console.log(chalk.cyan(`📁 Source: ${repoPath}`));
-  if (configPath) {
-    console.log(chalk.cyan(`⚙️ Config: ${configPath}`));
-  }
-  if (outputPath) {
-    console.log(chalk.cyan(`📂 Output: ${outputPath}`));
-  }
-  console.log(chalk.gray('─'.repeat(60)));
-
-  // Parse configuration if provided
-  let distributedConfig: DistributedConfig | null = null;
-  if (configPath) {
-    try {
-      // Resolve config path - check configs folder if relative path
-      let resolvedConfigPath = configPath;
-      if (!path.isAbsolute(configPath)) {
-        const configsDir = path.join(process.cwd(), 'configs');
-        const configInConfigsDir = path.join(configsDir, configPath);
-        // Check if file exists in configs directory, otherwise use original path
-        if (await fs.pathExists(configInConfigsDir)) {
-          resolvedConfigPath = configInConfigsDir;
-        }
-      }
-
-      const config = await parseConfig(resolvedConfigPath);
-      distributedConfig = distributeConfig(config);
-      console.log(chalk.green(`✅ Configuration loaded successfully`));
-    } catch (error) {
-      await logError(error as Error, `Configuration loading from ${configPath}`);
-      throw error; // Let the main error boundary handle it
-    }
-  }
-
-  // Check tool availability
-  const toolAvailability: ToolAvailability = await checkToolAvailability();
-  handleMissingTools(toolAvailability);
-
-  // Setup local repository
-  console.log(chalk.blue('📁 Setting up local repository...'));
-  let sourceDir: string;
-  try {
-    sourceDir = await setupLocalRepo(repoPath);
-    console.log(chalk.green('✅ Local repository setup successfully'));
-  } catch (error) {
-    const err = error as Error;
-    console.log(chalk.red(`❌ Failed to setup local repository: ${err.message}`));
-    console.log(chalk.gray('This could be due to:'));
-    console.log(chalk.gray('  - Insufficient permissions'));
-    console.log(chalk.gray('  - Repository path not accessible'));
-    console.log(chalk.gray('  - Git initialization issues'));
-    console.log(chalk.gray('  - Insufficient disk space'));
-    process.exit(1);
-  }
-
-  const variables: PromptVariables = { webUrl, repoPath, sourceDir };
-
-  // Create session (acts as lock file)
-  const session: Session = await createSession(webUrl, repoPath);
-  console.log(chalk.blue(`Session created: ${session.id.substring(0, 8)}...`));
-
-  // Session metadata for audit logging
-  const sessionMetadata: SessionMetadata = {
-    id: session.id,
-    webUrl,
-    repoPath: sourceDir,
-    ...(outputPath && { outputPath })
-  };
-
-  // Create outputs directory in source directory
-  try {
-    const outputsDir = path.join(sourceDir, 'outputs');
-    await fs.ensureDir(outputsDir);
-    await fs.ensureDir(path.join(outputsDir, 'schemas'));
-    await fs.ensureDir(path.join(outputsDir, 'scans'));
-  } catch (error) {
-    const err = error as Error;
-    throw new PentestError(
-      `Failed to create output directories: ${err.message}`,
-      'filesystem',
-      false,
-      { sourceDir, originalError: err.message }
-    );
-  }
-
-  try {
-  // PHASE 1: PRE-RECONNAISSANCE
-    const { duration: preReconDuration } = await executePreReconPhase(
-      webUrl,
-      sourceDir,
-      variables,
-      distributedConfig,
-      toolAvailability,
-      pipelineTestingMode,
-      session.id,
-      outputPath
-    );
-    console.log(chalk.green(`Pre-reconnaissance complete in ${formatDuration(preReconDuration)}`));
-
-  // PHASE 2: RECONNAISSANCE
-    console.log(chalk.magenta.bold('\n🔎 PHASE 2: RECONNAISSANCE'));
-    console.log(chalk.magenta('Analyzing initial findings...'));
-    const reconTimer = new Timer('phase-2-recon');
-
-    await runAgent(
-      'recon',
-      sourceDir,
-      variables,
-      distributedConfig,
-      pipelineTestingMode,
-      sessionMetadata
-    );
-    const reconDuration = reconTimer.stop();
-    console.log(chalk.green(`✅ Reconnaissance complete in ${formatDuration(reconDuration)}`));
-
-  // PHASE 3: VULNERABILITY ANALYSIS
-    const vulnTimer = new Timer('phase-3-vulnerability-analysis');
-    console.log(chalk.red.bold('\n🚨 PHASE 3: VULNERABILITY ANALYSIS'));
-
-    const vulnResults = await runParallelVuln(
-      sourceDir,
-      variables,
-      distributedConfig,
-      pipelineTestingMode,
-      sessionMetadata
-    );
-
-    const vulnDuration = vulnTimer.stop();
-    console.log(chalk.green(`✅ Vulnerability analysis phase complete in ${formatDuration(vulnDuration)}`));
-
-  // PHASE 4: EXPLOITATION
-    const exploitTimer = new Timer('phase-4-exploitation');
-    console.log(chalk.red.bold('\n💥 PHASE 4: EXPLOITATION'));
-
-    const exploitResults = await runParallelExploit(
-      sourceDir,
-      variables,
-      distributedConfig,
-      pipelineTestingMode,
-      sessionMetadata
-    );
-
-    const exploitDuration = exploitTimer.stop();
-    console.log(chalk.green(`✅ Exploitation phase complete in ${formatDuration(exploitDuration)}`));
-
-  // PHASE 5: REPORTING
-    console.log(chalk.greenBright.bold('\n📊 PHASE 5: REPORTING'));
-    console.log(chalk.greenBright('Generating executive summary and assembling final report...'));
-    const reportTimer = new Timer('phase-5-reporting');
-
-    // Assemble all deliverables into a single concatenated report
-    console.log(chalk.blue('📝 Assembling deliverables from specialist agents...'));
-    try {
-      await assembleFinalReport(sourceDir);
-    } catch (error) {
-      const err = error as Error;
-      console.log(chalk.red(`❌ Error assembling final report: ${err.message}`));
-    }
-
-    // Run reporter agent to create executive summary
-    console.log(chalk.blue('Generating executive summary and cleaning up report...'));
-    await runAgent(
-      'report',
-      sourceDir,
-      variables,
-      distributedConfig,
-      pipelineTestingMode,
-      sessionMetadata
-    );
-
-    const reportDuration = reportTimer.stop();
-    console.log(chalk.green(`✅ Final report generated in ${formatDuration(reportDuration)}`));
-
-    // Calculate final timing
-    timingResults.total.stop();
-
-    // Mark session as completed in both stores
-    await updateSessionStatus(session.id, 'completed');
-
-    // Update audit system's session.json status
-    const auditSession = new AuditSession(sessionMetadata);
-    await auditSession.updateSessionStatus('completed');
-
-    // Display comprehensive timing summary
-    displayTimingSummary();
-
-  console.log(chalk.cyan.bold('\n🎉 PENETRATION TESTING COMPLETE!'));
-  console.log(chalk.gray('─'.repeat(60)));
-
-  // Calculate audit logs path
-    const auditLogsPath = generateAuditPath(sessionMetadata);
-
-  // Consolidate deliverables into the session folder
-  await consolidateOutputs(sourceDir, auditLogsPath);
-  console.log(chalk.green(`\n📂 All outputs consolidated: ${auditLogsPath}`));
-
-    return {
-      reportPath: path.join(sourceDir, 'deliverables', 'comprehensive_security_assessment_report.md'),
-      auditLogsPath
-    };
-
-  } catch (error) {
-    // Mark session as failed in both stores
-    await updateSessionStatus(session.id, 'failed');
-
-    // Update audit system's session.json status
-    const auditSession = new AuditSession(sessionMetadata);
-    await auditSession.updateSessionStatus('failed');
-
-    throw error;
-  }
-}
-
-// Entry point - handle both direct node execution and shebang execution
-let args = process.argv.slice(2);
-// If first arg is the script name (from shebang), remove it
-if (args[0] && args[0].includes('shannon')) {
-  args = args.slice(1);
-}
-
-// Parse flags and arguments
-let configPath: string | null = null;
-let outputPath: string | null = null;
-let pipelineTestingMode = false;
-let disableLoader = false;
-const nonFlagArgs: string[] = [];
-
-for (let i = 0; i < args.length; i++) {
-  if (args[i] === '--config') {
-    if (i + 1 < args.length) {
-      configPath = args[i + 1]!;
-      i++; // Skip the next argument
-    } else {
-      console.log(chalk.red('❌ --config flag requires a file path'));
-      process.exit(1);
-    }
-  } else if (args[i] === '--output') {
-    if (i + 1 < args.length) {
-      outputPath = path.resolve(args[i + 1]!);
-      i++; // Skip the next argument
-    } else {
-      console.log(chalk.red('❌ --output flag requires a directory path'));
-      process.exit(1);
-    }
-  } else if (args[i] === '--pipeline-testing') {
-    pipelineTestingMode = true;
-  } else if (args[i] === '--disable-loader') {
-    disableLoader = true;
-  } else if (!args[i]!.startsWith('-')) {
-    nonFlagArgs.push(args[i]!);
-  }
-}
-
-// Handle help flag
-if (args.includes('--help') || args.includes('-h') || args.includes('help')) {
-  showHelp();
-  process.exit(0);
-}
-
-// Handle no arguments - show help
-if (nonFlagArgs.length === 0) {
-  console.log(chalk.red.bold('❌ Error: No arguments provided\n'));
-  showHelp();
-  process.exit(1);
-}
-
-// Handle insufficient arguments
-if (nonFlagArgs.length < 2) {
-  console.log(chalk.red('❌ Both WEB_URL and REPO_PATH are required'));
-  console.log(chalk.gray('Usage: shannon <WEB_URL> <REPO_PATH> [--config config.yaml]'));
-  console.log(chalk.gray('Help:  shannon --help'));
-  process.exit(1);
-}
-
-const [webUrl, repoPath] = nonFlagArgs;
-
-// Validate web URL
-const webUrlValidation = validateWebUrl(webUrl!);
-if (!webUrlValidation.valid) {
-  console.log(chalk.red(`❌ Invalid web URL: ${webUrlValidation.error}`));
-  console.log(chalk.gray(`Expected format: https://example.com`));
-  process.exit(1);
-}
-
-// Validate repository path
-const repoPathValidation = await validateRepoPath(repoPath!);
-if (!repoPathValidation.valid) {
-  console.log(chalk.red(`❌ Invalid repository path: ${repoPathValidation.error}`));
-  console.log(chalk.gray(`Expected: Accessible local directory path`));
-  process.exit(1);
-}
-
-// Success - show validated inputs
-console.log(chalk.green('✅ Input validation passed:'));
-console.log(chalk.gray(`   Target Web URL: ${webUrl}`));
-console.log(chalk.gray(`   Target Repository: ${repoPathValidation.path}\n`));
-console.log(chalk.gray(`   Config Path: ${configPath}\n`));
-if (outputPath) {
-  console.log(chalk.gray(`   Output Path: ${outputPath}\n`));
-}
-if (pipelineTestingMode) {
-  console.log(chalk.yellow('⚡ PIPELINE TESTING MODE ENABLED - Using minimal test prompts for fast pipeline validation\n'));
-}
-if (disableLoader) {
-  console.log(chalk.yellow('⚙️  LOADER DISABLED - Progress indicator will not be shown\n'));
-}
-
-try {
-  const result = await main(webUrl!, repoPathValidation.path!, configPath, pipelineTestingMode, disableLoader, outputPath);
-  console.log(chalk.green.bold('\n📄 FINAL REPORT AVAILABLE:'));
-  console.log(chalk.cyan(result.reportPath));
-  console.log(chalk.green.bold('\n📂 AUDIT LOGS AVAILABLE:'));
-  console.log(chalk.cyan(result.auditLogsPath));
-
-} catch (error) {
-  // Enhanced error boundary with proper logging
-  if (error instanceof PentestError) {
-    await logError(error, 'Main execution failed');
-    console.log(chalk.red.bold('\n🚨 PENTEST EXECUTION FAILED'));
-    console.log(chalk.red(`   Type: ${error.type}`));
-    console.log(chalk.red(`   Retryable: ${error.retryable ? 'Yes' : 'No'}`));
-
-    if (error.retryable) {
-      console.log(chalk.yellow('   Consider running the command again or checking network connectivity.'));
-    }
-  } else {
-    const err = error as Error;
-    console.log(chalk.red.bold('\n🚨 UNEXPECTED ERROR OCCURRED'));
-    console.log(chalk.red(`   Error: ${err?.message || err?.toString() || 'Unknown error'}`));
-
-    if (process.env.DEBUG) {
-      console.log(chalk.gray(`   Stack: ${err?.stack || 'No stack trace available'}`));
-    }
-  }
-
-  process.exit(1);
-}
@@ -0,0 +1,469 @@
+// Copyright (C) 2025 Keygraph, Inc.
+//
+// This program is free software: you can redistribute it and/or modify
+// it under the terms of the GNU Affero General Public License version 3
+// as published by the Free Software Foundation.
+
+/**
+ * Temporal activities for Shannon agent execution.
+ *
+ * Each activity wraps a single agent execution with:
+ * - Heartbeat loop (2s interval) to signal worker liveness
+ * - Git checkpoint/rollback/commit per attempt
+ * - Error classification for Temporal retry behavior
+ * - Audit session logging
+ *
+ * Temporal handles retries based on error classification:
+ * - Retryable: BillingError, TransientError (429, 5xx, network)
+ * - Non-retryable: AuthenticationError, PermissionError, ConfigurationError, etc.
+ */
+
+import { heartbeat, ApplicationFailure, Context } from '@temporalio/activity';
+import chalk from 'chalk';
+
+// Max lengths to prevent Temporal protobuf buffer overflow
+const MAX_ERROR_MESSAGE_LENGTH = 2000;
+const MAX_STACK_TRACE_LENGTH = 1000;
+
+// Max retries for output validation errors (agent didn't save deliverables)
+// Lower than default 50 since this is unlikely to self-heal
+const MAX_OUTPUT_VALIDATION_RETRIES = 3;
+
+/**
+ * Truncate error message to prevent buffer overflow in Temporal serialization.
+ */
+function truncateErrorMessage(message: string): string {
+  if (message.length <= MAX_ERROR_MESSAGE_LENGTH) {
+    return message;
+  }
+  return message.slice(0, MAX_ERROR_MESSAGE_LENGTH - 20) + '\n[truncated]';
+}
+
+/**
+ * Truncate stack trace on an ApplicationFailure to prevent buffer overflow.
+ */
+function truncateStackTrace(failure: ApplicationFailure): void {
+  if (failure.stack && failure.stack.length > MAX_STACK_TRACE_LENGTH) {
+    failure.stack = failure.stack.slice(0, MAX_STACK_TRACE_LENGTH) + '\n[stack truncated]';
+  }
+}
+
+import {
+  runClaudePrompt,
+  validateAgentOutput,
+  type ClaudePromptResult,
+} from '../ai/claude-executor.js';
+import { loadPrompt } from '../prompts/prompt-manager.js';
+import { parseConfig, distributeConfig } from '../config-parser.js';
+import { classifyErrorForTemporal } from '../error-handling.js';
+import {
+  safeValidateQueueAndDeliverable,
+  type VulnType,
+  type ExploitationDecision,
+} from '../queue-validation.js';
+import {
+  createGitCheckpoint,
+  commitGitSuccess,
+  rollbackGitWorkspace,
+  getGitCommitHash,
+} from '../utils/git-manager.js';
+import { assembleFinalReport } from '../phases/reporting.js';
+import { getPromptNameForAgent } from '../types/agents.js';
+import { AuditSession } from '../audit/index.js';
+import type { WorkflowSummary } from '../audit/workflow-logger.js';
+import type { AgentName } from '../types/agents.js';
+import type { AgentMetrics } from './shared.js';
+import type { DistributedConfig } from '../types/config.js';
+import type { SessionMetadata } from '../audit/utils.js';
+
+const HEARTBEAT_INTERVAL_MS = 2000; // Must be < heartbeatTimeout (10min production, 5min testing)
+
+/**
+ * Input for all agent activities.
+ * Matches PipelineInput but with required workflowId for audit correlation.
+ */
+export interface ActivityInput {
+  webUrl: string;
+  repoPath: string;
+  configPath?: string;
+  outputPath?: string;
+  pipelineTestingMode?: boolean;
+  workflowId: string;
+}
+
+/**
+ * Core activity implementation.
+ *
+ * Executes a single agent with:
+ * 1. Heartbeat loop for worker liveness
+ * 2. Config loading (if configPath provided)
+ * 3. Audit session initialization
+ * 4. Prompt loading
+ * 5. Git checkpoint before execution
+ * 6. Agent execution (single attempt)
+ * 7. Output validation
+ * 8. Git commit on success, rollback on failure
+ * 9. Error classification for Temporal retry
+ */
+async function runAgentActivity(
+  agentName: AgentName,
+  input: ActivityInput
+): Promise<AgentMetrics> {
+  const {
+    webUrl,
+    repoPath,
+    configPath,
+    outputPath,
+    pipelineTestingMode = false,
+    workflowId,
+  } = input;
+
+  const startTime = Date.now();
+
+  // Get attempt number from Temporal context (tracks retries automatically)
+  const attemptNumber = Context.current().info.attempt;
+
+  // Heartbeat loop - signals worker is alive to Temporal server
+  const heartbeatInterval = setInterval(() => {
+    const elapsed = Math.floor((Date.now() - startTime) / 1000);
+    heartbeat({ agent: agentName, elapsedSeconds: elapsed, attempt: attemptNumber });
+  }, HEARTBEAT_INTERVAL_MS);
+
+  try {
+    // 1. Load config (if provided)
+    let distributedConfig: DistributedConfig | null = null;
+    if (configPath) {
+      try {
+        const config = await parseConfig(configPath);
+        distributedConfig = distributeConfig(config);
+      } catch (err) {
+        throw new Error(`Failed to load config ${configPath}: ${err instanceof Error ? err.message : String(err)}`);
+      }
+    }
+
+    // 2. Build session metadata for audit
+    const sessionMetadata: SessionMetadata = {
+      id: workflowId,
+      webUrl,
+      repoPath,
+      ...(outputPath && { outputPath }),
+    };
+
+    // 3. Initialize audit session (idempotent, safe across retries)
+    const auditSession = new AuditSession(sessionMetadata);
+    await auditSession.initialize();
+
+    // 4. Load prompt
+    const promptName = getPromptNameForAgent(agentName);
+    const prompt = await loadPrompt(
+      promptName,
+      { webUrl, repoPath },
+      distributedConfig,
+      pipelineTestingMode
+    );
+
+    // 5. Create git checkpoint before execution
+    await createGitCheckpoint(repoPath, agentName, attemptNumber);
+    await auditSession.startAgent(agentName, prompt, attemptNumber);
+
+    // 6. Execute agent (single attempt - Temporal handles retries)
+    const result: ClaudePromptResult = await runClaudePrompt(
+      prompt,
+      repoPath,
+      '', // context
+      agentName, // description
+      agentName,
+      chalk.cyan,
+      sessionMetadata,
+      auditSession,
+      attemptNumber
+    );
+
+    // 6.5. Sanity check: Detect spending cap that slipped through all detection layers
+    // Defense-in-depth: A successful agent execution should never have ≤2 turns with $0 cost
+    if (result.success && (result.turns ?? 0) <= 2 && (result.cost || 0) === 0) {
+      const resultText = result.result || '';
+      const looksLikeBillingError = /spending|cap|limit|budget|resets/i.test(resultText);
+
+      if (looksLikeBillingError) {
+        await rollbackGitWorkspace(repoPath, 'spending cap detected');
+        await auditSession.endAgent(agentName, {
+          attemptNumber,
+          duration_ms: result.duration,
+          cost_usd: 0,
+          success: false,
+          error: `Spending cap likely reached: ${resultText.slice(0, 100)}`,
+        });
+        // Throw as billing error so Temporal retries with long backoff
+        throw new Error(`Spending cap likely reached: ${resultText.slice(0, 100)}`);
+      }
+    }
+
+    // 7. Handle execution failure
+    if (!result.success) {
+      await rollbackGitWorkspace(repoPath, 'execution failure');
+      await auditSession.endAgent(agentName, {
+        attemptNumber,
+        duration_ms: result.duration,
+        cost_usd: result.cost || 0,
+        success: false,
+        error: result.error || 'Execution failed',
+      });
+      throw new Error(result.error || 'Agent execution failed');
+    }
+
+    // 8. Validate output
+    const validationPassed = await validateAgentOutput(result, agentName, repoPath);
+    if (!validationPassed) {
+      await rollbackGitWorkspace(repoPath, 'validation failure');
+      await auditSession.endAgent(agentName, {
+        attemptNumber,
+        duration_ms: result.duration,
+        cost_usd: result.cost || 0,
+        success: false,
+        error: 'Output validation failed',
+      });
+
+      // Limit output validation retries (unlikely to self-heal)
+      if (attemptNumber >= MAX_OUTPUT_VALIDATION_RETRIES) {
+        throw ApplicationFailure.nonRetryable(
+          `Agent ${agentName} failed output validation after ${attemptNumber} attempts`,
+          'OutputValidationError',
+          [{ agentName, attemptNumber, elapsed: Date.now() - startTime }]
+        );
+      }
+      // Let Temporal retry (will be classified as OutputValidationError)
+      throw new Error(`Agent ${agentName} failed output validation`);
+    }
+
+    // 9. Success - commit and log
+    const commitHash = await getGitCommitHash(repoPath);
+    await auditSession.endAgent(agentName, {
+      attemptNumber,
+      duration_ms: result.duration,
+      cost_usd: result.cost || 0,
+      success: true,
+      ...(commitHash && { checkpoint: commitHash }),
+    });
+    await commitGitSuccess(repoPath, agentName);
+
+    // 10. Return metrics
+    return {
+      durationMs: Date.now() - startTime,
+      inputTokens: null, // Not currently exposed by SDK wrapper
+      outputTokens: null,
+      costUsd: result.cost ?? null,
+      numTurns: result.turns ?? null,
+    };
+  } catch (error) {
+    // Rollback git workspace before Temporal retry to ensure clean state
+    try {
+      await rollbackGitWorkspace(repoPath, 'error recovery');
+    } catch (rollbackErr) {
+      // Log but don't fail - rollback is best-effort
+      console.error(`Failed to rollback git workspace for ${agentName}:`, rollbackErr);
+    }
+
+    // If error is already an ApplicationFailure (e.g., from our retry limit logic),
+    // re-throw it directly without re-classifying
+    if (error instanceof ApplicationFailure) {
+      throw error;
+    }
+
+    // Classify error for Temporal retry behavior
+    const classified = classifyErrorForTemporal(error);
+    // Truncate message to prevent protobuf buffer overflow
+    const rawMessage = error instanceof Error ? error.message : String(error);
+    const message = truncateErrorMessage(rawMessage);
+
+    if (classified.retryable) {
+      // Temporal will retry with configured backoff
+      const failure = ApplicationFailure.create({
+        message,
+        type: classified.type,
+        details: [{ agentName, attemptNumber, elapsed: Date.now() - startTime }],
+      });
+      truncateStackTrace(failure);
+      throw failure;
+    } else {
+      // Fail immediately - no retry
+      const failure = ApplicationFailure.nonRetryable(message, classified.type, [
+        { agentName, attemptNumber, elapsed: Date.now() - startTime },
+      ]);
+      truncateStackTrace(failure);
+      throw failure;
+    }
+  } finally {
+    clearInterval(heartbeatInterval);
+  }
+}
+
+// === Individual Agent Activity Exports ===
+// Each function is a thin wrapper around runAgentActivity with the agent name.
+
+export async function runPreReconAgent(input: ActivityInput): Promise<AgentMetrics> {
+  return runAgentActivity('pre-recon', input);
+}
+
+export async function runReconAgent(input: ActivityInput): Promise<AgentMetrics> {
+  return runAgentActivity('recon', input);
+}
+
+export async function runInjectionVulnAgent(input: ActivityInput): Promise<AgentMetrics> {
+  return runAgentActivity('injection-vuln', input);
+}
+
+export async function runXssVulnAgent(input: ActivityInput): Promise<AgentMetrics> {
+  return runAgentActivity('xss-vuln', input);
+}
+
+export async function runAuthVulnAgent(input: ActivityInput): Promise<AgentMetrics> {
+  return runAgentActivity('auth-vuln', input);
+}
+
+export async function runSsrfVulnAgent(input: ActivityInput): Promise<AgentMetrics> {
+  return runAgentActivity('ssrf-vuln', input);
+}
+
+export async function runAuthzVulnAgent(input: ActivityInput): Promise<AgentMetrics> {
+  return runAgentActivity('authz-vuln', input);
+}
+
+export async function runInjectionExploitAgent(input: ActivityInput): Promise<AgentMetrics> {
+  return runAgentActivity('injection-exploit', input);
+}
+
+export async function runXssExploitAgent(input: ActivityInput): Promise<AgentMetrics> {
+  return runAgentActivity('xss-exploit', input);
+}
+
+export async function runAuthExploitAgent(input: ActivityInput): Promise<AgentMetrics> {
+  return runAgentActivity('auth-exploit', input);
+}
+
+export async function runSsrfExploitAgent(input: ActivityInput): Promise<AgentMetrics> {
+  return runAgentActivity('ssrf-exploit', input);
+}
+
+export async function runAuthzExploitAgent(input: ActivityInput): Promise<AgentMetrics> {
+  return runAgentActivity('authz-exploit', input);
+}
+
+export async function runReportAgent(input: ActivityInput): Promise<AgentMetrics> {
+  return runAgentActivity('report', input);
+}
+
+/**
+ * Assemble the final report by concatenating exploitation evidence files.
+ * This must be called BEFORE runReportAgent to create the file that the report agent will modify.
+ */
+export async function assembleReportActivity(input: ActivityInput): Promise<void> {
+  const { repoPath } = input;
+  console.log(chalk.blue('📝 Assembling deliverables from specialist agents...'));
+  try {
+    await assembleFinalReport(repoPath);
+  } catch (error) {
+    const err = error as Error;
+    console.log(chalk.yellow(`⚠️ Error assembling final report: ${err.message}`));
+    // Don't throw - the report agent can still create content even if no exploitation files exist
+  }
+}
+
+/**
+ * Check if exploitation should run for a given vulnerability type.
+ * Reads the vulnerability queue file and returns the decision.
+ *
+ * This activity allows the workflow to skip exploit agents entirely
+ * when no vulnerabilities were found, saving API calls and time.
+ *
+ * Error handling:
+ * - Retryable errors (missing files, invalid JSON): re-throw for Temporal retry
+ * - Non-retryable errors: skip exploitation gracefully
+ */
+export async function checkExploitationQueue(
+  input: ActivityInput,
+  vulnType: VulnType
+): Promise<ExploitationDecision> {
+  const { repoPath } = input;
+
+  const result = await safeValidateQueueAndDeliverable(vulnType, repoPath);
+
+  if (result.success && result.data) {
+    const { shouldExploit, vulnerabilityCount } = result.data;
+    console.log(
+      chalk.blue(
+        `🔍 ${vulnType}: ${shouldExploit ? `${vulnerabilityCount} vulnerabilities found` : 'no vulnerabilities, skipping exploitation'}`
+      )
+    );
+    return result.data;
+  }
+
+  // Validation failed - check if we should retry or skip
+  const error = result.error;
+  if (error?.retryable) {
+    // Re-throw retryable errors so Temporal can retry the vuln agent
+    console.log(chalk.yellow(`⚠️ ${vulnType}: ${error.message} (retrying)`));
+    throw error;
+  }
+
+  // Non-retryable error - skip exploitation gracefully
+  console.log(
+    chalk.yellow(`⚠️ ${vulnType}: ${error?.message ?? 'Unknown error'}, skipping exploitation`)
+  );
+  return {
+    shouldExploit: false,
+    shouldRetry: false,
+    vulnerabilityCount: 0,
+    vulnType,
+  };
+}
+
+/**
+ * Log phase transition to the unified workflow log.
+ * Called at phase boundaries for per-workflow logging.
+ */
+export async function logPhaseTransition(
+  input: ActivityInput,
+  phase: string,
+  event: 'start' | 'complete'
+): Promise<void> {
+  const { webUrl, repoPath, outputPath, workflowId } = input;
+
+  const sessionMetadata: SessionMetadata = {
+    id: workflowId,
+    webUrl,
+    repoPath,
+    ...(outputPath && { outputPath }),
+  };
+
+  const auditSession = new AuditSession(sessionMetadata);
+  await auditSession.initialize();
+
+  if (event === 'start') {
+    await auditSession.logPhaseStart(phase);
+  } else {
+    await auditSession.logPhaseComplete(phase);
+  }
+}
+
+/**
+ * Log workflow completion with full summary to the unified workflow log.
+ * Called at the end of the workflow to write a summary breakdown.
+ */
+export async function logWorkflowComplete(
+  input: ActivityInput,
+  summary: WorkflowSummary
+): Promise<void> {
+  const { webUrl, repoPath, outputPath, workflowId } = input;
+
+  const sessionMetadata: SessionMetadata = {
+    id: workflowId,
+    webUrl,
+    repoPath,
+    ...(outputPath && { outputPath }),
+  };
+
+  const auditSession = new AuditSession(sessionMetadata);
+  await auditSession.initialize();
+  await auditSession.logWorkflowComplete(summary);
+}
@@ -0,0 +1,212 @@
+#!/usr/bin/env node
+// Copyright (C) 2025 Keygraph, Inc.
+//
+// This program is free software: you can redistribute it and/or modify
+// it under the terms of the GNU Affero General Public License version 3
+// as published by the Free Software Foundation.
+
+/**
+ * Temporal client for starting Shannon pentest pipeline workflows.
+ *
+ * Starts a workflow and optionally waits for completion with progress polling.
+ *
+ * Usage:
+ *   npm run temporal:start -- <webUrl> <repoPath> [options]
+ *   # or
+ *   node dist/temporal/client.js <webUrl> <repoPath> [options]
+ *
+ * Options:
+ *   --config <path>       Configuration file path
+ *   --output <path>       Output directory for audit logs
+ *   --pipeline-testing    Use minimal prompts for fast testing
+ *   --workflow-id <id>    Custom workflow ID (default: shannon-<timestamp>)
+ *   --wait                Wait for workflow completion with progress polling
+ *
+ * Environment:
+ *   TEMPORAL_ADDRESS - Temporal server address (default: localhost:7233)
+ */
+
+import { Connection, Client } from '@temporalio/client';
+import dotenv from 'dotenv';
+import chalk from 'chalk';
+import { displaySplashScreen } from '../splash-screen.js';
+import { sanitizeHostname } from '../audit/utils.js';
+// Import types only - these don't pull in workflow runtime code
+import type { PipelineInput, PipelineState, PipelineProgress } from './shared.js';
+
+dotenv.config();
+
+// Query name must match the one defined in workflows.ts
+const PROGRESS_QUERY = 'getProgress';
+
+function showUsage(): void {
+  console.log(chalk.cyan.bold('\nShannon Temporal Client'));
+  console.log(chalk.gray('Start a pentest pipeline workflow\n'));
+  console.log(chalk.yellow('Usage:'));
+  console.log(
+    '  node dist/temporal/client.js <webUrl> <repoPath> [options]\n'
+  );
+  console.log(chalk.yellow('Options:'));
+  console.log('  --config <path>       Configuration file path');
+  console.log('  --output <path>       Output directory for audit logs');
+  console.log('  --pipeline-testing    Use minimal prompts for fast testing');
+  console.log(
+    '  --workflow-id <id>    Custom workflow ID (default: shannon-<timestamp>)'
+  );
+  console.log('  --wait                Wait for workflow completion with progress polling\n');
+  console.log(chalk.yellow('Examples:'));
+  console.log('  node dist/temporal/client.js https://example.com /path/to/repo');
+  console.log(
+    '  node dist/temporal/client.js https://example.com /path/to/repo --config config.yaml\n'
+  );
+}
+
+async function startPipeline(): Promise<void> {
+  const args = process.argv.slice(2);
+
+  if (args.includes('--help') || args.includes('-h') || args.length === 0) {
+    showUsage();
+    process.exit(0);
+  }
+
+  // Parse arguments
+  let webUrl: string | undefined;
+  let repoPath: string | undefined;
+  let configPath: string | undefined;
+  let outputPath: string | undefined;
+  let pipelineTestingMode = false;
+  let customWorkflowId: string | undefined;
+  let waitForCompletion = false;
+
+  for (let i = 0; i < args.length; i++) {
+    const arg = args[i];
+    if (arg === '--config') {
+      const nextArg = args[i + 1];
+      if (nextArg && !nextArg.startsWith('-')) {
+        configPath = nextArg;
+        i++;
+      }
+    } else if (arg === '--output') {
+      const nextArg = args[i + 1];
+      if (nextArg && !nextArg.startsWith('-')) {
+        outputPath = nextArg;
+        i++;
+      }
+    } else if (arg === '--workflow-id') {
+      const nextArg = args[i + 1];
+      if (nextArg && !nextArg.startsWith('-')) {
+        customWorkflowId = nextArg;
+        i++;
+      }
+    } else if (arg === '--pipeline-testing') {
+      pipelineTestingMode = true;
+    } else if (arg === '--wait') {
+      waitForCompletion = true;
+    } else if (arg && !arg.startsWith('-')) {
+      if (!webUrl) {
+        webUrl = arg;
+      } else if (!repoPath) {
+        repoPath = arg;
+      }
+    }
+  }
+
+  if (!webUrl || !repoPath) {
+    console.log(chalk.red('Error: webUrl and repoPath are required'));
+    showUsage();
+    process.exit(1);
+  }
+
+  // Display splash screen
+  await displaySplashScreen();
+
+  const address = process.env.TEMPORAL_ADDRESS || 'localhost:7233';
+  console.log(chalk.gray(`Connecting to Temporal at ${address}...`));
+
+  const connection = await Connection.connect({ address });
+  const client = new Client({ connection });
+
+  try {
+    const hostname = sanitizeHostname(webUrl);
+    const workflowId = customWorkflowId || `${hostname}_shannon-${Date.now()}`;
+
+    const input: PipelineInput = {
+      webUrl,
+      repoPath,
+      ...(configPath && { configPath }),
+      ...(outputPath && { outputPath }),
+      ...(pipelineTestingMode && { pipelineTestingMode }),
+    };
+
+    console.log(chalk.green.bold(`✓ Workflow started: ${workflowId}`));
+    console.log();
+    console.log(chalk.white('  Target:     ') + chalk.cyan(webUrl));
+    console.log(chalk.white('  Repository: ') + chalk.cyan(repoPath));
+    if (configPath) {
+      console.log(chalk.white('  Config:     ') + chalk.cyan(configPath));
+    }
+    if (pipelineTestingMode) {
+      console.log(chalk.white('  Mode:       ') + chalk.yellow('Pipeline Testing'));
+    }
+    console.log();
+
+    // Start workflow by name (not by importing the function)
+    const handle = await client.workflow.start<(input: PipelineInput) => Promise<PipelineState>>(
+      'pentestPipelineWorkflow',
+      {
+        taskQueue: 'shannon-pipeline',
+        workflowId,
+        args: [input],
+      }
+    );
+
+    if (!waitForCompletion) {
+      console.log(chalk.bold('Monitor progress:'));
+      console.log(chalk.white('  Web UI:  ') + chalk.blue(`http://localhost:8233/namespaces/default/workflows/${workflowId}`));
+      console.log(chalk.white('  Logs:    ') + chalk.gray(`./shannon logs ID=${workflowId}`));
+      console.log(chalk.white('  Query:   ') + chalk.gray(`./shannon query ID=${workflowId}`));
+      console.log();
+      return;
+    }
+
+    // Poll for progress every 30 seconds
+    const progressInterval = setInterval(async () => {
+      try {
+        const progress = await handle.query<PipelineProgress>(PROGRESS_QUERY);
+        const elapsed = Math.floor(progress.elapsedMs / 1000);
+        console.log(
+          chalk.gray(`[${elapsed}s]`),
+          chalk.cyan(`Phase: ${progress.currentPhase || 'unknown'}`),
+          chalk.gray(`| Agent: ${progress.currentAgent || 'none'}`),
+          chalk.gray(`| Completed: ${progress.completedAgents.length}/13`)
+        );
+      } catch {
+        // Workflow may have completed
+      }
+    }, 30000);
+
+    try {
+      const result = await handle.result();
+      clearInterval(progressInterval);
+
+      console.log(chalk.green.bold('\nPipeline completed successfully!'));
+      if (result.summary) {
+        console.log(chalk.gray(`Duration: ${Math.floor(result.summary.totalDurationMs / 1000)}s`));
+        console.log(chalk.gray(`Agents completed: ${result.summary.agentCount}`));
+        console.log(chalk.gray(`Total turns: ${result.summary.totalTurns}`));
+        console.log(chalk.gray(`Total cost: $${result.summary.totalCostUsd.toFixed(4)}`));
+      }
+    } catch (error) {
+      clearInterval(progressInterval);
+      console.error(chalk.red.bold('\nPipeline failed:'), error);
+      process.exit(1);
+    }
+  } finally {
+    await connection.close();
+  }
+}
+
+startPipeline().catch((err) => {
+  console.error(chalk.red('Client error:'), err);
+  process.exit(1);
+});
@@ -0,0 +1,155 @@
+#!/usr/bin/env node
+// Copyright (C) 2025 Keygraph, Inc.
+//
+// This program is free software: you can redistribute it and/or modify
+// it under the terms of the GNU Affero General Public License version 3
+// as published by the Free Software Foundation.
+
+/**
+ * Temporal query tool for inspecting Shannon workflow progress.
+ *
+ * Queries a running or completed workflow and displays its state.
+ *
+ * Usage:
+ *   npm run temporal:query -- <workflowId>
+ *   # or
+ *   node dist/temporal/query.js <workflowId>
+ *
+ * Environment:
+ *   TEMPORAL_ADDRESS - Temporal server address (default: localhost:7233)
+ */
+
+import { Connection, Client } from '@temporalio/client';
+import dotenv from 'dotenv';
+import chalk from 'chalk';
+
+dotenv.config();
+
+// Query name must match the one defined in workflows.ts
+const PROGRESS_QUERY = 'getProgress';
+
+// Types duplicated from shared.ts to avoid importing workflow APIs
+interface AgentMetrics {
+  durationMs: number;
+  inputTokens: number | null;
+  outputTokens: number | null;
+  costUsd: number | null;
+  numTurns: number | null;
+}
+
+interface PipelineProgress {
+  status: 'running' | 'completed' | 'failed';
+  currentPhase: string | null;
+  currentAgent: string | null;
+  completedAgents: string[];
+  failedAgent: string | null;
+  error: string | null;
+  startTime: number;
+  agentMetrics: Record<string, AgentMetrics>;
+  workflowId: string;
+  elapsedMs: number;
+}
+
+function showUsage(): void {
+  console.log(chalk.cyan.bold('\nShannon Temporal Query Tool'));
+  console.log(chalk.gray('Query progress of a running workflow\n'));
+  console.log(chalk.yellow('Usage:'));
+  console.log('  node dist/temporal/query.js <workflowId>\n');
+  console.log(chalk.yellow('Examples:'));
+  console.log('  node dist/temporal/query.js shannon-1704672000000\n');
+}
+
+function getStatusColor(status: string): string {
+  switch (status) {
+    case 'running':
+      return chalk.yellow(status);
+    case 'completed':
+      return chalk.green(status);
+    case 'failed':
+      return chalk.red(status);
+    default:
+      return status;
+  }
+}
+
+function formatDuration(ms: number): string {
+  const seconds = Math.floor(ms / 1000);
+  const minutes = Math.floor(seconds / 60);
+  const hours = Math.floor(minutes / 60);
+
+  if (hours > 0) {
+    return `${hours}h ${minutes % 60}m`;
+  } else if (minutes > 0) {
+    return `${minutes}m ${seconds % 60}s`;
+  }
+  return `${seconds}s`;
+}
+
+async function queryWorkflow(): Promise<void> {
+  const workflowId = process.argv[2];
+
+  if (!workflowId || workflowId === '--help' || workflowId === '-h') {
+    showUsage();
+    process.exit(workflowId ? 0 : 1);
+  }
+
+  const address = process.env.TEMPORAL_ADDRESS || 'localhost:7233';
+
+  const connection = await Connection.connect({ address });
+  const client = new Client({ connection });
+
+  try {
+    const handle = client.workflow.getHandle(workflowId);
+    const progress = await handle.query<PipelineProgress>(PROGRESS_QUERY);
+
+    console.log(chalk.cyan.bold('\nWorkflow Progress'));
+    console.log(chalk.gray('\u2500'.repeat(40)));
+    console.log(`${chalk.white('Workflow ID:')} ${progress.workflowId}`);
+    console.log(`${chalk.white('Status:')} ${getStatusColor(progress.status)}`);
+    console.log(
+      `${chalk.white('Current Phase:')} ${progress.currentPhase || 'none'}`
+    );
+    console.log(
+      `${chalk.white('Current Agent:')} ${progress.currentAgent || 'none'}`
+    );
+    console.log(`${chalk.white('Elapsed:')} ${formatDuration(progress.elapsedMs)}`);
+    console.log(
+      `${chalk.white('Completed:')} ${progress.completedAgents.length}/13 agents`
+    );
+
+    if (progress.completedAgents.length > 0) {
+      console.log(chalk.gray('\nCompleted agents:'));
+      for (const agent of progress.completedAgents) {
+        const metrics = progress.agentMetrics[agent];
+        const duration = metrics ? formatDuration(metrics.durationMs) : 'unknown';
+        const cost = metrics?.costUsd ? `$${metrics.costUsd.toFixed(4)}` : '';
+        console.log(
+          chalk.green(`  - ${agent}`) +
+            chalk.gray(` (${duration}${cost ? ', ' + cost : ''})`)
+        );
+      }
+    }
+
+    if (progress.error) {
+      console.log(chalk.red(`\nError: ${progress.error}`));
+      console.log(chalk.red(`Failed agent: ${progress.failedAgent}`));
+    }
+
+    console.log();
+  } catch (error) {
+    const err = error as Error;
+    if (err.message?.includes('not found')) {
+      console.log(chalk.red(`Workflow not found: ${workflowId}`));
+    } else {
+      console.error(chalk.red('Query failed:'), err.message);
+    }
+    process.exit(1);
+  } finally {
+    await connection.close();
+  }
+}
+
+queryWorkflow().catch((err) => {
+  console.error(chalk.red('Query error:'), err);
+  process.exit(1);
+});
@@ -0,0 +1,61 @@
+import { defineQuery } from '@temporalio/workflow';
+
+// === Types ===
+
+export interface PipelineInput {
+  webUrl: string;
+  repoPath: string;
+  configPath?: string;
+  outputPath?: string;
+  pipelineTestingMode?: boolean;
+  workflowId?: string; // Added by client, used for audit correlation
+}
+
+export interface AgentMetrics {
+  durationMs: number;
+  inputTokens: number | null;
+  outputTokens: number | null;
+  costUsd: number | null;
+  numTurns: number | null;
+}
+
+export interface PipelineSummary {
+  totalCostUsd: number;
+  totalDurationMs: number; // Wall-clock time (end - start)
+  totalTurns: number;
+  agentCount: number;
+}
+
+export interface PipelineState {
+  status: 'running' | 'completed' | 'failed';
+  currentPhase: string | null;
+  currentAgent: string | null;
+  completedAgents: string[];
+  failedAgent: string | null;
+  error: string | null;
+  startTime: number;
+  agentMetrics: Record<string, AgentMetrics>;
+  summary: PipelineSummary | null;
+}
+
+// Extended state returned by getProgress query (includes computed fields)
+export interface PipelineProgress extends PipelineState {
+  workflowId: string;
+  elapsedMs: number;
+}
+
+// Result from a single vuln→exploit pipeline
+export interface VulnExploitPipelineResult {
+  vulnType: string;
+  vulnMetrics: AgentMetrics | null;
+  exploitMetrics: AgentMetrics | null;
+  exploitDecision: {
+    shouldExploit: boolean;
+    vulnerabilityCount: number;
+  } | null;
+  error: string | null;
+}
+
+// === Queries ===
+
+export const getProgress = defineQuery<PipelineProgress>('getProgress');
@@ -0,0 +1,79 @@
+#!/usr/bin/env node
+// Copyright (C) 2025 Keygraph, Inc.
+//
+// This program is free software: you can redistribute it and/or modify
+// it under the terms of the GNU Affero General Public License version 3
+// as published by the Free Software Foundation.
+
+/**
+ * Temporal worker for Shannon pentest pipeline.
+ *
+ * Polls the 'shannon-pipeline' task queue and executes activities.
+ * Handles up to 25 concurrent activities to support multiple parallel workflows.
+ *
+ * Usage:
+ *   npm run temporal:worker
+ *   # or
+ *   node dist/temporal/worker.js
+ *
+ * Environment:
+ *   TEMPORAL_ADDRESS - Temporal server address (default: localhost:7233)
+ */
+
+import { NativeConnection, Worker, bundleWorkflowCode } from '@temporalio/worker';
+import { fileURLToPath } from 'node:url';
+import path from 'node:path';
+import dotenv from 'dotenv';
+import chalk from 'chalk';
+import * as activities from './activities.js';
+
+dotenv.config();
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+
+async function runWorker(): Promise<void> {
+  const address = process.env.TEMPORAL_ADDRESS || 'localhost:7233';
+  console.log(chalk.cyan(`Connecting to Temporal at ${address}...`));
+
+  const connection = await NativeConnection.connect({ address });
+
+  // Bundle workflows for Temporal's V8 isolate
+  console.log(chalk.gray('Bundling workflows...'));
+  const workflowBundle = await bundleWorkflowCode({
+    workflowsPath: path.join(__dirname, 'workflows.js'),
+  });
+
+  const worker = await Worker.create({
+    connection,
+    namespace: 'default',
+    workflowBundle,
+    activities,
+    taskQueue: 'shannon-pipeline',
+    maxConcurrentActivityTaskExecutions: 25, // Support multiple parallel workflows (5 agents × ~5 workflows)
+  });
+
+  // Graceful shutdown handling
+  const shutdown = async (): Promise<void> => {
+    console.log(chalk.yellow('\nShutting down worker...'));
+    worker.shutdown();
+  };
+
+  process.on('SIGINT', shutdown);
+  process.on('SIGTERM', shutdown);
+
+  console.log(chalk.green('Shannon worker started'));
+  console.log(chalk.gray('Task queue: shannon-pipeline'));
+  console.log(chalk.gray('Press Ctrl+C to stop\n'));
+
+  try {
+    await worker.run();
+  } finally {
+    await connection.close();
+    console.log(chalk.gray('Worker stopped'));
+  }
+}
+
+runWorker().catch((err) => {
+  console.error(chalk.red('Worker failed:'), err);
+  process.exit(1);
+});
@@ -0,0 +1,325 @@
+// Copyright (C) 2025 Keygraph, Inc.
+//
+// This program is free software: you can redistribute it and/or modify
+// it under the terms of the GNU Affero General Public License version 3
+// as published by the Free Software Foundation.
+
+/**
+ * Temporal workflow for Shannon pentest pipeline.
+ *
+ * Orchestrates the penetration testing workflow:
+ * 1. Pre-Reconnaissance (sequential)
+ * 2. Reconnaissance (sequential)
+ * 3-4. Vulnerability + Exploitation (5 pipelined pairs in parallel)
+ *      Each pair: vuln agent → queue check → conditional exploit
+ *      No synchronization barrier - exploits start when their vuln finishes
+ * 5. Reporting (sequential)
+ *
+ * Features:
+ * - Queryable state via getProgress
+ * - Automatic retry with backoff for transient/billing errors
+ * - Non-retryable classification for permanent errors
+ * - Audit correlation via workflowId
+ * - Graceful failure handling: pipelines continue if one fails
+ */
+
+import {
+  proxyActivities,
+  setHandler,
+  workflowInfo,
+} from '@temporalio/workflow';
+import type * as activities from './activities.js';
+import type { ActivityInput } from './activities.js';
+import {
+  getProgress,
+  type PipelineInput,
+  type PipelineState,
+  type PipelineProgress,
+  type PipelineSummary,
+  type VulnExploitPipelineResult,
+  type AgentMetrics,
+} from './shared.js';
+import type { VulnType } from '../queue-validation.js';
+
+// Retry configuration for production (long intervals for billing recovery)
+const PRODUCTION_RETRY = {
+  initialInterval: '5 minutes',
+  maximumInterval: '30 minutes',
+  backoffCoefficient: 2,
+  maximumAttempts: 50,
+  nonRetryableErrorTypes: [
+    'AuthenticationError',
+    'PermissionError',
+    'InvalidRequestError',
+    'RequestTooLargeError',
+    'ConfigurationError',
+    'InvalidTargetError',
+    'ExecutionLimitError',
+  ],
+};
+
+// Retry configuration for pipeline testing (fast iteration)
+const TESTING_RETRY = {
+  initialInterval: '10 seconds',
+  maximumInterval: '30 seconds',
+  backoffCoefficient: 2,
+  maximumAttempts: 5,
+  nonRetryableErrorTypes: PRODUCTION_RETRY.nonRetryableErrorTypes,
+};
+
+// Activity proxy with production retry configuration (default)
+const acts = proxyActivities<typeof activities>({
+  startToCloseTimeout: '2 hours',
+  heartbeatTimeout: '10 minutes', // Long timeout for resource-constrained workers with many concurrent activities
+  retry: PRODUCTION_RETRY,
+});
+
+// Activity proxy with testing retry configuration (fast)
+const testActs = proxyActivities<typeof activities>({
+  startToCloseTimeout: '10 minutes',
+  heartbeatTimeout: '5 minutes', // Shorter for testing but still tolerant of resource contention
+  retry: TESTING_RETRY,
+});
+
+/**
+ * Compute aggregated metrics from the current pipeline state.
+ * Called on both success and failure to provide partial metrics.
+ */
+function computeSummary(state: PipelineState): PipelineSummary {
+  const metrics = Object.values(state.agentMetrics);
+  return {
+    totalCostUsd: metrics.reduce((sum, m) => sum + (m.costUsd ?? 0), 0),
+    totalDurationMs: Date.now() - state.startTime,
+    totalTurns: metrics.reduce((sum, m) => sum + (m.numTurns ?? 0), 0),
+    agentCount: state.completedAgents.length,
+  };
+}
+
+export async function pentestPipelineWorkflow(
+  input: PipelineInput
+): Promise<PipelineState> {
+  const { workflowId } = workflowInfo();
+
+  // Select activity proxy based on testing mode
+  // Pipeline testing uses fast retry intervals (10s) for quick iteration
+  const a = input.pipelineTestingMode ? testActs : acts;
+
+  // Workflow state (queryable)
+  const state: PipelineState = {
+    status: 'running',
+    currentPhase: null,
+    currentAgent: null,
+    completedAgents: [],
+    failedAgent: null,
+    error: null,
+    startTime: Date.now(),
+    agentMetrics: {},
+    summary: null,
+  };
+
+  // Register query handler for real-time progress inspection
+  setHandler(getProgress, (): PipelineProgress => ({
+    ...state,
+    workflowId,
+    elapsedMs: Date.now() - state.startTime,
+  }));
+
+  // Build ActivityInput with required workflowId for audit correlation
+  // Activities require workflowId (non-optional), PipelineInput has it optional
+  // Use spread to conditionally include optional properties (exactOptionalPropertyTypes)
+  const activityInput: ActivityInput = {
+    webUrl: input.webUrl,
+    repoPath: input.repoPath,
+    workflowId,
+    ...(input.configPath !== undefined && { configPath: input.configPath }),
+    ...(input.outputPath !== undefined && { outputPath: input.outputPath }),
+    ...(input.pipelineTestingMode !== undefined && {
+      pipelineTestingMode: input.pipelineTestingMode,
+    }),
+  };
+
+  try {
+    // === Phase 1: Pre-Reconnaissance ===
+    state.currentPhase = 'pre-recon';
+    state.currentAgent = 'pre-recon';
+    await a.logPhaseTransition(activityInput, 'pre-recon', 'start');
+    state.agentMetrics['pre-recon'] =
+      await a.runPreReconAgent(activityInput);
+    state.completedAgents.push('pre-recon');
+    await a.logPhaseTransition(activityInput, 'pre-recon', 'complete');
+
+    // === Phase 2: Reconnaissance ===
+    state.currentPhase = 'recon';
+    state.currentAgent = 'recon';
+    await a.logPhaseTransition(activityInput, 'recon', 'start');
+    state.agentMetrics['recon'] = await a.runReconAgent(activityInput);
+    state.completedAgents.push('recon');
+    await a.logPhaseTransition(activityInput, 'recon', 'complete');
+
+    // === Phases 3-4: Vulnerability Analysis + Exploitation (Pipelined) ===
+    // Each vuln type runs as an independent pipeline:
+    // vuln agent → queue check → conditional exploit agent
+    // This eliminates the synchronization barrier between phases - each exploit
+    // starts immediately when its vuln agent finishes, not waiting for all.
+    state.currentPhase = 'vulnerability-exploitation';
+    state.currentAgent = 'pipelines';
+    await a.logPhaseTransition(activityInput, 'vulnerability-exploitation', 'start');
+
+    // Helper: Run a single vuln→exploit pipeline
+    async function runVulnExploitPipeline(
+      vulnType: VulnType,
+      runVulnAgent: () => Promise<AgentMetrics>,
+      runExploitAgent: () => Promise<AgentMetrics>
+    ): Promise<VulnExploitPipelineResult> {
+      // Step 1: Run vulnerability agent
+      const vulnMetrics = await runVulnAgent();
+
+      // Step 2: Check exploitation queue (starts immediately after vuln)
+      const decision = await a.checkExploitationQueue(activityInput, vulnType);
+
+      // Step 3: Conditionally run exploit agent
+      let exploitMetrics: AgentMetrics | null = null;
+      if (decision.shouldExploit) {
+        exploitMetrics = await runExploitAgent();
+      }
+
+      return {
+        vulnType,
+        vulnMetrics,
+        exploitMetrics,
+        exploitDecision: {
+          shouldExploit: decision.shouldExploit,
+          vulnerabilityCount: decision.vulnerabilityCount,
+        },
+        error: null,
+      };
+    }
+
+    // Run all 5 pipelines in parallel with graceful failure handling
+    // Promise.allSettled ensures other pipelines continue if one fails
+    const pipelineResults = await Promise.allSettled([
+      runVulnExploitPipeline(
+        'injection',
+        () => a.runInjectionVulnAgent(activityInput),
+        () => a.runInjectionExploitAgent(activityInput)
+      ),
+      runVulnExploitPipeline(
+        'xss',
+        () => a.runXssVulnAgent(activityInput),
+        () => a.runXssExploitAgent(activityInput)
+      ),
+      runVulnExploitPipeline(
+        'auth',
+        () => a.runAuthVulnAgent(activityInput),
+        () => a.runAuthExploitAgent(activityInput)
+      ),
+      runVulnExploitPipeline(
+        'ssrf',
+        () => a.runSsrfVulnAgent(activityInput),
+        () => a.runSsrfExploitAgent(activityInput)
+      ),
+      runVulnExploitPipeline(
+        'authz',
+        () => a.runAuthzVulnAgent(activityInput),
+        () => a.runAuthzExploitAgent(activityInput)
+      ),
+    ]);
+
+    // Aggregate results from all pipelines
+    const failedPipelines: string[] = [];
+    for (const result of pipelineResults) {
+      if (result.status === 'fulfilled') {
+        const { vulnType, vulnMetrics, exploitMetrics } = result.value;
+
+        // Record vuln agent metrics
+        if (vulnMetrics) {
+          state.agentMetrics[`${vulnType}-vuln`] = vulnMetrics;
+          state.completedAgents.push(`${vulnType}-vuln`);
+        }
+
+        // Record exploit agent metrics (if it ran)
+        if (exploitMetrics) {
+          state.agentMetrics[`${vulnType}-exploit`] = exploitMetrics;
+          state.completedAgents.push(`${vulnType}-exploit`);
+        }
+      } else {
+        // Pipeline failed - log error but continue with others
+        const errorMsg =
+          result.reason instanceof Error
+            ? result.reason.message
+            : String(result.reason);
+        failedPipelines.push(errorMsg);
+      }
+    }
+
+    // Log any pipeline failures (workflow continues despite failures)
+    if (failedPipelines.length > 0) {
+      console.log(
+        `⚠️ ${failedPipelines.length} pipeline(s) failed:`,
+        failedPipelines
+      );
+    }
+
+    // Update phase markers
+    state.currentPhase = 'exploitation';
+    state.currentAgent = null;
+    await a.logPhaseTransition(activityInput, 'vulnerability-exploitation', 'complete');
+
+    // === Phase 5: Reporting ===
+    state.currentPhase = 'reporting';
+    state.currentAgent = 'report';
+    await a.logPhaseTransition(activityInput, 'reporting', 'start');
+
+    // First, assemble the concatenated report from exploitation evidence files
+    await a.assembleReportActivity(activityInput);
+
+    // Then run the report agent to add executive summary and clean up
+    state.agentMetrics['report'] = await a.runReportAgent(activityInput);
+    state.completedAgents.push('report');
+    await a.logPhaseTransition(activityInput, 'reporting', 'complete');
+
+    // === Complete ===
+    state.status = 'completed';
+    state.currentPhase = null;
+    state.currentAgent = null;
+    state.summary = computeSummary(state);
+
+    // Log workflow completion summary
+    await a.logWorkflowComplete(activityInput, {
+      status: 'completed',
+      totalDurationMs: state.summary.totalDurationMs,
+      totalCostUsd: state.summary.totalCostUsd,
+      completedAgents: state.completedAgents,
+      agentMetrics: Object.fromEntries(
+        Object.entries(state.agentMetrics).map(([name, m]) => [
+          name,
+          { durationMs: m.durationMs, costUsd: m.costUsd },
+        ])
+      ),
+    });
+
+    return state;
+  } catch (error) {
+    state.status = 'failed';
+    state.failedAgent = state.currentAgent;
+    state.error = error instanceof Error ? error.message : String(error);
+    state.summary = computeSummary(state);
+
+    // Log workflow failure summary
+    await a.logWorkflowComplete(activityInput, {
+      status: 'failed',
+      totalDurationMs: state.summary.totalDurationMs,
+      totalCostUsd: state.summary.totalCostUsd,
+      completedAgents: state.completedAgents,
+      agentMetrics: Object.fromEntries(
+        Object.entries(state.agentMetrics).map(([name, m]) => [
+          name,
+          { durationMs: m.durationMs, costUsd: m.costUsd },
+        ])
+      ),
+      error: state.error ?? undefined,
+    });
+
+    throw error;
+  }
+}
@@ -47,10 +47,6 @@ export type PlaywrightAgent =

 export type AgentValidator = (sourceDir: string) => Promise<boolean>;

-export type AgentValidatorMap = Record<AgentName, AgentValidator>;
-
-export type McpAgentMapping = Record<PromptName, PlaywrightAgent>;
-
 export type AgentStatus =
  | 'pending'
  | 'in_progress'
@@ -63,3 +59,26 @@ export interface AgentDefinition {
  displayName: string;
  prerequisites: AgentName[];
 }
+
+/**
+ * Maps an agent name to its corresponding prompt file name.
+ */
+export function getPromptNameForAgent(agentName: AgentName): PromptName {
+  const mappings: Record<AgentName, PromptName> = {
+    'pre-recon': 'pre-recon-code',
+    'recon': 'recon',
+    'injection-vuln': 'vuln-injection',
+    'xss-vuln': 'vuln-xss',
+    'auth-vuln': 'vuln-auth',
+    'ssrf-vuln': 'vuln-ssrf',
+    'authz-vuln': 'vuln-authz',
+    'injection-exploit': 'exploit-injection',
+    'xss-exploit': 'exploit-xss',
+    'auth-exploit': 'exploit-auth',
+    'ssrf-exploit': 'exploit-ssrf',
+    'authz-exploit': 'exploit-authz',
+    'report': 'report-executive',
+  };
+
+  return mappings[agentName];
+}
@@ -31,13 +31,12 @@ type UnlockFunction = () => void;
 * }
 * ```
 */
+// Promise-based mutex with queue semantics - safe for parallel agents on same session
 export class SessionMutex {
  // Map of sessionId -> Promise (represents active lock)
  private locks: Map<string, Promise<void>> = new Map();

-  /**
-   * Acquire lock for a session
-   */
+  // Wait for existing lock, then acquire. Queue ensures FIFO ordering.
  async lock(sessionId: string): Promise<UnlockFunction> {
    if (this.locks.has(sessionId)) {
      // Wait for existing lock to be released
@@ -0,0 +1,73 @@
+// Copyright (C) 2025 Keygraph, Inc.
+//
+// This program is free software: you can redistribute it and/or modify
+// it under the terms of the GNU Affero General Public License version 3
+// as published by the Free Software Foundation.
+
+/**
+ * File I/O Utilities
+ *
+ * Core utility functions for file operations including atomic writes,
+ * directory creation, and JSON file handling.
+ */
+
+import fs from 'fs/promises';
+
+/**
+ * Ensure directory exists (idempotent, race-safe)
+ */
+export async function ensureDirectory(dirPath: string): Promise<void> {
+  try {
+    await fs.mkdir(dirPath, { recursive: true });
+  } catch (error) {
+    // Ignore EEXIST errors (race condition safe)
+    if ((error as NodeJS.ErrnoException).code !== 'EEXIST') {
+      throw error;
+    }
+  }
+}
+
+/**
+ * Atomic write using temp file + rename pattern
+ * Guarantees no partial writes or corruption on crash
+ */
+export async function atomicWrite(filePath: string, data: object | string): Promise<void> {
+  const tempPath = `${filePath}.tmp`;
+  const content = typeof data === 'string' ? data : JSON.stringify(data, null, 2);
+
+  try {
+    // Write to temp file
+    await fs.writeFile(tempPath, content, 'utf8');
+
+    // Atomic rename (POSIX guarantee: atomic on same filesystem)
+    await fs.rename(tempPath, filePath);
+  } catch (error) {
+    // Clean up temp file on failure
+    try {
+      await fs.unlink(tempPath);
+    } catch {
+      // Ignore cleanup errors
+    }
+    throw error;
+  }
+}
+
+/**
+ * Read and parse JSON file
+ */
+export async function readJson<T = unknown>(filePath: string): Promise<T> {
+  const content = await fs.readFile(filePath, 'utf8');
+  return JSON.parse(content) as T;
+}
+
+/**
+ * Check if file exists
+ */
+export async function fileExists(filePath: string): Promise<boolean> {
+  try {
+    await fs.access(filePath);
+    return true;
+  } catch {
+    return false;
+  }
+}
@@ -0,0 +1,60 @@
+// Copyright (C) 2025 Keygraph, Inc.
+//
+// This program is free software: you can redistribute it and/or modify
+// it under the terms of the GNU Affero General Public License version 3
+// as published by the Free Software Foundation.
+
+/**
+ * Formatting Utilities
+ *
+ * Generic formatting functions for durations, timestamps, and percentages.
+ */
+
+/**
+ * Format duration in milliseconds to human-readable string
+ */
+export function formatDuration(ms: number): string {
+  if (ms < 1000) {
+    return `${ms}ms`;
+  }
+
+  const seconds = ms / 1000;
+  if (seconds < 60) {
+    return `${seconds.toFixed(1)}s`;
+  }
+
+  const minutes = Math.floor(seconds / 60);
+  const remainingSeconds = Math.floor(seconds % 60);
+  return `${minutes}m ${remainingSeconds}s`;
+}
+
+/**
+ * Format timestamp to ISO 8601 string
+ */
+export function formatTimestamp(timestamp: number = Date.now()): string {
+  return new Date(timestamp).toISOString();
+}
+
+/**
+ * Calculate percentage
+ */
+export function calculatePercentage(part: number, total: number): number {
+  if (total === 0) return 0;
+  return (part / total) * 100;
+}
+
+/**
+ * Extract agent type from description string for display purposes
+ */
+export function extractAgentType(description: string): string {
+  if (description.includes('Pre-recon')) {
+    return 'pre-reconnaissance';
+  }
+  if (description.includes('Recon')) {
+    return 'reconnaissance';
+  }
+  if (description.includes('Report')) {
+    return 'report generation';
+  }
+  return 'analysis';
+}
@@ -0,0 +1,29 @@
+// Copyright (C) 2025 Keygraph, Inc.
+//
+// This program is free software: you can redistribute it and/or modify
+// it under the terms of the GNU Affero General Public License version 3
+// as published by the Free Software Foundation.
+
+/**
+ * Functional Programming Utilities
+ *
+ * Generic functional composition patterns for async operations.
+ */
+
+// eslint-disable-next-line @typescript-eslint/no-explicit-any
+type PipelineFunction = (x: any) => any | Promise<any>;
+
+/**
+ * Async pipeline that passes result through a series of functions.
+ * Clearer than reduce-based pipe and easier to debug.
+ */
+export async function asyncPipe<TResult>(
+  initial: unknown,
+  ...fns: PipelineFunction[]
+): Promise<TResult> {
+  let result = initial;
+  for (const fn of fns) {
+    result = await fn(result);
+  }
+  return result as TResult;
+}
@@ -7,13 +7,76 @@
 import { $ } from 'zx';
 import chalk from 'chalk';

+/**
+ * Check if a directory is a git repository.
+ * Returns true if the directory contains a .git folder or is inside a git repo.
+ */
+export async function isGitRepository(dir: string): Promise<boolean> {
+  try {
+    await $`cd ${dir} && git rev-parse --git-dir`.quiet();
+    return true;
+  } catch {
+    return false;
+  }
+}
+
 interface GitOperationResult {
  success: boolean;
  hadChanges?: boolean;
  error?: Error;
 }

-// Global git operations semaphore to prevent index.lock conflicts during parallel execution
+/**
+ * Get list of changed files from git status --porcelain output
+ */
+async function getChangedFiles(
+  sourceDir: string,
+  operationDescription: string
+): Promise<string[]> {
+  const status = await executeGitCommandWithRetry(
+    ['git', 'status', '--porcelain'],
+    sourceDir,
+    operationDescription
+  );
+  return status.stdout
+    .trim()
+    .split('\n')
+    .filter((line) => line.length > 0);
+}
+
+/**
+ * Log a summary of changed files with truncation for long lists
+ */
+function logChangeSummary(
+  changes: string[],
+  messageWithChanges: string,
+  messageWithoutChanges: string,
+  color: typeof chalk.green,
+  maxToShow: number = 5
+): void {
+  if (changes.length > 0) {
+    console.log(color(messageWithChanges.replace('{count}', String(changes.length))));
+    changes.slice(0, maxToShow).forEach((change) => console.log(chalk.gray(`       ${change}`)));
+    if (changes.length > maxToShow) {
+      console.log(chalk.gray(`       ... and ${changes.length - maxToShow} more files`));
+    }
+  } else {
+    console.log(color(messageWithoutChanges));
+  }
+}
+
+/**
+ * Convert unknown error to GitOperationResult
+ */
+function toErrorResult(error: unknown): GitOperationResult {
+  const errMsg = error instanceof Error ? error.message : String(error);
+  return {
+    success: false,
+    error: error instanceof Error ? error : new Error(errMsg),
+  };
+}
+
+// Serializes git operations to prevent index.lock conflicts during parallel agent execution
 class GitSemaphore {
  private queue: Array<() => void> = [];
  private running: boolean = false;
@@ -41,33 +104,38 @@ class GitSemaphore {

 const gitSemaphore = new GitSemaphore();

-// Execute git commands with retry logic for index.lock conflicts
-export const executeGitCommandWithRetry = async (
+const GIT_LOCK_ERROR_PATTERNS = [
+  'index.lock',
+  'unable to lock',
+  'Another git process',
+  'fatal: Unable to create',
+  'fatal: index file',
+];
+
+function isGitLockError(errorMessage: string): boolean {
+  return GIT_LOCK_ERROR_PATTERNS.some((pattern) => errorMessage.includes(pattern));
+}
+
+// Retries git commands on lock conflicts with exponential backoff
+export async function executeGitCommandWithRetry(
  commandArgs: string[],
  sourceDir: string,
  description: string,
  maxRetries: number = 5
-): Promise<{ stdout: string; stderr: string }> => {
+): Promise<{ stdout: string; stderr: string }> {
  await gitSemaphore.acquire();

  try {
    for (let attempt = 1; attempt <= maxRetries; attempt++) {
      try {
-        // For arrays like ['git', 'status', '--porcelain'], execute parts separately
        const [cmd, ...args] = commandArgs;
        const result = await $`cd ${sourceDir} && ${cmd} ${args}`;
        return result;
      } catch (error) {
        const errMsg = error instanceof Error ? error.message : String(error);
-        const isLockError =
-          errMsg.includes('index.lock') ||
-          errMsg.includes('unable to lock') ||
-          errMsg.includes('Another git process') ||
-          errMsg.includes('fatal: Unable to create') ||
-          errMsg.includes('fatal: index file');

-        if (isLockError && attempt < maxRetries) {
-          const delay = Math.pow(2, attempt - 1) * 1000; // Exponential backoff: 1s, 2s, 4s, 8s, 16s
+        if (isGitLockError(errMsg) && attempt < maxRetries) {
+          const delay = Math.pow(2, attempt - 1) * 1000;
          console.log(
            chalk.yellow(
              `    ⚠️ Git lock conflict during ${description} (attempt ${attempt}/${maxRetries}). Retrying in ${delay}ms...`
@@ -80,84 +148,81 @@ export const executeGitCommandWithRetry = async (
        throw error;
      }
    }
-    // Should never reach here but TypeScript needs a return
    throw new Error(`Git command failed after ${maxRetries} retries`);
  } finally {
    gitSemaphore.release();
  }
-};
+}

-// Pure functions for Git workspace management
-const cleanWorkspace = async (
+// Two-phase reset: hard reset (tracked files) + clean (untracked files)
+export async function rollbackGitWorkspace(
  sourceDir: string,
-  reason: string = 'clean start'
-): Promise<GitOperationResult> => {
-  console.log(chalk.blue(`    🧹 Cleaning workspace for ${reason}`));
-  try {
-    // Check for uncommitted changes
-    const status = await $`cd ${sourceDir} && git status --porcelain`;
-    const hasChanges = status.stdout.trim().length > 0;
-
-    if (hasChanges) {
-      // Show what we're about to remove
-      const changes = status.stdout
-        .trim()
-        .split('\n')
-        .filter((line) => line.length > 0);
-      console.log(chalk.yellow(`    🔄 Rolling back workspace for ${reason}`));
-
-      await $`cd ${sourceDir} && git reset --hard HEAD`;
-      await $`cd ${sourceDir} && git clean -fd`;
-
-      console.log(
-        chalk.yellow(`    ✅ Rollback completed - removed ${changes.length} contaminated changes:`)
-      );
-      changes.slice(0, 3).forEach((change) => console.log(chalk.gray(`       ${change}`)));
-      if (changes.length > 3) {
-        console.log(chalk.gray(`       ... and ${changes.length - 3} more files`));
-      }
-    } else {
-      console.log(chalk.blue(`    ✅ Workspace already clean (no changes to remove)`));
-    }
-    return { success: true, hadChanges: hasChanges };
-  } catch (error) {
-    const errMsg = error instanceof Error ? error.message : String(error);
-    console.log(chalk.yellow(`    ⚠️ Workspace cleanup failed: ${errMsg}`));
-    return { success: false, error: error instanceof Error ? error : new Error(errMsg) };
+  reason: string = 'retry preparation'
+): Promise<GitOperationResult> {
+  // Skip git operations if not a git repository
+  if (!(await isGitRepository(sourceDir))) {
+    console.log(chalk.gray(`    ⏭️  Skipping git rollback (not a git repository)`));
+    return { success: true };
  }
-};

-export const createGitCheckpoint = async (
+  console.log(chalk.yellow(`    🔄 Rolling back workspace for ${reason}`));
+  try {
+    const changes = await getChangedFiles(sourceDir, 'status check for rollback');
+
+    await executeGitCommandWithRetry(
+      ['git', 'reset', '--hard', 'HEAD'],
+      sourceDir,
+      'hard reset for rollback'
+    );
+    await executeGitCommandWithRetry(
+      ['git', 'clean', '-fd'],
+      sourceDir,
+      'cleaning untracked files for rollback'
+    );
+
+    logChangeSummary(
+      changes,
+      '    ✅ Rollback completed - removed {count} contaminated changes:',
+      '    ✅ Rollback completed - no changes to remove',
+      chalk.yellow,
+      3
+    );
+    return { success: true };
+  } catch (error) {
+    const result = toErrorResult(error);
+    console.log(chalk.red(`    ❌ Rollback failed after retries: ${result.error?.message}`));
+    return result;
+  }
+}
+
+// Creates checkpoint before each attempt. First attempt preserves workspace; retries clean it.
+export async function createGitCheckpoint(
  sourceDir: string,
  description: string,
  attempt: number
-): Promise<GitOperationResult> => {
+): Promise<GitOperationResult> {
+  // Skip git operations if not a git repository
+  if (!(await isGitRepository(sourceDir))) {
+    console.log(chalk.gray(`    ⏭️  Skipping git checkpoint (not a git repository)`));
+    return { success: true };
+  }
+
  console.log(chalk.blue(`    📍 Creating checkpoint for ${description} (attempt ${attempt})`));
  try {
-    // Only clean workspace on retry attempts (attempt > 1), not on first attempts
-    // This preserves deliverables between agents while still cleaning on actual retries
+    // First attempt: preserve existing deliverables. Retries: clean workspace to prevent pollution
    if (attempt > 1) {
-      const cleanResult = await cleanWorkspace(sourceDir, `${description} (retry cleanup)`);
+      const cleanResult = await rollbackGitWorkspace(sourceDir, `${description} (retry cleanup)`);
      if (!cleanResult.success) {
-        const errMsg = cleanResult.error?.message || 'Unknown error';
        console.log(
-          chalk.yellow(`    ⚠️ Workspace cleanup failed, continuing anyway: ${errMsg}`)
+          chalk.yellow(`    ⚠️ Workspace cleanup failed, continuing anyway: ${cleanResult.error?.message}`)
        );
      }
    }

-    // Check for uncommitted changes with retry logic
-    const status = await executeGitCommandWithRetry(
-      ['git', 'status', '--porcelain'],
-      sourceDir,
-      'status check'
-    );
-    const hasChanges = status.stdout.trim().length > 0;
+    const changes = await getChangedFiles(sourceDir, 'status check');
+    const hasChanges = changes.length > 0;

-    // Stage changes with retry logic
    await executeGitCommandWithRetry(['git', 'add', '-A'], sourceDir, 'staging changes');
-
-    // Create commit with retry logic
    await executeGitCommandWithRetry(
      ['git', 'commit', '-m', `📍 Checkpoint: ${description} (attempt ${attempt})`, '--allow-empty'],
      sourceDir,
@@ -171,106 +236,64 @@ export const createGitCheckpoint = async (
    }
    return { success: true };
  } catch (error) {
-    const errMsg = error instanceof Error ? error.message : String(error);
-    console.log(chalk.yellow(`    ⚠️ Checkpoint creation failed after retries: ${errMsg}`));
-    return { success: false, error: error instanceof Error ? error : new Error(errMsg) };
+    const result = toErrorResult(error);
+    console.log(chalk.yellow(`    ⚠️ Checkpoint creation failed after retries: ${result.error?.message}`));
+    return result;
  }
-};
+}

-export const commitGitSuccess = async (
+export async function commitGitSuccess(
  sourceDir: string,
  description: string
-): Promise<GitOperationResult> => {
+): Promise<GitOperationResult> {
+  // Skip git operations if not a git repository
+  if (!(await isGitRepository(sourceDir))) {
+    console.log(chalk.gray(`    ⏭️  Skipping git commit (not a git repository)`));
+    return { success: true };
+  }
+
  console.log(chalk.green(`    💾 Committing successful results for ${description}`));
  try {
-    // Check what we're about to commit with retry logic
-    const status = await executeGitCommandWithRetry(
-      ['git', 'status', '--porcelain'],
-      sourceDir,
-      'status check for success commit'
-    );
-    const changes = status.stdout
-      .trim()
-      .split('\n')
-      .filter((line) => line.length > 0);
+    const changes = await getChangedFiles(sourceDir, 'status check for success commit');

-    // Stage changes with retry logic
    await executeGitCommandWithRetry(
      ['git', 'add', '-A'],
      sourceDir,
      'staging changes for success commit'
    );
-
-    // Create success commit with retry logic
    await executeGitCommandWithRetry(
      ['git', 'commit', '-m', `✅ ${description}: completed successfully`, '--allow-empty'],
      sourceDir,
      'creating success commit'
    );

-    if (changes.length > 0) {
-      console.log(chalk.green(`    ✅ Success commit created with ${changes.length} file changes:`));
-      changes.slice(0, 5).forEach((change) => console.log(chalk.gray(`       ${change}`)));
-      if (changes.length > 5) {
-        console.log(chalk.gray(`       ... and ${changes.length - 5} more files`));
-      }
-    } else {
-      console.log(chalk.green(`    ✅ Empty success commit created (agent made no file changes)`));
-    }
+    logChangeSummary(
+      changes,
+      '    ✅ Success commit created with {count} file changes:',
+      '    ✅ Empty success commit created (agent made no file changes)',
+      chalk.green,
+      5
+    );
    return { success: true };
  } catch (error) {
-    const errMsg = error instanceof Error ? error.message : String(error);
-    console.log(chalk.yellow(`    ⚠️ Success commit failed after retries: ${errMsg}`));
-    return { success: false, error: error instanceof Error ? error : new Error(errMsg) };
+    const result = toErrorResult(error);
+    console.log(chalk.yellow(`    ⚠️ Success commit failed after retries: ${result.error?.message}`));
+    return result;
  }
-};
+}

-export const rollbackGitWorkspace = async (
-  sourceDir: string,
-  reason: string = 'retry preparation'
-): Promise<GitOperationResult> => {
-  console.log(chalk.yellow(`    🔄 Rolling back workspace for ${reason}`));
+/**
+ * Get current git commit hash.
+ * Returns null if not a git repository.
+ */
+export async function getGitCommitHash(sourceDir: string): Promise<string | null> {
+  if (!(await isGitRepository(sourceDir))) {
+    return null;
+  }
  try {
-    // Show what we're about to remove with retry logic
-    const status = await executeGitCommandWithRetry(
-      ['git', 'status', '--porcelain'],
-      sourceDir,
-      'status check for rollback'
-    );
-    const changes = status.stdout
-      .trim()
-      .split('\n')
-      .filter((line) => line.length > 0);
-
-    // Reset to HEAD with retry logic
-    await executeGitCommandWithRetry(
-      ['git', 'reset', '--hard', 'HEAD'],
-      sourceDir,
-      'hard reset for rollback'
-    );
-
-    // Clean untracked files with retry logic
-    await executeGitCommandWithRetry(
-      ['git', 'clean', '-fd'],
-      sourceDir,
-      'cleaning untracked files for rollback'
-    );
-
-    if (changes.length > 0) {
-      console.log(
-        chalk.yellow(`    ✅ Rollback completed - removed ${changes.length} contaminated changes:`)
-      );
-      changes.slice(0, 3).forEach((change) => console.log(chalk.gray(`       ${change}`)));
-      if (changes.length > 3) {
-        console.log(chalk.gray(`       ... and ${changes.length - 3} more files`));
-      }
-    } else {
-      console.log(chalk.yellow(`    ✅ Rollback completed - no changes to remove`));
-    }
-    return { success: true };
-  } catch (error) {
-    const errMsg = error instanceof Error ? error.message : String(error);
-    console.log(chalk.red(`    ❌ Rollback failed after retries: ${errMsg}`));
-    return { success: false, error: error instanceof Error ? error : new Error(errMsg) };
+    const result = await $`cd ${sourceDir} && git rev-parse HEAD`;
+    return result.stdout.trim();
+  } catch {
+    return null;
  }
-};
+}
@@ -5,7 +5,7 @@
 // as published by the Free Software Foundation.

 import chalk from 'chalk';
-import { formatDuration } from '../audit/utils.js';
+import { formatDuration } from './formatting.js';

 // Timing utilities