Skip to content

Agent Harnesses ​

Coleo supports multiple harness types for different use cases:

  1. opencode-api - Headless HTTP-based harness (default for API usage)
  2. opencode-tui - Terminal UI harness with visual debugging (default when terminal specified)
  3. opencode - Legacy PTY harness

Lifecycle Policy ​

For restart resilience, API-driven arm spawning uses a daemon-first policy:

  • opencode-api and opencode are daemon-managed harnesses and should be launched through ArmAgent (coleo agent start) so they survive API restarts.
  • opencode-tui may run locally without daemon management when an operator wants a visible terminal session.
  • Local fallback for daemon-managed harnesses is available only as an explicit override for debugging/recovery scenarios.

Recovery Model ​

Arm recovery is now an explicit first-class flow in both the CLI and web UI:

  • POST /api/arms/:id/recover is the shared recovery endpoint for CLI and web clients.
  • Recovery always uses the API as the source of truth for runtime metadata and session identity.
  • For distributed arms, the API prefers a currently live agent on the recorded host/harness instead of blindly trusting a stale persisted agent_id.
  • If the runtime cannot be confirmed as live, recovery falls back to a restart instead of reporting a false reattach.

Runtime health is summarized from the same signals everywhere:

  • database status (idle, busy, stopped, etc.)
  • runtime metadata (pid, port, session_id, agent_id, workdir)
  • last_heartbeat
  • last_activity_at
  • last_output_at

That summary is exposed on arm list/detail responses and powers:

  • CLI arm list
  • CLI arm status
  • Web UI arm cards
  • recovery button enablement
  • watch/session routing

Current Harnesses ​

opencode-api (Headless) ​

The opencode-api harness is the default for API-based arm management. It:

  • Connects to an opencode-api server over HTTP (no PTY required)
  • Exposes MCP tools to arms
  • Avoids terminal/PTY complexity and cross-platform issues
  • Best for production and automated environments

High-level responsibilities:

  • Manage arm sessions with the opencode-api service
  • Route MCP/tool calls between the Brain and the agent
  • Report arm status and activity back into SQLite and the API

opencode-tui (Visual + API) ​

The opencode-tui harness provides the best of both worlds:

  • Spawns OpenCode in a visible terminal window (Ghostty, iTerm2, Terminal.app, tmux)
  • Controls the agent via well-defined HTTP API endpoints
  • Enables visual debugging and observability
  • Supports real-time event streaming and health monitoring

Key features:

  • Visual feedback: See the agent's work in a real terminal window
  • Programmatic control: Full API access for automation
  • Health monitoring: Automatic detection when processes die
  • Event streaming: Real-time updates for dashboards
  • Terminal flexibility: Works with multiple terminal emulators

This harness is automatically selected when using coleo arm spawn -t <terminal>.

opencode (Legacy) ​

The original PTY-based harness. Still available but opencode-tui is recommended for terminal-based workflows.


Model Selection & Fallback ​

Both opencode-api and opencode-tui harnesses include automatic model validation and fallback. This ensures arms can spawn successfully even when configured with unavailable models.

How It Works ​

When spawning an arm with a provider/model configuration, the harness:

  1. Validates the requested model against available providers
  2. Falls back automatically if the model isn't available
  3. Logs the fallback reason for debugging

Resolution Order ​

The model resolver tries these options in order:

StepConditionAction
1Exact match existsUse requested provider/model
2Model not found in providerUse cheapest model from same provider
3Provider not availableFind same model from different provider
4Both invalidUse first available model from any connected provider

Example Fallback Scenarios ​

# Requested model doesn't exist
Request: opencode/grok-code
Fallback: opencode/gemini-2.5-pro
Reason: Model "grok-code" not available in OpenCode Zen, using cheapest alternative

# Provider not connected
Request: anthropic/claude-sonnet-4
Fallback: opencode/claude-sonnet-4
Reason: Provider "anthropic" not available, using OpenCode Zen

# Both invalid
Request: fake-provider/fake-model
Fallback: opencode/gemini-2.5-pro
Reason: Neither provider "fake-provider" nor model "fake-model" available

Configuration ​

Model selection happens at arm spawn time. Configure via:

CLI:

bash
# Specify model when spawning
coleo arm spawn --model claude-sonnet-4 --provider opencode

# Or use combined format
coleo arm spawn --model opencode/claude-sonnet-4

Database default: The default model is stored in the arms table and can be changed in src/db/index.ts seed data.

Cost-Based Selection ​

When falling back to "cheapest model from same provider", the resolver uses known pricing data:

ModelInput ($/M tokens)Output ($/M tokens)
gemini-2.5-flash$0.075$0.30
gpt-4o-mini$0.15$0.60
gemini-2.5-pro$1.25$5.00
gpt-4o$2.50$10.00
claude-sonnet-4$3.00$15.00
claude-opus-4$15.00$75.00

API Endpoint ​

The resolver fetches available models from the Coleo API:

GET /api/opencode/providers

Response:

json
{
  "providers": [
    {
      "id": "opencode",
      "name": "OpenCode Zen",
      "models": [
        { "id": "claude-sonnet-4", "name": "Claude Sonnet 4" },
        { "id": "gemini-2.5-pro", "name": "Gemini 2.5 Pro" }
      ]
    }
  ],
  "connected": ["opencode"]
}

Implementation ​

The model resolver is implemented in src/harness/model-resolver.ts with these key functions:

typescript
// Main resolution function
resolveModel(provider: string, model: string, apiUrl?: string): Promise<ResolvedModel>

// Check if a specific model is available
isModelAvailable(provider: string, model: string, apiUrl?: string): Promise<boolean>

// Get all models sorted by cost
getAvailableModelsByCost(apiUrl?: string): Promise<Array<{providerId, modelId, cost}>>

Future Harnesses (Design Notes) ​

The rest of this document describes a future harness architecture that may be implemented later if PTY and GUI automation concerns are resolved.

Coleo needs to interface with various AI coding agents, each with their own proprietary CLI/TUI. Rather than depending on specific proprietary APIs, we can treat these tools as interactive terminal applications and communicate via keystrokes and text parsing.

The Problem ​

Most AI coding agents are distributed as proprietary client applications:

AgentInterfaceMCP SupportAPI Access
OpenCodeTUI (terminal)YesNo
Claude CodeTUI (terminal)YesLimited
Codex CLITUI (terminal)NoOpenAI API
RooTUI (terminal)YesNo
KiloTUI (terminal)UnknownNo
AiderCLI (interactive)NoMultiple APIs
Gemini CLITUI (terminal)NoGoogle API
CursorGUI (Electron)PartialNo

Key insight: The common denominator is the interactive terminal. Every agent can be controlled by sending keystrokes and reading terminal output.

Harness Architecture ​

A harness is an adapter that translates Coleo's commands into agent-specific interactions.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    HARNESS ARCHITECTURE                      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                              β”‚
β”‚  Coleo Brain                                                 β”‚
β”‚       β”‚                                                      β”‚
β”‚       β”‚ Unified Interface                                    β”‚
β”‚       β–Ό                                                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                             β”‚
β”‚  β”‚   Harness   β”‚ ◄── Abstract interface                      β”‚
β”‚  β”‚   Manager   β”‚                                             β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                             β”‚
β”‚       β”‚                                                      β”‚
β”‚       β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”‚
β”‚       β–Ό              β–Ό              β–Ό              β–Ό         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚OpenCode β”‚    β”‚ Claude  β”‚    β”‚  Codex  β”‚    β”‚  Aider  β”‚   β”‚
β”‚  β”‚ Harness β”‚    β”‚ Harness β”‚    β”‚ Harness β”‚    β”‚ Harness β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚       β”‚              β”‚              β”‚              β”‚         β”‚
β”‚       β–Ό              β–Ό              β–Ό              β–Ό         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚   PTY   β”‚    β”‚   PTY   β”‚    β”‚   PTY   β”‚    β”‚   PTY   β”‚   β”‚
β”‚  β”‚ Session β”‚    β”‚ Session β”‚    β”‚ Session β”‚    β”‚ Session β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Harness Interface ​

Every harness implements this interface:

typescript
interface AgentHarness {
  // Metadata
  name: string;                    // e.g., "opencode", "claude-code"
  version: string;
  capabilities: HarnessCapabilities;
  
  // Lifecycle
  spawn(config: SpawnConfig): Promise<HarnessSession>;
  kill(session: HarnessSession): Promise<void>;
  
  // Communication
  sendPrompt(session: HarnessSession, prompt: string): Promise<void>;
  waitForResponse(session: HarnessSession, timeout?: number): Promise<string>;
  waitForIdle(session: HarnessSession, timeout?: number): Promise<void>;
  
  // State detection
  getState(session: HarnessSession): Promise<AgentState>;
  isProcessing(session: HarnessSession): Promise<boolean>;
  
  // Special actions
  interrupt(session: HarnessSession): Promise<void>;
  compact(session: HarnessSession): Promise<void>;  // If supported
  
  // MCP (if supported)
  hasMCP(): boolean;
  getMCPEndpoint?(session: HarnessSession): string;
}

interface HarnessCapabilities {
  mcp: boolean;                    // Supports MCP protocol
  streaming: boolean;              // Can stream responses
  interrupt: boolean;              // Can interrupt mid-response
  compact: boolean;                // Can compact/summarize context
  multiTurn: boolean;              // Maintains conversation context
  fileEditing: boolean;            // Can edit files directly
  commandExecution: boolean;       // Can run shell commands
}

interface SpawnConfig {
  workdir: string;
  env: Record<string, string>;
  headless: boolean;
  mcpServers?: string[];           // MCP servers to connect
}

type AgentState = 
  | "initializing"
  | "idle"                         // Waiting for input
  | "processing"                   // Thinking/generating
  | "executing"                    // Running tools/commands
  | "waiting_approval"             // Asking user for confirmation
  | "error"
  | "dead";

PTY Session Management ​

Each agent runs in a pseudo-terminal (PTY) for realistic terminal interaction:

typescript
import { spawn } from "node-pty";

interface PTYSession {
  pty: IPty;
  buffer: string;                  // Accumulated output
  lineBuffer: string[];            // Line-by-line history
  onData: (data: string) => void;
  onExit: (code: number) => void;
}

class PTYManager {
  async spawn(command: string, args: string[], config: SpawnConfig): Promise<PTYSession> {
    const pty = spawn(command, args, {
      name: "xterm-256color",
      cols: 120,
      rows: 40,
      cwd: config.workdir,
      env: { ...process.env, ...config.env },
    });
    
    const session: PTYSession = {
      pty,
      buffer: "",
      lineBuffer: [],
      onData: () => {},
      onExit: () => {},
    };
    
    pty.onData((data) => {
      session.buffer += data;
      // Parse into lines, handle ANSI escape codes
      const lines = parseTerminalOutput(data);
      session.lineBuffer.push(...lines);
      session.onData(data);
    });
    
    pty.onExit(({ exitCode }) => {
      session.onExit(exitCode);
    });
    
    return session;
  }
  
  write(session: PTYSession, text: string): void {
    session.pty.write(text);
  }
  
  sendKey(session: PTYSession, key: TerminalKey): void {
    session.pty.write(KEY_SEQUENCES[key]);
  }
  
  resize(session: PTYSession, cols: number, rows: number): void {
    session.pty.resize(cols, rows);
  }
  
  kill(session: PTYSession): void {
    session.pty.kill();
  }
}

// Common terminal key sequences
const KEY_SEQUENCES = {
  ENTER: "\r",
  TAB: "\t",
  ESCAPE: "\x1b",
  CTRL_C: "\x03",
  CTRL_D: "\x04",
  CTRL_L: "\x0c",
  CTRL_Z: "\x1a",
  UP: "\x1b[A",
  DOWN: "\x1b[B",
  RIGHT: "\x1b[C",
  LEFT: "\x1b[D",
  BACKSPACE: "\x7f",
};

Example Harness: OpenCode ​

typescript
class OpenCodeHarness implements AgentHarness {
  name = "opencode";
  version = "1.0.0";
  capabilities = {
    mcp: true,
    streaming: true,
    interrupt: true,
    compact: true,
    multiTurn: true,
    fileEditing: true,
    commandExecution: true,
  };
  
  private ptyManager = new PTYManager();
  
  async spawn(config: SpawnConfig): Promise<HarnessSession> {
    const pty = await this.ptyManager.spawn("opencode", [], config);
    
    // Wait for initial prompt
    await this.waitForPattern(pty, /^>/m, 30000);
    
    return { id: generateId(), pty, harness: this };
  }
  
  async sendPrompt(session: HarnessSession, prompt: string): Promise<void> {
    // OpenCode uses simple text input
    this.ptyManager.write(session.pty, prompt);
    this.ptyManager.sendKey(session.pty, "ENTER");
  }
  
  async waitForResponse(session: HarnessSession, timeout = 300000): Promise<string> {
    // Wait for the prompt to reappear, indicating response complete
    const startIndex = session.pty.buffer.length;
    await this.waitForPattern(session.pty, /^>/m, timeout);
    return session.pty.buffer.slice(startIndex);
  }
  
  async waitForIdle(session: HarnessSession, timeout = 60000): Promise<void> {
    // Wait for no output for 2 seconds
    await this.waitForQuiet(session.pty, 2000, timeout);
  }
  
  async getState(session: HarnessSession): Promise<AgentState> {
    const recentOutput = session.pty.buffer.slice(-500);
    
    if (recentOutput.includes("Error:")) return "error";
    if (recentOutput.includes("[Y/n]") || recentOutput.includes("(yes/no)")) {
      return "waiting_approval";
    }
    if (recentOutput.match(/^>/m)) return "idle";
    return "processing";
  }
  
  async interrupt(session: HarnessSession): Promise<void> {
    this.ptyManager.sendKey(session.pty, "CTRL_C");
  }
  
  async compact(session: HarnessSession): Promise<void> {
    // OpenCode supports /compact command
    await this.sendPrompt(session, "/compact");
    await this.waitForIdle(session);
  }
  
  hasMCP(): boolean {
    return true;
  }
  
  getMCPEndpoint(session: HarnessSession): string {
    // OpenCode exposes MCP on a local socket
    return `unix:/tmp/opencode-${session.id}.sock`;
  }
  
  // Helper methods
  private async waitForPattern(pty: PTYSession, pattern: RegExp, timeout: number): Promise<void> {
    // Implementation
  }
  
  private async waitForQuiet(pty: PTYSession, quietMs: number, timeout: number): Promise<void> {
    // Implementation
  }
}

Example Harness: Claude Code ​

typescript
class ClaudeCodeHarness implements AgentHarness {
  name = "claude-code";
  version = "1.0.0";
  capabilities = {
    mcp: true,
    streaming: true,
    interrupt: true,
    compact: true,
    multiTurn: true,
    fileEditing: true,
    commandExecution: true,
  };
  
  async spawn(config: SpawnConfig): Promise<HarnessSession> {
    const pty = await this.ptyManager.spawn("claude", [], config);
    
    // Claude Code has a different startup sequence
    await this.waitForPattern(pty, /Claude Code/i, 10000);
    await this.waitForPattern(pty, />/m, 30000);
    
    return { id: generateId(), pty, harness: this };
  }
  
  async sendPrompt(session: HarnessSession, prompt: string): Promise<void> {
    // Claude Code may need special handling for multi-line prompts
    const lines = prompt.split("\n");
    for (const line of lines) {
      this.ptyManager.write(session.pty, line);
      if (lines.indexOf(line) < lines.length - 1) {
        // Shift+Enter for newline without submit
        this.ptyManager.write(session.pty, "\x1b[13;2u");
      }
    }
    this.ptyManager.sendKey(session.pty, "ENTER");
  }
  
  async getState(session: HarnessSession): Promise<AgentState> {
    const recentOutput = session.pty.buffer.slice(-500);
    
    // Claude Code specific patterns
    if (recentOutput.includes("Do you want to")) return "waiting_approval";
    if (recentOutput.includes("Thinking...")) return "processing";
    if (recentOutput.includes("Running:")) return "executing";
    if (recentOutput.match(/>\s*$/)) return "idle";
    return "processing";
  }
  
  async compact(session: HarnessSession): Promise<void> {
    // Claude Code uses /clear or similar
    await this.sendPrompt(session, "/clear");
    await this.waitForIdle(session);
  }
}

Example Harness: Aider ​

typescript
class AiderHarness implements AgentHarness {
  name = "aider";
  version = "1.0.0";
  capabilities = {
    mcp: false,                    // Aider doesn't support MCP
    streaming: true,
    interrupt: true,
    compact: false,
    multiTurn: true,
    fileEditing: true,
    commandExecution: true,
  };
  
  async spawn(config: SpawnConfig): Promise<HarnessSession> {
    // Aider takes file arguments
    const args = ["--yes-always"];  // Auto-confirm file changes
    
    const pty = await this.ptyManager.spawn("aider", args, config);
    await this.waitForPattern(pty, /aider>/i, 30000);
    
    return { id: generateId(), pty, harness: this };
  }
  
  async sendPrompt(session: HarnessSession, prompt: string): Promise<void> {
    this.ptyManager.write(session.pty, prompt);
    this.ptyManager.sendKey(session.pty, "ENTER");
  }
  
  async getState(session: HarnessSession): Promise<AgentState> {
    const recentOutput = session.pty.buffer.slice(-500);
    
    if (recentOutput.includes("aider>")) return "idle";
    if (recentOutput.includes("Commit? [y/n]")) return "waiting_approval";
    return "processing";
  }
  
  hasMCP(): boolean {
    return false;
  }
}

Harness Registry ​

typescript
class HarnessRegistry {
  private harnesses = new Map<string, () => AgentHarness>();
  
  register(name: string, factory: () => AgentHarness): void {
    this.harnesses.set(name, factory);
  }
  
  get(name: string): AgentHarness {
    const factory = this.harnesses.get(name);
    if (!factory) {
      throw new Error(`Unknown harness: ${name}`);
    }
    return factory();
  }
  
  list(): string[] {
    return Array.from(this.harnesses.keys());
  }
}

// Default registry
const registry = new HarnessRegistry();
registry.register("opencode", () => new OpenCodeHarness());
registry.register("claude-code", () => new ClaudeCodeHarness());
registry.register("aider", () => new AiderHarness());
registry.register("codex", () => new CodexHarness());
registry.register("roo", () => new RooHarness());
registry.register("kilo", () => new KiloHarness());
registry.register("gemini", () => new GeminiHarness());

Harness Test Suite ​

Every harness must pass a standard test suite to ensure compatibility:

typescript
interface HarnessTestSuite {
  name: string;
  tests: HarnessTest[];
}

interface HarnessTest {
  name: string;
  run: (harness: AgentHarness) => Promise<TestResult>;
  timeout: number;
  required: boolean;              // Fail suite if this fails
}

const STANDARD_TESTS: HarnessTest[] = [
  {
    name: "spawn_and_idle",
    required: true,
    timeout: 60000,
    run: async (harness) => {
      const session = await harness.spawn({ workdir: "/tmp/test", env: {}, headless: true });
      const state = await harness.getState(session);
      await harness.kill(session);
      return { pass: state === "idle", details: `State: ${state}` };
    },
  },
  {
    name: "simple_prompt",
    required: true,
    timeout: 120000,
    run: async (harness) => {
      const session = await harness.spawn({ workdir: "/tmp/test", env: {}, headless: true });
      await harness.sendPrompt(session, "What is 2 + 2?");
      const response = await harness.waitForResponse(session);
      await harness.kill(session);
      return { pass: response.includes("4"), details: response.slice(0, 200) };
    },
  },
  {
    name: "file_creation",
    required: true,
    timeout: 180000,
    run: async (harness) => {
      const testDir = await createTempDir();
      const session = await harness.spawn({ workdir: testDir, env: {}, headless: true });
      await harness.sendPrompt(session, "Create a file called hello.txt with the content 'Hello World'");
      await harness.waitForIdle(session);
      await harness.kill(session);
      
      const fileExists = await exists(join(testDir, "hello.txt"));
      const content = fileExists ? await readFile(join(testDir, "hello.txt"), "utf-8") : "";
      return { pass: content.includes("Hello"), details: content };
    },
  },
  {
    name: "interrupt",
    required: false,
    timeout: 60000,
    run: async (harness) => {
      if (!harness.capabilities.interrupt) {
        return { pass: true, details: "Skipped: not supported" };
      }
      const session = await harness.spawn({ workdir: "/tmp/test", env: {}, headless: true });
      await harness.sendPrompt(session, "Count from 1 to 1000000 slowly");
      await sleep(2000);
      await harness.interrupt(session);
      const state = await harness.getState(session);
      await harness.kill(session);
      return { pass: state === "idle", details: `State after interrupt: ${state}` };
    },
  },
  {
    name: "state_detection",
    required: true,
    timeout: 120000,
    run: async (harness) => {
      const session = await harness.spawn({ workdir: "/tmp/test", env: {}, headless: true });
      
      // Should be idle initially
      let state = await harness.getState(session);
      if (state !== "idle") {
        return { pass: false, details: `Expected idle, got ${state}` };
      }
      
      // Should be processing after prompt
      await harness.sendPrompt(session, "Write a haiku about programming");
      await sleep(500);
      state = await harness.getState(session);
      if (state !== "processing" && state !== "executing") {
        return { pass: false, details: `Expected processing, got ${state}` };
      }
      
      await harness.waitForIdle(session);
      state = await harness.getState(session);
      await harness.kill(session);
      return { pass: state === "idle", details: `Final state: ${state}` };
    },
  },
  {
    name: "mcp_connection",
    required: false,
    timeout: 60000,
    run: async (harness) => {
      if (!harness.hasMCP()) {
        return { pass: true, details: "Skipped: MCP not supported" };
      }
      const session = await harness.spawn({ 
        workdir: "/tmp/test", 
        env: {}, 
        headless: true,
        mcpServers: ["test-server"],
      });
      const endpoint = harness.getMCPEndpoint!(session);
      // Try to connect to MCP endpoint
      const connected = await testMCPConnection(endpoint);
      await harness.kill(session);
      return { pass: connected, details: `Endpoint: ${endpoint}` };
    },
  },
];

// Run tests for a harness
async function testHarness(harnessName: string): Promise<TestReport> {
  const harness = registry.get(harnessName);
  const results: TestResult[] = [];
  
  for (const test of STANDARD_TESTS) {
    console.log(`Running ${test.name}...`);
    try {
      const result = await Promise.race([
        test.run(harness),
        sleep(test.timeout).then(() => ({ pass: false, details: "Timeout" })),
      ]);
      results.push({ ...result, name: test.name });
    } catch (error) {
      results.push({ pass: false, name: test.name, details: String(error) });
    }
  }
  
  const passed = results.filter(r => r.pass).length;
  const failed = results.filter(r => !r.pass && STANDARD_TESTS.find(t => t.name === r.name)?.required);
  
  return {
    harness: harnessName,
    passed,
    total: results.length,
    compatible: failed.length === 0,
    results,
  };
}

Planned Harness Support (Future) ​

AgentPriorityStatusNotes
opencode-apiHighImplementedHTTP-based, reliable harness used in production
Other PTY/GUI agentsLowFutureMay be revisited if PTY/GUI automation is stable

Terminal Output Parsing ​

Handling ANSI escape codes and terminal control sequences:

typescript
// Strip ANSI escape codes for text analysis
function stripAnsi(text: string): string {
  return text.replace(/\x1b\[[0-9;]*[a-zA-Z]/g, "");
}

// Parse terminal output into structured lines
function parseTerminalOutput(raw: string): TerminalLine[] {
  const stripped = stripAnsi(raw);
  return stripped.split("\n").map((line, index) => ({
    text: line,
    raw: raw.split("\n")[index] || "",
    timestamp: new Date(),
  }));
}

// Detect common UI patterns
interface UIPatterns {
  prompt: RegExp;                  // Input prompt
  thinking: RegExp;                // Processing indicator
  approval: RegExp;                // Confirmation request
  error: RegExp;                   // Error message
  success: RegExp;                 // Success message
}

const OPENCODE_PATTERNS: UIPatterns = {
  prompt: /^>\s*$/m,
  thinking: /thinking|processing/i,
  approval: /\[Y\/n\]|\(yes\/no\)/i,
  error: /^Error:|^Failed:/m,
  success: /^Done|^Completed/m,
};

Future: Visual Harnesses ​

For GUI-based agents like Cursor, consider:

typescript
interface VisualHarness extends AgentHarness {
  // Additional methods for GUI automation
  click(selector: string): Promise<void>;
  type(text: string): Promise<void>;
  screenshot(): Promise<Buffer>;
  findElement(selector: string): Promise<Element | null>;
}

// Could use Playwright or similar for automation
// Lower priority - focus on terminal-based agents first