Components

The Coleo system consists of five major components that work together.

Brain (Coordinator)

The Brain is the central nervous system of Coleo.

┌─────────────────────────────────────────────────────────────┐
│                         BRAIN                                │
├─────────────────────────────────────────────────────────────┤
│  Responsibilities:                                           │
│  ├── Arm Lifecycle      - Spawn, monitor, kill arms         │
│  ├── Conflict Resolution - Mediate ownership disputes        │
│  ├── Governance         - Process proposals, enforce rules   │
│  ├── Human Interface    - Route approvals, notifications     │
│  ├── State Management   - Persist system state               │
│  └── Misbehavior Detection - Identify and stop bad actors   │
├─────────────────────────────────────────────────────────────┤
│  Powers:                                                     │
│  ├── PAUSE arm          - Temporarily halt an arm            │
│  ├── KILL arm           - Terminate destructive arm          │
│  ├── VETO proposal      - Override arm consensus             │
│  └── ESCALATE to human  - Require human decision             │
└─────────────────────────────────────────────────────────────┘

Misbehavior Detection

The Brain monitors arms for problematic behavior:

Behavior	Detection	Response
Touching files outside task scope/claims	Pattern matching on file paths vs. claims	WARN, then PAUSE
Ignoring consensus without override	Proposal tracking	WARN, reputation penalty
Destructive changes	Pattern matching (rm -rf, DROP, etc.)	KILL immediately
Resource exhaustion	Token/API call counters	PAUSE, notify human
Stuck in loop	Action repetition detection	PAUSE with exponential backoff

Loop Detection & Backoff Throttling (Design)

When an arm gets stuck in a loop (repeating the same actions, hitting the same errors, or consuming tokens without progress), the brain intervenes with an escalating backoff strategy:

typescript

interface LoopDetection {
  armId: string;
  detectedAt: Date;
  loopType: "action_repeat" | "error_loop" | "token_burn" | "thrashing";
  consecutiveLoops: number;      // How many times we've caught this arm looping
  backoffMinutes: number;        // Current pause duration
}

const BACKOFF_SCHEDULE = [
  1,    // First loop: 1 minute pause
  5,    // Second: 5 minutes
  15,   // Third: 15 minutes
  30,   // Fourth: 30 minutes
  60,   // Fifth+: 1 hour
];

Loop Response Protocol:

Detect: Brain notices arm repeating actions or burning tokens without progress
Pause: Arm is immediately paused (cannot consume more tokens)
Instruct: Brain sends message: "You appear stuck. Compact your session and reassess."
Wait: Arm remains paused for backoff duration
Resume: After backoff, arm is resumed with instruction to compact context and retry
Check Relevance: If original task is no longer relevant, arm is reassigned

typescript

interface LoopRecoveryMessage {
  type: "loop_recovery";
  armId: string;
  instruction: string;
  actions: ("compact_session" | "reassess_task" | "request_help")[];
  originalTaskStillRelevant: boolean;
  pauseDuration: number;
}

Token Budget Protection (Design):

typescript

interface TokenThrottle {
  windowMinutes: number;         // Rolling window (default: 10)
  maxTokensPerWindow: number;    // Hard cap (default: 50000)
  warningThreshold: number;      // Warn at 70%
}

Brain State

typescript

interface BrainState {
  status: "running" | "stopped" | "paused";
  startedAt: Date;
  lastPollAt: Date;
  pollIntervalMs: number;
  activeArms: string[];
  pendingProposals: number;
  pendingApprovals: number;
}

Brain Logic

The brain runs a polling cycle every 30 seconds that orchestrates arm lifecycle and task assignment. It includes event-window based arm health monitoring and automatic intervention capabilities.

Poll Cycle

mermaid

flowchart TD
    classDef process fill:#e1f5fe,stroke:#01579b,stroke-width:2px
    classDef decision fill:#fff3e0,stroke:#e65100,stroke-width:2px,shape:rhombus

    subgraph POLL["Poll Cycle - Runs every 30s"]
        A[Start Poll] --> B[scanForRunningArms]
        B --> C{API Server Available?}
        C -->|Yes| D[Get Idle Arms]
        C -->|No| E[Skip API Arms]
        D --> F[promptIdleArms]
        F --> G[checkIdleArmStuckLoops]
        G --> H[End Poll]
    end

    class A,B,D,F,G,H process
    class C decision

Event-Window Based Health Monitoring

The brain continuously monitors arm health using an event-window based system that analyzes recent activity patterns to detect issues before they become critical.

Event Window Analysis

The health monitoring system fetches event windows for each arm from JetStream, grouping events by type and analyzing patterns to classify arm states:

typescript

interface ArmActivityState {
  productive: "actively doing useful work";
  idle: "waiting for work";
  waiting_permission: "blocked on a permission request";
  looping: "stuck in a repetitive pattern";
  silent: "no events for an extended period";
  error: "encountered an error state";
  starting: "in startup grace period";
}

Health Monitoring Components

BrainEventWindow: Centralized JetStream event window fetcher that retrieves event slices per arm
ArmActivityAnalyzer: Classifies arm states based on event windows using pattern recognition
ArmHealthMonitor: Coordinates health checks and automatic interventions

Automatic Intervention Capabilities

When issues are detected, the health monitoring system can automatically intervene:

Prompting: Send messages to arms that appear stuck or idle
Interrupting: Send /compact commands to arms stuck in loops
Killing: Terminate arms that are consistently problematic
Escalating: Notify humans for permission requests that timeout
Recovering: Restart arms that crash or become unresponsive

Configuration Options

The health monitoring system is highly configurable:

typescript

interface HealthMonitorConfig {
  checkIntervalMs: number;      // How often to run health checks
  eventWindowMs: number;        // Size of event window to analyze
  autoInterventionEnabled: boolean; // Whether automatic interventions are enabled
  silentThresholdMs: number;    // Time before arm considered silent
  loopRepetitionThreshold: number; // Repetitions before considering looping
  permissionEscalationMs: number; // Time before escalating permission requests
}

File Reading During Poll

The brain reads markdown files during its poll cycle to sync tasks from human-editable sources:

Poll Cycle File Reading:
├── Step 8: syncPlanTasks()
│   └── Reads .project/plan.md
│   └── Extracts tasks from `- [ ]` checkbox items
│   └── Creates/updates tasks in SQLite
│
├── Step 8a: processInbox()
│   └── Reads .project/inbox.md
│   └── Parses ## headers and - [ ] items as new tasks
│   └── Deduplicates against existing tasks (by title similarity)
│   └── Clears inbox after processing
│
├── Step 8b: checkDocUpdateTrigger()
│   └── Checks if documentation needs updating
│
└── Step 8c: reEvaluatePlanProgress()
    └── Creates verification tasks for issues

Files the brain reads:

.project/plan.md - Main plan with phases and deliverables
.project/inbox.md - Quick task input (cleared after processing)
**/*.plan.md - Any file ending in .plan.md
**/plans/*.md - Files in plans/ directories

Arm State Machine

Arms follow a formal state machine with 7 states:

mermaid

flowchart LR
    classDef state fill:#f3e5f5,stroke:#4a148c,stroke-width:2px

    subgraph LIFECYCLE["Arm State Machine"]
        direction LR
        disconnected["disconnected<br/>(not tracked)"] -->|scan finds process| idle["idle<br/>(ready for work)"]
        idle -->|brain assigns task| task_assigned["task_assigned<br/>(task assigned, awaiting ack)"]
        task_assigned -->|arm acknowledges| working["working<br/>(has task, processing)"]
        working -->|complete_task| idle
        idle -->|no activity after prompt| stuck["stuck detection"]
        stuck -->|productive activity detected| working
        stuck -->|confirmed stuck| intervention["intervention"]
        intervention -->|recover| idle
        working -->|process dies| stopped["stopped"]
        idle -->|process dies| stopped
        task_assigned -->|process dies| stopped
        stopped -->|restart| disconnected
    end

    class disconnected,idle,task_assigned,working,stuck,intervention,stopped state

State Transition Truth Table

From State	Event	To State	Notes
spawning	PROCESS_STARTED	starting	Process is running
starting	HARNESS_CONNECTED	idle	Harness ready
idle	TASK_ASSIGNED	task_assigned	Brain assigns task
task_assigned	TASK_ACKNOWLEDGED	working	Arm accepts task
task_assigned	TIMEOUT (3min)	idle	Arm didn't ack, release task
working	TASK_COMPLETED	idle	Task done
working	TASK_FAILED	idle	Task failed
idle/working	CONNECTION_LOST	disconnected	Network issue
disconnected	CONNECTION_RESTORED	previous	Reconnected
any	STOP	stopped	Intentional stop

Task Reordering and Management

Tasks now include a sort_order field that allows manual reordering:

http

POST /api/tasks/reorder
{
  "taskId": "task-123",
  "toSortOrder": 0  // 0=top, -1=bottom
}

Tasks can also be removed directly from plan.md files:

http

POST /api/tasks/:id/remove-from-plan

This links plan.md lines to tasks via plan_line_uid and removes both the line and the database entry.

Task Assignment Flow

mermaid

sequenceDiagram
    participant Arm
    participant Brain
    participant DB[SQLite Database]
    participant State[State Machine]

    Note over Arm,State: Arm is IDLE and waiting

    Arm->>Brain: heartbeat (status: idle)
    Brain->>Arm: "You have tasks, call get_full_briefing"

    Note over Arm: Arm calls get_full_briefing

    Arm->>Brain: get_full_briefing()
    Brain->>Arm: Task context bundle

    Note over Arm: Arm decides to claim task
    Arm->>Brain: claim_task(task-id)

    Brain->>State: TASK_ASSIGNED event
    State->>State: idle → task_assigned
    Brain->>DB: UPDATE tasks SET status='claimed', assigned_to=arm
    Brain->>DB: UPDATE arms SET status='busy'

    Brain->>Arm: task_assignment message (via queue)
    Arm->>Brain: acknowledge_task(task-id)

    Brain->>State: TASK_ACKNOWLEDGED event
    State->>State: task_assigned → working
    Brain->>DB: UPDATE tasks SET status='in_progress'

    Note over Arm,State: Arm is now WORKING on the task

Key insight: The brain assigns tasks to arms in idle state, transitioning them to task_assigned. The arm must then explicitly acknowledge to transition to working. This two-step process ensures both brain and arm agree on task ownership.

Grace Period for Autonomous Arms

When the brain starts up and finds arms that were working autonomously (before the brain was running), it protects them from being interrupted:

mermaid

sequenceDiagram
    participant Brain
    participant DB[SQLite Database]
    participant Arm

    Note over Brain,DB: Brain starts up - finds arm working autonomously

    Brain->>DB: scanForRunningArms()
    DB->>Brain: arm "rand" exists, status=busy
    Brain->>Arm: Check if process alive (kill(pid, 0))
    Arm-->>Brain: Process alive!
    Brain->>DB: UPDATE arms SET status='idle'
    Brain->>DB: armDetectionTimes.set("rand", now)

    Note over Brain: Grace period - no prompting for configured time

    Brain->>DB: checkIdleArmStuckLoops()
    Brain->>DB: Query productive activity in last 60 min
    DB-->>Brain: Found: complete_task at 14:30
    Brain->>Brain: lastProductiveAt = 14:30<br/>skip stuck detection<br/>arm was working autonomously!

    Note over Brain,Arm: Arm continues working without interruption

Grace period behavior:

Arms detected during scanForRunningArms() are not prompted for a configurable grace period
Default: 5 minutes
Configurable via brain.arm_grace_period_minutes in config.toml
The brain also checks for recent productive activity (heartbeat, claim_task, complete_task) before marking an arm as stuck

Stuck Loop Detection

The brain detects when arms repeatedly respond with "idle" without productive work:

Idle arm prompting: Brain prompts idle arms to check for work
Activity tracking: Recent activity is analyzed for prompt-response patterns
Productive actions: heartbeat, claim_task, acknowledge_task, complete_task, file_changed, tool_call
Escalation: If arm doesn't respond productively:
- Level 0: Interrupt + different prompt
- Level 1: Force context compaction
- Level 2: Kill and respawn arm
- Level 3: Notify human via email

Arms (General-Purpose Agents)

Each arm is a semi-autonomous general-purpose AI agent. Its behavior is determined by the task classification it is executing (architect, development, QA, documentation, etc.), not by a permanently assigned domain.

Arm Profile

typescript

interface ArmProfile {
  id: string;
  name: string;
  agent: "opencode-api" | "custom";

  // Task execution
  supportedClassifications: string[]; // e.g., ["architect", "development", "qa"]

  // Context Management
  contextBudget: ContextBudget;
  currentContext: ContextSnapshot;

  // Ownership
  claims: FileClaim[];         // Files/dirs this arm is tending

  // Governance
  reputation: number;          // 0-100, affects persuasion weight
  activeProposals: Proposal[];

  // State
  status: "idle" | "working" | "blocked" | "proposing" | "paused" | "dead";
  currentTask?: Task;
}

Arms can be used for different task classifications over time. The same arm might run an architect task for one assignment, then a development or QA task for the next.

MCP Server Catalog

Arms access the garden through MCP servers. Each provides specialized capabilities:

MCP Server	Purpose	Key Tools
`git-mcp`	Version control	`commit`, `push`, `branch`, `diff`, `log`
`env-mcp`	Environment variables	`get`, `set`, `list` (filtered for secrets)
`docs-mcp`	Library documentation	`search`, `fetch`, `summarize`
`devtools-mcp`	Browser automation	`screenshot`, `console`, `network`, `lighthouse`
`deploy-mcp`	Deployment operations	`request`, `status`, `rollback`, `logs`
`db-mcp`	Database operations	`query`, `migrate`, `seed`, `backup`
`pkg-mcp`	Package management	`install`, `update`, `audit`, `outdated`
`observability-mcp`	Logs & metrics	`logs.search`, `metrics.query`, `traces.find`
`alerts-mcp`	Alert management	`list`, `ack`, `silence`, `escalate`

Observability MCP Server

For production operations, arms need visibility into running systems:

typescript

interface ObservabilityMCP {
  // Log operations
  "logs.search": (params: {
    service: string;
    environment: string;
    query: string;
    timeRange: { start: Date; end: Date };
    limit?: number;
  }) => LogEntry[];
  
  "logs.tail": (params: {
    service: string;
    environment: string;
    follow: boolean;
  }) => AsyncIterable<LogEntry>;
  
  // Metrics
  "metrics.query": (params: {
    query: string;              // PromQL or similar
    timeRange: { start: Date; end: Date };
    step?: string;             // e.g., "1m", "5m"
  }) => MetricSeries[];
  
  "metrics.dashboard": (params: {
    name: string;
  }) => DashboardSnapshot;
  
  // Traces (distributed tracing)
  "traces.find": (params: {
    service?: string;
    traceId?: string;
    minDuration?: number;
    error?: boolean;
    limit?: number;
  }) => Trace[];
  
  // Health
  "health.check": (params: {
    service: string;
    environment: string;
  }) => HealthStatus;
}

interface LogEntry {
  timestamp: Date;
  level: "debug" | "info" | "warn" | "error";
  service: string;
  message: string;
  metadata: Record<string, unknown>;
}

interface HealthStatus {
  status: "healthy" | "degraded" | "unhealthy";
  checks: { name: string; status: string; message?: string }[];
  lastCheck: Date;
}

Alerts MCP Server

For incident response and on-call operations:

typescript

interface AlertsMCP {
  "alerts.list": (params: {
    status?: "firing" | "resolved" | "silenced";
    severity?: "critical" | "warning" | "info";
    service?: string;
  }) => Alert[];
  
  "alerts.ack": (params: {
    alertId: string;
    message: string;
  }) => void;
  
  "alerts.silence": (params: {
    matchers: { label: string; value: string }[];
    duration: string;         // e.g., "2h", "1d"
    reason: string;
  }) => Silence;
  
  "alerts.escalate": (params: {
    alertId: string;
    reason: string;
  }) => void;
  
  "runbook.fetch": (params: {
    alertName: string;
  }) => Runbook;
}

interface Alert {
  id: string;
  name: string;
  severity: "critical" | "warning" | "info";
  status: "firing" | "resolved" | "silenced";
  service: string;
  summary: string;
  firedAt: Date;
  resolvedAt?: Date;
  labels: Record<string, string>;
}

Legacy: Arm Domain Definition

typescript

interface ArmDomain {
  name: string;
  description: string;
  defaultPatterns: string[];   // Glob patterns for auto-claiming
  mcpServers: string[];        // Which MCP servers this arm can use
}

// Note: ArmDomain reflects an earlier design that relied on static domains.
// In the current design, arms are general-purpose and behavior is primarily
// guided by task classifications, task history, and configuration templates.

Garden (Shared Environment)

The Garden is the workspace that arms tend. It's represented as a 3D space for visualization.

Layers

┌─────────────────────────────────────────────────────────────┐
│                       THE GARDEN                             │
├─────────────────────────────────────────────────────────────┤
│  Physical Layer (files, dirs, repos):                        │
│  ├── Source code                                             │
│  ├── Configuration                                           │
│  ├── Documentation                                           │
│  ├── Tests                                                   │
│  └── Build artifacts                                         │
├─────────────────────────────────────────────────────────────┤
│  Logical Layer (exposed via MCP):                            │
│  ├── git-mcp        - VCS operations                         │
│  ├── env-mcp        - Environment variables                  │
│  ├── runtime-mcp    - Node/Python/Bun versions               │
│  ├── docs-mcp       - Library documentation                  │
│  ├── nx-mcp         - Monorepo orchestration                 │
│  ├── devtools-mcp   - Browser automation                     │
│  ├── deploy-mcp     - Deployment per environment             │
│  └── pkg-mcp        - Package management                     │
├─────────────────────────────────────────────────────────────┤
│  Ownership Layer:                                            │
│  ├── Who owns what (claims)                                  │
│  ├── Who touched what recently (activity)                    │
│  └── Conflict zones (multiple claims)                        │
└─────────────────────────────────────────────────────────────┘

3D Coordinate System (Radial)

Files are positioned in a radial 3D space where distance from center indicates activity - frequently touched files appear closer to the center, making the visualization naturally focus attention on what's actively being worked on.

typescript

interface GardenCoordinate {
  // Radial coordinates
  category: number;    // Angle in degrees (0-360) - which "slice" of the pie
  activity: number;    // Distance from center (0-100) - 0=hot, 100=cold
  depth: number;       // Vertical position (0-100) - stack layer
}

interface GardenNode {
  path: string;
  type: "file" | "directory";
  coords: GardenCoordinate;
  owner?: string;           // Arm ID
  lastTouchedBy?: string;   // Arm ID
  lastTouchedAt?: Date;
  conflictZone: boolean;
}

Axis	Heuristic	Meaning
Category (angle)	File type/category	Each file category gets a slice: UI (0-60°), API (60-120°), DB (120-180°), Infra (180-240°), Tests (240-300°), Docs (300-360°)
Activity (radius)	Recency & frequency	0=center=very active, 100=edge=dormant. Based on touches in last 7 days
Depth (vertical)	Stack layer	0=frontend/surface, 100=infrastructure/deep

Activity Calculation

typescript

function calculateActivity(path: string, touches: Touch[]): number {
  const now = Date.now();
  const weekMs = 7 * 24 * 60 * 60 * 1000;
  
  // Only consider touches in last 7 days
  const recentTouches = touches.filter(t => 
    now - t.timestamp.getTime() < weekMs
  );
  
  if (recentTouches.length === 0) return 100; // Edge - dormant
  
  // Score based on recency and frequency
  let score = 0;
  for (const touch of recentTouches) {
    const ageMs = now - touch.timestamp.getTime();
    const recencyWeight = 1 - (ageMs / weekMs); // 1.0 for now, 0.0 for week ago
    score += recencyWeight;
  }
  
  // Normalize: more touches + more recent = closer to center
  // Cap at 20 touches for max activity
  const normalized = Math.min(score / 20, 1);
  return Math.round((1 - normalized) * 100); // Invert: 0=active, 100=dormant
}

Visual Effect

In the 3D Garden view:

Center cluster: Files being actively worked on right now
Middle ring: Recently touched, part of ongoing work
Outer ring: Stable files, not recently modified
Colored by owner: Each arm has a color, files glow with owner's color
Pulsing: Files with active claims pulse gently
Conflict zones: Red highlight when multiple arms are contending

Observatory (Web UI + API)

The Observatory is how humans observe and control the system.

Server Components

┌─────────────────────────────────────────────────────────────┐
│                      OBSERVATORY                             │
├─────────────────────────────────────────────────────────────┤
│  Web Server (Hono):                                          │
│  ├── REST API          - CRUD operations, queries            │
│  ├── WebSocket         - Real-time updates                   │
│  ├── Static files      - React SPA                           │
│  └── Push endpoint     - Browser notifications               │
└─────────────────────────────────────────────────────────────┘

UI Views

View	Purpose
Dashboard	System overview, arm status at a glance
Garden	3D visualization of workspace with ownership
Arms	Arm details, context, activity log
Proposals	Active debates, arguments, signals
Approvals	Pending human decisions
Activity	Timeline of all system actions
Config	System settings, arm configuration

Available Actions

Spawn/Kill arms
Approve/Reject proposals
Override arm decisions
Configure context budgets
Reassign file ownership
Trigger deployments

Model Fallback System

The model fallback system ensures arms can spawn successfully even when configured with unavailable models by automatically resolving to alternative models based on availability and cost.

Model Resolution Process

Exact Match: Try the requested provider/model combination
Provider Fallback: If model not found, find cheapest model from same provider
Cross-Provider: If provider not available, find same model from another provider
Last Resort: Use first available model from any connected provider

Model Pricing Information

The resolver includes known model pricing to make cost-based decisions when multiple fallback options are available:

typescript

interface ModelPricing {
  input: number;   // Price per million input tokens
  output: number;  // Price per million output tokens
}

// Examples of known model pricing:
// Claude Sonnet 4: { input: 3, output: 15 }
// GPT-4o: { input: 2.5, output: 10 }
// Gemini 2.5 Flash: { input: 0.075, output: 0.3 }

Model Resolution Response

typescript

interface ResolvedModel {
  providerId: string;
  modelId: string;
  providerName: string;
  modelName: string;
  fallback: boolean;
  fallbackReason?: string;
}

When a fallback occurs, the system logs the reason for debugging and transparency.

Provider Availability Checking

The system can validate model availability before spawning arms:

typescript

async function isModelAvailable(
  providerId: string,
  modelId: string,
  apiUrl: string
): Promise<boolean>

Cost-Based Model Selection

For scenarios where cost optimization is important, the system can provide a list of all available models sorted by cost:

typescript

async function getAvailableModelsByCost(
  apiUrl: string
): Promise<Array<{ providerId: string; modelId: string; cost: number }>>

Nerve System (Communication Layer)

All communication flows through the Nerve System using NATS for distributed messaging and an internal queue for brain↔arm messages.

Message Flow

Human ◄──────► Observatory ◄──────► Brain ◄──────► Arms
         │                    │              │
         │ WebSocket          │ NATS         │ NATS/Queue
         │ Push Notifications │ JetStream    │ MCP Tools
         │ REST API           │              │

NATS Event Types (Arm Lifecycle)

Events published via NATS for distributed arm management:

Event Type	Direction	Description
`arm.spawned`	Agent → All	Arm process started
`arm.killed`	Agent → All	Arm process terminated
`arm.recovered`	Agent → All	Arm recovered from error
`arm.status_changed`	Agent → All	Arm status transition (idle→busy, etc.)
`arm.activity`	Agent → Brain	Arm performed an action
`arm.log`	Agent → Brain	Log message from arm
`agent.connected`	Agent → All	Agent host came online
`agent.disconnected`	Agent → All	Agent host went offline

Queue Message Types (Brain ↔ Arm)

Messages passed through the brain's message queue:

Message Type	Direction	Description
`task_assignment`	Brain → Arm	Assign task to arm
`task_complete`	Arm → Brain	Task finished successfully
`task_failed`	Arm → Brain	Task failed with error
`discovery`	Arm → Brain	Arm found something noteworthy
`dependency_discovery`	Arm → Brain	Arm found a dependency
`status_report`	Arm → Brain	Progress report on current task
`status_update`	Arm → Brain	General status update
`heartbeat`	Arm → Brain	Arm is alive and working
`approval_request`	Arm → Brain	Arm needs approval for action
`approval_response`	Brain → Arm	Approval granted/denied
`human_message`	Human → Brain	Message from human
`share_note`	Arm → All	Note shared between arms
`tool_discovery`	Arm → Brain	Arm found useful tool
`doc_update`	Brain → Arm	Documentation needs updating
`file_subscription`	Arm → Brain	Arm watching a file
`file_change`	Brain → Arm	Watched file changed
`claim_transfer`	Brain → Arm	File claim transferred
`bug_report`	Arm → Brain	Bug discovered
`bug_assignment`	Brain → Arm	Bug assigned for fixing
`context_compression`	Arm → Brain	Context was compressed
`dev_server_restart_request`	Arm → Brain	Request to restart dev server

Activity Event Types (JetStream)

Events tracked in JetStream for activity analysis:

Category	Event Types
Session	`session.status`, `session.idle`, `session.error`, `session.updated`, `session.diff`
Message	`message.updated`, `message.removed`, `message.part.updated`, `message.part.removed`
Permission	`permission.asked`, `permission.replied`
Todo	`todo.updated`
File	`file.edited`, `file.watcher.updated`
Command	`command.executed`
Arm Lifecycle	`arm.spawned`, `arm.status_changed`, `arm.heartbeat`, `arm.killed`, `arm.stopped`
Task	`task.created`, `task.assigned`, `task.claimed`, `task.completed`, `task.blocked`, `task.failed`
Brain	`arm_prompted`, `event-status`, `started`, `stopped`

Components ​

Brain (Coordinator) ​

Misbehavior Detection ​

Loop Detection & Backoff Throttling (Design) ​

Brain State ​

Brain Logic ​

Poll Cycle ​

Event-Window Based Health Monitoring ​

Event Window Analysis ​

Health Monitoring Components ​

Automatic Intervention Capabilities ​

Configuration Options ​

File Reading During Poll ​

Arm State Machine ​

State Transition Truth Table ​

Task Reordering and Management ​

Task Assignment Flow ​

Grace Period for Autonomous Arms ​

Stuck Loop Detection ​

Arms (General-Purpose Agents) ​

Arm Profile ​

MCP Server Catalog ​

Observability MCP Server ​

Alerts MCP Server ​

Legacy: Arm Domain Definition ​

Garden (Shared Environment) ​

Layers ​

3D Coordinate System (Radial) ​

Activity Calculation ​

Visual Effect ​

Observatory (Web UI + API) ​

Server Components ​

UI Views ​

Available Actions ​

Model Fallback System ​

Model Resolution Process ​

Model Pricing Information ​

Model Resolution Response ​

Provider Availability Checking ​

Cost-Based Model Selection ​

Nerve System (Communication Layer) ​

Message Flow ​

NATS Event Types (Arm Lifecycle) ​

Queue Message Types (Brain ↔ Arm) ​

Activity Event Types (JetStream) ​

Components

Brain (Coordinator)

Misbehavior Detection

Loop Detection & Backoff Throttling (Design)

Brain State

Brain Logic

Poll Cycle

Event-Window Based Health Monitoring

Event Window Analysis

Health Monitoring Components

Automatic Intervention Capabilities

Configuration Options

File Reading During Poll

Arm State Machine

State Transition Truth Table

Task Reordering and Management

Task Assignment Flow

Grace Period for Autonomous Arms

Stuck Loop Detection

Arms (General-Purpose Agents)

Arm Profile

MCP Server Catalog

Observability MCP Server

Alerts MCP Server

Legacy: Arm Domain Definition

Garden (Shared Environment)

Layers

3D Coordinate System (Radial)

Activity Calculation

Visual Effect

Observatory (Web UI + API)

Server Components

UI Views

Available Actions

Model Fallback System

Model Resolution Process

Model Pricing Information

Model Resolution Response

Provider Availability Checking

Cost-Based Model Selection

Nerve System (Communication Layer)

Message Flow

NATS Event Types (Arm Lifecycle)

Queue Message Types (Brain ↔ Arm)

Activity Event Types (JetStream)