Daemion docs

How does Daemion work?

Daemion is a local-first agent system: a persistent gateway running on your machine, a 6-substrate kernel that handles everything from storage to streaming, and a universal extension model where every capability is data — not code.


System overview

Phone / Browser (PWA)

        │  HTTPS via Tailscale (or localhost)

┌──────────────────────────────────────────┐
│           Gateway  :3001 (default)       │
│                                          │
│  HTTP/WebSocket API                      │
│                                          │
│  ┌────────────────────────────────────┐  │
│  │         6 Kernel Substrates        │  │
│  │  Extension  │ Context  │ Execution │  │
│  │  Trigger    │ Storage  │ Presentation│ │
│  └────────────────────────────────────┘  │
│                                          │
│  ┌────────────────────────────────────┐  │
│  │          Local Storage             │  │
│  │  SQLite  ·  Wiki  ·  Filesystem    │  │
│  └────────────────────────────────────┘  │
└──────────────────────────────────────────┘

        │  Agent SDK (not child_process)

   Runtime model path

The frontend is a static PWA served from Vercel — just a window into your local system. All messages, threads, extensions, and config live in SQLite on your machine. The gateway serves everything; the frontend never talks to an external database.


What is the gateway?

The gateway is a local HTTP/WebSocket server (port 3001 by default). It is the single entry point for all client communication — every message, thread list, extension CRUD, job run, and streaming response goes through it.

Core API surface:

MethodPathWhat it does
GET/healthHealth check, no auth required
POST/chatSend a turn, get a streaming response
GET/threadsList conversation threads
GET/threads/:id/turnsTurns for a thread
POST/threadsCreate a thread
POST/run/:jobExecute a job by name
GET/extensionsList all extensions
POST/extensionsCreate or update an extension
DELETE/extensions/:idRemove an extension
POST/reseedRe-sync built-in extensions from disk (no restart needed)
WS/streamStreaming turns and tool-call events

The API data model uses “turns” throughout — not “messages.” A turn is one exchange unit in a thread.

The gateway is local-first. Remote access goes through Tailscale when needed, which preserves the “lives on your machine” model instead of turning Daemion into a hosted service. Bearer token auth is required for all endpoints except /health.


What are the 6 kernel substrates?

The substrates are the OS primitives. Everything else — jobs, agents, commands, themes — plugs into them as extensions.

1. Extension Substrate

The meta-substrate. Registers, validates, loads, and manages all 12 extension types. Extensions are stored as JSON/YAML in SQLite — not compiled code. The agent can create extensions at runtime through chat.

See Extending Daemion for the full extension model.

2. Context Substrate

How the agent knows things. Inspired by Recursive Language Models — externalizes context rather than bulk-loading everything into the prompt.

Per request, the Context Substrate:

  1. Loads the last 10–15 turns from SQLite (always present, full text)
  2. Searches the knowledge substrate in parallel (compiled wiki, raw sources, and history surfaces)
  3. Provides the agent with history and knowledge tools such as search_history, get_thread, list_threads, find_relevant, search_all, knowledge_search, and knowledge_read

Older turns are queryable on demand — the agent retrieves them when it detects it needs them. This keeps prompts lean while preserving access to full history and durable knowledge.

3. Execution Substrate

How the agent does work. Manages model selection, tool access, budgets, turn limits, streaming, cancellation, and concurrency.

Request typeModelMax turnsBudget
Chat (quick)sonnet10$0.50
Chat (complex)sonnet/opus25$5.00
Job executionconfigurable30$5.00
Build tasksonnet50$10.00

All Claude invocations use the Agent SDK — never child_process (known hang bug #771). The SDK provides streaming, tool access, and session management.

4. Presentation Substrate

How output appears in the UI. Renders all content types, streams tokens, shows tool calls as collapsible step indicators, and handles interactive elements (approve/deny buttons, forms).

Content pipeline: Agent output → type detection → renderer selection → display

Custom renderers are extensions of type renderer — a proposal-card renderer, a diff renderer, etc.

5. Trigger Substrate

What causes things to happen. Evaluates conditions and fires the appropriate response.

TypeFires when
messageUser sends a turn
commandUser types /command
cronTime-based schedule
watchFile changes on disk
webhookHTTP request received
eventInternal event fires
chainAnother extension completes

6. Storage Substrate

All data persists locally. No cloud database.

BackendStores
SQLiteTurns, threads, extensions, config, metrics, costs
Knowledge wiki + raw filesCompiled durable knowledge, raw captured sources, agent-specific wiki spaces
FilesystemProject files, images, attachments, job outputs

What is the extension model?

Everything that isn’t the kernel is an extension. There are 12 types:

TypeWhat it is
commandInput handler (/, @, !, #)
themeVisual identity (colors, fonts)
jobAutonomous work unit
rendererCustom content display component
integrationExternal service connection (GitHub, Slack, Vercel)
actionPer-turn contextual action (copy, edit, regenerate)
widgetDashboard UI component
appEmbedded Vite application
artifactAgent-created output (code file, document)
capabilityAgent skill or behavior
controlSystem configuration (budget limits, model defaults)
agentPersistent agent identity

Extensions are data stored in SQLite — they don’t require compilation or deployment. The primary way to create one is by asking Daemion in chat. Agent-created extensions start disabled and require your approval to activate.

POST /reseed re-syncs built-in extensions from disk without restarting the gateway process.

See Extending Daemion for full schema, lifecycle, and examples.


How does a message flow end to end?

1. You type a message in the PWA

2. Frontend sends POST /chat {"thread_id": "thr_01abc123", "content": "..."}
   via Tailscale (or localhost) to the gateway

3. Gateway receives the request and authenticates the bearer token

4. Context Substrate assembles knowledge:
   a. Load last 10–15 turns from SQLite (always present)
   b. Search the knowledge substrate for relevant compiled or raw context (parallel)
   c. Check for extension-provided context (integrations, workspace state)

5. Execution Substrate invokes the agent (Agent SDK):
   a. Select model from request metadata or thread default
   b. Apply budget and turn limits based on detected complexity
   c. Stream tool calls + text tokens back via WebSocket /stream

6. Presentation Substrate formats output:
   a. Tool calls appear as step indicators ("Reading file...")
   b. Text streams token-by-token
   c. Code blocks get syntax highlighting on completion
   d. Final turn stored to SQLite

7. Frontend displays the streamed response

For jobs (autonomous, no user turn):

1. Trigger fires (cron schedule, file watch, event chain)
2. Engine loads the job definition (extension of type "job")
3. Context Substrate assembles job-specific context
4. Execution Substrate invokes the agent with the job prompt
5. Output routed: file write, proposal, notification, or other configured destination
6. If job has chains → trigger the next job

What events does the WebSocket send?

The WebSocket at WS /stream sends 12 event types. All events are JSON with a type field:

json

{ “type”: “connected”, “threadId”: “thr_01abc123” } { “type”: “start”, “messageId”: “trn_07xyz456”, “model”: “claude-sonnet-4-5” } { “type”: “text-delta”, “messageId”: “trn_07xyz456”, “delta”: “Hello” } { “type”: “tool-start”, “messageId”: “trn_07xyz456”, “tool”: “Read”, “input”: ”…” } { “type”: “tool-end”, “messageId”: “trn_07xyz456”, “tool”: “Read”, “output”: ”…” } { “type”: “finish”, “messageId”: “trn_07xyz456”, “costUsd”: 0.003, “durationMs”: 1240 } { “type”: “error”, “messageId”: “trn_07xyz456”, “error”: “budget exceeded” } { “type”: “stopped”, “messageId”: “trn_07xyz456” } { “type”: “warning”, “text”: ”…” } { “type”: “extension-changed”, “extensionId”: “ext_09def789” } { “type”: “thread-updated”, “threadId”: “thr_01abc123” }

The frontend reconstructs the full turn from streamed events. Tool calls render as collapsible step indicators in the Presentation Substrate.


What are the storage backends?

SQLite is the primary store — turns, threads, extensions, config, and cost metrics all live there. Path defaults to ~/.daemion/daemion.db, overridable via DAEMION_DB_PATH.

Knowledge in the current product is provided through the compiled wiki, raw source capture, and searchable history surfaces. That is the memory model users should expect today.

Filesystem access is provided via GET /filesystem/ls and GET /filesystem/search. Both take a path param that defaults to os.homedir() — the agent can browse your home directory. Scope this appropriately for your threat model.


Common questions

Q Are Daemion agents separate AIs?
No. Agents are persistent roles inside the same Daemion system. They have different souls, prompts, permissions, history, and responsibilities, but they are all part of one local agent environment.
Q Why local-first instead of cloud?
Your turns, threads, extensions, and config live on your machine in SQLite. The Vercel PWA is just static HTML/JS — a window into your local system. There's no external database. Tailscale provides encrypted remote access without any cloud intermediary.
Q What's the difference between the gateway and the daemon?
The gateway is the HTTP/WebSocket API server. The daemon is the gateway plus autonomous capabilities: heartbeat (ambient awareness on wake), cron scheduler, file watcher, and job executor. daemion start runs the gateway only. The background service (launchd/systemd) runs the full daemon.
Q What is the heartbeat?
The heartbeat is a core engine concept — not a job. On each system wake, the daemon reads HEARTBEAT.md, checks ambient state, and replies HEARTBEAT_OK if nothing needs attention, or sends a notification if something does. It's how Daemion maintains ambient awareness without polling.
Q How does context work for long conversations?
The last 10–15 turns are always loaded in full. Older turns are retrieved on demand through the history tools, while durable knowledge comes from the compiled wiki and raw capture surfaces. The system keeps prompts lean by retrieving what it needs instead of stuffing everything into one context window.

What can go wrong

Common architecture questions

Gateway unreachable from phone — Tailscale must be connected on both devices. The gateway binds to 127.0.0.1 only; Tailscale routes correctly when both devices are on the same Tailscale network. Check tailscale status on both.

401 {"error": "unauthorized"} — The bearer token is missing or expired. Re-pair the device: run daemion start, scan the QR code, and let the new token replace the old one in localStorage.

Filesystem endpoint returns homedir contents unexpectedlyGET /filesystem/ls defaults to os.homedir() when no path param is provided. Always pass an explicit path if you’re using this endpoint programmatically.


What’s next?