Quick Start¶

Get daimon running and make your first streaming request in under five minutes.

1. Install¶

macOS / LinuxWindows (winget)Windows (Scoop)Linux (.deb / .rpm)Build from source

brew tap sonicboom15/tap
brew install daimon

winget install sonicboom15.daimon

scoop bucket add sonicboom15 https://github.com/sonicboom15/scoop-bucket
scoop install daimon

Download from the latest release.

sudo dpkg -i daimon_*_linux_amd64.deb

Requires Go 1.23+.

git clone https://github.com/sonicboom15/daimon.git
cd daimon && make build
# binary at ./bin/daimon

2. Set up a model¶

Anthropic / OpenAILocal model (Docker — no API key)

Export your API key, then save a config.yaml:

export ANTHROPIC_API_KEY=sk-ant-...
# export OPENAI_API_KEY=sk-...

config.yaml

port: 3500

components:
  - name: claude
    type: anthropic
    metadata:
      default_model: claude-haiku-4-5-20251001

  - name: gpt4o
    type: openai
    metadata:
      default_model: gpt-4o-mini

Start Ollama in Docker and pull a model:

docker run -d -p 11434:11434 --name ollama ollama/ollama
docker exec ollama ollama pull qwen2.5:1.5b

Save a config.yaml pointing at it:

config.yaml

port: 3500

components:
  - name: local
    type: llamacpp
    metadata:
      base_url: http://localhost:11434/v1
      default_model: qwen2.5:1.5b

Tip

Swap qwen2.5:1.5b for any model on ollama.com/library. Larger models are slower but more capable.

3. Start daimon¶

daimon serve --config config.yaml

INFO daimon listening addr=127.0.0.1:3500

4. Make a request¶

Note

Examples below use claude. If you used the Docker setup, replace it with local.

Python SDKTypeScript SDKPython (async)

pip install daimon-client

import daimon_client as daimon

with daimon.Client() as client:
    for text in client.stream("claude", "What is a daimon?"):
        print(text, end="", flush=True)

npm install daimon-client

import { Client } from 'daimon-client';

const client = new Client();
for await (const text of client.stream('claude', 'What is a daimon?')) {
  process.stdout.write(text);
}

import asyncio
import daimon_client as daimon

async def main():
    async with daimon.AsyncClient() as client:
        async for text in client.stream("claude", "What is a daimon?"):
            print(text, end="", flush=True)

asyncio.run(main())

Next steps¶

Configuration

Components, inference defaults, MCP servers, telemetry — all in one YAML.

Configuration
Python SDK

Multi-turn conversations, sessions, tool calls, async.

Python SDK
TypeScript SDK

Native fetch, async generators, full type safety.

TypeScript SDK
Tool Calls (MCP)

Wire up filesystem, GitHub, search, and custom tools.

MCP tools