Skip to content

Quick Start

Get daimon running and make your first streaming request in under five minutes.

1. Install

brew tap sonicboom15/tap
brew install daimon
winget install sonicboom15.daimon
scoop bucket add sonicboom15 https://github.com/sonicboom15/scoop-bucket
scoop install daimon

Download from the latest release.

sudo dpkg -i daimon_*_linux_amd64.deb

Requires Go 1.23+.

git clone https://github.com/sonicboom15/daimon.git
cd daimon && make build
# binary at ./bin/daimon

2. Set up a model

Export your API key, then save a config.yaml:

export ANTHROPIC_API_KEY=sk-ant-...
# export OPENAI_API_KEY=sk-...
config.yaml
port: 3500

components:
  - name: claude
    type: anthropic
    metadata:
      default_model: claude-haiku-4-5-20251001

  - name: gpt4o
    type: openai
    metadata:
      default_model: gpt-4o-mini

Start Ollama in Docker and pull a model:

docker run -d -p 11434:11434 --name ollama ollama/ollama
docker exec ollama ollama pull qwen2.5:1.5b

Save a config.yaml pointing at it:

config.yaml
port: 3500

components:
  - name: local
    type: llamacpp
    metadata:
      base_url: http://localhost:11434/v1
      default_model: qwen2.5:1.5b

Tip

Swap qwen2.5:1.5b for any model on ollama.com/library. Larger models are slower but more capable.


3. Start daimon

daimon serve --config config.yaml
INFO daimon listening addr=127.0.0.1:3500

4. Make a request

Note

Examples below use claude. If you used the Docker setup, replace it with local.

pip install daimon-client
import daimon_client as daimon

with daimon.Client() as client:
    for text in client.stream("claude", "What is a daimon?"):
        print(text, end="", flush=True)
npm install daimon-client
import { Client } from 'daimon-client';

const client = new Client();
for await (const text of client.stream('claude', 'What is a daimon?')) {
  process.stdout.write(text);
}
import asyncio
import daimon_client as daimon

async def main():
    async with daimon.AsyncClient() as client:
        async for text in client.stream("claude", "What is a daimon?"):
            print(text, end="", flush=True)

asyncio.run(main())

Next steps

  • Configuration


    Components, inference defaults, MCP servers, telemetry — all in one YAML.

    Configuration

  • Python SDK


    Multi-turn conversations, sessions, tool calls, async.

    Python SDK

  • TypeScript SDK


    Native fetch, async generators, full type safety.

    TypeScript SDK

  • Tool Calls (MCP)


    Wire up filesystem, GitHub, search, and custom tools.

    MCP tools