The Rust SDK for building coding agents. Tool execution, LLM streaming, graph memory, sub-agent orchestration, MCP — as composable library functions.

Cersei

The complete Rust SDK for building coding agents.

Every building block of a production coding agent — tool execution, LLM streaming, sub-agent orchestration, persistent memory with an embedded graph database, skills, MCP integration — shipped as composable library functions. Build a Claude Code replacement, embed an agent in your app, or create something new.

use cersei::prelude::*;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let output = Agent::builder()
        .provider(Anthropic::from_env()?)
        .tools(cersei::tools::coding())
        .permission_policy(AllowAll)
        .run_with("Fix the failing tests in src/")
        .await?;

    println!("{}", output.text());
    Ok(())
}

Cersei is modular — use the full SDK or pick individual crates. The cersei facade re-exports everything. Individual crates like cersei-tools or cersei-memory work standalone.

Why Cersei

Claude Code and OpenCode are closed, monolithic CLI apps. You can't embed them in your own application, swap the memory backend, or customize the tool pipeline. Cersei gives you the same capabilities as decomposed library functions.

	Claude Code	OpenCode	Cersei SDK	Abstract CLI
Form factor	CLI app	CLI app	Library	CLI app
Embeddable	No	No	Yes	No (uses SDK)
Provider	Anthropic only	Multi	Multi	Multi
Language	TypeScript	TypeScript	Rust	Rust
Memory	File + LLM recall	SQLite	File + Graph DB	File + Graph DB
Custom tools	Plugins	Plugins	`impl Tool` trait	Via SDK
Startup	~266ms	~300ms	N/A (library)	~32ms
Peak RSS	333 MB	—	N/A	4.9 MB

Performance at a Glance

Numbers from run_tool_bench_claude.sh and run_tool_bench_codex.sh. Apple Silicon, release builds.

Startup Time

Tool	Startup	Ratio
Abstract	22ms	—
Codex CLI	57ms	2.6x slower
Claude Code	266ms	12x slower

Peak Memory (RSS)

Tool	RSS	Ratio
Abstract	4.7 MB	—
Codex CLI	44.7 MB	9.5x more
Claude Code	333 MB	71x more

Memory Recall — The Largest Gap

Both Claude Code and Codex call an LLM every turn to rank memory files. Cersei's embedded graph database does indexed lookups in 98 microseconds — no LLM call, no API cost. Note the log scale.

Operation	Abstract	Claude Code	Codex CLI
Memory recall (agent)	98us (graph)	7545ms (Sonnet)	5751ms (GPT)
Memory write (agent)	28us (graph)	20687ms	5882ms
File scan (100 files)	1.2ms	26.6ms	—

Sequential Throughput

10 consecutive prompts, per-request average:

Tool	Per-request	Ratio
Abstract	1564ms	—
Codex CLI	4152ms	2.7x slower
Claude Code	12079ms	7.7x slower

Graph ON vs OFF

Enabling the graph database adds zero overhead to scan and context building, and makes recall 92.5% faster:

The recall bar drops from 1359us to 103us with graph enabled. There's no reason to turn it off.

Tool Dispatch (SDK-level)

50 iterations per tool, in-process — no IPC, no subprocess:

Grep and Bash latency is dominated by subprocess spawn. Everything else is pure in-process I/O.

Full benchmark methodology, per-test breakdowns, and reproduction steps: Benchmark Report

use cersei::prelude::*;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let output = Agent::builder()
        .provider(Anthropic::from_env()?)  // reads ANTHROPIC_API_KEY
        .tools(cersei::tools::coding())
        .permission_policy(AllowAll)
        .run_with("What files are in the current directory?")
        .await?;

    println!("{}", output.text());
    Ok(())
}

use cersei::prelude::*;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let output = Agent::builder()
        .provider(OpenAi::from_env()?)  // reads OPENAI_API_KEY
        .model("gpt-4o")
        .tools(cersei::tools::coding())
        .permission_policy(AllowAll)
        .run_with("Explain the architecture of this project")
        .await?;

    println!("{}", output.text());
    Ok(())
}

use cersei::prelude::*;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let provider = OpenAi::builder()
        .base_url("http://localhost:11434/v1")
        .model("llama3.1:70b")
        .api_key("ollama")
        .build()?;

    let output = Agent::builder()
        .provider(provider)
        .tools(cersei::tools::coding())
        .permission_policy(AllowAll)
        .run_with("Refactor this function for readability")
        .await?;

    println!("{}", output.text());
    Ok(())
}

With Streaming

let agent = Agent::builder()
    .provider(Anthropic::from_env()?)
    .tools(cersei::tools::coding())
    .permission_policy(AllowAll)
    .build()?;

let mut stream = agent.run_stream("Fix the failing tests");
while let Some(event) = stream.next().await {
    match event {
        AgentEvent::TextDelta(t) => print!("{t}"),
        AgentEvent::ToolStart { name, .. } => eprintln!("\n[{name}]"),
        AgentEvent::ToolEnd { name, duration, .. } => {
            eprintln!("[{name} done in {}ms]", duration.as_millis());
        }
        AgentEvent::Complete(output) => {
            eprintln!("\n--- {} turns, {} tool calls ---", output.turns, output.tool_calls.len());
            break;
        }
        _ => {}
    }
}

With Graph Memory

use cersei::memory::manager::MemoryManager;

let mm = MemoryManager::new(project_root)
    .with_graph(Path::new("./agent.grafeo"))?;

// Store facts the agent discovers
let id = mm.store_memory("User prefers functional patterns", MemoryType::User, 0.9);
mm.tag_memory(&id.unwrap(), "coding-style");

// Later — recall in 98 microseconds, no LLM call
let results = mm.recall("coding style", 5);

Install

[dependencies]
cersei = { git = "https://github.com/pacifio/cersei" }
tokio = { version = "1", features = ["full"] }
anyhow = "1"

For graph memory:

cersei-memory = { git = "https://github.com/pacifio/cersei", features = ["graph"] }

For the Abstract CLI (complete coding agent):

cargo install --git https://github.com/pacifio/cersei abstract-cli

Introduction

Cersei

Why Cersei

Performance at a Glance

Startup Time

Peak Memory (RSS)

Memory Recall — The Largest Gap

Sequential Throughput

Graph ON vs OFF

Tool Dispatch (SDK-level)

What's Inside

34 Built-in Tools

Multi-Provider LLM

Graph Memory

Agent Runtime

Sub-Agent Orchestration

Hooks and Middleware

Quick Start

With Streaming

With Graph Memory

Install

Explore

Quick Start

API Reference

Architecture

Cookbooks

Abstract CLI

Benchmarks

On this page