Cersei

Introduction

The Rust SDK for building coding agents. Tool execution, LLM streaming, graph memory, sub-agent orchestration, MCP — as composable library functions.

Cersei

The complete Rust SDK for building coding agents.

Every building block of a production coding agent — tool execution, LLM streaming, sub-agent orchestration, persistent memory with an embedded graph database, skills, MCP integration — shipped as composable library functions. Build a Claude Code replacement, embed an agent in your app, or create something new.

use cersei::prelude::*;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let output = Agent::builder()
        .provider(Anthropic::from_env()?)
        .tools(cersei::tools::coding())
        .permission_policy(AllowAll)
        .run_with("Fix the failing tests in src/")
        .await?;

    println!("{}", output.text());
    Ok(())
}

Cersei is modular — use the full SDK or pick individual crates. The cersei facade re-exports everything. Individual crates like cersei-tools or cersei-memory work standalone.


Why Cersei

Claude Code and OpenCode are closed, monolithic CLI apps. You can't embed them in your own application, swap the memory backend, or customize the tool pipeline. Cersei gives you the same capabilities as decomposed library functions.

Claude CodeOpenCodeCersei SDKAbstract CLI
Form factorCLI appCLI appLibraryCLI app
EmbeddableNoNoYesNo (uses SDK)
ProviderAnthropic onlyMultiMultiMulti
LanguageTypeScriptTypeScriptRustRust
MemoryFile + LLM recallSQLiteFile + Graph DBFile + Graph DB
Custom toolsPluginsPluginsimpl Tool traitVia SDK
Startup~266ms~300msN/A (library)~32ms
Peak RSS333 MBN/A4.9 MB

Performance at a Glance

Numbers from run_tool_bench_claude.sh and run_tool_bench_codex.sh. Apple Silicon, release builds.

Startup Time

ToolStartupRatio
Abstract22ms
Codex CLI57ms2.6x slower
Claude Code266ms12x slower

Peak Memory (RSS)

ToolRSSRatio
Abstract4.7 MB
Codex CLI44.7 MB9.5x more
Claude Code333 MB71x more

Memory Recall — The Largest Gap

Both Claude Code and Codex call an LLM every turn to rank memory files. Cersei's embedded graph database does indexed lookups in 98 microseconds — no LLM call, no API cost. Note the log scale.

OperationAbstractClaude CodeCodex CLI
Memory recall (agent)98us (graph)7545ms (Sonnet)5751ms (GPT)
Memory write (agent)28us (graph)20687ms5882ms
File scan (100 files)1.2ms26.6ms

Sequential Throughput

10 consecutive prompts, per-request average:

ToolPer-requestRatio
Abstract1564ms
Codex CLI4152ms2.7x slower
Claude Code12079ms7.7x slower

Graph ON vs OFF

Enabling the graph database adds zero overhead to scan and context building, and makes recall 92.5% faster:

The recall bar drops from 1359us to 103us with graph enabled. There's no reason to turn it off.

Tool Dispatch (SDK-level)

50 iterations per tool, in-process — no IPC, no subprocess:

Grep and Bash latency is dominated by subprocess spawn. Everything else is pure in-process I/O.

Full benchmark methodology, per-test breakdowns, and reproduction steps: Benchmark Report


What's Inside


Quick Start

use cersei::prelude::*;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let output = Agent::builder()
        .provider(Anthropic::from_env()?)  // reads ANTHROPIC_API_KEY
        .tools(cersei::tools::coding())
        .permission_policy(AllowAll)
        .run_with("What files are in the current directory?")
        .await?;

    println!("{}", output.text());
    Ok(())
}
use cersei::prelude::*;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let output = Agent::builder()
        .provider(OpenAi::from_env()?)  // reads OPENAI_API_KEY
        .model("gpt-4o")
        .tools(cersei::tools::coding())
        .permission_policy(AllowAll)
        .run_with("Explain the architecture of this project")
        .await?;

    println!("{}", output.text());
    Ok(())
}
use cersei::prelude::*;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let provider = OpenAi::builder()
        .base_url("http://localhost:11434/v1")
        .model("llama3.1:70b")
        .api_key("ollama")
        .build()?;

    let output = Agent::builder()
        .provider(provider)
        .tools(cersei::tools::coding())
        .permission_policy(AllowAll)
        .run_with("Refactor this function for readability")
        .await?;

    println!("{}", output.text());
    Ok(())
}

With Streaming

let agent = Agent::builder()
    .provider(Anthropic::from_env()?)
    .tools(cersei::tools::coding())
    .permission_policy(AllowAll)
    .build()?;

let mut stream = agent.run_stream("Fix the failing tests");
while let Some(event) = stream.next().await {
    match event {
        AgentEvent::TextDelta(t) => print!("{t}"),
        AgentEvent::ToolStart { name, .. } => eprintln!("\n[{name}]"),
        AgentEvent::ToolEnd { name, duration, .. } => {
            eprintln!("[{name} done in {}ms]", duration.as_millis());
        }
        AgentEvent::Complete(output) => {
            eprintln!("\n--- {} turns, {} tool calls ---", output.turns, output.tool_calls.len());
            break;
        }
        _ => {}
    }
}

With Graph Memory

use cersei::memory::manager::MemoryManager;

let mm = MemoryManager::new(project_root)
    .with_graph(Path::new("./agent.grafeo"))?;

// Store facts the agent discovers
let id = mm.store_memory("User prefers functional patterns", MemoryType::User, 0.9);
mm.tag_memory(&id.unwrap(), "coding-style");

// Later — recall in 98 microseconds, no LLM call
let results = mm.recall("coding style", 5);

Install

[dependencies]
cersei = { git = "https://github.com/pacifio/cersei" }
tokio = { version = "1", features = ["full"] }
anyhow = "1"

For graph memory:

cersei-memory = { git = "https://github.com/pacifio/cersei", features = ["graph"] }

For the Abstract CLI (complete coding agent):

cargo install --git https://github.com/pacifio/cersei abstract-cli

Explore

On this page