Introduction
The Rust SDK for building coding agents. Tool execution, LLM streaming, graph memory, sub-agent orchestration, MCP — as composable library functions.
Cersei
The complete Rust SDK for building coding agents.
Every building block of a production coding agent — tool execution, LLM streaming, sub-agent orchestration, persistent memory with an embedded graph database, skills, MCP integration — shipped as composable library functions. Build a Claude Code replacement, embed an agent in your app, or create something new.
use cersei::prelude::*;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let output = Agent::builder()
.provider(Anthropic::from_env()?)
.tools(cersei::tools::coding())
.permission_policy(AllowAll)
.run_with("Fix the failing tests in src/")
.await?;
println!("{}", output.text());
Ok(())
}Cersei is modular — use the full SDK or pick individual crates. The cersei facade re-exports everything. Individual crates like cersei-tools or cersei-memory work standalone.
Why Cersei
Claude Code and OpenCode are closed, monolithic CLI apps. You can't embed them in your own application, swap the memory backend, or customize the tool pipeline. Cersei gives you the same capabilities as decomposed library functions.
| Claude Code | OpenCode | Cersei SDK | Abstract CLI | |
|---|---|---|---|---|
| Form factor | CLI app | CLI app | Library | CLI app |
| Embeddable | No | No | Yes | No (uses SDK) |
| Provider | Anthropic only | Multi | Multi | Multi |
| Language | TypeScript | TypeScript | Rust | Rust |
| Memory | File + LLM recall | SQLite | File + Graph DB | File + Graph DB |
| Custom tools | Plugins | Plugins | impl Tool trait | Via SDK |
| Startup | ~266ms | ~300ms | N/A (library) | ~32ms |
| Peak RSS | 333 MB | — | N/A | 4.9 MB |
Performance at a Glance
Numbers from run_tool_bench_claude.sh and run_tool_bench_codex.sh. Apple Silicon, release builds.
Startup Time
| Tool | Startup | Ratio |
|---|---|---|
| Abstract | 22ms | — |
| Codex CLI | 57ms | 2.6x slower |
| Claude Code | 266ms | 12x slower |
Peak Memory (RSS)
| Tool | RSS | Ratio |
|---|---|---|
| Abstract | 4.7 MB | — |
| Codex CLI | 44.7 MB | 9.5x more |
| Claude Code | 333 MB | 71x more |
Memory Recall — The Largest Gap
Both Claude Code and Codex call an LLM every turn to rank memory files. Cersei's embedded graph database does indexed lookups in 98 microseconds — no LLM call, no API cost. Note the log scale.
| Operation | Abstract | Claude Code | Codex CLI |
|---|---|---|---|
| Memory recall (agent) | 98us (graph) | 7545ms (Sonnet) | 5751ms (GPT) |
| Memory write (agent) | 28us (graph) | 20687ms | 5882ms |
| File scan (100 files) | 1.2ms | 26.6ms | — |
Sequential Throughput
10 consecutive prompts, per-request average:
| Tool | Per-request | Ratio |
|---|---|---|
| Abstract | 1564ms | — |
| Codex CLI | 4152ms | 2.7x slower |
| Claude Code | 12079ms | 7.7x slower |
Graph ON vs OFF
Enabling the graph database adds zero overhead to scan and context building, and makes recall 92.5% faster:
The recall bar drops from 1359us to 103us with graph enabled. There's no reason to turn it off.
Tool Dispatch (SDK-level)
50 iterations per tool, in-process — no IPC, no subprocess:
Grep and Bash latency is dominated by subprocess spawn. Everything else is pure in-process I/O.
Full benchmark methodology, per-test breakdowns, and reproduction steps: Benchmark Report
vs General Agent Frameworks
Cersei also competes directly with the Python agent stack — Agno, PydanticAI, LangGraph, CrewAI. Every number below is measured on the same Apple M1 Pro via the harness at bench/general-agents/. Methodology mirrors Agno's own performance cookbook: real agent constructors, no LLM invocation, no stub models. Full methodology →
Agent Instantiation
μs to construct one ready-to-use agent with one tool attached (1000 samples, log scale):
| Framework | p50 | vs Cersei |
|---|---|---|
| Cersei 0.1.6-patch.2 | 7.12 μs | 1× |
| Agno 2.5.17 | 6.50 μs | 0.9× |
| PydanticAI 1.22.0 | 219 μs | 31× |
| LangGraph 1.1.8 | 5 536 μs | 777× |
| CrewAI 1.14.2 | 28 509 μs | 4 004× |
Per-Agent Memory
Bytes per agent held live. Cersei uses jemalloc::stats::allocated; Python uses tracemalloc — both count real bytes allocated by framework code:
| Framework | Per-agent | vs Cersei |
|---|---|---|
| Cersei | 704 B | 1× |
| Agno | 5.8 KiB | 8.4× |
| PydanticAI | 8.7 KiB | 12.6× |
| CrewAI | 17.7 KiB | 25.8× |
| LangGraph | 30.2 KiB | 44× |
Max Concurrent Agents — The Capacity Story
How many live agents can you build and hold on one host? Cersei sweeps to 10k; Python frameworks sampled at N=100 and N=500. Log scale on both axes:
At N=500: Cersei holds them in 8.5 MB / 4.4 ms. CrewAI needs 1 739 MB / 50 697 ms — 204× more memory, 11 500× more wall time, for the same number of agents. Cersei's ramp continues cleanly to 10k agents in 22 MB of total RSS.
Why this matters in production. Agent workloads scale by holding one active instance per customer session, per parallel worker, per retry queue item. Memory per agent sets your ceiling. On a 4 GB process: Cersei ≈ 5 million agents (at 704 B each) vs LangGraph ≈ 130 k. That's not a rounding difference — it's a fundamentally different cost structure.
What's Inside
34 Built-in Tools
File I/O, shell execution, web fetch, planning, scheduling, orchestration, skills discovery. Every tool a coding agent needs, with a permission system and bash safety classifier.
Multi-Provider LLM
Anthropic (Claude), OpenAI (GPT), Ollama, Azure, vLLM — any OpenAI-compatible endpoint. Streaming SSE, token counting, prompt caching, extended thinking.
Graph Memory
Three-tier memory: embedded Grafeo graph DB for relationship-aware recall, flat files compatible with Claude Code, and CLAUDE.md hierarchy. Session persistence via append-only JSONL.
Agent Runtime
Builder pattern with 20+ configuration options. Agentic loop with 26-variant event streaming, auto-compact at 90% context, effort levels, bidirectional stream control.
Sub-Agent Orchestration
Spawn parallel workers, coordinate tasks, pass messages between agents. Built-in AgentTool, TaskCreate/Get/Update/Stop/Output, SendMessage, and Worktree isolation.
Hooks and Middleware
Intercept any lifecycle event — pre/post tool use, model turns. Cost gating, audit logging, tool blocking, input modification. Shell hooks for external integrations.
Quick Start
use cersei::prelude::*;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let output = Agent::builder()
.provider(Anthropic::from_env()?) // reads ANTHROPIC_API_KEY
.tools(cersei::tools::coding())
.permission_policy(AllowAll)
.run_with("What files are in the current directory?")
.await?;
println!("{}", output.text());
Ok(())
}use cersei::prelude::*;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let output = Agent::builder()
.provider(OpenAi::from_env()?) // reads OPENAI_API_KEY
.model("gpt-4o")
.tools(cersei::tools::coding())
.permission_policy(AllowAll)
.run_with("Explain the architecture of this project")
.await?;
println!("{}", output.text());
Ok(())
}use cersei::prelude::*;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let provider = OpenAi::builder()
.base_url("http://localhost:11434/v1")
.model("llama3.1:70b")
.api_key("ollama")
.build()?;
let output = Agent::builder()
.provider(provider)
.tools(cersei::tools::coding())
.permission_policy(AllowAll)
.run_with("Refactor this function for readability")
.await?;
println!("{}", output.text());
Ok(())
}With Streaming
let agent = Agent::builder()
.provider(Anthropic::from_env()?)
.tools(cersei::tools::coding())
.permission_policy(AllowAll)
.build()?;
let mut stream = agent.run_stream("Fix the failing tests");
while let Some(event) = stream.next().await {
match event {
AgentEvent::TextDelta(t) => print!("{t}"),
AgentEvent::ToolStart { name, .. } => eprintln!("\n[{name}]"),
AgentEvent::ToolEnd { name, duration, .. } => {
eprintln!("[{name} done in {}ms]", duration.as_millis());
}
AgentEvent::Complete(output) => {
eprintln!("\n--- {} turns, {} tool calls ---", output.turns, output.tool_calls.len());
break;
}
_ => {}
}
}With Graph Memory
use cersei::memory::manager::MemoryManager;
let mm = MemoryManager::new(project_root)
.with_graph(Path::new("./agent.grafeo"))?;
// Store facts the agent discovers
let id = mm.store_memory("User prefers functional patterns", MemoryType::User, 0.9);
mm.tag_memory(&id.unwrap(), "coding-style");
// Later — recall in 98 microseconds, no LLM call
let results = mm.recall("coding style", 5);Install
[dependencies]
cersei = { git = "https://github.com/pacifio/cersei" }
tokio = { version = "1", features = ["full"] }
anyhow = "1"For graph memory:
cersei-memory = { git = "https://github.com/pacifio/cersei", features = ["graph"] }For the Abstract CLI (complete coding agent) — one-line install on macOS or Linux (detects OS, installs Rust if needed, builds from source, drops the binary on your PATH):
curl -fsSL https://cersei.pacifio.dev/install-abstract.sh | bashOr install directly via cargo if you already have a Rust toolchain:
cargo install --git https://github.com/pacifio/cersei abstract-cliExplore
Quick Start
First agent in 10 lines. Streaming, multi-provider, custom tools.
API Reference
Agent builder, provider trait, tool system, memory, hooks, MCP.
Architecture
Crate map, dependency flow, data flow through the agentic loop.
Cookbooks
Database tools, HTTP APIs, deploy pipelines, embedding in Tauri/Actix/WebSocket.
Abstract CLI
Complete coding agent CLI. REPL, graph memory, 34 tools, slash commands.
Benchmarks
Abstract vs Claude Code vs Codex. Startup, memory, throughput, graph recall.