Agent builder, agentic loop, streaming events, auto-compact, sub-agents, coordinator mode.

cersei-agent

The high-level Agent API. Builder pattern for configuration, an agentic loop that handles tool dispatch and multi-turn conversations, a 26-variant event system for observation, and automatic context management.

Most users interact with Cersei through this crate. The cersei facade re-exports Agent, AgentBuilder, AgentOutput, AgentEvent, and AgentStream directly.

Agent Builder

Every agent starts with a builder. The only required field is .provider() — everything else has sensible defaults.

let agent = Agent::builder()
    .provider(Anthropic::from_env()?)
    .tools(cersei::tools::coding())
    .system_prompt("You are a coding assistant.")
    .model("claude-sonnet-4-6")
    .max_turns(10)
    .max_tokens(16384)
    .permission_policy(AllowAll)
    .build()?;

Builder Methods

Prop

Type

.build() returns Result<Agent>. It can fail if the provider requires authentication that isn't configured.

Execution Modes

One-Shot (shorthand)

The simplest way to run an agent — builds, executes, and returns in one call:

let output = Agent::builder()
    .provider(Anthropic::from_env()?)
    .tools(cersei::tools::coding())
    .run_with("Fix the failing tests")
    .await?;

println!("{}", output.text());
println!("Turns: {}, Tool calls: {}", output.turns, output.tool_calls.len());

Blocking (reusable agent)

Build once, run multiple times. The agent maintains conversation history across calls:

let agent = Agent::builder()
    .provider(Anthropic::from_env()?)
    .tools(cersei::tools::coding())
    .build()?;

// First turn
let output1 = agent.run("What files are in src/?").await?;

// Second turn — the agent remembers the first
let output2 = agent.run("Now fix the bug in main.rs").await?;

Streaming (events in real-time)

Returns an AgentStream that yields events as they happen. Supports bidirectional control — you can respond to permission requests, inject messages, or cancel mid-stream.

let mut stream = agent.run_stream("Deploy the application");

while let Some(event) = stream.next().await {
    match event {
        AgentEvent::TextDelta(t) => print!("{t}"),
        AgentEvent::ThinkingDelta(t) => { /* thinking content */ }
        AgentEvent::ToolStart { name, input, .. } => {
            eprintln!("\n[Tool: {name}]");
        }
        AgentEvent::ToolEnd { name, duration, is_error, .. } => {
            let status = if is_error { "FAIL" } else { "OK" };
            eprintln!("[{name}: {status} in {}ms]", duration.as_millis());
        }
        AgentEvent::PermissionRequired(req) => {
            // Interactive approval
            stream.respond_permission(req.id, PermissionDecision::Allow);
        }
        AgentEvent::Complete(output) => {
            eprintln!("\nDone: {} turns", output.turns);
            break;
        }
        AgentEvent::Error(msg) => {
            eprintln!("Error: {msg}");
            break;
        }
        _ => {}
    }
}

AgentOutput

Returned by run(), run_with(), and AgentStream::collect().

pub struct AgentOutput {
    pub message: Message,
    pub usage: Usage,
    pub stop_reason: StopReason,
    pub turns: u32,
    pub tool_calls: Vec<ToolCallRecord>,
}

Prop

Type

// Access the text response
let text = output.text();  // -> &str

// Inspect tool calls
for call in &output.tool_calls {
    println!("{}: {}ms (error: {})", call.name, call.duration.as_millis(), call.is_error);
}

AgentEvent

26 variants covering every observable moment in the agentic loop:

Content Events

Variant	Fields	Emitted When
`TextDelta(String)`	text chunk	Model streams a text token
`ThinkingDelta(String)`	thinking chunk	Model streams a thinking token (extended thinking)

Tool Events

Variant	Fields	Emitted When
`ToolStart { name, id, input }`	tool name, call ID, JSON input	Tool dispatch begins
`ToolEnd { name, id, result, is_error, duration }`	tool name, call ID, result text, error flag, wall-clock time	Tool execution completes

Lifecycle Events

Variant	Fields	Emitted When
`TurnComplete { turn, usage }`	turn number, token usage	One model call + tool cycle finishes
`TokenWarning { pct_used, state }`	context % used, warning/critical	Context window approaching limit
`CompactStart { reason }`	threshold/manual/overflow	Context compaction begins
`CompactEnd { messages_after, tokens_freed }`	remaining messages, tokens reclaimed	Compaction finishes
`SessionLoaded { session_id, message_count }`	session ID, number of messages	Session resumed from memory

Control Events

Variant	Fields	Emitted When
`PermissionRequired(PermissionRequest)`	tool name, description, level	Tool needs approval (interactive policy)
`CostUpdate { turn_cost, cumulative_cost, input_tokens, output_tokens }`	costs and token counts	After each model call
`SubAgentSpawned { agent_id, prompt }`	sub-agent ID, task description	Sub-agent created
`SubAgentComplete { agent_id, result }`	sub-agent ID, output	Sub-agent finished

Terminal Events

Variant	Fields	Emitted When
`Status(String)`	status message	Informational update
`Error(String)`	error message	Unrecoverable error
`Complete(AgentOutput)`	final output	Agent loop finished successfully

AgentStream

Bidirectional control channel. Receive events and send commands back.

Methods

Prop

Type

Effort Levels

Control thinking depth and temperature via a single setting:

use cersei_agent::effort::EffortLevel;

let effort = EffortLevel::from_str("max");
let budget = effort.thinking_budget_tokens();  // 32768
let temp = effort.temperature();               // Some(1.0)

Prop

Type

Auto-Compact

When the conversation approaches the context window limit, the agent automatically summarizes older messages to free space. This happens transparently — the model continues working without interruption.

Agent::builder()
    .auto_compact(true)           // enable
    .compact_threshold(0.9)       // trigger at 90% of context window
    .tool_result_budget(50_000)   // also truncate oldest tool results above 50K chars

The compaction pipeline:

Count tokens in the current conversation
If above threshold, group old messages by topic
Call the LLM to summarize each group
Replace original messages with summaries
Free tool results above the budget

Events emitted: CompactStart, CompactEnd (with messages_after and tokens_freed).

System Prompt Caching

The system prompt is split into two sections separated by __SYSTEM_PROMPT_DYNAMIC_BOUNDARY__:

[Static section — cached by the provider]
__SYSTEM_PROMPT_DYNAMIC_BOUNDARY__
[Dynamic section — rebuilt each turn]

Agent::builder()
    .system_prompt("You are a coding assistant. Always use Rust.")  // static, cached
    .append_system_prompt("Current time: 2024-01-01")               // dynamic, per-turn

Anthropic's prompt caching caches everything before the boundary. This saves tokens and latency on multi-turn conversations.

Agent (cersei-agent)

cersei-agent

Agent Builder

Builder Methods

Execution Modes

One-Shot (shorthand)

Blocking (reusable agent)

Streaming (events in real-time)

AgentOutput

AgentEvent

Content Events

Tool Events

Lifecycle Events

Control Events

Terminal Events

AgentStream

Methods

Effort Levels

Auto-Compact

System Prompt Caching

On this page