Cersei

Benchmarks: Abstract CLI

Abstract CLI overhead — startup, RSS, subcommands, end-to-end latency.

Abstract CLI Benchmarks

Measures framework overhead, not model speed. Apple Silicon, release build.

Startup

MetricAbstractNotes
--version (avg, 50 runs)32msIncludes Python subprocess overhead (~20ms)
--version (native)~8msDirect wall-clock
--help21ms

Binary & Memory

MetricValue
Binary size6.0 MB
Peak RSS (--help)4.9 MB

Subcommand Latency

20 iterations each:

SubcommandAvg
--help21ms
sessions list21ms
config show20ms
mcp list21ms
memory show22ms

All subcommands complete in ~21ms — process startup dominates. The operation itself adds under 1ms.

End-to-End Agentic Latency

Full round-trip: CLI startup + config + system prompt + API call + streaming + rendering.

Using OpenAI gpt-4o:

TestAvg (5 runs)
Simple response ("say OK")1078ms
Multi-word response1094ms
JSON output mode1004ms

Estimated overhead breakdown:

PhaseTime
Process startup~8ms
Config + memory context~2ms
System prompt assembly~1ms
Tool definition serialization~1ms
HTTP connection + TLS~50-100ms
Framework overhead~60-110ms

Everything else is network + model inference.

Sequential Throughput

10 consecutive prompts, compared across all three tools:

ToolTotalPer-request
Abstract15637ms1564ms/req
Codex CLI41518ms4152ms/req
Claude Code120787ms12079ms/req

Abstract processes 10 prompts in 16 seconds. Codex takes 42 seconds. Claude Code takes 2 minutes.

Run

./run_tool_bench_claude.sh --iterations 20 --full
./run_tool_bench_codex.sh --iterations 20 --full

On this page