Numbers for what Springtale costs to run. Everything here is
measured against the workspace's reference benchmarks
(crates/springtale-cooperation/benches/) and a typical
laptop-class machine (Apple M-series / x86_64 with 16 GB RAM and
NVMe SSD). Your numbers will vary; the order of magnitude shouldn't.
For the deep dive on cooperation-specific perf,
docs/benchmarks/cooperation-scaling.md
has the actual cargo-criterion output.
| Resource | Idle | Active (typical) | Stressed |
|---|---|---|---|
| RAM | 30 MB | 150 MB | 500 MB |
| CPU | <1% | 5% sustained | 50% bursty |
| Disk | 2 MB binary | + vault + db | + audit log |
| Cold start | 200ms | — | — |
| HTTP request latency | — | <5ms p99 (local) | — |
| Cooperation tick | — | <10ms p99 | <50ms p99 |
| Sentinel verdict | — | <500µs p99 | <2ms p99 |
"Active typical" means a handful of formations, 5 connectors, a few hundred actions per hour. "Stressed" means tens of formations, sustained webhook firehose, AI tool calls under load.
| Component | Size |
|---|---|
springtaled (release, stripped) |
~12 MB |
springtale-cli (release, stripped) |
~6 MB |
| Tauri desktop bundle (.dmg / .deb / .msi) | ~25 MB |
The static-musl build (nix build .#springtaled-static) is ~15 MB
— slightly larger because we statically link everything.
By comparison: an Electron-equivalent desktop app would be 150-250 MB, and an Anthropic / OpenAI client built directly on those SDKs would have the same crypto + TLS + HTTP cost as ours plus less safety.
A fresh springtaled boot with no connectors, no rules, no
formations uses about 30 MB resident.
Breakdown (approximate, via heaptrack):
| Component | RAM |
|---|---|
tokio runtime + executor |
~8 MB |
rustls + root certs |
~3 MB |
wasmtime engine + cranelift |
~10 MB |
rusqlite + SQLite3MultipleCiphers |
~4 MB |
| Everything else | ~5 MB |
The bulk is wasmtime's engine. Compiling a fresh WASM connector adds ~5 MB per module.
A formation costs about 0.5–2 MB resident depending on:
- Blackboard size (grows linearly with writes; bounded by config).
- Mental model contents.
- Number of members.
For 100 active formations: ~150 MB. The cooperation-scaling benchmarks show linear scaling up to ~1000 formations on a 16 GB machine.
Native connectors: <1 MB each — they're just Rust structs.
WASM connectors: each instance allocates a 64 MB sandbox (configurable
via [connector.wasm] memory_limit_pages). Actual usage is bounded
by what the WASM code allocates. A simple webhook connector might
use ~5 MB resident.
CPU usage at idle is dominated by tokio's reactor wakeups (~1 context switch per second from the heartbeat) and the wasmtime epoch ticker (1Hz). Total: <1% on a single core.
A 14-step formation tick takes 1–10 ms on a typical machine. The expensive steps are:
| Step | Typical cost |
|---|---|
| 1 — per-agent loop | 0.5–2 ms per agent |
| 4 — momentum update | <100 µs |
| 5 — momentum persist (SQLite) | 1–3 ms |
| 13 — mental model update | 0.5–2 ms |
| 14 — orchestrate (AI) | 100ms–10s per call |
A formation at Cold/Warming tier (no AI calls) ticks in <5 ms total. A Fever-tier formation calling the AI adapter per tick is dominated by AI latency.
Sentinel verdict path: <500 µs p99. The four checks (circuit-breaker, rate-limit, dead-man, impact-classify + approval) are all in-memory.
Connector dispatch is dominated by the connector's actual work (network I/O, file I/O, AI call). Pre-dispatch overhead is <1ms.
Cold compile of a fresh WASM module: 50–200 ms (cranelift's job). Subsequent invocations against the same module: <1 ms overhead from the sandbox itself.
The G-series WASM tier cache (crates/springtale-connector/src/wasm/tier/cache.rs)
caches compiled modules so the cold compile happens once.
/usr/local/bin/springtaled: ~12 MB.
| File | Typical size | Growth driver |
|---|---|---|
vault.bin |
131,152 bytes (constant) | — |
springtale.db |
1–100 MB | event volume, audit retention |
springtale.db-wal |
<10 MB transient | write rate |
audit.log |
varies | rotation policy |
connectors/ |
5–50 MB | number of WASM connectors |
api_token |
<100 bytes | — |
Under typical load (5 connectors, 100 actions/hour, 90-day audit retention):
audit_trail: ~10 MB / monthevents: ~5 MB / month (with rolling purge)bot_memory: ~1 MB / month per active bot sessionmental_model_*: ~100 KB / month per active formationformation_momentum: ~50 KB / month per active formation
The DB grows roughly 50–100 MB / month under that profile. Heavy moderation bots can hit 500 MB / month.
Configure retention to bound growth. See
docs/operations/database.md.
Zero outbound. We do not phone home. Verifiable via tcpdump.
Dispatching an action makes one outbound HTTPS call (to the
connector's target service). The TLS handshake is amortised via
reqwest's connection pool — typically <100 ms for a fresh
endpoint, <10 ms for a reused connection.
A webhook delivery is ~5 ms of CPU on the daemon side: HMAC verify, dispatch decision, queue insertion. Doesn't block subsequent requests.
Detailed in docs/benchmarks/cooperation-scaling.md;
summary:
| Formations | RAM | CPU (idle) | Tick latency p99 |
|---|---|---|---|
| 10 | 80 MB | <1% | 5 ms |
| 100 | 150 MB | 2% | 10 ms |
| 500 | 400 MB | 8% | 25 ms |
| 1000 | 700 MB | 15% | 50 ms |
| 5000 | 3 GB | 40% | 200 ms |
Beyond 5000 formations, the cooperation tick begins to fall behind
the cadence bus (default 1Hz). For very large fleets, either lower
the cadence rate ([cooperation] cadence_interval_secs = 5) or
split across multiple daemons (Phase 3 / Veilid concern for cross-
daemon coordination).
| Adapter | First-token | Total (100 tokens) |
|---|---|---|
OllamaAdapter (local llama3.1:8b) |
200 ms | 5–15 s |
AnthropicAdapter (Claude Sonnet 4.6) |
400 ms | 2–4 s |
OpenAiCompatAdapter (gpt-4) |
500 ms | 3–5 s |
NoopAdapter |
<100 µs | <100 µs |
For tool-calling flows, multiply by max_tool_iterations.
| Phase | Time |
|---|---|
| Process start → tokio runtime | 20 ms |
| Vault unlock (Argon2id KDF) | 50–200 ms |
| Database open + schema apply | 50–100 ms |
| Connector registry load | 20 ms per connector |
| HTTP server listening | 50 ms |
| Total cold start | 200–500 ms |
Argon2id KDF is the dominant cost. We tuned its parameters (64 MB, 3 iterations, 4 parallelism) for ~200 ms on a typical machine — fast enough to feel snappy, slow enough to make brute-force prohibitive.
# Cold-start time:
time (springtale-cli init && springtale-cli server start --once)
# Memory under load:
ps -p $(pgrep springtaled) -o rss
# Per-tick latency:
springtale-cli trace --formation <id> | grep "tick_completed"
# Sentinel verdict latency:
RUST_LOG=springtale_sentinel=debug springtaled 2>&1 | grep "verdict_emitted"
# Cooperation benchmarks (your machine):
cargo bench -p springtale-cooperationFor deeper analysis:
# Flame graph of the daemon under load:
cargo flamegraph --bin springtaled
# Memory profiling:
heaptrack ./target/release/springtaled
# Async task tracing (tokio-console):
TOKIO_CONSOLE_BIND=127.0.0.1:6669 springtaled
tokio-consoleThe cooperation benches use cargo-criterion for statistical rigour.
Single-run timing can be misleading; run benches three times and
take the median.
Some things we explicitly don't optimise for:
- Sub-millisecond formation ticks. The cadence is 1 Hz by default; aggressive sub-tick logic isn't in the design.
- Tens of thousands of concurrent webhook deliveries. The daemon can absorb bursts up to ~1000/sec but isn't designed as a load-balancing edge — put a real load balancer in front if you need that.
- Multi-process coordination. Pacing, sentinel, audit are all per-daemon. Cross-daemon is a Phase 3 concern.
- Cold-start under 100ms. The Argon2id passphrase derivation deliberately costs 200ms. We're not going to give that up.
Tuning becomes urgent when:
- RAM exceeds 1 GB on a single daemon. Usually a runaway formation that won't dissolve.
- Tick latency exceeds 100 ms consistently. Either too many active formations or an orchestrator stuck in an AI loop.
- Database larger than 1 GB. Audit retention too generous; set a
shorter
audit_retention_days. - Cold start over 1s. Vault file corruption; run
springtale-cli doctor.