Skip to main content
Snakream is a durable, replayable stream layer for application state. Each document, session, task, room, or agent run gets its own ordered stream. That stream supports recovery, replay, live updates, snapshots, and eventual expiry. You can think of Snakream as a per-entity durable log runtime. It is purpose-built for this pattern, not a general-purpose event backbone or stream storage primitive.

Durability without the latency tax

Many durable stream systems force a choice: either writes are fast but best-effort, or writes are durable but too slow for interactive use. Snakream is designed so you do not have to choose. Every write is replicated to a majority of nodes before acknowledgment. Write latency is typically 2-3 ms. Acknowledged data survives any single-node failure. After acknowledgment, a background process flushes data to S3 on a configurable interval. Once flushed, data has S3-grade durability. In a cross-region deployment with a 5-second flush interval, Snakream provides approximately 9-10 nines of per-message durability and 3-4 nines of annual zero-loss probability. The window where data is at risk from a simultaneous multi-region failure is measured in seconds.

One stream per entity

Most event systems multiplex many entities into shared channels. Consumers reconstruct per-entity state downstream. Snakream inverts this: each entity owns its own durable timeline. Writers append directly to that stream, and readers resume from its offsets. Buckets group related streams under one namespace. This gives you:
  • Replayable recovery. When a worker, sandbox, or agent restarts, replay the entity’s log and continue.
  • Live tails with simple clients. Watch progress over HTTP with catch-up reads, long-poll, or SSE.
  • Built-in lifecycle. Snapshots, bootstrap, and TTL fit the same timeline model instead of becoming separate infrastructure.

Tradeoffs

Snakream makes one durable timeline per entity cheap and ergonomic. In exchange, it is not trying to be the most general abstraction for cross-system event distribution or arbitrary stream processing pipelines. That tradeoff is intentional: the simpler the per-entity model, the easier it is to recover, inspect, and operate long-running application state.

Next steps