How does Turso Cloud keep your data durable and safe?

Glauber CostaGlauber Costa
Cover image for How does Turso Cloud keep your data durable and safe?

The Turso Cloud is a service that allows users to create millions — or even billions — of small SQLite databases. Those databases can be either accessed over HTTP, powering use cases like agentic memory and AI application builders, or replicated to a physical device like a mobile phone, IoT device, or more sophisticated devices like robots and drones.

Since the public release of our AWS regions, we commit to 99.999999999% data durability powered by AWS S3 and S3 Express, and provide features like instantaneous branching of databases, as well as continuous backups, with the ability to provide point-in-time-restore to any moment up to 90 days.

In this article, we will do a deep dive into how our cloud platform stores your data.

#Data Residency

All data on Turso Cloud is durable on either S3, or S3 Express. The local storage on the server acts as a cache, making sure that they are served with both low latency and low cost. Each database region has its own S3 bucket, guaranteeing that data doesn’t leave the jurisdiction where the compute takes place.

A modern SQLite file — much like the one you may have on your laptop, consists of two parts:

  • the database file itself,
  • and the WAL, or write-ahead-log.

The most recent writes are present in the write-ahead-log, and at some point, a process called checkpointing folds the new writes from the WAL back into the database file.

The Turso Cloud never has a full, one-piece copy of the database file for reasons which will become clear later. Instead, the database file is split into 128kB segments. The collection of all the segments that comprise a database file is called a generation. Splitting the database file into chunks allow us to do three main things:

  • When a new database file is created, we don’t have to create a new version: it is enough to point to the old segments, and add the ones that have changed.
  • When that database is moved to a new compute node in the Turso Cloud, or replicated to a physical server or device, it is not necessary to download the entire database before queries can be served: only the segments that are needed to serve the next query need to be brought to the local media. The rest can be lazily fetched, or not at all.
  • It makes it possible to limit how much physical storage is used in the compute nodes. For example, we may limit a specific user to 1GB of local storage, meaning 1GB will be available locally, while the rest is evicted, and brought back from S3 on-demand. This is critical for us to be able to offer a massive multitenant server.

All writes that have happened since the last generation, in other words, the most recent WAL, are sent to S3 Express, which has a write latency of a single-digit milliseconds. Only after the data is safely stored in S3 Express, is the transaction acknowledged to the user.

The entire WAL has to be present on the local storage of the server in order for any query to be serviced. Because of that, the Turso Cloud servers will checkpoint often, usually after each 1MB written, although the exact point depends on the specifics of the write pattern.

This ensures that the compute nodes can fail and be replaced at any time. In those scenarios, as soon as the database is accessed, the latest WAL is brought back from S3 Express, and the database segments are lazily fetched from S3.

#Continuous Backups

One advantage of this architecture is that there is no difference between writing to the database and backing up the database. Backups are always happening, continuously, and it is possible to restore the database to any point in time for up to 90 days (or more, for customers with custom needs). All we need to do is find the latest generation before the specific timestamp we want to restore to, the WAL fragments written after that generation up to the specific timestamp, and the database can then be restored to that point.

#Branching

The process of branching is also very similar. The generations belonging to a database now become shared between databases, including the WAL fragments. This is a metadata-only operation, and no data copying occurs. That means that branching is instantaneous. New writes to one of the databases will create new generations, and history will diverge from that point on.

#Durability

Because of our reliance on S3 and S3 Express, both with an advertised durability of 99.999999999%, all data ever written to a database on the Turso Cloud - past and present, is guaranteed to be durable. This also means that even in the unlikely event that a SQLite issue leads to a local corruption in the SQLite file, it will always be possible to revert the database to a point right before that.

#Conclusion

Using a combination of S3 and S3 Express, the Turso Cloud can offer a service where data is guaranteed to be durable on object store at all times. This allows us to have continuous backups, as well as branching, for free. It also makes the service more reliable, since destroying compute nodes is a safe operation. Combined with Turso Cloud's massive multitenant architecture, we can offer a service where users can create unlimited databases that come online in milliseconds to power their agentic workloads, or have a cheap and reliable database for any other need.

Get started with the Turso Cloud today!

scarf