Register now for early access to concurrent writes in the Turso Cloud. Join the waitlist

How we used Quint to find over 10 bugs in SQLite while hardening Turso

A Turso community member modeled the SQLite C API in Quint, ran the generated traces against real SQLite, and uncovered more than 10 bugs along the way.

Cover image for How we used Quint to find over 10 bugs in SQLite while hardening Turso

From the moment we started Turso, we have taken testing extremely seriously. There is no other choice: Turso is a rewrite of SQLite, and SQLite sets a very high bar for reliability. We consistently hear that this, more than any specific feature, is what users love about SQLite.

Our first version shipped already with a built-in Deterministic Simulation Testing mechanism. Today, aside from our original DST, we have differential testers, fuzzers, concurrency simulators, as well as making good use of Antithesis.

We will never pass up an opportunity to improve things even more. Because of this, we have been attracting a healthy community of quality aficionados into our Open Source community. This article goes into the details of how one of our Open Source members, Pavan Nambi, used Quint to tighten our testing so hard that he ended up finding 10 bugs (and counting) in SQLite itself.

#Formal Methods

Despite our testing discipline, one area in which we are still making slow inroads is formal methods. Formal methods have traditionally been very inaccessible to mortals, and usually come with the "who-watches-the-watchmen" added challenge of how to verify that the model of the system itself is correct. TLA+ is the gold standard for formal verification, but other tools have been appearing recently that try to tackle the issue of accessibility in formal methods.

One such tool is Quint. As per their GitHub page, Quint "combines the robust theoretical basis of the Temporal Logic of Actions (TLA) with state-of-the-art type checking and development tooling."

Pavan decided to take it for a spin, but quickly concluded that trying to specify the entire system would be close to impossible. At that point, he had an idea: most drivers that consume SQLite do so through the C API. Turso exposes a C API that is compatible with SQLite's. The C API is also well documented and well specified. If he could model the C API in Quint, that would amount to a pretty good level of coverage. The good specification of the API makes it easier to write the Quint specification for it, and we could also run it against SQLite itself to double check that the specification is in fact correct.

#Quinting SQLite

With this idea in mind, the process was now simple. We just had to rinse and repeat the following steps:

  • Take one documented SQLite C API contract, then model only the state and properties needed for that.
  • Generate a small trace.
  • Run it against SQLite.
  • Compare the documented result with the observed one.

If the model checker fails, we get a counter-example: a sequence of states that lead to the violation.

The expectation was that running it against SQLite would help us validate the model. In many situations it did. But in others, upon closer inspection, the system deviated from the model because the system was wrong.

#One example: sqlite3_deserialize()

Here is one example: the function sqlite3_deserialize() lets you take a serialized SQLite database image and load it into a database connection as an in-memory database. This is useful when you have a file-based database but want to fully load it into memory and operate on it as an in-memory database from that point on.

According to the specification of this API, sqlite3_deserialize() should fail with SQLITE_BUSY if the target database is currently in a read transaction or if it is involved in a backup operation.

Quint works by generating traces. Traces are just a sequence of states, not SQL statements. Those traces can then be translated into whatever you want to allow execution against the system. In our case, a small C program.

The Quint model automatically generated a trace where this contract was violated:

  1. Open database.
  2. Create table.
  3. Serialize database.
  4. Start reading from the database.
  5. While the read transaction is active, call sqlite3_deserialize(), expecting SQLITE_BUSY.

In summary, the database was supposed to return busy if a read transaction was in progress during a deserialize operation. The C program was used to move the database from one state to the other according to the trace. But in this case, the database did not return SQLITE_BUSY: it crashed instead.

Crashes have one big advantage: it is easy to know that this is not an issue with the model, since crashing is almost never the right behavior.

This was fixed in SQLite in this commit.

#And many more followed

During the entire exploration with Quint, the following bugs were found:

In addition to the ones above, there are also other findings that we are working with the SQLite team to validate.

#Summary

Formal methods, and Quint in particular, were able to explore parts of the specification that the Turso simulator previously could not. The general concepts are similar, but instead of generating a random sequence of SQL statements and enforcing system-level properties, Quint generates a list of traces that can then be translated into any substrate we want.

In this case, we translated Quint traces into a C program that interacted with the C API. Turso got better thanks to this endeavor. So did SQLite: at least 10 bugs were found, reported, and fixed.

SQLite is a magnificent piece of software that has been around and thoroughly tested for decades. And yet, bugs were lurking that previous disciplines did not catch. Formal methods did.