I recently published an article on how a code using database/sql can easily deadlock. That article started multiple interesting conversations and one of them led to discovery of a related issue when…
I recently published an article on how a code using database/sql can easily deadlock. That article started multiple interesting conversations and one of them led to discovery of a related issue when using SQLite in Golang. The issue is especially nasty because it does not manifest itself in a crash. Instead it makes your database use orders of magnitude more disk than necessary.
The topic is close to my heart because I spend every day working on a database for the Edge called Turso that's based on LibSQL — a SQLite fork.
To understand the problem we first need to look under the hood of SQLite and see how it manages the storage. There are actually multiple modes of SQLite operation but we're going to focus on Write-Ahead Logging(WAL). It's not the default mode so one needs to enable it by running PRAGMA journal_mode=WAL
SQL statement.
In case you're wondering why anyone would use WAL, SQLite documentationhas some useful list:
1. WAL is significantly faster in most scenarios.
2. WAL provides more concurrency as readers do not block writers and a writer does not block readers. Reading and writing can proceed concurrently.
3. Disk I/O operations tends to be more sequential using WAL.
4. WAL uses many fewer fsync() operations and is thus less vulnerable to problems on systems where the fsync() system call is broken.
On a very high level, here's how WAL works:
1. There are two important files main db file and write-ahead log (WAL) file.
2. Writes only append to WAL file and don't change main db file at all.
3. Reads are performed from both main db file and WAL file.
4. The database periodically runs a checkpoint operation that transfers updates from WAL file to main db file and truncates WAL file. Checkpoint can also be triggered manually for example by running PRAGMA wal_checkpoint(TRUNCATE)
SQL statement.
Consider the following code written in Go that creates an SQLite database, populates it with data, and immediately queries that data back. At the end it prints the size of the main db file and its associated WAL file:
func insertUsers(db *sql.DB, count int) error {
for i := 0; i < count; i++ {
_, err := db.Exec("INSERT INTO users VALUES (?)", gofakeit.Email())
if err != nil {
return err
}
}
return nil
}
func printFileSize(prefix, name string) error {
fi, err := os.Stat(name)
if err != nil {
return err
}
fmt.Println(prefix, humanize.Bytes(uint64(fi.Size())))
return nil
}
func main() {
dbDir, err := os.MkdirTemp("", "test")
if err != nil { panic(err) }
defer os.RemoveAll(dbDir)
db, err := sql.Open("sqlite", dbDir+"/test.db")
if err != nil { panic(err) }
defer db.Close()
_, err = db.Exec("PRAGMA journal_mode=WAL")
if err != nil { panic(err) }
_, err = db.Exec("CREATE TABLE users (email TEXT)")
if err != nil { panic(err) }
err = insertUsers(db, 10)
if err != nil { panic(err) }
rows, err := db.Query("SELECT email FROM users")
if err != nil { panic(err) }
// Do stuff with rows here but
// don't call rows.Close()
// and don't read all the rows
rows.Close() // <-- REMOVING THIS WILL HAVE AN INTERESTING EFFECT
err = insertUsers(db, 10000)
if err != nil { panic(err) }
err = printFileSize("Main DB file size:", dbDir+"/test.db")
if err != nil { panic(err) }
err = printFileSize("WAL file size:", dbDir+"/test.db-wal")
if err != nil { panic(err) }
}
When we run this code it prints:
Main DB file size: 406 kB
WAL file size: 4.1 MB
Which means that main db file
has size of 406kB
and WAL file
has size of 4.1MB
. We can see that at least some db updates generated by insertUsers
were transferred from WAL file
to main db file
. To run checkpoint manually we could add the following code before printing the size of the files:
_, err := db.Exec("PRAGMA wal_checkpoint(TRUNCATE)")
if err != nil {
panic(err)
}
When we run the code with such addition, it will print:
Main DB file size: 414 kB
WAL file size: 0 B
After such manual checkpoint
the main db file
is 414kB
in size and WAL file
would be empty.
Now, let's pretend that we have forgotten to call rows.Close()
. Without running manual checkpoint, the new code prints:
Main DB file size: 4.1 kB
WAL file size: 42 MB
Meaning that main db file
has only 4.1kB
and WAL file
has 42MB
. Running checkpoint
manually does not change this result. It's worth noticing that in the second case when we forgot to close rows
object, the storage taken by the same number of db rows is significantly (10–100x) higher.
rows
object represents a read operation to SQLite database. Unless rows
is closed, explicitly by calling rows.Close()
or implicitly by reading all the data from it, SQLite treats the operation as ongoing. It turns out that ongoing reads prevent checkpoint
operation from advancing. As a result, one leaked rows
object can prevent database from ever transferring changes from WAL file
to main db file
and truncating WAL file
. This means WAL file
grows indefinitely. At least until the process is restarted which can take a long time for a server application.
It's very important to remember about closing each result of the calls to sql.DB.Query
and sql.DB.QueryContext
. When using SQLite, failing to do so could prevent checkpoint
operation from happening and leads to the database using orders of magnitude more disk than necessary.
SQLite has a special operation to reclaim disk space and compact the database. The operation is called VACUUM. It turns out that the problem described in this article affects VACUUM in the same way it affects checkpoint.