Is NoSQL dead?

A great friend of mine recently ran a poll with the following question: Is the era of alternative NoSQL architectures over?

Matt Silverlock 🐀

@elithrar

·Follow

There's been a trend back to SQL databases in the latest set of database-y startups (@neondatabase, @supabase, @tursodatabase, @PlanetScale, et. al). Is the era of alternative NoSQL architectures (like DynamoDB, MongoDB or FaunaDB) over?

4:02 PM · Dec 15, 2023

Read 22 replies

According to this poll, it seems so. I worked for 8 years as employee #3 of a NoSQL company, ScyllaDB — before, and I still believe companies like theirs have a place in the sun.

Still, I voted for the first option. I summarized my reasoning on Twitter, but in this post, I want to expand a bit on that reasoning. I am drawing most of those comparisons with MongoDB, but a lot of that should be valid for many other NoSQL solutions.

Glauber Costa

@glcst

·Follow

When you think about the reasons people reached out for Mongo in 2010: * It was dead simple. * It allows you to scale to 100s of GB, even a whole TB. Fast forward to 2023: SQLite is easier than Mongo, and with the database-per-tenant pattern you can essentially emulate

Matt Silverlock 🐀

@elithrar

4:15 PM · Dec 15, 2023

209

Read 13 replies

#Not all NoSQL is created equal

NoSQL has always been my least favorite term in tech. It never really had any positive definition, it's just a band of database technologies lumped together by the thing they are not. Until, of course, someone started saying NoSQL meant “Not Only SQL”, at which point the term became frankly beyond useless.

And the problem here is that of course not all NoSQL is past its due, because all of those technologies serve various purposes.

#Specialized domains

Many NoSQLs operate in specialized domains. Nobody was, in 2010, replacing their MySQL setups with them, so I don't expect anybody to switch back. Those include things like:

Time series databases and Event stores: the data is so specialized and predictable, that it is just worth building abstractions for it.
Graph Databases: the relationships between entities, and queries, are also so specialized, that the specialized model pays off.
Key-value stores: You pass a key, and you get a value. Why would you complicate that?
Scale-out highly available column stores like ScyllaDB: at times I see people bragging on Twitter how their apps scaled to 10,000 users with thousands of requests per day. In one particular benchmark, we scaled Scylla to one billion (with a B) requests per second (not day, week, or year), and customers were often at the petabyte level. Nodes can come and go at any time and the system is still available. If you need that, SQL really won't cut it.

#What changed?

The truth is — developers love SQL. The reason NoSQL became so popular in 2010, is that it was “WebScale”. But what does it mean to be “WebScale”? It means you can handle workloads that are characteristic of most companies operating on the web. The hyperscalers always had (and always will have) their own in-house solutions anyway, and if all you manage is a simple blog, anything you use is fine.

For those in your early 30s who may not be familiar with this absolute classic, here it is:

At the time this was written:

Most media was HDD, which was (and still is) slow and sequential.
“Big data” was around 1TB.
Machines had a couple of cores, at most, meaning any decent performance had to come from horizontal scalability.

Fast forward to today:

Most media is flash-based, which is very fast, and parallel.
There are USB sticks with 1TB for sale for $5.
Your baby monitor probably has 4 cores.

When we started Turso, we had a clear thesis in mind: after decades of NoSQL, those changes mattered, and would bring about a renewed interest in SQL in general, and SQLite in particular.

#The other NoSQL

But use cases that left SQL in 2010 can now come back. The data sizes grew, but not exponentially: think about your shopping cart data. Your user base and throughput also grew, but machines grew even faster. So a lot of those constraints calling for mandatory scale-out are just not there anymore. Machines grew so much that most web workloads today fit on… SQLite!

Despite the scale-out aspect, another thing driving people towards MongoDB was its flexible collection-based data model, and just how easy it was to get started, push some JSON, and be good to go.

Things in our industry rarely die, but if Matt's poll is any indication, developers reaching for standard web workloads today will be reaching for SQL.

SQLite, in particular, has the same appeal of ease of Mongo, without giving up SQL. You can get started by just pointing to a file. No configuration, no nothing. And you can easily test your database code by passing the file around in your git repository.

But another, less appreciated advantage, is how SQLite can also be used to emulate a schemaless document pattern.

#Have your cake, eat it too

With SQLite, each database is just a file. It is cheap and easy to create and maintain millions of those files, so you don't have the same constraint of putting all of your data in the same database as with other SQL offerings.

import { createClient } from '@libsql/client';

const db1 = createClient({
  url: 'file:path/to/carts.dev',
});

const db2 = createClient({
  url: 'file:path/to/inventory.dev',
});

const db3 = createClient({
  url: 'file:path/to/payments.dev',
});

It is easy to just give each equivalent of a Mongo document its schema. You retain the advantage of SQL within that database: powerful support for indexes, connection with the rich SQL ecosystem, structure, and cross-table joins, while at the same time being able to have different schemas per database.

The file-based nature of SQLite is usually also its downfall: it becomes hard to scale, manage, backup, and access it from serverless.

This is, however, exactly the gap that Turso wants to plug. Turso uses a fork of SQLite, libSQL, to offer millions of independent databases that can be used to, among other things, emulate independent collections, automatic backups, or do replication, all while keeping a local development workflow in a file.

#In summary

NoSQL isn't dead. There is still vibrant room for it in specialized domains and extremely large-scale systems. But whereas 10 years ago it seemed like NoSQL would be the default choice for the average developer, the trend is now clearly in the other direction. SQL can do a lot more than it could before, and whatever can be done with SQL, will be.

In fact, the landscape changed so dramatically, that even SQLite can now play a big role. And if you're interested in SQLite and how to run it in production, join us on the Turso Discord to chat about it!