Aggregating Multi-Tenant Databases for Analytics and Reporting
Turso makes it super easy to create a database per-tenant, or even per-user with it's multi-tenant architecture. Creating a database per-tenant has many benefits, including but not limited to data isolation, data management, and security.
However, aggregating and reporting on each database becomes a challenge.
When running a multi-tenant platform such as an e-commerce SaaS where each store has its own SQLite database, getting a bird's-eye view of your platforms performance can be tricky.
SQLite's ATTACH
feature can be great if you're attaching up to 10 databases, but it's not the solution when your application scales.
For the purposes of this article, we'll assume you have a collection of SQLite databases, one per store, and you want to aggregate key metrics such as daily order totals, revenue, active shopping carts, and new products added.
Here's how you can replicate a similar schema for your database(s):
CREATE TABLE products (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
price INTEGER NOT NULL,
description TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
total_amount INTEGER NOT NULL,
status TEXT DEFAULT 'pending',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE carts (
id INTEGER PRIMARY KEY,
status TEXT DEFAULT 'active',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_products_created_at ON products(created_at);
CREATE INDEX idx_orders_created_at ON orders(created_at);
CREATE INDEX idx_carts_status ON carts(status, updated_at);
One solution to the problem is to connect to each database individually, perform a query to extract and transform the data we need, and store the results in a central database.
In this article, we'll create a basic Extract, Transform, and Load (ETL) script that:
You should already have some databases that all share a schema (either managed by you, or Turso with Multi-DB Schemas). If not, the Turso Platforms API can be used to create databases for tenants, users, or in our case — stores.
Turso Multi-DB Schemas lets you share a single schema across databases — fully managed. This means that if you need to modify schema, you apply it to the parent schema database, and Turso handles propogating that to all its children.
The implementation consists of several components:
You can use the Turso Platform API to fetch the databases that belong to an organization, as well performing any filtering (such as excluding any archived or schema databases):
import { createClient } from '@tursodatabase/api';
const turso = createClient({
org: process.env.TURSO_ORG,
token: process.env.TURSO_AUTH_TOKEN,
});
async function fetchAllDatabases() {
const { databases } = await turso.databases.list();
return databases
.filter((db) => !db.is_schema && !db.archived)
.map((db) => ({
name: db.name,
url: `libsql://${db.hostname}`,
}));
}
For each database returned from the response above, we can connect to it using a group token, and collect the following:
import { createClient } from '@libsql/client';
const client = createClient({
url, // Returned from `fetchAllDatabases()` above
authToken: process.env.TURSO_GROUP_AUTH_TOKEN,
});
const result = await client.execute(`
SELECT
(SELECT COUNT(*) FROM orders
WHERE DATE(created_at) = DATE(?)) as total_orders,
(SELECT COALESCE(SUM(total_amount), 0) FROM orders
WHERE DATE(created_at) = DATE(?)) as total_revenue,
(SELECT COUNT(*) FROM carts
WHERE status = 'active'
AND DATE(updated_at) = DATE(?)) as active_carts,
(SELECT COUNT(*) FROM products
WHERE DATE(created_at) = DATE(?)) as new_products
`);
The result above should return data that looks something like this:
We can store these results in a central database. Create a new metrics
table that looks soemthing like this:
CREATE TABLE metrics (
date DATE NOT NULL,
database_name TEXT NOT NULL,
total_orders INTEGER,
total_revenue INTEGER,
active_carts INTEGER,
new_products INTEGER,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (date, database_name)
);
Now with the results
from above, you can insert into the metrics
table the necessary data:
await client.execute({
sql: `
INSERT INTO metrics (date, database_name, total_orders, total_revenue, active_carts, new_products)
VALUES (?, ?, ?, ?, ?, ?)
ON CONFLICT (date, database_name)
DO UPDATE SET
total_orders = excluded.total_orders,
total_revenue = excluded.total_revenue,
active_carts = excluded.active_carts,
new_products = excluded.new_products,
created_at = CURRENT_TIMESTAMP;
`,
// Replace with actual date from previous steps
args: {
date: '2024-11-19',
database_name: 'store_1',
total_orders: 100,
total_revenue: 10000,
active_carts: 50,
new_products: 10,
},
});
Once you've collected metrics, you can then perform queries and display results as the multi-tenant vendor:
-- Top performing stores this month
SELECT
database_name,
SUM(total_revenue) as revenue,
SUM(total_orders) as orders
FROM metrics
WHERE date >= date('now', 'start of month')
GROUP BY database_name
ORDER BY revenue DESC
LIMIT 10;
-- Daily activity
SELECT
date,
SUM(total_orders) as platform_orders,
SUM(active_carts) as platform_active_carts
FROM metrics
WHERE date >= date('now', '-30 days')
GROUP BY date
ORDER BY date;
-- Average revenue per store
SELECT
date,
ROUND(AVG(total_revenue), 2) as avg_store_revenue
FROM metrics
GROUP BY date
ORDER BY date DESC
LIMIT 30;
By collecting metrics from individual stores on a scheduled basis and aggregating them in a central database — you can effectively monitor your platform's performance while maintaining the benefits of a multi-tenant architecture.
You will want to consider how frequently you need to collect metrics. If you need to view real-time metrics, you will need to extract the data more frequently.