Essential Reading For Engineering Leaders

Just Make It Scale: An Aurora DSQL Story

- Werner Vogels

Database
Rust

tl;dr: “In this post, Niko and Marc - the two senior principal engineers who built DSQL - provide deep technical insights on Rust and how we’ve used it to build DSQL. It’s an interesting story on the pursuit of engineering efficiency and why it’s so important to question past decisions – even if they’ve worked very well in the past.”

featured in #620

Why Query Caching Is the Most Cost-Effective Way To Scale Databases

- Gautam Gopinadhan

Management
Database

tl;dr: Most teams try to scale databases by throwing hardware at the problem, duplicating data, or rewriting slow queries, often at great cost. But there's a quieter and far more efficient path: SQL-layer query caching. It cuts load, reduces tail latency, and simplifies scaling, without migrations or infrastructure sprawl.

featured in #619

On The Road To Your Own Vector DB

- Doug Turnbull

Database

tl;dr: “These vectors correspond to vector embeddings, a representation of a word, sentence, image, or, really anything. Embeddings come out of models that move similar items closer. Our model might know that "Mary had a little lamb" is very similar to "Little bo peep had a sheep" - yielding nearly identical embeddings - despite sharing no important words.”

featured in #613

How Discord Indexes Trillions Of Messages

- Vicki Niu

Database
Architecture

tl;dr: “As guilds on Discord grow larger with longer histories, more and more of them bump up against Lucene’s MAX\_DOC limit of ~2 billion messages. We needed a solution to scale search for these special cases, which we call BFGs, or Big Freaking Guilds. We wanted to retain the performance gains from storing all messages for a given guild on the same Elasticsearch shard, since that still works for the vast majority of guilds, but we needed a solution to scale search for BFGs as well.”

featured in #611

Database Design For Google Calendar: A Tutorial

- Alexey Makhotkin

Database

tl;dr: “In this database design tutorial I’m going to show how to design the database tables for a real-world project of substantial complexity. We’ll design a clone of Google Calendar. We will model as much as possible of the functionality that is directly related to the calendar.”

featured in #587

Database Design For Google Calendar: A Tutorial

- Alexey Makhotkin

Database

tl;dr: “In this database design tutorial I’m going to show how to design the database tables for a real-world project of substantial complexity. We’ll design a clone of Google Calendar. We will model as much as possible of the functionality that is directly related to the calendar.”

featured in #586

Database Sharding Explained

- Mahdi Yusuf

Database
Scale

tl;dr: Mahdi discusses when to use it, how it can be set up, why we shard data stores and various options you have before sharding.

featured in #584

Migrating Billions Of Records: Moving Our Active DNS Database While It’s In Use

- Alex Fattouche Corey Horton

Migration
Database

tl;dr: “When initially measured in 2022, DNS data took up approximately 40% of the storage capacity in Cloudflare’s main database cluster (cfdb). This database cluster, consisting of a primary system and multiple replicas, is responsible for storing DNS zones, propagated to our data centers in over 330 cities via our distributed KV store.”

featured in #565

Things You Should Know About Databases

- Mahdi Yusuf

Database

tl;dr: "So, without fully getting into the weeds on database-specific quirks, I will cover everything you should understand about RDBMS indexes. I will touch briefly on transactions and isolation levels and how they can impact your reasoning about specific transactions."

featured in #553

Dealing With Large Tables

- Benjamin Dicken

Database

tl;dr: “Large databases often have a small number of very large tables that makes scaling difficult. How can you scale with these while keeping your database performant? This article covers vertical scaling, vertical sharding and horizontal sharding.”

featured in #550

/Database