/Database

You Don't Always Need Indexes

- Jeff Kaufman tl;dr: “Sometimes you have a lot of data, and one approach to support quick searches is pre-processing it to build an index so a search can involve only looking at a small fraction of the total data. The threshold at which it's worth switching to indexing, though, might be higher than you'd guess.” Jeff illustrates cases where full scans were better engineering choices.

featured in #418


Is 20M Of Rows Still A Valid Soft Limit Of MySQL Table In 2023?

- Yisheng Gong tl;dr: “There’s rumor around the internet that we should avoid having > 20M rows in a single MySQL table. Otherwise, the table’s performance will be downgraded, you will find SQL query much slower than usual when it’s above the soft limit. These judgements were made on HDD many years ago. I’m wondering if it’s still true for MySQL on SSD in 2023, and if true, why is that?”

featured in #417


Factors To Consider In Database Selection

- Alex Xu tl;dr: Alex examines key factors that influence the decision-making process of database selection such as scalability, performance, data consistency.

featured in #409


Understanding Database Types

- Alex Xu tl;dr: “We’ll arm ourselves with the knowledge necessary to make informed decisions when faced with the challenge of choosing databases for various components of our application. We will dive into the process of database selection, examining the various types of databases, discussing factors that influence database performance and cost, and guiding ourselves toward the best choices for our application while balancing essential tradeoffs.”

featured in #408


The Inner Workings Of Distributed Databases

- Alex Pelagenko tl;dr: “We analyze how several popular time-series / OLAP databases implement high availability to highlight the pros and cons of each approach.” Alex also reviews the fundamentals of distributed databases.

featured in #407


Database Sharding Explained

- Mahdi Yusuf tl;dr: Mahdi discusses when to use it, how it can be set up, why we shard data stores and various options you have before sharding.

featured in #401


From Postgres To Amazon DynamoDB

tl;dr: From the engineering team at Instacart, who have to manage and efficiently store and query hundreds of terabytes of data. The primary datastore of choice was Postgres - but once specific use cases began to outpace the largest Amazon EC2 instance size AWS offers - they chose Amazon DynamoDB. Here they discuss migrating existing tables from Postgres to DynamoDB.

featured in #394


Database Cryptography Fur The Rest Of Us

tl;dr: The author defines database cryptography, how it manifests for both relational and NoSQL databases, searchable encryption, and provides a case study of MongoDB’s Client-Side encryption.

featured in #394


In-Depth: ClickHouse vs PostgreSQL

- Mathew Pregasen tl;dr: "Most companies that invest in an online analytical processing (OLAP) database like ClickHouse originally used an online transaction processing (OLTP) stack like MySQL or Postgres." Despite the two being built for different purposes, most companies leverage features in both during their scaling period. The author compares the two technologies here. 

featured in #372


Things You Should Know About Databases

- Mahdi Yusuf tl;dr: "So, without fully getting into the weeds on database-specific quirks, I will cover everything you should understand about RDBMS indexes. I will touch briefly on transactions and isolation levels and how they can impact your reasoning about specific transactions."

featured in #366