The Billion Row Challenge (1BRC) - Step-By-Step From 71s To 1.7s
- Marko Topolnik tl;dr: “The main thing I'd like to show you in this post is that a good part of that amazing speed comes from easy-to-grasp, reusable tricks that you could apply in your code as well. Towards the end, I'll also show you some of the magical parts that take it beyond that level.”featured in #491
featured in #481
featured in #477
Building A Faster Hash Table For High Performance SQL Joins
- Andrei Pechkurov tl;dr: Andrei delves into QuestDB’s unique hash table, FastMap, designed to enhance SQL execution for JOIN and GROUP BY operations. FastMap employs open addressing and linear probing, optimized for high performance in database environments. It supports variable-size keys and fixed-size values, facilitating efficient data handling and updates. Notably, FastMap operates off-heap, reducing garbage collection pressure to improve performance.featured in #475
featured in #474
The Case Of A Curious SQL Query
- Justin Jaffray tl;dr: Justin provides a deep dive into SQL's foundational aspects, highlighting the importance of a formalized approach to query behavior. Using the example of "predicate pushdown," Justin presents a SQL query that behaves differently across databases like DuckDB, SQLite, and CockroachDB. "I think it's a fun little mind bender that gives you some insight into the internals of these databases query engines without having to actually look at any code."featured in #460
Storage Challenges In The Evolution Of Database Architecture
- Sujay Venaik tl;dr: “Sync service has been running since 2014, and we started facing issues related to physical storage on the database layer. For context, sync service runs on an AWS RDS Aurora cluster that has a single primary writer node and 3-4 readers, all of which are r6g.8xlarge. AWS RDS has a physical storage size limit of 128TiB for each RDS cluster… We were hovering around ~95TB, and our rate of ingestion was ~2TB per month. At this rate, we realized we would see ingestion issues in another 6-8 months.” The team devised a three-pronged strategy: eliminating unused tables, revising their append-only tables approach, and methodically freeing up space from sizable tables. This strategy successfully reclaimed about 60TB of space.featured in #452
featured in #452
Navigating The Stars: How InfluxDB Powers Loft Orbital's Space Innovations
tl;dr: Loft Orbital, a leading space infrastructure service provider, simplifies space missions with technological advances. They operate customer payloads on microsatellites as a service. Using Telegraf, InfluxDB, and Google Cloud, they collect telemetry data from spacecraft and monitor mission infrastructure. InfluxDB aids in QA, performance monitoring, and reveals long-term data trends, enhancing their mission automation.featured in #451
How Do Databases Execute Expressions?
- Phil Eaton tl;dr: “Databases are fun. They sit at the confluence of Computer Science topics that might otherwise not seem practical in life as a developer. For example, every database with a query language is also a programming language implementation of some caliber. That doesn't include all databases though of course; see: RocksDB, FoundationDB, TigerBeetle, etc. This post looks at how various databases execute expressions in their query language.”featured in #451