The Perils Of Migrating A Large-Scale Service At Uber

tl;dr: Details of Uber's journey in migrating its invoice generation service, highlighting challenges and lessons learned. The initial service was written in Python and faced scalability issues due to early design choices, accumulated technical debt and a legacy software stack. The new service was developed in Go, chosen for its speed and flexibility. The migration strategy adopted was component-based, focusing on individual system components rather than entire flows. The migration led to a 97% reduction in computing requirements and enhanced self-serve capabilities, reducing engineers' support work from 60% to under 20%.

featured in #442

Migrating Critical Traffic At Scale With No Downtime

tl;dr: From the team at Netflix: “when undertaking system migrations, one of the main challenges is establishing confidence and seamlessly transitioning the traffic to the upgraded architecture without adversely impacting the customer experience. This blog series will examine the tools, techniques, and strategies we utilized to achieve this goal.”

featured in #415

How Discord Stores Trillions Of Messages

- Bo Ingram tl;dr: “Our Cassandra cluster exhibited serious performance issues that required increasing amounts of effort to just maintain, not improve.” Bo discusses the troubles with Cassandra and the migration to ScyllaDB, a Cassandra-compatible database written in C++.

featured in #396

From Postgres To Amazon DynamoDB

tl;dr: From the engineering team at Instacart, who have to manage and efficiently store and query hundreds of terabytes of data. The primary datastore of choice was Postgres - but once specific use cases began to outpace the largest Amazon EC2 instance size AWS offers - they chose Amazon DynamoDB. Here they discuss migrating existing tables from Postgres to DynamoDB.

featured in #394

Strategies And Tools For Performing Migrations On Platform

- Mariana Ardoino Raul Herbster tl;dr: The authors present the following challenges - or scenarios - faced during the project: (1) Defining the scope of the project. (2) Scaling up. (3) Competing priorities. Each scenarios comes with symptoms (“when”), what you should avoid when facing the situation (“Don’t”), and what we suggest that you do (“Do”)."

featured in #371

Real-World Engineering Challenges #6: Migrations

- Gergely Orosz tl;dr: Gergely covers examples of companies that have carried out large scale migrations, including: (1) Box: a zero downtime data migration using a 6-step plan. (2) Pinterest: data migration using double writes. (3) LinkedIn: navigating the migration chaos when 100+ engineers were needed to write code and 600+ use cases need to be moved. And more. 

featured in #359

How We Reduced Our Annual Server Costs By 80% — From $1M To $200k — By Moving Away From AWS

- Trey Huffine tl;dr: Prerender saved $800k by removing their reliance on AWS and building in-house infrastructure to handle traffic and cached data. This post discusses the 3 phased approach to tackle the migration - testing, technical set-up, implementation and scaling. 

featured in #356

Changing Tires At 100mph: A Guide To Zero Downtime Migrations

- Kiran Rao tl;dr: (1) Create the new empty table. (2) Write to both old and new table. (3) Copy data (in chunks) from old to new. (4) Validate consistency. (5) Switch reads to new table. (6) Stop writes to the old table. (7) Cleanup old table. This guide will go through the step-by-step process of migrating tables in PostgreSQL. 

featured in #315

Migrations Done Well

- Gergely Orosz tl;dr: "If you do some groundwork before starting the migration, you’ll reduce risk, gain confidence and understand the scope of the migration better." Gergely breaks the migration process into the following steps: (1) Preparation for migrations. (2) Pre-migration steps, such as monitoring and validation. (3) The migration itself, covering downtime, strategies & toolset. (4) After the migration. (5) The migration long-tail.

featured in #302

How We Migrated Dropbox From Nginx To Envoy

- Alexey Ivanov Oleg Guba tl;dr: "We’ll compare Nginx to Envoy across many software engineering and operational dimensions. We’ll also briefly touch on the migration process, its current state, and some of the problems encountered on the way."

featured in #197