Real Time Presence Platform System Design
tl;dr: “In layman’s terms, the presence status shows whether a particular user is currently online or offline. The presence status is popular on real-time messaging applications and social networking platforms such as LinkedIn, Facebook, and Slack. The presence status represents the availability of the user for communication on a chat application or a social network.”featured in #408
featured in #407
The Inner Workings Of Distributed Databases
- Alex Pelagenko tl;dr: “We analyze how several popular time-series / OLAP databases implement high availability to highlight the pros and cons of each approach.” Alex also reviews the fundamentals of distributed databases.featured in #407
featured in #406
You Want Modules, Not Microservices
- Ted Neward tl;dr: Ted dissecting the concept of a microservice to “get to the real root of what's going on” arguing there's a mis-match between its promise and what it actually delivers.featured in #402
featured in #401
Automating Safe, Hands-Off Deployments
- Clare Liguori tl;dr: “In this article, we walk through the steps a code change goes through in a pipeline at Amazon on its way to production. A typical continuous delivery pipeline has four major phases - source, build, test, and production. We’ll dive into the details of what happens in each of these pipeline phases for a typical AWS service, and provide you with an example of how a typical AWS service team might set up one of their pipelines.”featured in #401
Keeping The Cloudflare API 'All Green' Using Python-Based Testing
- Elie Mitrani tl;dr: This article discusses Scout, an automated system running Python tests verifying the end to end behavior of Cloudflare’s APIs. Scout evaluates APIs in production-like environments, green lights a production deployment and monitors the behavior of APIs in production. This article dives deep into how it operates.featured in #399
How Discord Stores Trillions Of Messages
- Bo Ingram tl;dr: “Our Cassandra cluster exhibited serious performance issues that required increasing amounts of effort to just maintain, not improve.” Bo discusses the troubles with Cassandra and the migration to ScyllaDB, a Cassandra-compatible database written in C++.featured in #396
Scaling Media Machine Learning At Netflix
tl;dr: Netlfix’s goal in building ML infrastructure is to reduce the time from ideation to productization for the company. The team built infrastructure to (1) Access and process media data (e.g. video, image, audio, and text) (2) Training large-scale models efficiently. (3) Productize models in a self-serve fashion. (4) Store and serve model outputs for consumption.featured in #396