On Making Architectural Decisions

- Evgeniy Nikonorov tl;dr: Here we are shown that the architect’s main task is to define the comprehensive context - a set of evaluation criteria - to make well-balanced architectural decisions, and how to go about doing so.

featured in #252

Service Reliability Math That Every Engineer Should Know

- Matt Rickard tl;dr: "For a service to be up 99.99999% of the time, it can only be down at most 3 seconds every year. Unfortunately, achieving that milestone is a herculean task, even for the most experienced site reliability engineering teams."

featured in #251

How Does FaceTime Work?

- Mat Duggan tl;dr: "We need to establish a connection between two devices through various levels of networking abstraction, both at the ISP level and home level. This needs to be secure, reliable enough to maintain a conversation and also low bandwidth enough to be feasible given modern cellular data limits and home internet data caps. All of this needs to run on a device with limited battery capacity."

featured in #244

No, We Don’t Use Kubernetes

- Maik Zumstrull tl;dr: Maik is commonly asked by customers and interviews if they use Kubernetes as their primary dev platform. He believes the tech is "still very much at the peak of its hype cycle" and details a cost-benefit analysis of it in this post.

featured in #240

2020 Learnings, 2021 Expectations

- Chris Short tl;dr: Chris evaluates his 2020 predictions, and lays out his thoughts for 2021 - 5G and Edge will have significant impact, ARM will have a bigger 2021, live streaming becomes the norm, and the Developer Educator becomes a title.

featured in #220

Type In The Exact Number Of Machines To Proceed

tl;dr: When an enterprise has tooling allowing for changes to many machines simultaneously, by a simple shell command, errors happen. The author finds it helpful to prompt the person, before running the command, to manually enter the exact number of machines that will be affected.

featured in #214

We Can't Send Email More Than 500 Miles

- Trey Harris tl;dr: Trey gets a call from the Chairman of The Statistics Department at the college campus he was working at and is told: "we can't send mail more than 500 miles."

featured in #191

Deploys At Slack

- Michael Deng Jonathan Chang tl;dr: "We had to invest in greater visibility and reliability in order to accommodate the amount of work being done. This post will outline our process and a few of the major projects that got us to where we are."

featured in #180

(A Few) Ops Lessons We All Learn The Hard Way

- Jan Schaumann tl;dr: A laundry list of 88 devops observations by Jan.

featured in #173