Essential Reading For Engineering Leaders

RLHF: Reinforcement Learning From Human Feedback

- Chip Huyen

tl;dr: How exactly does RLHF work? Why does it work?” Chip discusses the answers to these questions. “RL has been notoriously difficult to work with, and therefore, mostly confined to gaming and simulated environments. Just five years ago, both RL and NLP were progressing pretty much orthogonally – different stacks, different techniques, and different experimentation setups. It’s impressive to see it work in a new domain at a massive scale.”

featured in #414

Real World Recommendation System – Part 1

- Nikhil Garg

ML
RecSys

tl;dr: “The goal of this publication is to start from the basics, explain nuances of all the moving layers, and describe this universal recommendation system architecture.”

featured in #411

Twitter's Recommendation Algorithm

Scale
ML
Algo

tl;dr: Twitter recommendation algorithm distills roughly 500 million tweets posted daily down to a handful of top tweets that show up on your device’s, specifically for you. This blog is an introduction to how the algorithm works.

featured in #403

Demand And ETR Forecasting At Airports

ML

tl;dr: The engineering team at Uber discuss how to tackle the undersupply / oversupply issue at airports to forecast supply balance and optimize resource allocation. The team built new models for demand-forecasting and effective queue length on the top of the Michelangelo platform, and integrated with current Driver app.

featured in #400

Online Gradient Descent Written In SQL

- Max Halford

ML
Algo
SQL

tl;dr: Max implements a ML algorithm within a relational database, using SQL. Some databases allow doing inference with an already trained model. Training the model in the database would remove altogether the need for a separate inference / training service. Max attempts to do this with the Online Gradient Descent algorithm.

featured in #398

Scaling Media Machine Learning At Netflix

Architecture
ML

tl;dr: Netlfix’s goal in building ML infrastructure is to reduce the time from ideation to productization for the company. The team built infrastructure to (1) Access and process media data (e.g. video, image, audio, and text) (2) Training large-scale models efficiently. (3) Productize models in a self-serve fashion. (4) Store and serve model outputs for consumption.

featured in #396

What Is ChatGPT Doing … and Why Does It Work?

- Stephan Wolfram

GPT
ML

tl;dr: "My purpose here is to give a rough outline of what’s going on inside ChatGPT—and then to explore why it is that it can do so well in producing what we might consider to be meaningful text. I should say at the outset that I’m going to focus on the big picture of what’s going on—and while I’ll mention some engineering details, I won’t get deeply into them.”

featured in #390

Accelerating Our A/B Experiments With Machine Learning

- Michael Wilson

ML

tl;dr: "Dropbox runs experiments that compare two product versions — A and B — against each other to understand what works best for our users. When a company generates revenue from selling advertisements, analyzing these A/B experiments can be done promptly; did a user click on an ad or not? However, at Dropbox we sell subscriptions, which makes analysis more complex. What is the best way to analyze A/B experiments when a user’s experience over several months can affect their decision to subscribe?"

featured in #385

How Uber Optimizes The Timing Of Push Notifications Using ML And Linear Programming

ML

tl;dr: "We introduced a system we call the Consumer Communication Gateway: a centralized intelligence layer to manage the quality, ranking, timing, and frequency of push notifications on a user level."

featured in #383

How GPT3 Works - Visualizations And Animations

- Jay Alammar

ML
GPT

tl;dr: "The dataset of 300 billion tokens of text is used to generate training examples for the model. For example, these are three training examples generated from the one sentence at the top. You can see how you can slide a window across all the text and make lots of examples."

featured in #378

/ML