/Tests

When, Why, And How GitHub And GitLab Use Feature Flags

- Ian Vanagas tl;dr: Ian discusses several benefits, such as reduced stress on developers, fewer failed deployments, and a higher rate of shipping features. GitLab calculated that fixing an issue without flags is as time-consuming as "developing a whole new feature." The article explores the advantages of feature flags over long-living feature branches for collaboration. Feature flags keep code changes small, make reviews easier, and limit merge conflicts. Both GitHub and GitLab use feature flags not just based on users but also on "actors" like organizations, teams, and repositories to create consistent experiences.

featured in #445


Keeping Figma Fast

- Slava Kim Laurel Woods tl;dr: Figma's journey in evolving its performance testing system as the company scaled. Initially, Figma used a single MacBook for all its in-house performance testing. However, as the codebase grew more complex and the team expanded, this approach became unsustainable. The article outlines the challenges Figma faced, such as the need for more granular performance tests and the limitations of running tests on a single piece of hardware. To address these issues, Figma adopted a two-system approach: a cloud-based system for mass testing and a hardware system for more targeted, precise tests. Both systems are connected by the same Continuous Integration system and aim to catch performance regressions early in the development cycle.

featured in #444


TDD With GitHub Copilot

- Paul Sobocinski tl;dr: The article explores the relationship between Test-Driven Development and AI coding assistants like GitHub Copilot. It argues that TDD remains essential even with AI assistance, as it provides fast and accurate feedback and helps in dividing and conquering problems. The article  shares tips for using GitHub Copilot with TDD, including starting with context, following the Red-Green-Refactor cycle, backfilling tests, and recognizing Copilot's limitations in refactoring.

featured in #441


Fuzz Testing Is the Best Thing To Happen To Our Application Tests

- Andrei Pechkurov tl;dr: The team at QuestDB faced challenges with segfaults, data corruption, and concurrency bugs. To address these, the team implemented fuzz testing, an automated software testing technique that provides invalid or unexpected data to a program to monitor for exceptions. This article details the process of introducing fuzz testing, revealing critical issues and leading to more robust database performance. The team also collaborated with SQLancer, a tool for testing SQL Database Management Systems, to uncover issues in their SQL engine.

featured in #441


A/B Testing Examples From Airbnb And YC's Top Companies

- Ian Vanagas tl;dr: Ian provides a comprehensive look at A/B testing examples from various successful companies, including Monzo, Instacart, Coinbase, Airbnb, and Convoy. It explores different approaches to A/B testing, such as Monzo's low-risk "pellets" strategy, Instacart's complex sampling problem-solving, Coinbase's scaling of tests, Airbnb's interleaving and dynamic p-values, and Convoy's Bayesian approach.

featured in #437


A Software Engineer's Guide To A/B Testing

- Lior Neu-ner tl;dr: This guide provides an introduction to A/B testing for software engineers. It explains the basics of A/B testing, including how to devise, implement, monitor and analyze tests, and answers common questions about A/B testing. The guide also lists conditions under which you may want to avoid A/B testing, such as lack of traffic, high implementation costs, and ethical considerations. The post concludes with a launch checklist for A/B tests.

featured in #434


Snapshot Testing

- Kent Beck tl;dr: Kent explains what Snapshot Testing is and how it scores on the  test desiderata - a list of 12 desirable properties of tests. This list is a useful framework for evaluating different types of tests.

featured in #431


Why We Test In Production (And You Should To)

- Ian Vanagas tl;dr: "Testing in production successfully is a multi-step process, and this post goes over what it is, why we do it, and how to do it well." Ian covers various types of production testing, such as usage tracking, feedback, monitoring, load testing, and integration testing.

featured in #428


I Booted Linux 292,612 Times

- Richard Jones tl;dr: Richard discovered a bug in Linux where it occasionally hangs on boot. He ran guestfish in a loop, performing 10,000 boots and using a test harness with up to 8 threads. After an extensive bisection process between versions 6.0 and 6.4-rc6, he found that a regression in the printk time feature was responsible.

featured in #423


When And How To Run Group-Targeted A/B Tests

- Lior Neu-ner tl;dr: Tests are run when one user interaction with your product impacts how others use it. “Suppose Slack wants to improve the usage of a new video calling feature. Improving the feature's discoverability for a single user will increase their own usage with it, but since they use it with their coworkers, their coworkers will also discover it.”

featured in #421