/Tests

Prefer Narrow Assertions In Unit Tests

- Kai Kent tl;dr: “Broad assertions should only be used for unit tests that care about all of the implicitly tested behaviors, which should be a small minority of unit tests. Prefer to have at most one such test that checks for full equality of a complex object for the common case, and use narrow assertions for all other cases.” Examples are provided in this article. 

featured in #503


A Few Words On Testing

- Thorsten Ball tl;dr: “Too many flaky tests. Too much time spent getting the tests to pass after making a tiny change that I knew was correct but the tests didn’t. Too many integration tests that made people wait 20, 30, 40 minutes until they could merge their change, only to reveal — months later — that they never tested anything. Too many times have I fixed a bug and knew it was fixed because I tested it manually, thoroughly, and was 100% sure that I know how the code works and that this can’t happen again, but then spent hours — 10 times longer than it took me to fix the bug — to write a test only to prove what I knew all along, that the bug is fixed.” 

featured in #499


A Few Words On Testing

- Thorsten Ball tl;dr: “Too many flaky tests. Too much time spent getting the tests to pass after making a tiny change that I knew was correct but the tests didn’t. Too many integration tests that made people wait 20, 30, 40 minutes until they could merge their change, only to reveal — months later — that they never tested anything. Too many times have I fixed a bug and knew it was fixed because I tested it manually, thoroughly, and was 100% sure that I know how the code works and that this can’t happen again, but then spent hours — 10 times longer than it took me to fix the bug — to write a test only to prove what I knew all along, that the bug is fixed.” 

featured in #498


Increase Test Fidelity By Avoiding Mocks

- Andrew Trenk Dillon Bly tl;dr: “Aim for as much fidelity as you can achieve without increasing the size of a test. At Google, tests are classified by size. Most tests should be small: they must run in a single process and must not wait on a system or event outside of their process. Increasing the fidelity of a small test is often a good choice if the test stays within these constraints. A healthy test suite also includes medium and large tests, which have higher fidelity since they can use heavyweight dependencies that aren’t feasible to use in small tests, e.g., dependencies that increase execution times or call other processes.”

featured in #493


Meta's New LLM-Based Test Generator Is A Sneak Peek To The Future Of Development

- Leonardo Creed tl;dr: “Meta claims that this “this is the first paper to report on LLM-generated code that has been developed independent of human intervention (other than final review sign off), and landed into large scale industrial production systems with guaranteed assurances for improvement over the existing code base.” Furthermore, there are solid principles that developers can take away in order to use AI effectively themselves.” 

featured in #492


Too Much Of A Good Thing: The Trade-Off We Make With Tests

- Nicole Tietz-Sokolskaya tl;dr: “If you aim for 100% code coverage, you're saying that any risk of bug is a risk you want to avoid. And if you have no tests, you're saying that it's okay if you have severe bugs with maximum cost.” Nicole presents us with a way to think about how much code coverage is enough. You need two numbers: (1) The cost of writing tests. To get this, you have to measure how much time is spent on testing. (2) The cost of bugs. Getting this number is more complicated. You can measure the time your team spends on triaging and fixing bugs. The rest of it, you'll estimate with management and product. The idea here is just to get close enough to understand the trade-off, not to be exact.

featured in #487


Feature Flags Spaghetti // FFs Missing Features

- Eliran Turgeman tl;dr: “I feel like there are some key features missing that would make me switch vendors. I mainly have two problems with current solutions: (1) It can get tedious and messy to turn on/off a feature when multiple FFs were placed for it. (2) Your codebase becomes a FF graveyard if you don’t remember cleaning it, and you probably don’t…” Eli provides suggestions on how to address these. 

featured in #486


The Day I Started Believing In Unit Tests

- Benjamin Richner tl;dr: “The test ran hundreds if not thousands of times successfully. What a waste of time... But then, one day, we started observing test failures. Not many, maybe three over the course of a few weeks. The test actually crashed with a Segmentation Fault, so it was clear that it was a severe error. Interestingly, none of the code under test had actually changed. Well, that's definitely something we had to investigate! I spare you the details of the search for the error, but eventually, I was able to reproduce the problem while a debugger was attached, so the entire context of the problem was handed to me on a silver platter.”

featured in #475


Canon TDD

- Kent Beck tl;dr: Test-driven development (TDD) is a programming method where new features are added without disrupting existing functions. It ensures new and old features work correctly, readies the system for future updates, and builds programmer confidence. The flow is as follows: (1) Write a list of the test scenarios you want to cover. (2) Turn exactly one item on the list into an actual, concrete, runnable test. (3) Change the code to make the test (& all previous tests) pass (adding items to the list as you discover them). (4) Optionally refactor to improve the implementation design. (5) Until the list is empty, go back to #2.

featured in #473


Pytest Daemon: 10X Local Test Iteration Speed

- Ruby Feinstein tl;dr: Discord utilizes a Python monolith to power its API, from sending messages to managing subscriptions. To support this, they use pytest to write and run unit tests. Over the last 8 years, the time it takes to run a single test has continuously grown until it reached a point where it takes 13 seconds to run a single test. even if the test ends up doing absolutely nothing. This post discusses how tests were sped up.

featured in #472