How I Learned To Stop Writing Brittle Tests and Love Expressive APIs

This article was adapted from a Google Testing on the Toilet (TotT) episode. You can download a printer-friendly version of this TotT episode and post it in your office.

By Titus Winters

A valuable but challenging property for tests is “resilience,” meaning a test should only fail when something important has gone wrong. However, an opposite property may be easier to see: A “brittle” test is one that fails not for real problems that would break in production, but because the test itself is fragile for innocuous reasons. Error messages, changing the order of metadata headers in a web request, or the order of calls to a heavily-mocked dependency can often cause a brittle test to fail.

Expressive test APIs are a powerful tool in the fight against brittle, implementation-detail heavy tests. A test written with IsSquare(output) is more expressive (and less brittle) than a test written with details such as JsonEquals(.width = 42, .length = 42), in cases where the size of the square is irrelevant. Similar expressive designs might include unordered element matching for hash containers, metadata comparisons for photos, and activity logs in processing objects, just to name a few. 

As an example, consider this C++ test code:

absl::flat_hash_set<int> GetValuesFromConfig(const Config&);


TEST(ConfigValues, DefaultConfigsArePrime) {

  // Note the strange order of these values. BAD CODE, DON’T DO THIS!

  EXPECT_THAT(GetValuesFromConfig(Config()), ElementsAre(29, 17, 31));

}

The reliance on hash ordering makes this test brittle, preventing improvements to the API being tested. A critical part of the fix to the above code was to provide better test APIs that allowed engineers to more effectively express the properties that mattered. Thus we added UnorderedElementsAre to the GoogleTest test framework and refactored brittle tests to use that: 

TEST(ConfigValues, DefaultConfigsArePrimeAndOrderDoesNotMatter) {

  EXPECT_THAT(GetValuesFromConfig(Config()), UnorderedElementsAre(17, 29, 31));

}

It’s easy to see brittle tests and think, “Whoever wrote this did the wrong thing! Why are these tests so bad?” But it’s far better to see that these brittle failures are a signal indicating where the available testing APIs are missing, under-advertised, or need attention.

Brittleness may indicate that the original test author didn’t have access to (or didn’t know about) test APIs that could more effectively identify the salient properties that the test meant to enforce. Without the right tools, it’s too easy to write tests that depend on irrelevant details, making those tests brittle. 

If your tests are brittle, look for ways to narrow down golden diff tests that compare exact pixel layouts or log outputs. Discover and learn more expressive APIs. File feature requests with the owners of the upstream systems.

If you maintain infrastructure libraries and can’t make changes because of brittleness, think about what your users are lacking, and invest in expressive test APIs.