Test Failures Should Be Actionable

This article was adapted from a Google Testing on the Toilet (TotT) episode. You can download a printer-friendly version of this TotT episode and post it in your office.

By Titus Winters

There are a lot of rules and best practices around unit testing. There are many posts on this blog; there is deeper material in the Software Engineering at Google book; there is specific guidance for every major language; there is guidance on test frameworks, test naming, and dozens of other test-related topics. Isn’t this excessive?

Good unit tests contain several important properties, but you could focus on a key principle: Test failures should be actionable.

When a test fails, you should be able to begin investigation with nothing more than the test’s name and its failure messages—no need to add more information and rerun the test.

Effective use of unit test frameworks and assertion libraries (JUnit, Truth, pytest, GoogleTest, etc.) serves two important purposes. Firstly, the more precisely we express the invariants we are testing, the more informative and less brittle our tests will be. Secondly, when those invariants don’t hold and the tests fail, the failure info should be immediately actionable. This meshes well with Site Reliability Engineering guidance on alerting.

Consider this example of a C++ unit test of a function returning an absl::Status (an Abseil type that returns either an “OK” status or one of a number of different error codes):

EXPECT_TRUE(LoadMetadata().ok());

EXPECT_OK(LoadMetadata());

Sample failure output

load_metadata_test.cc:42: Failure

Value of: LoadMetadata().ok()

Expected: true

Actual: false

load_metadata_test.cc:42: Failure

Value of: LoadMetadata()

Expected: is OK

Actual: NOT_FOUND: /path/to/metadata.bin

If the test on the left fails, you have to investigate why the test failed; the test on the right immediately gives you all the available detail, in this case because of a more precise GoogleTest matcher.

Here are some other posts on this blog that emphasize making test failures actionable:

  • Writing Descriptive Test Names - If our tests are narrow and sufficiently descriptive, the test name itself may give us enough information to start debugging.

  • Keep Tests Focused - If we test multiple scenarios in a single test, it’s hard to identify  exactly what went wrong.

  • Prefer Narrow Assertions in Unit Tests - If we have overly wide assertions (such as  depending on every field of a complex output proto), the test may fail for many unimportant reasons. False positives are the opposite of actionable.

  • Keep Cause and Effect Clear - Refrain from using large global test data structures shared across multiple unit tests, allowing for clear identification of each test’s setup.