Category Archives: Google Testing Blog

If it ain’t broke, you’re not trying hard enough

Include Only Relevant Details In Tests

This article was adapted from a Google Testing on the Toilet (TotT) episode. You can download a printer-friendly version of this TotT episode and post it in your office.

By Dagang Wei

What problem in the code below makes the test hard to follow?

def test_get_balance(self):

  settings = BankSettings(FDIC_INSURED, REGULATED, US_BASED)

  account = Account(settings, ID, BALANCE, ADDRESS, NAME, EMAIL, PHONE)

  self.assertEqual(account.GetBalance(), BALANCE)

The problem is that there is a lot of noise in the account creation code, which makes it hard to tell which details are relevant to the assert statement. 

But going from one extreme to the other can also make the test hard to follow:

def test_get_balance(self):

  account = _create_account()

  self.assertEqual(account.GetBalance(), BALANCE)

Now the problem is that critical details are hidden in the _create_account() helper function, so it’s not obvious where the BALANCE field comes from. In order to understand the test, you need to switch context by diving into the helper function.

A good test should include only details relevant to the test, while hiding noise:

def test_get_balance():

  account = _create_account(BALANCE)

  self.assertEqual(account.GetBalance(), BALANCE)

By following this advice, it should be easy to see the flow of data throughout a test. For example:

Bad (flow of data is hidden):

Good (flow of data is clear):

def test_bank_account_overdraw_fails(self):

  account = _create_account()

  outcome = _overdraw(account)

  self._assert_withdraw_failed(

    outcome, account)


def _create_account():

  settings = BankSettings(...)

  return Account(settings, BALANCE, ...)


def _overdraw(account):

  # Boilerplate code

  ...

  return account.Withdraw(BALANCE + 1)


def _assert_withdraw_failed(

    self, outcome, account):

  self.assertEqual(outcome, FAILED)

  self.assertEqual(

    account.GetBalance(), BALANCE)

def test_bank_account_overdraw_fails(self):

  account = _create_account(BALANCE)

  outcome = _withdraw(account, BALANCE + 1)

  self.assertEqual(outcome, FAILED)

  self.assertEqual(

    account.GetBalance(), BALANCE)

def _create_account(balance):

  settings = BankSettings(...)

  return Account(settings, balance, ...)


def _withdraw(account, amount):

  # Boilerplate code

  ...

  return account.Withdraw(amount)

Simplify Your Control Flows

This is another post in our Code Health series. A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.

By Jeff Hoy

When adding loops and conditionals, even simple code can become difficult to understand.
Consider this change:

if (commode.HasPreferredCustomer()) {

if (commode.HasPreferredCustomer()) {

  commode.WarmSeat();

  commode.WarmSeat();


} else if (commode.CustomerOnPhone()) {


  commode.ChillSeat();

}

}

While the above change may seem simple, even adding a single else statement can make the code harder to follow since the complexity of code grows quickly with its size. Below we see the code surrounding the above snippet; the control flow on the right illustrates how much a reader needs to retain:

while (commode.StillOccupied()) {

  if (commode.HasPreferredCustomer()) {

    commode.WarmSeat();

  } else if (commode.CustomerOnPhone()) {  

    commode.ChillSeat();                   

  }

  if (commode.ContainsKale()) {

    commode.PrintHealthCertificate();

    break;

  }

}

Code Control Flow with 5 structures and 9 edges:

challenging for a reader to retain in memory.


In order to fully understand the code, the reader needs to keep the entire control flow in their head.  However, the retention capacity of working memory is limited (source)  Code path complexity will also challenge the reader, and can be measured using cyclomatic complexity.

To reduce cognitive overhead of complex code, push implementation logic down into functions and methods. For example, if the if/else structure in the above code is moved into an AdjustSeatTemp() method, the reviewer can review the two blocks independently, each having a much simpler control graph:

while (commode.StillOccupied()) {

  commode.AdjustSeatTemp();

  if (commode.ContainsKale()) {

    commode.PrintHealthCertificate();

    break;

  }

}

3 control structures and 5 edges: easier to remember


Commode::AdjustSeatTemp()

with 2 structures and 4 edges


Avoiding complexity makes code easier to follow. In addition, code reviewers are more likely to identify logic errors, and maintainers are less likely to introduce complex code.

Improve Readability With Positive Booleans

This is another post in our Code Health series. A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.

By Max Kanat-Alexander

Reading healthy code should be as easy as reading a book in your native language. You shouldn’t have to stop and puzzle over what a line of code is doing. One small trick that can assist with this is to make boolean checks about something positive rather than about something negative.

Here’s an extreme example:

if not nodisable_kryponite_shield:

  devise_clever_escape_plan()

else:

  engage_in_epic_battle()

What does that code do? Sure, you can figure it out, but healthy code is not a puzzle, it’s a simple communication. Let’s look at two principles we can use to simplify this code.

1. Name your flags and variables in such a way that they represent the positive check you wish to make (the presence of something, something being enabled, something being true) rather than the negative check you wish to make (the absence of something, something being disabled, something being false).


if not enable_kryponite_shield:

  devise_clever_escape_plan()

else:

  engage_in_epic_battle()

That is already easier to read and understand than the first example.

2. If your conditional looks like “if not else ” then reverse it to put the positive case first.

if enable_kryponite_shield:

  engage_in_epic_battle()

else:

  devise_clever_escape_plan()


Now the intention of the code is immediately obvious.

There are many other contexts in which this gives improvements to readability. For example, the command foo --disable_feature=False is harder to read and think about than
foo --enable_feature=True, particularly when you change the default to enable the feature.

There are some exceptions (for example, in Python, if foo is not None could be considered a “positive check” even though it has a “not” in it), but in general checking the presence or absence of a positive is simpler for readers to understand than checking the presence or absence of a negative.


Shell Scripts: Stay Small & Simple

A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.

By David Mandelberg

Shell scripts (including Bash scripts) can be convenient for automating simple command line procedures, and they are often better than keeping complicated commands in a single developer's history. However, shell scripts can be hard to understand and maintain, and are typically not as well-supported as other programming languages. Shell scripts have less support for unit testing, and there is likely a lower chance that somebody reading one will be experienced with the language.

Python, Go, or other general-purpose languages are often better choices than shell. Shell is convenient for some simple use cases, and the Google shell style guide can help with writing better shell scripts. But it is difficult, even for experienced shell scripters, to mitigate the risks of its many surprising behaviors. So whenever possible, use shell scripts only for small, simple use cases, or avoid shell entirely.

Here are some examples of mistakes that are far too easy to make when writing a shell script (see Bash Pitfalls for many more):

  • Forgetting to quote something can have surprising results, due to shell's complicated evaluation rules. E.g., even if a wildcard is properly quoted, it can still be unexpectedly expanded elsewhere:

$ msg='Is using bash a pro? Or a con?'

$ echo $msg  # Note that there's a subdirectory called 'proc' in the current directory.

Is using bash a proc Or a con?  # ? was unexpectedly treated as a wildcard.

  • Many things that would be function arguments in other languages are command line arguments in shell. Command line arguments are world-readable, so they can leak secrets:

$ curl -H "Authorization: Bearer ${SECRET}" "$URL" &

$ ps aux  # The current list of processes shows the secret.

  • By default, the shell ignores all errors from commands, which can cause severe bugs if code assumes that earlier commands succeeded. The command set -e can appear to force termination at the first error, but its behavior is inconsistent. For example, set -e does not affect some commands in pipelines (like false in false | cat), nor will it affect some command substitutions (such as the false in export FOO="$(false)"). Even worse, its behavior inside a function depends on how that function is called.

  • Many things run in subshells, which can (often unexpectedly) hide changes to variables from the main shell. It can also make manual error handling harder, compounding the issue above:

$ run_or_exit() { "$@" || exit $?; }  # Executes the arguments then exits on failure.

$ foo="$(run_or_exit false)"  # Exits the $() subshell, but the script continues.




The Secret to Great Code Reviews: Respect Reviewers’ Comments

This is another post in our Code Health series. A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.

By Marius Latinis

You prepared a code change and asked for a review. A reviewer left a comment you disagree with. Are you going to reply that you will not address the comment? 

When addressing comments for your code reviewed by colleagues, find a solution that makes both you and the reviewer happy. The fact that a reviewer left a comment suggests you may be able to improve the code further. Here are two effective ways to respond:

  • When it’s easy for you to make an improvement, update the code. Improved code benefits future readers and maintainers. You will also avoid a potentially long and emotional debate with a reviewer.

  • If the comment is unclear, ask the reviewer to explain. To facilitate the process, talk directly with the reviewer through chat, or in person.

Let’s demonstrate with an example code review scenario:

  1. You prepare a code change that modifies the following function:

    3 // Return the post with the most upvotes.

    3 // Return the post with the most upvotes.

                                                 

    4 // Restrict to English if englishOnly = true.

    4 Post findMostUpvotedPost(

    5 Post findMostUpvotedPost(

    5     List<Post> posts) {

    6     List<Post> posts,

     

    7     boolean englishOnly) {

    6   ... 

    8   ... // Old and new logic mixed together.

    7 }

    9 }


  1. The code reviewer leaves the following comment:

    alertreviewer

    11:51 AM

    The new function signature is too complex. Can we keep the signature unchanged?


    Reply

You disagree with the comment that one additional parameter makes the signature too complex. Nevertheless, do not reject the suggestion outright.

There is another issue that might have prompted the comment: it is not the responsibility of this function to check the post’s language (https://en.wikipedia.org/wiki/Single-responsibility_principle).

  1. You rewrite your code to address the reviewer’s comment:

    ImmutableList<Post> englishPosts = selectEnglishPosts(posts);  // Your new logic.

    Post mostUpvotedEnglishPost = findMostUpvotedPost(englishPosts);  // No change needed.

Now the code is improved, and both you and the reviewer are happy.

Communicate Design Tradeoffs Visually

A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.

By Tim Lyakhovetskiy


A goal of any written design or project proposal is to present and evaluate alternatives. However, documents that include multiple solutions can be difficult to read when the qualities of each solution are not clearly expressed.

A common approach to simplifying proposals is to use “pros and cons” for each alternative, but this leads to biased writing since the pros and cons may be weighed differently depending on the reader’s priorities.

In this example, can you quickly tell how this option would measure up against others?

Option 1 - Optimize Shoelace Untangling Wizard in ShoeApp UI


Pros

  • Shoelace Untangling Wizard UI will use 10% less CPU

  • Less than one quarter to implement

  • Users will see 100ms less UI lag

Cons

  • Security risk (shoelace colors exposed) until ShoeAppBackend team fixes lacing API

  • ShoeAppBackend will be blocked for 3 months

  • User documentation for Shoelace Untangling Wizard UI has to change

This format requires the reader to remember many details in order to evaluate which option they prefer. Instead, express tradeoffs using impact on qualities. There are many common quality attributes including Performance, Security, Maintainability, Usability, Testability, Scalability, and Cost.

Use colors and symbols in a table ( negative, somewhat negative, positive) to make it easy for readers to parse your ideas. The symbols are needed for accessibility, e.g. color-blindness and screen readers.

Option 1 - Optimize Shoelace Untangling Wizard in ShoeApp UI

Usability

➕ Users will see 100ms less UI lag

User documentation for Shoelace Untangling Wizard UI has to change

Security

➖ Security risk (shoelace colors exposed) until ShoeAppBackend fixes lacing API

Partner impact

➖ ShoeAppBackend will be blocked for 3 months

Performance

➕ Shoelace Untangling Wizard UI will use 10% less CPU

Schedule/Cost

➕ Less than one quarter to implement

Notice that the content uses approximately the same space but communicates more visually. The benefit is even greater when there are many alternatives/attributes, as it’s possible to evaluate the whole option at a glance.



Else Nuances

 This is another post in our Code Health series. A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.

By Sam Lee and Stan Chan

If your function exits early in an if statement, using or not using an else clause is equivalent in terms of behavior. However, the proper use of else clauses and guard clauses (lack of else) can help emphasize the intent of the code to the reader.

Consider the following guidelines to help you structure your functions:

  • Use a guard clause to handle special cases upfront, so that the rest of the code can focus on the core logic. A guard clause checks a criterion and fails fast or returns early if it is not met, which reduces nesting (see the Reduce Nesting article).

def parse_path(path: str) -> Path:

  if not path:

    raise ValueError(“path is empty.”)

  else:

    # Nested logic here.

    ...

def parse_path(path: str) -> Path:

  if not path:

    raise ValueError(“path is empty.”)

  # No nesting needed for the valid case.

  ...

  • Use else if it is part of the core responsibility. Prefer to keep related conditional logic syntactically grouped together in the same if...else structure if each branch is relevant to the core responsibility of the function. Grouping logic in this way emphasizes the complementary nature of each condition. The complementary nature is emphasized explicitly by the else statement, instead of being inferred and relying on the resulting behavior of the prior return statement.

def get_favicon(self) -> Icon:

  if self.user.id is None:

    return Icon.SIGNED_OUT

  if self.browser.incognito: return Icon.INCOGNITO

  if not self.new_inbox_items:

    return Icon.EMPTY;

  return Icon.HAS_ITEMS

def get_favicon(self) -> Icon:

  if self.user.id is None:

    return Icon.SIGNED_OUT

  elif self.browser.incognito:

    return Icon.INCOGNITO

  elif not self.new_inbox_items:

    return Icon.EMPTY

  else:

    return Icon.HAS_ITEMS

  # No trailing return is needed or allowed.

When it’s idiomatic, use a switch (or similar) statement instead of if...else statements. (switch/when in Go/Kotlin can accept boolean conditions like if...else.)

Not all scenarios will be clear-cut for which pattern to use; use your best judgment to choose between these two styles. A good rule of thumb is use a guard if it's a special case, use else if its core logic. Following these guidelines can improve code understandability by emphasizing the connections between different logical branches.


Use Abstraction to Improve Function Readability

This is another post in our Code Health series. A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.


By Palak Bansal and Mark Manley


Which version of the createPizza function below is easier to understand?

func createPizza(order *Order) *Pizza {  

  pizza := &Pizza{Base: order.Size,

                  Sauce: order.Sauce,

                  Cheese: “Mozzarella”}


  if order.kind == “Veg” {

    pizza.Toppings = vegToppings

  } else if order.kind == “Meat” {

    pizza.Toppings = meatToppings

  }


  oven := oven.New()

  if oven.Temp != cookingTemp { 

    for (oven.Temp < cookingTemp) {

      time.Sleep(checkOvenInterval)

      oven.Temp = getOvenTemp(oven)

    }

  }


  if !pizza.Baked {

    oven.Insert(pizza)

    time.Sleep(cookTime)

    oven.Remove(pizza)

    pizza.Baked = true

  }


  box := box.New()

  pizza.Boxed = box.PutIn(pizza)

  pizza.Sliced = box.SlicePizza(order.Size)

  pizza.Ready = box.Close()

  return pizza  

}

func createPizza(order *Order) *Pizza {

  pizza := prepare(order)

  bake(pizza)

  box(pizza)

  return pizza

}


func prepare(order *Order) *Pizza {

  pizza := &Pizza{Base: order.Size,

                  Sauce: order.Sauce,

                  Cheese: “Mozzarella”}

  addToppings(pizza, order.kind)

  return pizza

}


func addToppings(pizza *Pizza, kind string) {

  if kind == “Veg” {

    pizza.Toppings = vegToppings

  } else if kind == “Meat” {

    pizza.Toppings = meatToppings

  }

}


func bake(pizza *Pizza) {

  oven := oven.New()

  heatOven(oven) 

  bakePizza(pizza, oven)

}


func heatOven(oven *Oven) { … }

func bakePizza(pizza *Pizza, oven *Oven) { … }

func box(pizza *Pizza) { … }

You probably said the right-hand side is easier, but why? The left side mixes together several levels of abstraction: low-level implementation details (e.g., how to heat the oven), intermediate-level functions (e.g., how to bake pizza), and high-level abstractions (e.g., preparing, baking, and boxing the pizza).

The right side is easier to follow because the functions have a consistent level of abstraction, providing a top-down narrative of the code’s logic. createPizza is a high-level function that delegates the preparing, baking, and boxing steps to lower-level specialized functions with intuitive names. Those functions, in turn, delegate to their own lower-level specialized functions (e.g., heatOven) until they reach a function that handles implementation details without needing to call other functions. 

Avoid mixing different abstraction layers into a single function. Nest functions at equal abstraction levels to provide a narrative. This self-documenting style is simpler to follow, debug, and reuse.

You can learn more about this topic in the book Clean Code by Robert C. Martin. 



Better Feedback Makes for Faster Reviews

Testing on the Toilet Presents...

Healthy Code on the Commode



Better Feedback Makes for Faster Reviews

by Felipe Sodré and Adam Bender

Have you ever received a code review with a comment like this?

server_is_alive = InitializeServer()

Comment:

This doesn’t make any sense. Why don't you use InitializeServerWithAFewExtraSteps() instead?


If so, it probably left you scratching your head. Why is the reviewer asking this question? Are they testing you? Did you make a big mistake? The problem is the comment is vague (and maybe even a bit rude). It also leaves out important context as to what the reviewer is thinking, making it difficult to respond to. Here are a few simple ways to ensure your review comments are effective, helpful, and clear. 

  • Be kind! People are more receptive to feedback if you assume competence and treat them with respect. 

  • Focus your comments on the code, not the author. Avoid statements with the word ‘you’ which can give the impression that you are judging the person and not the artifact.

  • Explain why you are making the comment. You may be aware of alternatives that are not obvious to the author, or they may be aware of additional constraints.

  • Express the tradeoffs your suggestion entails and take a pragmatic approach to working with the author to achieve the right balance.

  • Approach your role as a guide, not a test grader. Balance giving direct guidance with leaving some degrees of freedom for the author.

  • Consider marking low priority comments with severity like Nit, Optional, or FYI to help the author prioritize the important ones. 

Some comments can be difficult to express concisely. If you have a rewrite in mind, like a subtle API usage or structural change (e.g., split this big change into smaller changes), consider providing an example. If you find you are stuck after more than two rounds of feedback, consider moving to a more direct communication channel like chat or live call to discuss next steps.

When done well, comments can make code reviews more efficient and collaborative:

server_is_alive = InitializeServer()

Comment:

InitializeServerWithAFewExtraSteps() seems to achieve the same result but with built-in logging and auditing (see http://short_link_here). Would that be a better option here?


The goal of reviews of any kind is to get the best possible outcome for you and your team. Software Engineering is a team effort, and by helping each other more effectively you will deliver better, together!

References: Code Health: Respectful Reviews == Useful Reviews, Google's guide on Code Review Comments

More information, discussion, and archives: testing.googleblog.com

Copyright Google LLC. Licensed under a Creative Commons

Attribution–ShareAlike 4.0 License (http://creativecommons.org/licenses/by-sa/4.0/).


Sensenmann: Code Deletion at Scale

By Phil Norman

Code at Scale

At Google, tens of thousands of software engineers contribute to a multi-billion-line mono-repository. This repository, stored in a system called Piper, contains the source code of shared libraries, production services, experimental programs, diagnostic and debugging tools: basically anything that's code-related.


This open approach can be very powerful. For example, if an engineer is unsure how to use a library, they can find examples just by searching. It also allows kind-hearted individuals to perform important updates across the whole repository, be that migrating to newer APIs, or following language developments such as Python 3, or Go generics.


Code, however, doesn't come for free: it's expensive to produce, but also costs real engineering time to maintain. Such maintenance cannot easily be skipped, at least if one wants to avoid larger costs later on.


But what if there were less code to maintain? Are all those lines of code really necessary?


Deletion at Scale

Any large project accumulates dead code: there's always some module that is no longer needed, or a program that was used during early development but hasn't been run in years. Indeed, entire projects are created, function for a time, and then stop being useful. Sometimes they are cleaned up, but cleanups require time and effort, and it's not always easy to justify the investment.


However, while this dead code sits around undeleted, it's still incurring a cost: the automated testing system doesn't know it should stop running dead tests; people running large-scale cleanups aren't aware that there's no point migrating this code, as it is never run anyway.


So what if we could clean up dead code automatically? That was exactly what people started thinking several years ago, during the Zürich Engineering Productivity team's annual hackathon. The Sensenmann project, named after the German word for the embodiment of Death, has been highly successful. It submits over 1000 deletion changelists per week, and has so far deleted nearly 5% of all C++ at Google.


Its goal is simple (at least, in principle): automatically identify dead code, and send code review requests ('changelists') to delete it.


What to Delete?

Google's build system, Blaze (the internal version of Bazel) helps us determine this: by representing dependencies between binary targets, libraries, tests, source files and more, in a consistent and accessible way, we're able to construct a dependency graph. This allows us to find libraries that are not linked into any binary, and propose their deletion.


That's only a small part of the problem, though: what about all those binaries? All the one-shot data migration programs, and diagnostic tools for deprecated systems? If they don't get removed, all the libraries they depend on will be kept around too.


The only real way to know if programs are useful is to check whether they're being run, so for internal binaries (programs run in Google's data centres, or on employee workstations), a log entry is written when a program runs, recording the time and which specific binary it is. By aggregating this, we get a liveness signal for every binary used in Google. If a program hasn't been used for a long time, we try sending a deletion changelist.


What Not to Delete?

There are, of course, exceptions: some program code is there simply to serve as an example of how to use an API; some programs run only in places we can't get a log signal from. There are many other exceptions too, where removing the code would be deleterious. For this reason, it's important to have a blocklisting system so that exceptions can be marked, and we can avoid bothering people with spurious changelists.


The Devel's in the Details

Consider a simple case. We have two binaries, each depending on its own library, and also on a third, shared library. Drawing this (ignoring the source files and other dependencies), we find this kind of structure:


If we see that main1 is in active use, but main2 was last used over a year ago, we can propagate the liveness signal through the build tree, marking main1 as alive along with everything it depends upon. What is left can be removed; as main2 depends on lib2, we want to delete these two targets in the same change:


So far so good, but real production code has unit tests, whose build targets depend upon the libraries they test. This immediately makes the graph traversal a lot more complicated:


The testing infrastructure is going to run all those tests, including lib2_test, despite lib2 never being executed 'for real'. This means we cannot use test runs as a 'liveness' signal: if we did, we'd consider lib2_test to be alive, which would keep lib2 around forever. We would only be able to clean up untested code, which would severely hamper our efforts.


What we really want is for each test to share the fate of the library it is testing. We can do this by making the library and its test interdependent, thus creating loops in the graph:


This turns each library and its test into a strongly connected component. We can use the same technique as before, marking the 'live' nodes and then hunting for collections of 'dead' nodes to be deleted, but this time using Tarjan's strongly connected components algorithm to deal with the loops.


Simple, right? Well, yes, if it's easy to identify the relationships between tests and the libraries they're testing. Sadly, that is not always the case. In the examples above, there's a simple naming convention which allows us to match tests to libraries, but we can't rely on that heuristic in general.


Consider the following two cases:



On the left, we have an implementation of the LZW compression algorithm, as separate compressor and decompressor libraries. The test is actually testing both of them, to ensure data isn't corrupted after being compressed and then decompressed. On the right, we have a web_test that is testing our web server library; it uses a URL encoder library for support, but isn't actually testing the URL encoder itself. On the left, we want to consider the LZW test and both LZW libraries as one connected component, but on the right, we'd want to exclude the URL encoder and consider web_test and web_lib as the connected component.


Despite requiring different treatment, these two cases have identical structures. In practice, we can encourage engineers to mark libraries like url_encoder_lib as being 'test only' (ie. only for use to support unit testing), which can help in the web-test case; otherwise our current approach is to use the edit distance between test and library names to pick the most likely library to match to a given test. Being able to identify cases like the LZW example, with one test and two libraries, is likely to involve processing test coverage data, and has not yet been explored.


Focus on the User...

While the ultimate beneficiaries of dead code deletion are the software engineers themselves, many of whom appreciate the help in keeping their projects tidy, not everyone is happy to receive automated changelists trying to delete code they wrote. This is where the social engineering side of the project comes in, which is every bit as important as the software engineering.


Automatic code deletion is an alien concept to many engineers, and just as with the introduction of unit testing 20 years ago, many are resistent to it. It takes time and effort to change people's minds, along with a good deal of careful communication.


There are three main parts to Sensenmann's communication strategy. Of primary importance are the change descriptions, as they are the first thing a reviewer will see. They must be concise, but must provide enough background for all reviewers to be able to make a judgement. This is a difficult balance to achieve: too short, and many people will fail to find the information they need; too long, and one ends up with a wall of text no one will bother to read. Well-labelled links to supporting documentation and FAQs can really help here.


The second part is the supporting documentation. Concise and clear wording is vital here, too, as is a good navigable structure. Different people will need different information: some need reassurance that in a source control system, deletions can be rolled back; some will need guidance in how best to deal with a bad change, for example by fixing a misuse of the build system. Through careful thought, and iterations of user feedback, the supporting documentation can become a useful resource.


The third part is dealing with user feedback. This can be the hardest part at times: feedback is more frequently negative than positive, and can require a cool head and a good deal of diplomacy at times. However, accepting such feedback is the best way to improve the system in general, make users happier, and thus avoid negative feedback in the future.


Onwards and Upwards

Automatically deleting code may sound like a strange idea: code is expensive to write, and is generally considered to be an asset. However, unused code costs time and effort, whether in maintaining it, or cleaning it up. Once a code base reaches a certain size, it starts to make real sense to invest engineering time in automating the clean-up process. At Google's scale, it is estimated that automatic code deletion has paid for itself tens of times over, in saved maintenance costs.


The implementation requires solutions to problems both technical and social in nature. While a lot of progress has been made in both these areas, they are not entirely solved yet. As improvements are made, though, the rate of acceptance of the deletions increases and automatic deletion becomes more and more impactful. This kind of investment will not make sense everywhere, but if you have a huge mono-repository, maybe it'd make sense for you too. At least at Google, reducing the C++ maintenance burden by 5% is a huge win.