Tag Archives: Code Health

Arrange Your Code to Communicate Data Flow

This article was adapted from a Google Tech on the Toilet (TotT) episode. You can download a printer-friendly version of this TotT episode and post it in your office.

By Sebastian Dörner

We often read code linearly, from one line to the next. To make code easier to understand and to reduce cognitive load for your readers, make sure that adjacent lines of code are coherent. One way to achieve this is to order your lines of code to match the data flow inside your method:

fun getSandwich(

bread: Bread, pasture: Pasture

): Sandwich {

// This alternates between milk-

// bread-related code.

val cow = pasture.getCow()

val slicedBread = bread.slice()

val milk = cow.getMilk()

val toast = toastBread(slicedBread)

val cheese = makeCheese(milk)

return Sandwich(cheese, toast)

}

fun getSandwich(

bread: Bread, pasture: Pasture

): Sandwich {

// Linear flow from cow to milk

// to cheese.

val cow = pasture.getCow()

val milk = cow.getMilk()

val cheese = makeCheese(milk)

// Linear flow from bread to slicedBread

// to toast.

val slicedBread = bread.slice()

val toast = toastBread(slicedBread)

return Sandwich(cheese, toast)

}

To visually emphasize the grouping of related lines, you can add a blank line between each code block.

Often you can further improve readability by extracting a method, e.g., by extracting the first 3 lines of the function on the above right into a getCheese method. However, in some scenarios, extracting a method isn’t possible or helpful, e.g., if data is used a second time for logging. If you order the lines to match the data flow, you can still increase code clarity:

fun getSandwich(bread: Bread, pasture: Pasture): Sandwich {

// Both milk and cheese are used below, so this can’t easily be extracted into

// a method.

val cow = pasture.getCow()

val milk = cow.getMilk()

reportFatContentToStreamz(cow.age, milk)

val cheese = makeCheese(milk)

val slicedBread = bread.slice()

val toast = toastBread(slicedBread)

logWarningIfAnyExpired(bread, toast, milk, cheese)

return Sandwich(cheese, toast)

}

It isn’t always possible to group variables perfectly if you have more complicated data flows, but even incremental changes in this direction improve the readability of your code. A good starting point is to declare your variables as close to the first use as possible.

Source: Google Testing Blog

Write Change-Resilient Code With Domain Objects

This is another post in our Code Health series. A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.

By Amy Fu

Although a product's requirements can change often, its fundamental ideas usually change slowly. This leads to an interesting insight: if we write code that matches the fundamental ideas of the product, it will be more likely to survive future product changes.

Domain objects are building blocks (such as classes and interfaces) in our code that match the fundamental ideas of the product. Instead of writing code to match the desired behavior for the product's requirements ("configure text to be white"), we match the underlying idea ("text color settings").

For example, imagine you’re part of the gPizza team, which sells tasty, fresh pizzas to feed hungry Googlers. Due to popular demand, your team has decided to add a delivery service.

Without domain objects, the quickest path to pizza delivery is to simply create a deliverPizza method:

public class DeliveryService {

public void deliverPizza(List<Pizza> pizzas) { ... }

}

Although this works well at first, what happens if gPizza expands its offerings to other foods?
You could add a new method:

public void deliverWithDrinks(List<Pizza> pizzas, List<Drink> drinks) { ... }

But as your list of requirements grows (snacks, sweets, etc.), you’ll be stuck adding more and more methods. How can you change your initial implementation to avoid this continued maintenance burden?

You could add a domain object that models the product's ideas, instead of its requirements:

A use case is a specific behavior that helps the product satisfy its business requirements.
(In this case, "Deliver pizzas so we make more money".)
A domain object represents a common idea that is shared by several similar use cases.

To identify the appropriate domain object, ask yourself:

What related use cases does the product support, and what do we plan to support in future?

A: gPizza wants to deliver pizzas now, and eventually other products such as drinks and snacks.

What common idea do these use cases share?

A: gPizza wants to send the customer the food they ordered.

What is a domain object we can use to represent this common idea?

A: The domain object is a food order. We can encapsulate the use cases in a FoodOrder class.

Domain objects can be a useful generalization - but avoid choosing objects that are too generic, since there is a tradeoff between improved maintainability and more complex, ambiguous code. Generally, aim to support only planned use cases - not all possible use cases (see YAGNI principles).

// GOOD: It's clear what we're delivering.

public void deliver(FoodOrder order) {}

// BAD: Don't support furniture delivery.

public void deliver(DeliveryList items) {}

Learn more about domain objects and the more advanced topic of domain-driven design in the book Domain-Driven Design by Eric Evans.

Source: Google Testing Blog

Less Is More: Principles for Simple Comments

By David Bendory

Simplicity is the ultimate sophistication. — Leonardo da Vinci

You’re staring at a wall of code resembling a Gordian knot of Klingon. What’s making it worse? A sea of code comments so long that you’d need a bathroom break just to read them all! Let’s fix that.

Adopt the mindset of someone unfamiliar with the project to ensure simplicity. One approach is to separate the process of writing your comments from reviewing them; proofreading your comments without code context in mind helps ensure they are clear and concise for future readers.

Use self-contained comments to clearly convey intent without relying on the surrounding code for context. If you need to read the code to understand the comment, you’ve got it backwards!

Not self-contained; requires reading the code

Suggested alternative

// Respond to flashing lights in // rearview mirror.

// Pull over for police and/or yield to

// emergency vehicles.

while flashing_lights_in_rearview_mirror() {

move_to_slower_lane() || stop_on_shoulder();

}

Include only essential information in the comments and leverage external references to reduce cognitive load on the reader. For comments suggesting improvements, links to relevant bugs or docs keep comments concise while providing a path for follow-up. Note that linked docs may be inaccessible, so use judgment in deciding how much context to include directly in the comments.

Too much potential improvement in the comment

Suggested alternative

// The local bus offers good average- // case performance. Consider using // the subway which may be faster

// depending on factors like time of // day, weather, etc.

// TODO: Consider various factors to // present the best transit option.

// See issuetracker.fake/bus-vs-subway

commute_by_local_bus();

Avoid extensive implementation details in function-level comments. When implementations change, such details often result in outdated comments. Instead, describe the public API contract, focusing on what the function does.

Too much implementation detail

Suggested alternative

// For high-traffic intersections // prone to accidents, pass through // the intersection and make 3 right // turns, which is equivalent to // turning left.

// Perform a safe left turn at a

// high-traffic intersection.

// See discussion in

// dangerous-left-turns.fake/about.

fn safe_turn_left() {

go_straight();

for i in 0..3 {

turn_right();

}

Source: Google Testing Blog

In Praise of Small Pull Requests

By Elliotte Rusty Harold

Note: A “pull request” refers to one self-contained change that has been submitted to version control or which is undergoing code review. At Google, this is referred to as a“CL”, which is short for “changelist”.

Prefer small, focused pull requests that do exactly one thing each. Why? Several reasons:

Small pull requests are easier to review. A mistake in a focused pull request is more obvious. In a 40 file pull request that does several things, would you notice that one if statement had reversed the logic it should have and was using true instead of false? By contrast, if that if block and its test were the only things that changed in a pull request, you’d be a lot more likely to catch the error.
Small pull requests can be reviewed quickly. A reviewer can often respond quickly by slipping small reviews in between other tasks. Larger pull requests are a big task by themselves, often waiting until the reviewer has a significant chunk of time.
If something does go wrong and your continuous build breaks on a small pull request, the small size makes it much easier to figure out exactly where the mistake is. They are also easier to rollback if something goes wrong.
By virtue of their size, small pull requests are less likely to conflict with other developers’ work. Merge conflicts are less frequent and easier to resolve.
If you’ve made a critical error, it saves a lot of work when the reviewer can point this out after you’ve only gone a little way down the wrong path. Better to find out after an hour than after several weeks.
Pull request descriptions are more accurate when pull requests are focused on one task. The revision history becomes easier to read.
Small pull requests can lead to increased code coverage because it’s easier to make sure each individual pull request is completely tested.

Small pull requests are not always possible. In particular:

Frequent pull requests require reviewers to respond quickly to code review requests. If it takes multiple hours to get a pull request reviewed, developers spend more time blocked. Small pull requests often work better when reviewers are co-located (ideally within Nerf gun range for gentle reminders).
Some features cannot safely be committed in partial states. If this is a concern, try to put the new feature behind a flag.
Refactorings such as changing an argument type in a public method may require modifying many dozens of files at once.

Nonetheless, even if a pull request can’t be small, it can still be focused, e.g., fixing one bug, adding one feature or UI element, or refactoring one method.

Source: Google Testing Blog

Don’t DRY Your Code Prematurely

By Dan Maksimovich

Many of us have been told the virtues of “Don’t Repeat Yourself” or DRY. Pause and consider: Is the duplication truly redundant or will the functionality need to evolve independently over time? Applying DRY principles too rigidly leads to premature abstractions that make future changes more complex than necessary.

Consider carefully if code is truly redundant or just superficially similar. While functions or classes may look the same, they may also serve different contexts and business requirements that evolve differently over time. Think about how the functions’ purpose holds with time, not just about making the code shorter. When designing abstractions, do not prematurely couple behaviors that may evolve separately in the longer term.

When does introducing an abstraction harm our code? Let’s consider the following code:

# Premature DRY abstraction assuming # uniform rules, limiting entity-

# specific changes.

class DeadlineSetter:

def __init__(self, entity_type):

self.entity_type = entity_type

def set_deadline(self, deadline):

if deadline <= datetime.now():

raise ValueError(

“Date must be in the future”)

task = DeadlineSetter(“task”)

task.set_deadline(

datetime(2024, 3, 12))

payment = DeadlineSetter(“payment”)

payment.set_deadline(

datetime(2024, 3, 18))

# Repetitive but allows for clear,

# entity-specific logic and future

# changes.

def set_task_deadline(task_deadline):

if task_deadline <= datetime.now():

raise ValueError(

“Date must be in the future”)

def set_payment_deadline( payment_deadline):

if payment_deadline <= datetime.now():

raise ValueError(

“Date must be in the future”)

set_task_deadline(

datetime(2024, 3, 12))

set_payment_deadline(

datetime(2024, 3, 18))

The approach on the right seems to violate the DRY principle since the ValueError checks are coincidentally the same. However, tasks and payments represent distinct concepts with potentially diverging logic. If payment date later required a new validation, you could easily add it to the right-hand code; adding it to the left-hand code is much more invasive.

When in doubt, keep behaviors separate until enough common patterns emerge over time that justify the coupling. On a small scale, managing duplication can be simpler than resolving a premature abstraction’s complexity. In early stages of development, tolerate a little duplication and wait to abstract.

Future requirements are often unpredictable. Think about the “You Aren’t Gonna Need It” or YAGNI principle. Either the duplication will prove to be a nonissue, or with time, it will clearly indicate the need for a well-considered abstraction.

Source: Google Testing Blog

Don’t DRY Your Code Prematurely

By Dan Maksimovich

When does introducing an abstraction harm our code? Let’s consider the following code:

# Premature DRY abstraction assuming # uniform rules, limiting entity-

# specific changes.

class DeadlineSetter:

def __init__(self, entity_type):

self.entity_type = entity_type

def set_deadline(self, deadline):

if deadline <= datetime.now():

raise ValueError(

“Date must be in the future”)

task = DeadlineSetter(“task”)

task.set_deadline(

datetime(2024, 3, 12))

payment = DeadlineSetter(“payment”)

payment.set_deadline(

datetime(2024, 3, 18))

# Repetitive but allows for clear,

# entity-specific logic and future

# changes.

def set_task_deadline(task_deadline):

if task_deadline <= datetime.now():

raise ValueError(

“Date must be in the future”)

def set_payment_deadline( payment_deadline):

if payment_deadline <= datetime.now():

raise ValueError(

“Date must be in the future”)

set_task_deadline(

datetime(2024, 3, 12))

set_payment_deadline(

datetime(2024, 3, 18))

Source: Google Testing Blog

Avoid the Long Parameter List

By Gene Volovich

Have you seen code like this?

void transform(String fileIn, String fileOut, String separatorIn, String separatorOut);

This seems simple enough, but it can be difficult to remember the parameter ordering. It gets worse if you add more parameters (e.g., to specify the encoding, or to email the resulting file):

void transform(String fileIn, String fileOut, String separatorIn, String separatorOut,

String encoding, String mailTo, String mailSubject, String mailTemplate);

To make the change, will you add another (overloaded) transform method? Or add more parameters to the existing method, and update every single call to transform? Neither seems satisfactory.

One solution is to encapsulate groups of the parameters into meaningful objects. The CsvFile class used here is a “value object” — simply a holder for the data.

class CsvFile {

CsvFile(String filename, String separator, String encoding) { ... }

String filename() { return filename; }

String separator() { return separator; }

String encoding() { return encoding; }

} // ... and do the same for the EmailMessage class

void transform(CsvFile src, CsvFile target, EmailMessage resultMsg) { ... }

How to define a value object varies by language. For example, in Java, you can use a record class, which is available in Java 16+ (for older versions of Java, you can use AutoValue to generate code for the value object); in Kotlin, you can use a data class; in C++, you can use an option struct.

Using a value object this way may still result in a long parameter list when instantiating it. Solutions for this vary by language. For example, in Python, you can use keyword arguments and default parameter values to shorten the parameter list; in Java, one option is to use the Builder pattern, which lets you call a separate function to set each field, and allows you to skip setting fields that have default values.

CsvFile src = CsvFile.builder().setFilename("a.txt").setSeparator(":").build();

CsvFile target = CsvFile.builder().setFilename("b.txt").setEncoding(UTF_8).build();

EmailMessage msg =

EmailMessage.builder().setMailTo(rcpt).setMailTemplate("template").build();

transform(src, target, msg);

Always try to group data that belongs together and break up long, complicated parameter lists. The result will be code that is easier to read and maintain, and harder to make mistakes with.

Source: Google Testing Blog

isBooleanTooLongAndComplex

By Yiming Sun

You may have come across some complex, hard-to-read Boolean expressions in your codebase and wished they were easier to understand. For example, let's say we want to decide whether a pizza is fantastic:

// Decide whether this pizza is fantastic.

if ((!pepperoniService.empty() || sausages.size() > 0)

&& (useOnionFlag.get() || hasMushroom(ENOKI, PORTOBELLO)) && hasCheese()) {

...

}

A first step toward improving this is to extract the condition into a well-named variable:

boolean isPizzaFantastic =

(!pepperoniService.empty() || sausages.size() > 0)

&& (useOnionFlag.get() || hasMushroom(ENOKI, PORTOBELLO)) && hasCheese();

if (isPizzaFantastic) {

...

}

However, the Boolean expression is still too complex. It's potentially confusing to calculate the value of isPizzaFantastic from a given set of inputs. You might need to grab a pen and paper, or start a server locally and set breakpoints.

Instead, try to group the details into intermediate Booleans that provide meaningful abstractions. Each Boolean below represents a single well-defined quality, and you no longer need to mix && and || within an expression. Without changing the business logic, you’ve made it easier to see how the Booleans relate to each other:

boolean hasGoodMeat = !pepperoniService.empty() || sausages.size() > 0;

boolean hasGoodVeggies = useOnionFlag.get() || hasMushroom(ENOKI, PORTOBELLO);

boolean isPizzaFantastic = hasGoodMeat && hasGoodVeggies && hasCheese();

Another option is to hide the logic in a separate method. This also offers the possibility of early returns using guard clauses, further reducing the need to keep track of intermediate states:

boolean isPizzaFantastic() {

if (!hasCheese()) {

return false;

}

if (pepperoniService.empty() && sausages.size() == 0) {

return false;

}

return useOnionFlag.get() || hasMushroom(ENOKI, PORTOBELLO);
}

Source: Google Testing Blog

What’s in a Name?

by Adam Raider

“There are only two hard things in computer science: cache invalidation and naming things.” —Phil Karlton

Have you ever read an identifier only to realize later it doesn’t do what you expected? Or had to read the implementation in order to understand an interface? These indirections eat up our cognitive bandwidth and make our work more difficult. We spend far more time reading code than we do writing it; thoughtful names can save the reader (and writer) a lot of time and frustration. Here are some naming tips:

Spend time considering names—it’s worth it. Don’t default to the first name that comes to mind. The more public the name, the more expensive it is to change. Past a certain scale, names become infeasible to change, especially for APIs. Pay attention to a name in proportion to the cost of renaming it later. If you’re feeling stuck, consider running a new name by a teammate.
Describe behavior. Encourage naming based on what functions do rather than when the functions are called. Avoid prefixes like “handle” or “on” as they describe when and provide no added meaning:

button.listen('click', handleClick)

button.listen('click', addItemToCart)

Reveal intent with a contextually appropriate level of abstraction:

High-abstraction functions describe the what and operate on high-level types.
Lower-abstraction functions describe the how and operate on lower-level types.

For example, logout might call into clearUserToken, and recordWithCamera might call into parseStreamBytes.

Prefer unique, precise names. Are you frequently asking for the UserManager? Manager, Util, and similar suffixes are a common but imprecise naming convention. What does it do? It manages! If you’re struggling to come up with a more precise name, consider splitting the class into smaller ones.
Balance clarity and conciseness—use abbreviations with care. Commonly used abbreviations, such as HTML, i18n, and RPC, can aid communication but less-known ones can confuse your average readers. Ask yourself, “Will my readers immediately understand this label? Will a reader five years from now understand it?”
Avoid repetition and filler words. Or in other words, don’t say the same thing twice. It adds unnecessary visual noise:

userData.userBirthdayDate

user.birthDate

Software changes—names should, too. If you see an identifier that doesn’t aptly describe itself—fix it!

Learn more about identifier naming in this post: IdentifierNamingPostForWorldWideWebBlog.

Source: Google Testing Blog

Let Code Speak for Itself

by Shiva Garg and Francois Aube

Comments can be invaluable for understanding and maintaining a code base. But excessive comments in code can become unhelpful clutter full of extraneous and/or outdated detail.

Comments that offer useless (or worse, obsolete) information hurt readability. Here are some tips to let your code speak for itself:

Write comments to explain the “why” behind a certain approach in code. The comment below has two good reasons to exist: documenting non-obvious behavior and answering a question that a reader is likely to have (i.e. why doesn’t this code render directly on the screen?):

// Eliminate flickering by rendering the next frame off-screen and swapping into the

// visible buffer.

RenderOffScreen();

SwapBuffers();

Use well-named identifiers to guide the reader and reduce the need for comments:

// Payout should not happen if the user is

// in an ineligible country.

std::unordered_set<std::string> ineligible =

{"Atlantis", "Utopia"};

if (!ineligible.contains(country)) {

Payout(user.user_id);

}

if (IsCountryEligibleForPayout(country)) { Payout(user.user_id); }

Write function comments (a.k.a. API documentation) that describe intended meaning and purpose, not implementation details. Choose unambiguous function signatures that callers can use without reading any documentation. Don’t explain inner details that could change without affecting the contract with the caller:

// Reads an input string containing either a

// number of milliseconds since epoch or an

// ISO 8601 date and time. Invokes the

// Sole, Laces, and ToeCap APIs, then

// returns an object representing the Shoe

// available then or nullptr if none were.

Shoe* ModelAvailableAt(char* time);

// Returns the Shoe that was available for

// purchase at `time`. If no model was

// available, throws a runtime_error.

Shoe ModelAvailableAt(time_t time);

Omit comments that state the obvious. Superfluous comments increase code maintenance when code gets refactored and don’t add value, only overhead to keep these comments current:

// Increment counter by 1.

counter++;

Learn more about writing good comments: To Comment or Not to Comment?, Best practices for writing code comments

googblogs.com

All Google blogs and Press in one site

Tag Archives: Code Health

Arrange Your Code to Communicate Data Flow

Source: Google Testing Blog

Write Change-Resilient Code With Domain Objects

Source: Google Testing Blog

Less Is More: Principles for Simple Comments

Source: Google Testing Blog

In Praise of Small Pull Requests

Source: Google Testing Blog

Don’t DRY Your Code Prematurely

Source: Google Testing Blog

Don’t DRY Your Code Prematurely

Source: Google Testing Blog

Avoid the Long Parameter List

Source: Google Testing Blog

isBooleanTooLongAndComplex

Source: Google Testing Blog

What’s in a Name?

Source: Google Testing Blog

Let Code Speak for Itself

Source: Google Testing Blog