Tag Archives: Code Health

Use Abstraction to Improve Function Readability

This is another post in our Code Health series. A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.


By Palak Bansal and Mark Manley


Which version of the createPizza function below is easier to understand?

func createPizza(order *Order) *Pizza {  

  pizza := &Pizza{Base: order.Size,

                  Sauce: order.Sauce,

                  Cheese: “Mozzarella”}


  if order.kind == “Veg” {

    pizza.Toppings = vegToppings

  } else if order.kind == “Meat” {

    pizza.Toppings = meatToppings

  }


  oven := oven.New()

  if oven.Temp != cookingTemp { 

    for (oven.Temp < cookingTemp) {

      time.Sleep(checkOvenInterval)

      oven.Temp = getOvenTemp(oven)

    }

  }


  if !pizza.Baked {

    oven.Insert(pizza)

    time.Sleep(cookTime)

    oven.Remove(pizza)

    pizza.Baked = true

  }


  box := box.New()

  pizza.Boxed = box.PutIn(pizza)

  pizza.Sliced = box.SlicePizza(order.Size)

  pizza.Ready = box.Close()

  return pizza  

}

func createPizza(order *Order) *Pizza {

  pizza := prepare(order)

  bake(pizza)

  box(pizza)

  return pizza

}


func prepare(order *Order) *Pizza {

  pizza := &Pizza{Base: order.Size,

                  Sauce: order.Sauce,

                  Cheese: “Mozzarella”}

  addToppings(pizza, order.kind)

  return pizza

}


func addToppings(pizza *Pizza, kind string) {

  if kind == “Veg” {

    pizza.Toppings = vegToppings

  } else if kind == “Meat” {

    pizza.Toppings = meatToppings

  }

}


func bake(pizza *Pizza) {

  oven := oven.New()

  heatOven(oven) 

  bakePizza(pizza, oven)

}


func heatOven(oven *Oven) { … }

func bakePizza(pizza *Pizza, oven *Oven) { … }

func box(pizza *Pizza) { … }

You probably said the right-hand side is easier, but why? The left side mixes together several levels of abstraction: low-level implementation details (e.g., how to heat the oven), intermediate-level functions (e.g., how to bake pizza), and high-level abstractions (e.g., preparing, baking, and boxing the pizza).

The right side is easier to follow because the functions have a consistent level of abstraction, providing a top-down narrative of the code’s logic. createPizza is a high-level function that delegates the preparing, baking, and boxing steps to lower-level specialized functions with intuitive names. Those functions, in turn, delegate to their own lower-level specialized functions (e.g., heatOven) until they reach a function that handles implementation details without needing to call other functions. 

Avoid mixing different abstraction layers into a single function. Nest functions at equal abstraction levels to provide a narrative. This self-documenting style is simpler to follow, debug, and reuse.

You can learn more about this topic in the book Clean Code by Robert C. Martin. 



Code Health: Now You’re Thinking With Functions

This is another post in our Code Health series. A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.

by Cathal Weakliam

Loops are the standard way to process collections like arrays and lists. However, some loops implement the same patterns repeatedly, leading to duplicate code. Higher-order functions—functions that use other functions as inputs or outputs—can reduce duplication by providing a simpler way to express common operations with collections.

Consider these two loops in JavaScript that decide if every object in an array meets a condition:

let everyRequestValid = true;
for (const request of requests) {
if (!isValid(request)) {
everyRequestValid = false;
break;
}
}

if (everyRequestValid) {
// do something
}
let everyUserEligible = true;
for (const user of users) {
if (!isEligible(user)) {
everyUserEligible = false;
break;
}
}

if (everyUserEligible) {
// do something
}


The high similarity between the two loops violates the Don’t Repeat Yourself principle and creates an unnecessary burden on readers and maintainers of the code.

To reduce the maintenance burden, use the every method to replace each loop with a single expression.  (In other languages every may have a different name, like all or allMatch).

if (requests.every(isValid)) {
// do something
}
if (users.every(isEligible)) {
// do something
}


Processing collections with higher-order functions has several benefits:
  • It significantly reduces duplication by abstracting away the common looping code.
  • The resulting code is much shorter and simpler, with less opportunity for bugs.
  • A reader can quickly see the intent of the code as it is not obscured behind low-level control flow.
Two other common higher-order functions are map (apply a function to each element of a collection) and filter (select the elements of a collection that pass a predicate). While the exact syntax varies by language, here’s how they look in JavaScript (using an anonymous function as the argument):

// Double each value in `ints`
const doubled = ints.map(n => n * 2);
// Get only the even values in `ints`
const evens = ints.filter(n => n % 2 === 0);


Just don’t overdo it! Don’t rewrite a loop with functions if it makes it harder to understand, or if it would be considered unidiomatic in your language (for example, in Python, list comprehensions are equivalent to map and filter but are usually preferred).

Testing on the Toilet: Separation of Concerns? That’s a Wrap!

This article was adapted from a Google Testing on the Toilet (TotT) episode. You can download a printer-friendly version of this TotT episode and post it in your office.


By Stefan Kennedy


The following function decodes a byte array as an image using an API named SpeedyImg. What maintenance problems might arise due to referencing an API owned by a different team?

SpeedyImgImage decodeImage(List<SpeedyImgDecoder> decoders, byte[] data) {
SpeedyImgOptions options = getDefaultConvertOptions();
for (SpeedyImgDecoder decoder : decoders) {
SpeedyImgResult decodeResult = decoder.decode(decoder.formatBytes(data));
SpeedyImgImage image = decodeResult.getImage(options);
if (validateGoodImage(image)) { return image; }
}
throw new RuntimeException();
}



Details about how to call the API are mixed with domain logic, which can make the code harder to understand. For example, the call to decoder.formatBytes() is required by the API, but how the bytes are formatted isn’t relevant to the domain logic.


Additionally, if this API is used in many places across a codebase, then all usages may need to change if the way the API is used changes. For example, if the return type of this function is changed to the more generic SpeedyImgResult type, usages of SpeedyImgImage would need to be updated.


To avoid these maintenance problems, create wrapper types to hide API details behind an abstraction:

Image decodeImage(List<ImageDecoder> decoders, byte[] data) {
for (ImageDecoder decoder : decoders) {
Image decodedImage = decoder.decode(data);
if (validateGoodImage(decodedImage)) { return decodedImage; }
}
throw new RuntimeException();
}


Wrapping an external API follows the Separation of Concerns principle, since the logic for how the API is called is separated from the domain logic. This has many benefits, including:
  • If the way the API is used changes, encapsulating the API in a wrapper insulates how far those changes can propagate across your codebase.
  • You can modify the interface or the implementation of types you own, but you can’t for API types.
  • It is easier to switch or add another API, since they can still be represented by the introduced types (e.g. ImageDecoder/Image).
  • Readability can improve as you don’t need to sift through API code to understand core logic.

Not all external APIs need to be wrapped. For example, if an API would take a huge effort to separate or is simple enough that it doesn't pollute the codebase, it may be better not to introduce wrapper types (e.g. library types like List in Java or std::vector in C++). When in doubt, keep in mind that a wrapper should only be added if it will clearly improve the code (see the YAGNI principle).


“Separation of Concerns” in the context of external APIs is also described by Martin Fowler in his blog post, Refactoring code that accesses external services


Testing on the Toilet: Separation of Concerns? That’s a Wrap!

This article was adapted from a Google Testing on the Toilet (TotT) episode. You can download a printer-friendly version of this TotT episode and post it in your office.


By Stefan Kennedy


The following function decodes a byte array as an image using an API named SpeedyImg. What maintenance problems might arise due to referencing an API owned by a different team?

SpeedyImgImage decodeImage(List<SpeedyImgDecoder> decoders, byte[] data) {
SpeedyImgOptions options = getDefaultConvertOptions();
for (SpeedyImgDecoder decoder : decoders) {
SpeedyImgResult decodeResult = decoder.decode(decoder.formatBytes(data));
SpeedyImgImage image = decodeResult.getImage(options);
if (validateGoodImage(image)) { return image; }
}
throw new RuntimeException();
}



Details about how to call the API are mixed with domain logic, which can make the code harder to understand. For example, the call to decoder.formatBytes() is required by the API, but how the bytes are formatted isn’t relevant to the domain logic.


Additionally, if this API is used in many places across a codebase, then all usages may need to change if the way the API is used changes. For example, if the return type of this function is changed to the more generic SpeedyImgResult type, usages of SpeedyImgImage would need to be updated.


To avoid these maintenance problems, create wrapper types to hide API details behind an abstraction:

Image decodeImage(List<ImageDecoder> decoders, byte[] data) {
for (ImageDecoder decoder : decoders) {
Image decodedImage = decoder.decode(data);
if (validateGoodImage(decodedImage)) { return decodedImage; }
}
throw new RuntimeException();
}


Wrapping an external API follows the Separation of Concerns principle, since the logic for how the API is called is separated from the domain logic. This has many benefits, including:
  • If the way the API is used changes, encapsulating the API in a wrapper insulates how far those changes can propagate across your codebase.
  • You can modify the interface or the implementation of types you own, but you can’t for API types.
  • It is easier to switch or add another API, since they can still be represented by the introduced types (e.g. ImageDecoder/Image).
  • Readability can improve as you don’t need to sift through API code to understand core logic.

Not all external APIs need to be wrapped. For example, if an API would take a huge effort to separate or is simple enough that it doesn't pollute the codebase, it may be better not to introduce wrapper types (e.g. library types like List in Java or std::vector in C++). When in doubt, keep in mind that a wrapper should only be added if it will clearly improve the code (see the YAGNI principle).


“Separation of Concerns” in the context of external APIs is also described by Martin Fowler in his blog post, Refactoring code that accesses external services


Testing on the Toilet: Separation of Concerns? That’s a Wrap!

This article was adapted from a Google Testing on the Toilet (TotT) episode. You can download a printer-friendly version of this TotT episode and post it in your office.


By Stefan Kennedy


The following function decodes a byte array as an image using an API named SpeedyImg. What maintenance problems might arise due to referencing an API owned by a different team?

SpeedyImgImage decodeImage(List<SpeedyImgDecoder> decoders, byte[] data) {
SpeedyImgOptions options = getDefaultConvertOptions();
for (SpeedyImgDecoder decoder : decoders) {
SpeedyImgResult decodeResult = decoder.decode(decoder.formatBytes(data));
SpeedyImgImage image = decodeResult.getImage(options);
if (validateGoodImage(image)) { return image; }
}
throw new RuntimeException();
}



Details about how to call the API are mixed with domain logic, which can make the code harder to understand. For example, the call to decoder.formatBytes() is required by the API, but how the bytes are formatted isn’t relevant to the domain logic.


Additionally, if this API is used in many places across a codebase, then all usages may need to change if the way the API is used changes. For example, if the return type of this function is changed to the more generic SpeedyImgResult type, usages of SpeedyImgImage would need to be updated.


To avoid these maintenance problems, create wrapper types to hide API details behind an abstraction:

Image decodeImage(List<ImageDecoder> decoders, byte[] data) {
for (ImageDecoder decoder : decoders) {
Image decodedImage = decoder.decode(data);
if (validateGoodImage(decodedImage)) { return decodedImage; }
}
throw new RuntimeException();
}


Wrapping an external API follows the Separation of Concerns principle, since the logic for how the API is called is separated from the domain logic. This has many benefits, including:
  • If the way the API is used changes, encapsulating the API in a wrapper insulates how far those changes can propagate across your codebase.
  • You can modify the interface or the implementation of types you own, but you can’t for API types.
  • It is easier to switch or add another API, since they can still be represented by the introduced types (e.g. ImageDecoder/Image).
  • Readability can improve as you don’t need to sift through API code to understand core logic.

Not all external APIs need to be wrapped. For example, if an API would take a huge effort to separate or is simple enough that it doesn't pollute the codebase, it may be better not to introduce wrapper types (e.g. library types like List in Java or std::vector in C++). When in doubt, keep in mind that a wrapper should only be added if it will clearly improve the code (see the YAGNI principle).


“Separation of Concerns” in the context of external APIs is also described by Martin Fowler in his blog post, Refactoring code that accesses external services


Code Health: Respectful Reviews == Useful Reviews

This is another post in our Code Health series. A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.

By Liz Kammer (Google), Maggie Hodges (UX research consultant), and Ambar Murillo (Google)

While code review is recognized as a valuable tool for improving the quality of software projects, code review comments that are perceived as being unclear or harsh can have unfavorable consequences: slow reviews, blocked dependent code reviews, negative emotions, or negative perceptions of other contributors or colleagues.

Consider these tips to resolve code review comments respectfully.

As a Reviewer or Author:
  • DO: Assume competence. An author’s implementation or a reviewer’s recommendation may be due to the other party having different context than you. Start by asking questions to gain understanding.
  • DO: Provide rationale or context, such as a best practices document, a style guide, or a design document. This can help others understand your decision or provide mentorship.
  • DO: Consider how comments may be interpreted. Be mindful of the differing ways hyperbole, jokes, and emojis may be perceived.
    Author Don’t:
    I prefer short names so I’d rather
    not change this. Unless you make
    me? :)
    Author Do:
    Best practice suggests omitting
    obvious/generic terms. I’m not
    sure how to reconcile that
    advice with this request.
  • DON’T: Criticize the person. Instead, discuss the code. Even the perception that a comment is about a person (e.g., due to using “you” or “your”) distracts from the goal of improving the code.
    Reviewer Don’t:
    Why are you using this approach?
    You’re adding unnecessary
    complexity.
    Reviewer Do:
    This concurrency model appears to
    be adding complexity to the
    system without any visible
    performance benefit.
  • DON’T: Use harsh language. Code review comments with a negative tone are less likely to be useful. For example, prior research found very negative comments were considered useful by authors 57% of the time, while more-neutral comments were useful 79% of the time.  

As a Reviewer:
  • DO: Provide specific and actionable feedback. If you don’t have specific advice, sometimes it’s helpful to ask for clarification on why the author made a decision.
    Reviewer Don’t:
    I don’t understand this.
    Reviewer Do:
    If this is an optimization, can you
    please add comments?
  • DO: Clearly mark nitpicks and optional comments by using prefixes such as ‘Nit’ or ‘Optional’. This allows the author to better gauge the reviewer’s expectations.

As an Author:
  • DO: Clarify code or reply to the reviewer’s comment in response to feedback. Failing to do so can signal a lack of receptiveness to implementing improvements to the code.
    Author Don’t:
    That makes sense in some cases but
    not here.
    Author Do:
    I added a comment about why
    it’s implemented that way.
  • DO: When disagreeing with feedback, explain the advantage of your approach. In cases where you can’t reach consensus, follow Google’s guidance for resolving conflicts in code review.

Code Health: Make Interfaces Hard to Misuse

This is another post in our Code Health series. A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.

By Marek Kiszkis

We all try to avoid errors in our code. But what about errors created by callers of your code? A good interface design can make it easy for callers to do the right thing, and hard for callers to do the wrong thing. Don't push the responsibility of maintaining invariants required by your class on to its callers.
Can you see the issues that can arise with this code?
class Vector {
explicit Vector(int num_slots); // Creates an empty vector with `num_slots` slots.
int RemainingSlots() const; // Returns the number of currently remaining slots.
void AddSlots(int num_slots); // Adds `num_slots` more slots to the vector.
// Adds a new element at the end of the vector. Caller must ensure that RemainingSlots()
// returns at least 1 before calling this, otherwise caller should call AddSlots().
void Insert(int value);
}

If the caller forgets to call AddSlots(), undefined behavior might be triggered when Insert() is called. The interface pushes complexity onto the caller, exposing the caller to implementation details.

Since maintaining the slots is not relevant to the caller-visible behaviors of the class, don't expose them in the interface; make it impossible to trigger undefined behavior by adding slots as needed in Insert().
@Test public void class Vector {
explicit Vector(int num_slots);
// Adds a new element at the end of the vector. If necessary,
// allocates new slots to ensure that there is enough storage
// for the new value.
void Insert(int value);
}


Contracts enforced by the compiler are usually better than contracts enforced by runtime checks, or worse, documentation-only contracts that rely on callers to do the right thing.
Here are other examples that could signal that an interface is easy to misuse:
  • Requiring callers to call an initialization function (alternative: expose factory methods that return your object fully initialized).
  • Requiring callers to perform custom cleanup (alternative: use language-specific constructs that ensure automated cleanup when your object goes out of scope).
  • Allowing code paths that create objects without required parameters (e.g. a user without an ID).
  • Allowing parameters for which only some values are valid, especially if it is possible to use a more appropriate type (e.g. prefer Duration timeout instead of int timeout_in_millis).
It is not always practical to have a foolproof interface. In certain cases, relying on static analysis or documentation is necessary since some requirements are impossible to express in an interface (e.g. that a callback function needs to be thread-safe).

Don’t enforce what you don’t need to enforce - avoid code that is too defensive. For example, extensive validation of function parameters can increase complexity and reduce performance.

Code Health: Understanding Code In Review


This is another post in our Code Health series. A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.

By Max Kanat-Alexander


It's easy to assume that a developer who sends you some code for review is smarter than you'll ever be, and that's why you don't understand their code.

But in reality, if code is hard to understand, it's probably too complex. If you're familiar with the programming language being used, reading healthy code should be almost as easy as reading a book in your native language.

Pretend a developer sends you this block of Python to be reviewed:
def IsOkay(n):
f = False
for i in range(2, n):
if n % i == 0:
f = True
return not f

Don't spend more than a few seconds trying to understand it. Simply add a code review comment saying, "It's hard for me to understand this piece of code," or be more specific, and say, "Please use more descriptive names here."

The developer then clarifies the code and sends it to you for review again:
def IsPrime(n):
for divisor in range(2, n / 2):
if n % divisor == 0:
return False

return True

Now we can read it pretty easily, which is a benefit in itself.

Often, just asking a developer to clarify a piece of code will result in fundamental improvements. In this case, the developer noticed possible performance improvements since the code was easier to read—the function now returns earlier when the number isn't prime, and the loop only goes to n/2 instead of n.

However, now that we can easily understand this code, we can see many problems with it. For example, it has strange behavior with 0 and 1, and there are other problems, too. But most importantly, it is now apparent that this entire function should be removed and be replaced with a preexisting function for detecting if a number is prime. Clarifying the code helped both the developer and reviewer.

In summary, don't waste time reviewing code that is hard to understand, just ask for it to be clarified. In fact, such review comments are one of the most useful and important tools a code reviewer has!

Code Health: Obsessed With Primitives?


This is another post in our Code Health series. A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.

By Marc Eaddy

Programming languages provide basic types, such as integers, strings, and maps, which are useful in many contexts. For example, a string can be used to hold everything from a person's name to a web page URL. However, code that relies too heavily on basic types instead of custom abstractions can be hard to understand and maintain.

Primitive obsession is the overuse of basic ("primitive") types to represent higher-level concepts. For example, this code uses basic types to represent shapes:

vector<pair<int, int>> polygon = ...
pair<pair<int, int>, pair<int, int>> bounding_box = GetBoundingBox(polygon);
int area = (bounding_box.second.first - bounding_box.first.first) *
(bounding_box.second.second - bounding_box.first.seco
pair is not the right level of abstraction because its generically-named first and second fields are used to represent X and Y in one case and lower-left (er, upper-left?) and upper-right (er, lower-right?) in the other. Worse, basic types don't encapsulate domain-specific code such as computing the bounding box and area.

Replacing basic types with higher-level abstractions results in clearer and better encapsulated code:
Polygon polygon = ...
int area = polygon.GetBoundingBox().GetArea();
Here are some other examples of primitive obsession:
  • Related maps, lists, vectors, etc. that can be easily combined into a single collection by consolidating the values into a custom higher-level abstraction.
    map<UserId, string> id_to_name;
    map<UserId, int> id_to_age;
    map<UserId, Person> id_to_person;
  • A vector or map with magic indices/keys, e.g. string values at indices/keys 0, 1, and 2 hold name, address, and phone #, respectively. Instead, consolidate these values into a higher-level abstraction.
    person_data[kName] = "Foo";
    person.SetName("Foo");
  • A string that holds complex or structured text (e.g. a date). Instead, use a higher-level abstraction (e.g. Date) that provides self-documenting accessors (e.g. GetMonth) and guarantees correctness.
    string date = "01-02-03";
    Date date(Month::Feb, Day(1), Year(2003));
  • An integer or floating point number that stores a time value, e.g. seconds. Instead, use a structured timestamp or duration type.
    int timeout_secs = 5;
    Duration timeout = Seconds(5);
It's possible for any type—from a lowly int to a sophisticated red-black tree—to be too primitive for the job. If you see code that uses a lot of basic types that would be clearer or better encapsulated by using a higher-level abstraction, refactor it or politely remind the author to keep it classy!

Code Health: IdentifierNamingPostForWorldWideWebBlog


This is another post in our Code Health series. A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.

By Chris Lewis and Bob Nystrom


It's easy to get carried away creating long identifiers. Longer names often make things more readable. But names that are too long can decrease readability. There are many examples of variable names longer than 60 characters on GitHub and elsewhere. In 58 characters, we managed this haiku for you to consider:

Name variables
Using these simple guidelines
Beautiful source code

Names should be two things: clear (know what it refers to) and precise (know what it does not refer to). Here are some guidelines to help:

• Omit words that are obvious given a variable's type declaration.
// Bad, the type tells us what these variables are:
String nameString; List<datetime> holidayDateList;
// Better:
String name; List<datetime> holidays;

• Omit irrelevant details.
// Overly specific names are hard to read:
Monster finalBattleMostDangerousBossMonster; Payments nonTypicalMonthlyPayments;
// Better, if there's no other monsters or payments that need disambiguation:
Monster boss; Payments payments;


• Omit words that are clear from the surrounding context.
// Bad, repeating the context:
class AnnualHolidaySale {int annualSaleRebate; boolean promoteHolidaySale() {...}}
// Better:
class AnnualHolidaySale {int rebate; boolean promote() {...}}

• Omit words that could apply to any identifier.
You know the usual suspects: data, state, amount, number, value, manager, engine, object, entity, instance, helper, util, broker, metadata, process, handle, context. Cut them out.

There are some exceptions to these rules; use your judgment. Names that are too long are still better than names that are too short. However, following these guidelines, your code will remain unambiguous and be much easier to read. Readers, including "future you,” will appreciate how clear your code is!