How reviews on Google Maps work

When exploring new places, reviews on Google are a treasure trove of local knowledge that can point you to the places and businesses you’ll enjoy most — whether it’s a bakery with the best gluten-free cupcake or a nearby restaurant with live music.

With millions of reviews posted every day from people around the world, we have around-the-clock support to keep the information on Google relevant and accurate. Much of our work to prevent inappropriate content is done behind the scenes, so we wanted to shed some light on what happens after you hit “post” on a review.

How we create and enforce our policies

We’ve created strict content policies to make sure reviews are based on real-world experiences and to keep irrelevant and offensive comments off of Google Business Profiles.

As the world evolves, so do our policies and protections. This helps us guard places and businesses from violative and off-topic content when there’s potential for them to be targeted for abuse. For instance, when governments and businesses started requiring proof of COVID-19 vaccine before entering certain places, we put extra protections in place to remove Google reviews that criticize a business for its health and safety policies or for complying with a vaccine mandate.

Once a policy is written, it’s turned into training material — both for our operators and machine learning algorithms — to help our teams catch policy-violating content and ultimately keep Google reviews helpful and authentic.

Moderating reviews with the help of machine learning

As soon as someone posts a review, we send it to our moderation system to make sure the review doesn’t violate any of our policies. You can think of our moderation system as a security guard that stops unauthorized people from getting into a building — but instead, our team is stopping bad content from being posted on Google.

Given the volume of reviews we regularly receive, we’ve found that we need both the nuanced understanding that humans offer and the scale that machines provide to help us moderate contributed content. They have different strengths so we continue to invest tremendously in both.

Machines are our first line of defense because they’re good at identifying patterns. These patterns often immediately help our machines determine if the content is legitimate, and the vast majority of fake and fraudulent content is removed before anyone actually sees it.

Our machines look at reviews from multiple angles, such as:

  • The content of the review: Does it contain offensive or off-topic content?
  • The account that left the review: Does the Google account have any history of suspicious behavior?
  • The place itself: Has there been uncharacteristic activity — such as an abundance of reviews over a short period of time? Has it recently gotten attention in the news or on social media that would motivate people to leave fraudulent reviews?

Training a machine on the difference between acceptable and policy-violating content is a delicate balance. For example, sometimes the word “gay” is used as a derogatory term, and that’s not something we tolerate in Google reviews. But if we teach our machine learning models that it’s only used in hate speech, we might erroneously remove reviews that promote a gay business owner or an LGBTQ+ safe space. Our human operators regularly run quality tests and complete additional training to remove bias from the machine learning models. By thoroughly training our models on all the ways certain words or phrases are used, we improve our ability to catch policy-violating content and reduce the chance of inadvertently blocking legitimate reviews from going live.

If our systems detect no policy violations, then the review can post within a matter of seconds. But our job doesn’t stop once a review goes live. Our systems continue to analyze the contributed content and watch for questionable patterns. These patterns can be anything from a group of people leaving reviews on the same cluster of Business Profiles to a business or place receiving an unusually high number of 1 or 5-star reviews over a short period of time.

Keeping reviews authentic and reliable

Like any platform that welcomes contributions from users, we also have to stay vigilant in our efforts to prevent fraud and abuse from appearing on Maps. Part of that is making it easy for people using Google Maps to flag any policy-violating reviews. If you think you see a policy-violating review on Google, we encourage you to report it to our team. Businesses can report reviews on their profiles here, and consumers can report them here.

Phone featuring a selection of the reasons someone might report a Google review

Google Maps users and businesses can easily report reviews that they feel violate one of our policies.

Our team of human operators works around the clock to review flagged content. When we find reviews that violate our policies, we remove them from Google and, in some cases, suspend the user account or even pursue litigation.

In addition to reviewing flagged content, our team proactively works to identify potential abuse risks, which reduces the likelihood of successful abuse attacks. For instance, when there’s an upcoming event with a significant following — such as an election — we implement elevated protections to the places associated with the event and other nearby businesses that people might look for on Maps. We continue to monitor these places and businesses until the risk of abuse has subsided to support our mission of only publishing authentic and reliable reviews. Our investment in analyzing and understanding how contributed content can be abused has been critical in keeping us one step ahead of bad actors.

With more than 1 billion people turning to Google Maps every month to navigate and explore, we want to make sure the information they see — especially reviews — is reliable for everyone. Our work is never done; we’re constantly improving our system and working hard to keep abuse, including fake reviews, off of the map.

How reviews on Google Maps work

When exploring new places, reviews on Google are a treasure trove of local knowledge that can point you to the places and businesses you’ll enjoy most — whether it’s a bakery with the best gluten-free cupcake or a nearby restaurant with live music.

With millions of reviews posted every day from people around the world, we have around-the-clock support to keep the information on Google relevant and accurate. Much of our work to prevent inappropriate content is done behind the scenes, so we wanted to shed some light on what happens after you hit “post” on a review.

How we create and enforce our policies

We’ve created strict content policies to make sure reviews are based on real-world experiences and to keep irrelevant and offensive comments off of Google Business Profiles.

As the world evolves, so do our policies and protections. This helps us guard places and businesses from violative and off-topic content when there’s potential for them to be targeted for abuse. For instance, when governments and businesses started requiring proof of COVID-19 vaccine before entering certain places, we put extra protections in place to remove Google reviews that criticize a business for its health and safety policies or for complying with a vaccine mandate.

Once a policy is written, it’s turned into training material — both for our operators and machine learning algorithms — to help our teams catch policy-violating content and ultimately keep Google reviews helpful and authentic.

Moderating reviews with the help of machine learning

As soon as someone posts a review, we send it to our moderation system to make sure the review doesn’t violate any of our policies. You can think of our moderation system as a security guard that stops unauthorized people from getting into a building — but instead, our team is stopping bad content from being posted on Google.

Given the volume of reviews we regularly receive, we’ve found that we need both the nuanced understanding that humans offer and the scale that machines provide to help us moderate contributed content. They have different strengths so we continue to invest tremendously in both.

Machines are our first line of defense because they’re good at identifying patterns. These patterns often immediately help our machines determine if the content is legitimate, and the vast majority of fake and fraudulent content is removed before anyone actually sees it.

Our machines look at reviews from multiple angles, such as:

  • The content of the review: Does it contain offensive or off-topic content?
  • The account that left the review: Does the Google account have any history of suspicious behavior?
  • The place itself: Has there been uncharacteristic activity — such as an abundance of reviews over a short period of time? Has it recently gotten attention in the news or on social media that would motivate people to leave fraudulent reviews?

Training a machine on the difference between acceptable and policy-violating content is a delicate balance. For example, sometimes the word “gay” is used as a derogatory term, and that’s not something we tolerate in Google reviews. But if we teach our machine learning models that it’s only used in hate speech, we might erroneously remove reviews that promote a gay business owner or an LGBTQ+ safe space. Our human operators regularly run quality tests and complete additional training to remove bias from the machine learning models. By thoroughly training our models on all the ways certain words or phrases are used, we improve our ability to catch policy-violating content and reduce the chance of inadvertently blocking legitimate reviews from going live.

If our systems detect no policy violations, then the review can post within a matter of seconds. But our job doesn’t stop once a review goes live. Our systems continue to analyze the contributed content and watch for questionable patterns. These patterns can be anything from a group of people leaving reviews on the same cluster of Business Profiles to a business or place receiving an unusually high number of 1 or 5-star reviews over a short period of time.

Keeping reviews authentic and reliable

Like any platform that welcomes contributions from users, we also have to stay vigilant in our efforts to prevent fraud and abuse from appearing on Maps. Part of that is making it easy for people using Google Maps to flag any policy-violating reviews. If you think you see a policy-violating review on Google, we encourage you to report it to our team. Businesses can report reviews on their profiles here, and consumers can report them here.

Phone featuring a selection of the reasons someone might report a Google review

Google Maps users and businesses can easily report reviews that they feel violate one of our policies.

Our team of human operators works around the clock to review flagged content. When we find reviews that violate our policies, we remove them from Google and, in some cases, suspend the user account or even pursue litigation.

In addition to reviewing flagged content, our team proactively works to identify potential abuse risks, which reduces the likelihood of successful abuse attacks. For instance, when there’s an upcoming event with a significant following — such as an election — we implement elevated protections to the places associated with the event and other nearby businesses that people might look for on Maps. We continue to monitor these places and businesses until the risk of abuse has subsided to support our mission of only publishing authentic and reliable reviews. Our investment in analyzing and understanding how contributed content can be abused has been critical in keeping us one step ahead of bad actors.

With more than 1 billion people turning to Google Maps every month to navigate and explore, we want to make sure the information they see — especially reviews — is reliable for everyone. Our work is never done; we’re constantly improving our system and working hard to keep abuse, including fake reviews, off of the map.

Giving $2 billion to nonprofits since 2017

Five years ago, Google.org committed to contributing $1 billion to organizations around the world that are working to create opportunities for everyone. Today, thanks in part to the generosity of our employees, we’ve doubled that goal. Since 2017, we’ve provided more than $2 billion in cash grants and employee contributions to nonprofits, and Google employees have collectively volunteered the equivalent of 160 years worth of time with organizations they’re passionate about.

When we started Google.org in 2004, we wanted to use the best of Google to help solve some of humanity’s biggest challenges. Today, our commitment to nonprofits goes beyond cash grants and volunteering to include access to our products and technical expertise.

In addition to grants and employee contributions, we’ve donated over $7 billion in Ad Grants since 2017. These donated ads help organizations connect with potential donors, recruit volunteers and inform people of their services — and do so at the moment when their services are needed most.

We know that magic happens when we pair funding with employee tech expertise through Google.org Fellowships, a pro bono program that matches Google employees with nonprofits and civic entities to work full time for up to six months on technical projects. Last year alone, Fellows worked on projects that included raising awareness about gaps in health equity, making it easier for people in Detroit to find affordable housing, using AI to stop pests devastating crops that feed communities across India, and more.

As we mark these milestones of nonprofit giving, we also want to look to the future and focus on where tech-driven philanthropy can have the most impact.

We believe in the importance of taking big bets on new ideas that can pay off in the long run. Take our grantee GiveDirectly for example. Back in 2012, they shared with us a new idea about giving cash directly to people in need, and we jumped on board to provide some of their first seed funding. Today, there are more than 300 studies on the effectiveness of direct cash transfers, and GiveDirectly has distributed more than $500 million, making them one of the fastest-growing nonprofits of the decade.

We also know that when solving global issues — whether it’s supporting vaccine roll-out or creating economic opportunity — equity and inclusion are critical. We’ll continue to advocate for the role technology can play in driving equitable outcomes in everything from health to racial justice.

Our grantees around the world inspire us every day, and we’re excited to continue this journey towards a better world, together.

Giving $2 billion to nonprofits since 2017

Five years ago, Google.org committed to contributing $1 billion to organizations around the world that are working to create opportunities for everyone. Today, thanks in part to the generosity of our employees, we’ve doubled that goal. Since 2017, we’ve provided more than $2 billion in cash grants and employee contributions to nonprofits, and Google employees have collectively volunteered the equivalent of 160 years worth of time with organizations they’re passionate about.

When we started Google.org in 2004, we wanted to use the best of Google to help solve some of humanity’s biggest challenges. Today, our commitment to nonprofits goes beyond cash grants and volunteering to include access to our products and technical expertise.

In addition to grants and employee contributions, we’ve donated over $7 billion in Ad Grants since 2017. These donated ads help organizations connect with potential donors, recruit volunteers and inform people of their services — and do so at the moment when their services are needed most.

We know that magic happens when we pair funding with employee tech expertise through Google.org Fellowships, a pro bono program that matches Google employees with nonprofits and civic entities to work full time for up to six months on technical projects. Last year alone, Fellows worked on projects that included raising awareness about gaps in health equity, making it easier for people in Detroit to find affordable housing, using AI to stop pests devastating crops that feed communities across India, and more.

As we mark these milestones of nonprofit giving, we also want to look to the future and focus on where tech-driven philanthropy can have the most impact.

We believe in the importance of taking big bets on new ideas that can pay off in the long run. Take our grantee GiveDirectly for example. Back in 2012, they shared with us a new idea about giving cash directly to people in need, and we jumped on board to provide some of their first seed funding. Today, there are more than 300 studies on the effectiveness of direct cash transfers, and GiveDirectly has distributed more than $500 million, making them one of the fastest-growing nonprofits of the decade.

We also know that when solving global issues — whether it’s supporting vaccine roll-out or creating economic opportunity — equity and inclusion are critical. We’ll continue to advocate for the role technology can play in driving equitable outcomes in everything from health to racial justice.

Our grantees around the world inspire us every day, and we’re excited to continue this journey towards a better world, together.

Google Cloud is just the ticket for JustPark

JustPark are the nice guys of the parking world. For the uninitiated, the company exists to make parking more affordable, more convenient, and more sustainable. It started with one simple idea: to create societal change by tapping into the potential of unused spaces. And since 2006, the marketplace has allowed homeowners to get value from their empty garages and spaces while connecting drivers to otherwise underutilised parking spots all around the UK and US.

Today, JustPark connects a thriving community of some 45,000 space owners to over 5.5 million UK drivers, and 8 million worldwide, and manages parking spaces for some of the UK’s biggest Local Authorities and car parking companies. In the last year alone, they have partnered with London’s largest private transport provider, equipping it with access to off-street parking points and mobile payments technologies. In other words, it’s growing and growing fast.

But growth means increased demand. To meet this demand, JustPark needed a technology partner that would enable it to scale up sustainably and improve its existing services while taking over management of its software infrastructure to save on time and admin costs — and all without compromising on reliability or quality of service. It found the right partner in Google Cloud.

From a tight spot to the right one

With a small and agile software team of 40, JustPark needed a managed offering to take care of its software infrastructure. The company started off using Google Kubernetes Engine (GKE), a managed, production-ready environment for running containerised applications, in a way that was scalable and extensible.

More recently, it’s adopted Google BigQuery and Looker to provide scalable analysis of its data, helping to uncover the insights it needs to hone its business model and improve its services. “This was the first time we started gaining real business insight,” explains Jack Wall, Head of Engineering at JustPark. “Using these tools, we were able to use our own data to guide us to the most commercially viable decisions, especially regarding supply and demand.”

“Our adoption of GKE meant that the transition to other Google products was a no-brainer. Since we started using GKE, everything became so much easier — we could build on our services, improve the customer experience and crucially, we realised that we could leave our IT and cloud infrastructure in the hands of the experts at Google. That meant we could concentrate on continuing to perfect our business model.”

Headshot of Jack Wall, Head of Engineering at JustPark

Demand on the platform is growing fast, and JustPark needs a trustworthy partner to help it move quickly. “We’ve doubled our customer base in the last two years alone and anticipate this demand growing by a further 33% by summer 2022. Uptime is crucially important — we deal with immediate demands that require immediate connections. Any lags can be extremely annoying for customers, so it is vital that we have an architecture that is resilient and supportive. Moving forward, we’re keen to work with Google Cloud to improve the observability, reliability and resilience of our technical offering to deliver the best customer experience possible.”

A sustainable road to the future

With electric vehicle fleet offerings in growing demand, the company plans to use Looker to enable data-driven decisions that will help the UK continue to electrify its fleet. The ongoing partnership with Google, and full integration with Google Maps and G-Suite enables customers to enjoy all of the benefits of accurate traffic and location information, as well as effective business administration.

“Google Cloud’s networking model is a breath of fresh air compared to the platform we used before,” concludes Jack. “And with the architecture we have in place now, we’re confident that we’re ready to handle the volume of customers we’re anticipating in the next year.”

Stable Channel Update for Desktop

The Chrome team is delighted to announce the promotion of Chrome 98 to the stable channel for Windows, Mac and Linux. Chrome 98 is also promoted to our new extended stable channel for Windows and Mac. This will roll out over the coming days/weeks.

Chrome 98.0.4758.80/81/82 for windows and  98.0.4758.80 for mac and linux contains a number of fixes and improvements -- a list of changes is available in the log. Watch out for upcoming Chrome and Chromium blog posts about new features and big efforts delivered in 98.

Security Fixes and Rewards

Note: Access to bug details and links may be kept restricted until a majority of users are updated with a fix. We will also retain restrictions if the bug exists in a third party library that other projects similarly depend on, but haven’t yet fixed.


This update includes 27 security fixes. Below, we highlight fixes that were contributed by external researchers. Please see the Chrome Security Page for more information.


[$20000][1284584] High CVE-2022-0452: Use after free in Safe Browsing. Reported by avaue at S.S.L. on 2022-01-05

[$20000][1284916] High CVE-2022-0453: Use after free in Reader Mode. Reported by Rong Jian of VRI on 2022-01-06

[$12000][1287962] High CVE-2022-0454: Heap buffer overflow in ANGLE. Reported by Seong-Hwan Park (SeHwa) of SecunologyLab on 2022-01-17

[$7500][1270593] High CVE-2022-0455: Inappropriate implementation in Full Screen Mode. Reported by Irvan Kurniawan (sourc7) on 2021-11-16

[$7000][1289523] High CVE-2022-0456: Use after free in Web Search. Reported by Zhihua Yao of KunLun Lab on 2022-01-21

[$5000][1274445] High CVE-2022-0457: Type Confusion in V8. Reported by rax of the Group0x58 on 2021-11-29

[$1000][1267060] High CVE-2022-0458: Use after free in Thumbnail Tab Strip. Reported by Anonymous on 2021-11-05

[$TBD][1244205] High CVE-2022-0459: Use after free in Screen Capture. Reported by raven (@raid_akame) on 2021-08-28

[$7500][1250227] Medium CVE-2022-0460: Use after free in Window Dialog. Reported by 0x74960 on 2021-09-16

[$3000][1256823] Medium CVE-2022-0461: Policy bypass in COOP. Reported by NDevTK on 2021-10-05

[$2000][1270470] Medium CVE-2022-0462: Inappropriate implementation in Scroll. Reported by Youssef Sammouda on 2021-11-16

[$1000][1268240] Medium CVE-2022-0463: Use after free in Accessibility. Reported by Zhihua Yao of KunLun Lab on 2021-11-09

[$1000][1270095] Medium CVE-2022-0464: Use after free in Accessibility. Reported by Zhihua Yao of KunLun Lab on 2021-11-14

[$1000][1281941] Medium CVE-2022-0465: Use after free in Extensions. Reported by Samet Bekmezci @sametbekmezci on 2021-12-22

[$TBD][1115460] Medium CVE-2022-0466: Inappropriate implementation in Extensions Platform. Reported by David Erceg on 2020-08-12

[$TBD][1239496] Medium CVE-2022-0467: Inappropriate implementation in Pointer Lock. Reported by Alesandro Ortiz on 2021-08-13

[$TBD][1252716] Medium CVE-2022-0468: Use after free in Payments. Reported by Krace on 2021-09-24

[$TBD][1279531] Medium CVE-2022-0469: Use after free in Cast. Reported by Thomas Orlita on 2021-12-14

[$TBD][1269225] Low CVE-2022-0470: Out of bounds memory access in V8. Reported by Looben Yang on 2021-11-11


We would also like to thank all security researchers that worked with us during the development cycle to prevent security bugs from ever reaching the stable channel.

As usual, our ongoing internal security work was responsible for a wide range of fixes:

  • [1293087] Various fixes from internal audits, fuzzing and other initiatives


Many of our security bugs are detected using AddressSanitizer, MemorySanitizer, UndefinedBehaviorSanitizer, Control Flow Integrity, libFuzzer, or AFL.

Interested in switching release channels?  Find out how here. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.

Srinivas Sista
Google Chrome

Beta Channel Update for Chrome OS

The Beta channel is being updated to 98.0.4758.79 (Platform version: 14388.44.0) for most Chrome OS devices.

If you find new issues, please let us know by visiting our forum or filing a bug. Interested in switching channels Find out how. You can submit feedback using ‘Report an issue...’ in the Chrome menu (3 vertical dots in the upper right corner of the browser).  

Matt Nelson,

Google Chrome OS

Chrome for Android Update

Hi, everyone! We've just released Chrome 98 (98.0.4758.87) for Android: it'll become available on Google Play over the next few days.

This release includes stability and performance improvements. You can see a full list of the changes in the Git log. If you find a new issue, please let us know by filing a bug.

Krishna Govind
Google Chrome

Chrome for iOS Update

Hi, everyone! We've just released Chrome 98 (98.0.4758.85) for iOS: it'll become available on App Store in the next few hours.

This release includes stability and performance improvements. You can see a full list of the changes in the Git log. If you find a new issue, please let us know by filing a bug.

Harry Souders

Google Chrome

A Tale of Two Features

By George Pirocanac


I have often been asked, “What is the most memorable bug that you have encountered in your testing career?” For me, it is hands down a bug that happened quite a few years ago. I was leading an Engineering Productivity team that supported Google App Engine. At that time App Engine was still in its early stages, and there were many challenges associated with testing rapidly evolving features. Our testing frameworks and processes were also evolving, so it was an exciting time to be on the team. 

What makes this bug so memorable is that I spent so much time developing a comprehensive suite of test scenarios, yet a failure occurred during such an obvious use case that it left me shaking my head and wondering how I had missed it. Even with many years of testing experience it can be very humbling to construct scenarios that adequately mirror what will happen in the field.

I’ll try to provide enough background for the reader to play along and see if they can determine the anomalous scenario. As a further hint, the problem resulted from the interaction of two App Engine features, so I like calling this story A Tale of Two Features.

Feature 1 - Datastore Admin (backup, restore, delete)


Google App Engine was released 13 years ago as Google’s first Cloud product. It allowed users to build and deploy highly scalable web applications in the Cloud. To support this, it had its own scalable database called the Datastore. An administration console allowed users to manage the application and its Datastore through a web interface. Users wrote applications that consisted of request handlers that App Engine invoked according to the URL that was specified. The handlers could call App Engine services like Datastore through a remote procedure call (RPC) mechanism. Figure 1 illustrates this flow.



The first feature in this Tale of Two Features resided in the administration console, providing the ability to back up, restore, and delete selected or all of an application’s entities in the Datastore. It was implemented in a clever way that incorporated it directly into the application, rather than as an independent utility. As part of the application it could freely operate on the Datastore and incur the same billing charges as other datastore operations within the application. When the feature was invoked, traffic would be sent to its handler and the application would process it. Figure 2 illustrates this request flow.




By the time this memorable bug occurred, this Datastore administration feature was mature, well tested, and stable. No changes were being made to it.


Feature 2 - Utilities for Migrating to the HR - Datastore


The second feature (or more accurately, set of features) came at least a year after the first feature was released and stable. It helped users migrate their applications to a new High Replication (HR) Datastore. The HR Datastore was more reliable than its predecessor, but using it meant creating a new application and copying over all the data from the old Datastore. To support such migrations, App Engine developers added two new features to the administration console. The first copied all the data from the Datastore of one application to another, and the second redirected all traffic from one application to another. The latter was particularly useful because it meant the new application would seamlessly receive the traffic after a migration. This set of features was written by another team, and we in Engineering Productivity supported them by creating processes for testing various Datastore migrations. The migration-support features were thoroughly tested and released to developers. Figure 3 illustrates the request flow of the redirection feature.



What Could Possibly Go Wrong?


So this was the situation when we released these utilities for migrating to the new Datastore. We were very confident that they worked, as we had tested migrations of many different types and sizes of Datastore entities. We had also tested that a migration could be interrupted without data loss. All checked out fine. I was confident that this new feature would work, yet soon after we released it, we started getting problem reports.

If you have been playing along, now is the time to ask yourself,  “What could possibly go wrong?” As an added hint, the problem reports claimed that all the data in the newly migrated application was disappearing.


What Did Go Wrong

As mentioned above, developers began to report that data was disappearing from their newly migrated applications. It wasn’t at all common, yet of course it is most disconcerting when data “just disappears.” We had to investigate how this could occur. Our standard processes ensured that we had internal backups of the data, which were promptly restored. In parallel we tried to reproduce the problem, but we couldn’t—at least until we figured out what was happening. As I mentioned earlier, once we understood it, it was quite obvious, but that only made it all the more painful that we missed it.

What was happening was that, after migrating and automatically redirecting traffic to the new application, a number of customers thought they still needed to delete the data from their old application, so they used the first Datastore admin feature to do that. As expected, the feature sent traffic to that application to delete the entities from the Datastore. But that traffic was now being automatically redirected to the new application, and voila—all the data that had been copied earlier was now deleted there. Since only a handful of developers tried to delete the data from their old applications, this explained why the problem occurred only rarely. Figure 4 illustrates this request flow.



Obvious, isn’t it, once you know what is happening.


Lessons Learned


This all occurred years ago, and App Engine is based on a far different and more robust framework today. Datastore migrations are but a memory from the past, yet this experience made a great impression on me.

The most important thing I learned from this experience is that, while it is important to test features for their functionality, it’s also important to think of them as part of workflows. In performing our testing we exercised a very limited number of steps in the migration process workflow and omitted a very reasonable step at the end: trying to delete the data from the old application. Our focus was in testing the variability of contents in the Datastore rather than different steps in the migration process. It was this focus that kept our eyes away from the relatively obvious failure case.

Another thing I learned was that this bug might have been caught if the developer of the first feature had been in the design review for the second set of migration features (particularly the feature that automatically redirects traffic). Unfortunately, that person had already joined a new team. A key step in reducing bugs can occur at the design stage if “what-if” questions are being asked.

Finally, I was enormously impressed that we were able to recover so quickly. Protecting against data loss is one of the most important aspects of Cloud management, and being able to recover from mistakes is at least as important as trying to prevent them. I have the utmost respect for my coworkers in Site Reliability Engineering (SRE).


References