Author Archives: Pandu Nayak

New ways we’re helping you find high-quality information

People turn to Google every day for information in the moments that matter most. Sometimes that’s to look for the best recipe for dinner, other times it’s to check the facts about a claim they heard about from a friend.

No matter what you’re searching for, we aim to connect you with high-quality information, and help you understand and evaluate that information. We have deeply invested in both information quality and information literacy on Google Search and News, and today we have a few new developments about this important work.

Our latest quality improvements to featured snippets

We design our ranking systems to surface relevant information from the most reliable sources available – sources that demonstrate expertise, authoritativeness and trustworthiness. We train our systems to identify and prioritize these signals of reliability. And we’re constantly refining these systems — we make thousands of improvements every year to help people get high-quality information quickly.

Today we’re announcing one such improvement: a significant innovation to improve the quality of featured snippets. Featured snippets are the descriptive box at the top of the page that prominently highlights a piece of information from a result and the source, in response to your query. They’re helpful both for people searching on Google, and for web publishers, as featured snippets drive traffic to sites.

By using our latest AI model, Multitask Unified Model (MUM), our systems can now understand the notion of consensus, which is when multiple high-quality sources on the web all agree on the same fact. Our systems can check snippet callouts (the word or words called out above the featured snippet in a larger font) against other high-quality sources on the web, to see if there’s a general consensus for that callout, even if sources use different words or concepts to describe the same thing. We've found that this consensus-based technique has meaningfully improved the quality and helpfulness of featured snippet callouts.

A screenshot shows a query for “how long does it take for light from the sun to reach earth,” with a featured snippet highlighting a helpful article about the question and a bolded callout saying “8 and ⅓ minutes.”

With a consensus-based technique, we’re improving featured snippets.

AI models are also helping our systems understand when a featured snippet might not be the most helpful way to present information. This is particularly helpful for questions where there is no answer: for example, a recent search for “when did snoopy assassinate Abraham Lincoln” provided a snippet highlighting an accurate date and information about Lincoln’s assassination, but this clearly isn’t the most helpful way to display this result.

We’ve trained our systems to get better at detecting these sorts of false premises, which are not very common, but are cases where it’s not helpful to show a featured snippet. We’ve reduced the triggering of featured snippets in these cases by 40% with this update.

Information literacy

Beyond designing our systems to return high-quality information, we also build information literacy features in Google Search that help people evaluate information, whether they found it on social media or in conversations with family or friends. In fact, in a study this year, researchers found that people regularly use Google as a tool to validate information encountered on other platforms. We’ve invested in building a growing range of information literacy features — including Fact Check Explorer, Reverse image search, and About this result — and today, we’re announcing several updates to make these features even more helpful.

Expanding About this result to more places

About this result helps you see more context about any Search result before you ever visit a web page, just by tapping the three dots next to the result. Since launching last year, people have used About this result more than 2.4 billion times, and we’re bringing it to even more people and places - with eight more languages including Portuguese (PT), French (FR), Italian (IT), German (DE), Dutch (NL), Spanish (ES), Japanese (JP) and Indonesian (ID), coming later this year.

This week, we’re adding more context to About this result, such as how widely a source is circulated, online reviews about a source or company, whether a company is owned by another entity, or even when our systems can’t find much info about a source – all pieces of information that can provide important context.

And we’ve now launched About this page in the Google app, so you can get helpful context about websites as you’re browsing the web. Just swipe up from the navigation bar on any page to get more information about the source – helping you explore with confidence, no matter where you are online.

A gif shows the About this page feature, where someone swipes up on the navigation bar in the Google app while browsing the website for the Rainforest Alliance, and sees a panel with information about the source from across the web.

With About this page in the Google app, you can get helpful context on websites as you’re browsing.

Expanding content advisories for information gaps

Sometimes interest in a breaking news topic travels faster than facts, or there isn’t enough reliable information online about a given subject. Information literacy experts often refer to these situations as data voids. To address these, we show content advisories in situations when a topic is rapidly evolving, indicating that it might be best to check back later when more sources are available.

Now we’re expanding content advisories to searches where our systems don’t have high confidence in the overall quality of the results available for the search. This doesn’t mean that no helpful information is available, or that a particular result is low-quality. These notices provide context about the whole set of results on the page, and you can always see the results for your query, even when the advisory is present.

A gif shows a content advisory that says “It looks like there aren’t many great results for this search” along with tips like checking the source and trying new search terms.

New content advisories on searches where our systems don’t have high confidence in the overall quality of the results.

Educating people about misinformation

Beyond our products, we’re making investments into programs and partnerships to help educate people about misinformation. Since 2018, the Google News Initiative (GNI) has invested nearly $75 million in projects and partnerships working to strengthen media literacy and combat misinformation around the world.

Today, we’re announcing that Google is partnering with MediaWise at the Poynter Institute for Media Studies and PBS NewsHour Student Reporting Labs to develop information literacy lesson plans for teachers of middle and high school students. It will be available for free to teachers using PBS Learning Media and for download on Poynter’s website. We’ve partnered with MediaWise since it was founded. And today’s announcement builds on the GNI’s support of its microlearning course through text and WhatsApp called Find Facts Fast.

We also announced today the results of a survey conducted by the Poynter Institute and YouGov, with support from Google, on the ways people across generational lines verify information. You can read more in our blog post.

Helping people everywhere find the information they need

Google was built on the premise that information can be a powerful thing for people around the world. We’re determined to keep doing our part to help people everywhere find what they’re looking for and give them the context they need to make informed decisions about what they see online.

Source: Search


New ways we’re helping you find high-quality information

People turn to Google every day for information in the moments that matter most. Sometimes that’s to look for the best recipe for dinner, other times it’s to check the facts about a claim they heard about from a friend.

No matter what you’re searching for, we aim to connect you with high-quality information, and help you understand and evaluate that information. We have deeply invested in both information quality and information literacy on Google Search and News, and today we have a few new developments about this important work.

Our latest quality improvements to featured snippets

We design our ranking systems to surface relevant information from the most reliable sources available – sources that demonstrate expertise, authoritativeness and trustworthiness. We train our systems to identify and prioritize these signals of reliability. And we’re constantly refining these systems — we make thousands of improvements every year to help people get high-quality information quickly.

Today we’re announcing one such improvement: a significant innovation to improve the quality of featured snippets. Featured snippets are the descriptive box at the top of the page that prominently highlights a piece of information from a result and the source, in response to your query. They’re helpful both for people searching on Google, and for web publishers, as featured snippets drive traffic to sites.

By using our latest AI model, Multitask Unified Model (MUM), our systems can now understand the notion of consensus, which is when multiple high-quality sources on the web all agree on the same fact. Our systems can check snippet callouts (the word or words called out above the featured snippet in a larger font) against other high-quality sources on the web, to see if there’s a general consensus for that callout, even if sources use different words or concepts to describe the same thing. We've found that this consensus-based technique has meaningfully improved the quality and helpfulness of featured snippet callouts.

A screenshot shows a query for “how long does it take for light from the sun to reach earth,” with a featured snippet highlighting a helpful article about the question and a bolded callout saying “8 and ⅓ minutes.”

With a consensus-based technique, we’re improving featured snippets.

AI models are also helping our systems understand when a featured snippet might not be the most helpful way to present information. This is particularly helpful for questions where there is no answer: for example, a recent search for “when did snoopy assassinate Abraham Lincoln” provided a snippet highlighting an accurate date and information about Lincoln’s assassination, but this clearly isn’t the most helpful way to display this result.

We’ve trained our systems to get better at detecting these sorts of false premises, which are not very common, but are cases where it’s not helpful to show a featured snippet. We’ve reduced the triggering of featured snippets in these cases by 40% with this update.

Information literacy

Beyond designing our systems to return high-quality information, we also build information literacy features in Google Search that help people evaluate information, whether they found it on social media or in conversations with family or friends. In fact, in a study this year, researchers found that people regularly use Google as a tool to validate information encountered on other platforms. We’ve invested in building a growing range of information literacy features — including Fact Check Explorer, Reverse image search, and About this result — and today, we’re announcing several updates to make these features even more helpful.

Expanding About this result to more places

About this result helps you see more context about any Search result before you ever visit a web page, just by tapping the three dots next to the result. Since launching last year, people have used About this result more than 2.4 billion times, and we’re bringing it to even more people and places - with eight more languages including Portuguese (PT), French (FR), Italian (IT), German (DE), Dutch (NL), Spanish (ES), Japanese (JP) and Indonesian (ID), coming later this year.

This week, we’re adding more context to About this result, such as how widely a source is circulated, online reviews about a source or company, whether a company is owned by another entity, or even when our systems can’t find much info about a source – all pieces of information that can provide important context.

And we’ve now launched About this page in the Google app, so you can get helpful context about websites as you’re browsing the web. Just swipe up from the navigation bar on any page to get more information about the source – helping you explore with confidence, no matter where you are online.

A gif shows the About this page feature, where someone swipes up on the navigation bar in the Google app while browsing the website for the Rainforest Alliance, and sees a panel with information about the source from across the web.

With About this page in the Google app, you can get helpful context on websites as you’re browsing.

Expanding content advisories for information gaps

Sometimes interest in a breaking news topic travels faster than facts, or there isn’t enough reliable information online about a given subject. Information literacy experts often refer to these situations as data voids. To address these, we show content advisories in situations when a topic is rapidly evolving, indicating that it might be best to check back later when more sources are available.

Now we’re expanding content advisories to searches where our systems don’t have high confidence in the overall quality of the results available for the search. This doesn’t mean that no helpful information is available, or that a particular result is low-quality. These notices provide context about the whole set of results on the page, and you can always see the results for your query, even when the advisory is present.

A gif shows a content advisory that says “It looks like there aren’t many great results for this search” along with tips like checking the source and trying new search terms.

New content advisories on searches where our systems don’t have high confidence in the overall quality of the results.

Educating people about misinformation

Beyond our products, we’re making investments into programs and partnerships to help educate people about misinformation. Since 2018, the Google News Initiative (GNI) has invested nearly $75 million in projects and partnerships working to strengthen media literacy and combat misinformation around the world.

Today, we’re announcing that Google is partnering with MediaWise at the Poynter Institute for Media Studies and PBS NewsHour Student Reporting Labs to develop information literacy lesson plans for teachers of middle and high school students. It will be available for free to teachers using PBS Learning Media and for download on Poynter’s website. We’ve partnered with MediaWise since it was founded. And today’s announcement builds on the GNI’s support of its microlearning course through text and WhatsApp called Find Facts Fast.

We also announced today the results of a survey conducted by the Poynter Institute and YouGov, with support from Google, on the ways people across generational lines verify information. You can read more in our blog post.

Helping people everywhere find the information they need

Google was built on the premise that information can be a powerful thing for people around the world. We’re determined to keep doing our part to help people everywhere find what they’re looking for and give them the context they need to make informed decisions about what they see online.

Source: Search


Using AI to keep Google Search safe

Every day, people come to Google looking for ways to keep themselves and their families safe. From highlighting resources in the wake of a natural disaster to providing time-sensitive health information, we’re constantly working on new features and improvements to help you quickly find what you need. And advancements in AI can power new technologies, like flood forecasting, to help people stay out of harm’s way.

Here’s a look at how our AI systems are helping us connect people to critical information while avoiding potentially shocking or harmful content — so you can stay safe, both online and off.

Finding trustworthy, actionable information when you need it most

We know that people come to Search in the moments that matter most. Today, if you search on Google for information on suicide, sexual assault, substance abuse and domestic violence, you’ll see contact information for national hotlines alongside the most relevant and helpful results.

But people in personal crises search in all kinds of ways, and it’s not always obvious to us that they’re in need. And if we can’t accurately recognize that, we can’t code our systems to show the most helpful search results. That's why using machine learning to understand language is so important.

Now, using our latest AI model, MUM, we can automatically and more accurately detect a wider range of personal crisis searches. MUM can better understand the intent behind people’s questions to detect when a person is in need, which helps us more reliably show trustworthy and actionable information at the right time. We’ll start using MUM to make these improvements in the coming weeks.

Steering clear of unexpected shocking content

Keeping you safe on Search also means helping you steer clear of unexpected shocking results. This can be challenging, because content creators sometimes use benign terms to label explicit or suggestive content. And the most prevalent content that matches your search may not be what you intended to find. In these cases, even if people aren't directly seeking explicit content, it can show up in their results.

One way we tackle this is with SafeSearch mode, which offers users the option to filter explicit results. This setting is on by default for Google accounts for people under 18. And even when users choose to have SafeSearch off, our systems still reduce unwanted racy results for searches that aren't seeking them out. In fact, every day, our safety algorithms improve hundreds of millions of searches globally across web, image and video modes.

But there’s still room for improvement, and we’re using advanced AI technologies like BERT to better understand what you’re looking for. BERT has improved our understanding of whether searches are truly seeking out explicit content, helping us vastly reduce your chances of encountering surprising search results.

This is a complex challenge we’ve been tackling for a while — but in the last year alone, this BERT improvement has reduced unexpected shocking results by 30%. It’s been especially effective in reducing explicit content for searches related to ethnicity, sexual orientation and gender, which can disproportionately impact women and especially women of color.

Scaling our protections around the world

MUM can transfer knowledge across the 75 languages it’s trained on, which can help us scale safety protections around the world much more efficiently. When we train one MUM model to perform a task — like classifying the nature of a query — it learns to do it in all the languages it knows.

For example, we use AI to reduce unhelpful and sometimes dangerous spam pages in your search results. In the coming months, we’ll use MUM to improve the quality of our spam protections and expand to languages where we have very little training data. We'll also be able to better detect personal crisis queries all over the world, working with trusted local partners to show actionable information in several more countries.

Like any improvement to Search, these changes have and will continue to go through rigorous evaluation — with input from our search raters around the world to make sure we’re providing more relevant, helpful results. Whatever you’re searching for, we’re committed to helping you safely find it.

How AI powers great search results

Do you ever wonder how Google understands what you’re looking for? There’s a lot that goes into delivering helpful search results, and understanding language is one of the most important skills. Thanks to advancements in AI and machine learning, our Search systems are understanding human language better than ever before. And we want to share a behind-the-scenes look at how this translates into relevant results for you.

But first, let's walk down memory lane: In the early days of Search, before we had advanced AI, our systems simply looked for matching words. For example, if you searched for “pziza” — unless there was a page with that particular misspelling, you’d likely have to redo the search with the correct spelling to find a slice near you. And eventually, we learned how to code algorithms to find classes of patterns, like popular misspellings or potential typos from neighboring keys. Now, with advanced machine learning, our systems can more intuitively recognize if a word doesn’t look right and suggest a possible correction.

These kinds of AI improvements to our Search systems mean that they’re constantly getting better at understanding what you’re looking for. And since the world and people’s curiosities are always evolving, it’s really important that Search does, too. In fact, 15% of searches we see every day are entirely new. AI plays a major role in showing you helpful results, even at the outermost edges of your imagination.

How our systems play together

We’ve developed hundreds of algorithms over the years, like our early spelling system, to help deliver relevant search results. When we develop new AI systems, our legacy algorithms and systems don’t just get shelved away. In fact, Search runs on hundreds of algorithms and machine learning models, and we’re able to improve it when our systems — new and old — can play well together. Each algorithm and model has a specialized role, and they trigger at different times and in distinct combinations to help deliver the most helpful results. And some of our more advanced systems play a more prominent role than others. Let’s take a closer look at the major AI systems running in Search today, and what they do.

RankBrain — a smarter ranking system

When we launched RankBrain in 2015, it was the first deep learning system deployed in Search. At the time, it was groundbreaking — not only because it was our first AI system, but because it helped us understand how words relate to concepts. Humans understand this instinctively, but it’s a complex challenge for a computer. RankBrain helps us find information we weren’t able to before by more broadly understanding how words in a search relate to real-world concepts. For example, if you search for “what’s the title of the consumer at the highest level of a food chain,” our systems learn from seeing those words on various pages that the concept of a food chain may have to do with animals, and not human consumers. By understanding and matching these words to their related concepts, RankBrain understands that you’re looking for what’s commonly referred to as an “apex predator.”

Search bar with the query “what’s the title of the consumer at the highest level of a food chain,” and a mobile view of a featured snippet for “apex predator.”

Thanks to this type of understanding, RankBrain (as its name suggests) is used to help rank — or decide the best order for — top search results. Although it was our very first deep learning model, RankBrain continues to be one of the major AI systems powering Search today.

Neural matching — a sophisticated retrieval engine

Neural networks underpin many modern AI systems today. But it wasn’t until 2018, when we introduced neural matching to Search, that we could use them to better understand how queries relate to pages. Neural matching helps us understand fuzzier representations of concepts in queries and pages, and match them to one another. It looks at an entire query or page rather than just keywords, developing a better understanding of the underlying concepts represented in them. Take the search “insights how to manage a green,” for example. If a friend asked you this, you’d probably be stumped. But with neural matching, we’re able to make sense of it. By looking at the broader representations of concepts in the query — management, leadership, personality and more — neural matching can decipher that this searcher is looking for management tips based on a popular, color-based personality guide.

Search bar with the query “insights how to manage a green” with a mobile view of relevant search results.

When our systems understand the broader concepts represented in a query or page, they can more easily match them with one another. This level of understanding helps us cast a wide net when we scan our index for content that may be relevant to your query. This is what makes neural matching such a critical part of how we retrieve relevant documents from a massive and constantly changing information stream.

BERT — a model for understanding meaning and context

Launched in 2019, BERT was a huge step change in natural language understanding, helping us understand how combinations of words express different meanings and intents. Rather than simply searching for content that matches individual words, BERT comprehends how a combination of words expresses a complex idea. BERT understands words in a sequence and how they relate to each other, so it ensures we don’t drop important words from your query — no matter how small they are. For example, if you search for “can you get medicine for someone pharmacy,” BERT understands that you’re trying to figure out if you can pick up medicine for someone else. Before BERT, we took that short preposition for granted, mostly sharing results about how to fill a prescription. Thanks to BERT, we understand that even small words can have big meanings.

Search bar with the query “can you get medicine for someone pharmacy” with a mobile view of a featured snippet highlighting relevant text from an HHS.gov result.

Today, BERT plays a critical role in almost every English query. This is because our BERT systems excel at two of the most important tasks in delivering relevant results — ranking and retrieving. Based on its complex language understanding, BERT can very quickly rank documents for relevance. We’ve also improved legacy systems with BERT training, making them more helpful in retrieving relevant documents for ranking. And while BERT plays a major role in Search, it’s never working alone — like all of our systems, BERT is part of an ensemble of systems that work together to share high-quality results.

MUM — moving from language to information understanding

In May, we introduced our latest AI milestone in Search — Multitask Unified Model, or MUM. A thousand times more powerful than BERT, MUM is capable of both understanding and generating language. It’s trained across 75 languages and many different tasks at once, allowing it to develop a more comprehensive understanding of information and world knowledge. MUM is also multimodal, meaning it can understand information across multiple modalities such as text, images and more in the future.

While we’re still in the early days of tapping into MUM’s potential, we’ve already used it to improve searches for COVID-19 vaccine information, and we’ll offer more intuitive ways to search using a combination of both text and images in Google Lens in the coming months. These are very specialized applications — so MUM is not currently used to help rank and improve the quality of search results like RankBrain, neural matching and BERT systems do.

As we introduce more MUM-powered experiences to Search, we’ll begin to shift from advanced language understanding to a more nuanced understanding of information about the world. And as with all improvements to Search, any MUM application will go through a rigorous evaluation process, with special attention to the responsible application of AI. And when they’re deployed, they’ll join the chorus of systems that run together to make Search helpful.

How AI powers great search results

Do you ever wonder how Google understands what you’re looking for? There’s a lot that goes into delivering helpful search results, and understanding language is one of the most important skills. Thanks to advancements in AI and machine learning, our Search systems are understanding human language better than ever before. And we want to share a behind-the-scenes look at how this translates into relevant results for you.

But first, let's walk down memory lane: In the early days of Search, before we had advanced AI, our systems simply looked for matching words. For example, if you searched for “pziza” — unless there was a page with that particular misspelling, you’d likely have to redo the search with the correct spelling to find a slice near you. And eventually, we learned how to code algorithms to find classes of patterns, like popular misspellings or potential typos from neighboring keys. Now, with advanced machine learning, our systems can more intuitively recognize if a word doesn’t look right and suggest a possible correction.

These kinds of AI improvements to our Search systems mean that they’re constantly getting better at understanding what you’re looking for. And since the world and people’s curiosities are always evolving, it’s really important that Search does, too. In fact, 15% of searches we see every day are entirely new. AI plays a major role in showing you helpful results, even at the outermost edges of your imagination.

How our systems play together

We’ve developed hundreds of algorithms over the years, like our early spelling system, to help deliver relevant search results. When we develop new AI systems, our legacy algorithms and systems don’t just get shelved away. In fact, Search runs on hundreds of algorithms and machine learning models, and we’re able to improve it when our systems — new and old — can play well together. Each algorithm and model has a specialized role, and they trigger at different times and in distinct combinations to help deliver the most helpful results. And some of our more advanced systems play a more prominent role than others. Let’s take a closer look at the major AI systems running in Search today, and what they do.

RankBrain — a smarter ranking system

When we launched RankBrain in 2015, it was the first deep learning system deployed in Search. At the time, it was groundbreaking — not only because it was our first AI system, but because it helped us understand how words relate to concepts. Humans understand this instinctively, but it’s a complex challenge for a computer. RankBrain helps us find information we weren’t able to before by more broadly understanding how words in a search relate to real-world concepts. For example, if you search for “what’s the title of the consumer at the highest level of a food chain,” our systems learn from seeing those words on various pages that the concept of a food chain may have to do with animals, and not human consumers. By understanding and matching these words to their related concepts, RankBrain understands that you’re looking for what’s commonly referred to as an “apex predator.”

Search bar with the query “what’s the title of the consumer at the highest level of a food chain,” and a mobile view of a featured snippet for “apex predator.”

Thanks to this type of understanding, RankBrain (as its name suggests) is used to help rank — or decide the best order for — top search results. Although it was our very first deep learning model, RankBrain continues to be one of the major AI systems powering Search today.

Neural matching — a sophisticated retrieval engine

Neural networks underpin many modern AI systems today. But it wasn’t until 2018, when we introduced neural matching to Search, that we could use them to better understand how queries relate to pages. Neural matching helps us understand fuzzier representations of concepts in queries and pages, and match them to one another. It looks at an entire query or page rather than just keywords, developing a better understanding of the underlying concepts represented in them. Take the search “insights how to manage a green,” for example. If a friend asked you this, you’d probably be stumped. But with neural matching, we’re able to make sense of it. By looking at the broader representations of concepts in the query — management, leadership, personality and more — neural matching can decipher that this searcher is looking for management tips based on a popular, color-based personality guide.

Search bar with the query “insights how to manage a green” with a mobile view of relevant search results.

When our systems understand the broader concepts represented in a query or page, they can more easily match them with one another. This level of understanding helps us cast a wide net when we scan our index for content that may be relevant to your query. This is what makes neural matching such a critical part of how we retrieve relevant documents from a massive and constantly changing information stream.

BERT — a model for understanding meaning and context

Launched in 2019, BERT was a huge step change in natural language understanding, helping us understand how combinations of words express different meanings and intents. Rather than simply searching for content that matches individual words, BERT comprehends how a combination of words expresses a complex idea. BERT understands words in a sequence and how they relate to each other, so it ensures we don’t drop important words from your query — no matter how small they are. For example, if you search for “can you get medicine for someone pharmacy,” BERT understands that you’re trying to figure out if you can pick up medicine for someone else. Before BERT, we took that short preposition for granted, mostly sharing results about how to fill a prescription. Thanks to BERT, we understand that even small words can have big meanings.

Search bar with the query “can you get medicine for someone pharmacy” with a mobile view of a featured snippet highlighting relevant text from an HHS.gov result.

Today, BERT plays a critical role in almost every English query. This is because our BERT systems excel at two of the most important tasks in delivering relevant results — ranking and retrieving. Based on its complex language understanding, BERT can very quickly rank documents for relevance. We’ve also improved legacy systems with BERT training, making them more helpful in retrieving relevant documents for ranking. And while BERT plays a major role in Search, it’s never working alone — like all of our systems, BERT is part of an ensemble of systems that work together to share high-quality results.

MUM — moving from language to information understanding

In May, we introduced our latest AI milestone in Search — Multitask Unified Model, or MUM. A thousand times more powerful than BERT, MUM is capable of both understanding and generating language. It’s trained across 75 languages and many different tasks at once, allowing it to develop a more comprehensive understanding of information and world knowledge. MUM is also multimodal, meaning it can understand information across multiple modalities such as text, images and more in the future.

While we’re still in the early days of tapping into MUM’s potential, we’ve already used it to improve searches for COVID-19 vaccine information, and we’ll offer more intuitive ways to search using a combination of both text and images in Google Lens in the coming months. These are very specialized applications — so MUM is not currently used to help rank and improve the quality of search results like RankBrain, neural matching and BERT systems do.

As we introduce more MUM-powered experiences to Search, we’ll begin to shift from advanced language understanding to a more nuanced understanding of information about the world. And as with all improvements to Search, any MUM application will go through a rigorous evaluation process, with special attention to the responsible application of AI. And when they’re deployed, they’ll join the chorus of systems that run together to make Search helpful.

Responsibly applying AI models to Search

For over two decades of Search, we’ve been at the forefront of innovation in language understanding to help deliver on our mission of making the world’s information more accessible and useful for everyone. We’ve seen how critical these advancements are to making information more helpful, and being able to better connect people to creators, publishers and businesses on the web. It’s this constant improvement in understanding human language that’s enabled us to send more traffic to the web every year since Google was created.

We’ve also seen how AI models have significantly improved language innovation. Each successive milestone, from neural nets, to BERT, to MUM, has blown us away with the step changes in information understanding they’ve offered. But with each step forward, we look closely at the limitations and risks new technologies can present.

Across Google, we have been examining the risks and challenges associated with more powerful language models, and we’re committed to responsibly applying AI in Search. Here are some of the ways we do that.

Training on high quality data

We pretrain our models on high-quality data to reduce their potential to perpetuate undesirable biases that may exist in web content. In the case of MUM, we ensured that training data from the web was designated as high-quality based on our search quality metrics, which are informed by our Search Quality Rater Guidelines and driven by our quality rating and evaluation system. This substantially reduces the risk of training on misinformation or explicit content, for example, and is key to our approach.

And as part of our efforts to build a Search experience that works for everyone, MUM was trained on over 75 languages from around the world.

Rigorous Evaluation

Every improvement to Google Search undergoes a rigorous evaluation process to ensure we’re providing more relevant, helpful results. Our Search Quality Rater Guidelines are our north star for how we evaluate great search results. Human raters follow these guidelines and help us understand if our improvements are better fulfilling people’s information needs.

This evaluation process is central to the responsible application of any improvement to Search, whether we’re introducing powerful new systems like BERT or MUM, or simply adding a new feature.

Some changes are bigger than others, so we have to adjust our process accordingly. At the time of its introduction to Search, BERT impacted 1 in 10 English-language queries, so we scaled our evaluation process to be even more rigorous than usual. We subjected our systems to an unprecedented amount of scrutiny, increasing both the scale and granularity of quality testing, to help ensure they weren’t introducing concerning patterns into our systems.

While our standard evaluation process helps us judge launches across a representative query stream, for some improvements, we also more closely examine whether changes provide quality gains or losses across specific slices of queries, or topic areas. This allows us to identify if concerning patterns exist and pursue mitigations before launching an improvement to Search.

Search is not perfect, and any application of AI will not be perfect — this is why any change to Search involves extensive and constant evaluation and testing.

Responsible application design

In addition to working with responsibly designed and trained models, the thoughtful design of products and applications is key to addressing some of the challenges of language models. In Search, many of these critical mitigations take place at the application level, where we can focus on the end-user experience and more effectively manage risk in smaller models designed for specific tasks.

When we adopt new AI technologies such as BERT or MUM, they’re able to help improve individual systems to perform tasks more efficiently and effectively. This approach allows us to focus the scope of our evaluation and understand if an application is introducing concerning patterns. In the event that we do find concerning behavior, we’re able to design much more targeted solutions.

Minding our footprint

Training and running advanced AI models can be energy consumptive. Another benefit of training smaller, application-specific models is that the energy costs of the larger base model, such as MUM, are amortized over the many different applications.

The Google Research team recently published research detailing the energy costs of training state-of-the art language models, and their findings show that combining efficient models, processors, and data centers with clean energy sources can reduce the carbon footprint of a model by as much as one thousand-fold — and we follow this approach to train our models in Search.

Language models in practice

New language models like MUM have enormous potential to transform our ability to understand language and information about the world. And while they may be powerful, they do not make our existing systems obsolete. Today, Google Search employs hundreds of algorithms and machine learning models, none of which are wholly reliant on any singular, large model.

Amongst these hundreds of applications are systems and protections designed specifically to ensure you have a safe, high quality experience. For example, we design our ranking systems to surface relevant and reliable information. Even if a model were to present issues around low quality content, our systems are built to counteract this.

As we’re able to introduce new technologies like MUM into Search, they’ll help us greatly improve our systems and introduce entirely new product experiences. And they can also help us tackle other challenges we face. Improved AI systems can help bolster our spam fighting capabilities and even help us combat known loss patterns. In fact, we recently introduced a BERT-based system to better identify queries seeking explicit content, so we can better avoid shocking or offending users not looking for that information, and ultimately make our Search experience safer for everyone.

We look forward to making Search a better, more helpful product with improved information understanding from these advanced language models, and bringing these new capabilities to Search in a responsible way.

How MUM improved Google Searches for vaccine information

Soda, pop; sweater, jumper; soccer, football. So many things go by different names. Sometimes it’s a function of language, but sometimes it’s a matter of cultural trends or nuance, or simply where you are in the world. 

One very relevant example is COVID-19. As people everywhere searched for information, we had to learn to identify all the different phrases people used to refer to the novel coronavirus to make sure we surfaced high quality and timely information from trusted health authorities like the World Health Organization and Centers for Disease Control and Prevention. A year later, we’re encountering a similar challenge with vaccine names, only this time, we have a new tool to help: Multitask Unified Model (MUM).  


Understanding searches for vaccine information 

AstraZeneca, CoronaVac, Moderna, Pfizer, Sputnik and other broadly distributed vaccines all have many different names all over the world — over 800, based on our analysis. People searching for information about the vaccines may look for “Coronavaccin Pfizer,” “mRNA-1273,” “CoVaccine” — the list goes on. 

Our ability to correctly identify all these names is critical to bringing people the latest trustworthy information about the vaccine. But identifying the different ways people refer to the vaccines all over the world is hugely time-intensive, taking hundreds of human hours. 

With MUM, we were able to identify over 800 variations of vaccine names in more than 50 languages in a matter of seconds. After validating MUM’s findings, we applied them to Google Search so that people could find timely, high-quality information about COVID-19 vaccines worldwide.

Three screenshots of Search results about COVID-19 vaccines.

Surfacing trustworthy information about COVID-19 vaccines in Search.

Transferring knowledge across languages

MUM was able to do a job that should take weeks in just seconds thanks to its knowledge transfer skills. MUM can learn from and transfer knowledge across the 75+ languages it’s trained on. For example, imagine reading a book; if you’re multilingual, you’d be able to share the major takeaways of the book in the other languages you speak — depending on your fluency — because you have an understanding of the book that isn’t language- or translation-dependent. MUM transfers knowledge across languages much like this. 

Similarly, with its knowledge transfer abilities, MUM doesn’t have to learn a new capability or skill in every new language — it can transfer learnings across them, helping us quickly scale improvements even when there isn’t much training data to work with. This is in part thanks to MUM’s sample efficiencies — meaning MUM requires far fewer data inputs than previous models to accomplish the same task. In the case of vaccines, with just a small sample of official vaccine names, MUM was able to rapidly identify these variations across languages.  


Improving Google Search with MUM

This first application of MUM helped us get critical information to users around the world in a timely manner, and we’re looking forward to the many ways in which MUM can make Search more useful to people in the future. Our early testing indicates that not only will MUM be able to improve many aspects of our existing systems, but will also help us create completely new ways to search and explore information.

Improving Search to better protect people from harassment

Over the past two decades of building Google Search, we’ve continued to improve and refine our ability to provide the highest quality results for the billions of queries we see every day. Our core principles guide every improvement, as we constantly update Search to work better for you. One area we’d like to shed more light on is how we balance maximizing access to information with the responsibility to protect people from online harassment.


We design our ranking systems to surface high quality results for as many queries as possible, but some types of queries are more susceptible to bad actors and require specialized solutions. One such example is websites that employ exploitative removals practices. These are sites that require payment to remove content, and since 2018 we’ve had a policy that enables people to request removal of pages with information about them from our results. 


Beyond removing these pages from appearing in Google Search, we also used these removals as a demotion signal in Search, so that sites that have these exploitative practices rank lower in results. This solution leads the industry, and is effective in helping people who are victims of harassment from these sites. 


However, we found that there are some extraordinary cases of repeated harassment. The New York Times highlighted one such case, and shed light on some limitations of our approach.


To help people who are dealing with extraordinary cases of repeated harassment, we’re implementing an improvement to our approach to further protect known victims. Now, once someone has requested a removal from one site with predatory practices, we will automatically apply ranking protections to help prevent content from other similar low quality sites appearing in search results for people’s names. We’re also looking to expand these protections further, as part of our ongoing work in this space.


This change was inspired by a similar approach we’ve taken with victims of non-consensual explicit content, commonly known as revenge porn. While no solution is perfect, our evaluations show that these changes meaningfully improve the quality of our results.


Over the years of building Search, our approach has remained consistent: We take examples of queries where we’re not doing the best job in providing high quality results, and look for ways to make improvements to our algorithms. In this way, we don’t “fix” individual queries, since they’re often a symptom of a class of problems that affect many different queries. Our ability to address issues continues to lead the industry, and we’ve deployed advanced technology, tools and quality signals over the last two decades, making Search work better every day.


Search is never a solved problem, and there are always new challenges we face as the web and the world change. We’re committed to listening to feedback and looking for ways to improve the quality of our results.


Source: Search


MUM: A new AI milestone for understanding information

When I tell people I work on Google Search, I’m sometimes asked, "Is there any work left to be done?" The short answer is an emphatic “Yes!” There are countless challenges we're trying to solve so Google Search works better for you. Today, we’re sharing how we're addressing one many of us can identify with: having to type out many queries and perform many searches to get the answer you need.

Take this scenario: You’ve hiked Mt. Adams. Now you want to hike Mt. Fuji next fall, and you want to know what to do differently to prepare. Today, Google could help you with this, but it would take many thoughtfully considered searches — you’d have to search for the elevation of each mountain, the average temperature in the fall, difficulty of the hiking trails, the right gear to use, and more. After a number of searches, you’d eventually be able to get the answer you need.

But if you were talking to a hiking expert; you could ask one question — “what should I do differently to prepare?” You’d get a thoughtful answer that takes into account the nuances of your task at hand and guides you through the many things to consider.  

This example is not unique — many of us tackle all sorts of tasks that require multiple steps with Google every day. In fact, we find that people issue eight queries on average for complex tasks like this one. 

Today's search engines aren't quite sophisticated enough to answer the way an expert would. But with a new technology called Multitask Unified Model, or MUM, we're getting closer to helping you with these types of complex needs. So in the future, you’ll need fewer searches to get things done. 


Helping you when there isn’t a simple answer

MUM has the potential to transform how Google helps you with complex tasks. Like BERT, MUM is built on a Transformer architecture, but it’s 1,000 times more powerful. MUM not only understands language, but also generates it. It’s trained across 75 different languages and many different tasks at once, allowing it to develop a more comprehensive understanding of information and world knowledge than previous models. And MUM is multimodal, so it understands information across text and images and, in the future, can expand to more modalities like video and audio.

Take the question about hiking Mt. Fuji: MUM could understand you’re comparing two mountains, so elevation and trail information may be relevant. It could also understand that, in the context of hiking, to “prepare” could include things like fitness training as well as finding the right gear. 

Animated GIF visualization representing how MUM interprets the question “I’ve hiked Mt. Adams and now want to hike Mt. Fuji next fall, what should I do to prepare?

Since MUM can surface insights based on its deep knowledge of the world, it could highlight that while both mountains are roughly the same elevation, fall is the rainy season on Mt. Fuji so you might need a waterproof jacket. MUM could also surface helpful subtopics for deeper exploration — like the top-rated gear or best training exercises — with pointers to helpful articles, videos and images from across the web. 


Removing language barriers

Language can be a significant barrier to accessing information. MUM has the potential to break down these boundaries by transferring knowledge across languages. It can learn from sources that aren’t written in the language you wrote your search in, and help bring that information to you. 

Say there’s really helpful information about Mt. Fuji written in Japanese; today, you probably won’t find it if you don’t search in Japanese. But MUM could transfer knowledge from sources across languages, and use those insights to find the most relevant results in your preferred language. So in the future, when you’re searching for information about visiting Mt. Fuji, you might see results like where to enjoy the best views of the mountain, onsen in the area and popular souvenir shops — all information more commonly found when searching in Japanese.

Animated GIF showing a visualization of different illustrations of news sources in different languages.

Understanding information across types

MUM is multimodal, which means it can understand information from different formats like webpages, pictures and more, simultaneously. Eventually, you might be able to take a photo of your hiking boots and ask, “can I use these to hike Mt. Fuji?” MUM would understand the image and connect it with your question to let you know your boots would work just fine. It could then point you to a blog with a list of recommended gear.  

Animated GIF showing a photo of hiking shoes. The question “can I use these to hike Mt. Fuji?” appears next to the shoes.

Applying advanced AI to Search, responsibly

Whenever we take a leap forward with AI to make the world’s information more accessible, we do so responsibly. Every improvement to Google Search undergoes a rigorous evaluation process to ensure we’re providing more relevant, helpful results. Human raters, who follow our Search Quality Rater Guidelines, help us understand how well our results help people find information. 

Just as we’ve carefully tested the many applications of BERT launched since 2019, MUM will undergo the same process as we apply these models in Search. Specifically, we’ll look for patterns that may indicate bias in machine learning to avoid introducing bias into our systems. We’ll also apply learnings from our latest research on how to reduce the carbon footprint of training systems like MUM, to make sure Search keeps running as efficiently as possible.

We’ll bring MUM-powered features and improvements to our products in the coming months and years. Though we’re in the early days of exploring MUM, it’s an important milestone toward a future where Google can understand all of the different ways people naturally communicate and interpret information.

MUM: A new AI milestone for understanding information

When I tell people I work on Google Search, I’m sometimes asked, "Is there any work left to be done?" The short answer is an emphatic “Yes!” There are countless challenges we're trying to solve so Google Search works better for you. Today, we’re sharing how we're addressing one many of us can identify with: having to type out many queries and perform many searches to get the answer you need.

Take this scenario: You’ve hiked Mt. Adams. Now you want to hike Mt. Fuji next fall, and you want to know what to do differently to prepare. Today, Google could help you with this, but it would take many thoughtfully considered searches — you’d have to search for the elevation of each mountain, the average temperature in the fall, difficulty of the hiking trails, the right gear to use, and more. After a number of searches, you’d eventually be able to get the answer you need.

But if you were talking to a hiking expert; you could ask one question — “what should I do differently to prepare?” You’d get a thoughtful answer that takes into account the nuances of your task at hand and guides you through the many things to consider.  

This example is not unique — many of us tackle all sorts of tasks that require multiple steps with Google every day. In fact, we find that people issue eight queries on average for complex tasks like this one. 

Today's search engines aren't quite sophisticated enough to answer the way an expert would. But with a new technology called Multitask Unified Model, or MUM, we're getting closer to helping you with these types of complex needs. So in the future, you’ll need fewer searches to get things done. 


Helping you when there isn’t a simple answer

MUM has the potential to transform how Google helps you with complex tasks. Like BERT, MUM is built on a Transformer architecture, but it’s 1,000 times more powerful. MUM not only understands language, but also generates it. It’s trained across 75 different languages and many different tasks at once, allowing it to develop a more comprehensive understanding of information and world knowledge than previous models. And MUM is multimodal, so it understands information across text and images and, in the future, can expand to more modalities like video and audio.

Take the question about hiking Mt. Fuji: MUM could understand you’re comparing two mountains, so elevation and trail information may be relevant. It could also understand that, in the context of hiking, to “prepare” could include things like fitness training as well as finding the right gear. 

Animated GIF visualization representing how MUM interprets the question “I’ve hiked Mt. Adams and now want to hike Mt. Fuji next fall, what should I do to prepare?

Since MUM can surface insights based on its deep knowledge of the world, it could highlight that while both mountains are roughly the same elevation, fall is the rainy season on Mt. Fuji so you might need a waterproof jacket. MUM could also surface helpful subtopics for deeper exploration — like the top-rated gear or best training exercises — with pointers to helpful articles, videos and images from across the web. 


Removing language barriers

Language can be a significant barrier to accessing information. MUM has the potential to break down these boundaries by transferring knowledge across languages. It can learn from sources that aren’t written in the language you wrote your search in, and help bring that information to you. 

Say there’s really helpful information about Mt. Fuji written in Japanese; today, you probably won’t find it if you don’t search in Japanese. But MUM could transfer knowledge from sources across languages, and use those insights to find the most relevant results in your preferred language. So in the future, when you’re searching for information about visiting Mt. Fuji, you might see results like where to enjoy the best views of the mountain, onsen in the area and popular souvenir shops — all information more commonly found when searching in Japanese.

Animated GIF showing a visualization of different illustrations of news sources in different languages.

Understanding information across types

MUM is multimodal, which means it can understand information from different formats like webpages, pictures and more, simultaneously. Eventually, you might be able to take a photo of your hiking boots and ask, “can I use these to hike Mt. Fuji?” MUM would understand the image and connect it with your question to let you know your boots would work just fine. It could then point you to a blog with a list of recommended gear.  

Animated GIF showing a photo of hiking shoes. The question “can I use these to hike Mt. Fuji?” appears next to the shoes.

Applying advanced AI to Search, responsibly

Whenever we take a leap forward with AI to make the world’s information more accessible, we do so responsibly. Every improvement to Google Search undergoes a rigorous evaluation process to ensure we’re providing more relevant, helpful results. Human raters, who follow our Search Quality Rater Guidelines, help us understand how well our results help people find information. 

Just as we’ve carefully tested the many applications of BERT launched since 2019, MUM will undergo the same process as we apply these models in Search. Specifically, we’ll look for patterns that may indicate bias in machine learning to avoid introducing bias into our systems. We’ll also apply learnings from our latest research on how to reduce the carbon footprint of training systems like MUM, to make sure Search keeps running as efficiently as possible.

We’ll bring MUM-powered features and improvements to our products in the coming months and years. Though we’re in the early days of exploring MUM, it’s an important milestone toward a future where Google can understand all of the different ways people naturally communicate and interpret information.