Get the full picture with helpful context on websites

When you think about how you can stay safe online, you might immediately think of protecting your data, updating your passwords, or having control over your personal information. But another important part of online safety is being confident in the information you find.

Information quality — in other words, surfacing relevant information from reliable sources — is a key principle of Google Search, and it’s one we relentlessly invest in. We also give you tools to evaluate for yourself the reliability of the information you come across.

Helpful context on websites

One of the tools we launched last year, About this Result, has now been used more than 1.6 billion times. This tool is available in English on individual Search results, helping you to see important context about a website before you even visit it. More languages will be available for this tool later this year.

But we want to ensure you have the tools to evaluate information wherever you are online — not just on the search results page, but also if you’ve already picked a webpage to visit. So we’re making this helpful context more accessible as you explore the web.

Soon, when you’re viewing a web page on the Google App, you'll be able to see a tab with information about the source with just a tap — including a brief description, what they say about themselves and what others on the web say about them.

GIF showing the new helpful context feature for websites

Imagine you’re researching conservation efforts, and find yourself on an unfamiliar website of a rainforest protection organization. Before you decide to donate, you’d like to understand if it’s an organization you feel confident you should support. With this update, you’ll be able to find helpful context about a source while you’re already on a website.

You’ll be able to see context like this on any website — coming soon to the Google App on iOS and Android.

We hope this will not only give you more context and peace of mind when you search, but also help you explore with confidence.

Taking on the Next Generation of Phishing Scams

 

Every year, security technologies improve: browsers get better, encryption becomes ubiquitous on the Web, authentication becomes stronger. But phishing persistently remains a threat (as shown by a recent phishing attack on the U.S. Department of Labor) because users retain the ability to log into their online accounts, often with a simple password, from anywhere in the world. It’s why today at I/O we announced new ways we’re reducing the risks of phishing by: scaling phishing protections to Google Docs, Sheets and Slides, continuing to auto enroll people in 2-Step Verification and more. This blog will deep dive into the method of phishing and how it has evolved today.

As phishing adoption has grown, multi-factor authentication has become a particular focus for attackers. In some cases, attackers phish SMS codes directly, by following a legitimate "one-time passcode" (triggered by the attacker trying to log into the victim's account) with a spoofed message asking the victim to "reply back with the code you just received.”


Left: legitimate Google SMS verification. Right: spoofed message asking victim to share verification code.


In other cases, attackers have leveraged more sophisticated dynamic phishing pages to conduct relay attacks. In these attacks, a user thinks they're logging into the intended site, just as in a standard phishing attack. But instead of deploying a simple static phishing page that saves the victim's email and password when the victim tries to login, the phisher has deployed a web service that logs into the actual website at the same time the user is falling for the phishing page.

The simplest approach is an almost off-the-shelf "reverse proxy" which acts as a "person in the middle", forwarding the victim's inputs to the legitimate page and sending the response from the legitimate page back to the victim's browser.



These attacks are especially challenging to prevent because additional authentication challenges shown to the attacker—like a prompt for an SMS code—are also relayed to the victim, and the victim's response is in turn relayed back to the real website. In this way, the attacker can count on their victim to solve any authentication challenge presented.

Traditional multi-factor authentication with PIN codes can only do so much against these attacks, and authentication with smartphone approvals via a prompt — while more secure against SIM-swap attacks — is still vulnerable to this sort of real-time interception.

The Solution Space

Over the past year, we've started to automatically enable device-based two-factor authentication for our users. This authentication not only helps protect against traditional password compromise but, with technology improvements, we can also use it to help defend against these more sophisticated forms of phishing.

Taking a broad view, most efforts to protect and defend against phishing fall into the following categories:
  • Browser UI improvements to help users identify authentic websites.
  • Password managers that can validate the identity of the web page before logging in.
  • Phishing detection, both in email—the most common delivery channel—and in the browser itself, to warn users about suspicious web pages.
  • Preventing the person-in-the-middle attacks mentioned above by preventing automated login attempts.
  • Phishing-resistant authentication using FIDO with security keys or a Bluetooth connection to your phone.
  • Hardening the Google Prompt challenge to help users identify suspicious sign-in attempts, or to ask them to take additional steps that can defeat phishing (like navigating to a new web address, or to join the same wireless network as the computer they're logging into).

Expanding phishing-resistant authentication to more users


Over the last decade we’ve been working hard with a number of industry partners on expanding phishing-resistant authentication mechanisms, as part of FIDO Alliance. Through these efforts we introduced physical FIDO security keys, such as the Titan Security Key, which prevent phishing by verifying the identity of the website you're logging into. (This verification protects against the "person-in-the-middle" phishing described above.) Recently, we announced a major milestone with the FIDO Alliance, Apple and Microsoft by expanding our support for the FIDO Sign-in standards, helping to launch us into a truly passwordless, phishing-resistant future.

Even though security keys work great, we don't expect everyone to add one to their keyring.



Instead, to make this level of security more accessible, we're building it into mobile phones. Unlike physical FIDO security keys that need to be connected to your device via USB, we use Bluetooth to ensure your phone is close to the device you're logging into. Like physical security keys, this helps prevent a distant attacker from tricking you into approving a sign-in on their browser, giving us an added layer of security against the kind of "person in the middle" attacks that can still work against SMS or Google Prompt.

(But don't worry: this doesn't allow computers within Bluetooth range to login as you—it only grants that approval to the computer you're logging into. And we only use this to verify that your phone is near the device you're logging into, so you only need to have Bluetooth on during login.)

Over the next couple of months we’ll be rolling out this technology in more places, which you might notice as a request for you to enable Bluetooth while logging in, so we can perform this additional security check. If you've signed into your Google account on your Android phone, we can enroll your phone automatically—just like with Google Prompt—allowing us to give this added layer of security to many of our users without the need for any additional setup.

But unfortunately this secure login doesn't work everywhere—for example, when logging into a computer that doesn't support Bluetooth, or a browser that doesn't support security keys. That's why, if we are to offer phishing-resistant security to everyone, we have to offer backups when security keys aren't available—and those backups must also be secure enough to prevent attackers from taking advantage of them.


Hardening existing challenges against phishin
g

Over the past few months, we've started experimenting with making our traditional Google Prompt challenges more phishing resistant.

We already use different challenge experiences depending on the situation—for example, sometimes we ask the user to match a PIN code with what they're seeing on the screen in addition to clicking "allow" or "deny". This can help prevent static phishing pages from tricking you into approving a challenge.

We've also begun experimenting with more involved challenges for higher-risk situations, including more prominent warnings when we see you logging in from a computer that we think might belong to a phisher, or asking you to join your phone to the same Wi-Fi network as the computer you're logging into so we can be sure the two are near each other. Similar to our use of Bluetooth for Security Keys, this prevents an attacker from tricking you into logging into a "person-in-the-middle" phishing page.


Bringing it all together

Of course, while all of these options dramatically increase account security, we also know that they can be a challenge for some of our users, which is why we're rolling them out gradually, as part of a risk-based approach that also focuses on usability. If we think an account is at a higher risk, or if we see abnormal behavior, we're more likely to use these additional security measures.

Over time, as FIDO2 authentication becomes more widely available, we expect to be able to make it the default for many of our users, and to rely on stronger versions of our existing challenges like those described above to provide secure fallbacks.

All these new tools in our toolbox—detecting browser automation to prevent "person in the middle" attacks, warning users in Chrome and Gmail, making the Google Prompt more secure, and automatically enabling Android phones as easy-to-use Security Keys—work together to allow us to better protect our users against phishing.

Phishing attacks have long been seen as a persistent threat, but these recent developments give us the ability to really move the needle and help more of our users stay safer online.

A new Search tool to help control your online presence

Have you ever searched for your name online to see what other people can find out about you? You’re not alone. And for many people, a key element of feeling safer and more private online is having greater control over where their sensitive, personally-identifiable information can be found.

These days, it’s important to have simple tools to manage your online presence. That’s why we’re introducing a new tool in Google Search to help you easily control whether your personally-identifiable information can be found in Search results, so you can have more peace of mind about your online footprint.

Remove results about you in Search

You might have seen that we recently updated our policies to enable people to request the removal of sensitive, personally-identifiable information — including contact information, like a phone number, email address, or home address — from Search.

Now, we’re making it easier for you to remove results that contain your contact information from Google. We’re rolling out a new tool to accompany our updated policies and streamline the request process.

A gif showing a representation of a new tool that will allow people to easily request the removal of Search results containing their phone number, home address, or email address.

When you’re searching on Google and find results about you that contain your phone number, home address, or email address, you’ll be able to quickly request their removal from Google Search — right as you find them. With this new tool, you can request removal of your contact details from Search with a few clicks, and you’ll also be able to easily monitor the status of these removal requests.

This feature will be available in the coming months in the Google App, and you’ll also be able to make removal requests by going to the three dots next to individual Google Search results. In the meantime, you can make requests to remove your info from our support page.

It’s important to note that when we receive removal requests, we will evaluate all content on the web page to ensure that we're not limiting the availability of other information that is broadly useful, for instance in news articles. And of course, removing contact information from Google Search doesn’t remove it from the web, which is why you may wish to contact the hosting site directly, if you're comfortable doing so.

At Google, we strongly believe in open access to information, and we also have a deep commitment to protecting people — and their privacy — online. These changes are significant and important steps to help you manage your online presence — and we want to make sure it’s as easy as possible for you to be in control.

Language Models Perform Reasoning via Chain of Thought

In recent years, scaling up the size of language models has been shown to be a reliable way to improve performance on a range of natural language processing (NLP) tasks. Today’s language models at the scale of 100B or more parameters achieve strong performance on tasks like sentiment analysis and machine translation, even with little or no training examples. Even the largest language models, however, can still struggle with certain multi-step reasoning tasks, such as math word problems and commonsense reasoning. How might we enable language models to perform such reasoning tasks?

In “Chain of Thought Prompting Elicits Reasoning in Large Language Models,” we explore a prompting method for improving the reasoning abilities of language models. Called chain of thought prompting, this method enables models to decompose multi-step problems into intermediate steps. With chain of thought prompting, language models of sufficient scale (~100B parameters) can solve complex reasoning problems that are not solvable with standard prompting methods.

Comparison to Standard Prompting
With standard prompting (popularized by GPT-3) the model is given examples of input–output pairs (formatted as questions and answers) before being asked to predict the answer for a test-time example (shown below on the left). In chain of thought prompting (below, right), the model is prompted to produce intermediate reasoning steps before giving the final answer to a multi-step problem. The idea is that a model-generated chain of thought would mimic an intuitive thought process when working through a multi-step reasoning problem. While producing a thought process has been previously accomplished via fine-tuning, we show that such thought processes can be elicited by including a few examples of chain of thought via prompting only, which does not require a large training dataset or modifying the language model’s weights.

Whereas standard prompting asks the model to directly give the answer to a multi-step reasoning problem, chain of thought prompting induces the model to decompose the problem into intermediate reasoning steps, in this case leading to a correct final answer.

Chain of thought reasoning allows models to decompose complex problems into intermediate steps that are solved individually. Moreover, the language-based nature of chain of thought makes it applicable to any task that a person could solve via language. We find through empirical experiments that chain of thought prompting can improve performance on various reasoning tasks, and that successful chain of thought reasoning is an emergent property of model scale — that is, the benefits of chain of thought prompting only materialize with a sufficient number of model parameters (around 100B).

Arithmetic Reasoning
One class of tasks where language models typically struggle is arithmetic reasoning (i.e., solving math word problems). Two benchmarks in arithmetic reasoning are MultiArith and GSM8K, which test the ability of language models to solve multi-step math problems similar to the one shown in the figure above. We evaluate both the LaMDA collection of language models ranging from 422M to 137B parameters, as well as the PaLM collection of language models ranging from 8B to 540B parameters. We manually compose chains of thought to include in the examples for chain of thought prompting.

For these two benchmarks, using standard prompting leads to relatively flat scaling curves: increasing the scale of the model does not substantially improve performance (shown below). However, we find that when using chain of thought prompting, increasing model scale leads to improved performance that substantially outperforms standard prompting for large model sizes.

Employing chain of thought prompting enables language models to solve arithmetic reasoning problems for which standard prompting has a mostly flat scaling curve.

On the GSM8K dataset of math word problems, PaLM shows remarkable performance when scaled to 540B parameters. As shown in the table below, combining chain of thought prompting with the 540B parameter PaLM model leads to new state-of-the-art performance of 58%, surpassing the prior state of the art of 55% achieved by fine-tuning GPT-3 175B on a large training set and then ranking potential solutions via a specially trained verifier. Moreover, follow-up work on self-consistency shows that the performance of chain of thought prompting can be improved further by taking the majority vote of a broad set of generated reasoning processes, which results in 74% accuracy on GSM8K.

Chain of thought prompting with PaLM achieves a new state of the art on the GSM8K benchmark of math word problems. For a fair comparison against fine-tuned GPT-3 baselines, the chain of thought prompting results shown here also use an external calculator to compute basic arithmetic functions (i.e., addition, subtraction, multiplication and division).

Commonsense Reasoning
In addition to arithmetic reasoning, we consider whether the language-based nature of chain of thought prompting also makes it applicable to commonsense reasoning, which involves reasoning about physical and human interactions under the presumption of general background knowledge. For these evaluations, we use the CommonsenseQA and StrategyQA benchmarks, as well as two domain-specific tasks from BIG-Bench collaboration regarding date understanding and sports understanding. Example questions are below:

As shown below, for CommonsenseQA, StrategyQA, and Date Understanding, performance improved with model scale, and employing chain of thought prompting led to additional small improvements. Chain of thought prompting had the biggest improvement on sports understanding, for which PaLM 540B’s chain of thought performance surpassed that of an unaided sports enthusiast (95% vs 84%).

Chain of thought prompting also improves performance on various types of commonsense reasoning tasks.

Conclusions
Chain of thought prompting is a simple and broadly applicable method for improving the ability of language models to perform various reasoning tasks. Through experiments on arithmetic and commonsense reasoning, we find that chain of thought prompting is an emergent property of model scale. Broadening the range of reasoning tasks that language models can perform will hopefully inspire further work on language-based approaches to reasoning.

Acknowledgements
It was an honor and privilege to work with Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Chi, Sharan Narang, Aakanksha Chowdhery, and Quoc Le on this project.

Source: Google AI Blog


Sunset of the Ad Manager API v202105

On Tuesday, May 31, 2022, in accordance with the deprecation schedule, v202105 of the Ad Manager API will sunset. At that time, any requests made to this version will return errors.

If you’re still using v202105, now is the time to upgrade to a newer release and take advantage of additional functionality. For example, in v202111 and newer versions we added new video opportunity Columns.

When you’re ready to upgrade, check the full release notes to identify any breaking changes. Here are a few examples of changes that may impact your applications:
  • v202108
    • The deprecated adExchangeEnvironment field has been removed from ProposalLineItem. To control which platforms your ProposalLineItem will serve on, you can use requestPlatformTargeting instead.
    • On the MobileApplication object, the singular field appStore was changed to the list field appStores.
  • v202202
    • The deprecated pauseRole and pauseReason fields were removed from Proposals. These fields have been available on ProposalLineItems since v202108.
    • The deprecated exchangeRate and refreshExchangeRate fields were removed from Proposals.
As always, don't hesitate to reach out to us on the developer forum with any questions.

Improving skin tone representation across Google

Seeing yourself reflected in the world around you — in real life, media or online — is so important. And we know that challenges with image-based technologies and representation on the web have historically left people of color feeling overlooked and misrepresented. Last year, we announced Real Tone for Pixel, which is just one example of our efforts to improve representation of diverse skin tones across Google products.

Today, we're introducing a next step in our commitment to image equity and improving representation across our products. In partnership with Harvard professor and sociologist Dr. Ellis Monk, we’re releasing a new skin tone scale designed to be more inclusive of the spectrum of skin tones we see in our society. Dr. Monk has been studying how skin tone and colorism affect people’s lives for more than 10 years.

The culmination of Dr. Monk’s research is the Monk Skin Tone (MST) Scale, a 10-shade scale that will be incorporated into various Google products over the coming months. We’re openly releasing the scale so anyone can use it for research and product development. Our goal is for the scale to support inclusive products and research across the industry — we see this as a chance to share, learn and evolve our work with the help of others.

Ten circles in a row, ranging from dark to light.

The 10 shades of the Monk Skin Tone Scale.

This scale was designed to be easy-to-use for development and evaluation of technology while representing a broader range of skin tones. In fact, our research found that amongst participants in the U.S., people found the Monk Skin Tone Scale to be more representative of their skin tones compared to the current tech industry standard. This was especially true for people with darker skin tones.

“In our research, we found that a lot of the time people feel they’re lumped into racial categories, but there’s all this heterogeneity with ethnic and racial categories,” Dr. Monk says. “And many methods of categorization, including past skin tone scales, don’t pay attention to this diversity. That’s where a lack of representation can happen…we need to fine-tune the way we measure things, so people feel represented.”

Using the Monk Skin Tone Scale to improve Google products

Updating our approach to skin tone can help us better understand representation in imagery, as well as evaluate whether a product or feature works well across a range of skin tones. This is especially important for computer vision, a type of AI that allows computers to see and understand images. When not built and tested intentionally to include a broad range of skin-tones, computer vision systems have been found to not perform as well for people with darker skin.

The MST Scale will help us and the tech industry at large build more representative datasets so we can train and evaluate AI models for fairness, resulting in features and products that work better for everyone — of all skin tones. For example, we use the scale to evaluate and improve the models that detect faces in images.

Here are other ways you’ll see this show up in Google products.

Improving skin tone representation in Search

Every day, millions of people search the web expecting to find images that reflect their specific needs. That’s why we’re also introducing new features using the MST Scale to make it easier for people of all backgrounds to find more relevant and helpful results.

For example, now when you search for makeup related queries in Google Images, you'll see an option to further refine your results by skin tone. So if you’re looking for “everyday eyeshadow” or “bridal makeup looks” you’ll more easily find results that work better for your needs.

Animated GIF showing a Google Images search for “bridal makeup looks.” The results include an option to filter by skin tone; the cursor selects a darker skin tone, which adjusts to results that are more relevant to this choice.

Seeing yourself represented in results can be key to finding information that's truly relevant and useful, which is why we’re also rolling out improvements to show a greater range of skin tones in image results for broad searches about people, or ones where people show up in the results. In the future, we’ll incorporate the MST Scale to better detect and rank images to include a broader range of results, so everyone can find what they're looking for.

Creating a more representative Search experience isn’t something we can do alone, though. How content is labeled online is a key factor in how our systems surface relevant results. In the coming months, we'll also be developing a standardized way to label web content. Creators, brands and publishers will be able to use this new inclusive schema to label their content with attributes like skin tone, hair color and hair texture. This will make it possible for content creators or online businesses to label their imagery in a way that search engines and other platforms can easily understand.

A photograph of a Black person looking into the camera. Tags hover over various areas of the photo; one over their skin says “Skin tone” with a circle matching their skin tone. Two additional tags over their hair read “Hair color” and “Hair texture.

Improving skin tone representation in Google Photos

We’ll also be using the MST Scale to improve Google Photos. Last year, we introduced an improvement to our auto enhance feature in partnership with professional image makers. Now we’re launching a new set of Real Tone filters that are designed to work well across skin tones and evaluated using the MST Scale. We worked with a diverse range of renowned image makers, like Kennedi Carter and Joshua Kissi, who are celebrated for beautiful and accurate depictions of their subjects, to evaluate, test and build these filters. These new Real Tone filters allow you to choose from a wider assortment of looks and find one that reflects your style. Real Tone filters will be rolling out on Google Photos across Android, iOS and Web in the coming weeks.

Animated video showing before and after photos of images with the Real Tone Filter.

What’s next?

We’re openly releasing the Monk Skin Tone Scale so that others can use it in their own products, and learn from this work —and so that we can partner with and learn from them. We want to get feedback, drive more interdisciplinary research, and make progress together. We encourage you to share your thoughts here. We’re continuing to collaborate with Dr. Monk to evaluate the MST Scale across different regions and product applications, and we’ll iterate and improve on it to make sure the scale works for people and use cases all over the world. And, we’ll continue our efforts to make Google’s products work even better for every user.

The best part of working on this project is that it isn’t just ours — while we’re committed to making Google products better and more inclusive, we’re also excited about all the possibilities that exist as we work together to build for everyone across the web.

Unlocking Zero-Resource Machine Translation to Support New Languages in Google Translate

Machine translation (MT) technology has made significant advances in recent years, as deep learning has been integrated with natural language processing (NLP). Performance on research benchmarks like WMT have soared, and translation services have improved in quality and expanded to include new languages. Nevertheless, while existing translation services cover languages spoken by the majority of people world wide, they only include around 100 languages in total, just over 1% of those actively spoken globally. Moreover, the languages that are currently represented are overwhelmingly European, largely overlooking regions of high linguistic diversity, like Africa and the Americas.

There are two key bottlenecks towards building functioning translation models for the long tail of languages. The first arises from data scarcity; digitized data for many languages is limited and can be difficult to find on the web due to quality issues with Language Identification (LangID) models. The second challenge arises from modeling limitations. MT models usually train on large amounts of parallel (translated) text, but without such data, models must learn to translate from limited amounts of monolingual text, which is a novel area of research. Both of these challenges need to be addressed for translation models to reach sufficient quality.

In “Building Machine Translation Systems for the Next Thousand Languages”, we describe how to build high-quality monolingual datasets for over a thousand languages that do not have translation datasets available and demonstrate how one can use monolingual data alone to train MT models. As part of this effort, we are expanding Google Translate to include 24 under-resourced languages. For these languages, we created monolingual datasets by developing and using specialized neural language identification models combined with novel filtering approaches. The techniques we introduce supplement massively multilingual models with a self supervised task to enable zero-resource translation. Finally, we highlight how native speakers have helped us realize this accomplishment.

Meet the Data
Automatically gathering usable textual data for under-resourced languages is much more difficult than it may seem. Tasks like LangID, which work well for high-resource languages, are unsuccessful for under-resourced languages, and many publicly available datasets crawled from the web often contain more noise than usable data for the languages they attempt to support. In our early attempts to identify under-resourced languages on the web by training a standard Compact Language Detector v3 (CLD3) LangID model, we too found that the dataset was too noisy to be usable.

As an alternative, we trained a Transformer-based, semi-supervised LangID model on over 1000 languages. This model supplements the LangID task with the MAsked Sequence-to-Sequence (MASS) task to better generalize over noisy web data. MASS simply garbles the input by randomly removing sequences of tokens from it, and trains the model to predict these sequences. We applied the Transformer-based model to a dataset that had been filtered with a CLD3 model and trained to recognize clusters of similar languages.

We then applied the open sourced Term Frequency-Inverse Internet Frequency (TF-IIF) filtering to the resulting dataset to find and discard sentences that were actually in related high-resource languages, and developed a variety of language-specific filters to eliminate specific pathologies. The result of this effort was a dataset with monolingual text in over 1000 languages, of which 400 had over 100,000 sentences. We performed human evaluations on samples of 68 of these languages and found that the majority (>70%) reflected high-quality, in-language content.

The amount of monolingual data per language versus the amount of parallel (translated) data per language. A small number of languages have large amounts of parallel data, but there is a long tail of languages with only monolingual data.

Meet the Models
Once we had a dataset of monolingual text in over 1000 languages, we then developed a simple yet practical approach for zero-resource translation, i.e., translation for languages with no in-language parallel text and no language-specific translation examples. Rather than limiting our model to an artificial scenario with only monolingual text, we also include all available parallel text data with millions of examples for higher resource languages to enable the model to learn the translation task. Simultaneously, we train the model to learn representations of under-resourced languages directly from monolingual text using the MASS task. In order to solve this task, the model is forced to develop a sophisticated representation of the language in question, developing a complex understanding of how words relate to other words in a sentence.

Relying on the benefits of transfer learning in massively multilingual models, we train a single giant translation model on all available data for over 1000 languages. The model trains on monolingual text for all 1138 languages and on parallel text for a subset of 112 of the higher-resourced languages.

At training time, any input the model sees has a special token indicating which language the output should be in, exactly like the standard formulation for multilingual translation. Our additional innovation is to use the same special tokens for both the monolingual MASS task and the translation task. Therefore, the token translate_to_french may indicate that the source is in English and needs to be translated to French (the translation task), or it may mean that the source is in garbled French and needs to be translated to fluent French (the MASS task). By using the same tags for both tasks, a translate_to_french tag takes on the meaning, “Produce a fluent output in French that is semantically close to the input, regardless of whether the input is garbled in the same language or in another language entirely. From the model’s perspective, there is not much difference between the two.

Surprisingly, this simple procedure produces high quality zero-shot translations. The BLEU and ChrF scores for the resulting model are in the 10–40 and 20–60 ranges respectively, indicating mid- to high-quality translation. We observed meaningful translations even for highly inflected languages like Quechua and Kalaallisut, despite these languages being linguistically dissimilar to all other languages in the model. However, we only computed these metrics on the small subset of languages with human-translated evaluation sets. In order to understand the quality of translation for the remaining languages, we developed an evaluation metric based on round-trip translation, which allowed us to see that several hundred languages are reaching high translation quality.

To further improve quality, we use the model to generate large amounts of synthetic parallel data, filter the data based on round-trip translation (comparing a sentence translated into another language and back again), and continue training the model on this filtered synthetic data via back-translation and self-training. Finally, we fine-tune the model on a smaller subset of 30 languages and distill it into a model small enough to be served.

Translation accuracy scores for 638 of the languages supported in our model, using the metric we developed (RTTLangIDChrF), for both the higher-resource supervised languages and the low-resource zero-resource languages.

Contributions from Native Speakers
Regular communication with native speakers of these languages was critical for our research. We collaborated with over 100 people at Google and other institutions who spoke these languages. Some volunteers helped develop specialized filters to remove out-of-language content overlooked by automatic methods, for instance Hindi mixed with Sanskrit. Others helped with transliterating between different scripts used by the languages, for instance between Meetei Mayek and Bengali, for which sufficient tools didn’t exist; and yet others helped with a gamut of tasks related to evaluation. Native speakers were also key for advising in matters of political sensitivity, like the appropriate name for the language, and the appropriate writing system to use for it. And only native speakers could answer the ultimate question: given the current quality of translation, would it be valuable to the community for Google Translate to support this language?

Closing Notes
This advance is an exciting first step toward supporting more language technologies in under-resourced languages. Most importantly, we want to stress that the quality of translations produced by these models still lags far behind that of the higher-resource languages supported by Google Translate. These models are certainly a useful first tool for understanding content in under-resourced languages, but they will make mistakes and exhibit their own biases. As with any ML-driven tool, one should consider the output carefully.

The complete list of new languages added to Google Translate in this update:

Acknowledgements
We would like to thank Julia Kreutzer, Orhan Firat, Daan van Esch, Aditya Siddhant, Mengmeng Niu, Pallavi Baljekar, Xavier Garcia, Wolfgang Macherey, Theresa Breiner, Vera Axelrod, Jason Riesa, Yuan Cao, Mia Xu Chen, Klaus Macherey, Maxim Krikun, Pidong Wang, Alexander Gutkin, Apurva Shah, Yanping Huang, Zhifeng Chen, Yonghui Wu, and Macduff Hughes for their contributions to the research, engineering, and leadership of this project.

We would also like to extend our deepest gratitude to the following native speakers and members of affected communities, who helped us in a wide variety of ways: Yasser Salah Eddine Bouchareb (Algerian Arabic); Mfoniso Ukwak (Anaang); Bhaskar Borthakur, Kishor Barman, Rasika Saikia, Suraj Bharech (Assamese); Ruben Hilare Quispe (Aymara); Devina Suyanto (Balinese); Allahserix Auguste Tapo, Bakary Diarrassouba, Maimouna Siby (Bambara); Mohammad Jahangir (Baluchi); Subhajit Naskar (Bengali); Animesh Pathak, Ankur Bapna, Anup Mohan, Chaitanya Joshi, Chandan Dubey, Kapil Kumar, Manish Katiyar, Mayank Srivastava, Neeharika, Saumya Pathak, Tanya Sinha, Vikas Singh (Bhojpuri); Bowen Liang, Ellie Chio, Eric Dong, Frank Tang, Jeff Pitman, John Wong, Kenneth Chang, Manish Goregaokar, Mingfei Lau, Ryan Li, Yiwen Luo (Cantonese); Monang Setyawan (Caribbean Javanese); Craig Cornelius (Cherokee); Anton Prokopyev (Chuvash); Rajat Dogra, Sid Dogra (Dogri); Mohamed Kamagate (Dyula); Chris Assigbe, Dan Ameme, Emeafa Doe, Irene Nyavor, Thierry Gnanih, Yvonne Dumor (Ewe); Abdoulaye Barry, Adama Diallo, Fauzia van der Leeuw, Ibrahima Barry (Fulfulde); Isabel Papadimitriou (Greek); Alex Rudnick (Guarani); Mohammad Khdeir (Gulf Arabic); Paul Remollata (Hiligaynon); Ankur Bapna (Hindi); Mfoniso Ukwak (Ibibio); Nze Lawson (Igbo); D.J. Abuy, Miami Cabansay (Ilocano); Archana Koul, Shashwat Razdan, Sujeet Akula (Kashmiri); Jatin Kulkarni, Salil Rajadhyaksha, Sanjeet Hegde Desai, Sharayu Shenoy, Shashank Shanbhag, Shashi Shenoy (Konkani); Ryan Michael, Terrence Taylor (Krio); Bokan Jaff, Medya Ghazizadeh, Roshna Omer Abdulrahman, Saman Vaisipour, Sarchia Khursheed (Kurdish (Sorani));Suphian Tweel (Libyan Arabic); Doudou Kisabaka (Lingala); Colleen Mallahan, John Quinn (Luganda); Cynthia Mboli (Luyia); Abhishek Kumar, Neeraj Mishra, Priyaranjan Jha, Saket Kumar, Snehal Bhilare (Maithili); Lisa Wang (Mandarin Chinese); Cibu Johny (Malayalam); Viresh Ratnakar (Marathi); Abhi Sanoujam, Gautam Thockchom, Pritam Pebam, Sam Chaomai, Shangkar Mayanglambam, Thangjam Hindustani Devi (Meiteilon (Manipuri)); Hala Ajil (Mesopotamian Arabic); Hamdanil Rasyid (Minangkabau); Elizabeth John, Remi Ralte, S Lallienkawl Gangte,Vaiphei Thatsing, Vanlalzami Vanlalzami (Mizo); George Ouais (MSA); Ahmed Kachkach, Hanaa El Azizi (Morrocan Arabic); Ujjwal Rajbhandari (Newari); Ebuka Ufere, Gabriel Fynecontry, Onome Ofoman, Titi Akinsanmi (Nigerian Pidgin); Marwa Khost Jarkas (North Levantine Arabic); Abduselam Shaltu, Ace Patterson, Adel Kassem, Mo Ali, Yonas Hambissa (Oromo); Helvia Taina, Marisol Necochea (Quechua); AbdelKarim Mardini (Saidi Arabic); Ishank Saxena, Manasa Harish, Manish Godara, Mayank Agrawal, Nitin Kashyap, Ranjani Padmanabhan, Ruchi Lohani, Shilpa Jindal, Shreevatsa Rajagopalan, Vaibhav Agarwal, Vinod Krishnan (Sanskrit); Nabil Shahid (Saraiki); Ayanda Mnyakeni (Sesotho, Sepedi); Landis Baker (Seychellois Creole); Taps Matangira (Shona); Ashraf Elsharif (Sudanese Arabic); Sakhile Dlamini (Swati); Hakim Sidahmed (Tamazight); Melvin Johnson (Tamil); Sneha Kudugunta (Telugu); Alexander Tekle, Bserat Ghebremicael, Nami Russom, Naud Ghebre (Tigrinya); Abigail Annkah, Diana Akron, Maame Ofori, Monica Opoku-Geren, Seth Duodu-baah, Yvonne Dumor (Twi); Ousmane Loum (Wolof); and Daniel Virtheim (Yiddish).


Source: Google AI Blog


Introducing the Google Meet Live Sharing SDK

Posted by Mai Lowe, Product Manager & Ken Cenerelli, Technical Writer


The Google Meet Live Sharing SDK is in preview. To use the SDK, developers can apply for access through our Early Access Program.

Today at Google I/O 2022, we announced new functionality for app developers to leverage the Google Meet video conferencing product through our new Meet Live Sharing SDK. Users can now come together and share experiences with each other inside an app, such as streaming a TV show, queuing up videos to watch on YouTube, collaborating on a music playlist, joining in a dance party, or working out together though Google Meet. This SDK joins the large set of offerings available to developers under the Google Workspace Platform.

Partners like YouTube, Heads Up!, UNO!™ Mobile, and Kahoot! are already integrating our SDK into their applications so that their users can participate in these new, shared interactive experiences later this year.

Supports multiple use cases


The Live Sharing SDK allows developers to sync content across devices in real time and incorporate Meet into their apps, enabling them to bring new, fun, and genuinely connecting experiences to their users. It’s also a great way to reach new audiences as current users can introduce your app to friends and family.

The SDK supports two key use cases:
  • Co-Watching—Syncs streaming app content across devices in real time, and allows users to take turns sharing videos and playing the latest hits from their favorite artist. This allows for users to share controls such as starting and pausing a video, or selecting new content in the app.
  • Co-Doing—Syncs arbitrary app content, allowing users to get together to perform an activity like playing video games or follow the same workout regime.


The co-watching and co-doing APIs are independent but can be used in parallel with each other.


Example workflow illustration of a user starting live sharing within an app using the Live Sharing SDK.


Get started


To learn more, watch our I/O 2022 session on the Google Meet Live Sharing SDK and check out the documentation for the Android version.

If you want to try out the SDK, developers can apply for access through our Early Access Program.


What’s next?


We’re also continuing to improve features by working to build the video-content experience you want to bring to your users. For more announcements like this and for info about the Google Workspace Platform and APIs, subscribe to our developer newsletter.

What’s new in Jetpack Compose

Posted by Jolanda Verhoef, Android Developer Relations Engineer, and Anna-Chiara Bellini, Android Toolkit UI Product Manager

blog header featuring Android logos 

It’s been almost a year since Jetpack Compose 1.0 was released, and during this time we've seen the community adopt it with enthusiasm. You’ve told us you’re appreciating the conciseness of the Kotlin syntax and the declarative approach that makes thinking about UI so much faster and easier.

Compose in the Community

We've seen many companies adopt Compose at scale for the newest and boldest features of their apps. For instance, we've worked closely with the Play Store team, who started experimenting with Compose in the very early days, and learned that not only is it more enjoyable, it is beneficial to their developer productivity. They told us that "All new Play Store features are built on top of this framework. Compose has been instrumental in unlocking better velocity and smoother landings for the app." The team at Twitter has been using Jetpack Compose across different parts of the app, and they are reaping the benefits, as "Compose makes it much easier to define our own components and to make their API contracts more explicit, flexible, and intuitive." The Airbnb team adopted Compose as well: "Jetpack Compose is a critical part of our technical strategy. The productivity gains are massive."

We're very glad to see that these teams, who have carefully evaluated Compose in large, complex production environments, are experiencing not just more fun and clarity in their UI development, but broader engineering benefits! And these are just a few examples, because over 100 of the top 1000 apps in the Play Store are now using Compose.

These close collaborations, and listening carefully to feedback from the broader Android community, are always at the heart of our development process and are key to advancing our roadmap. We're now focusing on supporting your more advanced use cases, with new APIs and feature improvements, all together with new tools to make building with Compose easier. We know that Compose fundamentally changes the way UI is built. To help you with the necessary mindset shift, we're publishing more guidance, talks and codelabs on advanced topics, and more in-depth videos so you can write apps that look great and perform great. Here's what is new:

Compose 1.2 beta

Today, we’re releasing the first beta version of Compose 1.2, which includes a lot of features and improvements.

Text improvements

Font Padding

We’ve addressed one of the top-voted bugs in our issue tracker by making includeFontPadding a customizable parameter. We recommend you set this value to false, as this will enable more precise alignment of text within layout. We aim to eventually make this the default value in a future release. Please let us know in the issue above if setting the value to false leads to issues with your app. Additionally, when includeFontPadding is set to false, you can adapt the line height of your Text composable by setting the lineHeightStyle parameter. Combined it can look like this:

an image of multi-line text

Multi-line Text with includeFontPadding set to true (left, current default) vs false (right) and lineHeightStyle.

Text(
 text = myText,
 style = TextStyle(
   lineHeight = 2.5.em,
   platformStyle = PlatformTextStyle(
     includeFontPadding = false
   ),
   lineHeightStyle = LineHeightStyle(
     alignment = Alignment.Center,
     trim = Trim.None
   )
 )
)

Downloadable Fonts

Compose 1.2 also introduces downloadable fonts in Compose. You can use the new APIs for Compose to access Google Fonts asynchronously, even defining fallback fonts, without any complex setup. With downloadable fonts, you can keep your APK size small and improve your user’s system health as multiple apps can share the same font through a provider.

Text Magnifier

Android text provides a magnifier widget, which makes selecting text easier. Compose now supports the text magnifier.

an image of text and maginifer widget

The magnifier is shown when dragging a selection handle to help you see what’s under your finger. Compose 1.1.0 brought the magnifier to selection within text fields, and now Compose 1.2.0 supports magnifier in both text fields and SelectionContainer. The magnifier has also been enhanced to match the precise behavior of the Android magnifier in Views.

Layout features and improvements

Lazy Layouts

Lazy layouts continue to evolve, with the grid APIs LazyVerticalGrid and LazyHorizontalGrid graduating out of experimental, and a new experimental API being added, called LazyLayout, that lets you implement your own custom lazy layouts. Learn more about these APIs in the I/O talk Lazy layouts in Compose.

Interop with CoordinatorLayout

When you embed a scrolling composable in a CoordinatorLayout from the view system, you can now make sure their scroll behaviors are interoperable. This makes the setup of a collapsible toolbar much easier. You can opt-in to this behavior by passing the result of calling the new experimental rememberNestedScrollInteropConnection method into the nestedScroll modifier. Here’s a sample demonstrating this new functionality.

Window insets

The insets library in Accompanist has now graduated to the Compose Foundation library, using the WindowInsets class. Read more about it in our documentation on Integrating Compose with your existing UI.

Window size classes

To make it easier to design, develop and test resizable layouts, we’ve released window size classes - a set of opinionated viewport breakpoints. They are now available in alpha in a new library material3-window-size-class, as part of the Material 3 set of libraries. You can read more about size classes in the Supporting different screen sizes documentation and take a look at a sample implementation in Crane.

Focus on performance

To help you understand and improve your app’s performance, we focused a lot on new performance tooling and guidance. With this, it becomes much easier to understand why and where your app might be lagging.

Starting from Android Studio Dolphin, you can inspect how often composables recompose using the Layout Inspector. Unexpectedly high numbers of recomposition can point you to a composable that could be optimized. In addition, Android Studio Electric Eel now includes a recomposition highlighter, a visual aid to see which composables recompose when. Read more about this new tooling in the What’s new in Android Studio blog.

Layout Inspector showing recomposition count and recomposition highlighter

Layout Inspector showing recomposition count and recomposition highlighter.

Compose changes the way you write your UI at a fundamental level, so there are some best practices that you can adopt to make sure your app is performant. The newly released documentation page suggests how to write and configure your Compose app for best performance. In the I/O talk Common performance gotchas in Jetpack Compose, the Compose team describe common performance mistakes and how to fix them.

Performance is an ongoing area of focus and we’re working hard on improving and extending tooling and guidance. In the meantime, we’d really appreciate your feedback on the work we’ve done so far. Please raise your bugs in the issue tracker or ask your questions on the KotlinLang Slack group.

New tools

On top of improvements, there are also new tooling updates to help you use Compose more effectively. Android Studio Dolphin, now in Beta, brings exciting features for Compose development. Beyond recomposition counts, new tools include Animation Coordination so you can see and scrub through all your animations at once, and the MultiPreview annotation to help you build for multiple screen sizes. To enable you to iterate faster Android Studio Electric Eel (in Canary) brings LiveEdit.

Gif of Android Studio. On left side there is code and the right side there is a celebration text for Android Developers reaching one million subscribers on YouTube.

Check out What's new in Android Development Tools for all the details, and make sure you share your feedback to help shape the tooling support you need for Compose.

Compose for Wear OS

If there is something better than Compose, it is more Compose! So we're very excited to see Compose for Wear OS moving to Beta! Following the same principle as any other Jetpack library, Beta means that it's feature complete and API stable, and you can start building your production-ready apps. Go ahead and watch the talk, and read the blog post!

New and improved guidance

We’ve added and revamped a lot of the guidance on Compose:

Happy Composing!

We hope that you find these new features as exciting as we do. If you haven't started yet, it's time to learn Jetpack Compose and see how it will fit in your team and development process, so that you can experience all the benefits of improved velocity and developer productivity. Happy Composing!

Google I/O 2022: Advancing knowledge and computing

[TL;DR]

Nearly 24 years ago, Google started with two graduate students, one product, and a big mission: to organize the world’s information and make it universally accessible and useful. In the decades since, we’ve been developing our technology to deliver on that mission.

The progress we've made is because of our years of investment in advanced technologies, from AI to the technical infrastructure that powers it all. And once a year — on my favorite day of the year :) — we share an update on how it’s going at Google I/O.

Today, I talked about how we’re advancing two fundamental aspects of our mission — knowledge and computing — to create products that are built to help. It’s exciting to build these products; it’s even more exciting to see what people do with them.

Thank you to everyone who helps us do this work, and most especially our Googlers. We are grateful for the opportunity.

- Sundar


Editor’s note: Below is an edited transcript of Sundar Pichai's keynote address during the opening of today's Google I/O Developers Conference.

Hi, everyone, and welcome. Actually, let’s make that welcome back! It’s great to return to Shoreline Amphitheatre after three years away. To the thousands of developers, partners and Googlers here with us, it’s great to see all of you. And to the millions more joining us around the world — we’re so happy you’re here, too.

Last year, we shared how new breakthroughs in some of the most technically challenging areas of computer science are making Google products more helpful in the moments that matter. All this work is in service of our timeless mission: to organize the world's information and make it universally accessible and useful.

I'm excited to show you how we’re driving that mission forward in two key ways: by deepening our understanding of information so that we can turn it into knowledge; and advancing the state of computing, so that knowledge is easier to access, no matter who or where you are.

Today, you'll see how progress on these two parts of our mission ensures Google products are built to help. I’ll start with a few quick examples. Throughout the pandemic, Google has focused on delivering accurate information to help people stay healthy. Over the last year, people used Google Search and Maps to find where they could get a COVID vaccine nearly two billion times.

A visualization of Google’s flood forecasting system, with three 3D maps stacked on top of one another, showing landscapes and weather patterns in green and brown colors. The maps are floating against a gray background.

Google’s flood forecasting technology sent flood alerts to 23 million people in India and Bangladesh last year.

We’ve also expanded our flood forecasting technology to help people stay safe in the face of natural disasters. During last year’s monsoon season, our flood alerts notified more than 23 million people in India and Bangladesh. And we estimate this supported the timely evacuation of hundreds of thousands of people.

In Ukraine, we worked with the government to rapidly deploy air raid alerts. To date, we’ve delivered hundreds of millions of alerts to help people get to safety. In March I was in Poland, where millions of Ukrainians have sought refuge. Warsaw’s population has increased by nearly 20% as families host refugees in their homes, and schools welcome thousands of new students. Nearly every Google employee I spoke with there was hosting someone.

Adding 24 more languages to Google Translate

In countries around the world, Google Translate has been a crucial tool for newcomers and residents trying to communicate with one another. We’re proud of how it’s helping Ukrainians find a bit of hope and connection until they are able to return home again.

Two boxes, one showing a question in English — “What’s the weather like today?” — the other showing its translation in Quechua. There is a microphone symbol below the English question and a loudspeaker symbol below the Quechua answer.

With machine learning advances, we're able to add languages like Quechua to Google Translate.

Real-time translation is a testament to how knowledge and computing come together to make people's lives better. More people are using Google Translate than ever before, but we still have work to do to make it universally accessible. There’s a long tail of languages that are underrepresented on the web today, and translating them is a hard technical problem. That’s because translation models are usually trained with bilingual text — for example, the same phrase in both English and Spanish. However, there's not enough publicly available bilingual text for every language.

So with advances in machine learning, we’ve developed a monolingual approach where the model learns to translate a new language without ever seeing a direct translation of it. By collaborating with native speakers and institutions, we found these translations were of sufficient quality to be useful, and we'll continue to improve them.

A list of the 24 new languages Google Translate now has available.

We’re adding 24 new languages to Google Translate.

Today, I’m excited to announce that we’re adding 24 new languages to Google Translate, including the first indigenous languages of the Americas. Together, these languages are spoken by more than 300 million people. Breakthroughs like this are powering a radical shift in how we access knowledge and use computers.

Taking Google Maps to the next level

So much of what’s knowable about our world goes beyond language — it’s in the physical and geospatial information all around us. For more than 15 years, Google Maps has worked to create rich and useful representations of this information to help us navigate. Advances in AI are taking this work to the next level, whether it’s expanding our coverage to remote areas, or reimagining how to explore the world in more intuitive ways.

An overhead image of a map of a dense urban area, showing gray roads cutting through clusters of buildings outlined in blue.

Advances in AI are helping to map remote and rural areas.

Around the world, we’ve mapped around 1.6 billion buildings and over 60 million kilometers of roads to date. Some remote and rural areas have previously been difficult to map, due to scarcity of high-quality imagery and distinct building types and terrain. To address this, we’re using computer vision and neural networks to detect buildings at scale from satellite images. As a result, we have increased the number of buildings on Google Maps in Africa by 5X since July 2020, from 60 million to nearly 300 million.

We’ve also doubled the number of buildings mapped in India and Indonesia this year. Globally, over 20% of the buildings on Google Maps have been detected using these new techniques. We’ve gone a step further, and made the dataset of buildings in Africa publicly available. International organizations like the United Nations and the World Bank are already using it to better understand population density, and to provide support and emergency assistance.

Immersive view in Google Maps fuses together aerial and street level images.

We’re also bringing new capabilities into Maps. Using advances in 3D mapping and machine learning, we’re fusing billions of aerial and street level images to create a new, high-fidelity representation of a place. These breakthrough technologies are coming together to power a new experience in Maps called immersive view: it allows you to explore a place like never before.

Let’s go to London and take a look. Say you’re planning to visit Westminster with your family. You can get into this immersive view straight from Maps on your phone, and you can pan around the sights… here’s Westminster Abbey. If you’re thinking of heading to Big Ben, you can check if there's traffic, how busy it is, and even see the weather forecast. And if you’re looking to grab a bite during your visit, you can check out restaurants nearby and get a glimpse inside.

What's amazing is that isn't a drone flying in the restaurant — we use neural rendering to create the experience from images alone. And Google Cloud Immersive Stream allows this experience to run on just about any smartphone. This feature will start rolling out in Google Maps for select cities globally later this year.

Another big improvement to Maps is eco-friendly routing. Launched last year, it shows you the most fuel-efficient route, giving you the choice to save money on gas and reduce carbon emissions. Eco-friendly routes have already rolled out in the U.S. and Canada — and people have used them to travel approximately 86 billion miles, helping save an estimated half million metric tons of carbon emissions, the equivalent of taking 100,000 cars off the road.

Still image of eco-friendly routing on Google Maps — a 53-minute driving route in Berlin is pictured, with text below the map showing it will add three minutes but save 18% more fuel.

Eco-friendly routes will expand to Europe later this year.

I’m happy to share that we’re expanding this feature to more places, including Europe later this year. In this Berlin example, you could reduce your fuel consumption by 18% taking a route that’s just three minutes slower. These small decisions have a big impact at scale. With the expansion into Europe and beyond, we estimate carbon emission savings will double by the end of the year.

And we’ve added a similar feature to Google Flights. When you search for flights between two cities, we also show you carbon emission estimates alongside other information like price and schedule, making it easy to choose a greener option. These eco-friendly features in Maps and Flights are part of our goal to empower 1 billion people to make more sustainable choices through our products, and we’re excited about the progress here.

New YouTube features to help people easily access video content

Beyond Maps, video is becoming an even more fundamental part of how we share information, communicate, and learn. Often when you come to YouTube, you are looking for a specific moment in a video and we want to help you get there faster.

Last year we launched auto-generated chapters to make it easier to jump to the part you’re most interested in.

This is also great for creators because it saves them time making chapters. We’re now applying multimodal technology from DeepMind. It simultaneously uses text, audio and video to auto-generate chapters with greater accuracy and speed. With this, we now have a goal to 10X the number of videos with auto-generated chapters, from eight million today, to 80 million over the next year.

Often the fastest way to get a sense of a video’s content is to read its transcript, so we’re also using speech recognition models to transcribe videos. Video transcripts are now available to all Android and iOS users.

Animation showing a video being automatically translated. Then text reads "Now available in sixteen languages."

Auto-translated captions on YouTube.

Next up, we’re bringing auto-translated captions on YouTube to mobile. Which means viewers can now auto-translate video captions in 16 languages, and creators can grow their global audience. We’ll also be expanding auto-translated captions to Ukrainian YouTube content next month, part of our larger effort to increase access to accurate information about the war.

Helping people be more efficient with Google Workspace

Just as we’re using AI to improve features in YouTube, we’re building it into our Workspace products to help people be more efficient. Whether you work for a small business or a large institution, chances are you spend a lot of time reading documents. Maybe you’ve felt that wave of panic when you realize you have a 25-page document to read ahead of a meeting that starts in five minutes.

At Google, whenever I get a long document or email, I look for a TL;DR at the top — TL;DR is short for “Too Long, Didn’t Read.” And it got us thinking, wouldn’t life be better if more things had a TL;DR?

That’s why we’ve introduced automated summarization for Google Docs. Using one of our machine learning models for text summarization, Google Docs will automatically parse the words and pull out the main points.

This marks a big leap forward for natural language processing. Summarization requires understanding of long passages, information compression and language generation, which used to be outside of the capabilities of even the best machine learning models.

And docs are only the beginning. We’re launching summarization for other products in Workspace. It will come to Google Chat in the next few months, providing a helpful digest of chat conversations, so you can jump right into a group chat or look back at the key highlights.

Animation showing summary in Google Chat

We’re bringing summarization to Google Chat in the coming months.

And we’re working to bring transcription and summarization to Google Meet as well so you can catch up on some important meetings you missed.

Visual improvements on Google Meet

Of course there are many moments where you really want to be in a virtual room with someone. And that’s why we continue to improve audio and video quality, inspired by Project Starline. We introduced Project Starline at I/O last year. And we’ve been testing it across Google offices to get feedback and improve the technology for the future. And in the process, we’ve learned some things that we can apply right now to Google Meet.

Starline inspired machine learning-powered image processing to automatically improve your image quality in Google Meet. And it works on all types of devices so you look your best wherever you are.

An animation of a man looking directly at the camera then waving and smiling. A white line sweeps across the screen, adjusting the image quality to make it brighter and clearer.

Machine learning-powered image processing automatically improves image quality in Google Meet.

We’re also bringing studio quality virtual lighting to Meet. You can adjust the light position and brightness, so you’ll still be visible in a dark room or sitting in front of a window. We’re testing this feature to ensure everyone looks like their true selves, continuing the work we’ve done with Real Tone on Pixel phones and the Monk Scale.

These are just some of the ways AI is improving our products: making them more helpful, more accessible, and delivering innovative new features for everyone.

Gif shows a phone camera pointed towards a rack of shelves, generating helpful information about food items. Text on the screen shows the words ‘dark’, ‘nut-free’ and ‘highly-rated’.

Today at I/O Prabhakar Raghavan shared how we’re helping people find helpful information in more intuitive ways on Search.

Making knowledge accessible through computing

We’ve talked about how we’re advancing access to knowledge as part of our mission: from better language translation to improved Search experiences across images and video, to richer explorations of the world using Maps.

Now we’re going to focus on how we make that knowledge even more accessible through computing. The journey we’ve been on with computing is an exciting one. Every shift, from desktop to the web to mobile to wearables and ambient computing has made knowledge more useful in our daily lives.

As helpful as our devices are, we’ve had to work pretty hard to adapt to them. I’ve always thought computers should be adapting to people, not the other way around. We continue to push ourselves to make progress here.

Here’s how we’re making computing more natural and intuitive with the Google Assistant.

Introducing LaMDA 2 and AI Test Kitchen

Animation shows demos of how LaMDA can converse on any topic and how AI Test Kitchen can help create lists.

A demo of LaMDA, our generative language model for dialogue application, and the AI Test Kitchen.

We're continually working to advance our conversational capabilities. Conversation and natural language processing are powerful ways to make computers more accessible to everyone. And large language models are key to this.

Last year, we introduced LaMDA, our generative language model for dialogue applications that can converse on any topic. Today, we are excited to announce LaMDA 2, our most advanced conversational AI yet.

We are at the beginning of a journey to make models like these useful to people, and we feel a deep responsibility to get it right. To make progress, we need people to experience the technology and provide feedback. We opened LaMDA up to thousands of Googlers, who enjoyed testing it and seeing its capabilities. This yielded significant quality improvements, and led to a reduction in inaccurate or offensive responses.

That’s why we’ve made AI Test Kitchen. It’s a new way to explore AI features with a broader audience. Inside the AI Test Kitchen, there are a few different experiences. Each is meant to give you a sense of what it might be like to have LaMDA in your hands and use it for things you care about.

The first is called “Imagine it.” This demo tests if the model can take a creative idea you give it, and generate imaginative and relevant descriptions. These are not products, they are quick sketches that allow us to explore what LaMDA can do with you. The user interfaces are very simple.

Say you’re writing a story and need some inspirational ideas. Maybe one of your characters is exploring the deep ocean. You can ask what that might feel like. Here LaMDA describes a scene in the Mariana Trench. It even generates follow-up questions on the fly. You can ask LaMDA to imagine what kinds of creatures might live there. Remember, we didn’t hand-program the model for specific topics like submarines or bioluminescence. It synthesized these concepts from its training data. That’s why you can ask about almost any topic: Saturn’s rings or even being on a planet made of ice cream.

Staying on topic is a challenge for language models. Say you’re building a learning experience — you want it to be open-ended enough to allow people to explore where curiosity takes them, but stay safely on topic. Our second demo tests how LaMDA does with that.

In this demo, we’ve primed the model to focus on the topic of dogs. It starts by generating a question to spark conversation, “Have you ever wondered why dogs love to play fetch so much?” And if you ask a follow-up question, you get an answer with some relevant details: it’s interesting, it thinks it might have something to do with the sense of smell and treasure hunting.

You can take the conversation anywhere you want. Maybe you’re curious about how smell works and you want to dive deeper. You’ll get a unique response for that too. No matter what you ask, it will try to keep the conversation on the topic of dogs. If I start asking about cricket, which I probably would, the model brings the topic back to dogs in a fun way.

This challenge of staying on-topic is a tricky one, and it’s an important area of research for building useful applications with language models.

These experiences show the potential of language models to one day help us with things like planning, learning about the world, and more.

Of course, there are significant challenges to solve before these models can truly be useful. While we have improved safety, the model might still generate inaccurate, inappropriate, or offensive responses. That’s why we are inviting feedback in the app, so people can help report problems.

We will be doing all of this work in accordance with our AI Principles. Our process will be iterative, opening up access over the coming months, and carefully assessing feedback with a broad range of stakeholders — from AI researchers and social scientists to human rights experts. We’ll incorporate this feedback into future versions of LaMDA, and share our findings as we go.

Over time, we intend to continue adding other emerging areas of AI into AI Test Kitchen. You can learn more at: g.co/AITestKitchen.

Advancing AI language models

LaMDA 2 has incredible conversational capabilities. To explore other aspects of natural language processing and AI, we recently announced a new model. It’s called Pathways Language Model, or PaLM for short. It’s our largest model to date and trained on 540 billion parameters.

PaLM demonstrates breakthrough performance on many natural language processing tasks, such as generating code from text, answering a math word problem, or even explaining a joke.

It achieves this through greater scale. And when we combine that scale with a new technique called chain-of- thought prompting, the results are promising. Chain-of-thought prompting allows us to describe multi-step problems as a series of intermediate steps.

Let’s take an example of a math word problem that requires reasoning. Normally, how you use a model is you prompt it with a question and answer, and then you start asking questions. In this case: How many hours are in the month of May? So you can see, the model didn’t quite get it right.

In chain-of-thought prompting, we give the model a question-answer pair, but this time, an explanation of how the answer was derived. Kind of like when your teacher gives you a step-by-step example to help you understand how to solve a problem. Now, if we ask the model again — how many hours are in the month of May — or other related questions, it actually answers correctly and even shows its work.

There are two boxes below a heading saying ‘chain-of-thought prompting’. A box headed ‘input’ guides the model through answering a question about how many tennis balls a person called Roger has. The output box shows the model correctly reasoning through and answering a separate question (‘how many hours are in the month of May?’)

Chain-of-thought prompting leads to better reasoning and more accurate answers.

Chain-of-thought prompting increases accuracy by a large margin. This leads to state-of-the-art performance across several reasoning benchmarks, including math word problems. And we can do it all without ever changing how the model is trained.

PaLM is highly capable and can do so much more. For example, you might be someone who speaks a language that’s not well-represented on the web today — which makes it hard to find information. Even more frustrating because the answer you are looking for is probably out there. PaLM offers a new approach that holds enormous promise for making knowledge more accessible for everyone.

Let me show you an example in which we can help answer questions in a language like Bengali — spoken by a quarter billion people. Just like before we prompt the model with two examples of questions in Bengali with both Bengali and English answers.

That’s it, now we can start asking questions in Bengali: “What is the national song of Bangladesh?” The answer, by the way, is “Amar Sonar Bangla” — and PaLM got it right, too. This is not that surprising because you would expect that content to exist in Bengali.

You can also try something that is less likely to have related information in Bengali such as: “What are popular pizza toppings in New York City?” The model again answers correctly in Bengali. Though it probably just stirred up a debate amongst New Yorkers about how “correct” that answer really is.

What’s so impressive is that PaLM has never seen parallel sentences between Bengali and English. Nor was it ever explicitly taught to answer questions or translate at all! The model brought all of its capabilities together to answer questions correctly in Bengali. And we can extend the techniques to more languages and other complex tasks.

We're so optimistic about the potential for language models. One day, we hope we can answer questions on more topics in any language you speak, making knowledge even more accessible, in Search and across all of Google.

Introducing the world’s largest, publicly available machine learning hub

The advances we’ve shared today are possible only because of our continued innovation in our infrastructure. Recently we announced plans to invest $9.5 billion in data centers and offices across the U.S.

One of our state-of-the-art data centers is in Mayes County, Oklahoma. I’m excited to announce that, there, we are launching the world’s largest, publicly-available machine learning hub for our Google Cloud customers.

Still image of a data center with Oklahoma map pin on bottom left corner.

One of our state-of-the-art data centers in Mayes County, Oklahoma.

This machine learning hub has eight Cloud TPU v4 pods, custom-built on the same networking infrastructure that powers Google’s largest neural models. They provide nearly nine exaflops of computing power in aggregate — bringing our customers an unprecedented ability to run complex models and workloads. We hope this will fuel innovation across many fields, from medicine to logistics, sustainability and more.

And speaking of sustainability, this machine learning hub is already operating at 90% carbon-free energy. This is helping us make progress on our goal to become the first major company to operate all of our data centers and campuses globally on 24/7 carbon-free energy by 2030.

Even as we invest in our data centers, we are working to innovate on our mobile platforms so more processing can happen locally on device. Google Tensor, our custom system on a chip, was an important step in this direction. It’s already running on Pixel 6 and Pixel 6 Pro, and it brings our AI capabilities — including the best speech recognition we’ve ever deployed — right to your phone. It’s also a big step forward in making those devices more secure. Combined with Android’s Private Compute Core, it can run data-powered features directly on device so that it’s private to you.

People turn to our products every day for help in moments big and small. Core to making this possible is protecting your private information each step of the way. Even as technology grows increasingly complex, we keep more people safe online than anyone else in the world, with products that are secure by default, private by design and that put you in control.

We also spent time today sharing updates to platforms like Android. They’re delivering access, connectivity, and information to billions of people through their smartphones and other connected devices like TVs, cars and watches.

And we shared our new Pixel Portfolio, including the Pixel 6a, Pixel Buds Pro, Google Pixel Watch, Pixel 7, and Pixel tablet all built with ambient computing in mind. We’re excited to share a family of devices that work better together — for you.

The next frontier of computing: augmented reality

Today we talked about all the technologies that are changing how we use computers and access knowledge. We see devices working seamlessly together, exactly when and where you need them and with conversational interfaces that make it easier to get things done.

Looking ahead, there's a new frontier of computing, which has the potential to extend all of this even further, and that is augmented reality. At Google, we have been heavily invested in this area. We’ve been building augmented reality into many Google products, from Google Lens to multisearch, scene exploration, and Live and immersive views in Maps.

These AR capabilities are already useful on phones and the magic will really come alive when you can use them in the real world without the technology getting in the way.

That potential is what gets us most excited about AR: the ability to spend time focusing on what matters in the real world, in our real lives. Because the real world is pretty amazing!

It’s important we design in a way that is built for the real world — and doesn’t take you away from it. And AR gives us new ways to accomplish this.

Let’s take language as an example. Language is just so fundamental to connecting with one another. And yet, understanding someone who speaks a different language, or trying to follow a conversation if you are deaf or hard of hearing can be a real challenge. Let's see what happens when we take our advancements in translation and transcription and deliver them in your line of sight in one of the early prototypes we’ve been testing.

You can see it in their faces: the joy that comes with speaking naturally to someone. That moment of connection. To understand and be understood. That’s what our focus on knowledge and computing is all about. And it’s what we strive for every day, with products that are built to help.

Each year we get a little closer to delivering on our timeless mission. And we still have so much further to go. At Google, we genuinely feel a sense of excitement about that. And we are optimistic that the breakthroughs you just saw will help us get there. Thank you to all of the developers, partners and customers who joined us today. We look forward to building the future with all of you.