Tag Archives: Gboard

Privacy-Preserving Smart Input with Gboard

Google Keyboard (a.k.a Gboard) has a critical mission to provide frictionless input on Android to empower users to communicate accurately and express themselves effortlessly. In order to accomplish this mission, Gboard must also protect users' private and sensitive data. Nothing users type is sent to Google servers. We recently launched privacy-preserving input by further advancing the latest federated technologies. In Android 11, Gboard also launched the contextual input suggestion experience by integrating on-device smarts into the user's daily communication in a privacy-preserving way.

Before Android 11, input suggestions were surfaced to users in several different places. In Android 11, Gboard launched a consistent and coordinated approach to access contextual input suggestions. For the first time, we've brought Smart Replies to the keyboard suggestions - powered by system intelligence running entirely on device. The smart input suggestions are rendered with a transparent layer on top of Gboard’s suggestion strip. This structure maintains the trust boundaries between the Android platform and Gboard, meaning sensitive personal content cannot be not accessed by Gboard. The suggestions are only sent to the app after the user taps to accept them.

For instance, when a user receives the message “Have a virtual coffee at 5pm?” in Whatsapp, on-device system intelligence predicts smart text and emoji replies “Sounds great!” and “👍”. Android system intelligence can see the incoming message but Gboard cannot. In Android 11, these Smart Replies are rendered by the Android platform on Gboard’s suggestion strip as a transparent layer. The suggested reply is generated by the system intelligence. When the user taps the suggestion, Android platform sends it to the input field directly. If the user doesn't tap the suggestion, gBoard and the app cannot see it. In this way, Android and Gboard surface the best of Google smarts whilst keeping users' data private: none of their data goes to any app, including the keyboard, unless they've tapped a suggestion.

Additionally, federated learning has enabled Gboard to train intelligent input models across many devices while keeping everything individual users type on their device. Today, the emoji is as common as punctuation - and have become the way for our users to express themselves in messaging. Our users want a way to have fresh and diversified emojis to better express their thoughts in messaging apps. Recently, we launched new on-device transformer models that are fine-tuned with federated learning in Gboard, to produce more contextual emoji predictions for English, Spanish and Portuguese.

Furthermore, following the success of privacy-preserving machine learning techniques, Gboard continues to leverage federated analytics to understand how Gboard is used from decentralized data. What we've learned from privacy-preserving analysis has let us make better decisions in our product.

When a user shares an emoji in a conversation, their phone keeps an ongoing count of which emojis are used. Later, when the phone is idle, plugged in, and connected to WiFi, Google’s federated analytics server invites the device to join a “round” of federated analytics data computation with hundreds of other participating phones. Every device involved in one round will compute the emoji share frequency, encrypt the result and send it a federated analytics server. Although the server can’t decrypt the data individually, the final tally of total emoji counts can be decrypted when combining encrypted data across devices. The aggregated data shows that the most popular emoji is 😂 in Whatsapp, 😭 in Roblox(gaming), and ✔ in Google Docs. Emoji 😷 moved up from 119th to 42nd in terms of frequency during COVID-19.

Gboard always has a strong commitment to Google’s Privacy Principles. Gboard strives to build privacy-preserving effortless input products for users to freely express their thoughts in 900+ languages while safeguarding user data. We will keep pushing the state of the art in smart input technologies on Android while safeguarding user data. Stay tuned!

Create stickers for Gboard on Google Play

Posted by Alan Ni, Associate Product Manager, Gboard

Messaging is getting more and more expressive -- today you can say I love you with an emoji, a gif, or a sticker. Millions of users share expressive content every day on Android devices using Gboard as their default keyboard. We want to push expression even further by allowing developers to create their own stickers for Gboard. Some of our early partners include Bitmoji, Disney, and even our own Allo team. Once published, your stickers could be seen and shared by millions of users around the world.

Using the Firebase App Indexing API, you'll be able index any sticker assets you create, publish your app to the Play Store, and get featured in the Gboard Sticker collection. Once a user downloads your sticker pack from the Play store, they'll be able to send those stickers directly from their keyboard in any Android app that supports image insertion!

Getting Started with Stickers

To kick things off, you'll need to add the Firebase App Indexing library. Visit the Firebase Getting Started Guide for details. Once you've set up Firebase App Indexing, read through our sticker guide to learn how to index those stickers. Next, create your sticker assets!

You should build and index stickers on first run after update or install to minimize the lag between a user installing the app and seeing the stickers in Gboard. Our sample app should give an idea of the end-to-end flow.

Making your Stickers Searchable

Users often look for stickers via searching on keywords. That means you'll want to add appropriate keywords to allow users to find your stickers and you can use the put method to add keywords. In the code snippet below, you'll see that a Snoopy sticker is tagged with the keywords: "bye", "snoopy", "see ya", and "good bye".

new Indexable.Builder("Sticker")
   .setName("Bye")
   // add url for sticker asset 
   .setImage("http://www.snoopysticker.com?id=1234")
   // see: Support links to your app content section
   .setUrl("http://sticker/canonical/image/bye")
   // Set the accessibility label for the sticker.
   .setDescription("A sticker for Bye")
   // Add search keywords.
   .put("keywords", "bye", "snoopy", "see ya", "good bye")
   .put("isPartOf",
        new Indexable.Builder("StickerPack")
          .setName("Snoopy Pack")
          .build())
   .build())};

For larger sticker packs, you'll want to make sure you've tagged stickers with keywords so that they're easier for users to find. We've come up with a list of common English phrases/keywords you can use to tag your stickers. But don't forget to internationalize your stickers -- to do this you'll want to first detect the device language and then index keywords that correspond to that language.

Get Featured in the Sticker Collection

Finally, share your stickers with the world! To be featured in our Sticker Collection on the Play Store, fill out this form. But first, make sure to thoroughly test the sticker pack using the latest build of Gboard, If your app has high-quality stickers and is working well with Gboard, we'll add it to our sticker collection; this is the best way to get it seen by millions of Gboard users!

We're really excited to see what sticker packs you're able to build.

The Machine Intelligence Behind Gboard



Most people spend a significant amount of time each day using mobile-device keyboards: composing emails, texting, engaging in social media, and more. Yet, mobile keyboards are still cumbersome to handle. The average user is roughly 35% slower typing on a mobile device than on a physical keyboard. To change that, we recently provided many exciting improvements to Gboard for Android, working towards our vision of creating an intelligent mechanism that enables faster input while offering suggestions and correcting mistakes, in any language you choose.

With the realization that the way a mobile keyboard translates touch inputs into text is similar to how a speech recognition system translates voice inputs into text, we leveraged our experience in Speech Recognition to pursue our vision. First, we created robust spatial models that map fuzzy sequences of raw touch points to keys on the keyboard, just like acoustic models map sequences of sound bites to phonetic units. Second, we built a powerful core decoding engine based on finite state transducers (FST) to determine the likeliest word sequence given an input touch sequence. With its mathematical formalism and broad success in speech applications, we knew that an FST decoder would offer the flexibility needed to support a variety of complex keyboard input behaviors as well as language features. In this post, we will detail what went into the development of both of these systems.

Neural Spatial Models
Mobile keyboard input is subject to errors that are generally attributed to “fat finger typing” (or tracing spatially similar words in glide typing, as illustrated below) along with cognitive and motor errors (manifesting in misspellings, character insertions, deletions or swaps, etc). An intelligent keyboard needs to be able to account for these errors and predict the intended words rapidly and accurately. As such, we built a spatial model for Gboard that addresses these errors at the character level, mapping the touch points on the screen to actual keys.
Average glide trails for two spatially-similar words: “Vampire” and “Value”.
Up to recently, Gboard used a Gaussian model to quantify the probability of tapping neighboring keys and a rule-based model to represent cognitive and motor errors. These models were simple and intuitive, but they didn’t allow us to directly optimize metrics that correlate with better typing quality. Drawing on our experience with Voice Search acoustic models we replaced both the Gaussian and rule-based models with a single, highly efficient long short-term memory (LSTM) model trained with a connectionist temporal classification (CTC) criterion.

However, training this model turned out to be a lot more complicated than we had anticipated. While acoustic models are trained from human-transcribed audio data, one cannot easily transcribe millions of touch point sequences and glide traces. So the team exploited user-interaction signals, e.g. reverted auto-corrections and suggestion picks as negative and positive semi-supervised learning signals, to form rich training and test sets.
Raw data points corresponding to the word “could” (left), and normalized sampled trajectory with per-sample variances (right).
A plethora of techniques from the speech recognition literature was used to iterate on the NSM models to make them small and fast enough to be run on any device. The TensorFlow infrastructure was used to train hundreds of models, optimizing various signals surfaced by the keyboard: completions, suggestions, gliding, etc. After more than a year of work, the resulting models were about 6 times faster and 10 times smaller than the initial versions, they also showed about 15% reduction in bad autocorrects and 10% reduction in wrongly decoded gestures on offline datasets.

Finite-State Transducers
While the NSM uses spatial information to help determine what was tapped or swiped, there are additional constraints — lexical and grammatical — that can be brought to bear. A lexicon tells us what words occur in a language and a probabilistic grammar tells us what words are likely to follow other words. To encode this information we use finite-state transducers. FSTs have long been a key component of Google’s speech recognition and synthesis systems. They provide a principled way to represent various probabilistic models (lexicons, grammars, normalizers, etc) used in natural language processing together with the mathematical framework needed to manipulate, optimize, combine and search the models*.

In Gboard, a key-to-word transducer compactly represents the keyboard lexicon as shown in the figure below. It encodes the mapping from key sequences to words, allowing for alternative key sequences and optional spaces.
This transducer encodes “I”, “I’ve”, “If” along paths from the start state (the bold circle 1) to final states (the double circle states 0 and 1). Each arc is labeled with an input key (before the “:”) and a corresponding output word (after the “:”) where ε encodes the empty symbol. The apostrophe in “I’ve” can be omitted. The user may skip the space bar sometimes. To account for that, the space key transition between words in the transducer is optional. The ε and space back arcs allow accepting more than one word.
A probabilistic n-gram transducer is used to represent the language model for the keyboard. A state in the model represents an (up to) n-1 word context and an arc leaving that state is labeled with a successor word together with its probability of following that context (estimated from textual data). These, together with the spatial model that gives the likelihoods of sequences of key touches (discrete tap entries or continuous gestures in glide typing), are combined and explored with a beam search.

Generic FST principles, such as streaming, support for dynamic models, etc took us a long way towards building a new keyboard decoder, but several new functionalities also had to be added. When you speak, you don’t need the decoder to complete your words or guess what you will say next to save you a few syllables; but when you type, you appreciate the help of word completions and predictions. Also, we wanted the keyboard to provide seamless multilingual support, as shown below.
Trilingual input typing in Gboard.
It was a complex effort to get our new decoder off the ground, but the principled nature of FSTs has many benefits. For example, supporting transliterations for languages like Hindi is just a simple extension of the generic decoder.

Transliteration Models
In many languages with complex scripts, romanization systems have been developed to map characters into the Latin alphabet, often according to their phonetic pronunciations. For example, the Pinyin “xièxiè” corresponds to the Chinese characters “谢谢” (“thank you”). A Pinyin keyboard allows users to conveniently type words on a QWERTY layout and have them automatically “translated” into the target script. Likewise, a transliterated Hindi keyboard allows users to type “daanth” for “दांत” (teeth). Whereas Pinyin is an agreed-upon romanization system, Hindi transliterations are more fuzzy; for example “daant” would be a valid alternative for “दांत”.
Transliterated glide input for Hindi.
Just as we have a transducer mapping from letter sequences to words (a lexicon) and a weighted language model automaton providing probabilities for word sequences, we built weighted transducer mappings between Latin key sequences and target script symbol sequences for 22 Indic languages. Some languages have multiple writing systems (Bodo for example can be written in the Bengali or Devanagari scripts) so between transliterated and native layouts, we built 57 new input methods in just a few months.

The general nature of the FST decoder let us leverage all the work we had done to support completions, predictions, glide typing and many UI features with no extra effort, allowing us to offer a rich experience to our Indian users right from the start.

A More Intelligent Keyboard
All in all, our recent work cut the decoding latency by 50%, reduced the fraction of words users have to manually correct by more than 10%, allowed us to launch transliteration support for the 22 official languages of India, and enabled many new features you may have noticed.

While we hope that these recent changes improve your typing experience, we recognize that on-device typing is by no means solved. Gboard can still make suggestions that seem nonintuitive or of low utility and gestures can still be decoded to words a human would never pick. However, our shift towards powerful machine intelligence algorithms has opened new spaces that we’re actively exploring to make more useful tools and products for our users worldwide.

Acknowledgments
This work was done by Cyril Allauzen, Ouais Alsharif, Lars Hellsten, Tom Ouyang, Brian Roark and David Rybach, with help from Speech Data Operation team. Special thanks go to Johan Schalkwyk and Corinna Cortes for their support.


* The toolbox of relevant algorithms is available in the OpenFst open-source library.

Bringing down the language barriers – making the internet more inclusive

There are currently over 400* million Internet users in India, but with only 20% of the population fluent in English, most Internet users have significant language barriers to getting the full value of the Internet. A speaker of Indian languages like Hindi or Tamil still has trouble finding content to read and or use services that they can use in their own languages.

To build rich and empowering experiences for everyone means first and foremost making things work in the languages people speak. Today, we’re taking a huge step forward by launching new set of products and features that will empower the Internet ecosystem to create more language content and better serve the needs of a billion Indians who’re coming online rapidly.

Neural Machine Translation: The world’s content, in your language
Starting today, when you use Google Translate, you might notice that the translation is more accurate and easier to understand, especially when translating full sentences. That’s because we’ve brought our new Neural Machine Translation technology to translations between English and nine widely used Indian languages — Hindi, Bengali, Marathi, Tamil, Telugu, Gujarati, Punjabi, Malayalam and Kannada.

Neural translation is a lot better than our old phrase-based system, translating full sentences at a time, instead of pieces of a sentence. It uses this broader context to help it figure out the most relevant translation, which it then rearranges and adjusts to be more like a human speaking with proper grammar. This new technique improves the quality of translation more in a single jump than we’ve seen in the last ten years combined.

Just like it’s easier to learn a language when you already know a related language, we’ve discovered that our neural technology speaks each language better when it learns several at a time. For example, we have a whole lot more sample data for Hindi than its relatives Marathi and Bengali, but when we train them all together, the translations for all improve more than if we’d trained each individually.

Screen Shot 2017-04-24 at 12.07.04 PM.png
Phrase based Translation      Neural Machine Translation

You can try, these out on iOS and Android Google Translate apps, at translate.google.co.in and through Google Search.

But how does this make the whole web better for everyone — Chrome has it covered!
That’s where Chrome’s built-in Translate functionality comes into play. Every day, more than 150 million web pages are translated by Chrome users through the magic of machine translations with one click or tap.  The Chrome team and the Google Translate team have worked together to bring the power of Neural Machine Translation to web content, making full-page translations more accurate and easier to read.

Today, we’re extending Neural Machine Translation built into Chrome to and from English for the same nine Indian languages (Bengali, Gujarati, Kannada, Malayalam, Marathi, Punjabi, Tamil Telugu and Hindi). This means higher quality translations of everything from song lyrics to news articles to cricket discussions.

            
Screen Shot 2017-04-24 at 5.10.07 PM.png

Gboard in 22 Indian Languages and more
Being able to type in your language of choice is as important as understanding content on the web. Today, we are ramping up support to include 11 new languages to the list of 11 existing Indian languages —with transliteration support—including Hindi, Bengali, Telugu, Marathi, Tamil, Urdu, and Gujarati.

Gboard has all the things you love about your Google Keyboard — speed and accuracy, Glide Typing and voice typing — plus Google Search built in. It also allows you to search and use Google Translate right in your keyboard (just tap the “G” button to get started). And—as a reminder—Gboard already has a Hinglish language option for those of you who often switch back and forth between Hindi and English.

With today’s update, we’ve also dropped in a new text editing tool that makes it easier to select, copy and paste, plus new options for resizing and repositioning the keyboard so it fits to your hand and texting style. And to top it all off, this Gboard update comes with some under-the-hood improvements including better accuracy and predictions while you type

Like Google Indic Keyboard, Gboard has auto-correction and prediction in these new languages, plus two layouts for each—one in the native language script and one with the QWERTY layout for transliteration, which lets you spell words phonetically using the QWERTY alphabet and get text output in your native language script. For example, type “aapko holi ki hardik shubhkamnay” and get “आपको होली की हार्दिक शुभकामनायें ”.

hindi_translit_fixed.gif                                           

This is available today on Google Play Store, so make sure you’re running the latest version of the app.

Auto-translated local reviews in Maps
The local information across Google Search and Maps helps millions of people, every day, to discover and share great places. Our goal is to build a map tailored to each user and their likes and preferences and make it work for everyone in their local languages. Starting today, we’ll automatically add translations to local reviews on Google Maps, both on mobile and desktop. With this update, millions of these reviews – from restaurants to cafes or hotels – will appear in your own language.

All you need to do is launch Google Maps, open reviews, and they’ll appear in both the original language as well as the language you set on your device. So for instance if you speak Tamil and travel to Kolkata, and you want to see reviews of the popular restaurants in Kolkata, you can now automatically see reviews both in your own language and the original language of the review.

Hindi Dictionary in Search
When you search for the meaning of a word in English, for instance “meaning of nostalgic”, you’ll get a dictionary straight in Google Search. Today, in collaboration with the Oxford University Press, we’re bringing the Rajpal & Sons Hindi dictionary online. This new experience supports transliteration so you don’t even need to switch to a Hindi keyboard. So  the next time when you’d like to know more about a word, say Nirdeshak, you can just type in Nirdeshak ka matlab in Search, and you’ll instantly get to see word meanings and dictionary definitions on the search results page, including English translations.

pasted image 0 (11).png

While all these new products and improvements takes us closer to make the web more useful for Indian Language users.  We realise that we can’t do this alone, we need India’s internet ecosystem to come together to build apps and more content to make India’s Internet that serve its users need. And one way to effectively get the Internet Industry together to solve for local language users is to really understand the users, understand their needs to shape India’s Internet landscape. We have worked with KPMG India to compile an industry report titled “Indian Languages - Defining India’s Internet” - which provides rich insights on what we need to do together as an Industry to bring the Internet alive for every Indian.

Source: *Indian Languages - Defining India’s Internet”  Report

Posted By Barak Turovsky, Group Product Manager, Google Translate