Online calls have become an everyday part of life for millions of people by helping to streamline their work and connect them to loved ones. To transmit a call across the internet, the data from calls are split into short chunks, called packets. These packets make their way over the network from the sender to the receiver where they are reassembled to make continuous streams of video and audio. However, packets often arrive at the other end in the wrong order or at the wrong time, an issue generally referred to as jitter, and sometimes individual packets can be lost entirely. Issues such as these lead to lower call quality, since the receiver has to try and fill in the gaps, and are a pervasive problem for both audio and video transmission. For example, 99% of Google Duo calls need to deal with packet losses, excessive jitter or network delays. Of those calls, 20% lose more than 3% of the total audio duration due to network issues, and 10% of calls lose more than 8%.
|Simplified diagram of network problems leading to packet loss, which needs to be counteracted by the receiver to allow reliable real-time communication.|
To address these audio issues, we present WaveNetEQ, a new PLC system now being used in Duo. WaveNetEQ is a generative model, based on DeepMind’s WaveRNN technology, that is trained using a large corpus of speech data to realistically continue short speech segments enabling it to fully synthesize the raw waveform of missing speech. Because Duo calls are end-to-end encrypted, all processing needs to be done on-device. The WaveNetEQ model is fast enough to run on a phone, while still providing state-of-the-art audio quality and more natural sounding PLC than other systems currently in use.
A New PLC System for Duo
Like many other web-based communication systems, Duo is based on the WebRTC open source project. To conceal the effects of packet loss, WebRTC’s NetEQ component uses signal processing methods, which analyze the speech and produce a smooth continuation that works very well for small losses (20ms or less), but does not sound good when the number of missing packets leads to gaps of 60ms or more. In those latter cases the speech becomes robotic and repetitive, a characteristic sound that is unfortunately familiar to many internet voice callers.
To better manage packet loss, we replace the NetEQ PLC component with a modified version of WaveRNN, a recurrent neural network model for speech synthesis consisting of two parts, an autoregressive network and a conditioning network. The autoregressive network is responsible for the continuity of the signal and provides the short-term and mid-term structure for the speech by having each generated sample depend on the network’s previous outputs. The conditioning network influences the autoregressive network to produce audio that is consistent with the more slowly-moving input features.
However, WaveRNN, like its predecessor WaveNet, was created with the text-to-speech (TTS) application in mind. As a TTS model, WaveRNN is supplied with the information of what it is supposed to say and how to say it. The conditioning network directly receives this information as input in form of the phonemes that make up the words and additional prosody features (i.e., all non-text information like intonation or pitch). In a way, the conditioning network can “see into the future” and then steer the autoregressive network towards the right waveforms to match it. In the case of a PLC system and real-time communication, this context is not provided.
For a functional PLC system, one must both extract contextual information from the current speech (i.e., the past), and generate a plausible sound to continue it. Our solution, WaveNetEQ, does both at the same time, using the autoregressive network to provide the audio continuation during a packet loss event, and the conditioning network to model long term features, like voice characteristics. The spectrogram of the past audio signal is used as input for the conditioning network, which extracts limited information about the prosody and textual content. This condensed information is fed to the autoregressive network, which combines it with the audio of the recent past to predict the next sample in the waveform domain.
This differs slightly from the procedure that was followed during training of the WaveNetEQ model, where the autoregressive network receives the actual sample present in the training data as input for the next step, rather than using the last sample it produced. This process, called teacher forcing, assures that the model learns valuable information, even at an early stage of training when its predictions are still of low quality. Once the model is fully trained and put to use in an audio or video call, teacher forcing is only used to "warm up" the model for the first sample, and after that its own output is passed back as input for the next step.
|WaveNetEQ architecture. During inference, we "warm up" the autoregressive network by teacher forcing with the most recent audio. Afterwards, the model is supplied with its own output as input for the next step. A MEL spectrogram from a longer audio part is used as input for the conditioning network.|
|60 ms Packet Loss|
|120 ms Packet Loss|
|Audio clips: Comparison of WebRTC’s default PLC system, NetEQ, with our model, WaveNetEQ. Audio clips were taken from LibriTTS and 10% of the audio was dropped in 60 or 120 ms chunks and then filled in by the PLC systems.|
One important factor during PLC is the ability of the network to adapt to variable input signals, including different speakers or changes in background noise. In order to ensure the robustness of the model across a wide range of users, we trained WaveNetEQ on a speech dataset that contains over 100 speakers in 48 different languages, which allows the model to learn the characteristics of human speech in general, instead of the properties of a specific language. To ensure WaveNetEQ is able to deal with noisy environments, such as answering your phone in the train station or in the cafeteria, we augment the data by mixing it with a wide variety of background noises.
While our model learns how to plausibly continue speech, this is only true on a short scale — it can finish a syllable but does not predict words, per se. Instead, for longer packet losses we gradually fade out until the model only produces silence after 120 milliseconds. To further ensure that the model is not generating false syllables, we evaluated samples from WaveNetEQ and NetEQ using the Google Cloud Speech-to-Text API and found no significant difference in the word error rate, i.e., how many mistakes were made transcribing the spoken text.
We have been experimenting with WaveNetEQ in Duo, where the feature has demonstrated a positive impact on call quality and user experience. WaveNetEQ is already available in all Duo calls on Pixel 4 phones and is now being rolled out to additional models.
The core team includes Alessio Bazzica, Niklas Blum, Lennart Kolmodin, Henrik Lundin, Alex Narest, Olga Sharonova from Google and Tom Walters from DeepMind. We would also like to thank Martin Bruse (Google), Norman Casagrande, Ray Smith, Chenjie Gu and Erich Elsen (DeepMind) for their contributions.
Source: Google AI Blog
Bringing Android Enterprise to your organization opens up new possibilities for your business, and a well-structured communication plan can help employees understand all the capabilities.
We’ve created the Android Enterprise Employee Adoption Kit to help IT teams communicate the features and benefits to their employees.
Resources include helpful videos, flyers, email templates, and slide decks that walk through how to get started with Android device features and management tools. We’ve designed these assets to be useful for preparing your users, assisting them in getting started, and sharing out tips, especially for those switching to Android.
Getting teams ready for Android Enterprise
To generate buzz before introducing Android Enterprise to your organization, you can use and customize our email scripts to share details about the new mobile experience for your team. Some companies may wish to create a demo desk to give new users a guided tour of Android Enterprise features. We’ve included suggested scripts to help walk employees through what’s to come.
Our user adoption slides detail the benefits, features, and scope of different device management modes. New YouTube videos offer a helpful overview of using the work profile, managed Google Play, and zero-touch enrollment. These videos can be embedded into internal sites or shared out directly.
Also, customizable slide decks walk through initial steps with a new Android device, provide detailed instructions for key tasks like downloading an app, or highlight the many benefits to using the work profile.
Learning Android features
Giving your team regular tips and tricks helps them take advantage of Android features and gain confidence in their device. We’ve prepared assets that offer suggestions for using helpful productivity tools in Android and embracing the privacy and work-life balance the work profile offers.
This kit is available for all those who wish to help their teams find success with Android. Learn more about Android Enterprise and how it can transform your business.
Source: The Official Google Blog
At Google Pay, we’re always looking for ways to make things simple, helpful, and accessible for everyone, whether that’s consumers or developers. Today, we’re introducing a new resource for developers that does just that — the Business Console for Google Pay. The Business Console is a new tool that streamlines the way you integrate Google Pay into your apps and websites.
Many of you have already added support for Google Pay. In the process, you asked questions like:
- Can I see the current status of my integrations?
- Where can I find all other integrations I worked on?
- I need to add support for Google Pay to my new site. Can I get notified when additional information is needed?
We created the Business Console for Google Pay in response to your feedback. With the new console, you’ll be able to integrate Google Pay into your apps and websites more seamlessly, discover resources, get support at different stages throughout your integration, and keep track of your progress along the way.
And this is only the beginning. As we add new features, the Business Console will be your go-to place to manage all your new and existing integrations with Google Pay, see how your integrations perform over time, and add support for other business- and developer-focused products.
The new Business Console lets you simplify your Google Pay integrations by guiding you during the submission for approval and helping you keep track of progress.
Getting started is easy. Just head to pay.google.com/business/console. If you’ve already integrated with Google Pay, log in with your account to see your existing integrations or create new ones. And if you haven’t integrated with Google Pay yet, simply create your business profile, build an integration, and submit it for approval directly from the console.
Some businesses, system integrators, and developers have already started using the Business Console as part of our early-access program. “The new Google Pay Business Console helped us understand the integration requirements, and the examples made it easy to implement the Google Pay API into our website,” Gymondo GmbH CTO Christopher Weiss said. The Business Console also helped Weiss get their integration approved quickly. “Shortly after,” Weiss said, “we started seeing purchases coming from our customers paying with Google Pay."
We hope the new console makes your integration process go just as smoothly, and we’d love to hear about your experience. You can share any feedback from the menu within the console. We’re looking forward to learning how we can make Google Pay even more helpful in the future.
Source: Google Developers Blog
Googlers use Code Search every day to help understand the codebase: they search for half-remembered functions and usages; jump through the codebase to figure out what calls the function they are viewing; and try to identify when and why a particular line of code changed.
The Code Search tool gives a rich code browsing experience. For example, the blame button shows which user last changed each line and you can display history on the same page as the file contents. In addition, it supports a powerful search language and, for some repositories, cross-references.
Suggest-as-you-type in any search box annotates suggestions with the type of code object, the repository and the path, helping users find what they want faster.
The search language supports regular expressions and a number of helpful search atoms. For a user looking for a function foo in a Go file, instead of sifting through thousands of results containing foo, the user can search for lang:go function:foo to limit search results to Go files where foo is a function and not a struct or a word in a comment.
One example is finding a file using only part of the name. The query file:KytheURI.java goes directly to the file, since there is only one such file.
See the quick reference for more information.
In addition to text search, some of the open source repositories have cross-references powered by Kythe. Kythe is a Google open source project that includes tools to help understand code. Project owners instrument a build of their repository to output compilation information for Kythe. Kythe tools convert this data to a graph. This graph connects definitions to declarations and code references to the abstract objects they represent (described by a graph schema). Google then runs an internal pipeline that combines these graphs for the different languages, prunes unnecessary pieces, and optimizes it for serving cross-references. The whole process runs several times per day to keep the data fresh.
Open source communities use a broader set of build systems than Google. In order to support cross-references, Kythe added drop-in support for Bazel, CMake, Maven, and Go. Projects using other build systems can use Kythe-provided wrappers for clang and javac to instrument their builds; these are used by Chromium and Android AOSP to provide compilation information for Kythe.
Because Kythe is based on the build, Kythe cross-references include links to files generated as part of the build process, such as Java files generated for AutoValues (example here) or protos. For repositories where cross-references are enabled, clicking on a symbol will take you to a definition of that symbol.
Clicking on the definition of a symbol will open a cross-reference panel, showing all the places where that symbol is referenced. For example, clicking on toVName below, we can see the places that reference this method. One of the callers is parseVName, and clicking on that shows the callers of that method.
At this time, we only provide search on the repositories listed below, but we plan to add more over time:
- Bazel (with cross-references)
- Firebase SDK
- Go (with cross-references)
- gVisor (with cross-references)
- Kythe (with cross-references)
- Nomulus (with cross-references)
- Tensorflow (with cross-references)
We hope you find this tool useful!
Source: Google Open Source Blog
You can see a full list of the changes in the Git log. If you find a new issue, please let us know by filing a bug.
As usual, our ongoing internal security work was responsible for a wide range of fixes:
-  Various fixes from internal audits, fuzzing and other initiatives
Best practices for search visibilityBy default, Google tries to show the most relevant, authoritative information in response to any search. This process is more effective when content owners help Google understand their content in appropriate ways.
To better guide health-related organizations in this process (known as SEO, for "search engine optimization"), we have produced a new help center article with some important best practices, with emphasis on health information sites, including:
- How to help users access your content on the go
- The importance of good page content and titles
- Ways to check how your site appears for coronavirus-related queries
- How to analyze the top coronavirus related user queries
- How to add structured data for FAQ content
New support group for health organizationsIn addition to our best practices help page, health organizations can take part in our new technical support group that's focused on helping health organizations who publish COVID-19 information with Search related questions.
We’ll be approving requests for access on a case-by-case basis. At first we’ll be accepting only domains under national health ministries and US state level agencies. We'll inform of future expansions here in this blog post, and on our Twitter account. You’ll need to register using either an email under those domains (e.g. [email protected]) or have access to the website Search Console account.
Fill this form to request access to the COVID-19 Google Search group
The group was created to respond to the current needs of health organizations, and we intend to deprecate the group as soon as COVID-19 is no longer considered a Public Health Emergency by WHO or some similar deescalation is widely in place.
Everyone is welcome to use our existing webmaster help forum, and if you have any questions or comments, please let us know on Twitter.
Posted by Daniel Waisberg, Search Advocate & Ofir Roval, Search Console Lead PM
Source: Google Webmaster Central Blog
Teams quickly discover they need to customize, validate, audit and re-publish their forked/ generated bundles for their environment. Most packaging solutions to date are tightly coupled to some format written as code (e.g. templates, DSLs, etc). This introduces a number of challenges when trying to extend, build on top of, or integrate them with other systems. For example, how does one update a forked template from upstream, or how does one apply custom validation?
Packaging is the foundation of building reusable components, but it also incurs a productivity tax on the users of those components.
Today we’d like to introduce kpt, an OSS tool for Kubernetes packaging, which uses a standard format to bundle, publish, customize, update, and apply configuration manifests.
Kpt is built around an “as data” architecture bundling Kubernetes resource configuration, a format for both humans and machines. The ability for tools to read and write the package contents using standardized data structures enables powerful new capabilities:
- Any existing directory in a Git repo with configuration files can be used as a kpt package.
- Packages can be arbitrarily customized and later pull in updates from upstream by merging them.
- Tools and automation can perform high-level operations by transforming and validating package data on behalf of users or systems.
- Organizations can develop their own tools and automation which operate against the package data.
- $ kpt fn
- Existing tools and automation that work with resource configuration “just work” with kpt.
- Existing solutions that generate configuration (e.g. from templates or DSLs) can emit kpt packages which enable the above capabilities for them.
Example workflow with kptNow that we’ve established the benefits of using kpt for managing your packages of Kubernetes config, lets walk through how an enterprise might leverage kpt to package, share and use their best practices for Kubernetes across the organization.
First, a team within the organization may build and contribute to a repository of best practices (pictured in blue) for managing a certain type of application, for example a microservice (called “app”). As the best practices are developed within an organization, downstream teams will want to consume and modify configuration blueprints based on them. These blueprints provide a blessed starting point which adheres to organization policies and conventions.
The downstream team will get their own copy of a package by downloading it to their local filesystem (pictured in red) using kpt pkg get. This clones the git subdirectory, recording upstream metadata so that it can be updated later.
They may decide to update the number of replicas to fit their scaling requirements or may need to alter part of the image field to be the image name for their app. They can directly modify the configuration using a text editor (as would be done before). Alternatively, the package may define setters, allowing fields to be set programmatically using kpt cfg set. Setters streamline workflows by providing user and automation friendly commands to perform common operations.
Once the modifications have been made to the local filesystem, the team will commit and push their package to an app repository owned by them. From there, a CI/CD pipeline will kick off and the deployment process will begin. As a final customization before the package is deployed to the cluster, the CI/CD pipeline will inject the digest of the image it just built into the image field (using kpt cfg set). When the image digest has been set, the CI/CD pipeline can send the manifests to the cluster using kpt live apply. Kpt live operates like kubectl apply, providing additional functionality to prune resources deleted from the configuration and block on rollout completion (reporting status of the rollout back to the user).
Now that we’ve walked through how you might use kpt in your organization, we’d love it if you’d try it out, read the docs, or contribute.
One more thingThere’s still a lot to the story we didn’t cover here. Expect to hear more from us about:
- Using kpt with GitOps
- Building custom logic with functions
- Writing effective blueprints with kpt and kustomize