A Step Towards Protecting Patients from Medication Errors

Posted by Kathryn Rough, Research Scientist and Alvin Rajkomar, MD, Google Health

While no doctor, nurse, or pharmacist wants to make a mistake that harms a patient, research shows that 2% of hospitalized patients experience serious preventable medication-related incidents that can be life-threatening, cause permanent harm, or result in death. There are many factors contributing to medical mistakes, often rooted in deficient systems, tools, processes, or working conditions, rather than the flaws of individual clinicians (IOM report). To mitigate these challenges, one can imagine a system more sophisticated than the current rules-based error alerts provided in standard electronic health record software. The system would identify prescriptions that looked abnormal for the patient and their current situation, similar to a system that produces warnings for atypical credit card purchases on stolen cards. However, determining which medications are appropriate for any given patient at any given time is complex — doctors and pharmacists train for years before acquiring the skill. With the widespread use of electronic health records, it may now be feasible to use this data to identify normal and abnormal patterns of prescriptions.

In an initial effort to explore solutions to this problem, we partnered with UCSF's Bakar Computational Health Sciences Institute to publish “Predicting Inpatient Medication Orders in Electronic Health Record Data” in Clinical Pharmacology and Therapeutics, which evaluates the extent to which machine learning could anticipate normal prescribing patterns by doctors, based on electronic health records. Similar to our prior work, we used comprehensive clinical data from de-identified patient records, including the sequence of vital signs, laboratory results, past medications, procedures, diagnoses and more. Based on the patient’s current clinical state and medical history, our best model was able to anticipate physician’s actual prescribing decisions three quarters of the time.

Model Training
The dataset used for model training included approximately three million medication orders from over 100,000 hospitalizations. It used retrospective electronic health record data, which was de-identified by randomly shifting dates and removing identifying portions of the record in accordance with HIPAA, including names, addresses, contact details, record numbers, physician names, free-text notes, images, and more. The data was not joined or combined with any other data. All research was done using the open-sourced Fast Healthcare Interoperability Resources (FHIR) format, which we’ve previously used to make healthcare data more effective for machine learning. The dataset was not restricted to a particular disease or therapeutic area, which made the machine learning task more challenging, but also helped to ensure that the model could identify a larger variety of conditions; e.g. patients suffering from dehydration require different medications than those with traumatic injuries.

We evaluated two machine learning models: a long short-term memory (LSTM) recurrent neural network and a regularized, time-bucketed logistic model, which are commonly used in clinical research. Both were compared to a simple baseline that ranked the most frequently ordered medications based on a patient’s hospital service (e.g., General Medical, General Surgical, Obstetrics, Cardiology, etc.) and amount of time since admission. Each time a medication was ordered in the retrospective data, the models ranked a list of 990 possible medications, and we assessed whether the models assigned high probabilities to the medications actually ordered by doctors in each case.

As an example of how the model was evaluated, imagine a patient who arrived at the hospital with signs of an infection. The model reviewed the information recorded in the patient’s electronic health record — a high temperature, elevated white blood cell count, quick breathing rate — and estimated how likely it would be for different medications to be prescribed in that situation. The model’s performance was evaluated by comparing its ranked choices against the medications that the physician actually prescribed (in this example, the antibiotic vancomycin and sodium chloride solution for rehydration).

Based on a patient’s medical history and current clinical characteristics, the model ranks the medications a physician is most likely to prescribe.

Findings
Our best-performing model was the LSTM model, a class of models particularly effective for handling sequential data, including text and language. These models are capable of capturing the ordering and time recency of events in the data, making them a good choice for this problem.

Nearly all (93%) top-10 lists contained at least one medication that would be ordered by clinicians for the given patient within the next day. Fifty-five percent of the time, the model correctly placed medications prescribed by the doctor as one of the top-10 most likely medications, and 75% of ordered medications were ranked in the top-25. Even for ‘false negatives’ — cases where the medication ordered by doctors did not appear among the top-25 results — the model highly ranked a medication in the same class 42% of the time. This performance was not explained by the model simply predicting previously prescribed medications. Even when we blinded the model to previous medication orders, it maintained high performance.

What Does This Mean for Patients and Clinicians?
It’s important to remember that models trained this way reproduce physician behavior as it appears in historical data, and have not learned optimal prescribing patterns, how these medications might work, or what side effects might occur. However, learning ‘normal’ is a starting point to eventually spot abnormal, potentially dangerous orders. In our next phase of research, we will examine under which circumstances these models are useful for finding medication errors that could harm patients.

The results from this exploratory work are early first steps towards testing the hypothesis that machine learning can be applied to build systems that prevent mistakes and help to keep patients safe. We look forward to collaborating with doctors, pharmacists, other clinicians, and patients as we continue research to quantify whether models like this one are capable of catching errors, keeping patients safe in the hospital.

Acknowledgements
We would like to thank Atul Butte (UCSF), Claire Cui, Andrew Dai, Michael Howell, Laura Vardoulakis, Yuan (Emily) Xue, and Kun Zhang for their contributions towards the research work described in this post. We’d additionally like to thank members of our broader research team who have assisted in the development of analytical tools, data collection, maintenance of research infrastructure, assurance of data quality, and project management: Gabby Espinosa, Gerardo Flores, Michaela Hardt, Sharat Israni (UCSF), Jeff Love (UCSF), Dana Ludwig (UCSF), Hong Ji, Svetlana Kelman, I-Ching Lee, Mimi Sun, Patrik Sundberg, Chunfeng Wen, and Doris Wong.

Source: Google AI Blog

Chromebook accessibility tools for distance learning

Around the world, 1.5 billion students are now adjusting to learning from home. For students with disabilities, this adjustment is even more difficult without hands-on classroom instruction and support from teachers and learning specialists.

For educators and families using Chromebooks, there are a variety of built-in accessibility features to customize students’ learning experience and make them even more helpful. We’ve put together a list of some of these tools to explore as you navigate at-home learning for students with disabilities.

Supporting students who are low vision

To help students see screens more easily, you can find instructions for locating and turning on several Chromebook accessibility features in this Chromebook Help article. Here are a few examples of things you can try, based on students’ needs:

Increase the size of the cursor, or increase text size for better visibility.
Add ahighlighted circle around the cursor when moving the mouse, text caret when typing, or keyboard-focused item when tabbing. These colorful rings appear when the items are in motion to draw greater visual focus, and then fade away.
For students with light sensitivity or eye strain, you can turn on high-contrast mode to invert colors across the Chromebook (or add this Chrome extension for web browsing in high contrast).
Increase the size of browser or app content, or make everything on the screen—including app icons and Chrome tabs—larger for greater visibility.
For higher levels of zoom, try thefullscreen or docked magnifiers in Chromebook accessibility settings. The fullscreen magnifier zooms the entire screen, whereas the docked magnifier makes the top one-third of the screen a magnified area. Learn more in this Chromebook magnification tutorial.

Helping students read and understand text

Features that read text out loud can be useful for students with visual impairments, learning and processing challenges, or even students learning a new language.

Select-to-speak lets students hear the text they choose on-screen spoken out loud, with word-by-word visual highlighting for better audio and visual connection.
With Chromevox, the built-in screen reader for Chromebooks, students can navigate around the Chromebook interface using audio spoken feedback or braille. To hear whatever text is under the cursor, turn on Speak text under the mouse in ChromeVox options. This is most beneficial for students who have significant vision loss.
Add the Read&Write Chrome extension from Texthelp for spelling and grammar checks, talking and picture dictionaries, text-to-speech and additional reading and writing supports- all in one easy to use toolbar.
For students with dyslexia, try the OpenDyslexic Font Chrome extension to replace web page fonts with a more readable font. Or use the BeeLine Reader Chrome extension to color-code text to reduce eye strain and help students better track from one line of text to the next. You can also use the Thomas Jockin font in Google Docs, Sheets and Slides.

Guiding students with writing challenges or mobility impairments

Students can continue to develop writing skills while they’re learning from home.

Students can use their voice to enter text by enabling dictation in Chromebook accessibility settings, which works in edit fields across the device. If dictating longer assignments, students can also use voice typing in Google Docs to access a rich set of editing and formatting voice commands. Dictating writing assignments can also be very helpful for students who get a little stuck and want to get thoughts flowing by speaking instead of typing.
Students with mobility impairments can use features like the on-screen keyboard to type using a mouse or pointer device, or automatic clicks to hover over items to click or scroll.
Try the Co:Writer Chrome extension for word prediction and completion, as well as excellent grammar help. Don Johnston is offering free access to this and other eLearning tools. Districts, schools, and education practitioners can submit a request for access.

How to get started with Chromebook accessibility tools

We just shared a 12-part video series with training for G Suite and Chromebook Accessibility features made by teachers for teachers. These videos highlight teachers’ experience using these features in the classroom, as well as what type of diverse learner specific features benefit. For more, you can watch these videos from the Google team, read our G Suite accessibility user guide, or join a Google Group to ask questions and get real time answers. To find great accessibility apps and ideas on how to use them, check out the Chromebook App Hub, and for training, head to the Teacher Center.

We’re also eager to hear your ideas—leave your thoughts in this Google Form and help educators benefit from your experience.

Source: The Official Google Blog

Become a Developer Student Club Lead

Posted by Erica Hanson, Global Program Lead, Developer Student Clubs

Calling all student developers: If you’re someone who wants to lead, is passionate about technology, loves problem-solving, and is driven to give back to your community, then Developer Student Clubs has a home for you. Interest forms for the upcoming 2020-2021 academic year are now available. Ready to dive in? Get started at goo.gle/dsc-leads.

Want to know more? Check out these details below.

Image description: People holding up Developer Students Club sign

What are Developer Student Clubs?

Developer Student Clubs (DSC) are university based community groups for students interested in Google developer technologies. With programs that meet in person and online, students from all undergraduate and graduate programs with an interest in growing as a developer are welcome. By joining a DSC, students grow their knowledge in a peer-to-peer learning environment and build solutions for local businesses and their community.

Why should I join?

- Grow your skills as a developer with training content from Google.

- Think of your own project, then lead a team of your peers to scale it.

- Build prototypes and solutions for local problems.

- Participate in a global developer competition.

- Receive access to select Google events and conferences.

- Gain valuable experience

Is there a Developer Student Club near me?

Developer Student Clubs are now in 68+ countries with 860+ groups. Find a club near you or learn how to start your own, here.

When do I need to submit the interest form?

You may express interest through the form until May 15th, 11:59pm PST. Get started here.

Make sure to learn more about our program criteria.

Our DSC Leads are working on meaningful projects around the world. Watch this video of how one lead worked to protect her community from dangerous floods in Indonesia. Similarly, read this story of how another lead helped modernize healthcare in Uganda.

We’re looking forward to welcoming a new group of leads to Developer Student Clubs. Have a friend who you think is a good fit? Pass this article along. Wishing all developer students the best on the path towards building great products and community.

Submit interest form here.

*Developer Student Clubs are student-led independent organizations, and their presence does not indicate a relationship between Google and the students' universities.

Source: Google Developers Blog

Improve email security in Gmail with TLS by default and other new features

What’s changing

Recently, the Google Security blog outlined how the usage of Transport Layer Security (TLS) has grown to more than 96% of all traffic seen by a Chrome browser on Chrome OS. The blog post also highlighted a significant goal: to enable TLS by default for our Google products and services, and to ensure that TLS works out of the box.

Gmail already supports TLS, so that if the Simple Mail Transfer Protocol (SMTP) mail connection can be secured through TLS, it will be. However, in order to encourage more organizations to increase their email security posture, and to further the above goal of enabling TLS by default, we’ve made the following changes:

TLS for mail connections will now be enabled by default
Admins are now able to test their SMTP outbound routes’ TLS configuration in the Admin console before deployment. They no longer need to wait for messages to bounce.

While admins have always had the ability to require TLS encryption for mail routes, it was previously off by default. Note that existing mail routes will not be impacted by these changes.

Who’s impacted

Admins

Why it’s important

We always recommend that admins enable existing mail security features, including SPF, DKIM, and DMARC, to help protect end users. We also recommend that admins turn on MTA Strict Transport Security (MTA-STS), which improves Gmail security by requiring authentication checks and encryption for email sent to their domains. Enabling TLS by default on new SMTP mail routes enhances the security posture of our customers while enabling admins to test connections before enforcing TLS on existing routes makes it easier for them to deploy best practice security policies.

This change will not impact mail routes that were previously created.

Additional details

TLS enabled by default on new mail routes
With TLS enabled by default for new mail routes, all certificate validation requirements are also enabled by default. This ensures that recipient hosts have a certificate issued for the correct host that has been signed by a trusted Certificate Authority (CA). See more details about how we’re changing the requirements for trusted CAs below.

Admins will still have the ability to customize their TLS security settings on newly created mail routes. For example, if mail is forwarded to third-party or on-premise mail servers using internal CA certificates, admins may need to disable CA certificate validation. Disabling CA certificate validation, or even disabling TLS entirely, is not recommended. We encourage admins to test their SMTP TLS configuration in the Admin console in order to validate the TLS connection to external mail servers before disabling any recommended validations. See more details about how to test TLS connections in the Admin console.

Certificate Authority distrust in Gmail
In the past, the Google Security Blog has highlighted instances where Chrome would no longer trust root CA certificates used to intercept traffic on the public internet and where Chrome distrusts specific CAs.

If these scenarios occur in the future, these certificates will also be distrusted by Gmail. When this happens, mail sent using routes that require TLS with CA-signed certificate enforcement may bounce if the CA is no longer trusted. Although the list of root certificates trusted by Gmail can be retrieved from the Google Trust Services repository, we encourage admins to use the Test TLS Connections feature in the Admin console to confirm whether certificates have been distrusted.

Test TLS connections in Admin console
Admins can now use the new Test TLS Connection feature to verify whether a mail route can successfully establish a TLS connection with full validation to any destination, such as an on-premise mail server or a third-party mail relay, before enforcing TLS for that destination.

Getting started

Admins:

TLS settings
TLS will be ON by default for all new mail routes. We recommend that admins review all of their existing routes and enable all recommended TLS security options for these routes as well.

Testing TLS connections
Admins who want to require a secure TLS connection for emails can now verify that the connection to the recipient's mail server is valid simply by clicking on the “Test TLS Connection” button in the Admin console; they no longer need to wait for emails to bounce.

Learn more about requiring mail to be transmitted via a secure (TLS) connection and adding mail routes in the Help Center.

All certificate validations are now enabled by default when creating a new TLS compliance setting.

TLS and all certificate validations are now enabled by default when creating a new mail route.

End users: There are no end user settings for these features.

Rollout pace

Rapid and Scheduled Release domains: Extended rollout (potentially longer than 15 days for feature visibility) starting on April 2, 2020

Availability

Available to all G Suite customers

Resources

Source: G Suite Updates Blog

The teen fact-checkers fighting misinformation

Editor's note: It’s International Fact-Checking Day today and teenager Lyndsay Valadez from Indianapolis, Indiana tells us why fact-checking matters. She’s a member of the Teen Fact-Checking Network at MediaWise—part of the Google News Initiative and a Google.orgfunded partnership with The Poynter Institute for Media Studies.

Being a freshman in college and living in a dorm away from my mom and sister means we usually stay in touch by text. In between “What’s up?” and “Miss you!” I occasionally get a different kind of message from home: “Is this real?” But now that my school has switched to digital teaching because of the coronavirus, there’s no escape from my family who constantly bombard me in person about claims surrounding COVID-19.

Scrolling through social media, it can be tough to know the difference—especially if you haven’t been trained to look for it. Just like my mom taught me to say “Please” and “Thank you,” I’m now teaching her how to tell the difference between fact from fiction online. And learning those skills is really crucial at this time with people’s health on the line.

As a journalism major at Indiana University, I understand the need for truth-telling and how important facts are in this digitized age. That’s why I became an intern with the Teen Fact-Checking Network—part of the MediaWise Project—where I research, write and put together videos debunking false claims, half-truths and fantasy.

During my time, some of the debunked posts I’ve done include the ones about Joe Biden and Bernie Sanders and Hurricane Dorian. Of course, lately I am super busy covering the coronavirus.

One fact check that’s particularly special to me is one I did alongside my younger sister, Elizabeth Valadez, who recently joined MediaWise’s Teen Fact-Checking Network. It has been so neat to watch her fact-check while helping her along the way. Together, we worked on this fact-check about how long the coronavirus can live on different surfaces.

Video showing sisters Lyndsay and her sister Elizabeth talking about fact-checking coronavirus myths.

During the time working with my sister, I realized how our own media experiences affect the way we approach fact-checking. We have different tastes—she’s into the social aspect while I like the more informational side. But this variety of media viewpoints and understanding helped us present a fuller, more comprehensive fact-check. Together, we’re teaching people to ask three key questions created by MediaWise partner, the Stanford History Education Group: Who is behind the information? What is the evidence? And what do other sources say?

Teen fact-checking siblings

Surprisingly enough, we aren’t the only siblings fact-checking together at MediaWise.

Fact-checking brothers Kush Patel, 16, and his little brother Parth, 13, from North Carolina debunked a Twitter claim about a book predicting the 2019 coronavirus. Brother-sister duo Jahin Rahman, 16, and Fahmin Rahman, 14, teamed up to fact-check a claim aboutCO2 emissions dropping 25 percent in China because of the virus. You might be surprised by the answer!

fact check.jpg — Left: Brother-sister duo Jahin and Fahmin Rahman. Right: Kush Patel and his little brother Parth.

Today the Teen Fact-Checking Network has 35 teenagers on staff from a dozen states. Through social media storytelling, we’ve debunked more than 300 claims—and that’s only the beginning. The staff is now solely fact-checking claims about COVID-19, and has debunked more than 20 social media posts. Who knows, in 10 years the TFCN could be fact-checking at a level similar to organizations like Politifact or Snopes.

Video showing teens talking about the Teen Fact-Checking Network.

And as we mark this fourth year of International Fact-Checking Day, we recognize the need for this kind of media literacy and teaching others how to fact-check. So far MediaWise has helped more than 5 million people learn how to be media savvy about what they see online. And through in-person training, the MediaWise team has taught more than 18,000 students at 70 different schools across the country.

MediaWise has taught me that no matter how old you are, we can all stand to be better. And we all need to work together to do our part in combating the spread of misinformation. Now more than ever.

This International Fact-Checking Day, check out Civic Online Reasoning, a free curriculum developed by the Stanford History Education Group as part of MediaWise on how to evaluate online information.

Source: The Official Google Blog

COVID-19: $6.5 million to help fight coronavirus misinformation

Health authorities have warned that an overabundance of information can make it harder for people to obtain reliable guidance about the coronavirus pandemic.

Helping the world make sense of this information requires a broad response, involving scientists, journalists, public figures, technology platforms and many others. Here are some ways we plan to help.

Supporting coronavirus fact-checking and verification efforts

We’re providing $6.5 million in funding to fact-checkers and nonprofits fighting misinformation around the world, with an immediate focus on coronavirus.

Collaboration is a crucial component of journalism’s response to a story as complicated and all-encompassing as COVID-19. For this reason, the Google News Initiative (GNI) is stepping up its support for First Draft. The nonprofit is providing an online resource hub, dedicated training and crisis simulations for reporters covering COVID-19 all over the globe. First Draft is also using its extensive CrossCheck network to help newsrooms respond quickly and address escalating content that is causing confusion and harm. We’re also renewing our support for the collaborative verification project Comprova in Brazil.

As fact-checkers address heightened demand for their work, we are providing immediate support to several organizations. Full Fact andMaldita.es will coordinate efforts in Europe focused on countries with the most cases (Italy, Spain, Germany, France and the United Kingdom) to amplify experts, share trends, and help reduce the spread of harmful false information. In Germany, CORRECTIV will step up its efforts to engage citizens in the fight against misinformation.

LatamChequea, coordinated by Chequeado, is providing a single hub to highlight the work of 21 fact-checking organizations across 15 countries in the Spanish-speaking world and Latin America. With our support, PolitiFact and Kaiser Health News will expand their health fact-checking partnership to focus on COVID-19 misinformation.

Increasing access to data, scientific expertise and fact checks

Access to primary expert sources during an evolving public health crisis is both challenging and fundamental for journalists covering the story. To make this easier, we’re providing funding to SciLine, based at theAmerican Association for the Advancement of Science, and the Australian Science Media Centre, creators of Scimex.org. We’re supporting the creation of a database for reporters developed by the journalism technology nonprofit Meedan in partnership with public health experts.

The GNI is also supporting the JSK Journalism Fellowships at Stanford University and Stanford's Big Local News group to create a global data resource for reporters working on COVID-19. The new project will collate data from around the world and help journalists tell data-driven stories that have impact in their communities.

The International Fact-Checking Network (IFCN) continues to advocate for fact-checkers worldwide; our renewed support will boost their efforts to uphold best practices in the fact-checking field and showcase the work of the CoronaVirusFacts alliance. In addition, Science Feedback will conduct a network analysis using the hundreds of COVID-19 fact checks published globally to track the spread of related misinformation.

We also want to do more to highlight fact-check articles that address potentially harmful health misinformation more prominently to our users and we’re experimenting with how to best include a dedicated fact check section in the COVID-19 Google News experience.

Providing insights to fact-checkers, reporters and health authorities

So that reporters can understand and explain how the world is searching for the virus, we’ve made Google Trends data readily available in localized pages with embeddable visualizations.

We’re also making more local Google Trends data available for journalists, health organizations and local authorities to help them understand people's information needs around the world.

Questions in search on Coronavirus in cities around the world — Questions in Search on coronavirus in cities around the world

Fact-checkers and health authorities need help to identify topics that people are searching for and where there might be a gap in the availability of good information online. Unanswered user questions—such as “what temperature kills coronavirus?”—can provide useful insights to fact-checkers and health authorities about content they may want to produce.

To help, we’re supporting Data Leads in partnership with BOOM Live in India and Africa Check in Nigeria to leverage data from Question Hub. This will be complemented by an effort to train 1,000 journalists across India and Nigeria to spot health misinformation.

Our online resources are being updated to support the vital work journalists are doing. The GNI Training Center has tools for data journalism and verification in 16 languages, and our global team of Teaching Fellows is delivering workshops entirely online in 10 languages.

Today's announcement is one of several efforts we’re working on to support those working to cover this pandemic. We look forward to sharing more soon.

Source: The Official Google Blog

Transform your photo in the style of an iconic artist

From the bold, swirling movement in Vincent van Gogh's paintings, to the surreal, confident brushstrokes of Frida Kahlo, many famous artists have instantly recognizable styles. Now you can use these styles to transform your own photos. With Art Transfer, a new feature in the Google Arts & Culture app, you can apply the characteristics of well-known paintings to your own images.

To try it, open the Camera menu in the bottom bar of the Google Arts & Culture app and select “Art Transfer.” After taking or uploading a photo, choose from dozens of masterpieces to transfer that style onto your image. (And while you wait, we’ll share a fun fact about the artwork, in case you’re curious to know a bit more about its history.) For more customization, you can use the scissors icon to select which part of the image you want the style applied to.

Thanks to cultural institutions from around the world, such as the UK’s National Gallery and Japan’s MOA Museum of Art, we’re able to feature artists like van Gogh, Frida Kahlo, Edvard Munch or Leonardo da Vinci.

Art Transfer animation of coffee cup - Cutting Tool.gif

Many Google Arts & Culture experiments show what’s possible when you combine art and technology. Artificial intelligence in particular can be a powerful tool not just in the hands of artists, but also as a way for people to experience and learn about art in new ways.

In this case, Art Transfer is powered by an algorithmic model created by Google AI. Once you snap your photo and select a style, Art Transfer doesn’t just blend the two things or simply overlay your image. Instead, it kicks off a unique algorithmic recreation of your photo inspired by the specific art style you have chosen.

And all of it happens right on your device without the help of the cloud or your image being processed online.

We are curious to see what you will create with a little help of AI. Once you are happy with your Art Transfer, tap share to share the results as a still image or as a GIF - #ArtTransfer.

Discover more on Google Arts & Culture—or download our free app for iOS or Android.

Source: The Official Google Blog

Dev Channel Update for Desktop

The Dev channel has been updated to 83.0.4100.3 for Windows, Mac, and Linux platforms.

A partial list of changes is available in the log. Interested in switching release channels? Find out how. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.

Srinivas Sista Google Chrome

Source: Google Chrome Releases

Stable Channel Update for Chrome OS

The Stable channel is being updated to 80.0.3987.162 (Platform version: 12739.111.0) for most Chrome OS devices. This build contains a number of bug fixes and security updates. Systems will be receiving updates over the next several days.

If you find new issues, please let us know by vising our forum or filing a bug. Interested in switching channels? Find out how. You can submit feedback using 'Report an issue...' in the Chrome menu (3 vertical dots in the upper right corner of the browser).

Daniel Gagnon

Google Chrome OS

Source: Google Chrome Releases

Improving Audio Quality in Duo with WaveNetEQ

Posted by Pablo Barrera, Software Engineer, Google Research and Florian Stimberg, Research Engineer, DeepMind

Online calls have become an everyday part of life for millions of people by helping to streamline their work and connect them to loved ones. To transmit a call across the internet, the data from calls are split into short chunks, called packets. These packets make their way over the network from the sender to the receiver where they are reassembled to make continuous streams of video and audio. However, packets often arrive at the other end in the wrong order or at the wrong time, an issue generally referred to as jitter, and sometimes individual packets can be lost entirely. Issues such as these lead to lower call quality, since the receiver has to try and fill in the gaps, and are a pervasive problem for both audio and video transmission. For example, 99% of Google Duo calls need to deal with packet losses, excessive jitter or network delays. Of those calls, 20% lose more than 3% of the total audio duration due to network issues, and 10% of calls lose more than 8%.

Simplified diagram of network problems leading to packet loss, which needs to be counteracted by the receiver to allow reliable real-time communication.

In order to ensure reliable real-time communication, it is necessary to deal with packets that are missing when the receiver needs them. Specifically, if new audio is not provided continuously, glitches and gaps will be audible, but repeating the same audio over and over is not an ideal solution, as it produces artifacts and reduces the overall quality of the call. The process of dealing with the missing packets is called packet loss concealment (PLC). The receiver’s PLC module is responsible for creating audio (or video) to fill in the gaps created by packet losses, excessive jitter or temporary network glitches, all three of which result in an absence of data.

To address these audio issues, we present WaveNetEQ, a new PLC system now being used in Duo. WaveNetEQ is a generative model, based on DeepMind’s WaveRNN technology, that is trained using a large corpus of speech data to realistically continue short speech segments enabling it to fully synthesize the raw waveform of missing speech. Because Duo calls are end-to-end encrypted, all processing needs to be done on-device. The WaveNetEQ model is fast enough to run on a phone, while still providing state-of-the-art audio quality and more natural sounding PLC than other systems currently in use.

A New PLC System for Duo
Like many other web-based communication systems, Duo is based on the WebRTC open source project. To conceal the effects of packet loss, WebRTC’s NetEQ component uses signal processing methods, which analyze the speech and produce a smooth continuation that works very well for small losses (20ms or less), but does not sound good when the number of missing packets leads to gaps of 60ms or more. In those latter cases the speech becomes robotic and repetitive, a characteristic sound that is unfortunately familiar to many internet voice callers.

To better manage packet loss, we replace the NetEQ PLC component with a modified version of WaveRNN, a recurrent neural network model for speech synthesis consisting of two parts, an autoregressive network and a conditioning network. The autoregressive network is responsible for the continuity of the signal and provides the short-term and mid-term structure for the speech by having each generated sample depend on the network’s previous outputs. The conditioning network influences the autoregressive network to produce audio that is consistent with the more slowly-moving input features.

However, WaveRNN, like its predecessor WaveNet, was created with the text-to-speech (TTS) application in mind. As a TTS model, WaveRNN is supplied with the information of what it is supposed to say and how to say it. The conditioning network directly receives this information as input in form of the phonemes that make up the words and additional prosody features (i.e., all non-text information like intonation or pitch). In a way, the conditioning network can “see into the future” and then steer the autoregressive network towards the right waveforms to match it. In the case of a PLC system and real-time communication, this context is not provided.

For a functional PLC system, one must both extract contextual information from the current speech (i.e., the past), and generate a plausible sound to continue it. Our solution, WaveNetEQ, does both at the same time, using the autoregressive network to provide the audio continuation during a packet loss event, and the conditioning network to model long term features, like voice characteristics. The spectrogram of the past audio signal is used as input for the conditioning network, which extracts limited information about the prosody and textual content. This condensed information is fed to the autoregressive network, which combines it with the audio of the recent past to predict the next sample in the waveform domain.

This differs slightly from the procedure that was followed during training of the WaveNetEQ model, where the autoregressive network receives the actual sample present in the training data as input for the next step, rather than using the last sample it produced. This process, called teacher forcing, assures that the model learns valuable information, even at an early stage of training when its predictions are still of low quality. Once the model is fully trained and put to use in an audio or video call, teacher forcing is only used to "warm up" the model for the first sample, and after that its own output is passed back as input for the next step.

WaveNetEQ architecture. During inference, we "warm up" the autoregressive network by teacher forcing with the most recent audio. Afterwards, the model is supplied with its own output as input for the next step. A MEL spectrogram from a longer audio part is used as input for the conditioning network.

The model is applied to the audio data in Duo's jitter buffer. Once the real audio continues after a packet loss event, we seamlessly merge the synthetic and real audio stream. In order to find the best alignment between the two signals, the model generates slightly more output than is required and then cross-fades from one to the other. This makes the transition smooth and avoids noticeable noise.

Simulation of PLC events on audio over a moving span of 60 ms. The blue line represents the real audio signal, including past and future parts of the PLC event. At each timestep the orange line represents the synthetic audio WaveNetEQ would predict if the audio were to cut out at the vertical grey line.

60 ms Packet Loss
NetEQ
WaveNetEQ
NetEQ
WaveNetEQ

120 ms Packet Loss
NetEQ
WaveNetEQ
NetEQ
WaveNetEQ

Audio clips: Comparison of WebRTC’s default PLC system, NetEQ, with our model, WaveNetEQ. Audio clips were taken from LibriTTS and 10% of the audio was dropped in 60 or 120 ms chunks and then filled in by the PLC systems.

Ensuring Robustness
One important factor during PLC is the ability of the network to adapt to variable input signals, including different speakers or changes in background noise. In order to ensure the robustness of the model across a wide range of users, we trained WaveNetEQ on a speech dataset that contains over 100 speakers in 48 different languages, which allows the model to learn the characteristics of human speech in general, instead of the properties of a specific language. To ensure WaveNetEQ is able to deal with noisy environments, such as answering your phone in the train station or in the cafeteria, we augment the data by mixing it with a wide variety of background noises.

While our model learns how to plausibly continue speech, this is only true on a short scale — it can finish a syllable but does not predict words, per se. Instead, for longer packet losses we gradually fade out until the model only produces silence after 120 milliseconds. To further ensure that the model is not generating false syllables, we evaluated samples from WaveNetEQ and NetEQ using the Google Cloud Speech-to-Text API and found no significant difference in the word error rate, i.e., how many mistakes were made transcribing the spoken text.

We have been experimenting with WaveNetEQ in Duo, where the feature has demonstrated a positive impact on call quality and user experience. WaveNetEQ is already available in all Duo calls on Pixel 4 phones and is now being rolled out to additional models.

Acknowledgements
The core team includes Alessio Bazzica, Niklas Blum, Lennart Kolmodin, Henrik Lundin, Alex Narest, Olga Sharonova from Google and Tom Walters from DeepMind. We would also like to thank Martin Bruse (Google), Norman Casagrande, Ray Smith, Chenjie Gu and Erich Elsen (DeepMind) for their contributions.

googblogs.com

All Google blogs and Press in one site

A Step Towards Protecting Patients from Medication Errors

Source: Google AI Blog

Chromebook accessibility tools for distance learning

Supporting students who are low vision

Helping students read and understand text

Guiding students with writing challenges or mobility impairments

How to get started with Chromebook accessibility tools

Source: The Official Google Blog

Become a Developer Student Club Lead

Source: Google Developers Blog

Improve email security in Gmail with TLS by default and other new features

What’s changing

Who’s impacted

Why it’s important

Additional details

Getting started

Rollout pace

Availability

Resources

Source: G Suite Updates Blog

The teen fact-checkers fighting misinformation

Source: The Official Google Blog

COVID-19: $6.5 million to help fight coronavirus misinformation

Source: The Official Google Blog

Transform your photo in the style of an iconic artist

Source: The Official Google Blog

Dev Channel Update for Desktop

Source: Google Chrome Releases

Stable Channel Update for Chrome OS

Source: Google Chrome Releases

Improving Audio Quality in Duo with WaveNetEQ

Source: Google AI Blog