Author Archives: Julie Cattiau

A communication tool for people with speech impairments

For millions of people, being able to speak and be understood can be difficult as a result of conditions that can impact speech, including stroke, ALS, Cerebral Palsy, traumatic brain injury or Parkinson's disease. Today, we’re inviting an initial group of people to test Project Relate, a new Android app that aims to help people with speech impairments communicate more easily with others and interact with the Google Assistant.

Project Relate is a continuation of years of research from both Google’s Speech and Research teams, made possible by over a million speech samples recorded by participants of our research effort. We are now looking for English-speaking testers in Australia, Canada, New Zealand and the United States to try out the app and provide feedback to help us improve it.

As an early tester of Project Relate, you will be asked to record a set of phrases. The app will use these phrases to automatically learn how to better understand your unique speech patterns, and give you access to the app's three main features: Listen, Repeat and Assistant.

Listen: Through the Listen feature, the Relate app transcribes your speech to text in real time, so you can copy-paste text into other apps, or let people read what you want to tell them.

Repeat: You can use the Repeat feature to restate what you’ve said using a clear, synthesized voice. We hope this can be especially helpful in face-to-face conversation or even when you want to speak a command to your home assistant device.

Assistant: Speak directly to your Google Assistant from within the Relate app, so you can take care of different tasks, such as turning on the lights or playing a song, with ease.

In creating the app, we worked closely with many people with speech impairments, including Aubrie Lee, a brand manager at Google, whose speech is affected by muscular dystrophy. “I’m used to the look on people’s faces when they can’t understand what I’ve said,” Aubrie shared with us. “Project Relate can make the difference between a look of confusion and a friendly laugh of recognition.” Since Aubrie works on the marketing team that names new products, she also helped us name the app!

If you have a condition that makes your speech difficult to understand, you may be able to help provide feedback on the Project Relate Android app as a trusted tester. To express interest, please fill out our interest form at g.co/ProjectRelate, and the team will get back to you in the coming months.

With your help, we hope to build a future in which people with disabilities can more easily communicate and be understood.

Source: The Official Google Blog

A whale of a tale about responsibility and AI

A couple of years ago, Google AI for Social Good’s Bioacoustics team created a ML model that helps the scientific community detect the presence of humpback whale sounds using acoustic recordings. This tool, developed in partnership with the National Oceanic and Atmospheric Association, helps biologists study whale behaviors, patterns, population and potential human interactions.

We realized other researchers could use this model for their work, too — it could help them better understand the oceans and protect key biodiversity areas. We wanted to freely share this model, but struggled with a big dilemma: On one hand, it could help ocean scientists. On the other, though, we worried about whale poachers or other bad actors. What if they used our shared knowledge in a way we didn’t intend?

We decided to consult with experts in the field in order to help us responsibly open source this machine learning model. We worked with Google's Responsible Innovation team to use our AI Principles — a guide to responsibly developing technology — to make a decision.

The team gave us the guidance we needed to open source a machine learning model that could be socially beneficial and was built and tested for safety, while also upholding high standards of scientific excellence for the marine biologists and researchers worldwide.

On Earth Day — and every day — putting the AI Principles into practice is important to the communities we serve, on land and in the sea.

Curious about diving deeper? You can use AI to explore thousands of hours of humpback whale songs and make your own discoveries with our Pattern Radio and see our collaboration with the National Oceanic and Atmospheric Association of the United States as well as our work with Fisheries and Oceans Canada (DFO) to apply machine learning to protect killer whales in the Salish Sea.

Source: The Official Google Blog

Project Euphonia’s new step: 1,000 hours of speech recordings

Muratcan Cicek, a PhD candidate at UC Santa Cruz, worked as a summer intern on Google’s Project Euphonia, which aims to improve computers’ abilities to understand impaired speech. This work was especially relevant and important for Muratcan, who was born with cerebral palsy and has a severe speech impairment.

Before his internship, Muratcan recorded 2,000 phrases for Project Euphonia. These phrases, expressions like “Turn the lights on” and “Turn up thermostat to 74 degrees,” were used to build a personalized speech recognition model that could better recognize the unique sound of his voice and transcribe his speech. The prototype allowed Muratcan to share the transcription in a video call so others could better understand him. He used the prototype to converse with co-workers, give status updates during team meetings and connect with people in ways that were previously impossible. Muratcan says, “Euphonia transformed my communication skills in a way that I can leverage in my career as an engineer without feeling insecure about my condition.”

Muratcan, a Google intern — Muratcan, a summer research intern on the Euphonia team, uses the Euphonia prototype app

1,000 hours of speech samples

The phrases that Muratcan recorded were key to training custom machine learning models that could help him be more easily understood. To help other people that have impaired speech caused by ALS, Parkinson’s disease or Down Syndrome, we need to gather samples of their speech patterns. So we’ve worked with partners like CDSS, ALS TDI, ALSA, LSVT Global, Team Gleason and CureDuchenne to encourage people with speech impairments to record their voices and contribute to this research.

Since 2018, nearly 1,000 participants have recorded over 1,000 hours of speech samples. For many, it’s been a source of pride and purpose to shape the future of speech recognition, not only for themselves but also for others who struggle to be understood.

I contribute to this research so that I can help not only myself, but also a larger group of people with communication challenges that are often left out. Project Euphonia participant

While the technology is still under development, the speech samples we’ve collected helped us create personalized speech recognition models for individuals with speech impairments, like Muratcan. For more technical details about how these models work, see the Euphonia and Parrotron blog posts. We’re evaluating these personalized models with a group of early testers. The next phase of our research aims to improve speech recognition systems for many more people, but it requires many more speech samples from a broad range of speakers.

How you can contribute

To continue our research, we hope to collect speech samples from an additional 5,000 participants. If you have difficulty being understood by others and want to contribute to meaningful research to improve speech recognition technologies, learn more and consider signing up to record phrases. We look forward to hearing from more participants and experts— and together, helping everyone be understood.

Source: The Official Google Blog

AI’s killer (whale) app

The Salish Sea, which extends from British Columbia to Washington State in the U.S., was once home to hundreds of killer whales, also known as orcas. Now, the population of Southern Resident Killer Whales, a subgroup of orcas, is struggling to survive—there are only 73 of them left. Building on our work using AI for Social Good, we’re partnering with Fisheries and Oceans Canada (DFO) to apply machine learning to protect killer whales in the Salish Sea.

According to DFO, which monitors and protects this endangered population of orcas, the greatest threats to the animals are scarcity of prey (particularly Chinook salmon, their favorite meal), contaminants, and disturbance caused by human activity and passing vessels. Teaming up with DFO and Rainforest Connection, we used deep neural networks to track, monitor and observe the orcas’ behavior in the Salish Sea, and send alerts to Canadian authorities. With this information, marine mammal managers can monitor and treat whales that are injured, sick or distressed. In case of an oil spill, the detection system can allow experts to locate the animals and use specialized equipment to alter the direction of travel of the orcas to prevent exposure.

To teach a machine learning model to recognize orca sounds, DFO provided 1,800 hours of underwater audio and 68,000 labels that identified the origin of the sound. The model is used to analyze live sounds that DFO monitors across 12 locations within the Southern Resident Killer Whales’ habitat. When the model hears a noise that indicates the presence of a killer whale, it’s displayed on the Rainforest Connection (a grantee of the Google AI Impact Challenge) web interface, and live alerts on their location are provided to DFO and key partners through an app that Rainforest Connection developed.

Our next steps on this project include distinguishing between the three sub-populations of orcas—Southern Resident Killer Whales, Northern Resident Killer Whales and Biggs Killer Whales—so that we can better monitor their health and protect them in real time. We hope that advances in bioacoustics technology using AI can make a difference in animal conservation.

Source: The Official Google Blog

How Tim Shaw regained his voice

His entire life, Tim Shaw dedicated himself to football and dreamed of playing professionally. At 23, his dream came true when he was drafted and spent six years as an NFL linebacker. Then, in 2013, Tim felt his body begin to change. It started with small muscle twitches or bicep spasms; once, a gallon of milk slipped out of his hand while he was unloading groceries. During a game when he was perfectly positioned to tackle his opponent, his arm couldn’t hang on and the player slid past. His performance kept inexplicably declining and just before the 2013 season, Tim was cut from the Titans.

Five months later, Tim was diagnosed with Amyotrophic Lateral Sclerosis (ALS, also known as Lou Gehrig’s disease). With no known cause or cure, ALS not only impacts movement, but can make speaking, swallowing and even breathing difficult. Through our partnership with the ALS Therapy Development Institute, we met Tim and learned that the inability to communicate was one of the hardest parts of living with the disease. We showcase Tim’s journey in the new YouTube Originals learning series “The Age of A.I.” hosted by Robert Downey Jr.

For many people with ALS, losing their voice can be one of the most devastating aspects of the disease. But technology has the potential to help. Earlier this year, we announced a research project called Project Euphonia, which aims to use AI to improve communication for people who have impaired speech caused by neurologic conditions, including ALS. When we heard Tim's story, we thought we might have a way to help him regain a part of identity he'd lost—his voice.

Current text-to-speech technology requires at least 30-40 minutes of recordings to create a high-quality synthetic voice—which people with ALS don’t always have. In Tim’s case, though, we were able to pull together a bank of voice samples from the many interviews he had done while playing for the NFL. The DeepMind, Google AI and Project Euphonia teams created tools that were able to take these recordings and use them to create a voice that resembles how Tim sounded before his speech degraded; he was even able to use the voice to read out the letter he’d recently written to his younger self. While it lacks the expressiveness, quirks and controllability of a real voice, it shows that this technology holds promise.

"It has been so long since I've sounded like that, I feel like a new person,” Tim said when he first heard his recreated voice. “I felt like a missing part was put back in place. It's amazing."

In the aforementioned letter, Tim told his younger self to “wake up every day and choose to make a positive impact on other people.” Our research and work with Tim makes us hopeful we can do just that by improving communication systems and ultimately giving people with impaired speech more independence. You can learn more about our project with Tim and the vital role he played in our research in “The Age of A.I.” now streaming on YouTube.com/Learning.

Source: The Official Google Blog

Offline translations are now a lot better thanks to on-device AI

Just about two years ago we introduced neural machine translation (NMT) to Google Translate, significantly improving accuracy of our online translations. Today, we’re bringing NMT technology offline—on device. This means that the technology will run in the Google Translate apps directly on your Android or iOS device, so that you can get high-quality translations even when you don't have access to an internet connection.

The neural system translates whole sentences at a time, rather than piece by piece. It uses broader context to help determine the most relevant translation, which it then rearranges and adjusts to sound more like a real person speaking with proper grammar. This makes translated paragraphs and articles a lot smoother and easier to read.

Offline translations can be useful when traveling to other countries without a local data plan, if you don’t have access to internet, or if you just don’t want to use cellular data. And since each language set is just 35-45MB, they won’t take too much storage space on your phone when you download them.

Comparison between phrase based translation and online/offline NMT — A comparison between our current phrase-based machine translation (PBMT), new offline neural machine translation (on-device), and online neural machine translation

To try NMT offline translations, go to your Translate app on Android or iOS. If you’ve used offline translations before, you’ll see a banner on your home screen which will take you to the right place to update your offline files. If not, go to your offline translation settings and tap the arrow next to the language name to download the package for that language. Now you’ll be ready to translate text whether you’re online or not.

We're rolling out this update in 59 languages over the next few days, so get out there and connect to the world around you!

googblogs.com

All Google blogs and Press in one site

Author Archives: Julie Cattiau

A communication tool for people with speech impairments

Source: The Official Google Blog

A whale of a tale about responsibility and AI

Source: The Official Google Blog

Project Euphonia’s new step: 1,000 hours of speech recordings

1,000 hours of speech samples

How you can contribute

Source: The Official Google Blog

AI’s killer (whale) app

Source: The Official Google Blog

How Tim Shaw regained his voice

Source: The Official Google Blog

Offline translations are now a lot better thanks to on-device AI

Source: Translate

Source: The Official Google Blog

New progress toward our 24/7 carbon-free energy goal

Source: The Official Google Blog

1,000 hours of speech samples

How you can contribute

Source: The Official Google Blog

Source: The Official Google Blog

Source: The Official Google Blog

Source: Translate