
A more powerful Android assistant with Gemini

Earlier this week, I was in the kitchen watching my kids — at the (very fun) ages of seven and 11 — engaged in a conversation with our Google Assistant. My son, who has recently discovered a love of karaoke, asked it to play music so he could practice singing along to his favorite band, BTS. He and his sister ask it all kinds of questions: “How tall is the Eiffel Tower?” “How much do elephants weigh?” “Where was the Declaration of Independence signed?”
Whether we’re dictating a text message in the car or starting a timer while cooking at home, one thing is true: Voice plays an increasingly important role in the way we get things done — not just for us, but for our kids, too. It allows them to indulge their curiosities, learn new things and tap into their creative, inquisitive minds — all without having to look at a screen. As a mom, I see firsthand how kids’ relationship with technology starts by discovering the power of their own voice. And as today’s kids grow up in a world surrounded by technology, we want to help them have safer, educational and natural conversational experiences with Assistant. Here’s how we’re doing it.
Since we know kids — like my own — tend to use their families’ shared devices, we take very seriously our responsibility to help parents protect them from harmful and inappropriate content. Building on that long-standing commitment, we’re rolling out a number of new features that will make it safer for your kids to interact with Assistant.
To give parents more control and peace of mind over the interactions their children have on Google speakers and smart displays, we’re introducing parental controls for Google Assistant. In the coming weeks through the Google Home, Family Link and Google Assistant apps on Android and iOS, you can modify media settings, enable or disable certain Assistant functionality and set up downtime for your kids.
After selecting your child’s account, you can choose the music and video providers they can access — such as YouTube Kids, YouTube and YouTube Music — and your kids will only be able to explore content from those pre-selected providers. You can also decide whether you want your children to listen to news and podcasts on their devices.
Through parental controls, you can also control the specific Assistant features your kids can use — like restricting them from making phone calls or choosing what kind of answers they get from Assistant. And to encourage a healthy relationship between kids and technology, just say, “Hey Google, open Assistant settings.” From there, navigate to parental controls, and you can block off time when they shouldn’t use their devices, just like you can do on personal Android devices and Chromebooks. Whether you have parental controls turned on or not, we always make sure you’re in control of your privacy settings.
“What does telescope mean?” “What is the definition of ‘fluorescent’?”
Kids are naturally inquisitive and often turn to their Assistant to define words like these when they’re not sure what they mean. To help make those interactions even more engaging, we're introducing Kids Dictionary, which gives simplified and age-appropriate answers across speakers, smart displays and mobile devices.
With Kids Dictionary, children’s interactions with Assistant can be both educational and fun, allowing them to fuel their interests and learn new things. When your child is voice matched and Assistant detects their voice asking about a definition, it will automatically respond using this experience in Kids Dictionary.
Whether they’re doing their homework or simply curious about a new word they saw in a book, they’re only a “Hey Google” away from a little more help.
Kids today are growing up with technology, so it’s important that their experiences are developmentally appropriate. In addition to our increased efforts around safety and education, we’re also introducing four new kid-friendly voices. These new voices, which we designed alongside kids and parents, were developed with a diverse range of accents to reflect different communities and ways of speaking. And like a favorite teacher, these voices speak in slower and more expressive styles to help with storytelling and aid comprehension.
To activate one of Assistant’s new kid-friendly voices, kids can simply say, “Hey Google, change your voice!” Parents can also help their child navigate to Assistant settings, where they can select a new voice from the options available.
Like all parents, I’m always amazed by my kids’ insatiable curiosity. And every day, I see that curiosity come to life in the many questions they ask our Assistant. We’re excited to not only provide a safer experience, but an educational and engaging one, too — and to continue our work to truly build an Assistant for everyone.
I’ve been leading the Google Assistant team for over a year now, and I’m inspired every day by the meaningful questions it raises — like how voice can support underserved populations, teach kids new things or help people with impaired speech communicate more easily. This week, as part of VentureBeat’s annual Transform technology conference, I sat down for a virtual fireside chat with Jana Eggers, CEO of Nara Logics, to tackle some of these questions and talk about what’s ahead for Assistant.
As a computer scientist at heart, I had a lot of fun digging into topics like the machine learning (ML) renaissance, the future of conversational artificial intelligence (AI) and the incredible power of voice to transform people’s lives. You can watch the whole fireside chat or check out a few takeaways from our conversation below.
Many folks who’ve worked with me know that I like to challenge assumptions. When it comes to building products at Google, that means using technology in new, sometimes uncharted ways to try and solve real-world problems. When I worked on the Google Ads team, for example, I helped create the first ML-driven ads product by challenging existing assumptions about what ML could do. And I’m super excited to use that same challenger spirit to build a world-class, conversational assistant that truly understands you and helps you get things done. I firmly believe we can continue to change people’s lives if we harness new technologies and challenge the boundaries of what’s possible.
There are so many people who are underserved with their information and access needs. We talk about new internet users, or people who can’t read but want to access the world’s information. We now see hundreds of millions of voice queries every day, and that’s continuing to grow among new internet users. In India, for example, nearly 30% of Hindi search queries are spoken. That insight tells us a lot. If you think about reaching these people and making voice a democratizer for access, it’s a compelling area to continue to invest in.
The holy grail with Google Assistant is to figure out how a computer can understand humans the way humans understand each other. That’s an audacious, ambitious goal. Human language is ambiguous; we rely on many different cues when we speak to each other that are inherent to us as human beings. So we need to teach computers how humans express themselves and to ask: What are they trying to say? That’s what this product strives to be — a natural and conversational assistant. Every day we ask ourselves: How do we create a magical conversational experience, where the computer truly understands what you’re trying to say and adapts to you?
This work can’t be done without the right team. Building the best team of people possible is my number one piece of advice. This is hard stuff; it requires a type of individual I call a “pragmatic dreamer.” You want people who can dream big, but you also need people in the trenches figuring out the real, pragmatic engineering challenges standing in the way. I think it’s really important to create space for a team to ideate and explore the boundaries of what’s possible with technology.
Sometimes we get so enamored by technology that we forget what it's for. I always ask myself: “What are we trying to do for human beings; what are we trying to make better for them?” Sometimes voice can be considered a technology in search of a problem, but I think of it differently. There are real problems people have that this technology can solve. It’s the constant marriage of user problems and what technology can do to solve them. If you keep people as your north star, you can’t go wrong.
Like any other busy parent, I’m always looking for ways to make daily life a little bit easier. And Google Assistant helps me do that — from giving me cooking instructions as I’m making dinner for my family to sharing how much traffic there is on the way to the office. Assistant allows me to get more done at home and on the go, so I can make time for what really matters.
Every month, over 700 million people around the world get everyday tasks done with their Assistant. Voice has become one of the main ways we communicate with our devices. But we know it can feel unnatural to say “Hey Google'' or touch your device every time you want to ask for help. So today, we’re introducing new ways to interact with your Assistant more naturally — just as if you were talking to a friend.
Our first new feature, Look and Talk, is beginning to roll out today in the U.S. on Nest Hub Max. Once you opt in, you can simplylook at the screen and ask for what you need. From the beginning, we’ve built Look and Talk with your privacy in mind. It’s designed to activate when you opt in and both Face Match and Voice Match recognize it’s you. And video from these interactions is processed entirely on-device, so it isn’t shared with Google or anyone else.
Let’s say I need to fix my leaky kitchen sink. As I walk into the room, I can just look at my Nest Hub Max and say “Show plumbers near me” — without having to say “Hey Google” first.
There’s a lot going on behind the scenes to recognize whether you’re actually making eye contact with your device rather than just giving it a passing glance. In fact, it takes six machine learning models to process more than 100 signals from both the camera and microphone — like proximity, head orientation, gaze direction, lip movement, context awareness and intent classification — all in real time.
Last year, we announced Real Tone, an effort to improve Google’s camera and imagery products across skin tones. Continuing in that spirit, we’ve tested and refined Look and Talk to work across a range of skin tones so it works well for people with diverse backgrounds. We’ll continue to drive this work forward using the Monk Skin Tone Scale, released today.
We’re also expanding quick phrases to Nest Hub Max, which let you skip saying “Hey Google” for some of your most common daily tasks. So as soon as you walk through the door, you can just say “Turn on the hallway lights” or “Set a timer for 10 minutes.” Quick phrases are also designed with privacy in mind. If you opt in, you decide which phrases to enable, and they’ll work when Voice Match recognizes it’s you.
In everyday conversation, we all naturally say “um,” correct ourselves and pause occasionally to find the right words. But others can still understand us, because people are active listeners and can react to conversational cues in under 200 milliseconds. We believe your Google Assistant should be able to listen and understand you just as well.
To make this happen, we're building new, more powerful speech and language models that can understand the nuances of human speech — like when someone is pausing, but not finished speaking. And we’re getting closer to the fluidity of real-time conversation with the Tensor chip, which is custom-engineered to handle on-device machine learning tasks super fast. Looking ahead, Assistant will be able to better understand the imperfections of human speech without getting tripped up — including the pauses, “umms” and interruptions — making your interactions feel much closer to a natural conversation.
We're working hard to make Google Assistant the easiest way to get everyday tasks done at home, in the car and on the go. And with these latest improvements, we’re getting closer to a world where you can spend less time thinking about technology — and more time staying present in the moment.