Tag Archives: Ask a Techspert

Ask a Techspert: What’s breaking my text conversations?

Not to brag, but I have a pretty excellent group chat with my friends. We use it to plan trips, to send happy birthdays and, obviously, to share lots and lots of GIFs. It’s the best — until it’s not. We don’t all have the same kind of phones; we’ve got both Android phones and iPhones in the mix. And sometimes, they don’t play well together. Enter “green bubble issues” — things like, missing read receipts and typing indicators, low-res photos and videos, broken group chats…I could go on describing the various potential communication breakdowns, but you probably know what I’m talking about. Instead, I decided to ask Google’s Elmar Weber: What’s the problem with messaging between different phone platforms?

First, can you tell me what you do at Google?

I lead several engineering organizations including the team that builds Google’s Messages app, which is available on most Android phones today.

OK, then you’re the perfect person to talk to! So my first question: When did this start being a problem? I remember wayback when I had my first Android phone, I would text iPhone friends…and it was fine.

Texting has been around for a long time. Basic SMS texting — which is what you’re talking about here — is 30 years old. SMS, which means Short Message Service, was originally only 160 characters. Back then you couldn’t do things like send photos or reactions or read receipts. In fact, mobile phones weren’t made for messaging, they were designed for making phone calls. To send a message you actually had to hit the number buttons to get to the letters that you’d have to spell out. But people started using it a ton, and it sort of exploded. So this global messaging industry took off. MMS (Multimedia Messaging Service) was then introduced in the early 2000s, which let people send photos and videos for the first time. But that came with a lot of limitations too.

Got it. Then the messaging apps all started building their own systems to support modern messaging features like emoji reactions and typing indicators, because SMS/MMS were created long before those things were even dreamed of?

Yes, exactly.

I guess…we need a new SMS?

Well the new SMS is RCS, which stands for Rich Communication Services. It enables things like high-resolution photo and video sharing, read receipts, emoji reactions, better security and privacy with end-to-end encryption and more. Most major carriers support RCS, and Android users have been using it for years.

How long has RCS been around?

Version one of RCS was released December 15, 2008.

Who made it?

RCS isn’t a messaging app like Messages or WhatsApp — it’s an industry-wide standard. Similar to other technical standards (USB, 5G, email), it was developed by a group of different companies. In the case of RCS, it was coordinated by an association of global wireless operators, hardware chip makers and other industry players.

RCS makes messaging better, so if Android phones use this, then why are texts from iPhones still breaking? RCS sounds like an upgrade — so shouldn’t that fix everything?

There’s the hitch! So Android phones use RCS, and iPhones still don’t. iPhones still rely on SMS and MMS for conversations with Android users, which is why your group chats feel so outdated. Think of it like this: If you have two groups of people who use different spoken languages, they can communicate effectively in their respective languages to other people who speak their language, but they can’t talk to each other. And when they try to talk to one another, they have to act out what they're saying, as though they're playing charades. Now think of RCS as a magic translator that helps multiple groups speak fluently — but every group has to use the translator, and if one doesn’t, they’re each going to need to use motions again.

Do you think iPhones will start using RCS too?

I hope so! It’s not just about things like the typing indicators, read receipts or emoji reactions — everyone should be able to pick up their phone and have a secure, modern messaging experience. Anyone who has a phone number should get that, and that’s been lost a little bit because we’re still finding ourselves using outdated messaging systems. But the good news is that RCS could bring that back and connect all smartphone users, and because so many different companies and carriers are working together on it, the future is bright.

Check outAndroid.com/GetTheMessageto learn why now is the time for Apple to fix texting.

Ask a Techspert: How do digital wallets work?

In recent months, you may have gone out to dinner only to realize you left your COVID vaccine card at home. Luckily, the host is OK with the photo of it on your phone. In this case, it’s acceptable to show someone a picture of a card, but for other things it isn’t — an image of your driver’s license or credit card certainly won’t work. So what makes digital versions of these items more legit than a photo? To better understand the digitization of what goes into our wallets and purses, I talked to product manager Dong Min Kim, who works on the brand new Google Wallet. Google Wallet, which will be coming soon in over 40 countries, is the new digital wallet for Android and Wear OS devices…but how does it work?

Let’s start with a basic question: What is a digital wallet?

A digital wallet is simply an application that holds digital versions of the physical items you carry around in your actual wallet or purse. We’ve seen this shift where something you physically carry around becomes part of your smartphone before, right?

Like..?

Look at the camera: You used to carry around a separate item, a camera, to take photos. It was a unique device that did a specific thing. Then, thanks to improvements in computing power, hardware and image processing algorithms, engineers merged the function of the camera — taking photos — into mobile phones. So now, you don’t have to carry around both, if you don’t want to.

Ahhh yes, I am old enough to remember attending college gatherings with my digital camera andmy flip phone.

Ha! So think about what else you carry around: your wallet and your keys.

So the big picture here is that digital wallets help us carry around less stuff?

That’s certainly something we’re thinking about, but it’s more about how we can make these experiences — the ones where you need to use a camera, or in our case, items from your wallet — better. For starters, there’s security: It's really hard for someone to take your phone and use your Google Wallet, or to take your card and add it to their own phone. Your financial institution will verify who you are before you can add a card to your phone, and you can set a screen lock so a stranger can’t access what’s on your device. And should you lose your device, you can remotely locate, lock or even wipe it from “Find My Device.”

What else can Google Wallet do that my physical wallet can’t?

If you saved your boarding pass for a flight to Google Wallet, it will notify you of delays and gate changes. When you head to a concert, you’ll receive a notification on your phone beforehand, reminding you of your saved tickets.

Wallet also works with other Google apps — for instance if you’re taking the bus to see a friend and look up directions in Google Maps, your transit card and balance will show up alongside the route. If you're running low on fare, you can tap and add more. We’ll also give you complete control over how items in your wallet are used to enable these experiences; for example, the personal information on your COVID vaccine pass is kept on your device and never shared without your permission, not even with Google.

Plus, even if you lose your credit or debit card and you’re waiting for the replacement to show up, you can still use that card with Google Wallet because of the virtual number attached to it.

This might be taking a step backwards, but can I pay someone from my Google Wallet? As in can I send money from a debit card, or straight from my bank account?

That’s actually where the Google Pay app — which is available in markets like the U.S., India and Singapore — comes in. We’ll keep growing this app as a companion app where you can do more payments-focused things like send and receive money from friends or businesses, discover offers from your favorite retailers or manage your transactions.

OK, but can I pay with my Google Wallet?

Yes,you can still pay with the cards stored in your Google Wallet in stores where Google Pay is accepted; it’s simple and secure.

Use payment cards in Google Wallet in stores with Google Pay, got it — but how does everything else “get” into Wallet?

We've already partnered with hundreds of transit agencies, retailers, ticket providers, health agencies and airlines so they can create digital versions of their cards or tickets for Google Wallet. You can add a card or ticket directly to Wallet, or within the apps or sites of businesses we partner with, you’ll see an option to add it to Wallet. We’re working on adding more types of content for Wallet, too, like digital IDs, or office and hotel keys.

An image of the Google Wallet app open on a Pixel phone. The app is showing a Chase Freedom Unlimited credit card, a ticket for a flight from SFO to JFK, and a Walgreens cash reward pass. In the bottom right hand corner, there is a “Add to Wallet” button.

Developers can make almost any item into a digital pass.. Developers can use the templates we’ve created, like for boarding passes and event tickets — or they can use a generic template if it’s something more unique and we don’t have a specific solution for it yet. This invitation to developers is part of what I think makes Google Wallet interesting; it’s very open.

What exactly do you mean by “open” exactly?

Well, the Android platform is open — any Android developer can use and develop for Wallet. One thing that’s great about that is all these features and tools can be made available on less expensive phones, too, so it isn’t only people who can afford the most expensive, newest phones out there who can use Google Wallet. Even if a phone can’t use some features of Google Wallet, it’s possible for developers to use QR or barcodes for their content, which more devices can access.

So working with Google Wallet is easier for developers. Any ways you’re making things easier for users?

Plenty of them! In particular, we’re working on ways to make it easy to add objects directly from your phone too. For instance, today if you take a screenshot of your boarding pass or Covid vaccine card from an Android device, we’ll give you the option to add it directly to your Google Wallet!

Ask a Techspert: What’s that weird box next to my emoji?

A few months ago, I received a message from a friend that, I have to confess, made absolutely no sense. Rows of emoji followed by different boxes — like this 􏿿􏿿􏿿􏿿􏿿􏿿 — appeared…so I sent back a simple “huh?” Apparently she’d sent me a string of emoji that were meant to tell me about her weekend and let’s just say that it was all lost in translation.

To find out exactly what caused our communication breakdown, I decided to ask emoji expert Jennifer Daniel.

Why did the emoji my friend typed to me show up as 􏿿􏿿􏿿􏿿􏿿􏿿 ?

Oy boy. No bueno. Sounds like your friend was using some of the new emoji that were released this month. (Not to rub it in but they are so good!!! There’s a salute 🫡, a face holding back tears, 🥹 and another face that’s melting 🫠!) Sadly, you’re not the only one who’s losing things in translation. For way too long, 96% of Android users couldn’t see emoji released the year they debuted.

And it isn't just an Android problem: Despite being one of the earliest platforms to include emoji, Gmail received its first emoji update since 2016 last year! (You read that right: Two-thousand-sixteen!) This often resulted in skin toned and gendered emoji appearing broken.

Illustration of a few examples of "broken" skin tone and gendered emoji.

A few examples of "broken" skin tone and gendered emoji.

What!? That’s crazy. Why?

Yeah, strong agree. Historically, emoji have been at the mercy of operating system updates. New OS? New emoji. If you didn’t update your device, it meant that when new emoji were released, they would display as those black boxes you saw, which are referred to as a “tofu.” It gets worse: What if your phone doesn’t offer OS updates? Well, you’d have to buy a newer phone. Maybe that’d be worth it so you can use the new finger heart emoji (🫰)???

Emoji are fundamental to digital communication. Meanwhile, there is a very real economic divide between people who can afford to get a new phone every year (or who can afford a fancy phone that generously updates the OS) and everyone else in the world. That is lame, absurd and I personally hate it. Now for the good news: Check your phone, I bet you can see the emoji from your friend’s email today.

Whaaaaat! You’re right. Why can I see them now but I couldn’t a few months ago?

Well, this year Google finally decoupled emoji updates from operating system updates. That means YOU get an emoji and YOU get an emoji and YOU get an emoji!

Examples of emoji

What does “decoupled” emoji updates mean?

It basically means emoji can be updated on your phone or your computer without you updating your operating system. As of this month, all apps that use Appcompat (a tool that enables Android apps to be compatible with several Android versions)will automatically get the latest and greatest emoji so you can send and receive emoji even if you don’t have the newest phone. And this will work across Google: All 3,366 emoji will now appear in Gmail, on Chrome OS and lots of other places when people send them to you. Apps that make their own emoji rather than defaulting on the operating system may find themselves falling behind as taking on the responsibility of maintaining and distributing emoji is a lot of work. This is why we're so thrilled to see Google rely on Noto Emoji so everyone can get the latest emoji quickly.

Since you mentioned Gmail being an early emoji adopter, it makes me wonder…how old are emoji? Where do they come from?

A volunteer-based organization called the Unicode Consortium digitizes the world’s languages. They’re the reason why when you send Hindi from one computer the computer on the other end can render it in Hindi. In their mission to ensure different platforms and operating systems can work together, they standardize the underlying technology that Google, Apple, Twitter and others use to render their emoji fonts.

You see, emoji are a font. That’s right. A font. I know. They look like tiny pictures but they operate the same way any other letter of the alphabet does when it enters our digital realm.

Like the letter A (U+0041) or the letter अ (U+0905), each emoji is assigned a code point (for instance, 😤 is U+1F624) by the Unicode Consortium. (Some emoji contain multiple code points — I’m generalizing a bit! Don’t tell the Unicode Consortium.) Point being: Emoji are a font and like fonts, some emoji on iPhones look different than they do on Pixel phones.

A variety of the new emoji designs that are now visible  across Google products including Gmail, Google Chat, YouTube Live Chat and Chrome OS.

A variety of the new emoji designs that are now visible across Google products including Gmail, Google Chat, YouTube Live Chat and Chrome OS.

So, the Unicode Consortium makes fonts?

No, they manage a universal character encoding set that written languages map to. Google's Noto project is a global font project to support those existing scripts and languages. Google uses Noto Emoji and provides resources to ensure your emoji render on Android and in desktop environments including metadata like iconography and shortcodes too! All Google chat products now support this.

We’re also working on ways for you to download or embed Noto Emoji into your website of choice via fonts.google.com. So, stay tuned 😉.

Emoji are a font. Black boxes are tofus. The more you know! I guess I have one final question: Now that I can send (and see!) the melting face emoji, will it look identical no matter who I send it to?

Well, every emoji font has its own flavor. Some of these design variations are minor and you might not even notice them. With others, primarily the smilies (😆🤣🥲), the details really matter — people are hardwired to read micro-expressions! The last thing anyone wants is an emoji you see as a smile and someone else sees as a downward smirk — it can ruin friendships! Fortunately, over the past three years designs have converged, so there’s less chance of being misunderstood 🌈.

Ask a Techspert: What does AI do when it doesn’t know?

As humans, we constantly learn from the world around us. We experience inputs that shape our knowledge — including the boundaries of both what we know and what we don’t know.

Many of today’s machines also learn by example. However, these machines are typically trained on datasets and information that doesn’t always include rare or out-of-the-ordinary examples that inevitably come up in real-life scenarios. What is an algorithm to do when faced with the unknown?

I recently spoke with Abhijit Guha Roy, an engineer on the Health AI team, and Ian Kivlichan, an engineer on the Jigsaw team, to hear more about using AI in real-world scenarios and better understand the importance of training it to know when it doesn’t know.

Abhijit, tell me about your recent research in the dermatology space.

We’re applying deep learning to a number of areas in health, including in medical imaging where it can be used to aid in the identification of health conditions and diseases that might require treatment. In the dermatological field, we have shown that AI can be used to help identify possible skin issues and are in the process of advancing research and products, including DermAssist, that can support both clinicians and people like you and me.

In these real-world settings, the algorithm might come up against something it's never seen before. Rare conditions, while individually infrequent, might not be so rare in aggregate. These so-called “out-of-distribution” examples are a common problem for AI systems which can perform less well when it’s exposed to things they haven’t seen before in its training.

Can you explain what “out-distribution” means for AI?

Most traditional machine learning examples that are used to train AI deal with fairly unsubtle — or obvious — changes. For example, if an algorithm that is trained to identify cats and dogs comes across a car, then it can typically detect that the car — which is an “out-of-distribution” example — is an outlier. Building an AI system that can recognize the presence of something it hasn’t seen before in training is called “out-of-distribution detection,” and is an active and promising field of AI research.

Okay, let’s go back to how this applies to AI in medical settings.

Going back to our research in the dermatology space, the differences between skin conditions can be much more subtle than recognizing a car from a dog or a cat, even more subtle than recognizing a previously unseen “pick-up truck” from a “truck”. As such, the out-of-distribution detection task in medical AI demands even more of our focused attention.

This is where our latest research comes in. We trained our algorithm to recognize even the most subtle of outliers (a so-called “near-out of distribution” detection task). Then, instead of the model inaccurately guessing, it can take a safer course of action — like deferring to human experts.

Ian, you’re working on another area where AI needs to know when it doesn’t know something. What’s that?

The field of content moderation. Our team at Jigsaw used AI to build a free tool called Perspective that scores comments according to how likely they are to be considered toxic by readers. Our AI algorithms help identify toxic language and online harassment at scale so that human content moderators can make better decisions for their online communities. A range of online platforms use Perspective more than 600 million times a day to reduce toxicity and the human time required to moderate content.

In the real world, online conversations — both the things people say and even the ways they say them — are continually changing. For example, two years ago, nobody would have understood the phrase “non-fungible token (NFT).” Our language is always evolving, which means a tool like Perspective doesn't just need to identify potentially toxic or harassing comments, it also needs to “know when it doesn’t know,” and then defer to human moderators when it encounters comments very different from anything it has encountered before.

In our recent research, we trained Perspective to identify comments it was uncertain about and flag them for separate human review. By prioritizing these comments, human moderators can correct more than 80% of the mistakes the AI might otherwise have made.

What connects these two examples?

We have more in common with the dermatology problem than you'd expect at first glance — even though the problems we try to solve are so different.

Building AI that knows when it doesn’t know something means you can prevent certain errors that might have unintended consequences. In both cases, the safest course of action for the algorithm entails deferring to human experts rather than trying to make a decision that could lead to potentially negative effects downstream.

There are some fields where this isn’t as important and others where it’s critical. You might not care if an automated vegetable sorter incorrectly sorts a purple carrot after being trained on orange carrots, but you would definitely care if an algorithm didn’t know what to do about an abnormal shadow on an X-ray that a doctor might recognize as an unexpected cancer.

How is AI uncertainty related to AI safety?

Most of us are familiar with safety protocols in the workplace. In safety-critical industries like aviation or medicine, protocols like “safety checklists” are routine and very important in order to prevent harm to both the workers and the people they serve.

It’s important that we also think about safety protocols when it comes to machines and algorithms, especially when they are integrated into our daily workflow and aid in decision-making or triaging that can have a downstream impact.

Teaching algorithms to refrain from guessing in unfamiliar scenarios and to ask for help from human experts falls within these protocols, and is one of the ways we can reduce harm and build trust in our systems. This is something Google is committed to, as outlined in its AI Principles.

Ask a Techspert: What’s a subsea cable?

Whenever I try to picture the internet at work, I see little pixels of information moving through the air and above our heads in space, getting where they need to go thanks to 5G towers and satellites in the sky. But it’s a lot deeper than that — literally. Google Cloud’s Vijay Vusirikala recently talked with me about why the coolest part of the internet is really underwater. So today, we’re diving into one of the best-kept secrets in submarine life: There wouldn’t be an internet without the ocean.

First question: How does the internet get underwater?

We use something called a subsea cable that runs along the ocean floor and transmits bits of information.

What’s a subsea cable made of?

These cables are about the same diameter as the average garden hose, but on the inside they contain thin optical fibers. Those fibers are surrounded by several layers of protection, including two layers of ultra-high strength steel wires, water-blocking structures and a copper sheath. Why so much protection? Imagine the pressure they are under. These cables are laid directly on the sea bed and have tons of ocean water on top of them! They need to be super durable.

Two photographs next to each other, the first showing a cable with outer protection surrounding it. The second photograph shows a stripped cable with copper wires and optical fibers inside.

A true inside look at subsea cables: On the left, a piece of the Curie subsea cable showing the additional steel armoring for protection close to the beach landing. On the right, a cross-sectional view of a typical deep water subsea cable showing the optical fibers, copper sheath, and steel wires for protection.

Why are subsea cables important?

Subsea cables are faster, can carry higher traffic loads and are more cost effective than satellite networks. Subsea cables are like a highway that has the right amount of lanes to handle rush-hour traffic without getting bogged down in standstill jams. Subsea cables combine high bandwidths (upwards of 300 to 400 terabytes of data per second) with low lag time. To put that into context, 300 to 400 terabytes per second is roughly the same as 17.5 million people streaming high quality videos — at the same time!

So when you send a customer an email, share a YouTube video with a family member or talk with a friend or coworker on Google Meet, these underwater cables are like the "tubes" that deliver those things to the recipient.

Plus, they help increase internet access in places that have had limited connectivity in the past, like countries in South America and Africa. This leads to job creation and economic growth in the places where they’re constructed.

How many subsea cables are there?

There are around 400 subsea cables criss-crossing the planet in total. Currently, Google invests in 19 of them — a mix of cables we build ourselves and projects we’re a part of, where we work together with telecommunications providers and other companies.

Video introducing Curie, a subsea cable.
10:25

Wow, 400! Does the world need more of them?

Yes! Telecommunications providers alongside technology companies are still building them around the world. At Google, we invest in subsea cables for a few reasons: One, our Google applications and Cloud services keep growing. This means more network demand from people and businesses in every country around the world. And more demand means building more cables and upgrading existing ones, which have less capacity than their modern counterparts.

Two, you cannot have a single point of failure when you're on a mission to connect the world’s information and make it universally accessible. Repairing a subsea cable that goes down can take weeks, so to guard against this we place multiple cables in each cross section. This gives us sufficient extra cable capacity so that services aren’t affected for people around the world.

What’s your favorite fact about subsea cables?

Three facts, if I may!

First, I love that we name many of our cables after pioneering women, like Curie for Marie Curie, which connects California to Chile, and Grace Hopper, which links the U.S., Spain and the U.K. Firmina, which links the U.S., Argentina, Brazil and Uruguay, is named after Brazil’s first novelist, Maria Firmina dos Reis.

Second, I’m proud that the cables are kind to their undersea homes. They’re environmentally friendly and are made of chemically inactive materials that don't harm the flora and fauna of the ocean, and they generally don’t move around much! We’re very careful about where we place them; we study each beach’s marine life conditions and we adjust our attachment timeline so we don’t disrupt a natural lifecycle process, like sea turtle nesting season. For the most part they’re stationary and don't disrupt the ocean floor or marine life. Our goal is to integrate into the underwater landscape, not bother it.

And lastly, my favorite fact is actually a myth: Most people think sharks regularly attack our subsea cables, but I’m aware of exactly one shark attack on a subsea cable that took place more than 15 years ago. Truly, the most common problems for our cables are caused by people doing things like fishing, trawling (which is when a fishing net is pulled through the water behind a boat) and anchor drags (when a ship drifts without holding power even though it has been anchored).

Ask a Techspert: What exactly is a time crystal?

Quantum computers will allow us to do hard computations and help us rethink our understanding of the fundamentals of science. That’s because quantum computers harness the power of quantum mechanics — a subfield in physics that explains how the natural world around us works at the subatomic level. While we are a long ways off from building a useful quantum computer, our team at Google Quantum AI is already running novel experiments on the quantum processors we have today. One particular experiment that was just published in the science journal Nature is our work on a new phase of matter called a time crystal.

For years, scientists have theorized about the possibility of a time crystal and wondered whether one could ever be observed. By using our quantum processor, Sycamore, we now know it’s possible. To answer some common questions about this phenomenon, Google Quantum AI research scientists Pedram Roushan and Kostyantyn Kechedzhi answer the frequently asked questions.

What is a time crystal?

A time crystal may sound like it's from the pages of a science fiction novel, but it’s something that we’ve demonstrated is possible to observe, even though it may appear to go against the basic laws of nature. You might be familiar with crystals like emerald, diamond and salt. At the microscopic level, they’re made up of repeating patterns — many layers of atoms that ultimately form a physical structure. For example, a grain of salt is made up of sodium and chlorine atoms. A time crystal is similar, but instead of forming a repetitive pattern in space, an oscillating pattern is formed in time.

An artistic representation of time crystals is represented as a pattern on the faces of a 20-sided object. The pattern changes from one instance of time to the next, but repeats itself, and the oscillation continues indefinitely.

Time crystals show an oscillating pattern in time.

Can you give an example of a time crystal?

Let’s say you took pictures of a planet and its orbiting moon every time it finishes its orbit over a period of time with the Hubble Telescope. These pictures would all look the same with the moon repeating its orbit over and over again. Now hypothetically, let’s say there were hundreds of new moons added to the planet’s orbit. Each new moon introduced would exert gravitational pull on the orbits of the others. Over time, the moons would start to deviate from their orbits without ever coming back to their starting point. This increase in disorder or entropy is unavoidable due to the second law of thermodynamics, a fundamental law of physics. What if there was a system of a planet and many moons where the moons could periodically repeat their orbits, without ever increasing entropy? This configuration — evidently hard to achieve — would be considered a time crystal.

How do you use a quantum processor to observe a time crystal?

Quantum objects behave like waves, similar to how sonar uses sound waves reflected from solid objects on the ocean floor to detect them. If the medium that the quantum wave travels through contains multiple objects at random locations in space, then the wave could be confined and come to a complete stop. This key insight about quantum waves is what puts a cap on the spread of entropy and allows the production of a stable time crystal, even though it appears to be at odds with the second law of thermodynamics. This is where our quantum processor comes in. In our paper, we describe how we used our Sycamore processor as a quantum system to observe these oscillatory wave patterns of stable time crystals.

Our quantum processor, Sycamore, is made up of two chips. The top chip contains qubits and the bottom contains the wiring.

We observed a time crystal using Sycamore, our quantum processor.

Now that time crystals have been observed for the first time, what’s next?

Observing a time crystal shows how quantum processors can be used to study novel physical phenomena that have puzzled scientists for years. Moving from theory to actual observation is a critical leap and is the foundation for any scientific discovery. Research like this opens the door to many more experiments, not only in physics, but hopefully inspiring future quantum applications in many other fields.

Ask a Techspert: How do Nest Cams know people from pets?

The other day when I was sitting in my home office, I got an alert from my Nest Doorbell that a package had been delivered — and right from my phone, I could see it sitting on the porch. Moments later, my neighbor dropped by to return a piece of mail that had accidentally gone to her — and again, my Doorbell alerted me. But this time, it alerted me that someone (rather than something) was at the door. 

When I opened my door and saw my neighbor standing next to the package, I wondered…how does that little camera understand the world around it? 

For an answer, I turned to Yoni Ben-Meshulam, a Staff Software Engineer who works on the Nest team. 

Before I ask you how the camera knows what’s a person and what’s a vehicle, first I want to get into how they detect anything at all?

Our cameras run something called a perception algorithm which detects objects (people, animals, vehicles, and packages) that show up in the live video stream. For example, if a package is delivered within one of your Activity Zones, like your porch, the camera will track the movement of the delivery person and the package, and analyze all of this to give you a package delivery notification. If you have Familiar Face Alerts on and the camera detects a face, it analyzes the face on-device and checks whether it matches anyone you have identified as a Familiar Face. And the camera recognizes new faces as you identify and label them.

The camera also learns what its surroundings look like. For example, if you have a Nest Cam in your living room, the camera runs an algorithm that can identify where there is likely a TV, so that the camera won’t think the people on the screen are in your home. 

Perception algorithms sound a little like machine learning. Is ML involved in this process?

Yes — Nest cameras actually have multiple machine learning models running inside of them. One is an object detector that takes in video frames and outputs a bounding box around objects of interest, like a package or vehicle. This object detector was trained to solve a problem using millions of examples.

Nest Cam (battery) in the rain

Is there a difference between creating an algorithm for a security camera versus a “regular” camera?

Yes! A security camera is a different domain. ​​Generally, the pictures you take on your phone are closer and the object of interest is better-focused. For a Nest camera, the environment is harder to control.

Objects may appear blurry due to lighting, weather or camera positioning. People usually aren’t posing or smiling for a security camera, and sometimes only part of an object, like a person’s arm, is in the frame. And Nest Cams analyze video in real time, versus some photos applications, which may have an entire video to analyze from start to finish. 

Cameras also see the world in 2D but they need to understand it in 3D. That’s why a Nest Cam may occasionally mistake a picture on your T-shirt for a real event. Finally, a lot of what a security camera sees is boring because our doorsteps and backyards are mostly quiet, and there are fewer examples of activity. That means you may occasionally get alerts where nothing actually happened. In order for security cameras to become more accurate, we need to have more high quality data to train the ML models on—and that’s one of the biggest challenges.

Nest Cam vs. camera photo of dog

On the left, an image of a dog from a Nest Cam feed on a Nest Hub. On the right, a photo of a dog taken with a Pixel phone.

So basically…it’s harder to detect people with a security camera than with a handheld camera, like a phone? 

In a word…yes. A model used for Google Image Search or Photos won't perform well on Nest Cameras because the images used to train it were probably taken on handheld cameras, and those images are mostly centered and well-lit, unlike the ones a Nest Camera has to analyze

A synthetic room with synthetic cats

Here's an example of a synthesized image, with bounding boxes around synthetic cats

So, we increased the size and diversity of our datasets that were appropriate for security cameras. Then, we added synthesized data — which ranges from creating a fully simulated world to putting synthetic objects on real backgrounds. With full simulation, we were able to create a game-like world where we could manipulate room layout, object placement, lighting conditions, camera placement, and more to account for the many settings our cameras are installed in. Over the course of this project, we created millions of images — including 2.5 million synthetic cats! 

We also use common-sense rules when developing and tuning our algorithms — for example, heads are attached to people!

Our new cameras and doorbells also have hardware that can make the most of the improved software and they do machine learning on-device, rather than in the cloud, for added privacy. They have a Tensor Processing Unit (TPU) with 170 times more compute than our older devices—a fancy way of saying that the new devices have more accurate, reliable and timely alerts. 

So, does this mean Nest Cam notifications are accurate 100% of the time? 

No — we use machine learning to ensure Nest Cam notifications are very accurate, but the technology isn’t always perfect. Sometimes a camera could mistake a child crawling around on all fours as an animal, a statue may be confused with a real person, and sometimes the camera will miss things. The new devices have a significantly improved ability to catch previously missed events, but improving our models over time is a forever project.

One thing we’re working on is making sure our camera algorithms take data diversity into account across different genders, ages and skin tones with larger, more diverse training datasets. We’ve also built hardware that can accommodate these datasets, and process frames on-device for added privacy. We treat all of this very seriously across Google ML projects, and Nest is committed to the same.

Ask a Techspert: How do you build a chatbot?

Chatbots have become a normal part of daily life, from that helpful customer service pop-up on a website to the voice-controlled system in your home. As a conversational AI engineer at Google, Lee Boonstra knows everything about chatbots. When the pandemic started, many of the conferences she spoke at were canceled, which gave Lee the time to put her knowledge into book form. She started writing while she was pregnant, and now, along with her daughter Rebel, she has this book: The Definitive Guide to Conversational AI With Dialogflow and Google Cloud


Lee, who lives and works in Amsterdam, is donating the proceeds of her royalties to Stichting Meer dan Gewenst, a nonprofit organization that helps people in the LGBTQ+ community who want to have children. The charity is close to her heart; as an LGBTQ+ parent herself, she wants others like her to have a chance at the joy she feels with her daughter. 


The book itself is for anyone interested in using chatbots, from developers to project managers and CEOs. Here she speaks to The Keyword about the art (and science) behind building a chatbot. 


What exactly is a chatbot?

A chatbot is a piece of software designed to simulate online conversations with people. Many people know chatbots as a chat window that appears when you open a website, but there are more forms — for instance, there are chatbots that answer questions via social media, and the voice of the Google Assistant is a chatbot. Chatbots have been around since the early computing days, but computers, they’ve only recently become more mainstream. That has everything to do with machine learning and natural language understanding. 

Old-school chatbots required you to formulate your sentences carefully. If you said things differently, the chatbot wouldn't know how to answer. If you made a spelling mistake, the bot would run amok! But there are many different ways to say something. A chatbot built with natural language understanding can understand a specific piece of text and then retrieve a specific answer to your question. It doesn't matter if you spell it wrong or say things differently. 


What benefits can the use of chatbots offer companies?
A chatbot works quickly, knows (almost) everything and is available 24/7. That basically makes it the ideal customer service representative. The customer no longer has to wait, the company saves money and the employees experience less stress. As a customer, you get a chatbot on the phone that listens to your question and can answer like a human thanks to speech technology. This way, most customers already receive the answers they need. If the chatbot doesn’t know the answer, it can transfer them to an employee. The customer will not be prompted for information again, as the agent will see that the chat history and system fields are already filled.


Companies are finding more and more ways to use chatbots. For example, since the advent of artificial intelligence, KLM Royal Dutch Airlines has been handling twice as many questions from customers via social media. And technical developer Doop built a Google Assistant Action in the Netherlands in collaboration with AVROTROS, specifically for the Eurovision Song Contest. Anyone who asks for information about the Eurovision Song Contest will hear a chatbot with the voice of presenter Cornald Maas talk about the show. 


How do you build a chatbot?
You can build a chatbot using the Dialogflow tool and other services on the Google Cloud platform. Dialogflow is a tool in your web browser that allows you to build chatbots by entering examples. For example, if you already have a FAQ section on your website, that's a good start. With Dialogflow you can edit the content of that Q&A and then train the chatbot to find answers to questions that customers often ask. Dialogflow learns from all the conversation examples so that it can provide answers.


But just like building a website, you probably need more resources, such as a place to host your chatbot and a database to store your data. You may also want to use additional machine learning models so that your chatbot can do things like detect the content of a PDF or the sentiment of a text. Google Cloud has more than 200 products available for this. It's just like playing with blocks: by stacking all these resources on top of each other, you build a product and you improve the experience, for yourself and for the customer.


Do you have any tips for getting started?
First things first: Start building the chatbot as soon as possible. Many people dread this, because they think it's hugely complex, but it’s better to just get going. You will need to keep track of the conversations and keep an eye on the statistics; what do customers ask and what do they expect? Building a chatbot is an ongoing project. The longer a chatbot lasts, the more data is collected and the smarter and faster it becomes. 


In addition, don't build a chatbot just for one specific channel. What you don't want is to have to build a chatbot for another channel next year and replicate the work. In a large company, teams often want to build a chatbot, but different chat channels are important to different departments. As a company you want to be present on all of those channels, whether that’s the website, on social media, via telephone or on Whatsapp. Build an integrated bot so there’s no duplication of work and maintenance is much easier. 


How do chatbots make life easier for people?
Many of the frustrations that you experience with traditional customer services, such as limited opening hours for contact by phone, waiting times and incomprehensible menus, can be removed with chatbots. People do find it important to know whether they are interacting with a human being or a chatbot, but, interestingly, a chatbot is more likely to be forgiven for making a mistake than a human. People might also have a specific preference for human interaction or a chatbot when discussing more sensitive topics like medical or financial issues, either because they want to have personal, human contact or they would rather not discuss a topic with a human being because they don’t feel comfortable doing so. Chatbots are getting better and better at understanding and interacting, and can be very helpful for interactions about these topics as well. 

Ask a Techspert: What is open source?

When I started working at Google, a colleague mentioned that the group projects I worked on in college sounded a lot like some of the open source projects we do here at Google. I thought there had to be some misunderstanding since my projects all happened in-person with my classmates in the corner of some building in the engineering quad. 

To find out how a real life study group could be like a type of computer software, I went straight to Rebecca Stambler, one of Google’s many open source experts.


Explain your job to me like I’m a first-grader.

Well, to start, computer programs have to be written in a language that computers understand — not in English or any other spoken language. At Google we have our own language called Go. When we write in a language to tell a computer what to do, that’s called source code. Just like you can write an essay or a letter in a Google Doc, you have to write your code in an “editor.” I work on making these editors work well for people who write code in Google’s programming language, Go. 


What does it mean for software to be open source?

A piece of software is considered open source if its source code is made publicly available to anyone, meaning they can freely copy, modify and redistribute the code. Usually, companies want to keep the source code of their products secret, so people can’t copy and reproduce their products. But sometimes a company shares their code publicly so anyone can contribute. This makes software more accessible and builds a community around a project. Anyone can work on an open source project no matter who they are or where they are. 


Anyone can contribute? How do they do it?

Before you actually write open source code, a good first step would be thinking about what you’re interested in, whether that’s web development, systems or front end development. Then you can dive into that community by doing things like attending talks or joining online networks where you can often learn more about what open source projects are out there. Then, think about what topics you’re interested in — maybe it’s the environment, retail, banking or a specific type of web development. Some people write code just because they enjoy it; plenty of these people have contributed to code within Google open source projects. So if you’re looking to contribute,  make sure it’s something  you’re really interested in.

Abstract illustration of three people putting together code.

Many open source projects are hosted on a site called Github, so once you narrow down your area of interest, that’s a great place to start! Once you’ve found something you want to work on, the easiest way to get involved is to fix errors in the code raised by other members of the project who don’t have the time to fix. Even if you don’t know how to code there’s a lot of non-technical work in open source projects like prioritizing issues that need fixing, community organization or writing user guides. You just have to be passionate about the work and ready to jump in. 


What’s the benefit of using open source code to create something?

We need lots of diverse perspectives to build good software, and open source helps with that. If you’re building something with a small team of three people, you might not consider all of the different ways someone might use your product. Or maybe your team doesn’t have the best equipment. Open source enables people from all over the world with different use cases, computers and experiences to chime in and say “hey, this doesn’t actually work for me” or “running this software drains my battery.” Without having open source projects, I don’t think we could make products that work for everyone. 

Projects like Android, which is Google operating system for mobile devices, are open source. And just last year Google Kubernetes Engine celebrated its five-year anniversary. This was really exciting because it showed how Google engineers contribute to the broader open source community outside of Google. Open source projects build a real sense of community between the contributors. When we have people that work on a lot of our projects we send them thank you notes and mention them when we release new software versions. We’ve created a whole community of contributors who’ve made our products more successful and exciting. 

Ask a Techspert: What’s a neural network?

Back in the day, there was a surefire way to tell humans and computers apart: You’d present a picture of a four-legged friend and ask if it was a cat or dog. A computer couldn’t identify felines from canines, but we humans could answer with doggone confidence. 

That all changed about a decade ago thanks to leaps in computer vision and machine learning – specifically,  major advancements in neural networks, which can train computers to learn in a way similar to humans. Today, if you give a computer enough images of cats and dogs and label which is which, it can learn to tell them apart purr-fectly. 

But how exactly do neural networks help computers do this? And what else can — or can’t — they do? To answer these questions and more, I sat down with Google Research’s Maithra Raghu, a research scientist who spends her days helping computer scientists better understand neural networks. Her research helped the Google Health team discover new ways to apply deep learning to assist doctors and their patients.

So, the big question: What’s a neural network?

To understand neural networks, we need to first go back to the basics and understand how they fit into the bigger picture of artificial intelligence (AI). Imagine a Russian nesting doll, Maithra explains. AI would be the largest doll, then within that, there’s machine learning (ML), and within that, neural networks (... and within that, deep neural networks, but we’ll get there soon!).

If you think of AI as the science of making things smart, ML is the subfield of AI focused on making computers smarter by teaching them to learn, instead of hard-coding them. Within that, neural networks are an advanced technique for ML, where you teach computers to learn with algorithms that take inspiration from the human brain.

Your brain fires off groups of neurons that communicate with each other. In an artificial neural network, (the computer type), a “neuron” (which you can think of as a computational unit) is grouped with a bunch of other “neurons” into a layer, and those layers  stack on top of each other. Between each of those layers are connections. The more layers  a neural network has, the “deeper” it is. That’s where the idea of “deep learning” comes from. “Neural networks depart from neuroscience because you have a mathematical element to it,” Maithra explains, “Connections between neurons are numerical values represented by matrices, and training the neural network uses gradient-based algorithms.” 

This might seem complex, but you probably interact with neural networks fairly often — like when you’re scrolling through personalized movie recommendations or chatting with a customer service bot.

So once you’ve set up a neural network, is it ready to go?

Not quite. The next step is training. That’s where the model becomes much more sophisticated. Similar to people, neural networks learn from feedback. If you go back to the cat and dog example, your neural network would look at pictures and start by randomly guessing. You’d label the training data (for example, telling the computer if each picture features a cat or dog), and those labels would provide feedback, telling the neural network when it’s right or wrong. Throughout this process, the neural network’s parameters adjust, and the neural network transitions from not knowing to learning how to identify between cats and dogs.

Why don’t we use neural networks all the time?

“Though neural networks are based on our brains, the way they learn is actually very different from humans,” Maithra says. “Neural networks are usually quite specialized and narrow. This can be useful because, for example, it means a neural network might be able to process medical scans much quicker than a doctor, or spot patterns  a trained expert might not even notice.” 

But because neural networks learn differently from people, there's still a lot that computer scientists don’t know about how they work. Let’s go back to cats versus dogs: If your neural network gives you all the right answers, you might think it’s behaving as intended. But Maithra cautions that neural networks can work in mysterious ways.

“Perhaps your neural network isn’t able to identify between cats and dogs at all – maybe it’s only able to identify between sofas and grass, and all of your pictures of cats happen to be on couches, and all your pictures of dogs are in parks,” she says. “Then, it might seem like it knows the difference when it actually doesn’t.”

That’s why Maithra and other researchers are diving into the internals of neural networks, going deep into their layers and connections, to better understand them – and come up with ways to make them more helpful.

“Neural networks have been transformative for so many industries,” Maithra says, “and I’m excited that we’re going to realize even more profound applications for them moving forward.”