Tag Archives: Pixel

Making sure everyone feels “Seen on Pixel”

The Super Bowl has always been a special moment for Google. From our first Super Bowl ad in 2010, “Parisian Love,” to our 2020 spot “Loretta,” we try to shine a light on the challenges we’re focused on solving with our technology and tell the stories of real people impacted by our products.

And today, we’re continuing this legacy with our latest Super Bowl ad, “Seen on Pixel,” which tells the story of Real Tone, Google’s years-long efforts to ensure all our camera and imaging products accurately represent all skin tones.

For too long, camera technology, including our own, has failed people of color by either making them look washed out or too unnaturally bright or dark. Because everyone deserves to be seen as they truly are, we are committed to addressing this gap. Internally, Googlers of color volunteered to test the camera on Pixel 6 before we launched it and provided input on what was working and what could be better. Externally, we partnered with image experts who spent months with our engineers, testing the camera and providing detailed and thoughtful feedback that helped improve our camera and editing products, including adding significantly more portraits of people of color in the image datasets that train our camera models. This collective teamwork allowed us to launch what we call Real Tone, with Pixel 6 as our first camera to feature these improvements.

Since the launch of Real Tone on Google Pixel 6 and Pixel 6 Pro last October, we have seen the difference camera representation can make. “Seen on Pixel” brings to life what Real Tone represents. It is a montage of beautiful photography of individuals and families from all walks of life, all photographed on Pixel 6 by our director Joshua Kissi and contributing photographers Deun Ivory and Aundre Larrow. We partnered with award-winning artist Lizzo, who truly embodies the spirit of our campaign by always being her authentic self, unapologetically. Her powerful vocals as the soundtrack bring “Seen on Pixel” to life with a preview of her new song, “If You Love Me.”

Representation and equity in everything should always be the norm and the default. And until we reach it, our goal at Google will always be to make gains in the world every day through our products and storytelling.

Accurate Alpha Matting for Portrait Mode Selfies on Pixel 6

Image matting is the process of extracting a precise alpha matte that separates foreground and background objects in an image. This technique has been traditionally used in the filmmaking and photography industry for image and video editing purposes, e.g., background replacement, synthetic bokeh and other visual effects. Image matting assumes that an image is a composite of foreground and background images, and hence, the intensity of each pixel is a linear combination of the foreground and the background.

In the case of traditional image segmentation, the image is segmented in a binary manner, in which a pixel either belongs to the foreground or background. This type of segmentation, however, is unable to deal with natural scenes that contain fine details, e.g., hair and fur, which require estimating a transparency value for each pixel of the foreground object.

Alpha mattes, unlike segmentation masks, are usually extremely precise, preserving strand-level hair details and accurate foreground boundaries. While recent deep learning techniques have shown their potential in image matting, many challenges remain, such as generation of accurate ground truth alpha mattes, improving generalization on in-the-wild images and performing inference on mobile devices treating high-resolution images.

With the Pixel 6, we have significantly improved the appearance of selfies taken in Portrait Mode by introducing a new approach to estimate a high-resolution and accurate alpha matte from a selfie image. When synthesizing the depth-of-field effect, the usage of the alpha matte allows us to extract a more accurate silhouette of the photographed subject and have a better foreground-background separation. This allows users with a wide variety of hairstyles to take great-looking Portrait Mode shots using the selfie camera. In this post, we describe the technology we used to achieve this improvement and discuss how we tackled the challenges mentioned above.

Portrait Mode effect on a selfie shot using a low-resolution and coarse alpha matte compared to using the new high-quality alpha matte.

Portrait Matting
In designing Portrait Matting, we trained a fully convolutional neural network consisting of a sequence of encoder-decoder blocks to progressively estimate a high-quality alpha matte. We concatenate the input RGB image together with a coarse alpha matte (generated using a low-resolution person segmenter) that is passed as an input to the network. The new Portrait Matting model uses a MobileNetV3 backbone and a shallow (i.e., having a low number of layers) decoder to first predict a refined low-resolution alpha matte that operates on a low-resolution image. Then we use a shallow encoder-decoder and a series of residual blocks to process a high-resolution image and the refined alpha matte from the previous step. The shallow encoder-decoder relies more on lower-level features than the previous MobileNetV3 backbone, focusing on high-resolution structural features to predict final transparency values for each pixel. In this way, the model is able to refine an initial foreground alpha matte and accurately extract very fine details like hair strands. The proposed neural network architecture efficiently runs on Pixel 6 using Tensorflow Lite.

The network predicts a high-quality alpha matte from a color image and an initial coarse alpha matte. We use a MobileNetV3 backbone and a shallow decoder to first predict a refined low-resolution alpha matte. Then we use a shallow encoder-decoder and a series of residual blocks to further refine the initially estimated alpha matte.

Most recent deep learning work for image matting relies on manually annotated per-pixel alpha mattes used to separate the foreground from the background that are generated with image editing tools or green screens. This process is tedious and does not scale for the generation of large datasets. Also, it often produces inaccurate alpha mattes and foreground images that are contaminated (e.g., by reflected light from the background, or “green spill”). Moreover, this does nothing to ensure that the lighting on the subject appears consistent with the lighting in the new background environment.

To address these challenges, Portrait Matting is trained using a high-quality dataset generated using a custom volumetric capture system, Light Stage. Compared with previous datasets, this is more realistic, as relighting allows the illumination of the foreground subject to match the background. Additionally, we supervise the training of the model using pseudo–ground truth alpha mattes from in-the-wild images to improve model generalization, explained below. This ground truth data generation process is one of the key components of this work.

Ground Truth Data Generation
To generate accurate ground truth data, Light Stage produces near-photorealistic models of people using a geodesic sphere outfitted with 331 custom color LED lights, an array of high-resolution cameras, and a set of custom high-resolution depth sensors. Together with Light Stage data, we compute accurate alpha mattes using time-multiplexed lights and a previously recorded “clean plate”. This technique is also known as ratio matting.

This method works by recording an image of the subject silhouetted against an illuminated background as one of the lighting conditions. In addition, we capture a clean plate of the illuminated background. The silhouetted image, divided by the clean plate image, provides a ground truth alpha matte.

Then, we extrapolate the recorded alpha mattes to all the camera viewpoints in Light Stage using a deep learning–based matting network that leverages captured clean plates as an input. This approach allows us to extend the alpha mattes computation to unconstrained backgrounds without the need for specialized time-multiplexed lighting or a clean background. This deep learning architecture was solely trained using ground truth mattes generated using the ratio matting approach.

Computed alpha mattes from all camera viewpoints at the Light Stage.

Leveraging the reflectance field for each subject and the alpha matte generated with our ground truth matte generation system, we can relight each portrait using a given HDR lighting environment. We composite these relit subjects into backgrounds corresponding to the target illumination following the alpha blending equation. The background images are then generated from the HDR panoramas by positioning a virtual camera at the center and ray-tracing into the panorama from the camera’s center of projection. We ensure that the projected view into the panorama matches its orientation as used for relighting. We use virtual cameras with different focal lengths to simulate the different fields-of-view of consumer cameras. This pipeline produces realistic composites by handling matting, relighting, and compositing in one system, which we then use to train the Portrait Matting model.

Composited images on different backgrounds (high-resolution HDR maps) using ground truth generated alpha mattes.

Training Supervision Using In-the-Wild Portraits
To bridge the gap between portraits generated using Light Stage and in-the-wild portraits, we created a pipeline to automatically annotate in-the-wild photos generating pseudo–ground truth alpha mattes. For this purpose, we leveraged the Deep Matting model proposed in Total Relighting to create an ensemble of models that computes multiple high-resolution alpha mattes from in-the-wild images. We ran this pipeline on an extensive dataset of portrait photos captured in-house using Pixel phones. Additionally, during this process we performed test-time augmentation by doing inference on input images at different scales and rotations, and finally aggregating per-pixel alpha values across all estimated alpha mattes.

Generated alpha mattes are visually evaluated with respect to the input RGB image. The alpha mattes that are perceptually correct, i.e., following the subject's silhouette and fine details (e.g., hair), are added to the training set. During training, both datasets are sampled using different weights. Using the proposed supervision strategy exposes the model to a larger variety of scenes and human poses, improving its predictions on photos in the wild (model generalization).

Estimated pseudo–ground truth alpha mattes using an ensemble of Deep Matting models and test-time augmentation.

Portrait Mode Selfies
The Portrait Mode effect is particularly sensitive to errors around the subject boundary (see image below). For example, errors caused by the usage of a coarse alpha matte keep sharp focus on background regions near the subject boundaries or hair area. The usage of a high-quality alpha matte allows us to extract a more accurate silhouette of the photographed subject and improve foreground-background separation.

Try It Out Yourself
We have made front-facing camera Portrait Mode on the Pixel 6 better by improving alpha matte quality, resulting in fewer errors in the final rendered image and by improving the look of the blurred background around the hair region and subject boundary. Additionally, our ML model uses diverse training datasets that cover a wide variety of skin tones and hair styles. You can try this improved version of Portrait Mode by taking a selfie shot with the new Pixel 6 phones.

Portrait Mode effect on a selfie shot using a coarse alpha matte compared to using the new high quality alpha matte.

Acknowledgments
This work wouldn’t have been possible without Sergio Orts Escolano, Jana Ehmann, Sean Fanello, Christoph Rhemann, Junlan Yang, Andy Hsu, Hossam Isack, Rohit Pandey, David Aguilar, Yi Jinn, Christian Hane, Jay Busch, Cynthia Herrera, Matt Whalen, Philip Davidson, Jonathan Taylor, Peter Lincoln, Geoff Harvey, Nisha Masharani, Alexander Schiffhauer, Chloe LeGendre, Paul Debevec, Sofien Bouaziz, Adarsh Kowdle, Thabo Beeler, Chia-Kai Liang and Shahram Izadi. Special thanks to our photographers James Adamson, Christopher Farro and Cort Muller who took numerous test photographs for us.


Source: Google AI Blog


So you got new gear for the holidays. Now what?

The new year is here, and the holidays are (officially) over. If you were gifted a new Google gadget, that means it’s time to get your new gear out of the box and into your home or pocket.

We talked to the experts here at Google and asked for a few of their quick setup tips, so you can get straight to using your new…whatever you got...right away.

So you got a Pixel 6 Pro…

  1. Begin by setting up fingerprint unlock for quick and easy access.
  2. Prepare for future emergencies and turn on the extreme battery saver feature in the settings app. Extreme battery saver can extend your Pixel 6 Pro’s battery life by intelligently pausing apps and slowing processes, and you can preselect when you want to enable the feature — and what your priority apps are.
  3. Create a personal aesthetic with Material You, and express character by customizing wallpaper and interface designs that will give your Pixel 6 Pro’s display a more uniform look.

So you got a Nest Hub Max…

  1. First, set up Face Match to ensure your Nest Hub Max can quickly identify you as the user and share a more personal experience. Then, when you walk up to the device it can do things like present your daily schedule, play your favorite playlist or suggest recommended videos, news and podcasts.
  2. Set up a Duo account for video calling and messaging with your friends and family. From there, you can ask Nest Hub Max to call anyone in your Google contacts who has Duo — just say, “Hey Google, call (your contact name).” For family members or friends who don't already have Duo, the app is free and available for download on both Android and iOS.
  3. Be sure to connect your Nest Hub Max to any other Google gear, such as the Chromecast and Nest Mini for a smart home experience.
The Nest Hub Max in front of a white background.

The Nest Hub Max.

So you got the new Nest Thermostat…

  1. Use Quick Schedule to easily and quickly get your thermostat programmed. You can go with its recommended presets or adjust the settings further to create a custom schedule. You can make changes to your schedule anytime from the Home app.
  2. Then you can opt in Home and Away Routines, which can help you avoid heating or cooling an empty house by using motion sensing and your phone’s location to know when nobody’s home and adjust the temperature accordingly to save energy.
  3. Make sure you’ve enabled notifications and Savings Finder will proactively suggest small tweaks to your schedule that you can accept from the Home app. For example, it might suggest a small change to your sleep temperature to save you energy.

So you got the new Pixel Buds A-Series…

  1. Check out the Pixel Buds A-Series’ latest feature, the bass customization option, to find your perfect sound. This addition doubles the bass range when connected to an Android 6.0 device, and can be adjusted on a scale from -1 to 4 by using the Pixel Buds App.
  2. Here’s a hardware tip: Try out the three different ear tip fit options to find the most comfortable fit for you.
  3. Start listening to your favorite podcasts and music right away by using Fast Pair to immediately connect your Pixel Buds to your phone.

Creator Labs artists take on the Pixel 6

“As humans we are constantly trying to understand ourselves … this is a universal experience, both socially and culturally. I find myself currently in a state of asking questions relating to my own sense of self.” This was what photographer MaryV was thinking while she was working on her latest project with Creator Labs.

Photograph of a woman sitting on a white horse. There is a pink and purple sunset in the background.

Photography by MaryV

Following the launch of Google Pixel 6 Pro in October, MaryV and 12 other lens-based artists were tasked with exploring the idea of “For All You Are,” a prompt referencing why we started the Creator Labs program: We want to give artists the tools to tell their own stories, in their own unique voices.

This year, Creator Labs artists were also able to use Real Tone on Google Pixel 6, a multi-year mission to make best-in-class smartphone cameras that photograph skin more equitably. As part of this initiative, the Pixel team made a suite of improvements and changes across how Pixel’s camera and supporting imagery products work to highlight the nuances of different skin tones beautifully and authentically.

One theme we saw multiple artists focus on was “ancestry,” both from the perspective of honoring traditions and redefining what constitutes family. Anthony Prince Leslie reimagined African Folklore with his piece “Spyda,” which in his words, showcases “the resilience of the Black diaspora and the importance of storytelling as a method of preserving history.” Texas Isaiah paid homage to his childhood home in East New York, Brooklyn. As the first-generation of his family born in the U.S., he never spent time with his extended family. So as a child, his home was filled with native Canadian and South American ​​photographs, souvenirs and other materials his family had collected over more than 30 years.

Myles Loftin challenged the “traditional” family structures with his piece by documenting and honoring his chosen family in New York City. This is an extension of a larger body of work called “In The Life” which centers on Black Queer life.

Photograph of three people standing together, hugging and leaning into one another. There is blue sky and white cloud in the background.

Photography by Myles Loftin

While each artists’ work is unique, they all invite us to reflect and be vulnerable.

Other Creator Labs artists include Mayan Toledano, Pegah Farahmand, Kennedi Carter, Aidan Cullen, Andre Wagner, Tim Kellner, Natalia Mantini, Josh Goldenberg (glassface) and June Canedo. You can see examples of their work and more from the artists above onthe Pixel Instagram page.

Snap faster, hear better and do more with your Pixel

One of the sweet things about being a Pixel user is that your phone continues to get a boost of helpfulness with Feature Drops. Whether you want to quickly tap to access Snapchat from your Pixel lock screen or control the bass levels on your Pixel Buds A-Series, we’ve got an update you’ll love.

This latest Feature Drop will roll out to users over the next few weeks, starting today with relevant updates coming to Pixel 3a through Pixel 5a (5G) devices - see g.co/pixel/updates for details. Pixel 6 and Pixel 6 Pro devices will begin receiving their updates next week.

Snapchat, digital car key and ultra-wideband help Pixel do more

You can already customize the actions your Pixel takes when you use Quick Tap, from taking a screenshot to playing music. With Quick Tap to Snap, you can access Snapchat directly from your lock screen, making Pixel the fastest phone to make a Snap. Quick Tap to Snap is available to all Pixel 4a with 5G or newer Pixel phones. Plus, starting this month, you’ll be able to add a new Pixel-exclusive Lens – Pixel Face – to your Snaps. Look out for more Pixel-exclusive Lenses in future Feature Drops.

Image showing  Quick Tap to Snap on Pixel 6 and Pixel 6 Pro. Two people, an adult and a child, looking into the camera. One is smiling and the other is making a silly face.

As you saw from our friends at Android, we’ve partnered with BMW to enable digital car key for Pixel 6 and Pixel 6 Pro. On select 2020-2022 BMW models in certain countries, you can now unlock and lock your car by tapping your phone on the door handle, and you can start your car by placing your Pixel on the interior key reader and pressing the engine start button.

And ultra-wideband is now enabled on Pixel 6 Pro. This technology improves Nearby Share so you can quickly and securely send files, videos, map locations and more to other ultra-wideband devices nearby.

Personalize your devices

Conversation mode, an early-stage accessibility feature in the Sound Amplifier app, is now available in beta first on Pixel. This feature uses on-device machine learning to help anyone better hear conversations in loud environments by tuning into their conversation partner and tuning out competing noise. While Google Research continues to work on conversation mode you can get a sneak peek as an early tester and help make it better for everyone.

Animated GIF showing how Sound Amplifier works. A person's face is centered in a circle in the middle of the phone and while they speak, abstract sound icons illustrate the app amplifying their words.

Have you ever heard a catchy new track, but have no idea what it is? We’ve updated the Now Playing experience on Pixel to help you find your next favorite song. As always, Now Playing's automatic recognition is done entirely on-device. If Now Playing hasn’t automatically identified a song playing nearby, turn on the new search button and tap to let Pixel find it for you (available on Pixel 4 or newer Pixel phones). And if you’re really digging it, smash that music note icon next to the track’s recognition on your lock screen to save it as a favorite.

Animated GIF showing how Now Playing recognizes songs that are playing nearby on a Pixel phone.

On-screen experience is simulated for illustrative purposes. Now Playing may not recognize every song.

Speaking of music: We’re also introducing improved bass-level control for the Pixel Buds A-Series. With any Android 6.0+ device, you can now open the Pixel Buds app and use a slider to adjust bass from -1 to +4, giving you twice the bass range you currently have.

We've also added to our wallpapers. In celebration of International Day of Persons with Disabilities, we collaborated with Dana Kearly, a disabled multidisciplinary artist from Vancouver B.C., to create three beautiful new wallpapers for the Curated Culture collection.

Image showing a Wallpaper by Dana Kearly on a Pixel phone lock screen. It has cartoon flowers standing up on grass with an abstract pink, yellow, purple and orange background behind them.

Wallpaper by Dana Kearly.

Car crash detection and Recorder

Car crash detection is now supported in Taiwan, Italy and France, in addition to Spain, Ireland, Japan, the U.K., Australia, Singapore and the U.S. When car crash detection is turned on in the Personal Safety app, your Pixel 3, Pixel 4 or newer Pixel phone can help detect if you’ve been in a severe car accident. If a crash is detected, your phone will check in with you to see if you’re OK. If there’s no response, Pixel can share your location and other relevant details with emergency responders. (This feature is dependent upon network connectivity and other factors and may not be reliable for emergency communications or available in all areas.)

And while car crash detection is expanding to new countries, we’re also enabling new languages for transcription in the Recorder app. These include Japanese, French and German on Pixel 3 and newer Pixel phones.

If you want to learn more about these updates visit our Pixel forum. Otherwise, that’s all for now — until our next Feature Drop!

Winter is coming: 9 ways to enjoy it with Google

As a native Oregonian, I thought living in California would be an incredible break from the nine months of rain I’d endured growing up. What I didn’t realize was that 70-degree winters felt…wrong. Where were the mittens? The down jackets? The occasional snowy days? I’ve since moved back to the Pacific Northwest, and I’ve had a renewed appreciation for winter weather.

In fact, I enjoy the chilly months of the year so much, I’ve put together a few ways to make the most of the cold weather.

  1. I love snowshoeing, and I always want to find new trails. I use Google Maps to look for mountain biking and hiking trails that are covered in snow in the winter. (Just look for the hiking icons, or the light dash lines that indicate trails.) If I come across a good one, I label it on Maps so I know how to get back.
Animated GIF showing trails on Google Maps and how you can select and label them; this one is being saved to a list called “trails.”

2. I’m a year-round runner, but once the temperature dips below 50 Fahrenheit and the roads get wet or icy, I need new gear — all of which I can find in one place using Google Shopping. You can select the Sports & Outdoors tab to browse — and turn on the deals filter for discounts.

3. And when I’m returning from a chilly run, I can use the Google Home app to turn on my Nest Thermostat before I get home, so I know I’m not wasting energy while I’m out and the house will be toasty when I come in. I also use Home & Away Routines so that Nest knows when I’m out and can adjust my temperature automatically.

4. OK fine, there’s one downside of winter weather, and that’s how early it gets dark. I use Google Assistant to notify me an hour before sunset so I can get outside for some sunshine before the sun goes down.

5. We’ve started cutting down our own Christmas tree, which is actually pretty easy to do. A quick Google Search for cutting down a tree on federal land will help you find a map (and how you can purchase a permit). Then you can just use Google Maps to take you to the right area.

6. If I’m feeling really adventurous and ready to hit the slopes, I’ll check out the Explore tool on google.com/travel. I can set my home as the point of origin and then select “skiing” under the Interests filter and see what ski towns I can visit.

Animated GIF showing the United States on Google Maps. The arrow selects the “interests” tab and then “skiing” to surface ski towns in different parts of the country.

7. I love a good Google Alert to stay up to date on what’s going on locally. Once November rolls around, I set one for “Oregon winter festivals.”

8. Pixel cameras take incredible photos in dimly lit areas, so using Night Sight for shots of light displays or snowy nights is a no-brainer. And if you’ve already snagged a Pixel 6 or Pixel 6 Pro, those photos will look even better: The new Pixel camera lets in 2.5 times as much light as the Pixel 5, and you can try out the new Motion Mode setting to capture an artsy falling snow pic.

9. Most winter nights, I make a real fire — but when I don’t feel like hauling in wood, there’s always a YouTube version, complete with crackle.

10 ways Google Assistant can help you during the holidays

As fun as the holidays can be, they’re also filled with lots of to-do lists, preparation and planning. Before the hustle and bustle of the season begins, we wanted to share a few ways you can use Google Assistant to stay on top of things and do what matters most — spending quality time with family and friends.

  1. Get together over a good meal made easy with hands-free help in the kitchen. Surprise your family and friends with a new dish or dessert or find inspiration by saying, “Hey Google, find me Thanksgiving recipes.”
  2. …And if you happen to come across a few new favorites, tap on that recipe and ask your Assistant to save it for you by saying “Hey Google, add to my cookbook.” Then when it comes time for a holiday feast, all your recent recipes will be waiting for you on your Smart Display and will show up when you say “Hey Google, show me my cookbook.” Once you've gathered your ingredients, select the recipe you want to cook and say “Hey Google, start cooking” to get step-by-step instructions on your Smart Display.
Illustration of a Smart Display with a recipe on the screen. There is also a photo of a warm drink with whipped cream on the screen.

3. When the food is prepared and the table is set, let everyone know dinner is ready withBroadcast. Just say, “Hey Google, broadcast ‘dinner is ready.’”

4. How early is too early for festive music? The limit does not exist! And even if you don’t have something queued up, you can just say“Hey Google, play Christmas music.”

5. Want to avoid scrolling endlessly for gifts? Android users can use Assistant to browse shopping apps like Walmart with just their voice. If you have the Walmart app installed on your Android phone, try saying“Hey Google, search Walmart for bicycles.”

6. Avoid spending hours waiting on hold when you call to adjust travel plans or return a gift. Pixel users can take advantage of Hold For Me, where Google Assistant will wait on the line for you and let you know when a real person is ready to take your call.

7. Connect and feel close from anywhere with video calling. Make a group call with Duo supporting up to 32 people on your Nest Hub Max — or send a “happy holidays!” message using one of the fun AR effects on mobile devices. To start a Duo call, just say, “Hey Google, make a video call.”

8. Keep your family’s busy holiday schedule on track with Family Bell from Google. Say “Hey Google, set up a Family Bell” to be reminded with delightful sounds on your speakers or smart displays when it’s time to tackle important moments of your day, like holiday meals or volunteering at the local gift drive. And for routines that require a little extra work — like getting the kids to bed after a get together — create a Family Bell checklist on your Smart Display with get ready bells that remind them of key tasks to complete, like brushing their teeth and putting on pajamas.

9. Have some fun and create new memories with a hands-free family game night. Put your game face on and say, “Hey Google, let’s play a game.”

10. Spark some holiday magic with a story from Google. We’ve added a new interactive story from Grabbit, a twist on the classic fairytale, “Hansel and Gretel.” Play the story from either the perspective of Hansel and Gretel or the Witch, and decide how the story unfolds. Just say “Hey Google, talk to Twisted Hansel and Gretel” and let the adventure begin! More interactive stories from Grabbit like “Jungle Book,” “Alice in Wonderland” and “Sherlock Holmes” will soon be available on your Google Nest smart display devices between now and the new year.

Improved On-Device ML on Pixel 6, with Neural Architecture Search

This fall Pixel 6 phones launched with Google Tensor, Google’s first mobile system-on-chip (SoC), bringing together various processing components (such as central/graphic/tensor processing units, image processors, etc.) onto a single chip, custom-built to deliver state-of-the-art innovations in machine learning (ML) to Pixel users. In fact, every aspect of Google Tensor was designed and optimized to run Google’s ML models, in alignment with our AI Principles. That starts with the custom-made TPU integrated in Google Tensor that allows us to fulfill our vision of what should be possible on a Pixel phone.

Today, we share the improvements in on-device machine learning made possible by designing the ML models for Google Tensor’s TPU. We use neural architecture search (NAS) to automate the process of designing ML models, which incentivize the search algorithms to discover models that achieve higher quality while meeting latency and power requirements. This automation also allows us to scale the development of models for various on-device tasks. We’re making these models publicly available through the TensorFlow model garden and TensorFlow Hub so that researchers and developers can bootstrap further use case development on Pixel 6. Moreover, we have applied the same techniques to build a highly energy-efficient face detection model that is foundational to many Pixel 6 camera features.

An illustration of NAS to find TPU-optimized models. Each column represents a stage in the neural network, with dots indicating different options, and each color representing a different type of building block. A path from inputs (e.g., an image) to outputs (e.g., per-pixel label predictions) through the matrix represents a candidate neural network. In each iteration of the search, a neural network is formed using the blocks chosen at every stage, and the search algorithm aims to find neural networks that jointly minimize TPU latency and/or energy and maximize accuracy.

Search Space Design for Vision Models
A key component of NAS is the design of the search space from which the candidate networks are sampled. We customize the search space to include neural network building blocks that run efficiently on the Google Tensor TPU.

One widely-used building block in neural networks for various on-device vision tasks is the Inverted Bottleneck (IBN). The IBN block has several variants, each with different tradeoffs, and is built using regular convolution and depthwise convolution layers. While IBNs with depthwise convolution have been conventionally used in mobile vision models due to their low computational complexity, fused-IBNs, wherein depthwise convolution is replaced by a regular convolution, have been shown to improve the accuracy and latency of image classification and object detection models on TPU.

However, fused-IBNs can have prohibitively high computational and memory requirements for neural network layer shapes that are typical in the later stages of vision models, limiting their use throughout the model and leaving the depthwise-IBN as the only alternative. To overcome this limitation, we introduce IBNs that use group convolutions to enhance the flexibility in model design. While regular convolution mixes information across all the features in the input, group convolution slices the features into smaller groups and performs regular convolution on features within that group, reducing the overall computational cost. Called group convolution–based IBNs (GC-IBNs), their tradeoff is that they may adversely impact model quality.

Inverted bottleneck (IBN) variants: (a) depthwise-IBN, depthwise convolution layer with filter size KxK sandwiched between two convolution layers with filter size 1x1; (b) fused-IBN, convolution and depthwise are fused into a convolution layer with filter size KxK; and (c) group convolution–based GC-IBN that replaces with the KxK regular convolution in fused-IBN with group convolution. The number of groups (group count) is a tunable parameter during NAS.
Inclusion of GC-IBN as an option provides additional flexibility beyond other IBNs. Computational cost and latency of different IBN variants depends on the feature dimensions being processed (shown above for two example feature dimensions). We use NAS to determine the optimal choice of IBN variants.

Faster, More Accurate Image Classification
Which IBN variant to use at which stage of a deep neural network depends on the latency on the target hardware and the performance of the resulting neural network on the given task. We construct a search space that includes all of these different IBN variants and use NAS to discover neural networks for the image classification task that optimize the classification accuracy at a desired latency on TPU. The resulting MobileNetEdgeTPUV2 model family improves the accuracy at a given latency (or latency at a desired accuracy) compared to the existing on-device models when run on the TPU. MobileNetEdgeTPUV2 also outperforms their predecessor, MobileNetEdgeTPU, the image classification models designed for the previous generation of the TPU.

Network architecture families visualized as connected dots at different latency targets. Compared with other mobile models, such as FBNet, MobileNetV3, and EfficientNets, MobileNetEdgeTPUV2 models achieve higher ImageNet top-1 accuracy at lower latency when running on Google Tensor’s TPU.

MobileNetEdgeTPUV2 models are built using blocks that also improve the latency/accuracy tradeoff on other compute elements in the Google Tensor SoC, such as the CPU. Unlike accelerators such as the TPU, CPUs show a stronger correlation between the number of multiply-and-accumulate operations in the neural network and latency. GC-IBNs tend to have fewer multiply-and-accumulate operations than fused-IBNs, which leads MobileNetEdgeTPUV2 to outperform other models even on Pixel 6 CPU.

MobileNetEdgeTPUV2 models achieve ImageNet top-1 accuracy at lower latency on Pixel 6 CPU, and outperform other CPU-optimized model architectures, such as MobileNetV3.

Improving On-Device Semantic Segmentation
Many vision models consist of two components, the base feature extractor for understanding general features of the image, and the head for understanding domain-specific features, such as semantic segmentation (the task of assigning labels, such as sky, car, etc., to each pixel in an image) and object detection (the task of detecting instances of objects, such as cats, doors, cars, etc., in an image). Image classification models are often used as feature extractors for these vision tasks. As shown below, the MobileNetEdgeTPUV2 classification model coupled with the DeepLabv3+ segmentation head improves the quality of on-device segmentation.

To further improve the segmentation model quality, we use the bidirectional feature pyramid network (BiFPN) as the segmentation head, which performs weighted fusion of different features extracted by the feature extractor. Using NAS we find the optimal configuration of blocks in both the feature extractor and the BiFPN head. The resulting models, named Autoseg-EdgeTPU, produce even higher-quality segmentation results, while also running faster.

The final layers of the segmentation model contribute significantly to the overall latency, mainly due to the operations involved in generating a high resolution segmentation map. To optimize the latency on TPU, we introduce an approximate method for generating the high resolution segmentation map that reduces the memory requirement and provides a nearly 1.5x speedup, without significantly impacting the segmentation quality.

Left: Comparing the performance, measured as mean intersection-over-union (mIOU), of different segmentation models on the ADE20K semantic segmentation dataset (top 31 classes). Right: Approximate feature upsampling (e.g., increasing resolution from 32x32 → 512x512). Argmax operation used to compute per-pixel labels is fused with the bilinear upsampling. Argmax performed on smaller resolution features reduces memory requirements and improves latency on TPU without a significant impact to quality.

Higher-Quality, Low-Energy Object Detection
Classic object detection architectures allocate ~70% of the compute budget to the feature extractor and only ~30% to the detection head. For this task we incorporate the GC-IBN blocks into a search space we call “Spaghetti Search Space”1, which provides the flexibility to move more of the compute budget to the head. This search space also uses the non-trivial connection patterns seen in recent NAS works such as MnasFPN to merge different but related stages of the network to strengthen understanding.

We compare the models produced by NAS to MobileDet-EdgeTPU, a class of mobile detection models customized for the previous generation of TPU. MobileDets have been demonstrated to achieve state-of-the-art detection quality on a variety of mobile accelerators: DSPs, GPUs, and the previous TPU. Compared with MobileDets, the new family of SpaghettiNet-EdgeTPU detection models achieves +2.2% mAP (absolute) on COCO at the same latency and consumes less than 70% of the energy used by MobileDet-EdgeTPU to achieve similar accuracy.

Comparing the performance of different object detection models on the COCO dataset with the mAP metric (higher is better). SpaghettiNet-EdgeTPU achieves higher detection quality at lower latency and energy consumption compared to previous mobile models, such as MobileDets and MobileNetV2 with Feature Pyramid Network (FPN).

Inclusive, Energy-Efficient Face Detection
Face detection is a foundational technology in cameras that enables a suite of additional features, such as fixing the focus, exposure and white balance, and even removing blur from the face with the new Face Unblur feature. Such features must be designed responsibly, and Face Detection in the Pixel 6 were developed with our AI Principles top of mind.

Left: The original photo without improvements. Right: An unblurred face in a dynamic environment. This is the result of Face Unblur combined with a more accurate face detector running at a higher frames per second.

Since mobile cameras can be power-intensive, it was important for the face detection model to fit within a power budget. To optimize for energy efficiency, we used the Spaghetti Search Space with an algorithm to search for architectures that maximize accuracy at a given energy target. Compared with a heavily optimized baseline model, SpaghettiNet achieves the same accuracy at ~70% of the energy. The resulting face detection model, called FaceSSD, is more power-efficient and accurate. This improved model, combined with our auto-white balance and auto-exposure tuning improvements, are part of Real Tone on Pixel 6. These improvements help better reflect the beauty of all skin tones. Developers can utilize this model in their own apps through the Android Camera2 API.

Toward Datacenter-Quality Language Models on a Mobile Device
Deploying low-latency, high-quality language models on mobile devices benefits ML tasks like language understanding, speech recognition, and machine translation. MobileBERT, a derivative of BERT, is a natural language processing (NLP) model tuned for mobile CPUs.

However, due to the various architectural optimizations made to run these models efficiently on mobile CPUs, their quality is not as high as that of the large BERT models. Since MobileBERT on TPU runs significantly faster than on CPU, it presents an opportunity to improve the model architecture further and reduce the quality gap between MobileBERT and BERT. We extended the MobileBERT architecture and leveraged NAS to discover models that map well to the TPU. These new variants of MobileBERT, named MobileBERT-EdgeTPU, achieve up to 2x higher hardware utilization, allowing us to deploy large and more accurate models on TPU at latencies comparable to the baseline MobileBERT.

MobileBERT-EdgeTPU models, when deployed on Google Tensor’s TPU, produce on-device quality comparable to the large BERT models typically deployed in data centers.

Performance on the question answering task (SQuAD v 1.1). While the TPU in Pixel 6 provides a ~10x acceleration over CPU, further model customization for the TPU achieves on-device quality comparable to the large BERT models typically deployed in data centers.

Conclusion
In this post, we demonstrated how designing ML models for the target hardware expands the on-device ML capabilities of Pixel 6 and brings high-quality, ML-powered experiences to Pixel users. With NAS, we scaled the design of ML models to a variety of on-device tasks and built models that provide state-of-the-art quality on-device within the latency and power constraints of a mobile device. Researchers and ML developers can try out these models in their own use cases by accessing them through the TensorFlow model garden and TF Hub.

Acknowledgements
This work is made possible through a collaboration spanning several teams across Google. We’d like to acknowledge contributions from Rachit Agrawal, Berkin Akin, Andrey Ayupov, Aseem Bathla, Gabriel Bender, Po-Hsein Chu, Yicheng Fan, Max Gubin, Jaeyoun Kim, Quoc Le, Dongdong Li, Jing Li, Yun Long, Hanxiao Lu, Ravi Narayanaswami, Benjamin Panning, Anton Spiridonov, Anakin Tung, Zhuo Wang, Dong Hyuk Woo, Hao Xu, Jiayu Ye, Hongkun Yu, Ping Zhou, and Yanqi Zhuo. Finally, we’d like to thank Tom Small for creating illustrations for this blog post.



1The resulting architectures tend to look like spaghetti because of the connection patterns formed between blocks. 

Source: Google AI Blog


Pixel art: How designers created the new Pixel 6 colors

During a recent visit to Google’s Color, Material and Finish (better known as CMF) studio, I watched while Jess Ng and Jenny Davis opened drawer after drawer and placed object after object on two white tables. A gold hoop earring, a pale pink shell — all pieces of inspiration that Google designers use to come up with new colors for devices, including the just-launched Pixel 6 and Pixel 6 Pro.

“We find inspiration everywhere,” Jenny says. “It’s not abnormal to have a designer come to the studio with a toothbrush or some random object they found on their walk or wherever.”

The CMF team designs how a Google device will physically look and feel. “Color, material and finish are a big part of what defines a product,” Jess, a CMF hardware designer, says. “It touches on the more emotional part of how we decide what to buy.” And Jenny, CMF Manager for devices and services, agrees. “We always joke around that in CMF, the F stands for ‘feelings,’ so we joke that we design feelings.”

The new Pixel 6 comes in Sorta Seafoam and Kinda Coral, while the Pixel 6 Pro comes in Sorta Sunny and Cloudy White, and both are available in Stormy Black. Behind those five shades are years of work, plenty of trial and error…and lots and lots of fine-tuning. “It’s actually a very complex process,” Jenny says.

Mademore complex by COVID-19. Both Jenny and Jess describe the color selection process as highly collaborative and hands-on, which was difficult to accomplish while working from home. Designers aren’t just working with their own teams, but with those on the manufacturing and hardware side as well. “We don’t design color after the hardware design is done — we actually do it together,” Jenny says. The Pixel 6 and Pixel 6 Pro’s new premium look and feel influenced the direction of the new colors, and the CMF team needed to see colors and touch items in order to select and eliminate the shades.

They don’t only go hands-on with the devices, they do the same with sources of inspiration. “I remember one time I really wanted to share this color because I thought it would be really appropriate for one of our products, so I ended up sending my boss one of my sweaters through a courier delivery!” Jenny says. “We found creative workarounds.”

The team that designed the new Pixel 6 and Pixel 6 Pro case colors did as well. “The CMF team would make models and then take photos of the models and I would try to go in and look at them in person and physically match the case combinations against the different phone colors,” says Nasreen Shad, a Pixel Accessories product manager. “Then we’d render or photograph them and send them around to the team to review and see if what was and wasn’t working.” In addition to the challenge of working remotely, Nasreen’s team was also working on something entirely new: colorful, translucent cases.

Nasreen says they didn’t want to cover up the phones, but complement them instead, so they went with a translucent tinted plastic. Each device has a case that corresponds to its color family, but you can mix and match them for interesting new shades.

That process involved lots of experimenting. For example, what eventually became the Golden Glow case started out closer to a bronze color, which didn’t pair as well with the Stormy Black phone. “We had to tune it to a peachy shade, so that it looked good with its ‘intended pairing,’ Sorta Sunny, but with everything else, too. That meant ordering more resins and color chips in different tones, but it ended with some really beautiful effects.”

Beautiful effects, and tons of options. “I posted a picture of all of the possible combinations you can make with the phones and the cases and people kept asking me, ‘how many phones did Google just release!?’” Nasreen laughs. “And I had to be like, ‘No, no, no, these are just the cases!’”

A photograph showing the various Pixel 6 and Pixel 6 Pro phones in different colors in different colored cases, illustrating how many options there are.

Google designers often only know the devices and colors by temporary, internal code names. It's up to their colleagues to come up with the names you see on the Google Store site now. But one person who absolutely knows their official names is Lily Hackett, a Product Marketing Manager who works on a team that names device colors. “The way that we go about color naming is unique,” she says. “We like to play on the color. When you think about it, it’s actually very difficult to describe color, and the colors we often use are subtle — so we like to be specific with our approach to the name.”

Because color can be so subjective (one person’s white and gold dress is another’s black and blue dress), Lily’s team often checks in with CMF designers to make sure the words and names they’re gravitating toward actually describe the colors accurately. “It’s so nice to go to color experts and say, ‘Is this right? Is this a word you would use to describe this color?’”

Lily says their early brainstorming sessions can result in lists of 75 or more options. “It’s truly a testament to our copywriting team. When we were brainstorming for Stormy Black, they had everything under the sun — they had everything under the moon! It was incredible to see how many words they came up with.”

These days, everyone is looking ahead at new colors and new names, but the team is excited to see the rest of the world finally get to see their work. “I couldn’t wait for them to come out,” Lily says. “My favorite color was even the first to sell out on the Google Store! I was like, ‘Yes, everyone else loves it, too!’”

8 more things to love about the new Pixel phones

Last week we unveiled the new Pixel 6 and Pixel 6 Pro — and we unveiled a lot. Aside from the two new phones themselves, there was also Google Tensor, our custom system on a chip (SoC) that takes advantage of our machine learning research. Then there’s Magic Eraser, which will take unwanted people and objects out of your photos — plus Pixel Pass, a new way to buy, and a ton of new features packed into Android 12.

More from this Collection

Pixel 6 and Pixel 6 Pro

The Pixel 6 and Pixel 6 Pro have arrived, and so have plenty of new features.

View all 8 articles

Amid all thenew, you may have missed a thing or two. But don’t worry, we went ahead and collected everything you might have missed, and some extras, too.

  1. One of the key differences between Pixel 6 and previous editions is the radical redesign of the hardware encasedin aluminum and glass.

2. Real Tone is a significant advancement, making the Pixel 6 camera more equitable, and that’s not all: It also improves Google Photos' auto enhance feature on Android and iOS with better face detection, auto white balance and auto exposure, so that it works well across skin tones.

3. Speech recognition has been updated to take advantage of Google Tensor so you can do more with voice. We’ve added automatic punctuation while dictating and support for voice commands like “send” and “clear” to send a message or edit it. With new emoji support, I can just say ‘‘pasta emoji” while dictating. (Which, I admit, is going to get a lot of use.)

4. We’ve partnered with Snap to bring exclusive Snapchat features to the Pixel. For example, you can set it up so when you tap the back of your Pixel 6 or Pixel 6 Pro twice, it will launch the Snapchat selfie camera.

5. When you're flipping through your photos on a Pixel 6 or Pixel 6 Pro, Google Photos can proactively suggest using Magic Eraser to remove photobombers in the background.

Animated GIF showing Magic Eraser being used to take people out of the background of a photo.

6. The camera bar is a major new hardware design feature in the Pixel 6, and part of the reason it’s there is to fit a much bigger sensor, which captures more light so photos look sharper — in fact, the new sensor lets in 150% more light than that of the Pixel 5’s. The Pixel Pro 6’s Telephoto camera also uses a prism inside the camera to bend the light so the lens can fit inside the camera bar.

7. The Pixel 6 comes in Kinda Coral, Sorta Seafoam and Stormy Black, and the Pixel 6 Pro comes in Cloudy White, Sorta Sunny and the same Stormy Black. These shades are stunning on their own, but you can customize them even more with the new translucent cases: Combine the Sorta Seafoam Pixel 6 with the Light Rain case for an icy new look.

8. New in Android 12 and exclusive to Pixel 6 and Pixel 6 Pro, Gboard now features Grammar Correction. Not only will it make communication easier, but it will also work entirely on-device to preserve privacy. You can learn more over on the Google AI blog.