Tag Archives: machine learning

The best hardware, software and AI—together

Today, we introduced our second generation family of consumer hardware products, all made by Google: new Pixel phones, Google Home Mini and Max, an all new Pixelbook, Google Clips hands-free camera, Google Pixel Buds, and an updated Daydream View headset. We see tremendous potential for devices to be helpful, make your life easier, and even get better over time when they’re created at the intersection of hardware, software and advanced artificial intelligence (AI).


Why Google?

These days many devices—especially smartphones—look and act the same. That means in order to create a meaningful experience for users, we need a different approach. A year ago, Sundar outlined his vision of how AI would change how people would use computers. And in fact, AI is already transforming what Google’s products can do in the real world. For example, swipe typing has been around for a while, but AI lets people use Gboard to swipe-type in two languages at once. Google Maps uses AI to figure out what the parking is like at your destination and suggest alternative spots before you’ve even put your foot on the gas. But, for this wave of computing to reach new breakthroughs, we have to build software and hardware that can bring more of the potential of AI into reality—which is what we’ve set out to do with this year’s new family of products.

Hardware, built from the inside out

We’ve designed and built our latest hardware products around a few core tenets. First and foremost, we want them to be radically helpful. They’re fast, they’re there when you need them, and they’re simple to use. Second, everything is designed for you, so that the technology doesn’t get in they way and instead blends into your lifestyle. Lastly, by creating hardware with AI at the core, our products can improve over time. They’re constantly getting better and faster through automatic software updates. And they’re designed to learn from you, so you’ll notice features—like the Google Assistant—get smarter and more assistive the more you interact with them.


You’ll see this reflected in our 2017 lineup of new Made by Google products:

  • The Pixel 2 has the best camera of any smartphone, again, along with a gorgeous display and augmented reality capabilities. Pixel owners get unlimited storage for their photos and videos, and an exclusive preview of Google Lens, which uses AI to give you helpful information about the things around you.
  • Google Home Mini brings the Assistant to more places throughout your home, with a beautiful design that fits anywhere. And Max is our biggest and best-sounding Google Home device, powered by the Assistant. And with AI-based Smart Sound, Max has the ability to adapt your audio experience to you—your environment, context, and preferences.
  • With Pixelbook, we’ve reimagined the laptop as a high-performance Chromebook, with a versatile form factor that works the way you do. It’s the first laptop with the Assistant built in, and the Pixelbook Pen makes the whole experience even smarter.
  • Our new Pixel Buds combine Google smarts and the best digital sound. You’ll get elegant touch controls that put the Assistant just a tap away, and they’ll even help you communicate in a different language.
  • The updated Daydream View is the best mobile virtual reality (VR) headset on the market, and the simplest, most comfortable VR experience.
  • Google Clips is a totally new way to capture genuine, spontaneous moments—all powered by machine learning and AI. This tiny camera seamlessly sends clips to your phone, and even edits and curates them for you.

Assistant, everywhere

Across all these devices, you can interact with the Google Assistant any way you want—talk to it with your Google Home or your Pixel Buds, squeeze your Pixel 2, or use your Pixelbook’s Assistant key or circle things on your screen with the Pixelbook Pen. Wherever you are, and on any device with the Assistant, you can connect to the information you need and get help with the tasks to get you through your day. No other assistive technology comes close, and it continues to get better every day.

New hardware products

Google’s hardware business is just getting started, and we’re committed to building and investing for the long run. We couldn’t be more excited to introduce you to our second-generation family of products that truly brings together the best of Google software, thoughtfully designed hardware with cutting-edge AI. We hope you enjoy using them as much as we do.

The best hardware, software and AI—together

Today, we introduced our second generation family of consumer hardware products, all made by Google: new Pixel phones, Google Home Mini and Max, an all new Pixelbook, Google Clips hands-free camera, Google Pixel Buds, and an updated Daydream View headset. We see tremendous potential for devices to be helpful, make your life easier, and even get better over time when they’re created at the intersection of hardware, software and advanced artificial intelligence (AI).


Why Google?

These days many devices—especially smartphones—look and act the same. That means in order to create a meaningful experience for users, we need a different approach. A year ago, Sundar outlined his vision of how AI would change how people would use computers. And in fact, AI is already transforming what Google’s products can do in the real world. For example, swipe typing has been around for a while, but AI lets people use Gboard to swipe-type in two languages at once. Google Maps uses AI to figure out what the parking is like at your destination and suggest alternative spots before you’ve even put your foot on the gas. But, for this wave of computing to reach new breakthroughs, we have to build software and hardware that can bring more of the potential of AI into reality—which is what we’ve set out to do with this year’s new family of products.

Hardware, built from the inside out

We’ve designed and built our latest hardware products around a few core tenets. First and foremost, we want them to be radically helpful. They’re fast, they’re there when you need them, and they’re simple to use. Second, everything is designed for you, so that the technology doesn’t get in they way and instead blends into your lifestyle. Lastly, by creating hardware with AI at the core, our products can improve over time. They’re constantly getting better and faster through automatic software updates. And they’re designed to learn from you, so you’ll notice features—like the Google Assistant—get smarter and more assistive the more you interact with them.


You’ll see this reflected in our 2017 lineup of new Made by Google products:

  • The Pixel 2 has the best camera of any smartphone, again, along with a gorgeous display and augmented reality capabilities. Pixel owners get unlimited storage for their photos and videos, and an exclusive preview of Google Lens, which uses AI to give you helpful information about the things around you.
  • Google Home Mini brings the Assistant to more places throughout your home, with a beautiful design that fits anywhere. And Max is our biggest and best-sounding Google Home device, powered by the Assistant. And with AI-based Smart Sound, Max has the ability to adapt your audio experience to you—your environment, context, and preferences.
  • With Pixelbook, we’ve reimagined the laptop as a high-performance Chromebook, with a versatile form factor that works the way you do. It’s the first laptop with the Assistant built in, and the Pixelbook Pen makes the whole experience even smarter.
  • Our new Pixel Buds combine Google smarts and the best digital sound. You’ll get elegant touch controls that put the Assistant just a tap away, and they’ll even help you communicate in a different language.
  • The updated Daydream View is the best mobile virtual reality (VR) headset on the market, and the simplest, most comfortable VR experience.
  • Google Clips is a totally new way to capture genuine, spontaneous moments—all powered by machine learning and AI. This tiny camera seamlessly sends clips to your phone, and even edits and curates them for you.

Assistant, everywhere

Across all these devices, you can interact with the Google Assistant any way you want—talk to it with your Google Home or your Pixel Buds, squeeze your Pixel 2, or use your Pixelbook’s Assistant key or circle things on your screen with the Pixelbook Pen. Wherever you are, and on any device with the Assistant, you can connect to the information you need and get help with the tasks to get you through your day. No other assistive technology comes close, and it continues to get better every day.

New hardware products

Google’s hardware business is just getting started, and we’re committed to building and investing for the long run. We couldn’t be more excited to introduce you to our second-generation family of products that truly brings together the best of Google software, thoughtfully designed hardware with cutting-edge AI. We hope you enjoy using them as much as we do.

Best commute ever? Ride along with Google execs Diane Greene and Fei-Fei Li

Editor’s Note: The Grace Hopper Celebration of Women in Computing is coming up, and Diane Greene and Dr. Fei-Fei Li—two of our senior leaders—are getting ready. Sometimes Diane and Fei-Fei commute to the office together, and this time we happened to be along to capture the ride. Diane took over the music for the commute, and with Aretha Franklin’s “Respect” in the background, she and Fei-Fei chatted about the conference, their careers in tech, motherhood, and amplifying female voices everywhere. Hop in the backseat for Diane and Fei-Fei’s ride to work.

(A quick note for the riders: This conversation has been edited for brevity, and so you don’t have to read Diane and Fei-Fei talking about U-turns.)

fei-fei and diane.gif

Fei-Fei: Are you getting excited for Grace Hopper?

Diane: I’m super excited for the conference. We’re bringing together technical women to surface a lot of things that haven’t been talked about as openly in the past.

Fei-Fei: You’ve had a long career in tech. What makes this point in time different from the early days when you entered this field?

Diane: I got a degree in engineering in 1976 (ed note: Fei-Fei jumped in to remind Diane that this was the year she was born!). Computers were so exciting, and I learned to program. When I went to grad school to study computer science in 1985, there was actually a fair number of women at UC Berkeley. I’d say we had at least 30 percent women, which is way better than today.

It was a new, undefined field. And whenever there’s a new industry or technology, it’s wide open for everyone because nothing’s been established. Tech was that way, so it was quite natural for women to work in artificial intelligence and theory, and even in systems, networking, and hardware architecture. I came from mechanical engineering and the oil industry where I was the only woman. Tech was full of women then, but now less than 15 percent of women are in tech.

Fei-Fei: So do you think it’s too late?

Diane: I don’t think it’s too late. Girls in grade school and high school are coding. And certainly in colleges the focus on engineering is really strong, and the numbers are growing again.

Fei-Fei: You’re giving a talk at Grace Hopper—how will you talk to them about what distinguishes your career?

Diane: It’s wonderful that we’re both giving talks! Growing up, I loved building things so it was natural for me to go into engineering. I want to encourage other women to start with what you’re interested in and what makes you excited. If you love building things, focus on that, and the career success will come. I’ve been so unbelievably lucky in my career, but it’s a proof point that you can end up having quite a good career while doing what you’re interested in.

I want to encourage other women to start with what you’re interested in and what makes you excited. If you love building things, focus on that, and the career success will come. Diane Greene

Fei-Fei: And you are a mother of two grown, beautiful children. How did you prioritize them while balancing career?

Diane: When I was at VMware, I had the “go home for dinner” rule. When we founded the company, I was pregnant and none of the other founders had kids. But we were able to build a the culture around families—every time someone had a kid we gave them a VMware diaper bag. Whenever my kids were having a school play or parent teacher conference, I would make a big show of leaving in the middle of the day so everyone would know they could do that too. And at Google, I encourage both men and women on my team to find that balance.

Fei-Fei: It’s so important for your message to get across because young women today are thinking about their goals and what they want to build for the world, but also for themselves and their families. And there are so many women and people of color doing great work, how do we lift up their work? How do we get their voices heard? This is something I think about all the time, the voice of women and underrepresented communities in AI.

Diane: This is about educating people—not just women—to surface the accomplishments of everybody and make sure there’s no unconscious bias going on. I think Grace Hopper is a phenomenal tool for this, and there are things that I incorporate into my work day to prevent that unconscious bias: pausing to make sure the right people were included in a meeting, and that no one has been overlooked. And encouraging everyone in that meeting to participate so that all voices are heard.

Fei-Fei: Grace Hopper could be a great platform to share best practices for how to address these issues.

...young women today are thinking about their goals and what they want to build for the world, but also for themselves and their families. Dr. Fei-Fei Li

Diane: Every company is struggling to address diversity and there’s a school of thought that says having three or more people from one minority group makes all the difference in the world—I see it on boards. Whenever we have three or more women, the whole dynamic changes. Do you see that in your research group at all?

Fei-Fei: Yes, for a long time I was the only woman faculty member in the Stanford AI lab, but now it has attracted a lot of women who do very well because there’s a community. And that’s wonderful for me, and for the group.

Now back to you … you’ve had such a successful career, and I think a lot of women would love to know what keeps you going every day.

Diane: When you wake up in the morning, be excited about what’s ahead for the day. And if you’re not excited, ask yourself if it’s time for a change. Right now the Cloud is at the center of massive change in our world, and I’m lucky to have a front row seat to how it’s happening and what’s possible with it. We’re creating the next generation of technologies that are going to help people do things that we didn’t even know were possible, particularly in the AI/ML area. It’s exciting to be in the middle of the transformation of our world and the fast pace at which it’s happening.

Fei-Fei: Coming to Google Cloud, the most rewarding part is seeing how this is helping people go through that transformation and making a difference. And it’s at such a scale that it’s unthinkable on almost any other platform.

Diane: Cloud is making it easier for companies to work together and for people to work across boundaries together, and I love that. I’ve always found when you can collaborate across more boundaries you can get a lot more done.

To hear more from Fei-Fei and Diane, tune into Grace Hopper’s live stream on October 4. 

Source: Google Cloud


Best commute ever? Ride along with Google execs Diane Greene and Fei-Fei Li

Editor’s Note: The Grace Hopper Celebration of Women in Computing is coming up, and Diane Greene and Dr. Fei-Fei Li—two of our senior leaders—are getting ready. Sometimes Diane and Fei-Fei commute to the office together, and this time we happened to be along to capture the ride. Diane took over the music for the commute, and with Aretha Franklin’s “Respect” in the background, she and Fei-Fei chatted about the conference, their careers in tech, motherhood, and amplifying female voices everywhere. Hop in the backseat for Diane and Fei-Fei’s ride to work.

(A quick note for the riders: This conversation has been edited for brevity, and so you don’t have to read Diane and Fei-Fei talking about U-turns.)

fei-fei and diane.gif

Fei-Fei: Are you getting excited for Grace Hopper?

Diane: I’m super excited for the conference. We’re bringing together technical women to surface a lot of things that haven’t been talked about as openly in the past.

Fei-Fei: You’ve had a long career in tech. What makes this point in time different from the early days when you entered this field?

Diane: I got a degree in engineering in 1976 (ed note: Fei-Fei jumped in to remind Diane that this was the year she was born!). Computers were so exciting, and I learned to program. When I went to grad school to study computer science in 1985, there was actually a fair number of women at UC Berkeley. I’d say we had at least 30 percent women, which is way better than today.

It was a new, undefined field. And whenever there’s a new industry or technology, it’s wide open for everyone because nothing’s been established. Tech was that way, so it was quite natural for women to work in artificial intelligence and theory, and even in systems, networking, and hardware architecture. I came from mechanical engineering and the oil industry where I was the only woman. Tech was full of women then, but now less than 15 percent of women are in tech.

Fei-Fei: So do you think it’s too late?

Diane: I don’t think it’s too late. Girls in grade school and high school are coding. And certainly in colleges the focus on engineering is really strong, and the numbers are growing again.

Fei-Fei: You’re giving a talk at Grace Hopper—how will you talk to them about what distinguishes your career?

Diane: It’s wonderful that we’re both giving talks! Growing up, I loved building things so it was natural for me to go into engineering. I want to encourage other women to start with what you’re interested in and what makes you excited. If you love building things, focus on that, and the career success will come. I’ve been so unbelievably lucky in my career, but it’s a proof point that you can end up having quite a good career while doing what you’re interested in.

I want to encourage other women to start with what you’re interested in and what makes you excited. If you love building things, focus on that, and the career success will come. Diane Greene

Fei-Fei: And you are a mother of two grown, beautiful children. How did you prioritize them while balancing career?

Diane: When I was at VMware, I had the “go home for dinner” rule. When we founded the company, I was pregnant and none of the other founders had kids. But we were able to build a the culture around families—every time someone had a kid we gave them a VMware diaper bag. Whenever my kids were having a school play or parent teacher conference, I would make a big show of leaving in the middle of the day so everyone would know they could do that too. And at Google, I encourage both men and women on my team to find that balance.

Fei-Fei: It’s so important for your message to get across because young women today are thinking about their goals and what they want to build for the world, but also for themselves and their families. And there are so many women and people of color doing great work, how do we lift up their work? How do we get their voices heard? This is something I think about all the time, the voice of women and underrepresented communities in AI.

Diane: This is about educating people—not just women—to surface the accomplishments of everybody and make sure there’s no unconscious bias going on. I think Grace Hopper is a phenomenal tool for this, and there are things that I incorporate into my work day to prevent that unconscious bias: pausing to make sure the right people were included in a meeting, and that no one has been overlooked. And encouraging everyone in that meeting to participate so that all voices are heard.

Fei-Fei: Grace Hopper could be a great platform to share best practices for how to address these issues.

...young women today are thinking about their goals and what they want to build for the world, but also for themselves and their families. Dr. Fei-Fei Li

Diane: Every company is struggling to address diversity and there’s a school of thought that says having three or more people from one minority group makes all the difference in the world—I see it on boards. Whenever we have three or more women, the whole dynamic changes. Do you see that in your research group at all?

Fei-Fei: Yes, for a long time I was the only woman faculty member in the Stanford AI lab, but now it has attracted a lot of women who do very well because there’s a community. And that’s wonderful for me, and for the group.

Now back to you … you’ve had such a successful career, and I think a lot of women would love to know what keeps you going every day.

Diane: When you wake up in the morning, be excited about what’s ahead for the day. And if you’re not excited, ask yourself if it’s time for a change. Right now the Cloud is at the center of massive change in our world, and I’m lucky to have a front row seat to how it’s happening and what’s possible with it. We’re creating the next generation of technologies that are going to help people do things that we didn’t even know were possible, particularly in the AI/ML area. It’s exciting to be in the middle of the transformation of our world and the fast pace at which it’s happening.

Fei-Fei: Coming to Google Cloud, the most rewarding part is seeing how this is helping people go through that transformation and making a difference. And it’s at such a scale that it’s unthinkable on almost any other platform.

Diane: Cloud is making it easier for companies to work together and for people to work across boundaries together, and I love that. I’ve always found when you can collaborate across more boundaries you can get a lot more done.

To hear more from Fei-Fei and Diane, tune into Grace Hopper’s live stream on October 4. 

Access information quicker, do better work with Google Cloud Search

We all get sidetracked at work. We intend to be as efficient as possible, but inevitably, the “busyness” of business gets in the way through back-to-back meetings, unfinished docs or managing a rowdy inbox. To be more efficient, you need quick access to your information like relevant docs, important tasks and context for your meetings.

Sadly, according to a report by McKinsey, workers spend up to 20 percent of their time—an entire day each week—searching for and consolidating information across a number of tools. We made Google Cloud Search available to Enterprise and Business edition customers earlier this year so that teams can access important information quicker. Here are a few ways that Cloud Search can help you get the information you need to accomplish more throughout your day.

1. Search more intuitively, access information quicker

If you search for a doc, you’re probably not going to remember its exact name or where you saved it in Drive. Instead, you might remember who sent the doc to you or a specific piece of information it contains, like a statistic.

A few weeks ago, we launched a new, more intuitive way to search in Cloud Search using natural language processing (NLP) technology. Type questions in Cloud Search using everyday language, like “Documents shared with me by John?,” “What’s my agenda next Tuesday?,” or “What docs need my attention?” and it will track down useful information for you.
NLP GIF

2. Prioritize your to-dos, use spare time more wisely

With so much work to do, deciding what to focus on and what to leave for later isn’t always simple. A study by McKinsey reports that only nine percent of executives surveyed feel “very satisfied” with the way they allocate their time. We think technology, like Cloud Search, should help you with more than just finding what you’re looking for—it should help you stay focused on what’s important.

Imagine if your next meeting gets cancelled and you suddenly have an extra half hour to accomplish tasks. You can open the Cloud Search app to help you focus on what’s important. Powered by machine intelligence, Cloud Search proactively surfaces information that it believes is relevant to you and organizes it into simple cards that appear in the app throughout your workday. For example, it suggests documents or tasks based on which documents need your attention or upcoming meetings you have in Google Calendar.

3. Prepare for meetings, get more out of them

Employees spend a lot of time in meetings. According to a study in the UK by the Centre for Economics and Business, office workers spend an average of four hours per week in meetings. It’s even normal for us to join meetings unprepared. The same group surveyed feels like nearly half of the time (47%) spent in meetings is unproductive.

Thankfully, Cloud Search can help. It uses machine intelligence to organize and present information to set you up for success in a meeting. In addition to surfacing relevant docs, Cloud Search also surfaces information about meeting attendees from your corporate directory, and even includes links to relevant conversations from Gmail.

Start by going into Cloud Search to see info related to your next meeting. If you’re interested in looking at another meeting later in the day, just click on “Today’s meetings” and it will show you your agenda for the day. Next, select an event in your agenda (sourced from your Calendar) and Cloud Search will recommend information that’s relevant to that meeting.

GIF 2

Take back your time and focus on what’s important—open the Cloud Search app and get started today, or ask your IT administrator to enable it in your domain. You can also learn more about how Cloud Search can help your teams here.

Source: Google Cloud


Access information quicker, do better work with Google Cloud Search

We all get sidetracked at work. We intend to be as efficient as possible, but inevitably, the “busyness” of business gets in the way through back-to-back meetings, unfinished docs or managing a rowdy inbox. To be more efficient, you need quick access to your information like relevant docs, important tasks and context for your meetings.

Sadly, according to a report by McKinsey, workers spend up to 20 percent of their time—an entire day each week—searching for and consolidating information across a number of tools. We made Google Cloud Search available to Enterprise and Business edition customers earlier this year so that teams can access important information quicker. Here are a few ways that Cloud Search can help you get the information you need to accomplish more throughout your day.

1. Search more intuitively, access information quicker

If you search for a doc, you’re probably not going to remember its exact name or where you saved it in Drive. Instead, you might remember who sent the doc to you or a specific piece of information it contains, like a statistic.

A few weeks ago, we launched a new, more intuitive way to search in Cloud Search using natural language processing (NLP) technology. Type questions in Cloud Search using everyday language, like “Documents shared with me by John?,” “What’s my agenda next Tuesday?,” or “What docs need my attention?” and it will track down useful information for you.
NLP GIF

2. Prioritize your to-dos, use spare time more wisely

With so much work to do, deciding what to focus on and what to leave for later isn’t always simple. A study by McKinsey reports that only nine percent of executives surveyed feel “very satisfied” with the way they allocate their time. We think technology, like Cloud Search, should help you with more than just finding what you’re looking for—it should help you stay focused on what’s important.

Imagine if your next meeting gets cancelled and you suddenly have an extra half hour to accomplish tasks. You can open the Cloud Search app to help you focus on what’s important. Powered by machine intelligence, Cloud Search proactively surfaces information that it believes is relevant to you and organizes it into simple cards that appear in the app throughout your workday. For example, it suggests documents or tasks based on which documents need your attention or upcoming meetings you have in Google Calendar.

3. Prepare for meetings, get more out of them

Employees spend a lot of time in meetings. According to a study in the UK by the Centre for Economics and Business, office workers spend an average of four hours per week in meetings. It’s even normal for us to join meetings unprepared. The same group surveyed feels like nearly half of the time (47%) spent in meetings is unproductive.

Thankfully, Cloud Search can help. It uses machine intelligence to organize and present information to set you up for success in a meeting. In addition to surfacing relevant docs, Cloud Search also surfaces information about meeting attendees from your corporate directory, and even includes links to relevant conversations from Gmail.

Start by going into Cloud Search to see info related to your next meeting. If you’re interested in looking at another meeting later in the day, just click on “Today’s meetings” and it will show you your agenda for the day. Next, select an event in your agenda (sourced from your Calendar) and Cloud Search will recommend information that’s relevant to that meeting.

GIF 2

Take back your time and focus on what’s important—open the Cloud Search app and get started today, or ask your IT administrator to enable it in your domain. You can also learn more about how Cloud Search can help your teams here.

Now anyone can explore machine learning, no coding required

From helping you find your favorite dog photos, to helping farmers in Japan sort cucumbers, machine learning is changing the way people use code to solve problems. But how does machine learning actually work? We wanted to make it easier for people who are curious about this technology to learn more about it. So we created Teachable Machine, a simple experiment that lets you teach a machine using your camera—live in the browser, no coding required.

Teachable Machine is built with a new library called deeplearn.js, which makes it easier for any web developer to get into machine learning by training and running neural nets right in the browser. We’ve also open sourced the code to help inspire others to make new experiments.

Check it out at g.co/teachablemachine.

Now anyone can explore machine learning, no coding required

From helping you find your favorite dog photos, to helping farmers in Japan sort cucumbers, machine learning is changing the way people use code to solve problems. But how does machine learning actually work? We wanted to make it easier for people who are curious about this technology to learn more about it. So we created Teachable Machine, a simple experiment that lets you teach a machine using your camera—live in the browser, no coding required.

Teachable Machine is built with a new library called deeplearn.js, which makes it easier for any web developer to get into machine learning by training and running neural nets right in the browser. We’ve also open sourced the code to help inspire others to make new experiments.

Check it out at g.co/teachablemachine.

Source: Education


A GIPHY engineering intern goes the GIF-stance with Google Cloud Vision

Editor’s Note: Today, we’re GIFted with the presence of a guest author. Bethany Davis, current University of Pennsylvania student and former software engineering summer intern at GIPHY, shares the details of her summer project, which was powered by Google Cloud Vision. This is a condensed and modified version of a post published on the GIPHY Engineering blog.

When my friend was starting her first full-time job, I wanted to GIF her a pep talk before her first day. I had the perfect movie reference in mind: Becca from “Bridesmaids” saying, “You are more beautiful than Cinderella! You smell like pine needles and have a face like sunshine!”

I searched GIPHY for “you are more beautiful than Cinderella” to no avail, then searched for “bridesmaids” and scrolled through several dozen results before giving up.

GiphySearch_2.png
Searching for Bridesmaids or the direct quote did not yield any useful results

It was easy to search for GIFs with popular tags, but because no one had tagged this GIF with the full line from the movie, I couldn’t find it. Yet I knew this GIF was out there. I wished there was a way to find the exact GIF that was pulled from the line in a movie, scene from a TV show or lyric from a song. Luckily, I was about to start my internship at GIPHY and I had the opportunity to tackle the problem head on—by using optical character recognition (OCR) and Google Cloud Vision to help you (and me) find the perfect GIF.

GIF me the tools and I’ll finish the job

When I started my internship, GIPHY engineers had already generated metadata about our collection of GIFs using Google Cloud Vision, an image recognition tool that is powered by machine learning. Specifically, Cloud Vision had performed optical character recognition (OCR) on our entire GIF library to detect text or captions within the image. The OCR results we got back from Google Cloud Vision were so good that my team was ready to incorporate the data directly into our search engine. I was tasked with parsing the data and indexing each GIF, then updating our search query to leverage the new, bolstered metadata.

Using Luigi I wrote a batch job that processed the JSON data generated from Google Cloud Vision. Then I used AWS Simple Queue Service to coordinate data transfer from Google Cloud Vision to documents in our search index. GIPHY search is built on top of Elasticsearch, which stores GIF documents; and the search query returns results based on the data in our Elasticsearch index. Bringing all these components together looks something like this:

GiphySearch_Workflow.png

One of the biggest challenges in building this update was ensuring that we could process data for millions of GIFs quickly. I had to learn how to optimize the runtime of the code that prepares GIF updates for Elasticsearch. My first iteration took 80+ hours, but eventually I got it to run in just eight.

Once all the data was indexed, the next step was to incorporate the text/caption metadata into our query. I used what’s called a match phrase query, which looks for words in the caption that appear in the same order as the words in the search input—guaranteeing that a substring of my movie quote is intact in the results. I also had to decide how much to weigh the data from Google Cloud Vision relative to other sources of data we have about a GIF (like its tags or the frequency with which users click on it) to determine the most relevant results.

It was time to see how the change would affect results. Using an internal GIPHY tool called Search UX, I searched for “where are the turtles,” a quote from “The Office.” The difference between the old query and the new one was dramatic:

GiphySearch_3.png

I also used a tool that examines the change on a larger scale by running the old and new queries against a random set of search terms—useful for ensuring that the change won’t disrupt popular searches like “cat” or “happy birthday,” which already deliver high-quality results.

See the GIFference

After our internal tools indicated a positive change, I launched the updated query as an A/B experiment. The results looked promising, with an overall increase in click-through rate of 0.5 percent. But my change affects a very specific type of search, especially longer phrases, and the impact of the change is even more noticeable for queries in this category. For example, click-through rate when searching for the phrase “never give up never surrender” (from “Galaxy Quest”) increased 32 percent, and click-through rate for the phrase “gotta be quicker than that” increased 31 percent. In addition to quotes from movies and TV shows, we saw improvements for general phrases like “everything will be ok” and “there you go.” The final click-through rate for these queries is almost 100 percent!

The ultimate test was my own, though. I revisited my search query from the beginning of the summer:

GiphySearch_4.png

Success! The search results are much improved. Now, the next time you use GIPHY to search for a specific scene or a direct quote, the results will show you exactly what you were looking for.

To learn more about the technical details behind my project, see the GIPHY Engineering blog.

How Machine Learning with TensorFlow Enabled Mobile Proof-Of-Purchase at Coca-Cola

In this guest editorial, Patrick Brandt of The Coca-Cola Company tells us how they're using AI and TensorFlow to achieve frictionless proof-of-purchase.

Coca-Cola's core loyalty program launched in 2006 as MyCokeRewards.com. The "MCR.com" platform included the creation of unique product codes for every Coca-Cola, Sprite, Fanta, and Powerade product sold in 20oz bottles and cardboard "fridge-packs" purchasable at grocery stores and other retail outlets. Users could enter these product codes at MyCokeRewards.com to participate in promotional campaigns.

Fast-forward to 2016: Coke's loyalty programs are still hugely popular with millions of product codes having been entered for promotions and sweepstakes. However, mobile browsing went from non-existent in 2006 to over 50% share by the end of 2016. The launch of Coke.com as a mobile-first web experience (replacing MCR.com) was a response to these changes in browsing behavior. Thumb-entering 14-character codes into a mobile device could be a difficult enough user experience to impact the success of our programs. We want to provide our mobile audience the best possible experience, and recent advances in artificial intelligence opened new opportunities.

The quest for frictionless proof-of-purchase

For years Coke attempted to use off-the-shelf optical character recognition (OCR) libraries and services to read product codes with little success. Our printing process typically uses low-resolution dot-matrix fonts with the cap or fridge-pack media running under the printhead at very high speeds. All of this translates into a low-fidelity string of characters that defeats off-the-shelf OCR offerings (and can sometimes be hard to read with the human eye as well). OCR is critical to simplifying the code-entry process for mobile users: they should be able to take a picture of a code and automatically have the purchase registered for a promotional entry. We needed a purpose-built OCR system to recognize our product codes.

Bottlecap and fridge-pack examples

Our research led us to a promising solution: Convolutional Neural Networks. CNNs are one of a family of "deep learning" neural networks that are at the heart of modern artificial intelligence products. Google has used CNNs to extract street address numbers from StreetView images. CNNs also perform remarkably well at recognizing handwritten digits. These number-recognition use-cases were a perfect proxy for the type of problem we were trying to solve: extracting strings from images that contain small character sets with lots of variance in the appearance of the characters.

CNNs with TensorFlow

In the past, developing deep neural networks like CNNs has been a challenge because of the complexity of available training and inference libraries. TensorFlow, a machine learning framework that was open sourced by Google in November 2015, is designed to simplify the development of deep neural networks.

TensorFlow provides high-level interfaces to different kinds of neuron layers and popular loss functions, which makes it easier to implement different CNN model architectures. The ability to rapidly iterate over different model architectures dramatically reduced the time required to build Coke's custom OCR solution because different models could be developed, trained, and tested in a matter of days. TensorFlow models are also portable: the framework supports model execution natively on mobile devices ("AI on the edge") or in servers hosted remotely in the cloud. This enables a "create once, run anywhere" approach for model execution across many different platforms, including web-based and mobile.

Machine learning: practice makes perfect

Any neural network is only as good as the data used to train it. We knew that we needed a large set of labeled product-code images to train a CNN that would achieve our performance goals. Our training set would be built in three phases:

  1. Pre-launch simulated images
  2. Pre-launch real-world images
  3. Images labeled by our users in production

The pre-launch training phase began by programmatically generating millions of simulated product-code images. These simulated images included variations in tilt, lighting, shadows, and blurriness. The prediction accuracy (i.e. how often all 14 characters were correctly predicted within the top-10 predictions) was at 50% against real-world images when the model was trained using only simulated images. This provided a baseline for transfer-learning: a model initially trained with simulated images was the foundation for a more accurate model that would be trained against real-world images.

The challenge now turned to enriching the simulated images with enough real-world images to hit our performance goals. We created a purpose-built training app for iOS and Android devices that "trainers" could use to take pictures of codes and label them; these labeled images were then transferred to cloud storage for training. We did a production run of several thousand product codes on bottle caps and fridge-packs and distributed these to multiple suppliers who used the app to create the initial real-world training set.

Even with an augmented and enriched training set, there is no substitute for images created by end-users in a variety of environmental conditions. We knew that scans would sometimes result in an inaccurate code prediction, so we needed to provide a user-experience that would allow users to quickly correct these predictions. Two components are essential to delivering this experience: a product-code validation service that has been in use since the launch of our original loyalty platform in 2006 (to verify that a predicted code is an actual code) and a prediction algorithm that performs a regression to determine a per-character confidence at each one of the 14 character positions. If a predicted code is invalid, the top prediction as well as the confidence levels for each character are returned to the user interface. Low-confidence characters are visually highlighted to guide the user to update characters that need attention.

Error correction user interface lets users correct invalid predictions and generate useful training data

This user interface innovation enables an active learning process: a feedback loop allows the model to gradually improve by returning corrected predictions to the training pipeline. In this way, our users organically improve the accuracy of the character recognition model over time.

Product-code recognition pipeline

Optimizing for maximum performance

To meet user expectations around performance, we established a few ambitious requirements for the product-code OCR pipeline:

  • It had to be fast: we needed a one-second average processing time once the image of the product-code was sent into the OCR pipeline
  • It had to be accurate: our goal was to achieve 95% string recognition accuracy at launch with the guarantee that the model could be improved over time via active learning
  • It had to be small: the OCR pipeline needs to be small enough to be distributed directly to mobile apps and accommodate over-the-air updates as the model improves over time
  • It had to handle diverse product code media: dozens of different combinations of font types, bottlecaps, and cardboard fridge-pack media

We initially explored an architecture that used a single CNN for all product-code media. This approach created a model that was too large to be distributed to mobile apps and the execution time was longer than desired. Our applied-AI partners at Quantiphi, Inc.began iterating on different model architectures, eventually landing on one that used multiple CNNs.

This new architecture reduced the model size dramatically without sacrificing accuracy, but it was still on the high end of what we needed in order to support over-the-air updates to mobile apps. We next used TensorFlow's prebuilt quantization module to reduce the model size by reducing the fidelity of the weights between connected neurons. Quantization reduced the model size by a factor of 4, but a dramatic reduction in model size occurred when Quantiphi had a breakthrough using a new approach called SqueezeNet.

The SqueezeNet model was published by a team of researchers from UC Berkeley and Stanford in November of 2016. It uses a small but highly complex design to achieve accuracy levels on par with much larger models against popular benchmarks such as Imagenet. After re-architecting our character recognition models to use a SqueezeNet CNN, Quantiphi was able to reduce the model size of certain media types by a factor of 100. Since the SqueezeNet model was inherently smaller, a richer feature detection architecture could be constructed, achieving much higher accuracy at much smaller sizes compared to our first batch of models trained without SqueezeNet. We now have a highly accurate model that can be easily updated on remote devices; the recognition success rate of our final model before active learning was close to 96%, which translates into a 99.7% character recognition accuracy (just 3 misses for every 1000 character predictions).

Valid product-code recognition examples with different types of occlusion, translation, and camera focus issues

Crossing boundaries with AI

Advances in artificial intelligence and the maturity of TensorFlow enabled us to finally achieve a long-sought proof-of-purchase capability. Since launching in late February 2017, our product code recognition platform has fueled more than a dozen promotions and resulted in over 180,000 scanned codes; it is now a core component for all of Coca-Cola North America's web-based promotions.

Moving to an AI-enabled product-code recognition platform has been valuable for two key reasons:

  • Frictionless proof-of-purchase was enabled in a timely fashion, corresponding to our overall move to a mobile-first marketing platform.
  • Coke saved millions of dollars by avoiding the requirement to update printers in our production lines to support higher-fidelity fonts that would work with existing off-the-shelf OCR software.

Our product-code recognition platform is the first execution of new AI-enabled capabilities at scale within Coca-Cola. We're now exploring AI applications across multiple lines of business, from new product development to ecommerce retail optimization.