Tag Archives: machine learning

Best commute ever? Ride along with Google execs Diane Greene and Fei-Fei Li

Editor’s Note: The Grace Hopper Celebration of Women in Computing is coming up, and Diane Greene and Dr. Fei-Fei Li—two of our senior leaders—are getting ready. Sometimes Diane and Fei-Fei commute to the office together, and this time we happened to be along to capture the ride. Diane took over the music for the commute, and with Aretha Franklin’s “Respect” in the background, she and Fei-Fei chatted about the conference, their careers in tech, motherhood, and amplifying female voices everywhere. Hop in the backseat for Diane and Fei-Fei’s ride to work.

(A quick note for the riders: This conversation has been edited for brevity, and so you don’t have to read Diane and Fei-Fei talking about U-turns.)

fei-fei and diane.gif

Fei-Fei: Are you getting excited for Grace Hopper?

Diane: I’m super excited for the conference. We’re bringing together technical women to surface a lot of things that haven’t been talked about as openly in the past.

Fei-Fei: You’ve had a long career in tech. What makes this point in time different from the early days when you entered this field?

Diane: I got a degree in engineering in 1976 (ed note: Fei-Fei jumped in to remind Diane that this was the year she was born!). Computers were so exciting, and I learned to program. When I went to grad school to study computer science in 1985, there was actually a fair number of women at UC Berkeley. I’d say we had at least 30 percent women, which is way better than today.

It was a new, undefined field. And whenever there’s a new industry or technology, it’s wide open for everyone because nothing’s been established. Tech was that way, so it was quite natural for women to work in artificial intelligence and theory, and even in systems, networking, and hardware architecture. I came from mechanical engineering and the oil industry where I was the only woman. Tech was full of women then, but now less than 15 percent of women are in tech.

Fei-Fei: So do you think it’s too late?

Diane: I don’t think it’s too late. Girls in grade school and high school are coding. And certainly in colleges the focus on engineering is really strong, and the numbers are growing again.

Fei-Fei: You’re giving a talk at Grace Hopper—how will you talk to them about what distinguishes your career?

Diane: It’s wonderful that we’re both giving talks! Growing up, I loved building things so it was natural for me to go into engineering. I want to encourage other women to start with what you’re interested in and what makes you excited. If you love building things, focus on that, and the career success will come. I’ve been so unbelievably lucky in my career, but it’s a proof point that you can end up having quite a good career while doing what you’re interested in.

I want to encourage other women to start with what you’re interested in and what makes you excited. If you love building things, focus on that, and the career success will come. Diane Greene

Fei-Fei: And you are a mother of two grown, beautiful children. How did you prioritize them while balancing career?

Diane: When I was at VMware, I had the “go home for dinner” rule. When we founded the company, I was pregnant and none of the other founders had kids. But we were able to build a the culture around families—every time someone had a kid we gave them a VMware diaper bag. Whenever my kids were having a school play or parent teacher conference, I would make a big show of leaving in the middle of the day so everyone would know they could do that too. And at Google, I encourage both men and women on my team to find that balance.

Fei-Fei: It’s so important for your message to get across because young women today are thinking about their goals and what they want to build for the world, but also for themselves and their families. And there are so many women and people of color doing great work, how do we lift up their work? How do we get their voices heard? This is something I think about all the time, the voice of women and underrepresented communities in AI.

Diane: This is about educating people—not just women—to surface the accomplishments of everybody and make sure there’s no unconscious bias going on. I think Grace Hopper is a phenomenal tool for this, and there are things that I incorporate into my work day to prevent that unconscious bias: pausing to make sure the right people were included in a meeting, and that no one has been overlooked. And encouraging everyone in that meeting to participate so that all voices are heard.

Fei-Fei: Grace Hopper could be a great platform to share best practices for how to address these issues.

...young women today are thinking about their goals and what they want to build for the world, but also for themselves and their families. Dr. Fei-Fei Li

Diane: Every company is struggling to address diversity and there’s a school of thought that says having three or more people from one minority group makes all the difference in the world—I see it on boards. Whenever we have three or more women, the whole dynamic changes. Do you see that in your research group at all?

Fei-Fei: Yes, for a long time I was the only woman faculty member in the Stanford AI lab, but now it has attracted a lot of women who do very well because there’s a community. And that’s wonderful for me, and for the group.

Now back to you … you’ve had such a successful career, and I think a lot of women would love to know what keeps you going every day.

Diane: When you wake up in the morning, be excited about what’s ahead for the day. And if you’re not excited, ask yourself if it’s time for a change. Right now the Cloud is at the center of massive change in our world, and I’m lucky to have a front row seat to how it’s happening and what’s possible with it. We’re creating the next generation of technologies that are going to help people do things that we didn’t even know were possible, particularly in the AI/ML area. It’s exciting to be in the middle of the transformation of our world and the fast pace at which it’s happening.

Fei-Fei: Coming to Google Cloud, the most rewarding part is seeing how this is helping people go through that transformation and making a difference. And it’s at such a scale that it’s unthinkable on almost any other platform.

Diane: Cloud is making it easier for companies to work together and for people to work across boundaries together, and I love that. I’ve always found when you can collaborate across more boundaries you can get a lot more done.

To hear more from Fei-Fei and Diane, tune into Grace Hopper’s live stream on October 4. 

Source: Google Cloud


Best commute ever? Ride along with Google execs Diane Greene and Fei-Fei Li

Editor’s Note: The Grace Hopper Celebration of Women in Computing is coming up, and Diane Greene and Dr. Fei-Fei Li—two of our senior leaders—are getting ready. Sometimes Diane and Fei-Fei commute to the office together, and this time we happened to be along to capture the ride. Diane took over the music for the commute, and with Aretha Franklin’s “Respect” in the background, she and Fei-Fei chatted about the conference, their careers in tech, motherhood, and amplifying female voices everywhere. Hop in the backseat for Diane and Fei-Fei’s ride to work.

(A quick note for the riders: This conversation has been edited for brevity, and so you don’t have to read Diane and Fei-Fei talking about U-turns.)

fei-fei and diane.gif

Fei-Fei: Are you getting excited for Grace Hopper?

Diane: I’m super excited for the conference. We’re bringing together technical women to surface a lot of things that haven’t been talked about as openly in the past.

Fei-Fei: You’ve had a long career in tech. What makes this point in time different from the early days when you entered this field?

Diane: I got a degree in engineering in 1976 (ed note: Fei-Fei jumped in to remind Diane that this was the year she was born!). Computers were so exciting, and I learned to program. When I went to grad school to study computer science in 1985, there was actually a fair number of women at UC Berkeley. I’d say we had at least 30 percent women, which is way better than today.

It was a new, undefined field. And whenever there’s a new industry or technology, it’s wide open for everyone because nothing’s been established. Tech was that way, so it was quite natural for women to work in artificial intelligence and theory, and even in systems, networking, and hardware architecture. I came from mechanical engineering and the oil industry where I was the only woman. Tech was full of women then, but now less than 15 percent of women are in tech.

Fei-Fei: So do you think it’s too late?

Diane: I don’t think it’s too late. Girls in grade school and high school are coding. And certainly in colleges the focus on engineering is really strong, and the numbers are growing again.

Fei-Fei: You’re giving a talk at Grace Hopper—how will you talk to them about what distinguishes your career?

Diane: It’s wonderful that we’re both giving talks! Growing up, I loved building things so it was natural for me to go into engineering. I want to encourage other women to start with what you’re interested in and what makes you excited. If you love building things, focus on that, and the career success will come. I’ve been so unbelievably lucky in my career, but it’s a proof point that you can end up having quite a good career while doing what you’re interested in.

I want to encourage other women to start with what you’re interested in and what makes you excited. If you love building things, focus on that, and the career success will come. Diane Greene

Fei-Fei: And you are a mother of two grown, beautiful children. How did you prioritize them while balancing career?

Diane: When I was at VMware, I had the “go home for dinner” rule. When we founded the company, I was pregnant and none of the other founders had kids. But we were able to build a the culture around families—every time someone had a kid we gave them a VMware diaper bag. Whenever my kids were having a school play or parent teacher conference, I would make a big show of leaving in the middle of the day so everyone would know they could do that too. And at Google, I encourage both men and women on my team to find that balance.

Fei-Fei: It’s so important for your message to get across because young women today are thinking about their goals and what they want to build for the world, but also for themselves and their families. And there are so many women and people of color doing great work, how do we lift up their work? How do we get their voices heard? This is something I think about all the time, the voice of women and underrepresented communities in AI.

Diane: This is about educating people—not just women—to surface the accomplishments of everybody and make sure there’s no unconscious bias going on. I think Grace Hopper is a phenomenal tool for this, and there are things that I incorporate into my work day to prevent that unconscious bias: pausing to make sure the right people were included in a meeting, and that no one has been overlooked. And encouraging everyone in that meeting to participate so that all voices are heard.

Fei-Fei: Grace Hopper could be a great platform to share best practices for how to address these issues.

...young women today are thinking about their goals and what they want to build for the world, but also for themselves and their families. Dr. Fei-Fei Li

Diane: Every company is struggling to address diversity and there’s a school of thought that says having three or more people from one minority group makes all the difference in the world—I see it on boards. Whenever we have three or more women, the whole dynamic changes. Do you see that in your research group at all?

Fei-Fei: Yes, for a long time I was the only woman faculty member in the Stanford AI lab, but now it has attracted a lot of women who do very well because there’s a community. And that’s wonderful for me, and for the group.

Now back to you … you’ve had such a successful career, and I think a lot of women would love to know what keeps you going every day.

Diane: When you wake up in the morning, be excited about what’s ahead for the day. And if you’re not excited, ask yourself if it’s time for a change. Right now the Cloud is at the center of massive change in our world, and I’m lucky to have a front row seat to how it’s happening and what’s possible with it. We’re creating the next generation of technologies that are going to help people do things that we didn’t even know were possible, particularly in the AI/ML area. It’s exciting to be in the middle of the transformation of our world and the fast pace at which it’s happening.

Fei-Fei: Coming to Google Cloud, the most rewarding part is seeing how this is helping people go through that transformation and making a difference. And it’s at such a scale that it’s unthinkable on almost any other platform.

Diane: Cloud is making it easier for companies to work together and for people to work across boundaries together, and I love that. I’ve always found when you can collaborate across more boundaries you can get a lot more done.

To hear more from Fei-Fei and Diane, tune into Grace Hopper’s live stream on October 4. 

Access information quicker, do better work with Google Cloud Search

We all get sidetracked at work. We intend to be as efficient as possible, but inevitably, the “busyness” of business gets in the way through back-to-back meetings, unfinished docs or managing a rowdy inbox. To be more efficient, you need quick access to your information like relevant docs, important tasks and context for your meetings.

Sadly, according to a report by McKinsey, workers spend up to 20 percent of their time—an entire day each week—searching for and consolidating information across a number of tools. We made Google Cloud Search available to Enterprise and Business edition customers earlier this year so that teams can access important information quicker. Here are a few ways that Cloud Search can help you get the information you need to accomplish more throughout your day.

1. Search more intuitively, access information quicker

If you search for a doc, you’re probably not going to remember its exact name or where you saved it in Drive. Instead, you might remember who sent the doc to you or a specific piece of information it contains, like a statistic.

A few weeks ago, we launched a new, more intuitive way to search in Cloud Search using natural language processing (NLP) technology. Type questions in Cloud Search using everyday language, like “Documents shared with me by John?,” “What’s my agenda next Tuesday?,” or “What docs need my attention?” and it will track down useful information for you.
NLP GIF

2. Prioritize your to-dos, use spare time more wisely

With so much work to do, deciding what to focus on and what to leave for later isn’t always simple. A study by McKinsey reports that only nine percent of executives surveyed feel “very satisfied” with the way they allocate their time. We think technology, like Cloud Search, should help you with more than just finding what you’re looking for—it should help you stay focused on what’s important.

Imagine if your next meeting gets cancelled and you suddenly have an extra half hour to accomplish tasks. You can open the Cloud Search app to help you focus on what’s important. Powered by machine intelligence, Cloud Search proactively surfaces information that it believes is relevant to you and organizes it into simple cards that appear in the app throughout your workday. For example, it suggests documents or tasks based on which documents need your attention or upcoming meetings you have in Google Calendar.

3. Prepare for meetings, get more out of them

Employees spend a lot of time in meetings. According to a study in the UK by the Centre for Economics and Business, office workers spend an average of four hours per week in meetings. It’s even normal for us to join meetings unprepared. The same group surveyed feels like nearly half of the time (47%) spent in meetings is unproductive.

Thankfully, Cloud Search can help. It uses machine intelligence to organize and present information to set you up for success in a meeting. In addition to surfacing relevant docs, Cloud Search also surfaces information about meeting attendees from your corporate directory, and even includes links to relevant conversations from Gmail.

Start by going into Cloud Search to see info related to your next meeting. If you’re interested in looking at another meeting later in the day, just click on “Today’s meetings” and it will show you your agenda for the day. Next, select an event in your agenda (sourced from your Calendar) and Cloud Search will recommend information that’s relevant to that meeting.

GIF 2

Take back your time and focus on what’s important—open the Cloud Search app and get started today, or ask your IT administrator to enable it in your domain. You can also learn more about how Cloud Search can help your teams here.

Access information quicker, do better work with Google Cloud Search

We all get sidetracked at work. We intend to be as efficient as possible, but inevitably, the “busyness” of business gets in the way through back-to-back meetings, unfinished docs or managing a rowdy inbox. To be more efficient, you need quick access to your information like relevant docs, important tasks and context for your meetings.

Sadly, according to a report by McKinsey, workers spend up to 20 percent of their time—an entire day each week—searching for and consolidating information across a number of tools. We made Google Cloud Search available to Enterprise and Business edition customers earlier this year so that teams can access important information quicker. Here are a few ways that Cloud Search can help you get the information you need to accomplish more throughout your day.

1. Search more intuitively, access information quicker

If you search for a doc, you’re probably not going to remember its exact name or where you saved it in Drive. Instead, you might remember who sent the doc to you or a specific piece of information it contains, like a statistic.

A few weeks ago, we launched a new, more intuitive way to search in Cloud Search using natural language processing (NLP) technology. Type questions in Cloud Search using everyday language, like “Documents shared with me by John?,” “What’s my agenda next Tuesday?,” or “What docs need my attention?” and it will track down useful information for you.
NLP GIF

2. Prioritize your to-dos, use spare time more wisely

With so much work to do, deciding what to focus on and what to leave for later isn’t always simple. A study by McKinsey reports that only nine percent of executives surveyed feel “very satisfied” with the way they allocate their time. We think technology, like Cloud Search, should help you with more than just finding what you’re looking for—it should help you stay focused on what’s important.

Imagine if your next meeting gets cancelled and you suddenly have an extra half hour to accomplish tasks. You can open the Cloud Search app to help you focus on what’s important. Powered by machine intelligence, Cloud Search proactively surfaces information that it believes is relevant to you and organizes it into simple cards that appear in the app throughout your workday. For example, it suggests documents or tasks based on which documents need your attention or upcoming meetings you have in Google Calendar.

3. Prepare for meetings, get more out of them

Employees spend a lot of time in meetings. According to a study in the UK by the Centre for Economics and Business, office workers spend an average of four hours per week in meetings. It’s even normal for us to join meetings unprepared. The same group surveyed feels like nearly half of the time (47%) spent in meetings is unproductive.

Thankfully, Cloud Search can help. It uses machine intelligence to organize and present information to set you up for success in a meeting. In addition to surfacing relevant docs, Cloud Search also surfaces information about meeting attendees from your corporate directory, and even includes links to relevant conversations from Gmail.

Start by going into Cloud Search to see info related to your next meeting. If you’re interested in looking at another meeting later in the day, just click on “Today’s meetings” and it will show you your agenda for the day. Next, select an event in your agenda (sourced from your Calendar) and Cloud Search will recommend information that’s relevant to that meeting.

GIF 2

Take back your time and focus on what’s important—open the Cloud Search app and get started today, or ask your IT administrator to enable it in your domain. You can also learn more about how Cloud Search can help your teams here.

Source: Google Cloud


Now anyone can explore machine learning, no coding required

From helping you find your favorite dog photos, to helping farmers in Japan sort cucumbers, machine learning is changing the way people use code to solve problems. But how does machine learning actually work? We wanted to make it easier for people who are curious about this technology to learn more about it. So we created Teachable Machine, a simple experiment that lets you teach a machine using your camera—live in the browser, no coding required.

Teachable Machine is built with a new library called deeplearn.js, which makes it easier for any web developer to get into machine learning by training and running neural nets right in the browser. We’ve also open sourced the code to help inspire others to make new experiments.

Check it out at g.co/teachablemachine.

Now anyone can explore machine learning, no coding required

From helping you find your favorite dog photos, to helping farmers in Japan sort cucumbers, machine learning is changing the way people use code to solve problems. But how does machine learning actually work? We wanted to make it easier for people who are curious about this technology to learn more about it. So we created Teachable Machine, a simple experiment that lets you teach a machine using your camera—live in the browser, no coding required.

Teachable Machine is built with a new library called deeplearn.js, which makes it easier for any web developer to get into machine learning by training and running neural nets right in the browser. We’ve also open sourced the code to help inspire others to make new experiments.

Check it out at g.co/teachablemachine.

Source: Education


A GIPHY engineering intern goes the GIF-stance with Google Cloud Vision

Editor’s Note: Today, we’re GIFted with the presence of a guest author. Bethany Davis, current University of Pennsylvania student and former software engineering summer intern at GIPHY, shares the details of her summer project, which was powered by Google Cloud Vision. This is a condensed and modified version of a post published on the GIPHY Engineering blog.

When my friend was starting her first full-time job, I wanted to GIF her a pep talk before her first day. I had the perfect movie reference in mind: Becca from “Bridesmaids” saying, “You are more beautiful than Cinderella! You smell like pine needles and have a face like sunshine!”

I searched GIPHY for “you are more beautiful than Cinderella” to no avail, then searched for “bridesmaids” and scrolled through several dozen results before giving up.

GiphySearch_2.png
Searching for Bridesmaids or the direct quote did not yield any useful results

It was easy to search for GIFs with popular tags, but because no one had tagged this GIF with the full line from the movie, I couldn’t find it. Yet I knew this GIF was out there. I wished there was a way to find the exact GIF that was pulled from the line in a movie, scene from a TV show or lyric from a song. Luckily, I was about to start my internship at GIPHY and I had the opportunity to tackle the problem head on—by using optical character recognition (OCR) and Google Cloud Vision to help you (and me) find the perfect GIF.

GIF me the tools and I’ll finish the job

When I started my internship, GIPHY engineers had already generated metadata about our collection of GIFs using Google Cloud Vision, an image recognition tool that is powered by machine learning. Specifically, Cloud Vision had performed optical character recognition (OCR) on our entire GIF library to detect text or captions within the image. The OCR results we got back from Google Cloud Vision were so good that my team was ready to incorporate the data directly into our search engine. I was tasked with parsing the data and indexing each GIF, then updating our search query to leverage the new, bolstered metadata.

Using Luigi I wrote a batch job that processed the JSON data generated from Google Cloud Vision. Then I used AWS Simple Queue Service to coordinate data transfer from Google Cloud Vision to documents in our search index. GIPHY search is built on top of Elasticsearch, which stores GIF documents; and the search query returns results based on the data in our Elasticsearch index. Bringing all these components together looks something like this:

GiphySearch_Workflow.png

One of the biggest challenges in building this update was ensuring that we could process data for millions of GIFs quickly. I had to learn how to optimize the runtime of the code that prepares GIF updates for Elasticsearch. My first iteration took 80+ hours, but eventually I got it to run in just eight.

Once all the data was indexed, the next step was to incorporate the text/caption metadata into our query. I used what’s called a match phrase query, which looks for words in the caption that appear in the same order as the words in the search input—guaranteeing that a substring of my movie quote is intact in the results. I also had to decide how much to weigh the data from Google Cloud Vision relative to other sources of data we have about a GIF (like its tags or the frequency with which users click on it) to determine the most relevant results.

It was time to see how the change would affect results. Using an internal GIPHY tool called Search UX, I searched for “where are the turtles,” a quote from “The Office.” The difference between the old query and the new one was dramatic:

GiphySearch_3.png

I also used a tool that examines the change on a larger scale by running the old and new queries against a random set of search terms—useful for ensuring that the change won’t disrupt popular searches like “cat” or “happy birthday,” which already deliver high-quality results.

See the GIFference

After our internal tools indicated a positive change, I launched the updated query as an A/B experiment. The results looked promising, with an overall increase in click-through rate of 0.5 percent. But my change affects a very specific type of search, especially longer phrases, and the impact of the change is even more noticeable for queries in this category. For example, click-through rate when searching for the phrase “never give up never surrender” (from “Galaxy Quest”) increased 32 percent, and click-through rate for the phrase “gotta be quicker than that” increased 31 percent. In addition to quotes from movies and TV shows, we saw improvements for general phrases like “everything will be ok” and “there you go.” The final click-through rate for these queries is almost 100 percent!

The ultimate test was my own, though. I revisited my search query from the beginning of the summer:

GiphySearch_4.png

Success! The search results are much improved. Now, the next time you use GIPHY to search for a specific scene or a direct quote, the results will show you exactly what you were looking for.

To learn more about the technical details behind my project, see the GIPHY Engineering blog.

How Machine Learning with TensorFlow Enabled Mobile Proof-Of-Purchase at Coca-Cola

In this guest editorial, Patrick Brandt of The Coca-Cola Company tells us how they're using AI and TensorFlow to achieve frictionless proof-of-purchase.

Coca-Cola's core loyalty program launched in 2006 as MyCokeRewards.com. The "MCR.com" platform included the creation of unique product codes for every Coca-Cola, Sprite, Fanta, and Powerade product sold in 20oz bottles and cardboard "fridge-packs" purchasable at grocery stores and other retail outlets. Users could enter these product codes at MyCokeRewards.com to participate in promotional campaigns.

Fast-forward to 2016: Coke's loyalty programs are still hugely popular with millions of product codes having been entered for promotions and sweepstakes. However, mobile browsing went from non-existent in 2006 to over 50% share by the end of 2016. The launch of Coke.com as a mobile-first web experience (replacing MCR.com) was a response to these changes in browsing behavior. Thumb-entering 14-character codes into a mobile device could be a difficult enough user experience to impact the success of our programs. We want to provide our mobile audience the best possible experience, and recent advances in artificial intelligence opened new opportunities.

The quest for frictionless proof-of-purchase

For years Coke attempted to use off-the-shelf optical character recognition (OCR) libraries and services to read product codes with little success. Our printing process typically uses low-resolution dot-matrix fonts with the cap or fridge-pack media running under the printhead at very high speeds. All of this translates into a low-fidelity string of characters that defeats off-the-shelf OCR offerings (and can sometimes be hard to read with the human eye as well). OCR is critical to simplifying the code-entry process for mobile users: they should be able to take a picture of a code and automatically have the purchase registered for a promotional entry. We needed a purpose-built OCR system to recognize our product codes.

Bottlecap and fridge-pack examples

Our research led us to a promising solution: Convolutional Neural Networks. CNNs are one of a family of "deep learning" neural networks that are at the heart of modern artificial intelligence products. Google has used CNNs to extract street address numbers from StreetView images. CNNs also perform remarkably well at recognizing handwritten digits. These number-recognition use-cases were a perfect proxy for the type of problem we were trying to solve: extracting strings from images that contain small character sets with lots of variance in the appearance of the characters.

CNNs with TensorFlow

In the past, developing deep neural networks like CNNs has been a challenge because of the complexity of available training and inference libraries. TensorFlow, a machine learning framework that was open sourced by Google in November 2015, is designed to simplify the development of deep neural networks.

TensorFlow provides high-level interfaces to different kinds of neuron layers and popular loss functions, which makes it easier to implement different CNN model architectures. The ability to rapidly iterate over different model architectures dramatically reduced the time required to build Coke's custom OCR solution because different models could be developed, trained, and tested in a matter of days. TensorFlow models are also portable: the framework supports model execution natively on mobile devices ("AI on the edge") or in servers hosted remotely in the cloud. This enables a "create once, run anywhere" approach for model execution across many different platforms, including web-based and mobile.

Machine learning: practice makes perfect

Any neural network is only as good as the data used to train it. We knew that we needed a large set of labeled product-code images to train a CNN that would achieve our performance goals. Our training set would be built in three phases:

  1. Pre-launch simulated images
  2. Pre-launch real-world images
  3. Images labeled by our users in production

The pre-launch training phase began by programmatically generating millions of simulated product-code images. These simulated images included variations in tilt, lighting, shadows, and blurriness. The prediction accuracy (i.e. how often all 14 characters were correctly predicted within the top-10 predictions) was at 50% against real-world images when the model was trained using only simulated images. This provided a baseline for transfer-learning: a model initially trained with simulated images was the foundation for a more accurate model that would be trained against real-world images.

The challenge now turned to enriching the simulated images with enough real-world images to hit our performance goals. We created a purpose-built training app for iOS and Android devices that "trainers" could use to take pictures of codes and label them; these labeled images were then transferred to cloud storage for training. We did a production run of several thousand product codes on bottle caps and fridge-packs and distributed these to multiple suppliers who used the app to create the initial real-world training set.

Even with an augmented and enriched training set, there is no substitute for images created by end-users in a variety of environmental conditions. We knew that scans would sometimes result in an inaccurate code prediction, so we needed to provide a user-experience that would allow users to quickly correct these predictions. Two components are essential to delivering this experience: a product-code validation service that has been in use since the launch of our original loyalty platform in 2006 (to verify that a predicted code is an actual code) and a prediction algorithm that performs a regression to determine a per-character confidence at each one of the 14 character positions. If a predicted code is invalid, the top prediction as well as the confidence levels for each character are returned to the user interface. Low-confidence characters are visually highlighted to guide the user to update characters that need attention.

Error correction user interface lets users correct invalid predictions and generate useful training data

This user interface innovation enables an active learning process: a feedback loop allows the model to gradually improve by returning corrected predictions to the training pipeline. In this way, our users organically improve the accuracy of the character recognition model over time.

Product-code recognition pipeline

Optimizing for maximum performance

To meet user expectations around performance, we established a few ambitious requirements for the product-code OCR pipeline:

  • It had to be fast: we needed a one-second average processing time once the image of the product-code was sent into the OCR pipeline
  • It had to be accurate: our goal was to achieve 95% string recognition accuracy at launch with the guarantee that the model could be improved over time via active learning
  • It had to be small: the OCR pipeline needs to be small enough to be distributed directly to mobile apps and accommodate over-the-air updates as the model improves over time
  • It had to handle diverse product code media: dozens of different combinations of font types, bottlecaps, and cardboard fridge-pack media

We initially explored an architecture that used a single CNN for all product-code media. This approach created a model that was too large to be distributed to mobile apps and the execution time was longer than desired. Our applied-AI partners at Quantiphi, Inc.began iterating on different model architectures, eventually landing on one that used multiple CNNs.

This new architecture reduced the model size dramatically without sacrificing accuracy, but it was still on the high end of what we needed in order to support over-the-air updates to mobile apps. We next used TensorFlow's prebuilt quantization module to reduce the model size by reducing the fidelity of the weights between connected neurons. Quantization reduced the model size by a factor of 4, but a dramatic reduction in model size occurred when Quantiphi had a breakthrough using a new approach called SqueezeNet.

The SqueezeNet model was published by a team of researchers from UC Berkeley and Stanford in November of 2016. It uses a small but highly complex design to achieve accuracy levels on par with much larger models against popular benchmarks such as Imagenet. After re-architecting our character recognition models to use a SqueezeNet CNN, Quantiphi was able to reduce the model size of certain media types by a factor of 100. Since the SqueezeNet model was inherently smaller, a richer feature detection architecture could be constructed, achieving much higher accuracy at much smaller sizes compared to our first batch of models trained without SqueezeNet. We now have a highly accurate model that can be easily updated on remote devices; the recognition success rate of our final model before active learning was close to 96%, which translates into a 99.7% character recognition accuracy (just 3 misses for every 1000 character predictions).

Valid product-code recognition examples with different types of occlusion, translation, and camera focus issues

Crossing boundaries with AI

Advances in artificial intelligence and the maturity of TensorFlow enabled us to finally achieve a long-sought proof-of-purchase capability. Since launching in late February 2017, our product code recognition platform has fueled more than a dozen promotions and resulted in over 180,000 scanned codes; it is now a core component for all of Coca-Cola North America's web-based promotions.

Moving to an AI-enabled product-code recognition platform has been valuable for two key reasons:

  • Frictionless proof-of-purchase was enabled in a timely fashion, corresponding to our overall move to a mobile-first marketing platform.
  • Coke saved millions of dollars by avoiding the requirement to update printers in our production lines to support higher-fidelity fonts that would work with existing off-the-shelf OCR software.

Our product-code recognition platform is the first execution of new AI-enabled capabilities at scale within Coca-Cola. We're now exploring AI applications across multiple lines of business, from new product development to ecommerce retail optimization.

How publishers can take advantage of machine learning

As the publishing world continues to face new challenges amidst the shift to digital, news media and publishers are tasked with unlocking new opportunities. With online news consumption continuing to grow, it’s crucial that publishers take advantage of new technologies to sustain and grow their business. Machine learning yields tremendous value for media and can help them tackle the hardest problems: engaging readers, increasing profits, and making newsrooms more efficient. Google has a suite of machine learning tools and services that are easy to use—here are a few ways they can help newsrooms and reporters do their jobs

1. Improve your newsroom's efficiency 

Editors want to make their stories appealing and to stand out so that people will read them. So finding just the right photograph or video can be key in bringing a story to life. But with ever-pressing deadlines, there’s often not enough time to find that perfect image. This is where Google Cloud Vision and Video Intelligence can simplify the process by tagging images and videos based on the content inside the actual image. This metadata can then be used to make it easier and quicker to find the right visual.

2.  Better understand your audience

News publishers use analytics tools to grow their audiences, and understand what that audience is reading and how they’re discovering content. Google Cloud Natural Language uses machine learning to understand what your content is about, independent of a website’s section and subsection structure (i.e. Sports, Local, etc.) Today, Cloud Natural Language announced a new content classifier and entity sentiment that digs into the detail of what a story is actually about. For example, an article about a high-tech stadium for the Golden State Warriors may be classified under the “technology” section of a paper, when its content should fall under “technology” and “sports.” This section-independent tagging can increase readership by driving smarter article recommendations and provides better data around trending topics. Naveed Ahmad, Senior Director of Data at Hearst has emphasized that precision and speed are critical to engaging readers: “Google Cloud Natural Language is unmatched in its accuracy for content classification. At Hearst, we publish several thousand articles a day across 30+ properties and, with natural language processing, we're able to quickly gain insight into what content is being published and how it resonates with our audiences."

3. Engage with new audiences

As publications expand their reach into more countries, they have to write for multiple audiences in different languages and many cannot afford multi-language desks. Google Cloud Translation makes translating for different audiences easier by providing a simple interface to translate content into more than 100 languages. Vice launched GoogleFish earlier this year to help editors quickly translate existing Vice articles into the language of their market. Once text was auto-translated, an editor could then push the translation to a local editor to ensure tone and local slang were accurate. Early translation results are very positive and Vice is also uncovering new insights around global content sharing they could not previously identify.

DB Corp, India’s largest newspaper group, publishes 62 editions in four languages and sells about 6 million newspaper copies per day. To address its growing customers and its diverse readership, reporters use Google Cloud Translation to capture and document interviews and source material for articles, with accuracy rates of 95 percent for Hindi alone.

4. Monetize your audience

So far we’ve primarily outlined ways to improve content creation and engagement with readers, however monetization is a critical piece for all publishers. Using Cloud Datalab, publishers can identify new subscription opportunities and offerings. The metadata collected from image, video, and content tagging creates an invaluable dataset to advertisers, such as audiences interested in local events or personal finance, or those who watch videos about cars or travel. The Washington Post has seen success with their in-house solution through the ability to target native ads to likely interested readers. Lastly, improved content recommendation drives consumption, ultimately improving the bottom line.

5. Experiment with new formats

The ability to share news quickly and efficiently is a major concern for newsrooms across the world. However today more than ever, readers are reading the news in different ways across different platforms and the “one format fits all” method is not always best. TensorFlow’s “summary.text” feature can help publishers quickly experiment with creating short form content from longer stories. This helps them quickly test the best way to share their content across different platforms. Reddit recently launched a similar “tl;dr bot” that summarizes long posts into digestible snippets.

6. Keep your content safe for everyone

The comments section can be a place of both fruitful discussion as well as toxicity. Users who comment are frequently the most highly engaged on the site overall, and while publishers want to keep sharing open, it can frequently spiral out of control into offensive speech and bad language. Jigsaw’s Perspective is an API that uses machine learning to spot harmful comments which can be flagged for moderators. Publishers like the New York Times have leveraged Perspective's technology to improve the way all readers engage with comments. By making the task of moderating conversations at scale easier, this frees up valuable time for editors and improves online discussion.

8
Example of New York Time’s moderator dashboard. Each dot represents a negative comment

From the printing press to machine learning, technology continues to spur new opportunities for publishers to reach more people, create engaging content and operate efficiently. We're only beginning to scratch the surface of what machine learning can do for publishers. Keep tabs on The Keyword for the latest developments.

Search more intuitively using natural language processing in Google Cloud Search

Earlier this year, we launched Google Cloud Search, a new G Suite tool that uses machine learning to help organizations find and access information quickly.

Just like in Google Search, which lets you search queries in a natural, intuitive way, we want to make it easy for you to find information in the workplace using everyday language. According to Gartner research, by 2018, 30 percent or more of enterprise search queries will start with a "what," "who," "how" or "when.”*

Today, we’re making it possible to use natural language processing (NLP) technology in Cloud Search so you can track down information—like documents, presentations or meeting details—fast.

Related Article

Introducing Google Cloud Search: Bringing the power of Google Search to G Suite customers

Every day, people around the globe rely on the power of Google Search to access the world’s information. In fact, we see more than one tr...

Read Article

Find information fast with Cloud Search

If you’re looking for a Google Doc, you’re more likely to remember who shared it with you than the exact name of a file. Now, you can use NLP technology, an intuitive way to search, to find information quickly in Cloud Search.

Type queries into Cloud Search using natural, everyday language. Ask questions like “Docs shared by Mary,” “Who’s Bob’s manager?” or “What docs need my attention?” and Cloud Search will show you answer cards with relevant information.

NLP Cloud Search GIF

Having access to information quicker can help you make better and faster decisions in the workplace. If your organization runs on G Suite Business or Enterprise edition, start using Cloud Search now. If you’re new to Cloud Search, learn more on our website or check out this video to see it in action.

Introducing Google Cloud Search

*Gartner, ‘Insight Engines’ Will Power Enterprise Search That is Natural, Total and Proactive, 09 December 2015, refreshed 05 April 2017