Author Archives: Kaz Sato

Noodle on this: Machine learning that can identify ramen by shop

There are casual ramen fans and then there are ramen lovers. There are people who are all tonkatsu all the time, and others who swear by tsukemen. And then there’s machine learning, which—based on a recent case study out of Japan—might be the biggest ramen aficionado of them all.


Recently, data scientist Kenji Doi used machine learning models and AutoML Vision to classify bowls of ramen and identify the exact shop each bowl is made at, out of 41 ramen shops, with 95 percent accuracy. Sounds crazy (also delicious), especially when you see what these bowls look like:
Ramen bowls made at three different Ramen Jiro shops.
Ramen bowls made at three different Ramen Jiro shops

With 41 locations around Tokyo, Ramen Jiro is one of the most popular restaurant franchises in Japan, because of its generous portions of toppings, noodles and soup served at low prices. They serve the same basic menu at each shop, and as you can see above, it's almost impossible for a human (especially if you're new to Ramen Jiro) to tell what shop each bowl is made at.


But Kenji thought deep learning could discern the minute details that make one shop’s bowl of ramen different from the next. He had already built a machine learning model to classify ramen, but wanted to see if AutoML Vision could do it more efficiently.


AutoML Vision creates customized ML models automatically—to identify animals in the wild, or recognize types of products to improve an online store, or in this case classify ramen. You don’t have to be a data scientist to know how to use it—all you need to do is upload well-labeled images and then click a button. In Kenji’s case, he compiled a set of 48,000 photos of bowls of soup from Ramen Jiro locations, along with labels for each shop, and uploaded them to AutoML Vision. The model took about 24 hours to train, all automatically (although a less accurate, “basic” mode had a model ready in just 18 minutes). The results were impressive: Kenji’s model got 94.5 percentaccuracy on predicting the shop just from the photos.

Confusion matrix of Ramen Jiro shop classifier by AutoML Vision

Confusion matrix of Ramen Jiro shop classifier by AutoML Vision (Advanced mode). Row = actual shop, column = predicted shop. You can see AutoML Vision incorrectly identified the restaurant location in only a couple of instances for each test case.

AutoML Vision is designed for people without ML expertise, but it also speeds things up dramatically for experts. Building a model for ramen classification from scratch would be a time-consuming process requiring multiple steps—labeling, hyperparameter tuning, multiple attempts with different neural net architectures, and even failed training runs—and experience as a data scientist. As Kenji puts it, “With AutoML Vision, a data scientist wouldn’t need to spend a long time training and tuning a model to achieve the best results. This means businesses could scale their AI work even with a limited number of data scientists." We wrote about another recent example of AutoML Vision at work in this Big Data blog post, which also has more technical details on Kenji’s model.


As for how AutoML detects the differences in ramen, it’s certainly not from the taste. Kenji’s first hypothesis was that the model was looking at the color or shape of the bowl or table—but that seems unlikely, since the model was highly accurate even when each shop used the same bowl and table design. Kenji’s new theory is that the model is accurate enough to distinguish very subtle differences between cuts of the meat, or the way toppings are served. He plans on continuing to experiment with AutoML to see if his theories are true. Sounds like a project that might involve more than a few bowls of ramen. Slurp on.

Source: Google Cloud


TensorFlow lends a hand to build a rock-paper-scissors machine

Editor’s note: It’s hard to think of a more “analog” game than rock-paper-scissors. But this summer, one Googler decided to use TensorFlow, Google’s open source machine learning system, to build a machine that could play rock-paper-scissors. For more technical details and source code, see the original post on the Google Cloud Big Data and Machine Learning Blog.

This summer, my 12-year-old son and I were looking for a science project to do together. He’s interested in CS and has studied programming with Scratch, so we knew we wanted to do something involving coding. After exploring several ideas, we decided to build a rock-paper-scissors machine that detects a hand gesture, then selects the appropriate pose to respond: rock, paper, or scissors.

But our rock-paper-scissors machine had a secret ingredient: TensorFlow, which in this case runs a very simple ML algorithm that detects your hand posture through an Arduino micro controller connected to the glove.


To build the machine’s hardware, we used littleBits—kid-friendly kits that include a wide variety of components like LEDs, motors, switches, sensors, and controllers—to attach three bend sensors to a plastic glove. When you bend your fingers while wearing the glove, the sensors output an electric signal. To read the signals from the bend sensors and control the machine’s dial, we used an Arduino and Servo module.
hardware components of rock paper scissors
The hardware components of the rock-paper-scissors machine

After putting together the hardware component, we wrote the code to read data from the sensors. The Arduino module takes the input signal voltage it receives from the glove, then converts those signals to a series of three numbers.

probability of rock paper scissors
Estimated probability distribution of rock (red), paper (green) and scissors (blue), as determined by TensorFlow and our linear model

The next step was to determine which combination of three numbers represents rock, paper or scissors. We wanted to do it in a way that could be flexible over time—for example if we wanted to capture more than three hand positions with many more sensors. So we created a linear model—a simple algebraic formula that many of you might have learned in high school or college—and used machine learning in TensorFlow to solve the formula based on the given sensor data and the expected results (rock, paper or scissors). What’s cool about this is that it’s like automated programming—we specify the input and output, and the computer generates the most important transformation in the middle.

Finally, we put it all together. Once we determine the hand’s posture, the Servo controls the machine hand to win the game. (Of course, if you were feeling competitive, you could always program the machine to let YOU win every time… but we would never do that. ?)

kaz's son drawing
My son drawing the sign for the machine hand

Rock-paper-scissors probably isn’t what comes to mind when you think about ML, but this project demonstrates how ML can be useful for all kinds of programmers, regardless of the task—reducing human coding work and speeding up calculations. In this case, it also provided some family fun!