Tag Archives: machine learning

Full spectrum of on-device machine learning tools on Android

Posted by Hoi Lam, Android Machine Learning



This blog post is part of a weekly series for #11WeeksOfAndroid. Each week we’re diving into a key area of Android so you don’t miss anything. Throughout this week, we covered various aspects of Android on-device machine learning (ML). Whichever stage of development be it starting out or an established app; whatever role you play in design, product and engineering; whatever your skill level from beginner to experts, we have a wide range of ML tools for you.

Design - ML as a differentiator

“Focus on the user and all else will follow” is a Google mantra that becomes even more relevant in our machine learning age. Our Design Advocate, Di Dang, highlighted the importance of finding the unique intersection of user problems and ML strengths. Too often, teams are so keen on the idea of machine learning that they lose sight of their user needs.



Di outlined how the People + AI Guidebook can help you make ML product decisions and used the example of the Read Along app to illustrate topics like precision and recall, which are unique to ML design and development. Check out her interview with the Read Along team together with your team for more inspiration.

New ML Kit fully focused on on-device

When you decide that on-device machine learning is the solution, the easiest way to implement it will be through turnkey SDKs like ML Kit. Sophisticated Google-trained models and processing pipelines are offered through an easy to use interface in Kotlin / Java. ML Kit is designed and built for on-device ML: it works offline, offers enhanced privacy, unlocks high performance for real-time use cases and it is free. We recently made ML Kit a standalone SDK and it no longer requires a Firebase account. Just one line in your build.gradle file and you can start bringing ML functionality into your app.



The team has also added new functionalities such as Jetpack lifecycle support and the option to use the face contour models via Google Play Services saving as much as 20MB in app size. Another much anticipated addition is the support for swapping Google models with your own for both Image Labeling as well as Object Detection and Tracking. This provides one of the easiest ways to add TensorFlow Lite models to your applications without interacting with ByteArray!

Customise with TensorFlow Lite and Android tools

If the base model provided by ML Kit doesn’t quite fit the bill, what should developers do? The first port of call should be TensorFlow Hub where ready-to-use TensorFlow Lite models from both Google and the wider community can be downloaded. From 100,000 US Supermarket products to tomato plant diseases classifiers, the choice is yours.



In addition to Firebase AutoML Vision Edge, you can also build your own model using TensorFlow Model Maker (image classification / text classification) with just a few lines of Python. Once you have a TensorFlow Lite model from either TensorFlow Hub, or the Model Maker, you can easily integrate it with your Android app using ML Kit Image Labelling or Object Detection and Tracking. If you prefer an open source solution, Android Studio 4.1 beta introduces ML model binding that helps wrap around the TensorFlow Lite model with an easy to use Kotlin / Java wrapper. Adding a custom model to your Android app has never been easier. Check out this blog for more details.

Time for on-device ML is now

From the examples of the Android Developer Challenge winners, it is obvious that on-device machine learning has come of age and ML functionalities once reserved for the cloud or supercomputers are now available on your Android phone. Take a step forward with us by trying out our codelabs of the day:

Also checkout the ML Week learning pathway and take the quiz to get your very own ML badge.

Android on-device machine learning is a rapidly evolving platform, if you have any enhancement requests or feedback on how it could be improved, please let us know together with your use-case (TensorFlow Lite / ML Kit). Time for on-device ML is now.

Resources

You can find the entire playlist of #11WeeksOfAndroid video content here, and learn more about each week here. We’ll continue to spotlight new areas each week, so keep an eye out and follow us on Twitter and YouTube. Thanks so much for letting us be a part of this experience with you!

Read Along: Grow child literacy with on-device ML design insights

Posted by Di Dang, Design Advocate

From our Machine Learning-themed week together, we’ve delved into an ML Kit x CameraX Codelab, and we learned how to train your own custom models and integrate them in your Android app. In addition to the technical considerations that go into using ML, it’s important that we design our ML-based apps in a way that enables our users to feel in control of the ML technology, and not the other way around. To help product creators understand some best practices for ML product decisions, the PAIR team published the People + AI Guidebook at Google I/O last year. Let’s take a look at some ML design considerations you can apply in your Android apps by learning from the example of Read Along.




Google recently launched Read Along, an Android app that uses on-device ML and voice UI to help children learn to read anytime, anywhere, using just their voice. According to the UN Division of Sustainable Development Goals, more than 50% of children worldwide are not achieving minimum proficiency in reading. First launched in India as “Bolo”, the “Read Along” app is now available globally. We recently went behind the scenes with the Read Along team in this episode of Centered, to learn how they made an ML- and voice-based app to improve child literacy.

Why Machine Learning and Voice UI?

Since using ML can be time- and cost-intensive, we need to find the intersection of ML strengths and user needs. To learn to read, children need time on task and one-on-one attention, which is challenging for areas where there is a lack of access to teachers or educational materials. “In many parts of the world, there are only so many schools that can be built, only so many teachers can be trained. So first and foremost, it's a scale problem,” said Nitin Kashyap, Read Along’s product manager. This creates a unique opportunity for the use of ML—to provide real-time reading feedback at scale, Read Along utilizes the Google Assistant’s text-to-speech and speech recognition capabilities. The Read Along team also added abilities on edge to preserve children’s privacy. The voice data is analyzed on-device without being sent to any Google servers, enabling children to use Read Along offline as well.
Child using app demo

False positives vs. false negatives

Since ML-based systems are inherently probabilistic, they can generate “wrong” predictions in the form of false positives and false negatives. As we create ML-based applications, we need to decide which behavior to optimize for. Within the Read Along experience, a false positive denotes that the child has misread a word, though the system fails to recognize this and does not provide corrective feedback. On the other hand, an example of a false negative is when a child reads a word correctly, but Read Along predicts the word was read incorrectly, and thus prompts the child to try again. “We spent a little time to understand what really happens when the child gets false positive and false negative, and what impact does it have on the psychology, and also on the reading experience,“ said Eshita Priyadarshini, Read Along’s UX Research Lead. “When the child is reading, we don't really tell him, "Oh, you got that word wrong. Why don't you read it again?" By unpacking the impact of false positives and false negatives on the user, the Read Along team decided to optimize for recall, thereby increasing the number of false positives, which results in a user experience that feels more encouraging for children.

Screenshot of People + AI Guidebook

To learn more about how the Read Along team made ML product decisions, check out the full Centered episode. For more guidance on how your cross-functional team (spanning UX, PM, and engineering) can come together to design ML-based applications, check out the People + AI Guidebook.

New tools for finding, training, and using custom machine learning models on Android

Posted by Hoi Lam, Android Machine Learning

Yesterday, we talked about turnkey machine learning (ML) solutions with ML Kit. But what if that doesn’t completely address your needs and you need to tweak it a little? Today, we will discuss how to find alternative models, and how to train and use custom ML models in your Android app.

Find alternative ML models

Crop disease models from the wider research community available on tfhub.dev

If the turnkey ML solutions don't suit your needs, TensorFlow Hub should be your first port of call. It is a repository of ML models from Google and the wider research community. The models on the site are ready for use in the cloud, in a web-browser or in an app on-device. For Android developers, the most exciting models are the TensorFlow Lite (TFLite) models that are optimized for mobile.

In addition to key vision models such as MobileNet and EfficientNet, the repository also boast models powered by the latest research such as:

Many of these solutions were previously only available in the cloud, as the models are too large and too power intensive to run on-device. Today, you can run them on Android on-device, offline and live.

Train your own custom model

Besides the large repository of base models, developers can also train their own models. Developer-friendly tools are available for many common use cases. In addition to Firebase’s AutoML Vision Edge, the TensorFlow team launched TensorFlow Lite Model Maker earlier this year to give developers more choices over the base model that support more use cases. TensorFlow Lite Model Maker currently supports two common ML tasks:

The TensorFlow Lite Model Maker can run on your own developer machine or in Google Colab online machine learning notebooks. Going forward, the team plans to improve the existing offerings and to add new use cases.

Using custom model in your Android app

New TFLite Model import screen in Android Studio 4.1 beta

Once you have selected a model or trained your model there are new easy-to-use tools to help you integrate them into your Android app without having to convert everything into ByteArrays. The first new tool is ML Model binding with Android Studio 4.1. This lets developers import any TFLite model, read the input / output signature of the model, and use it with just a few lines of code that calls the open source TensorFlow Lite Android Support Library.

Another way to implement a TensorFlow Lite model is via ML Kit. Starting in June, ML Kit no longer requires a Firebase project for on-device functionality. In addition, the image classification and object detection and tracking (ODT) APIs support custom models. The latter ODT offering is especially useful in use-cases where you need to separate out objects from a busy scene.

So how should you choose between these three solutions? If you are trying to detect a product on a busy supermarket shelf, ML Kit object detection and tracking can help your user select a specific product for processing. The API then performs image classification on just the part of the image that contains the product, which results in better detection performance. On the other hand, if the scene or the object you are trying to detect takes up most of the input image, for example, a landmark such as Big Ben, using ML Model binding or the ML Kit image classification API might be more appropriate.

TensorFlow Hub bird detection model with ML Kit Object Detection & Tracking AP

Two examples of how these tools can fit together

Here are some resources to help you get started:

Customizing your model is easier than ever

Finding, building and using custom models on Android has never been easier. As both Android and TensorFlow teams increase the coverage of machine learning use cases, please let us know how we can improve these tools for your use cases by filing an enhancement request with TensorFlow Lite or ML Kit.

Tomorrow, we will take a step back and focus on how to appropriately use and design for a machine learning first Android app. The content will be appropriate for the entire development team, so bring your product manager and designers along. See you next time.

Presenting a Challenge and Workshop in Efficient Open-Domain Question Answering



One of the primary goals of natural language processing is to build systems that can answer a user's questions. To do this, computers need to be able to understand questions, represent world knowledge, and reason their way to answers. Traditionally, answers have been retrieved from a collection of documents or a knowledge graph. For example, to answer the question, “When was the declaration of independence officially signed?” a system might first find the most relevant article from Wikipedia, and then locate a sentence containing the answer, “August 2, 1776”. However, more recent approaches, like T5, have also shown that neural models, trained on large amounts of web-text, can also answer questions directly, without retrieving documents or facts from a knowledge graph. This has led to significant debate about how knowledge should be stored for use by our question answering systems — in human readable text and structured formats, or in the learned parameters of a neural network.

Today, we are proud to announce the EfficientQA competition and workshop at NeurIPS 2020, organized in cooperation with Princeton University and the University of Washington. The goal is to develop an end-to-end question answering system that contains all of the knowledge required to answer open-domain questions. There are no constraints on how the knowledge is stored — it could be in documents, databases, the parameters of a neural network, or any other form — but entries will be evaluated based on the number of bytes used to access this knowledge, including code, corpora, and model parameters. There will also be an unconstrained track, in which the goal is to achieve the best possible question answering performance regardless of system size. To build small, yet robust systems, participants will have to explore new methods of knowledge representation and reasoning.
An illustration of how the memory budget changes as a neural network and retrieval corpus grow and shrink. It is possible that successful systems will also use other resources such as a knowledge graph.
Competition Overview
The competition will be evaluated using the open-domain variant of the Natural Questions dataset. We will also provide further human evaluation of all the top performing entries to account for the fact that there are many correct ways to answer a question, not all of which will be covered by any set of reference answers. For example, for the question “What type of car is a Jeep considered?” both “off-road vehicles” and “crossover SUVs” are valid answers.

The competition is divided between four separate tracks: best performing system under 500 Mb; best performing system under 6 Gb; smallest system to get at least 25% accuracy; and the best performing system with no constraints. The winners of each of these tracks will be invited to present their work during the competition track at NeurIPS 2020, which will be hosted virtually. We will also put each of the winning systems up against human trivia experts (the 2017 NeurIPS Human-Computer competition featured Jeopardy! and Who Wants to Be a Millionaire champions) in a real-time contest at the virtual conference.

Participation
To participate, go to the competition site where you will find the data and evaluation code available for download, as well as dates and instructions on how to participate, and a sign-up form for updates. Along with our academic collaborators, we have provided some example systems to help you get started.

We believe that the field of natural language processing will benefit from a greater exploration and comparison of small system question answering options. We hope that by encouraging the development of very small systems, this competition will pave the way for on-device question answering.

Acknowledgements
Creating this challenge and workshop has been a large team effort including Adam Roberts, Colin Raffel, Chris Alberti, Jordan Boyd-Graber, Jennimaria Palomaki, Kenton Lee, Kelvin Guu, and Michael Collins from Google; as well as Sewon Min and Hannaneh Hajishirzi from the University of Washington; and Danqi Chen from Princeton University.

Source: Google AI Blog


On-device machine learning solutions with ML Kit, now even easier to use

Posted by Christiaan Prins, Product Manager, ML Kit and Shiyu Hu, Tech Lead Manager, ML Kit

ML Kit logo

Two years ago at I/O 2018 we introduced ML Kit, making it easier for mobile developers to integrate machine learning into your apps. Today, more than 25,000 applications on Android and iOS make use of ML Kit’s features. Now, we are introducing some changes that will make it even easier to use ML Kit. In addition, we have a new feature and a set of improvements we’d like to discuss.

A new ML Kit SDK, fully focused on on-device ML

ML Kit API Overview

ML Kit's APIs are built to help you tackle common challenges in the Vision and Natural Language domains. We make it easy to recognize text, scan barcodes, track and classify objects in real-time, do translation of text, and more.

The original version of ML Kit was tightly integrated with Firebase, and we heard from many of you that you wanted more flexibility when implementing it in your apps. As a result, we are now making all the on-device APIs available in a new standalone ML Kit SDK that no longer requires a Firebase project. You can still use both ML Kit and Firebase to get the best of both products if you choose to.

With this change, ML Kit is now fully focused on on-device machine learning, giving you access to the unique benefits that on-device versus cloud ML offers:

  • It’s fast, unlocking real-time use cases- since processing happens on the device, there is no network latency. This means, we can do inference on a stream of images / video or multiple times a second on text strings.
  • Works offline - you can rely on our APIs even when the network is spotty or your app’s end-user is in an area without connectivity.
  • Privacy is retained: since all processing is performed locally, there is no need to send sensitive user data over the network to a server.

Naturally, you still get access to Google’s on-device models and processing pipelines, all accessible through easy-to-use APIs, and offered at no cost.

All ML Kit resources can now be found on our new website where we made it a lot easier to access sample apps, API reference docs and our community channels that are there to help you if you have questions.

Object detection & tracking gif Text recognition + Language ID + Translate gif

What does this mean if I already use ML Kit today?

If you are using ML Kit for Firebase’s on-device APIs in your app today, we recommend you to migrate to the new standalone ML Kit SDK to benefit from new features and updates. For more information and step-by-step instructions to update your app, please follow our Migration guide. The cloud-based APIs, model deployment and AutoML Vision Edge remain available through Firebase Machine Learning.

Shrink your app footprint with Google Play Services

Apart from making ML Kit easier to use, developers also asked if we can ship ML Kit through Google Play Services resulting in a smaller app footprint and the model can be reused between apps. Apart from Barcode scanning and Text recognition, we have now added Face detection / contour (model size: 20MB) to the list of APIs that support this functionality.

// Face detection / Face contour model
// Delivered via Google Play Services outside your app's APK…
implementation 'com.google.android.gms:play-services-mlkit-face-detection:16.0.0'

// …or bundled with your app's APK
implementation 'com.google.mlkit:face-detection:16.0.0'

Jetpack Lifecycle / CameraX support

Android Jetpack Lifecycle support has been added to all APIs. Developers can use addObserver to automatically manage teardown of ML Kit APIs as the app goes through screen rotation or closure by the user / system. This makes CameraX integration easier. With this release, we are also recommending that developers adopt CameraX in their apps due to the ease of integration and image quality improvements (compared to Camera1) on a wide range of devices.

// ML Kit now supports Lifecycle
val recognizer = TextRecognizer.newInstance()
lifecycle.addObserver(recognizer)

// ...

// Just like CameraX
val camera = cameraProvider.bindToLifecycle( /* lifecycleOwner= */this,
    cameraSelector, previewUseCase, analysisUseCase)

For an overview of all recent changes, check out the release notes for the new SDK.

Codelab of the day - ML Kit x CameraX

To help you get started with the new ML Kit and its support for CameraX, we have created this code lab to Recognize, Identify Language and Translate text. If you have any questions regarding this code lab, please raise them at StackOverflow and tag it with [google-mlkit]. Our team will monitor this.

screenshot of app running

Early access program

Through our early access program, developers have an opportunity to partner with the ML Kit team and get access to upcoming features. Two new APIs are now available as part of this program:

  • Entity Extraction - Detect entities in text & make them actionable. We have support for phone numbers, addresses, payment numbers, tracking numbers, date/time and more.
  • Pose Detection - Low-latency pose detection supporting 33 skeletal points, including hands and feet tracking.

If you are interested, head over to our early access page for details.

pose detection on man jumping rope

Tomorrow - Support for custom models

ML Kit's turn-key solutions are built to help you take common challenges. However, if you needed to have a more tailored solution, one that required custom models, you typically needed to build an implementation from scratch. To help, we are now providing the option to swap out the default Google models with a custom TensorFlow Lite model. We’re starting with the Image Labeling and Object Detection and Tracking APIs, that now support custom image classification models.

Tomorrow, we will dive a bit deeper into how to find or train a TensorFlow Lite model and use it either with ML Kit, or with Android Studio’s new ML binding functionality.

On-device machine learning solutions with ML Kit, now even easier to use

Posted by Christiaan Prins, Product Manager, ML Kit and Shiyu Hu, Tech Lead Manager, ML Kit

ML Kit logo

Two years ago at I/O 2018 we introduced ML Kit, making it easier for mobile developers to integrate machine learning into your apps. Today, more than 25,000 applications on Android and iOS make use of ML Kit’s features. Now, we are introducing some changes that will make it even easier to use ML Kit. In addition, we have a new feature and a set of improvements we’d like to discuss.

A new ML Kit SDK, fully focused on on-device ML

ML Kit API Overview

ML Kit's APIs are built to help you tackle common challenges in the Vision and Natural Language domains. We make it easy to recognize text, scan barcodes, track and classify objects in real-time, do translation of text, and more.

The original version of ML Kit was tightly integrated with Firebase, and we heard from many of you that you wanted more flexibility when implementing it in your apps. As a result, we are now making all the on-device APIs available in a new standalone ML Kit SDK that no longer requires a Firebase project. You can still use both ML Kit and Firebase to get the best of both products if you choose to.

With this change, ML Kit is now fully focused on on-device machine learning, giving you access to the unique benefits that on-device versus cloud ML offers:

  • It’s fast, unlocking real-time use cases- since processing happens on the device, there is no network latency. This means, we can do inference on a stream of images / video or multiple times a second on text strings.
  • Works offline - you can rely on our APIs even when the network is spotty or your app’s end-user is in an area without connectivity.
  • Privacy is retained: since all processing is performed locally, there is no need to send sensitive user data over the network to a server.

Naturally, you still get access to Google’s on-device models and processing pipelines, all accessible through easy-to-use APIs, and offered at no cost.

All ML Kit resources can now be found on our new website where we made it a lot easier to access sample apps, API reference docs and our community channels that are there to help you if you have questions.

Object detection & tracking gif Text recognition + Language ID + Translate gif

What does this mean if I already use ML Kit today?

If you are using ML Kit for Firebase’s on-device APIs in your app today, we recommend you to migrate to the new standalone ML Kit SDK to benefit from new features and updates. For more information and step-by-step instructions to update your app, please follow our Migration guide. The cloud-based APIs, model deployment and AutoML Vision Edge remain available through Firebase Machine Learning.

Shrink your app footprint with Google Play Services

Apart from making ML Kit easier to use, developers also asked if we can ship ML Kit through Google Play Services resulting in a smaller app footprint and the model can be reused between apps. Apart from Barcode scanning and Text recognition, we have now added Face detection / contour (model size: 20MB) to the list of APIs that support this functionality.

// Face detection / Face contour model
// Delivered via Google Play Services outside your app's APK…
implementation 'com.google.android.gms:play-services-mlkit-face-detection:16.0.0'

// …or bundled with your app's APK
implementation 'com.google.mlkit:face-detection:16.0.0'

Jetpack Lifecycle / CameraX support

Android Jetpack Lifecycle support has been added to all APIs. Developers can use addObserver to automatically manage teardown of ML Kit APIs as the app goes through screen rotation or closure by the user / system. This makes CameraX integration easier. With this release, we are also recommending that developers adopt CameraX in their apps due to the ease of integration and image quality improvements (compared to Camera1) on a wide range of devices.

// ML Kit now supports Lifecycle
val recognizer = TextRecognizer.newInstance()
lifecycle.addObserver(recognizer)

// ...

// Just like CameraX
val camera = cameraProvider.bindToLifecycle( /* lifecycleOwner= */this,
    cameraSelector, previewUseCase, analysisUseCase)

For an overview of all recent changes, check out the release notes for the new SDK.

Codelab of the day - ML Kit x CameraX

To help you get started with the new ML Kit and its support for CameraX, we have created this code lab to Recognize, Identify Language and Translate text. If you have any questions regarding this code lab, please raise them at StackOverflow and tag it with [google-mlkit]. Our team will monitor this.

screenshot of app running

Early access program

Through our early access program, developers have an opportunity to partner with the ML Kit team and get access to upcoming features. Two new APIs are now available as part of this program:

  • Entity Extraction - Detect entities in text & make them actionable. We have support for phone numbers, addresses, payment numbers, tracking numbers, date/time and more.
  • Pose Detection - Low-latency pose detection supporting 33 skeletal points, including hands and feet tracking.

If you are interested, head over to our early access page for details.

pose detection on man jumping rope

Tomorrow - Support for custom models

ML Kit's turn-key solutions are built to help you take common challenges. However, if you needed to have a more tailored solution, one that required custom models, you typically needed to build an implementation from scratch. To help, we are now providing the option to swap out the default Google models with a custom TensorFlow Lite model. We’re starting with the Image Labeling and Object Detection and Tracking APIs, that now support custom image classification models.

Tomorrow, we will dive a bit deeper into how to find or train a TensorFlow Lite model and use it either with ML Kit, or with Android Studio’s new ML binding functionality.

13 Most Common Google Cloud Reference Architectures

Posted by Priyanka Vergadia, Developer Advocate

Google Cloud is a cloud computing platform that can be used to build and deploy applications. It allows you to take advantage of the flexibility of development while scaling the infrastructure as needed.

I'm often asked by developers to provide a list of Google Cloud architectures that help to get started on the cloud journey. Last month, I decided to start a mini-series on Twitter called “#13DaysOfGCP" where I shared the most common use cases on Google Cloud. I have compiled the list of all 13 architectures in this post. Some of the topics covered are hybrid cloud, mobile app backends, microservices, serverless, CICD and more. If you were not able to catch it, or if you missed a few days, here we bring to you the summary!

Series kickoff #13DaysOfGCP

#1: How to set up hybrid architecture in Google Cloud and on-premises

Day 1

#2: How to mask sensitive data in chatbots using Data loss prevention (DLP) API?

Day 2

#3: How to build mobile app backends on Google Cloud?

Day 3

#4: How to migrate Oracle Database to Spanner?

Day 4

#5: How to set up hybrid architecture for cloud bursting?

Day 5

#6: How to build a data lake in Google Cloud?

Day 6

#7: How to host websites on Google Cloud?

Day 7

#8: How to set up Continuous Integration and Continuous Delivery (CICD) pipeline on Google Cloud?

Day 8

#9: How to build serverless microservices in Google Cloud?

Day 9

#10: Machine Learning on Google Cloud

Day 10

#11: Serverless image, video or text processing in Google Cloud

Day 11

#12: Internet of Things (IoT) on Google Cloud

Day 12

#13: How to set up BeyondCorp zero trust security model?

Day 13

Wrap up with a puzzle

Wrap up!

We hope you enjoy this list of the most common reference architectures. Please let us know your thoughts in the comments below!

Announcing the winners of the #AndroidDevChallenge, powered by on-device machine learning

Posted by Jacob Lehrbaum, Director of Developer Relations, Android

Developers like you have always played an important role in Android innovation. Over 10 years ago, when we first launched the Android SDK, we also announced the Android Developer Challenge to reward model apps and highlight new ways of solving user problems. As Android pushes the boundaries of machine learning, 5G, foldables, and more, developers continue to help shape these new frontiers. To celebrate this work, we revived the challenge in 2019, with a focus on “Helpful Innovation,” powered by on-device machine learning.

We received hundreds of creative projects, and at the end of last year, picked 10 winners who each combined a strong idea and a thirst to bring it to life. Since then, we’ve been working with those winners to help turn their ideas into reality. And today, we’re announcing the 10 winners. Some are still at the beginning of their journey but but their apps are now ready for you to download and try out! !

  • AgroDoc helps farmers diagnose plant disease and make treatment plans. [Navneet Krishna; Kochi, India]
  • AgriFarm helps farmers detect plant diseases and prevent major damage in fruits and vegetables such as tomatoes, corn and potatoes. [Balochisan, Pakistan]
  • Eskke streamlines mobile money management for people in the Congo, letting them transfer money, pay bills, buy subscriptions and essential airtime through SMS. [David Mumbere Kathoh; Goma, Democratic Republic of Congo]
  • Leepi helps students learn hand gestures and symbols for American Sign Language. [Prince Patel; Bengaluru, India]
  • MixPose is a live streaming platform that gives yoga teachers and fitness professionals the opportunity to teach, track alignment, and give feedback in real-time. [Peter Ma; San Francisco, California, USA]
  • Pathfinder could help people with visual impairments navigate complex situations by identifying and calculating the trajectories of objects moving in their path. [Colin Shelton; Addison, Texas, USA]
  • Snore & Cough helps you identify and analyze snoring and coughing, to help provide info to users seeking assistance from a medical professional. [Ethan Fan; Mountain View, California, USA]
  • Stila pairs with a wearable device, like the Fitbit wristband or a device running on Wear OS by Google to monitor and track the body’s stress levels. By monitoring stress levels over time, you have the chance to better understand and manage stress in your life. [Yingdin Wing; Munich, Germany]
  • Trashly makes recycling easier. Just point the on-device camera at an item, and through object detection, the app identifies and classifies plastic and paper cups, bags, bottles, etc. [Elvin Rakhmankulov; Chicago, Illinois, USA]
  • UnoDogs helps owners better support their pet’s wellness, providing customized information and fitness programs. [Chinmany Mishra; New Delhi, India]

Making on-device machine learning more accessible, with ML Kit and TensorFlow Lite

Increasingly, machine learning is becoming a more accessible tool to developers with limited to no background in the technology. In fact, for most of the winners of the Android Developer Challenge, this was their first foray into machine learning. That’s thanks in part to two key offerings from Google, which bring on-device machine learning into reach for millions of developers around the world.

The first is ML Kit. ML Kit brings Google’s on-device machine learning technologies to mobile app developers, so they can build customized and interactive experiences into their apps. This includes tools such as language translation, text recognition, object detection and more. Eskke, for instance, uses offline text recognition and barcode scanning from ML Kit so users can scan the QR code at a mobile money kiosk and quickly withdraw money. And MixPose uses ML Kit's forthcoming Pose detection API to detect each user’s yoga positions and movements, so teachers can provide feedback.

The other Google resource that many of the Android Dev Challenge winners used was TensorFlow Lite. This powerful machine learning framework can help run machine learning models on Android, iOS and IoT devices that would never normally be able to support them. Its set of tools can be used for all kinds of powerful neural network-related applications, from image detection to speech recognition, bringing the latest cutting-edge technology to the devices we carry around with us wherever we go. Trashly, for instance, uses a custom TensorFlow Lite model to report if an object is recyclable and how to recycle it.

Helpful innovation, such as the 10 winning apps in the Android Developer Challenge, has the potential to change the way we access, use, and interpret information, making it available when we need it, where we need it most. By working with these developers focused on helpful innovation, we hope to inspire the next wave of developers to unlock what’s possible with this new technology.

#11WeeksOfAndroid Week 2 Machine Learning with Android logo head

What’s next in Android Machine Learning week?

As we kick off the second week of #11WeeksOfAndroid, focused on Machine Learning, we will highlight new tools and resources available to Android developers. Here’s a taste of the rest of this week:

  • Tuesday - ML Kit, the turnkey ML SDK went through a major overhaul with its new on-device offering this month. Check out the substantial improvement in developer usability, CameraX support and where the platform is going next.
  • Wednesday - Custom Models. When prepackaged SDK doesn’t quite satisfy your need, tools from Android Studio, TensorFlow Lite and ML Kit might just be the answer. Aside from individual offerings, we will also highlight how they can be used together.
  • Thursday - ML design. Learn some best practices for making ML product decisions from the People + AI Guidebook. We will go behind the scenes of the Read Along app, an on-device ML app that helps grow universal literacy. Bring your whole team because everyone, including UXers, engineers, and product managers are invited!

On Tuesday and Wednesday, we will also have a “codelab of the day” so get your Android Studio 4.1 beta today, block off an hour in your schedule and take this ML journey with us!

*The apps presented here are the projects of the developers individually, and not Google.

Announcing the winners of the #AndroidDevChallenge, powered by on-device machine learning

Posted by Jacob Lehrbaum, Director of Developer Relations, Android

Developers like you have always played an important role in Android innovation. Over 10 years ago, when we first launched the Android SDK, we also announced the Android Developer Challenge to reward model apps and highlight new ways of solving user problems. As Android pushes the boundaries of machine learning, 5G, foldables, and more, developers continue to help shape these new frontiers. To celebrate this work, we revived the challenge in 2019, with a focus on “Helpful Innovation,” powered by on-device machine learning.

We received hundreds of creative projects, and at the end of last year, picked 10 winners who each combined a strong idea and a thirst to bring it to life. Since then, we’ve been working with those winners to help turn their ideas into reality. And today, we’re announcing the 10 winners. Some are still at the beginning of their journey but but their apps are now ready for you to download and try out! !

  • AgroDoc helps farmers diagnose plant disease and make treatment plans. [Navneet Krishna; Kochi, India]
  • AgriFarm helps farmers detect plant diseases and prevent major damage in fruits and vegetables such as tomatoes, corn and potatoes. [Balochisan, Pakistan]
  • Eskke streamlines mobile money management for people in the Congo, letting them transfer money, pay bills, buy subscriptions and essential airtime through SMS. [David Mumbere Kathoh; Goma, Democratic Republic of Congo]
  • Leepi helps students learn hand gestures and symbols for American Sign Language. [Prince Patel; Bengaluru, India]
  • MixPose is a live streaming platform that gives yoga teachers and fitness professionals the opportunity to teach, track alignment, and give feedback in real-time. [Peter Ma; San Francisco, California, USA]
  • Pathfinder could help people with visual impairments navigate complex situations by identifying and calculating the trajectories of objects moving in their path. [Colin Shelton; Addison, Texas, USA]
  • Snore & Cough helps you identify and analyze snoring and coughing, to help provide info to users seeking assistance from a medical professional. [Ethan Fan; Mountain View, California, USA]
  • Stila pairs with a wearable device, like the Fitbit wristband or a device running on Wear OS by Google to monitor and track the body’s stress levels. By monitoring stress levels over time, you have the chance to better understand and manage stress in your life. [Yingdin Wing; Munich, Germany]
  • Trashly makes recycling easier. Just point the on-device camera at an item, and through object detection, the app identifies and classifies plastic and paper cups, bags, bottles, etc. [Elvin Rakhmankulov; Chicago, Illinois, USA]
  • UnoDogs helps owners better support their pet’s wellness, providing customized information and fitness programs. [Chinmany Mishra; New Delhi, India]

Making on-device machine learning more accessible, with ML Kit and TensorFlow Lite

Increasingly, machine learning is becoming a more accessible tool to developers with limited to no background in the technology. In fact, for most of the winners of the Android Developer Challenge, this was their first foray into machine learning. That’s thanks in part to two key offerings from Google, which bring on-device machine learning into reach for millions of developers around the world.

The first is ML Kit. ML Kit brings Google’s on-device machine learning technologies to mobile app developers, so they can build customized and interactive experiences into their apps. This includes tools such as language translation, text recognition, object detection and more. Eskke, for instance, uses offline text recognition and barcode scanning from ML Kit so users can scan the QR code at a mobile money kiosk and quickly withdraw money. And MixPose uses ML Kit's forthcoming Pose detection API to detect each user’s yoga positions and movements, so teachers can provide feedback.

The other Google resource that many of the Android Dev Challenge winners used was TensorFlow Lite. This powerful machine learning framework can help run machine learning models on Android, iOS and IoT devices that would never normally be able to support them. Its set of tools can be used for all kinds of powerful neural network-related applications, from image detection to speech recognition, bringing the latest cutting-edge technology to the devices we carry around with us wherever we go. Trashly, for instance, uses a custom TensorFlow Lite model to report if an object is recyclable and how to recycle it.

Helpful innovation, such as the 10 winning apps in the Android Developer Challenge, has the potential to change the way we access, use, and interpret information, making it available when we need it, where we need it most. By working with these developers focused on helpful innovation, we hope to inspire the next wave of developers to unlock what’s possible with this new technology.

#11WeeksOfAndroid Week 2 Machine Learning with Android logo head

What’s next in Android Machine Learning week?

As we kick off the second week of #11WeeksOfAndroid, focused on Machine Learning, we will highlight new tools and resources available to Android developers. Here’s a taste of the rest of this week:

  • Tuesday - ML Kit, the turnkey ML SDK went through a major overhaul with its new on-device offering this month. Check out the substantial improvement in developer usability, CameraX support and where the platform is going next.
  • Wednesday - Custom Models. When prepackaged SDK doesn’t quite satisfy your need, tools from Android Studio, TensorFlow Lite and ML Kit might just be the answer. Aside from individual offerings, we will also highlight how they can be used together.
  • Thursday - ML design. Learn some best practices for making ML product decisions from the People + AI Guidebook. We will go behind the scenes of the Read Along app, an on-device ML app that helps grow universal literacy. Bring your whole team because everyone, including UXers, engineers, and product managers are invited!

On Tuesday and Wednesday, we will also have a “codelab of the day” so get your Android Studio 4.1 beta today, block off an hour in your schedule and take this ML journey with us!

*The apps presented here are the projects of the developers individually, and not Google.

Using Selective Attention in Reinforcement Learning Agents



Inattentional blindness is the psychological phenomenon that causes one to miss things in plain sight, and is a consequence of the selective attention that enables you to remain focused on important parts of the world without distraction from irrelevant details. It is believed that this selective attention mechanism enables people to condense broad sensory information into a form that is compact enough to be used for future decision making. While this may seem to be a limitation, such “bottlenecks” observed in nature can also inspire the design of machine learning systems that hope to mimic the success and efficiency of biological organisms. For example, while most methods presented in the deep reinforcement learning (RL) literature allow an agent to access the entire visual input, and even incorporating modules for predicting future sequences of visual inputs, perhaps reducing an agent’s access to its visual inputs via an attention constraint could be beneficial to an agent’s performance?

In our recent GECCO 2020 paper, “Neuroevolution of Self-Interpretable Agents” (AttentionAgent), we investigate the properties of such agents that employ a self-attention bottleneck. We show that not only are they able to solve challenging vision-based tasks from pixel inputs with 1000x fewer learnable parameters compared to conventional methods, they are also better at generalization to unseen modifications of their tasks, simply due to its ability to “not see details” that can confuse it. Furthermore, looking at where the agent is focusing its attention provides visual interpretability to its decision making process. The following diagram illustrates how the agent learned to deal with its attention bottleneck:
AttentionAgent learned to attend to task critical regions in its visual inputs. In a car driving task (CarRacing, top row), the agent mostly attends to the road borders, but shifts its focus to the turns before it changes heading directions. In a fireball dodging game (DoomTakeCover, bottom row), the agent focuses on fireballs and enemy monsters. Left: Visual inputs to the agent. Center: Agent’s attention overlaid on the visual inputs, the white patches indicate where the agent focuses its attention. Right: Visual cues based on which the agent makes decisions.
Agent with Artificial Attention
While there have been several works that explore how constraints such as sparsity may play a role in actually shaping the abilities of reinforcement learning agents, AttentionAgent takes inspiration from concepts related to inattentional blindness — when the brain is involved in effort-demanding tasks, it assigns most of its attention capacity only to task-relevant elements and is temporarily blind to other signals. To achieve this, we segment the input image into several patches and then rely on a modified self-attention architecture to simulate voting between patches to elect a subset to be considered important. The patches of interest are elected at each time step and, once determined, AttentionAgent makes decisions solely on these patches, ignoring the rest.

In addition to extracting key factors from visual inputs, the ability to contextualize these factors as they change in time is just as crucial. For example, a batter in the game of baseball must use visual signals to continuously keep track of the baseball's location in order to predict its position and be able to hit it. In AttentionAgent, a long short-term memory (LSTM) model accepts information from the important patches and generates an action at each time step. LSTM keeps track of the changes in the input sequence, and can thus utilize the information to track how critical factors evolve over time.

It is conventional to optimize a neural network with backpropagation. However, because AttentionAgent contains non-differentiable operations for the generation of important patches, like sorting and slicing, it is not straightforward to apply such techniques for training. We therefore turn to derivative-free optimization algorithms to overcome this difficulty.
Overview of our method and illustration of data processing flow in AttentionAgent. Top: Input transformation — A sliding window segments an input image into smaller patches, and then “flattens” them for future processing. Middle: Patch election — The modified self-attention module holds votes between patches to generate a patch importance vector. Bottom: Action generation — AttentionAgent picks the patches of the highest importance, extracts corresponding features and makes decisions based on them.
Generalization to Unseen Modifications of the Environment
We demonstrate that Attention Agent learned to attend to a variety of regions in the input images. Visualization of the important patches provides a peek into how the agent is making decisions, illustrating that most selections make sense and are consistent with human intuition, and is a powerful tool for analyzing and debugging an agent in development. Furthermore, since the agent learned to ignore information non-critical to the core task, it can generalize to tasks where small environmental modifications are applied.

Here, we show that restricting the agent’s decision-making controller’s access to important patches only while ignoring the rest of the scene can result in better generalization, simply due to how the agent is restricted from “seeing things” that can confuse it. Our agent is trained to survive in the VizDoom TakeCover environment only, but it can also survive in unseen settings with higher walls, different floor textures, or when confronted with a distracting sign.
DoomTakeCover Generalization: The AttentionAgent is trained in the environment with no modifications (left). It is able to adapt to changes in the environment, such as a higher wall (middle, left), a different floor texture (middle, right), or floating text (right).
When one learns to drive during a sunny day, one also can transfer those skills (to some extent) to driving at night, on a rainy day, in a different car, or in the presence of bird droppings on the windshield. AttentionAgent is not only able to solve CarRacing-v0, it can also achieve similar performance in unseen conditions, such as brighter or darker scenery, or having its vision modified by artifacts such as side bars or background blobs, while requiring 1000x fewer parameters than conventional methods that fail to generalize.
CarRacing Generalization: No modification (left); color perturbation (middle, left); vertical bars on left and right (middle, right); added red blob (right).
Limitations and Future Work
While AttentionAgent is able to cope with various modifications of the environment, there are limitations to this approach, and much more work to be done to further enhance the generalization capabilities of the agent. For example, AttentionAgent does not generalize to cases where dramatic background changes are involved. The agent trained on the original car racing environment with the green grass background fails to generalize when the background is replaced with distracting YouTube videos. When we take this one step further and replace the background with pure uniform noise, we observe that the agent’s attention module breaks down and attends only to random patches of noise, rather than to the road-related patches. If we train an agent from scratch in the noisy background environment, it manages to get around the track, although the performance is mediocre. Interestingly, the agent still attends only to the noise, rather than to the road, it appears to have learned to drive by estimating where the lane is based on the number of selected patches on the left and right of the screen.
AttentionAgent fails to generalize to drastically modified environments. Left: The background suddenly becomes a cat (Creative Commons video). Middle: The background suddenly becomes an arcade game (Creative Commons video). Right: AttentionAgent learned to drive on pure noise background by avoiding noise patches.
The simplistic method we use to extract information from important patches may be inadequate for more complicated tasks. How we can learn more meaningful features, and perhaps even extract symbolic information from the visual input will be an exciting future direction. In addition to open sourcing the code to the research community, we have also released CarRacingExtension, a suite of car racing tasks that involve various environmental modifications, as testbeds and benchmark for ML researchers who are interested in agent generalizations.

Acknowledgements
This research was conducted by Yujin Tang, Duong Nguyen, and David Ha. We would like to thank Yingtao Tian, Lana Sinapayen, Shixin Luo, Krzysztof Choromanski, Sherjil Ozair, Ben Poole, Kai Arulkumaran, Eric Jang, Brian Cheung, Kory Mathewson, Ankur Handa, and Jeff Dean for valuable discussions.

Source: Google AI Blog