Posted by Hee Jung, Developer Relations Community Manager / Soonson Kwon, Developer Relations Program Manager
ML in Action is a virtual event to collect and share cool and useful machine learning (ML) use cases that leverage multiple Google ML products. This is the first run of an ML use case campaign by the ML Developer Programs team.
Let us announce the winners right now, right here. They have showcased practical uses of ML, and how ML was adapted to real life situations. We hope these projects can spark new applied ML project ideas and provide opportunities for ML community leaders to discuss ML use cases.
4 Winners of "ML in Action" are:
Detecting Food Quality with Raspberry Pi and TensorFlow
By George Soloupis, ML Google Developer Expert (Greece)
This project helps people with smell impairment by identifying food degradation. The idea came suddenly when a friend revealed that he has no sense of smell due to a bike crash. Even with experiences attending a lot of IT meetings, this issue was unaddressed and the power of machine learning is something we could rely on. Hence the goal. It is to create a prototype that is affordable, accurate and usable by people with minimum knowledge of computers.
The basic setting of the food quality detection is this. Raspberry Pi collects data from air sensors over time during the food degradation process. This single board computer was very useful! With the GUI, it’s easy to execute Python scripts and see the results on screen. Eight sensors collected data of the chemical elements such as NH3, H2s, O3, CO, and CH4. After operating the prototype for one day, categories were set following the results. The first hours of the food out of the refrigerator as “good” and the rest as “bad”. Then the dataset was evaluated with the help of TensorFlow and the inference was done with TensorFlow Lite.
Since there were no open source prototypes out there with similar goals, it was a complete adventure. Sensors on PCBs and standalone sensors were used to get the best mixture of accuracy, stability and sensitivity. A logic level converter has been used to minimize the use of resistors, and capacitors have been placed for stability. And the result, a compact prototype! The Raspberry Pi could attach directly on with slots for eight sensors. It is developed in such a way that sensors can be replaced at any time. Users can experiment with different sensors. And the inference time values are sent through the bluetooth to a mobile device. As an end result a user with no advanced technical knowledge will be able to see food quality on an app built on Android (Kotlin).
Reference: Github, more to read
* This project is supported by Google Impact Fund.
Election Watch: Applying ML in Analyzing Elections Discourse and Citizen Participation in Nigeria
By Victor Dibia, ML Google Developer Expert (USA)
This project explores the use of GCP tools in ingesting, storing and analyzing data on citizen participation and election discourse in Nigeria. It began on the premise that the proliferation of social media interactions provides an interesting lens to study human behavior, and ask important questions about election discourse in Nigeria as well as interrogate social/demographic questions.
It is based on data collected from twitter between September 2018 to March 2019 (tweets geotagged to Nigeria and tweets containing election related keywords). Overall, the data set contains 25.2 million tweets and retweets, 12.6 million original tweets, 8.6 million geotagged tweets and 3.6 million tweets labeled (using an ML model) as political.
By analyzing election discourse, we can learn a few important things including - issues that drive election discourse, how social media was utilized by candidates, and how participation was distributed across geographic regions in the country. Finally, in a country like Nigeria where updated demographics data is lacking (e.g., on community structures, wealth distribution etc), this project shows how social media can be used as a surrogate to infer relative statistics (e.g., existence of diaspora communities based on election discussion and wealth distribution based on device type usage across the country).
Data for the project was collected using python scripts that wrote tweets from the Twitter streaming api (matching certain criteria) to BigQuery. BigQuery queries were then used to generate aggregate datasets used for visualizations/analysis and training machine learning models (political text classification models to label political text and multi class classification models to label general discourse). The models were built using Tensorflow 2.0 and trained on Colab notebooks powered by GCP GPU compute VMs.
References: Election Watch website, ML models descriptions one, two
Bioacoustic Sound Detector (To identify bird calls in soundscapes)
By Usha Rengaraju, TFUG Organizer (India)
|(Bird image is taken by Krisztian Toth @unsplash) |
“Visionary Perspective Plan (2020-2030) for the conservation of avian diversity, their ecosystems, habitats and landscapes in the country” proposed by the Indian government to help in the conservation of birds and their habitats inspired me to take up this project.
Extinction of bird species is an increasing global concern as it has a huge impact on food chains. Bioacoustic monitoring can provide a passive, low labor, and cost-effective strategy for studying endangered bird populations. Recent advances in machine learning have made it possible to automatically identify bird songs for common species with ample training data. This innovation makes it easier for researchers and conservation practitioners to accurately survey population trends and they’ll be able to regularly and more effectively evaluate threats and adjust their conservation actions.
This project is an implementation of a Bioacoustic monitor using Masked Autoencoders in TensorFlow and Cloud TPUs. The project will be presented as a browser based application using Flask. The deep learning prototype can process continuous audio data and then acoustically recognize the species.
The goal of the project when I started was to build a basic prototype for monitoring of rare bird species in India. In future I would like to expand the project to monitor other endangered species as well.
References: Kaggle Notebook, Colab Notebook, Github, the dataset and more to read
Persona Labs' Digital Personas
By Martin Andrews and Sam Witteveen, ML Google Developer Experts (Singapore)
Over the last 3 years, Red Dragon AI (a company co-founded by Martin and Sam) has been developing real-time digital “Personas”. The key idea is to enable users to interact with life-like Personas in a format similar to a Zoom call : Speaking to them and seeing them respond in real time, just as a human would. Naturally, each Persona can be tailored to tasks required (by adjusting the appearance, voice, and ‘motivation’ of the dialog system behind the scenes and their corresponding backend APIs).
The components required to make the Personas work effectively include dynamic face models, expression generation models, Text-to-Speech (TTS), dialog backend(s) and Speech Recognition (ASR). Much of this was built on GCP, with GPU VMs running the (many) Deep Learning models and combining the outputs into dynamic WebRTC video that streams to users via a browser front-end.
Much of the previous years’ work focussed on making the Personas’ faces behave in a life-like way, while making sure that the overall latency (i.e. the time between the Persona hearing the user asking a question, to their lips starting the response) is kept low, and the rendering of individual images matches the 25 frames-per-second video rate required. As you might imagine, there were many Deep Learning modeling challenges, coupled with hard engineering issues to overcome.
In terms of backend technologies, Google Cloud GPUs were used to train the Deep Learning models (built using TensorFlow/TFLite, PyTorch/ONNX & more recently JAX/Flax), and the real-time serving is done by Nvidia T4 GPU-enabled VMs, launched as required. Google ASR is currently used as a streaming backend for speech recognition, and Google’s WaveNet TTS is used when multilingual TTS is needed. The system also makes use of Google’s serverless stack with CloudRun and Cloud Functions being used in some of the dialog backends.
Visit the Persona’s website (linked below) and you can see videos that demonstrate several aspects : What the Personas look like; their Multilingual capability; potential applications; etc. However, the videos can’t really demonstrate what the interactivity ‘feels like’. For that, it’s best to get a live demo from Sam and Martin - and see what real-time Deep Learning model generation looks like!
Reference: The Persona Labs website