Category Archives: Google Developers Blog

News and insights on Google platforms, tools and events

Skip the setup— Run code directly from Google Cloud’s documentation

Posted by Abby Carey, Developer Advocate

Blog header

Long gone are the days of looking for documentation, finding a how-to guide, and questioning whether the commands and code samples actually work.

Google Cloud recently added a Cloud Shell integration within each and every documentation page.

This new functionality lets you test code in a preprovisioned virtual machine instance while learning about Google Cloud services. Running commands and code from the documentation cuts down on context switching between the documentation and a terminal window to run the commands in a tutorial.

This gif shows how Google Cloud’s documentation uses Cloud Shell, letting you run commands in a quickstart within your Cloud Shell environment.

gif showing how Google Cloud’s documentation uses Cloud Shell, letting you run commands in a quickstart within your Cloud Shell environment.

If you’re new to developing on Google Cloud, this creates a low barrier to entry for trying Google Cloud services and APIs. After activating billing verification with your Google Cloud account, you can test services that have a free tier at no charge, like Pub/Sub and Cloud Vision.

  1. Open a Google Cloud documentation page (like this Pub/Sub quickstart).
  2. Sign into your Google account.
  3. In the top navigation, click Activate Cloud Shell.
  4. Select your project or create one if you don’t already have one. You can select a project by running the gcloud config set project command or by using this drop-down menu:
    image showing how to select a project
  5. Copy, paste, and run your commands.

If you want to test something a bit more adventurous, try to deploy a containerized web application, or get started with BigQuery.

A bit about Cloud Shell

If you’ve been developing on Google Cloud, chances are you’ve already interacted with Cloud Shell in the Cloud Console. Cloud Shell is a ready-to-go, online development and operations environment. It comes preinstalled with common command-line tools, programming languages, and the Cloud SDK.

Just like in the Cloud Console, your Cloud Shell terminal stays open as you navigate the site. As you work through tutorials within Google Cloud’s documentation, the Cloud Shell terminal stays on your screen. This helps with progressing from two connected tutorials, like the Pub/Sub quickstart and setting up a Pub/Sub Proxy.

Having a preprovisioned environment setup by Google eliminates the age old question of “Is my machine the problem?” when you eventually try to run these commands locally.

What about code samples?

While Cloud Shell is useful for managing your Google Cloud resources, it also lets you test code samples. If you’re using Cloud Client Libraries, you can customize and run sample code in the Cloud Shell’s built in code editor: Cloud Shell Editor.

Cloud Shell Editor is Cloud Shell’s built-in, browser-based code editor, powered by the Eclipse Theia IDE platform. To open it, click the Open Editor button from your Cloud Shell terminal:

Image showing how to open Cloud Shell Editor

Cloud Shell Editor has rich language support and debuggers for Go, Java, .Net, Python, NodeJS and more languages, integrated source control, local emulators for Kubernetes, and more features. With the Cloud Shell Editor open, you can then walk through a client library tutorial like Cloud Vision’s Detect labels guide, running terminal commands and code from one browser tab.

Open up a Google Cloud quickstart and give it a try! This could be a game-changer for your learning experience.

Cloud Shell Editor has rich language support and debuggers for Go, Java, .Net, Python, NodeJS and more languages, integrated source control, local emulators for Kubernetes, and more features. With the Cloud Shell Editor open, you can then walk through a client library tutorial like Cloud Vision’s Detect labels guide, running terminal commands and code from one browser tab.

Open up a Google Cloud quickstart and give it a try! This could be a game-changer for your learning experience.

Mentoring future women Experts

Posted by Justyna Politanska-Pyszko

Google Developers Experts is a global community of developers, engineers and thought leaders who passionately share their technical knowledge with others.

Becoming a Google Developers Expert is no easy task. First, you need to have strong skills in one of the technical areas - Android, Kotlin, Google Cloud, Machine Learning, Web Technologies, Angular, Firebase, Google Workspace, Flutter or other. You also need to have a track record of sharing your knowledge - be it via conference talks, your personal blog, youtube videos or in some other form. Finally, you need one more thing. The courage to approach an existing Expert or a Google employee and ask them to support your application.

It’s not easy, but it’s worth it. Joining the Experts community comes with many opportunities: direct access to product teams at Google, invitations to events and projects, entering a network of technology enthusiasts from around the world.

On a quest to make these opportunities available to a diverse group of talented people globally, we launched “Road to GDE”: a mentoring program to support women in their journey to become Google Developers Experts.

Mentors and Mentees meeting online

Mentors and Mentees meeting online

For 3 months, 17 mentors from the Experts community were mentoring mentees on topics like public speaking, building their professional portfolio and confidence boosting. What did they learn during the program?

Glafira Zhur: No time for fear! With my Mentor’s help, I got invited to speak at several events, two of which are already scheduled for the summer. I created my speaker portfolio and made new friends in the community. It was a great experience.

Julia Miocene: I learned that I shouldn't be afraid to do what someone else has already done. Even if there are talks or articles on some topic already, I will do them differently anyway. And for people, it’s important to see things from different perspectives. Just do what you like and don’t be afraid.

Bhavna Thacker: I got motivated to continue my community contributions, learnt how to promote my work and reach more developers, so that they can benefit from my efforts. Overall, It was an excellent program. Thanks to all organisers and my mentor - Garima Jain. I am definitely looking forward to applying to the Experts program soon!

Road to GDE mentee - Glafira Zhur and her mentor - Natalia Venditto.

Road to GDE mentee - Glafira Zhur and her mentor - Natalia Venditto.

Congratulations to all 17 mentees who completed the Program: Maris Botero, Clarissa Loures, Layale Matta, Bhavika Panara, Stefanie Urchs, Alisa Tsvetkova, Glafira Zhur, Wafa Waheeda Syed, Helen Kapatsa, Karin-Aleksandra Monoid, Sveta Krivosheeva, Ines Akrap, Julia Miocene, Vandana Srivastava, Anna Zharkova, Bhavana Thacker, Debasmita Sarkar

And to their mentors - all members of the Google Developers Experts community: Lesly Zerna, Bianca Ximenes, Kristina Simakova, Sayak Paul, Karthik Muthuswamy, Jeroen Meijer, Natalia Venditto, Martina Kraus, Merve Noyan, Annyce Davis, Majid Hajian, James Milner, Debbie O'Brien, Niharika Arora, Nicola Corti, Garima Jain, Kamal Shree Soundirapandian

To learn more about the Experts program, follow us on Twitter, Linkedin or Medium.

Drone control via gestures using MediaPipe Hands

A guest post by Neurons Lab

Please note that the information, uses, and applications expressed in the below post are solely those of our guest author, Neurons Lab, and not necessarily those of Google.

How the idea emerged

With the advancement of technology, drones have become not only smaller, but also have more compute. There are many examples of iPhone-sized quadcopters in the consumer drone market and the computing power to do live tracking while recording 4K video. However, the most important element has not changed much - the controller. It is still bulky and not intuitive for beginners to use. There is a smartphone with on-display control as an option; however, the control principle is still the same. 

That is how the idea for this project emerged: a more personalised approach to control the drone using gestures. ML Engineer Nikita Kiselov (me) together with consultation from my colleagues at Neurons Lab undertook this project. 

Demonstration of drone flight control via gestures using MediaPipe Hands

Figure 1: [GIF] Demonstration of drone flight control via gestures using MediaPipe Hands

Why use gesture recognition?

Gestures are the most natural way for people to express information in a non-verbal way.  Gesture control is an entire topic in computer science that aims to interpret human gestures using algorithms. Users can simply control devices or interact without physically touching them. Nowadays, such types of control can be found from smart TV to surgery robots, and UAVs are not the exception.

Although gesture control for drones have not been widely explored lately, the approach has some advantages:

  • No additional equipment needed.
  • More human-friendly controls.
  • All you need is a camera that is already on all drones.

With all these features, such a control method has many applications.

Flying action camera. In extreme sports, drones are a trendy video recording tool. However, they tend to have a very cumbersome control panel. The ability to use basic gestures to control the drone (while in action)  without reaching for the remote control would make it easier to use the drone as a selfie camera. And the ability to customise gestures would completely cover all the necessary actions.

This type of control as an alternative would be helpful in an industrial environment like, for example, construction conditions when there may be several drone operators (gesture can be used as a stop-signal in case of losing primary source of control).

The Emergencies and Rescue Services could use this system for mini-drones indoors or in hard-to-reach places where one of the hands is busy. Together with the obstacle avoidance system, this would make the drone fully autonomous, but still manageable when needed without additional equipment.

Another area of application is FPV (first-person view) drones. Here the camera on the headset could be used instead of one on the drone to recognise gestures. Because hand movement can be impressively precise, this type of control, together with hand position in space, can simplify the FPV drone control principles for new users. 

However, all these applications need a reliable and fast (really fast) recognition system. Existing gesture recognition systems can be fundamentally divided into two main categories: first - where special physical devices are used, such as smart gloves or other on-body sensors; second - visual recognition using various types of cameras. Most of those solutions need additional hardware or rely on classical computer vision techniques. Hence, that is the fast solution, but it's pretty hard to add custom gestures or even motion ones. The answer we found is MediaPipe Hands that was used for this project.

Overall project structure

To create the proof of concept for the stated idea, a Ryze Tello quadcopter was used as a UAV. This drone has an open Python SDK, which greatly simplified the development of the program. However, it also has technical limitations that do not allow it to run gesture recognition on the drone itself (yet). For this purpose a regular PC or Mac was used. The video stream from the drone and commands to the drone are transmitted via regular WiFi, so no additional equipment was needed. 

To make the program structure as plain as possible and add the opportunity for easily adding gestures, the program architecture is modular, with a control module and a gesture recognition module. 

Scheme that shows overall project structure and how videostream data from the drone is processed

Figure 2: Scheme that shows overall project structure and how videostream data from the drone is processed

The application is divided into two main parts: gesture recognition and drone controller. Those are independent instances that can be easily modified. For example, to add new gestures or change the movement speed of the drone.

Video stream is passed to the main program, which is a simple script with module initialisation, connections, and typical for the hardware while-true cycle. Frame for the videostream is passed to the gesture recognition module. After getting the ID of the recognised gesture, it is passed to the control module, where the command is sent to the UAV. Alternatively, the user can control a drone from the keyboard in a more classical manner.

So, you can see that the gesture recognition module is divided into keypoint detection and gesture classifier. Exactly the bunch of the MediaPipe key point detector along with the custom gesture classification model distinguishes this gesture recognition system from most others.

Gesture recognition with MediaPipe

Utilizing MediaPipe Hands is a winning strategy not only in terms of speed, but also in flexibility. MediaPipe already has a simple gesture recognition calculator that can be inserted into the pipeline. However, we needed a more powerful solution with the ability to quickly change the structure and behaviour of the recognizer. To do so and classify gestures, the custom neural network was created with 4 Fully-Connected layers and 1 Softmax layer for classification.

Figure 3: Scheme that shows the structure of classification neural network

Figure 3: Scheme that shows the structure of classification neural network

This simple structure gets a vector of 2D coordinates as an input and gives the ID of the classified gesture. 

Instead of using cumbersome segmentation models with a more algorithmic recognition process, a simple neural network can easily handle such tasks. Recognising gestures by keypoints, which is a simple vector with 21 points` coordinates, takes much less data and time. What is more critical, new gestures can be easily added because model retraining tasks take much less time than the algorithmic approach.

To train the classification model, dataset with keypoints` normalised coordinates and ID of a gesture was used. The numerical characteristic of the dataset was that:

  • 3 gestures with 300+ examples (basic gestures)
  • 5 gestures with 40 -150 examples 

All data is a vector of x, y coordinates that contain small tilt and different shapes of hand during data collection.

Figure 4: Confusion matrix and classification report for classification

Figure 4: Confusion matrix and classification report for classification

We can see from the classification report that the precision of the model on the test dataset (this is 30% of all data) demonstrated almost error-free for most classes, precision > 97% for any class. Due to the simple structure of the model, excellent accuracy can be obtained with a small number of examples for training each class. After conducting several experiments, it turned out that we just needed the dataset with less than 100 new examples for good recognition of new gestures. What is more important, we don’t need to retrain the model for each motion in different illumination because MediaPipe takes over all the detection work.

Figure 5: [GIF] Test that demonstrates how fast classification network can distinguish newly trained gestures using the information from MediaPipe hand detector

Figure 5: [GIF] Test that demonstrates how fast classification network can distinguish newly trained gestures using the information from MediaPipe hand detector

From gestures to movements

To control a drone, each gesture should represent a command for a drone. Well, the most excellent part about Tello is that it has a ready-made Python API to help us do that without explicitly controlling motors hardware. We just need to set each gesture ID to a command.

Figure 6: Command-gesture pairs representation

Figure 6: Command-gesture pairs representation

Each gesture sets the speed for one of the axes; that’s why the drone’s movement is smooth, without jitter. To remove unnecessary movements due to false detection, even with such a precise model, a special buffer was created, which is saving the last N gestures. This helps to remove glitches or inconsistent recognition.

The fundamental goal of this project is to demonstrate the superiority of the keypoint-based gesture recognition approach compared to classical methods. To demonstrate all the potential of this recognition model and its flexibility, there is an ability to create the dataset on the fly … on the drone`s flight! You can create your own combinations of gestures or rewrite an existing one without collecting massive datasets or manually setting a recognition algorithm. By pressing the button and ID key, the vector of detected points is instantly saved to the overall dataset. This new dataset can be used to retrain classification network to add new gestures for the detection. For now, there is a notebook that can be run on Google Colab or locally. Retraining the network-classifier takes about 1-2 minutes on a standard CPU instance. The new binary file of the model can be used instead of the old one. It is as simple as that. But for the future, there is a plan to do retraining right on the mobile device or even on the drone.

Figure 7: Notebook for model retraining in action

Figure 7: Notebook for model retraining in action

Summary 

This project is created to make a push in the area of the gesture-controlled drones. The novelty of the approach lies in the ability to add new gestures or change old ones quickly. This is made possible thanks to MediaPipe Hands. It works incredibly fast, reliably, and ready out of the box, making gesture recognition very fast and flexible to changes. Our Neuron Lab`s team is excited about the demonstrated results and going to try other incredible solutions that MediaPipe provides. 

We will also keep track of MediaPipe updates, especially about adding more flexibility in creating custom calculators for our own models and reducing barriers to entry when creating them. Since at the moment our classifier model is outside the graph, such improvements would make it possible to quickly implement a custom calculator with our model into reality.

Another highly anticipated feature is Flutter support (especially for iOS). In the original plans, the inference and visualisation were supposed to be on a smartphone with NPU\GPU utilisation, but at the moment support quality does not satisfy our requests. Flutter is a very powerful tool for rapid prototyping and concept checking. It allows us to throw and test an idea cross-platform without involving a dedicated mobile developer, so such support is highly demanded. 

Nevertheless, the development of this demo project continues with available functionality, and there are already several plans for the future. Like using the MediaPipe Holistic for face recognition and subsequent authorisation. The drone will be able to authorise the operator and give permission for gesture control. It also opens the way to personalisation. Since the classifier network is straightforward, each user will be able to customise gestures for themselves (simply by using another version of the classifier model). Depending on the authorised user, one or another saved model will be applied. Also in the plans to add the usage of Z-axis. For example, tilt the palm of your hand to control the speed of movement or height more precisely. We encourage developers to innovate responsibly in this area, and to consider responsible AI practices such as testing for unfair biases and designing with safety and privacy in mind.

We highly believe that this project will motivate even small teams to do projects in the field of ML computer vision for the UAV, and MediaPipe will help to cope with the limitations and difficulties on their way (such as scalability, cross-platform support and GPU inference).


If you want to contribute, have ideas or comments about this project, please reach out to [email protected], or visit the GitHub page of the project.

This blog post is curated by Igor Kibalchich, ML Research Product Manager at Google AI.

Drone control via gestures using MediaPipe Hands

A guest post by Neurons Lab

Please note that the information, uses, and applications expressed in the below post are solely those of our guest author, Neurons Lab, and not necessarily those of Google.

How the idea emerged

With the advancement of technology, drones have become not only smaller, but also have more compute. There are many examples of iPhone-sized quadcopters in the consumer drone market and the computing power to do live tracking while recording 4K video. However, the most important element has not changed much - the controller. It is still bulky and not intuitive for beginners to use. There is a smartphone with on-display control as an option; however, the control principle is still the same. 

That is how the idea for this project emerged: a more personalised approach to control the drone using gestures. ML Engineer Nikita Kiselov (me) together with consultation from my colleagues at Neurons Lab undertook this project. 

Demonstration of drone flight control via gestures using MediaPipe Hands

Figure 1: [GIF] Demonstration of drone flight control via gestures using MediaPipe Hands

Why use gesture recognition?

Gestures are the most natural way for people to express information in a non-verbal way.  Gesture control is an entire topic in computer science that aims to interpret human gestures using algorithms. Users can simply control devices or interact without physically touching them. Nowadays, such types of control can be found from smart TV to surgery robots, and UAVs are not the exception.

Although gesture control for drones have not been widely explored lately, the approach has some advantages:

  • No additional equipment needed.
  • More human-friendly controls.
  • All you need is a camera that is already on all drones.

With all these features, such a control method has many applications.

Flying action camera. In extreme sports, drones are a trendy video recording tool. However, they tend to have a very cumbersome control panel. The ability to use basic gestures to control the drone (while in action)  without reaching for the remote control would make it easier to use the drone as a selfie camera. And the ability to customise gestures would completely cover all the necessary actions.

This type of control as an alternative would be helpful in an industrial environment like, for example, construction conditions when there may be several drone operators (gesture can be used as a stop-signal in case of losing primary source of control).

The Emergencies and Rescue Services could use this system for mini-drones indoors or in hard-to-reach places where one of the hands is busy. Together with the obstacle avoidance system, this would make the drone fully autonomous, but still manageable when needed without additional equipment.

Another area of application is FPV (first-person view) drones. Here the camera on the headset could be used instead of one on the drone to recognise gestures. Because hand movement can be impressively precise, this type of control, together with hand position in space, can simplify the FPV drone control principles for new users. 

However, all these applications need a reliable and fast (really fast) recognition system. Existing gesture recognition systems can be fundamentally divided into two main categories: first - where special physical devices are used, such as smart gloves or other on-body sensors; second - visual recognition using various types of cameras. Most of those solutions need additional hardware or rely on classical computer vision techniques. Hence, that is the fast solution, but it's pretty hard to add custom gestures or even motion ones. The answer we found is MediaPipe Hands that was used for this project.

Overall project structure

To create the proof of concept for the stated idea, a Ryze Tello quadcopter was used as a UAV. This drone has an open Python SDK, which greatly simplified the development of the program. However, it also has technical limitations that do not allow it to run gesture recognition on the drone itself (yet). For this purpose a regular PC or Mac was used. The video stream from the drone and commands to the drone are transmitted via regular WiFi, so no additional equipment was needed. 

To make the program structure as plain as possible and add the opportunity for easily adding gestures, the program architecture is modular, with a control module and a gesture recognition module. 

Scheme that shows overall project structure and how videostream data from the drone is processed

Figure 2: Scheme that shows overall project structure and how videostream data from the drone is processed

The application is divided into two main parts: gesture recognition and drone controller. Those are independent instances that can be easily modified. For example, to add new gestures or change the movement speed of the drone.

Video stream is passed to the main program, which is a simple script with module initialisation, connections, and typical for the hardware while-true cycle. Frame for the videostream is passed to the gesture recognition module. After getting the ID of the recognised gesture, it is passed to the control module, where the command is sent to the UAV. Alternatively, the user can control a drone from the keyboard in a more classical manner.

So, you can see that the gesture recognition module is divided into keypoint detection and gesture classifier. Exactly the bunch of the MediaPipe key point detector along with the custom gesture classification model distinguishes this gesture recognition system from most others.

Gesture recognition with MediaPipe

Utilizing MediaPipe Hands is a winning strategy not only in terms of speed, but also in flexibility. MediaPipe already has a simple gesture recognition calculator that can be inserted into the pipeline. However, we needed a more powerful solution with the ability to quickly change the structure and behaviour of the recognizer. To do so and classify gestures, the custom neural network was created with 4 Fully-Connected layers and 1 Softmax layer for classification.

Figure 3: Scheme that shows the structure of classification neural network

Figure 3: Scheme that shows the structure of classification neural network

This simple structure gets a vector of 2D coordinates as an input and gives the ID of the classified gesture. 

Instead of using cumbersome segmentation models with a more algorithmic recognition process, a simple neural network can easily handle such tasks. Recognising gestures by keypoints, which is a simple vector with 21 points` coordinates, takes much less data and time. What is more critical, new gestures can be easily added because model retraining tasks take much less time than the algorithmic approach.

To train the classification model, dataset with keypoints` normalised coordinates and ID of a gesture was used. The numerical characteristic of the dataset was that:

  • 3 gestures with 300+ examples (basic gestures)
  • 5 gestures with 40 -150 examples 

All data is a vector of x, y coordinates that contain small tilt and different shapes of hand during data collection.

Figure 4: Confusion matrix and classification report for classification

Figure 4: Confusion matrix and classification report for classification

We can see from the classification report that the precision of the model on the test dataset (this is 30% of all data) demonstrated almost error-free for most classes, precision > 97% for any class. Due to the simple structure of the model, excellent accuracy can be obtained with a small number of examples for training each class. After conducting several experiments, it turned out that we just needed the dataset with less than 100 new examples for good recognition of new gestures. What is more important, we don’t need to retrain the model for each motion in different illumination because MediaPipe takes over all the detection work.

Figure 5: [GIF] Test that demonstrates how fast classification network can distinguish newly trained gestures using the information from MediaPipe hand detector

Figure 5: [GIF] Test that demonstrates how fast classification network can distinguish newly trained gestures using the information from MediaPipe hand detector

From gestures to movements

To control a drone, each gesture should represent a command for a drone. Well, the most excellent part about Tello is that it has a ready-made Python API to help us do that without explicitly controlling motors hardware. We just need to set each gesture ID to a command.

Figure 6: Command-gesture pairs representation

Figure 6: Command-gesture pairs representation

Each gesture sets the speed for one of the axes; that’s why the drone’s movement is smooth, without jitter. To remove unnecessary movements due to false detection, even with such a precise model, a special buffer was created, which is saving the last N gestures. This helps to remove glitches or inconsistent recognition.

The fundamental goal of this project is to demonstrate the superiority of the keypoint-based gesture recognition approach compared to classical methods. To demonstrate all the potential of this recognition model and its flexibility, there is an ability to create the dataset on the fly … on the drone`s flight! You can create your own combinations of gestures or rewrite an existing one without collecting massive datasets or manually setting a recognition algorithm. By pressing the button and ID key, the vector of detected points is instantly saved to the overall dataset. This new dataset can be used to retrain classification network to add new gestures for the detection. For now, there is a notebook that can be run on Google Colab or locally. Retraining the network-classifier takes about 1-2 minutes on a standard CPU instance. The new binary file of the model can be used instead of the old one. It is as simple as that. But for the future, there is a plan to do retraining right on the mobile device or even on the drone.

Figure 7: Notebook for model retraining in action

Figure 7: Notebook for model retraining in action

Summary 

This project is created to make a push in the area of the gesture-controlled drones. The novelty of the approach lies in the ability to add new gestures or change old ones quickly. This is made possible thanks to MediaPipe Hands. It works incredibly fast, reliably, and ready out of the box, making gesture recognition very fast and flexible to changes. Our Neuron Lab`s team is excited about the demonstrated results and going to try other incredible solutions that MediaPipe provides. 

We will also keep track of MediaPipe updates, especially about adding more flexibility in creating custom calculators for our own models and reducing barriers to entry when creating them. Since at the moment our classifier model is outside the graph, such improvements would make it possible to quickly implement a custom calculator with our model into reality.

Another highly anticipated feature is Flutter support (especially for iOS). In the original plans, the inference and visualisation were supposed to be on a smartphone with NPU\GPU utilisation, but at the moment support quality does not satisfy our requests. Flutter is a very powerful tool for rapid prototyping and concept checking. It allows us to throw and test an idea cross-platform without involving a dedicated mobile developer, so such support is highly demanded. 

Nevertheless, the development of this demo project continues with available functionality, and there are already several plans for the future. Like using the MediaPipe Holistic for face recognition and subsequent authorisation. The drone will be able to authorise the operator and give permission for gesture control. It also opens the way to personalisation. Since the classifier network is straightforward, each user will be able to customise gestures for themselves (simply by using another version of the classifier model). Depending on the authorised user, one or another saved model will be applied. Also in the plans to add the usage of Z-axis. For example, tilt the palm of your hand to control the speed of movement or height more precisely. We encourage developers to innovate responsibly in this area, and to consider responsible AI practices such as testing for unfair biases and designing with safety and privacy in mind.

We highly believe that this project will motivate even small teams to do projects in the field of ML computer vision for the UAV, and MediaPipe will help to cope with the limitations and difficulties on their way (such as scalability, cross-platform support and GPU inference).


If you want to contribute, have ideas or comments about this project, please reach out to [email protected], or visit the GitHub page of the project.

This blog post is curated by Igor Kibalchich, ML Research Product Manager at Google AI.

Celebrating some of the best indie games

Posted by Patricia Correa, Director, Global Developer Marketing

Indie Games Accelerator - Meet the Class of 2021, Indie Games Festival - Meet the winners

In June this year we opened applications for our Indie Games Accelerator, a mentorship program to help top mobile game startups achieve their full potential, as well as for our Indie Games Festival, a competition open to small game studios who get the opportunity to win promotions and be featured on Google Play. These annual programs are part of our commitment to helping all developers thrive in the Google ecosystem.

We received thousands of applications from developers across the world and we were truly amazed by the response. We’re impressed by the innovation and passion of the indie game community, and the unique and creative games they bring to players worldwide.

Last month we announced the Festival finalists and today we hosted the finals.

This year, for the first time, the events were virtual so everyone could attend. Players from around the world joined the adventure, met the finalists, played their games, and cheered on the Top 10 and the winners as they were announced on stage.

We also took the opportunity to announce the Indie Games Accelerator selected class of 2021.

screenshot of Europe stage

Our deepest thanks to our amazing hosts: YouTube creator Papfi, Japanese comedians Kajisak and Kikuchiusotsukanai, and Inho Chung, who all shared their unique expertise and love of games.

Without further ado, here are this year's Festival winners…

Indie Games Festival Winners

Europe

Indie Games Festival Winners | Europe

Bird Alone by George Batchelor, United Kingdom

Cats in Time by Pine Studio, Croatia

Gumslinger by Itatake, Sweden

Korea

Indie Games Festival Winners | South Korea

CATS & SOUP by HIDEA

Rush Hour Rally by Soen Games

The Way Home by CONCODE


Users' Choice

Animal Doll Shop by Funnyeve

Japan

Indie Games Festival Winners | Japan

Mousebusters by Odencat

Quantum Transport by ruccho

Survivor's guilt by aso


Student Category Award

Japanese Train Drive Simulator 2 "OneMan2" by HAKOT


Check out the top 10 finalists in Europe, South Korea and Japan.

Indie Games Accelerator Class of 2021

The selected studios will receive exclusive education and mentorship over the 12 week program, to help them build and grow successful businesses.

Americas 


Aoca Game Lab, Brazil

Berimbau Game Studio, Brazil

Boomware Studio, Peru

Concrete Software, USA

Delotech Games, Brazil

DreamCraft Entertainment, Inc., USA

Ingames, Argentina

Ludare Games Group Inc., Canada

Whitethorn Games, USA

Asia Pacific


Banjiha Games, South Korea

CATS BY STUDIO, South Korea

dc1ab pte. Ltd., Singapore

Dreams & Co., Thailand

Gamestacy Entertinment, India

izzle Inc., South  Korea

Limin Development and Investment Joint Stock Company, Vietnam 

Mugshot Games Pty Ltd,  Australia

Odencat Inc., Japan

Playbae, India

Xigma Games, India

XOGAMES Inc., South Korea

YOMI Studio, Vietnam

Europe, Middle East & Africa


Cleverside Ltd, Belarus

Dali Games, Poland

Firegecko Ltd, United Kingdom

Hot Siberians, Russia

Infinity Games, Portugal

Itatake, Sweden

Jimjum Studios, Israel

LIVA Interactive, Tunisia 

Pale Blue Interactive, South Africa

Pine Studio, Croatia

Platonic Games, Spain

SMOKOKO LTD, Bulgaria

Spooky House Studios, Germany


If you missed the finals

If you missed the finals or would like to explore further, you can still sign in and wander around the space but only for a limited time. Explore now.

How useful did you find this blog post?

An easier way to move your App Engine apps to Cloud Run

Posted by Wesley Chun (@wescpy), Developer Advocate, Google Cloud

Blue header

An easier yet still optional migration

In the previous episode of the Serverless Migration Station video series, developers learned how to containerize their App Engine code for Cloud Run using Docker. While Docker has gained popularity over the past decade, not everyone has containers integrated into their daily development workflow, and some prefer "containerless" solutions but know that containers can be beneficial. Well today's video is just for you, showing how you can still get your apps onto Cloud Run, even If you don't have much experience with Docker, containers, nor Dockerfiles.

App Engine isn't going away as Google has expressed long-term support for legacy runtimes on the platform, so those who prefer source-based deployments can stay where they are so this is an optional migration. Moving to Cloud Run is for those who want to explicitly move to containerization.

Migrating to Cloud Run with Cloud Buildpacks video

So how can apps be containerized without Docker? The answer is buildpacks, an open-source technology that makes it fast and easy for you to create secure, production-ready container images from source code, without a Dockerfile. Google Cloud Buildpacks adheres to the buildpacks open specification and allows users to create images that run on all GCP container platforms: Cloud Run (fully-managed), Anthos, and Google Kubernetes Engine (GKE). If you want to containerize your apps while staying focused on building your solutions and not how to create or maintain Dockerfiles, Cloud Buildpacks is for you.

In the last video, we showed developers how to containerize a Python 2 Cloud NDB app as well as a Python 3 Cloud Datastore app. We targeted those specific implementations because Python 2 users are more likely to be using App Engine's ndb or Cloud NDB to connect with their app's Datastore while Python 3 developers are most likely using Cloud Datastore. Cloud Buildpacks do not support Python 2, so today we're targeting a slightly different audience: Python 2 developers who have migrated from App Engine ndb to Cloud NDB and who have ported their apps to modern Python 3 but now want to containerize them for Cloud Run.

Developers familiar with App Engine know that a default HTTP server is provided by default and started automatically, however if special launch instructions are needed, users can add an entrypoint directive in their app.yaml files, as illustrated below. When those App Engine apps are containerized for Cloud Run, developers must bundle their own server and provide startup instructions, the purpose of the ENTRYPOINT directive in the Dockerfile, also shown below.

Starting your web server with App Engine (app.yaml) and Cloud Run with Docker (Dockerfile) or Buildpacks (Procfile)

Starting your web server with App Engine (app.yaml) and Cloud Run with Docker (Dockerfile) or Buildpacks (Procfile)

In this migration, there is no Dockerfile. While Cloud Buildpacks does the heavy-lifting, determining how to package your app into a container, it still needs to be told how to start your service. This is exactly what a Procfile is for, represented by the last file in the image above. As specified, your web server will be launched in the same way as in app.yaml and the Dockerfile above; these config files are deliberately juxtaposed to expose their similarities.

Other than this swapping of configuration files and the expected lack of a .dockerignore file, the Python 3 Cloud NDB app containerized for Cloud Run is nearly identical to the Python 3 Cloud NDB App Engine app we started with. Cloud Run's build-and-deploy command (gcloud run deploy) will use a Dockerfile if present but otherwise selects Cloud Buildpacks to build and deploy the container image. The user experience is the same, only without the time and challenges required to maintain and debug a Dockerfile.

Get started now

If you're considering containerizing your App Engine apps without having to know much about containers or Docker, we recommend you try this migration on a sample app like ours before considering it for yours. A corresponding codelab leading you step-by-step through this exercise is provided in addition to the video which you can use for guidance.

All migration modules, their videos (when available), codelab tutorials, and source code, can be found in the migration repo. While our content initially focuses on Python users, we hope to one day also cover other legacy runtimes so stay tuned. Containerization may seem foreboding, but the goal is for Cloud Buildpacks and migration resources like this to aid you in your quest to modernize your serverless apps!

Introducing the Google For Startups Accelerator: Women Founders class of 2021

Posted by Ashley Francisco, Head of Startup Developer Ecosystem, Canada

Google for Startups Women Founders logo

Earlier this summer we shared details about how the Google for Startups Accelerator program is expanding its support for founders from underrepresented groups. In addition to our Black Founders accelerator program, the expansion included a second year of programming specifically designed for women-led startups in North America.

We launched the inaugural Google for Startups Accelerator: Women Founders program in 2020, in order to address gender disparity in the startup ecosystem and provide high-quality mentorship opportunities and support for women founders. Studies showed that only 16% of small and medium sized businesses were owned by women, and that women often lack access to venture capitalist funding and accelerator programs to help launch and scale up their businesses.

This year, we have designed another great program for our women founders, and today we are thrilled to announce the 12 women-led startups joining our class of 2021.

Without further ado, meet the Google For Startups Accelerator: Women Founders class of 2021!

  • Aquacycl (Escondido, California): Aquacycl makes energy-neutral wastewater treatment a reality, offering modular on-site systems to treat high-strength organic waste streams. The BioElectrochemical Treatment Technology (BETT) is the first commercially viable microbial fuel cell, which generates direct electricity from wastewater, treating untreatable streams and reducing wastewater management costs by 20-60%.
  • Braze Mobility (Toronto, Ontario): Braze Mobility provides affordable navigation solutions for wheelchair users. It developed the world's first blind spot sensor system that can be attached to any wheelchair, transforming it into a smart wheelchair that automatically detects obstacles and provides alerts to the user through intuitive lights, sounds, and vibrations.
  • Claira (Grand Rapids, Michigan): Claira is a competency analytics engine that helps organizations understand their people and hire better.
  • ImagoAI (Saint Paul, Minnesota): ImagoAI’s proprietary AI solution does real-time food safety and quality testing at food manufacturing facilities and on the farms. Its solutions help companies reduce production line hold times by more than 90%, delivering consistent quality products, and reduce waste by early inspection.
  • Journey Foods (Austin, Texas): Journey Foods solves food science and supply chain inefficiencies with software in order to help companies feed 8 billion people better. They build enterprise technology that improves product management, ingredient intelligence, and manufacturing insights for CPG companies, grocery stores, suppliers, and manufacturers.
  • Nyquist Data (Palo Alto, California): Nyquist Data helps companies around the world access critical data and insights, which maximizes efficiency, resources, and results for innovation.
  • Paperstack (Toronto, Ontario): Paperstack is an all-in-one platform that helps self-employed individuals with incorporation, bookkeeping, and taxes.
  • Pocketnest (Ann Arbor, Michigan): Pocketnest is a comprehensive financial planning tool targeting genX and millennials. The company licenses its software to financial institutions and employers, helping them connect with a younger audience and grow their business. Based on psychology, behavioral science and coaching, it leads users through all ten themes of personal finances, resulting in actionable items and recommendations customized for each user.
  • SAFETYDOCS Global (Vancouver, British Columbia): SAFETYDOCS Global is a document management solutions platform that streamlines and automates permitting and licensing documentation workflows.
  • Schoolytics (Washington, DC): Schoolytics, the Student Data Platform, enables schools to connect disparate student datasets, including the student information system (SIS), learning management systems (LMS), and assessments, to transform data into meaningful insights and action. Its web-based tool supports data-driven decision making through real-time analytics and reporting.
  • Tengiva (Montreal, Quebec): Tengiva is the first digital platform enabling real-time supply chain in the textile industry by optimizing the typical months-long procurement process into a single-click operation.
  • ThisWay (Austin, Texas): ThisWay matches all people to all jobs, fairly and without bias. The web platform accurately delivers qualified talent, while increasing diversity and inclusion so ROI is optimized.

Starting on September 27, the 10-week intensive virtual program will bring the best of Google's programs, products, people and technology to help these businesses reach their goals. Participating startups receive deep mentorship on technical challenges and machine learning, as well as connections to relevant teams across Google. The startups will also receive nontechnical programming to help address some of the unique barriers faced by women founders in the startup ecosystem.

We are excited to welcome these 12 women-led businesses to our Google for Startups Accelerator community, and look forward to working with them this fall!

The Google Cloud Startup Summit is coming on September 9, 2021

Posted by Chris Curtis, Startup Marketing Manager at Google Cloud

Startup Summit logo

We’re excited to announce our first-ever Google Cloud Startup Summit will be taking place on September 9, 2021.

We hope you will join us as we bring together our startup community, including startup founders, CTOs, VCs and Google experts to provide behind the scenes insights and inspiring stories of innovation. To kick off the event, we’ll be bringing in X’s Captain of Moonshots, Astro Teller, for a keynote focused on innovation. We’ll also have exciting technical and business sessions,with Google leaders, industry experts, venture investors and startup leaders. You can see the full agenda here to get more details on the sessions.

We can’t wait to see you at the Google Cloud Startup Summit at 10am PT on September 9! Register to secure your spot today.

Google Developer Group Spotlight: A conversation with GDG Juba Lead, Kose

Posted by Aniedi Udo-Obong, Sub-Saharan Africa Regional Lead, Google Developer Groups

Header image featuring Kose with text that says meet Kose

The Google Developer Groups Spotlight series interviews inspiring leaders of community meetup groups around the world. Our goal is to learn more about what developers are working on, how they’ve grown their skills with the Google Developer Group community, and what tips they might have for us all.

We recently spoke with Kose, community lead for Google Developer Groups Juba in South Sudan. Check out our conversation with Kose about building a GDG chapter, the future of GDG Juba, and the importance of growing the tech community in South Sudan.

Tell us a little about yourself?

I’m a village-grown software developer and community lead of GDG Juba. I work with JavaScript stack with a focus on the backend. Learning through the community has always been part of me before joining GDG Juba. I love tech volunteerism and building a community around me and beyond. I attended many local developer meetups and learned a lot that led to my involvement with GDG Juba.

I am currently helping grow the GDG Juba community in South Sudan, and previously volunteered as a mentor in the Google Africa Developer Scholarship 2020.

Why did you want to get involved in tech?

I hail from a remote South Sudan's village with little to zero access to technology. My interest in tech has largely been driven due to an enthusiasm to build things and solve farming, agricultural economics, and social issues using technology.

I am currently researching and working on a farmers connection network to help transform our agricultural economics.

What is unique about pursuing a career as a developer in South Sudan?

When you talk about technology in South Sudan, we are relatively behind compared to our neighbors and beyond. Some challenges include the lack of support, resources, and mentorship among the few technology aspirants. The electricity and internet bills are so costly that an undetermined hustler won't sacrifice their days' hustle for exploring and learning the tech spectrum.

At the same time, there are a lot of areas technology developers can dive into. Finance, hospitality, agriculture, transportation, and content creation are all viable fields. As a determined techie, I tasked myself with allocating 10% of everything I do and earn to learning and exploring technology. This helped me to have some time, money, and resources for my tech journey. As for mentorship, I’m building a global network of resourceful folks to help me venture into new areas of the tech sector.

How did you become a GDG lead?

I’ve always been that person who joined tech events as often as I could find registration links. In my college days, I would skip classes to attend events located hours away. I would hardly miss Python Hyderabad, pycons, and many other Android meetups. It was during the International Women's Day (IWD) 2018 event organized by WTM Hyderabad and GDG Hyderabad that I was lucky enough to give a short challenge pitch talk. I saw how the conference folks were excited and amazed given the fact that I was the only African in the huge Tech Mahindra conference hall. I met a lot of people, organizers, business personalities and students.

Kose shakes hand with woman at stage

Kose takes the stage for International Women's Day (IWD) 2018

At the end of the conference and subsequent events, I convinced myself to start a similar community. Since starting out with a WhatsApp group chat, we’ve grown to about 200 members on our GDG event platform, and have event partners like Koneta Hub and others. Since then, GDG Juba is helping grow the tech community around Juba, South Sudan.

How has the GDG community helped you grow in the tech industry?

From design thinking to public speaking and structuring technical meetups, the GDG community has become a resourceful part of organizing GDG Juba meetups and enhancing my organizational skills.

As a community lead, I continuously plan the organization of more impactful events and conferences, and network with potential event partners, speakers, mentors, and organizers. Being part of the GDG community has helped me get opportunities to share knowledge with others. In one instance, I became a mobile web mentor and judge for the Google Africa Developer Scholarship 2020 program.

What has been the most inspiring part of being a part of your local Google Developer Group?

As a tech aspirant, I had always wanted to be part of a tech community to learn, network, and grow in the community. Unfortunately, back then there wasn't a single tech user group in my locality. The most inspiring thing about being part of this chapter is the network buildup and learning from the community. I notably network with people I could have never networked with on a typical day.

Kose standing with 10 members at GDG Juba meetup

Kose at a GDG Juba meetup

A lot of our meetup attendees now share their knowledge and experiences with others to inspire many. We are seeing a community getting more engagement in technology. Students tell us they are learning things they hardly get in their college curriculum.

As a learner myself, I am very excited to see folks learn new tech skills and am also happy to see women participating in the tech events. I’m especially proud of the fact that we organized International Women's Day (IWD) 2021, making it possible for us to be featured in a famous local newspaper outlet.

What are some technical resources you have found the most helpful for your professional development?

The official documentation from Google Developers for Android, Firebase, and others have been and are still helpful for my understanding and diving into details of the new things I learn.

In addition to the cool resources from the awesome tech bloggers on the internet, these links are helping me a lot in my adventure:

  1. Google Developers Medium articles
  2. Android Developers Training courses
  3. Udacity Android/ Firebase courses
  4. GitHub code review
  5. Google Developers India YouTube channel

What is coming up for GDG Juba that you are most excited about?

As part of our Android Study Jam conducted earlier this year, we are planning to host a mentorship program for Android application development. The program will run from scratch to building a fully-fledged, deployable Android app that the community can use for daily activities. I am particularly excited about the fact that we will be having a mentor who has been in the industry for quite a long time. I hope to see people who read this article participating in the mentorship program, too!

What would be one piece of advice you have for someone looking to learn more about a specific technology?

Be a learner. Join groups that can mentor your learning journey.

Ready to learn new skills with developers like Kose? Find a Google Developer Group near you, here.

Video-Touch: Multi-User Remote Robot Control in Google Meet call by DNN-based Gesture Recognition

A guest post by the Engineering team at Video-Touch

Please note that the information, uses, and applications expressed in the below post are solely those of our guest author, Video-Touch.

A guest post by Video-Touch

You may have watched some science fiction movies where people could control robots with the movements of their bodies. Modern computer vision and robotics approaches allow us to make such an experience real, but no less exciting and incredible.

Inspired by the idea to make remote control and teleoperation practical and accessible during such a hard period of coronavirus, we came up with a VideoTouch project.

Video-Touch is the first robot-human interaction system that allows multi-user control via video calls application (e.g. Google Meet, Zoom, Skype) from anywhere in the world.

The Video-Touch system in action.

Figure 1: The Video-Touch system in action: single user controls a robot during a Video-Touch call. Note the sensors’ feedback when grasping a tube [source video].

We were wondering if it is even possible to control a robot remotely using only your own hands - without any additional devices like gloves or a joystick - not suffering from a significant delay. We decided to use computer vision to recognize movements in real-time and instantly pass them to the robot. Thanks to MediaPipe now it is possible.

Our system looks as follows:

  1. Video conference application gets a webcam video on the user device and sends it to the robot computer (“server”);
  2. User webcam video stream is being captured on the robot's computer display via OBS virtual camera tool;
  3. The recognition module reads user movements and gestures with the help of MediaPipe and sends it to the next module via ZeroMQ;
  4. The robotic arm and its gripper are being controlled from Python, given the motion capture data.

Figure 2: Overall scheme of the Video-Touch system: from users webcam to the robot control module [source video].

As it clearly follows from the scheme, all the user needs to operate a robot is a stable internet connection and a video conferencing app. All the computation, such as screen capture, hand tracking, gesture recognition, and robot control, is being carried on a separate device (just another laptop) connected to the robot via Wi-Fi. Next, we describe each part of the pipeline in detail.

Video stream and screen capture

One can use any software that sends a video from one computer to another. In our experiments, we used the video conference desktop application. A user calls from its device to a computer with a display connected to the robot. Thus it can see the video stream from the user's webcam.

Now we need some mechanism to pass the user's video from the video conference to the Recognition module. We use Open Broadcaster Software (OBS) and its virtual camera tool to capture the open video conference window. We get a virtual camera that now has frames from the users' webcam and its own unique device index that can be further used in the Recognition module.

Recognition module

The role of the Recognition module is to capture a users' movements and pass them to the Robot control module. Here is where the MediaPipe comes in. We searched for the most efficient and precise computer vision software for hand motion capture. We found many exciting solutions, but MediaPipe turned out to be the only suitable tool for such a challenging task - real-time on-device fine-grained hand movement recognition.

We made two key modifications to the MediaPipe Hand Tracking module: added gesture recognition calculator and integrated ZeroMQ message passing mechanism.

At the moment of our previous publication we had two versions of the gesture recognition implementation. The first version is depicted in Figure 3 below and does all the computation inside the Hand Gesture Recognition calculator. The calculator has scaled landmarks as its input, i.e. these landmarks are normalized on the size of the hand bounding box, not on the whole image size. Next it recognizes one of 4 gestures (see also Figure 4): “move”, “angle”, “grab” and “no gesture” (“finger distance” gesture from the paper was an experimental one and was not included in the final demo) and outputs the gesture class name. Despite this version being quite robust and useful, it is based only on simple heuristic rules like: “if this landmark[i].x < landmark[j].x then it is a `move` gesture”, and is failing for some real-life cases like hand rotation.

Modified MediaPipe Hand Landmark CPU subgraph.

Figure 3: Modified MediaPipe Hand Landmark CPU subgraph. Note the HandGestureRecognition calculator

To alleviate the problem of bad generalization, we implemented the second version. We trained the Gradient Boosting classifier from scikit-learn on a manually collected and labeled dataset of 1000 keypoints: 200 per “move”, “angle” and “grab” classes, and 400 for “no gesture” class. By the way, today such a dataset could be easily obtained using the recently released Jesture AI SDK repo (note: another project of some of our team members).

We used scaled landmarks, angles between joints, and pairwise landmark distances as an input to the model to predict the gesture class. Next, we tried to pass only the scaled landmarks without any angles and distances, and it resulted in similar multi-class accuracy of 91% on a local validation set of 200 keypoints. One more point about this version of gesture classifier is that we were not able to run the scikit-learn model from C++ directly, so we implemented it in Python as a part of the Robot control module.

Figure 4: Gesture classes recognized by our model (“no gesture” class is not shown).

Figure 4: Gesture classes recognized by our model (“no gesture” class is not shown).

Right after the publication, we came up with a fully-connected neural network trained in Keras on just the same dataset as the Gradient Boosting model, and it gave an even better result of 93%. We converted this model to the TensorFlow Lite format, and now we are able to run the gesture recognition ML model right inside the Hand Gesture Recognition calculator.

Figure 5: Fully-connected network for gesture recognition converted to TFLite model format.

Figure 5: Fully-connected network for gesture recognition converted to TFLite model format.

When we get the current hand location and current gesture class, we need to pass it to the Robot control module. We do this with the help of the high-performance asynchronous messaging library ZeroMQ. To implement this in C++, we used the libzmq library and the cppzmq headers. We utilized the request-reply scenario: REP (server) in C++ code of the Recognition module and REQ (client) in Python code of the Robot control module.

So using the hand tracking module with our modifications, we are now able to pass the motion capture information to the robot in real-time.

Robot control module

A robot control module is a Python script that takes hand landmarks and gesture class as its input and outputs a robot movement command (on each frame). The script runs on a computer connected to the robot via Wi-Fi. In our experiments we used MSI laptop with the Nvidia GTX 1050 Ti GPU. We tried to run the whole system on Intel Core i7 CPU and it was also real-time with a negligible delay, thanks to the highly optimized MediaPipe compute graph implementation.

We use the 6DoF UR10 robot by Universal Robotics in our current pipeline. Since the gripper we are using is a two-finger one, we do not need a complete mapping of each landmark to the robots’ finger keypoint, but only the location of the hands’ center. Using this center coordinates and python-urx package, we are now able to change the robots’ velocity in a desired direction and orientation: on each frame, we calculate the difference between the current hand center coordinate and the one from the previous frame, which gives us a velocity change vector or angle. Finally, all this mechanism looks very similar to how one would control a robot with a joystick.

Hand-robot control logic follows the idea of a joystick with pre-defined movement directions

Figure 6: Hand-robot control logic follows the idea of a joystick with pre-defined movement directions [source video].

Tactile perception with high-density tactile sensors

Dexterous manipulation requires a high spatial resolution and high-fidelity tactile perception of objects and environments. The newest sensor arrays are well suited for robotic manipulation as they can be easily attached to any robotic end effector and adapted at any contact surface.

High fidelity tactile sensor array

Figure 7: High fidelity tactile sensor array: a) Array placement on the gripper. b) Sensor data when the gripper takes a pipette. c) Sensor data when the gripper takes a tube [source publication].

Video-Touch is embedded with a kind of high-density tactile sensor array. They are installed in the two-fingers robotic gripper. One sensor array is attached to each fingertip. A single electrode array can sense a frame area of 5.8 [cm2] with a resolution of 100 points per frame. The sensing frequency equals 120 [Hz]. The range of force detection per point is of 1-9 [N]. Thus, the robot detects the pressure applied to solid or flexible objects grasped by the robotic fingers with a resolution of 200 points (100 points per finger).

The data collected from the sensor arrays are processed and displayed to the user as dynamic finger-contact maps. The pressure sensor arrays allow the user to perceive the grasped object's physical properties such as compliance, hardness, roughness, shape, and orientation.

Multi-user robotic arm control feature.

Figure 8: Multi-user robotic arm control feature. The users are able to perform a COVID-19 test during a regular video call [source video].

Endnotes

Thus by using MediaPipe and a robot we built an effective, multi-user robot teleoperation system. Potential future uses of teleoperation systems include medical testing and experiments in difficult-to-access environments like outer space. Multi-user functionality of the system addresses an actual problem of effective remote collaboration, allowing to work on projects which need manual remote control in a group of several people.

Another nice feature of our pipeline is that one could control the robot using any device with a camera, e.g. a mobile phone. One also could operate another hardware form factor, such as edge devices, mobile robots, or drones instead of a robotic arm. Of course, the current solution has some limitations: latency, the utilization of z-coordinate (depth), and the convenience of the gesture types could be improved. We can’t wait for the updates from the MediaPipe team to try them out, and looking forward to trying new types of the gripper (with fingers), two-hand control, or even a whole-body control (hello, “Real Steel”!).

We hope the post was useful for you and your work. Keep coding and stay healthy. Thank you very much for your attention!

This blog post is curated by Igor Kibalchich, ML Research Product Manager at Google AI