Category Archives: Google Developers Blog

News and insights on Google platforms, tools and events

Google Pay picks Flutter to drive its global product development

Posted by David Ko, Engineering Director; Jeff Lim, Software Engineer; Pankaj Gupta, Director of Engineering; Will Horn, Software Engineer

Three years ago, when we launched Google Pay India (then called Tez), our vision was to create a simple and secure payment app for everyone in India. We started with the premise of making payments simple and built a user interface that made making payments as easy as starting a conversation. The simplicity of the design resonated with users instantly and over time, we have added functionality to help users do more than just make payments. Today users can pay their bills, recharge their phones, get loans instantly through banks, buy train tickets and much more all within the app. Last year, we also launched the Spot Platform in India, which allows merchants to create branded experiences within the Google Pay app so they can connect with their customers in a more engaging way.

As we looked at scaling our learnings from India to other parts of the world, we wanted to focus on a fast and efficient development environment, which was modern and engaging with the flexibility needed to keep the UI clean. And more importantly one that enabled us to write once and be able to deploy to both iOS and Android reaching the wide variety of users.

It was clear that we would need to build it, and ensure that it worked across a wide variety of payment rails, infrastructure, and operating systems. But with the momentum we had for Google Pay in India, and the fast evolving product features - we had limited engineering resources to put behind this effort.

After evaluating various options, it was easy to pick Flutter as the obvious choice. The three things that made it click for us were:

  • We could write once in Dart and deploy on both iOS and Android, which led to a uniform best-in-class experience on both Android and iOS;
  • Just-in-Time compiler with hot reload during development enabled rapid iteration on UI which tremendously increased developer efficiency; and
  • Ahead-of-time compilation ensured high performance deployment.

Now the task was to get it done. We started with a small team of three software engineers from both Android and iOS. Those days were focused and intense. To start with we created a vertical slice of the app — home page, chat, and payments (with the critical native plugins for payments in India). The team first tried a hybrid approach, and then decided to do a clean rewrite as it was not scalable.

We ran a few small sprints for other engineers on the team to give them an opportunity to rewrite something in Flutter and provide feedback. Everyone loved Flutter — you could see the thrill on people’s faces as they talked about how fast it was to build a user interface. One of the most exciting things was that the team could get instant feedback while developing. We could also leverage the high quality widgets that Flutter provided to make development easier.

After carefully weighing the risks and our case for migration, we decided to go all in with Flutter. It was a monumental rewrite of a moving target, and the existing app continues to evolve while we were rewriting features. After many months of hard work, Google Pay Flutter implementation is now available in open beta in India and Singapore. Our users in India and Singapore can visit the Google Play Store page for Google Pay to opt into the beta program and experience the latest app built on Flutter. Next, we are looking forward to launching Google Pay on Flutter to everyone across the world on iOS and Android.

 Google Pay image Google Pay image Google pay image Google Pay image

We hope this gives you a fair idea of how to approach and launch a complete rewrite of an active app that is used by millions of users and businesses of all sizes. It would not have been possible for us to deliver this without Flutter’s continued advances on the platform. Huge thanks to the Flutter team, as today, we are standing on their shoulders!

When fully migrated, Google Pay will be one of the largest production deployments on the Flutter platform. We look forward to sharing more learnings from our transition to Flutter in the future.

Doubling down on the edge with Coral’s new accelerator

Posted by The Coral Team

Coral image

Moving into the fall, the Coral platform continues to grow with the release of the M.2 Accelerator with Dual Edge TPU. Its first application is in Google’s Series One room kits where it helps to remove interruptions and makes the audio clearer for better video meetings. To help even more folks build products with Coral intelligence, we’re dropping the prices on several of our products. And for those folks that are looking to level up their at home video production, we’re sharing a demo of a pose based AI director to make multi-camera video easier to make.

Coral M.2 Accelerator with Dual Edge TPU

The newest addition to our product family brings two Edge TPU co-processors to systems in an M.2 E-key form factor. While the design requires a dual bus PCIe M.2 slot, it brings enhanced ML performance (8 TOPS) to tasks such as running two models in parallel or pipelining one large model across both Edge TPUs.

The ability to scale across multiple edge accelerators isn’t limited to only two Edge TPUs. As edge computing expands to local data centers, cell towers, and gateways, multi-Edge TPU configurations will be required to help process increasingly sophisticated ML models. Coral allows the use of a single toolchain to create models for one or more Edge TPUs that can address many different future configurations.

A great example of how the Coral M.2 Accelerator with Dual Edge TPU is being used is in the Series One meeting room kits for Google Meet.

The new Series One room kits for Google Meet run smarter with Coral intelligence

Coral image

Google’s new Series One room kits use our Coral M.2 Accelerator with Dual Edge TPU to bring enhanced audio clarity to video meetings. TrueVoice®, a multi-channel noise cancellation technology, minimizes distractions to ensure every voice is heard with up to 44 channels of echo and noise cancellation, making distracting sounds like snacking or typing on a keyboard a concern of the past.

Enabling the clearest possible communication in challenging environments was the target for the Google Meet hardware team. The consideration of what makes a challenging environment was not limited to unusually noisy environments, such as lunchrooms doubling as conference rooms. Any conference room can present challenging acoustics that make it difficult for all participants to be heard.

The secret to clarity without expensive and cumbersome equipment is to use virtual audio channels and AI driven sound isolation. Read more about how Coral was used to enhance and future-proof the innovative design.

Expanding the AI edge

Earlier this year, we reduced the prices of our prototyping devices and sensors. We are excited to share further price drops on more of our products. Our System-on-Module is now available for $99.99, and our Mini PCIe Accelerator, M.2 Accelerator A+E Key, and M.2 Accelerator B+M key are now available at $24.99. We hope this lower price will make our edge AI more accessible to more creative minds around the world. Later, this month our SoM offering will also expand to include 2 and 4GB RAM options.

Multi-cam with AI

Coral image

As we expand our platform and product family, we continue to keep new edge AI use cases in mind. We are continually inspired by our developer community’s experimentation and implementations. When recently faced with the challenges of multicam video production from home, Markku Lepistö, Solutions Architect at Google Cloud, created this real-time pose-based multicam tool he so aptly dubbed, AI Director.

We love seeing such unique implementations of on-device ML and invite you to share your own projects and feedback at [email protected].

For a list of worldwide distributors, system integrators and partners, visit the Coral partnerships page. Please visit Coral.ai to discover more about our edge ML platform.

Doubling down on the edge with Coral’s new accelerator

Posted by The Coral Team

Coral image

Moving into the fall, the Coral platform continues to grow with the release of the M.2 Accelerator with Dual Edge TPU. Its first application is in Google’s Series One room kits where it helps to remove interruptions and makes the audio clearer for better video meetings. To help even more folks build products with Coral intelligence, we’re dropping the prices on several of our products. And for those folks that are looking to level up their at home video production, we’re sharing a demo of a pose based AI director to make multi-camera video easier to make.

Coral M.2 Accelerator with Dual Edge TPU

The newest addition to our product family brings two Edge TPU co-processors to systems in an M.2 E-key form factor. While the design requires a dual bus PCIe M.2 slot, it brings enhanced ML performance (8 TOPS) to tasks such as running two models in parallel or pipelining one large model across both Edge TPUs.

The ability to scale across multiple edge accelerators isn’t limited to only two Edge TPUs. As edge computing expands to local data centers, cell towers, and gateways, multi-Edge TPU configurations will be required to help process increasingly sophisticated ML models. Coral allows the use of a single toolchain to create models for one or more Edge TPUs that can address many different future configurations.

A great example of how the Coral M.2 Accelerator with Dual Edge TPU is being used is in the Series One meeting room kits for Google Meet.

The new Series One room kits for Google Meet run smarter with Coral intelligence

Coral image

Google’s new Series One room kits use our Coral M.2 Accelerator with Dual Edge TPU to bring enhanced audio clarity to video meetings. TrueVoice®, a multi-channel noise cancellation technology, minimizes distractions to ensure every voice is heard with up to 44 channels of echo and noise cancellation, making distracting sounds like snacking or typing on a keyboard a concern of the past.

Enabling the clearest possible communication in challenging environments was the target for the Google Meet hardware team. The consideration of what makes a challenging environment was not limited to unusually noisy environments, such as lunchrooms doubling as conference rooms. Any conference room can present challenging acoustics that make it difficult for all participants to be heard.

The secret to clarity without expensive and cumbersome equipment is to use virtual audio channels and AI driven sound isolation. Read more about how Coral was used to enhance and future-proof the innovative design.

Expanding the AI edge

Earlier this year, we reduced the prices of our prototyping devices and sensors. We are excited to share further price drops on more of our products. Our System-on-Module is now available for $99.99, and our Mini PCIe Accelerator, M.2 Accelerator A+E Key, and M.2 Accelerator B+M key are now available at $24.99. We hope this lower price will make our edge AI more accessible to more creative minds around the world. Later, this month our SoM offering will also expand to include 2 and 4GB RAM options.

Multi-cam with AI

Coral image

As we expand our platform and product family, we continue to keep new edge AI use cases in mind. We are continually inspired by our developer community’s experimentation and implementations. When recently faced with the challenges of multicam video production from home, Markku Lepistö, Solutions Architect at Google Cloud, created this real-time pose-based multicam tool he so aptly dubbed, AI Director.

We love seeing such unique implementations of on-device ML and invite you to share your own projects and feedback at [email protected].

For a list of worldwide distributors, system integrators and partners, visit the Coral partnerships page. Please visit Coral.ai to discover more about our edge ML platform.

Applications are open for Google for Startups Accelerator in Japan

Posted by Takuo Suzuki, Developer Relations Program Manager

image from a recent accelerator

The Google for Startups Accelerator helps founders across the globe solve for important economic and societal challenges, while helping them grow and scale their business. Due to the continued success of the program around the world, we are pleased to open up applications for our third Accelerator class in Japan, commencing January 2021. Applications will remain open until October 30, 2020.

This accelerator is designed for established startups across Japan using technology to help solve important social and environmental issues, and that contribute to the Japanese economy. This includes (but is not limited to) startups tackling:

  • Ageing society and declining workforce
  • Energy, environment, and sustainability
  • Rural revitalization
  • Medicine, health, and well-being
  • Education
  • Diversity, inclusion, and social equity

Google for Startups Accelerators provide support to later-stage companies that have already launched their product, and have strong market-fit and potential to scale rapidly in the future. Startups in the program benefit from tailored Google mentorship, product advice & credits, technical workshops, and by getting connected to other founders, VCs, and industry experts.

Each participating startup selected for the Google for Startups Accelerator program will join a 500+ company alumni network of startups from around the world, such as Selan, with their product Omister (Class #2 Japan), is improving education & childcare in Japan by providing bilingual instructors for children, and mDoc, (Class #1 Sustainability, Europe), a Nigerian startup helping people in West Africa with chronic diseases get treatment via their n app.

In summary:

  • Suitable for startups solving for societal or environmental issues in Japan
  • Application open: September 15, 2020
  • Application close: October 30, 2020
  • Announcement of selected startups: December 2020
  • Program runs from late-January 2021 to end of April 2021 (planned)
  • Please refer to the website for further information and to apply.

Building solutions using the G Suite developer platform

Posted by Charles Maxson, Developer Advocate, G Suite

Millions of users know G Suite as a collection of communication and productivity apps that enables teams to easily create, communicate, collaborate, and discover content to supercharge teamwork. Beneath the surface of this well-serving collection of apps is also an extensible platform that enables developers to build targeted custom experiences and integrations utilizing these apps, allowing G Suite’s vast user base to get even more value out of the platform.

At first glance, it may not be natural to think of the tools you use for day-to-day productivity and collaboration as a developer platform. But consider what makes up a developer platform; Languages, APIs, runtimes, frameworks, IDEs, ecosystem, etc; G Suite offers developers all of these things and more.

Let’s take a closer look at what makes up the G Suite developer platform and how you can use it.

G Suite as a Developer Platform

There are a lot of components that make up G Suite as a platform. As a developer, there is probably none more important than the data that your solution collects, processes and presents. As a platform, G Suite is both highly interoperable, secure, and also interestingly unique.

Being interoperable, G Suite lets you interact with your data--whether your data is in G Suite or elsewhere, no matter how you store it or how you want to analyze it. G Suite allows you to keep your data where it best suits your application, while offering you flexibility to access it easily. Some examples include rich integrations with sources like BigQuery or JDBC databases. Better yet, often little to no code is required to get you connected.

Where G Suite as a platform is unique regarding data is it can be used to store, or perhaps even more interesting, be used to produce data. For native storage, you may use Drive as a content repository, or store information in a Sheets spreadsheet, or collect it via Google Forms as a front end. Additionally, there are many scenarios where the content your users are engaging in (emails, chats, events, tasks, contacts, documents, identity, etc.) can be harnessed to create unique interactions with G Suite. Solutions that build off, or integrate with G Suite provide such unique business value, but regardless where your data resides, accessing it as a developer is a non-issue via the platform.

The core of the G Suite developer platform itself is composed of frameworks for developer features including G Suite Add-ons and Chatbots, as well as a comprehensive library of REST APIs. These allow you to interface with the full G Suite platform to create integrations, build extensions, add customizations, and access content or data.

G Suite Add-ons and Chatbots are frameworks specifically designed for G Suite that allow you to quickly and safely build experiences that enrich the way users interact within G Suite apps, while while the REST APIs give you essentially unlimited access to G Suite apps and data including Gmail, Classroom, Calendar, Drive, Docs, Sheets, Slides, Task, and more. What you build, and what you build with, including languages and dev environments is up to you!

The beauty of G Suite as a platform is how you can unlock complementary technologies like Google Cloud that expand the platform to be even more powerful. Think about a G Suite UI connecting to a Google Cloud Platform backend; the familiar interface of G Suite coupled with the phenomenal power and scale of GCP!

Building with GCP from G Suite, you have access to components like the AI platform. This enables scenarios like using Google Sheets as a front end to AI tools like the Vision, Natural Language and the Translation APIs. Imagine how you can change the way users interact with G Suite, your app and your data combined with the power of ML?

Another useful concept is how you can add natural conversational experiences to your app in G Suite with tools like DialogFlow. This way instead of writing complicated interfaces users have to learn, you could build a G Suite Chat bot that invokes Dialogflow to allow users to execute commands directly from within their team conversations in Chat. So for example, users could just ask a Chat bot to “Add a task to the project list” or “Assign this issue to Matt”. A recent example of this is DataQnA, a natural language interface for analyzing BigQuery data.

BigQuery is another GCP tool that works natively with G Suite to allow you to analyze and leverage larger, complicated data sets while producing unique custom reports that can be surfaced in a user friendly way. One of the ways to leverage BigQuery with G Suite is through Connected Sheets, which provides the power and scale of a BigQuery data warehouse in the familiar context of Sheets. With Connected Sheets, you can analyze billions of rows of live BigQuery data in Google Sheets without requiring SQL knowledge. You can apply familiar tools—like pivot tables, charts, and formulas—to easily derive insights from big data.

One relatively new addition to the Google Cloud family also worth mentioning here is AppSheet. AppSheet is a no-code tool that can be used to quickly build mobile and web apps. Being no-code, it may seem out of place in a discussion for a development platform, but AppSheet is a dynamic and agile tool that makes it great for building apps fast or envisioning prototypes, while also connecting to G Suite apps like Google Sheets, allowing you to access G Suite platform data with ease.

When you do need the power of writing custom code, one of the foundational components of the G Suite developer platform is Apps Script. For over a decade, Apps Script has been the server-less, JavaScript-based runtime that natively powers G Suite extensibility. Built directly into G Suite with its own IDE, Apps Script makes it super fast and easy to get started building solutions with nothing to install or configure, just open and start coding -- or you can even let the macro recorder write code for you! Apps Script masks a lot of complexities that developers face like handling user authentication, allowing you to focus on creating solutions quickly. Its native integration and relative simplicity also welcomes developers with diverse skill levels to build customized workflows, menus and UI, automations and more right inside G Suite.

While Apps Script is nimble and useful for many use cases, we know that many developers have preferences around tools, languages and development environments. G Suite is an open platform that encourages developers to choose options that makes them more productive. In continuing to build on that principle, we recently introduced Alternate Runtimes for G Suite Add-ons. This new capability allows you to create solutions using the G Suite Add-ons framework without being bound to Apps Script as a toolset, giving you the choice and freedom to leverage your existing preferences and investments in hosting infrastructure, development tools, source control, languages, and code libraries, etc.

Finally, what completes the vision of G Suite as a developer platform is that you have the confidence and convenience of an established platform that is broadly deployed and backed by tools like Google Identity Management and the G Suite Admin Console for administration and security. This enables you to build your solutions--whether its a customized solution for your internal users or an integration between your software platform and G Suite--and distribute them at a domain level or even globally via the G Suite Marketplace, which is an acquisition channel for developers and a discovery engine for end-users and enterprise admins alike.

Now that you can see how G Suite is a developer platform, imagine what you can build?

Visit the G Suite Developer homepage and get started on your journey today.

Instant Motion Tracking with MediaPipe

Posted by Vikram Sharma, Software Engineering Intern; Jianing Wei, Staff Software Engineer; Tyler Mullen, Senior Software Engineer

Augmented Reality (AR) technology creates fun, engaging, and immersive user experiences. The ability to perform AR tracking across devices and platforms, without initialization, remains important for powering AR applications at scale.

Today, we are excited to release the Instant Motion Tracking solution in MediaPipe. It is built upon the MediaPipe Box Tracking solution we released previously. With Instant Motion Tracking, you can easily place fun virtual 2D and 3D content on static or moving surfaces, allowing them to seamlessly interact with the real world. This technology also powered MotionStills AR. Along with the library, we are releasing an open source Android application to showcase its capabilities. In this application, a user simply taps the camera viewfinder in order to place virtual 3D objects and GIF animations, augmenting the real-world environment.

gif of instant motion tracking in MediaPipe gif of instant motion tracking in MediaPipe

Instant Motion Tracking in MediaPipe

Instant Motion Tracking

The Instant Motion Tracking solution provides the capability to seamlessly place virtual content on static or motion surfaces in the real world. To achieve that, we provide the six degrees of freedom tracking with relative scale in the form of rotation and translation matrices. This tracking information is then used in the rendering system to overlay virtual content on camera streams to create immersive AR experiences.

The core concept behind Instant Motion Tracking is to decouple the camera’s translation and rotation estimation, treating them instead as independent optimization problems. This approach enables AR tracking across devices and platforms without initialization or calibration. We do this by first finding the 3D camera translation using only the visual signals from the camera. This involves estimating the target region's apparent 2D translation and relative scale across frames. The process can be illustrated with a simple pinhole camera model, relating translation and scale of an object in the image plane to the final 3D translation.

image

By finding the change in relative size of our tracked region from view position V1 to V2, we can estimate the relative change in distance from the camera.

Next, we obtain the device’s 3D rotation from its built-in IMU (Inertial Measurement Unit) sensor. By combining this translation and rotation data, we can track a target region with six degrees of freedom at relative scale. This information allows for the placement of virtual content on any system with a camera and IMU functionality, and is calibration free. For more details on Instant Motion Tracking, please refer to our paper.

A MediaPipe Pipeline for Instant Motion Tracking

A diagram of Instant Motion Tracking pipeline is shown below, consisting of four major components: a Sticker Manager module, a Region Tracking module, a Matrices Manager module, and lastly a Rendering System. Each of the components consists of MediaPipe calculators or subgraphs.

Diagram

Diagram of Instant Motion Tracking Pipeline

The Sticker Manager accepts sticker data from the application and produces initial anchors (tracked region information) based on user taps, and user gesture controls for every sticker object. Initial anchors are then sent to our Region Tracking module to generate tracked anchors. The Matrices Manager combines this data with our device’s rotation matrix to produce six degrees-of-freedom poses as model matrices. After integrating any user-specified transforms like asset scaling, our final poses are forwarded to the Rendering System to render all virtual objects overlaid on the camera frame to produce the output AR frame.

Using the Instant Motion Tracking Solution

The Instant Motion Tracking solution is easy to use by leveraging the MediaPipe cross-platform framework. With camera frames, device rotation matrix, and anchor positions (screen coordinates) as input, the MediaPipe graph produces AR renderings for each frame, providing engaging experiences. If you wish to integrate this Instant Motion Tracking library with your system or application, please visit our documentation to build your own AR experiences on any device with IMU functionality and a camera sensor.

Augmenting The World with 3D Stickers and GIFs

Instant Motion Tracking solution allows bringing both 3D stickers and GIF animations into Augmented Reality experiences. GIFs are rendered on flat 3D billboards placed in the world, introducing fun and immersive experiences with animated content blended into the real environment.Try it for yourself!

Demonstration of GIF placement in 3D Demonstration of GIF placement in 3D

Demonstration of GIF placement in 3D

MediaPipe Instant Motion Tracking is already helping PixelShift.AI, a startup applying cutting-edge vision technologies to facilitate video content creation, to track virtual characters seamlessly in the view-finder for a realistic experience. Building upon Instant Motion Tracking’s high-quality pose estimation, PixelShift.AI enables VTubers to create mixed reality experiences with web technologies. The product is going to be released to the broader VTuber community later this year.

Instant

Instant Motion Tracking helps PixelShift.AI create mixed reality experiences

Follow MediaPipe

We look forward to publishing more blog posts related to new MediaPipe pipeline examples and features. Please follow the MediaPipe label on Google Developers Blog and Google Developers twitter account (@googledevs).

Acknowledgement

We would like to thank Vikram Sharma, Jianing Wei, Tyler Mullen, Chuo-Ling Chang, Ming Guang Yong, Jiuqiang Tang, Siarhei Kazakou, Genzhi Ye, Camillo Lugaresi, Buck Bourdon, and Matthias Grundman for their contributions to this release.

Guidance to developers affected by our effort to block less secure browsers and applications

Posted by Lillan Marie Agerup, Product Manager

We are always working to improve security protections of Google accounts. Our security systems automatically detect, alert and help protect our users against a range of security threats. One form of phishing, known as “man-in-the-middle”, is hard to detect when an embedded browser framework (e.g., Chromium Embedded Framework - CEF) or another automation platform is being used for authentication. MITM presents an authentication flow on these platforms and intercepts the communications between a user and Google to gather the user’s credentials (including the second factor in some cases) and sign in. To protect our users from these types of attacks Google Account sign-ins from all embedded frameworks will be blocked starting on January 4, 2021. This block affects CEF-based apps and other non-supported browsers.

To minimize the disruption of service to our partners, we are providing this information to help developers set up OAuth 2.0 flows in supported user-agents. The information in this document outlines the following:

  • How to enable sign-in on your embedded framework-based apps using browser-based OAuth 2.0 flows.
  • How to test for compatibility.

Apps that use embedded frameworks

If you're an app developer and use CEF or other clients for authorization on devices, use browser-based OAuth 2.0 flows. Alternatively, you can use a compatible full native browser for sign-in.

For limited-input device applications, such as applications that do not have access to a browser or have limited input capabilities, use limited-input device OAuth 2.0 flows.

Browsers

Modern browsers with security updates will continue to be supported.

Browser standards

The browser must have JavaScript enabled. For more details, see our previous blog post.

The browser must not proxy or alter the network communication. Your browser must not do any of the following:

  • Server-side rendering
  • HTTPS proxy
  • Replay requests
  • Rewrite HTTP headers

The browser must have a reasonably complete implementation of web standards and browser features. You must confirm that your browser does not contain any of the following:

  • Headless browsers
  • Node.js
  • Text-based browsers

The browser must identify itself clearly in the User-Agent. The browser must not try to impersonate another browser like Chrome or Firefox.

The browser must not provide automation features. This includes scripts that automate keystrokes or clicks, especially to perform automatic sign-ins. We do not allow sign-in from browsers based on frameworks like CEF or Embedded Internet Explorer.

Test for compatibility

If you're a developer that currently uses CEF for sign-in, be aware that support for this type of authentication ends on January 4, 2021. To verify whether you'll be affected by the change, test your application for compatibility. To test your application, add a specific HTTP header and value to disable the allowlist. The following steps explain how to disable the allowlist:

  1. Go to where you send requests to accounts.google.com.
  2. Add Google-Accounts-Check-OAuth-Login:true to your HTTP request headers.

The following example details how to disable the allowlist in CEF.

Note: You can add your custom headers in CefRequestHandler#OnBeforeResourceLoad.

    CefRequest::HeaderMap hdrMap;
request->GetHeaderMap(hdrMap);
hdrMap.insert(std::make_pair("Google-Accounts-Check-OAuth-Login", "true"));

To test manually in Chrome, use ModHeader to set the header. The header enables the changes for that particular request.

Setting the header using ModHeader

Related content

See our previous blog post about protection against man-in-the-middle phishing attacks.

ML Kit Pose Detection Makes Staying Active at Home Easier

Posted by Kenny Sulaimon, Product Manager, ML Kit; Chengji Yan and Areeba Abid, Software Engineers, ML Kit

ML Kit logo

Two months ago we introduced the standalone version of the ML Kit SDK, making it even easier to integrate on-device machine learning into mobile apps. Since then we’ve launched the Digital Ink Recognition API, and also introduced the ML Kit early access program. Our first two early access APIs were Pose Detection and Entity Extraction. We’ve received an overwhelming amount of interest in these new APIs and today, we are thrilled to officially add Pose Detection to the ML Kit lineup.

ML Kit Overview

A New ML Kit API, Pose Detection


Examples of ML Kit Pose Detection

ML Kit Pose Detection is an on-device, cross platform (Android and iOS), lightweight solution that tracks a subject's physical actions in real time. With this technology, building a one-of-a-kind experience for your users is easier than ever.

The API produces a full body 33 point skeletal match that includes facial landmarks (ears, eyes, mouth, and nose), along with hands and feet tracking. The API was also trained on a variety of complex athletic poses, such as Yoga positions.

Skeleton image detailing all 33 landmark points

Skeleton image detailing all 33 landmark points

Under The Hood

Diagram of the ML Kit Pose Detection Pipeline

The power of the ML Kit Pose Detection API is in its ease of use. The API builds on the cutting edge BlazePose pipeline and allows developers to build great experiences on Android and iOS, with little effort. We offer a full body model, support for both video and static image use cases, and have added multiple pre and post processing improvements to help developers get started with only a few lines of code.

The ML Kit Pose Detection API utilizes a two step process for detecting poses. First, the API combines an ultra-fast face detector with a prominent person detection algorithm, in order to detect when a person has entered the scene. The API is capable of detecting a single (highest confidence) person in the scene and requires the face of the user to be present in order to ensure optimal results.

Next, the API applies a full body, 33 landmark point skeleton to the detected person. These points are rendered in 2D space and do not account for depth. The API also contains a streaming mode option for further performance and latency optimization. When enabled, instead of running person detection on every frame, the API only runs this detector when the previous frame no longer detects a pose.

The ML Kit Pose Detection API also features two operating modes, “Fast” and “Accurate”. With the “Fast” mode enabled, you can expect a frame rate of around 30+ FPS on a modern Android device, such as a Pixel 4 and 45+ FPS on a modern iOS device, such as an iPhone X. With the “Accurate” mode enabled, you can expect more stable x,y coordinates on both types of devices, but a slower frame rate overall.

Lastly, we’ve also added a per point “InFrameLikelihood” score to help app developers ensure their users are in the right position and filter out extraneous points. This score is calculated during the landmark detection phase and a low likelihood score suggests that a landmark is outside the image frame.

Real World Applications


Examples of a pushup and squat counter using ML Kit Pose Detection

Keeping up with regular physical activity is one of the hardest things to do while at home. We often rely on gym buddies or physical trainers to help us with our workouts, but this has become increasingly difficult. Apps and technology can often help with this, but with existing solutions, many app developers are still struggling to understand and provide feedback on a user’s movement in real time. ML Kit Pose Detection aims to make this problem a whole lot easier.

The most common applications for Pose detection are fitness and yoga trackers. It’s possible to use our API to track pushups, squats and a variety of other physical activities in real time. These complex use cases can be achieved by using the output of the API, either with angle heuristics, tracking the distance between joints, or with your own proprietary classifier model.

To get you jump started with classifying poses, we are sharing additional tips on how to use angle heuristics to classify popular yoga poses. Check it out here.

Learning to Dance Without Leaving Home

Learning a new skill is always tough, but learning to dance without the aid of a real time instructor is even tougher. One of our early access partners, Groovetime, has set out to solve this problem.

With the power of ML Kit Pose Detection, Groovetime allows users to learn their favorite dance moves from popular short-form dance videos, while giving users automated real time feedback on their technique. You can join their early access beta here.

Groovetime App using ML Kit Pose Detection

Staying Active Wherever You Are

Our Pose Detection API is also helping adidas Training, another one of our early access partners, build a virtual workout experience that will help you stay active no matter where you are. This one-of-a-kind innovation will help analyze and give feedback on the user’s movements, using nothing more than just your phone. Integration into the adidas Training app is still in the early phases of the development cycle, but stay tuned for more updates in the future.

How to get started?

If you would like to start using the Pose Detection API in your mobile app, head over to the developer documentation or check out the sample apps for Android and iOS to see the API in action. For questions or feedback, please reach out to us through one of our community channels.

Join us for Google Assistant Developer Day on October 8

Posted by Baris Gultekin, Director, Product Management Google Assistant and
Payam Shodjai, Director, Product Management Google Assistant

More and more people turn to Google Assistant every day to help them get the most out of their phones and smart displays: From playing games to using their favorite app by voice, there are more opportunities than ever for developers to create new and engaging experiences for Google Assistant.

We welcome you to join us virtually at our Google Assistant Developer Day on Thursday, October 8, to learn more about new tools and features we’re building for developers to bring Google Assistant to mobile apps and Smart Displays and help drive discoverability and engagement via voice. This will also be a great chance to chat live with Google leaders and engineers on the team to get your questions answered.

You’ll hear from our product experts and partnership leads on best practices to integrate with Google Assistant to help users more easily engage with their favorite apps by voice. Other sessions will include in-depth conversations around native development on Google Assistant, with so much more.

We’ll also have guest speakers like: Garrett Gaudini, Head of Product at Postmates, Laurens Rutten, Founder & CEO of CoolGames, Corey Bozarth, VP of Product & Monetization at MyFitnessPal and many other, join us on stage to share their stories about how voice has transformed the way people interact with their apps and services.

Whether you build for mobile or smart home, these new tools will help make your content and services available to people who want to use their voice to get things done.

Registration is FREE! Head on over to the event website to register and check out the schedule.

Helping the Haitian economy, one line of code at a time

Posted by Jennifer Kohl, Program Manager, Developer Community Programs

Picture

Eustache Luckens Yadley at a GDG Port-au-Prince meetup

Meet Eustache Luckens Yadley, or “Yadley” for short. As a web developer from Port-au-Prince, Yadley has spent his career building web applications that benefit the local Haitian economy. Whether it’s ecommerce platforms that bring local sellers to market or software tools that help local businesses operate more effectively, Yadley has always been there with a technical hand to lend.

However, Yadley has also spent his career watching Haiti’s unemployment numbers rise to among the highest in the Caribbean. As he describes it,


“Every day, several thousand young people have no job to get by.”


So with code in mind and mouse in hand, Yadley got right to work. His first step was to identify a need in the economy. He soon figured out that Haiti had a shortage of delivery methods for consumers, making home delivery purchases of any kind extremely unreliable. With this observation, Yadley also noticed that there was a surplus of workers willing to deliver the goods, but no infrastructure to align their needs with that of the market’s.

picture

Yadley watching a demo at a GDG Port-au-Prince meetup

In this moment, Yadley did what many good developers would do: build an app. He created the framework for what is now called “Livrezonpam,” an application that allows companies to post where and when they need a particular product delivered and workers to find the corresponding delivery jobs closest to them.

With a brilliant solution, Yadley’s last step was to find the right technical tools to build the concept out and make it a viable platform that users could work with to their benefit.

It was at this crucial step when Yadley found the Port-au-Prince Google Developer Group. With GDG Port-au-Prince, Yadley was able to bring his young app right into the developer community, run different demos of his product to experienced users, and get feedback from a wide array of developers with an intimate knowledge of the Haitian tech scene. The takeaways from working in the community translated directly to his work. Yadley learned how to build with the Google Cloud Platform Essentials, which proved key in managing all the data his app now collects. He also learned how to get the Google Maps Platform API working for his app, creating a streamlined user experience that helped workers and companies in Haiti locate one another with precision and ease.

picture

This wide array of community technical resources, from trainings, to mentors, to helpful friends, allowed Yadley to grow his knowledge of several Google technologies, which in turn allowed him to grow his app for the Haitian community.

Today, Yadley is still an active member of the GDG community, growing his skills and those of the many friends around him. And at the same time, he is still growing Librezonpam on the Google Play App Store to help local businesses reach their customers and bring more jobs directly to the people of Haiti.


Ready to start building with a Google Developer Group near you? Find the closest community to you, here.