Tag Archives: Web

MediaPipe on the Web

Posted by Michael Hays and Tyler Mullen from the MediaPipe team

MediaPipe is a framework for building cross-platform multimodal applied ML pipelines. We have previously demonstrated building and running ML pipelines as MediaPipe graphs on mobile (Android, iOS) and on edge devices like Google Coral. In this article, we are excited to present MediaPipe graphs running live in the web browser, enabled by WebAssembly and accelerated by XNNPack ML Inference Library. By integrating this preview functionality into our web-based Visualizer tool, we provide a playground for quickly iterating over a graph design. Since everything runs directly in the browser, video never leaves the user’s computer and each iteration can be immediately tested on a live webcam stream (and soon, arbitrary video).

Running the MediaPipe face detection example in the Visualizer

Figure 1 shows the running of the MediaPipe face detection example in the Visualizer

MediaPipe Visualizer

MediaPipe Visualizer (see Figure 2) is hosted at viz.mediapipe.dev. MediaPipe graphs can be inspected by pasting graph code into the Editor tab or by uploading that graph file into the Visualizer. A user can pan and zoom into the graphical representation of the graph using the mouse and scroll wheel. The graph will also react to changes made within the editor in real time.

MediaPipe Visualizer hosted at https://viz.mediapipe.dev

Figure 2 MediaPipe Visualizer hosted at https://viz.mediapipe.dev

Demos on MediaPipe Visualizer

We have created several sample Visualizer demos from existing MediaPipe graph examples. These can be seen within the Visualizer by visiting the following addresses in your Chrome browser:

Edge Detection

Face Detection

Hair Segmentation

Hand Tracking

Edge detection
Face detection
Hair segmentation
Hand tracking

Each of these demos can be executed within the browser by clicking on the little running man icon at the top of the editor (it will be greyed out if a non-demo workspace is loaded):

This will open a new tab which will run the current graph (this requires a web-cam).

Implementation Details

In order to maximize portability, we use Emscripten to directly compile all of the necessary C++ code into WebAssembly, which is a special form of low-level assembly code designed specifically for web browsers. At runtime, the web browser creates a virtual machine in which it can execute these instructions very quickly, much faster than traditional JavaScript code.

We also created a simple API for all necessary communications back and forth between JavaScript and C++, to allow us to change and interact with the MediaPipe graph directly from JavaScript. For readers familiar with Android development, you can think of this as a similar process to authoring a C++/Java bridge using the Android NDK.

Finally, we packaged up all the requisite demo assets (ML models and auxiliary text/data files) as individual binary data packages, to be loaded at runtime. And for graphics and rendering, we allow MediaPipe to automatically tap directly into WebGL so that most OpenGL-based calculators can “just work” on the web.

Performance

While executing WebAssembly is generally much faster than pure JavaScript, it is also usually much slower than native C++, so we made several optimizations in order to provide a better user experience. We utilize the GPU for image operations when possible, and opt for using the lightest-weight possible versions of all our ML models (giving up some quality for speed). However, since compute shaders are not widely available for web, we cannot easily make use of TensorFlow Lite GPU machine learning inference, and the resulting CPU inference often ends up being a significant performance bottleneck. So to help alleviate this, we automatically augment our “TfLiteInferenceCalculator” by having it use the XNNPack ML Inference Library, which gives us a 2-3x speedup in most of our applications.

Currently, support for web-based MediaPipe has some important limitations:

  • Only calculators in the demo graphs above may be used
  • The user must edit one of the template graphs; they cannot provide their own from scratch
  • The user cannot add or alter assets
  • The executor for the graph must be single-threaded (i.e. ApplicationThreadExecutor)
  • TensorFlow Lite inference on GPU is not supported

We plan to continue to build upon this new platform to provide developers with much more control, removing many if not all of these limitations (e.g. by allowing for dynamic management of assets). Please follow the MediaPipe tag on the Google Developer blog and Google Developer twitter account. (@googledevs)

Acknowledgements

We would like to thank Marat Dukhan, Chuo-Ling Chang, Jianing Wei, Ming Guang Yong, and Matthias Grundmann for contributing to this blog post.

Security Crawl Maze: An Open Source Tool to Test Web Security Crawlers

Scanning modern web applications for security vulnerabilities can be a difficult task, especially if they are built with Javascript frameworks, which is why crawlers have to use a multi-stage crawling approach to discover all the resources on modern websites.

Living in the times of dynamically changing specifications and the constant appearance of new frameworks, we often have to adjust our crawlers so that they are able to discover new ways in which developers can link resources from their applications. The issue we face in such situations is measuring if changes to crawling logic improve the effectiveness. While working on replacing a crawler for a web security scanner that has been in use for a number of years, we found we needed a universal test bed, both to test our current capabilities and to discover cases that are currently missed. Inspired by Firing Range, today we’re announcing the open-source release of Security Crawl Maze – a universal test bed for web security crawlers.

Security Crawl Maze is a simple Python application built with the Flask framework that contains a wide variety of cases for ways in which a web based application can link other resources on the Web. We also provide a Dockerfile which allows you to build a docker image and deploy it to an environment of your choice. While the initial release is covering the most important cases for HTTP crawling, it’s a subset of what we want to achieve in the near future. You’ll soon be able to test whether your crawler is able to discover known files (robots.txt, sitemap.xml, etc…) or crawl modern single page applications written with the most popular JS frameworks (Angular, Polymer, etc.).

Security crawlers are mostly interested in code coverage, not in content coverage, which means the deduplication logic has to be different. This is why we plan to add cases which allow for testing if your crawler deduplicates URLs correctly (e.g. blog posts, e-commerce). If you believe there is something else, feel free to add a test case for it – it’s super simple! Code is available on GitHub and through a public deployed version.

We hope that others will find it helpful in evaluating the capabilities of their crawlers, and we certainly welcome any contributions and feedback from the broader security research community.

By Maciej Trzos, Information Security Engineer

Advance your career with the Google Africa Certifications Scholarships

Posted by William Florance, Global Head, Developer Training Programs

Building upon our pledge to provide mobile developer training to 100,000 Africans to develop world class apps, today we are pleased to announce the next round of Google Africa Certification Scholarships aimed at helping developers become certified on Google’s Android, Web, and Cloud technologies.

This year, we are offering 30,000 additional scholarship opportunities and 1,000 grants for the Google Associate Android Developer, Mobile Web Specialist, and Associate Cloud Engineer certifications. The scholarship program will be delivered by our partners, Pluralsight and Andela, through an intensive learning curriculum designed to prepare motivated learners for entry-level and intermediate roles as software developers. Interested students in Africa can learn more about the Google Africa Certifications Scholarships and apply here

According to World Bank, Africa is on track to have the largest working-age population (1.1 billion) by 2034. Today’s announcement marks a transition from inspiring new developers to preparing them for the jobs of tomorrow. Google’s developer certifications are performance-based. They are developed around a job-task analysis which test learners for skills that employers expect developers to have.

As announced during Sundar Pichai - Google CEO’s visit to Nigeria in 2017, our continued initiatives focused on digital skills training, education and economic opportunity, and support for African developers and startups, demonstrate our commitment to help advance a healthy and vibrant ecosystem. By providing support for training and certifications we will help bridge the unemployment gap on the continent through increasing the number of employable software developers.

Although Google’s developer certifications are relatively new, we have already seen evidence that becoming certified can make a meaningful difference to developers and employers. Adaobi Frank - a graduate of the Associate Android Developer certification - got a better job that paid ten times more than her previous salary after completing her certification. Her interview was expedited as her employer was convinced that she was great for the role after she mentioned that she was certified. Now, she's got a job that helps provide for her family - see her video here. Through our efforts this year, we want to help many more developers like Ada and support the growth of startups and technology companies throughout Africa.

Follow this link to learn more about the scholarships and apply.

Web Notifications API Support Now Available in FCM Send v1 API

Posted by Mertcan Mermerkaya, Software Engineer

We have great news for web developers that use Firebase Cloud Messaging to send notifications to clients! The FCM v1 REST API has integrated fully with the Web Notifications API. This integration allows you to set icons, images, actions and more for your Web notifications from your server! Better yet, as the Web Notifications API continues to grow and change, these options will be immediately available to you. You won't have to wait for an update to FCM to support them!

Below is a sample payload you can send to your web clients on Push API supported browsers. This notification would be useful for a web app that supports image posting. It can encourage users to engage with the app.

{
"message": {
"webpush": {
"notification": {
"title": "Fish Photos ?",
"body":
"Thanks for signing up for Fish Photos! You now will receive fun daily photos of fish!",
"icon": "firebase-logo.png",
"image": "guppies.jpg",
"data": {
"notificationType": "fishPhoto",
"photoId": "123456"
},
"click_action": "https://example.com/fish_photos",
"actions": [
{
"title": "Like",
"action": "like",
"icon": "icons/heart.png"
},
{
"title": "Unsubscribe",
"action": "unsubscribe",
"icon": "icons/cross.png"
}
]
}
},
"token": "<APP_INSTANCE_REGISTRATION_TOKEN>"
}
}

Notice that you are able to set new parameters, such as actions, which gives the user different ways to interact with the notification. In the example below, users have the option to choose from actions to like the photo or to unsubscribe.

To handle action clicks in your app, you need to add an event listener in the default firebase-messaging-sw.js file (or your custom service worker). If an action button was clicked, event.action will contain the string that identifies the clicked action. Here's how to handle the "like" and "unsubscribe" events on the client:

// Retrieve an instance of Firebase Messaging so that it can handle background messages.
const messaging = firebase.messaging();

// Add an event listener to handle notification clicks
self.addEventListener('notificationclick', function(event) {
if (event.action === 'like') {
// Like button was clicked

const photoId = event.notification.data.photoId;
like(photoId);
}
else if (event.action === 'unsubscribe') {
// Unsubscribe button was clicked

const notificationType = event.notification.data.notificationType;
unsubscribe(notificationType);
}

event.notification.close();
});

The SDK will still handle regular notification clicks and redirect the user to your click_action link if provided. To see more on how to handle click actions on the client, check out the guide.

Since different browsers support different parameters in different platforms, it's important to check out the browser compatibility documentation to ensure your notifications work as intended. Want to learn more about what the Send API can do? Check out the FCM Send API documentation and the Web Notifications API documentation. If you're using the FCM Send API and you incorporate the Web Notifications API in a cool way, then let us know! Find Firebase on Twitter at @Firebase, and Facebook and Google+ by searching "Firebase".

Funding 15,000 web and android scholarship in Africa – to provide employable developer skills

Posted by William Florance, Head, Economic Impact Programs

Africa's digital journey is rapidly gaining speed. According to the recent data, over 73 million people came online in Africa for the first time in 2017- that's more than the population of the UK! This means there are now about 435 million people on the continent using the Web to engage, connect and access information online. That's a good thing! But with this growth comes with an increased need to scale efforts to make the Web more relevant and useful to African users. This will require more skilled hands working with individuals and local businesses to develop content and platforms that will support Africa's digital growth.

In July 2017, Google's CEO, Sundar Pichai, announced a pledge to provide digital skills training to ten million people in Africa, and also to provide mobile developer training to 100,000 Africans. Today, in line with that commitment, we're excited to announce the launch of our new Africa Web and Android Scholarship program aimed at providing 15,000 scholarships to developers resident in Africa countries.

Working in partnership with Udacity and Andela, we will be offering 15,000 2-month 'single course' scholarships and 500 6-month nanodegree scholarships to aspiring and professional developers across Africa. The training will be available online via the Udacity training website, and the Andela Learning Community will support the students (in Nigeria and Kenya) through mentorship, in-person meet-ups, and online communities.

In order to access the full nanodegree scholarships, learners will have to complete lessons and quizzes courses being offered under the Udacity single course scholarships (also known as challenge courses) in addition to their active participation and support of classmates in the student community. We will be offering 10,000 scholarships to beginners (with little or no programming experience) and 5,000 to professional developers (with +1 year of experience) spread across Android and mobile web development tracks. The 10,000 beginner scholarships will include Android beginner courses and basic introduction to HTML & CSS; while the 5,000 intermediate scholarships include Android fundamentals for intermediate and building offline web applications courses respectively. Both courses are taught in English through an online program on Udacity open to Africa residence. The top 500 students at the end of the challenge will earn a full Nanodegree scholarship to one of four Nanodegree programs in Android or web development.

The application period closes on April 24th. Interested or want to learn more, visit https://www.udacity.com/google-africa-scholarships?utm_source=devblog

AMP stories: Bringing visual storytelling to the open web

Posted by Rudy Galfi, Product Manager for AMP at Google

The AMP story format is a recently launched addition to the AMP Project that provides content publishers with a mobile-focused format for delivering news and information as visually rich, tap-through stories.

A visual-driven format for evolving news consumption on mobile

Some stories are best told with text while others are best expressed through images and videos. On mobile devices, users browse lots of articles, but engage with few in-depth. Images, videos and graphics help publishers to get their readers' attention as quickly as possible and keep them engaged through immersive and easily consumable visual information.

Recently, as with many new or experimental features within AMP, contributors from multiple companies — in this case, Google and a group of publishers — came together to work toward building a story-focused format in AMP. The collective desire was that this format offer new, creative and visually rich ways of storytelling specifically designed for mobile.

Minimize technical challenges and let creators focus on the storytelling

The mobile web is great for distributing and sharing content, but mastering performance can be tricky. Creating visual stories on the web with the fast and smooth performance that users have grown accustomed to in native apps can be challenging. Getting these key details right often poses prohibitively high startup costs, particularly for small publishers.

AMP stories are built on the technical infrastructure of AMP to provide a fast, beautiful experience on the mobile web. Just like any web page, a publisher hosts an AMP story HTML page on their site and can link to it from any other part of their site to drive discovery. And, as with all content in the AMP ecosystem, discovery platforms can employ techniques like pre-renderable pages, optimized video loading and caching to optimize delivery to the end user.

AMP stories aim to make the production of stories as easy as possible from a technical perspective. The format comes with preset but flexible layout templates, standardized UI controls, and components for sharing and adding follow-on content.

Yet, the design gives great editorial freedom to content creators to tell stories true to their brand. Publishers involved in the early development of the AMP stories format — CNN, Conde Nast, Hearst, Mashable, Meredith, Mic, Vox Media, and The Washington Post — have brought together their reporters, illustrators, designers, producers, and video editors to creatively use this format and experiment with novel ways to tell immersive stories for a diverse set of content categories.

Developer preview for AMP stories is starting today

Today AMP stories are available for everyone to try on their websites. As part of the AMP Project, the AMP story format is free and open for anyone to use. To get started, check out the tutorial and documentation. We are looking forward to feedback from content creators and technical contributors alike.

Also, starting today, you can see AMP stories on Google Search. To try it out, search for publisher names (like the ones mentioned above) within g.co/ampstories using your mobile browser. At a later point, Google plans to bring AMP stories to more products across Google, and expand the ways they appear in Google Search.

Why I contribute to Chromium

This is a guest post by Yoav Weiss who was recently recognized through the Google Open Source Peer Bonus Program for his work on the Chromium project. We invited Yoav to share about his work on our blog.

I was recently recognized by Google for my contributions to Chromium and wanted to write a few words on why I contribute to the project, other rendering engines and the web platform in general. I also wanted to share how it helped me evolve as a developer and why more people should contribute to the web platform for their own benefit.

The web platform

I’ve written before about why I think the web platform is an extremely important asset for humanity and why we should make sure it'll thrive for years to come. It enables the distribution of knowledge to the corners of the earth and has fundamentally changed our world. Yet, compared to the amount of users (billions!) and web developers (millions), there are only a few hundred engineers working on maintaining and improving the platform itself.

That means that there are many aspects of the platform that are not as well maintained as they should be. We're at a real risk of a "tragedy of the commons" scenario, where despite usage and utility, the platform will collapse under its own weight because maintaining it is nobody's exclusive problem.

How I got started

Personally, I had been working on web performance for well over a decade before I decided to get more involved and lend my hand in building the platform. For a large part of my professional life, browsers were black boxes. They were given to us by the browser gods and that's what we had to work with for the next few years. Their undocumented bugs and quirks became gospel, passed from senior engineers to their juniors.

Then at some point, that situation changed. Slowly but surely, open source browsers started picking up market share. No longer black boxes, we can actually see what happens on the inside!

I first got involved by joining the responsive images discussions and the Responsive Images Community Group. Then, I saw a tweet from RICG's chair calling to develop a prototype of the current proposal to prove its feasibility and value. And I jumped in.

I created a prototype using Chromium and WebKit, demoed it to anyone that was interested, worked on the proposals and argued the viability of the proposals' approach on the various mailing lists. Eventually, we were able to get some browser folks on board, improve the proposals and their fit to the rest of the platform, and I started working on an implementation.

The amount of work this required was larger than I expected. Eventually I managed to ship the feature in Blink and Chromium, and complete large parts of the implementation in WebKit as well. WOOT!

Success! Now what?

After that project was done, I started looking into what I should do next. I was determined to continue working on browsers and find a gig that would let me do that. So I searched for an employer with a vested interest in the web and in making it faster, who would be happy to let me work on the platform's client - the web browser.

I found such an employer in Akamai, where I have been working as a Principal Architect ever since. As part of my job I'm working on our performance optimization features as well as performance-related browser features, making sure they make it into browsers in a timely fashion.

Why you should contribute, too

Now, chances are that if you're reading this, you're also relying on the web platform for your job in one way or another. Which means that there's a chance that it also makes sense for your organization to contribute to the web platform. Let’s explore the reasons:

1. Make sure work is done on features you care about

If you're like me, you love the web platform and the reach it provides you, but you're not necessarily happy with all of it. The web is great, but not perfect. Since browsers and web standards are no longer black boxes, you can help change that.

You can work on standards and browsers to change them to include your use-case. That's immense power at your fingertips: put in the work and the platform evolves for all the billions of users out there.

And you don’t have to wait years before new features can be used in production like with yesteryear's browser changes. With today’s browser update rates and progressive enhancement, you’ll probably be able to use changes in production within a few months.

2. Gain expertise that can help you do your job better

Knowing browser internals better can also give you superpowers in other parts of your job. Whenever questions about browser behavior arrive, you can take a peek into the source code and have concrete answers rather than speculation.

Keeping track of standards discussions give you visibility into new browser APIs that are coming along, so that you can opt to use those rather than settle for sub-optimal alternatives that are currently available.

3. Grow as an engineer

Working on browsers teaches you a lot about how things work under the surface and enables you to understand the internals of modern browsers, which are extremely complex machines. Further, this work allows you to get code reviews from the world's leading experts on these subjects. What better way to grow than to interact with the experts?

4. It's a fun and welcoming community

Contributing to the web platform has been a great experience for me. Working with the Chromium project, in particular, is always great fun. The project is Google backed, but there are many external contributors and the majority of work and decisions are being done in the open. The people I've worked with are super friendly and happy to help. All in all, it's really fun!

Join us

The web needs more people working on it, and working on the web platform can be extremely beneficial to you, your career and your business.

If you're interested in getting started with web standards, the Discourse instance of the web Platform Incubator Community Group (or WICG for short) is where it's at (disclaimer: I'm co-chairing that group). For getting started with Chromium development, this is the post for you.

And most important, don't be afraid to ask the community. People on blink-dev and IRC are super friendly and will be happy to point you in the right direction.

So come on over and join the good cause. We'll be happy to have you!

By Yoav Weiss, Chromium contributor

Get ready for Javascript “Promises” with Google and Udacity

Sarah Clark, Program Manager, Google Developer Training

Front-end web developers face challenges when using common “asynchronous” requests. These requests, such as fetching a URL or reading a file, often lead to complicated code, especially when performing multiple actions in a row. How can we make this easier for developers?

Javascript Promises are a new tool that simplifies asynchronous code, converting a tangle of callbacks and event handlers into simple, straightforward code such as: fetch(url).then(decodeJSON).then(addToPage)...

Promises are used by many new web standards, including Service Worker, the Fetch API, Quota Management, Font Load Events,Web MIDI, and Streams.


We’ve just opened up a online course on Promises, built in collaboration with Udacity. This brief course, which you can finish in about a day, walks you through building an “Exoplanet Explorer” app that reads and displays live data using Promises. You’ll also learn to use the Fetch API and finally kiss XMLHttpRequest goodbye!

This short course is a prerequisite for most of the Senior Web Developer Nanodegree. Whether you are in the paid Nanodegree program or taking the course for free, won’t you come learn to make your code simpler and more reliable today?