Category Archives: Google Developers Blog

News and insights on Google platforms, tools and events

Congratulations to our US Grow with Google Developer Scholars!

Posted by Peter Lubbers, Head of Google Developer Training

Grow with Google in partnership with Udacity, is awarding 5,000 Nanodegree program scholarships to help aspiring developers in the US continue their digital skills training and prepare for jobs as Android or Mobile Web developers.

As part of the Grow with Google Developer Scholarship program, scholars completed an initial challenge course at Udacity - completing on average over 100 hours of coursework, building coding project portfolios and engaging with their local developer community. Today, Google and Udacity are excited to recognize the 5,000 top performers in the challenge course, and offer them a chance to continue their training through a Nanodegree program with a full scholarship.

By successfully completing a Nanodegree program, scholars earn an industry-recognized credential helping to create a path for increased job opportunities as well as prepare for one of Google's Developer Certifications: Associate Android Developer or Mobile Web Specialist. These developer training programs offer scholars the opportunity to build their skills and become job-ready, helping to close the gap in the more than 500,000 open computing jobs in the US.

We are incredibly inspired by the hard work and passion shown by all our Grow with Google developer scholars -- including these stellar scholars:

Bela from Tennessee, a mother of two working toward her goal of becoming a web developer. Bela recently shared her personal story of determination to complete her developer training.

Desmend from Illinois, who is taking what he learns in his Android developer course and sharing it with local high school students that he mentors -- teaching them about technology and the type of career opportunities offered to developers.

Sean from Alabama, a veteran using his course training to transition into the civilian workforce as an Android developer.

And Demetra from New York, who utilized the online training and forums to achieve her goal of advancing her skills in web development.

This scholarship effort is part of the Grow with Google initiative, which is aimed at helping create economic opportunities for Americans by offering free tools, training, and events. Udacity is excited to partner with Google on this powerful effort and together we look forward to seeing what these scholars will achieve in the coming year.

Showcase your innovations at the 2018 China-US Young Makers Competition

Posted by Bill Luan, Senior Program Manager & Greater China Regional Lead, Developer Relations

The 2018 China-U.S. Young Maker Competition launched this week by the event co-organizer Hackster.IO. Project submissions are now open to all makers, developers, and students ages 18-40 in both China and the United States. Google is the corporate sponsor for this year's competition.

Since 2014, this competition has been running annually in supporting the U.S.-China High-Level Consultation on People-to-People Exchange program. The competition encourages makers in both countries to create innovative products focusing on community development, education, environmental protection, health & fitness, energy, transportation and sustainable development.

Participants have the freedom to choose appropriate technologies to enable their innovations, and we encourage makers to consider open source technologies, such as TensorFlow and AIY Projects for artificial intelligence use cases, Android Studio for mobile applications, as well as Android Things for IoT solutions.

The top 10 projects in the U.S. will win an all-expenses-paid trip to Beijing, to compete against Chinese makers on August 13-17 for the chance at $30,000 in prizes. Further, there are 35 additional chances to win Google prizes! So join the competition, and let your innovation shine on the global stage!

For more details, please see the event announcement on Hackster.IO.

AIY Projects: Updated kits for 2018

Posted by Billy Rutledge, Director of AIY Projects

Last year, AIY Projects launched to give makers the power to build AI into their projects with two do-it-yourself kits. We're seeing continued demand for the kits, especially from the STEM audience where parents and teachers alike have found the products to be great tools for the classroom. The changing nature of work in the future means students may have jobs that haven't yet been imagined, and we know that computer science skills, like analytical thinking and creative problem solving, will be crucial.

We're taking the first of many steps to help educators integrate AIY into STEM lesson plans and help prepare students for the challenges of the future by launching a new version of our AIY kits. The Voice Kit lets you build a voice controlled speaker, while the Vision Kit lets you build a camera that learns to recognize people and objects (check it out here). The new kits make getting started a little easier with clearer instructions, a new app and all the parts in one box.

To make setup easier, both kits have been redesigned to work with the new Raspberry Pi Zero WH, which comes included in the box, along with the USB connector cable and pre-provisioned SD card. Now users no longer need to download the software image and can get running faster. The updated AIY Vision Kit v1.1 also includes the Raspberry Pi Camera v2.

AIY Voice Kit v2 includes Raspberry Pi Zero WH and pre-provisioned SD card

AIY Voice Kit v1.1 includes Raspberry Pi Zero WH, Raspberry Pi Cam 2 and pre-provisioned SD card

We're also introducing the AIY companion app for Android, available here in Google Play, to make wireless setup and configuration a snap. The kits still work with monitor, keyboard and mouse as an alternate path and we're working on iOS and Chrome companions which will be coming soon.

The AIY website has been refreshed with improved documentation, now easier for young makers to get started and learn as they build. It also includes a new AIY Models area, showcasing a collection of neural networks designed to work with AIY kits. While we've solved one barrier to entry for the STEM audience, we recognize that there are many other things that we can do to make our kits even more useful. We'll once again be at #MakerFaire events to gather feedback from our users and in June we'll be working with teachers from all over the world at the ISTE conference in Chicago.

The new AIY Voice Kit and Vision Kit have arrived at Target Stores and Target.com (US) this month and we're working to make them globally available through retailers worldwide. Sign up on our mailing list to be notified when our products become available.

We hope you'll pick up one of the new AIY kits and learn more about how to build your own smart devices. Be sure to share your recipes on Hackster.io and social media using #aiyprojects.

Text Embedding Models Contain Bias. Here’s Why That Matters.

Posted by Ben Packer, Yoni Halpern, Mario Guajardo-Céspedes & Margaret Mitchell (Google AI)

As Machine Learning practitioners, when faced with a task, we usually select or train a model primarily based on how well it performs on that task. For example, say we're building a system to classify whether a movie review is positive or negative. We take 5 different models and see how well each performs this task:

Figure 1: Model performances on a task. Which model would you choose?

Normally, we'd simply choose Model C. But what if we found that while Model C performs the best overall, it's also most likely to assign a more positive sentiment to the sentence "The main character is a man" than to the sentence "The main character is a woman"? Would we reconsider?

Bias in Machine Learning Models

Neural network models can be quite powerful, effectively helping to identify patterns and uncover structure in a variety of different tasks, from language translation to pathology to playing games. At the same time, neural models (as well as other kinds of machine learning models) can contain problematic biases in many forms. For example, classifiers trained to detect rude, disrespectful, or unreasonable comments may be more likely to flag the sentence "I am gay" than "I am straight" [1]; face classification models may not perform as well for women of color [2]; speech transcription may have higher error rates for African Americans than White Americans [3].

Many pre-trained machine learning models are widely available for developers to use -- for example, TensorFlow Hub recently launched its platform publicly. It's important that when developers use these models in their applications, they're aware of what biases they contain and how they might manifest in those applications.

Human data encodes human biases by default. Being aware of this is a good start, and the conversation around how to handle it is ongoing. At Google, we are actively researching unintended bias analysis and mitigation strategies because we are committed to making products that work well for everyone. In this post, we'll examine a few text embedding models, suggest some tools for evaluating certain forms of bias, and discuss how these issues matter when building applications.

WEAT scores, a general-purpose measurement tool

Text embedding models convert any input text into an output vector of numbers, and in the process map semantically similar words near each other in the embedding space:

Figure 2: Text embeddings convert any text into a vector of numbers (left). Semantically similar pieces of text are mapped nearby each other in the embedding space (right).

Given a trained text embedding model, we can directly measure the associations the model has between words or phrases. Many of these associations are expected and are helpful for natural language tasks. However, some associations may be problematic or hurtful. For example, the ground-breaking paper by Bolukbasi et al. [4] found that the vector-relationship between "man" and "woman" was similar to the relationship between "physician" and "registered nurse" or "shopkeeper" and "housewife" in the popular publicly-available word2vec embedding trained on Google News text.

The Word Embedding Association Test (WEAT) was recently proposed by Caliskan et al. [5] as a way to examine the associations in word embeddings between concepts captured in the Implicit Association Test (IAT). We use the WEAT here as one way to explore some kinds of problematic associations.

The WEAT test measures the degree to which a model associates sets of target words (e.g., African American names, European American names, flowers, insects) with sets of attribute words (e.g., "stable", "pleasant" or "unpleasant"). The association between two given words is defined as the cosine similarity between the embedding vectors for the words.

For example, the target lists for the first WEAT test are types of flowers and insects, and the attributes are pleasant words (e.g., "love", "peace") and unpleasant words (e.g., "hatred," "ugly"). The overall test score is the degree to which flowers are more associated with the pleasant words, relative to insects. A high positive score (the score can range between 2.0 and -2.0) means that flowers are more associated with pleasant words, and a high negative score means that insects are more associated with pleasant words.

While the first two WEAT tests proposed in Caliskan et al. measure associations that are of little social concern (except perhaps to entomologists), the remaining tests measure more problematic biases.

We used the WEAT score to examine several word embedding models: word2vec and GloVe (previously reported in Caliskan et al.), and three newly-released models available on the TensorFlow Hub platform -- nnlm-en-dim50, nnlm-en-dim128, and universal-sentence-encoder. The scores are reported in Table 1.

Table 1: Word Embedding Association Test (WEAT) scores for different embedding models. Cell color indicates whether the direction of the measured bias is in line with (blue) or against (yellow) the common human biases recorded by the Implicit Association Tests. *Statistically significant (p < 0.01) using Caliskan et al. (2015) permutation test. Rows 3-5 are variations whose word lists come from [6], [7], and [8]. See Caliskan et al. for all word lists. * For GloVe, we follow Caliskan et al. and drop uncommon words from the word lists. All other analyses use the full word lists.

These associations are learned from the data that was used to train these models. All of the models have learned the associations for flowers, insects, instruments, and weapons that we might expect and that may be useful in text understanding. The associations learned for the other targets vary, with some -- but not all -- models reinforcing common human biases.

For developers who use these models, it's important to be aware that these associations exist, and that these tests only evaluate a small subset of possible problematic biases. Strategies to reduce unwanted biases are a new and active area of research, and there exists no "silver bullet" that will work best for all applications.

When focusing in on associations in an embedding model, the clearest way to determine how they will affect downstream applications is by examining those applications directly. We turn now to a brief analysis of two sample applications: A Sentiment Analyzer and a Messaging App.

Case study 1: Tia's Movie Sentiment Analyzer

WEAT scores measure properties of word embeddings, but they don't tell us how those embeddings affect downstream tasks. Here we demonstrate the effect of how names are embedded in a few common embeddings on a movie review sentiment analysis task.

Tia is looking to train a sentiment classifier for movie reviews. She does not have very many samples of movie reviews, and so she leverages pretrained embeddings which map the text into a representation which can make the classification task easier.

Let's simulate Tia's scenario using an IMDB movie review dataset [9], subsampled to 1,000 positive and 1,000 negative reviews. We'll use a pre-trained word embedding to map the text of the IMDB reviews to low-dimensional vectors and use these vectors as features in a linear classifier. We'll consider a few different word embedding models and training a linear sentiment classifier with each.

We'll evaluate the quality of the sentiment classifier using the area under the ROC curve (AUC) metric on a held-out test set.

Here are AUC scores for movie sentiment classification using each of the embeddings to extract features:

Figure 3: Performance scores on the sentiment analysis task, measured in AUC, for each of the different embeddings.

At first, Tia's decision seems easy. She should use the embedding that result in the classifier with the highest score, right?

However, let's think about some other aspects that could affect this decision. The word embeddings were trained on large datasets that Tia may not have access to. She would like to assess whether biases inherent in those datasets may affect the behavior of her classifier.

Looking at the WEAT scores for various embeddings, Tia notices that some embeddings consider certain names more "pleasant" than others. That doesn't sound like a good property of a movie sentiment analyzer. It doesn't seem right to Tia that names should affect the predicted sentiment of a movie review. She decides to check whether this "pleasantness bias" affects her classification task.

She starts by constructing some test examples to determine whether a noticeable bias can be detected.

In this case, she takes the 100 shortest reviews from her test set and appends the words "reviewed by _______", where the blank is filled in with a name. Using the lists of "African American" and "European American" names from Caliskan et al. and common male and female names from the United States Social Security Administration, she looks at the difference in average sentiment scores.

Figure 4: Difference in average sentiment scores on the modified test sets where "reviewed by ______" had been added to the end of each review. The violin plots show the distribution over differences when models are trained on small samples of the original IMDB training data.

The violin-plots above show the distribution in differences of average sentiment scores that Tia might see, simulated by taking subsamples of 1,000 positive and 1,000 negative reviews from the original IMDB training set. We show results for five word embeddings, as well as a model (No embedding) that doesn't use a word embedding.

Checking the difference in sentiment with no embedding is a good check that confirms that the sentiment associated with the names is not coming from the small IMDB supervised dataset, but rather is introduced by the pretrained embeddings. We can also see that different embeddings lead to different system outcomes, demonstrating that the choice of embedding is a key factor in the associations that Tia's sentiment classifier will make.

Tia needs to think very carefully about how this classifier will be used. Maybe her goal is just to select a few good movies for herself to watch next. In this case, it may not be a big deal. The movies that appear at the top of the list are likely to be very well-liked movies. But what if she hires and pays actors and actresses according to their average movie review ratings, as assessed by her model? That sounds much more problematic.

Tia may not be limited to the choices presented here. There are other approaches that she may consider, like mapping all names to a single word type, retraining the embeddings using data designed to mitigate sensitivity to names in her dataset, or using multiple embeddings and handling cases where the models disagree.

There is no one "right" answer here. Many of these decisions are highly context dependent and depend on Tia's intended use. There is a lot for Tia to think about as she chooses between feature extraction methods for training text classification models.

Case study 2: Tamera's Messaging App

Tamera is building a messaging app, and she wants to use text embedding models to give users suggested replies when they receive a message. She's already built a system to generate a set of candidate replies for a given message, and she wants to use a text embedding model to score these candidates. Specifically, she'll run the input message through the model to get the message embedding vector, do the same for each of the candidate responses, and then score each candidate with the cosine similarity between its embedding vector and the message embedding vector.

While there are many ways that a model's bias may play a role in these suggested replies, she decides to focus on one narrow aspect in particular: the association between occupations and binary gender. An example of bias in this context is if the incoming message is "Did the engineer finish the project?" and the model scores the response "Yes he did" higher than "Yes she did." These associations are learned from the data used to train the embeddings, and while they reflect the degree to which each gendered response is likely to be the actual response in the training data (and the degree to which there's a gender imbalance in these occupations in the real world), it can be a negative experience for users when the system simply assumes that the engineer is male.

To measure this form of bias, she creates a templated list of prompts and responses. The templates include questions such as, "Is/was your cousin a(n) ?" and "Is/was the here today?", with answer templates of "Yes, s/he is/was." For a given occupation and question (e.g., "Will the plumber be there today?"), the model's bias score is the difference between the model's score for the female-gendered response ("Yes, she will") and that of the male-gendered response ("Yes, he will"):

For a given occupation overall, the model's bias score is the sum of the bias scores for all question/answer templates with that occupation.

Tamera runs 200 occupations through this analysis using the Universal Sentence Encoder embedding model. Table 2 shows the occupations with the highest female-biased scores (left) and the highest male-biased scores (right):

Highest female biasHighest male bias

Table 2: Occupations with the highest female-biased scores (left) and the highest male-biased scores (right).

Tamera isn't bothered by the fact that "waitress" questions are more likely to induce a response that contains "she," but many of the other response biases give her pause. As with Tia, Tamera has several choices she can make. She could simply accept these biases as is and do nothing, though at least now she won't be caught off-guard if users complain. She could make changes in the user interface, for example by having it present two gendered responses instead of just one, though she might not want to do that if the input message has a gendered pronoun (e.g., "Will she be there today?"). She could try retraining the embedding model using a bias mitigation technique (e.g., as in Bolukbasi et al.) and examining how this affects downstream performance, or she might mitigate bias in the classifier directly when training her classifier (e.g., as in Dixon et al. [1], Beutel et al. [10], or Zhang et al. [11]). No matter what she decides to do, it's important that Tamera has done this type of analysis so that she's aware of what her product does and can make informed decisions.

Conclusions

To better understand the potential issues that an ML model might create, both model creators and practitioners who use these models should examine the undesirable biases that models may contain. We've shown some tools for uncovering particular forms of stereotype bias in these models, but this certainly doesn't constitute all forms of bias. Even the WEAT analyses discussed here are quite narrow in scope, and so should not be interpreted as capturing the full story on implicit associations in embedding models. For example, a model trained explicitly to eliminate negative associations for 50 names in one of the WEAT categories would likely not mitigate negative associations for other names or categories, and the resulting low WEAT score could give a false sense that negative associations as a whole have been well addressed. These evaluations are better used to inform us about the way existing models behave and to serve as one starting point in understanding how unwanted biases can affect the technology that we make and use. We're continuing to work on this problem because we believe it's important and we invite you to join this conversation as well.

Acknowledgments

We would like to thank Lucy Vasserman, Eric Breck, Erica Greene, and the TensorFlow Hub and Semantic Experiences teams for collaborating on this work.

References

[1] Dixon, L., Li, J., Sorensen, J., Thain, M. and Vasserman, L., 2018. Measuring and Mitigating Unintended Bias in Text Classification. AIES.

[2] Buolamwini, J. and Gebru, T., 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. FAT/ML.

[3] Tatman, R. and Kasten, C. 2017. Effects of Talker Dialect, Gender & Race on Accuracy of Bing Speech and YouTube Automatic Captions. INTERSPEECH.

[4] Bolukbasi, T., Chang, K., Zou, J., Saligrama, V. and Kalai, A. 2016. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. NIPS.

[5] Caliskan, A., Bryson, J. J. and Narayanan, A. 2017. Semantics derived automatically from language corpora contain human-like biases. Science.

[6] Greenwald, A. G., McGhee, D. E., and Schwartz, J. L. 1998. Measuring individual differences in implicit cognition: the implicit association test. Journal of personality and social psychology.

[7] Bertrand, M. and Mullainathan, S. 2004. Are emily and greg more employable than lakisha and jamal? a field experiment on labor market discrimination. The American Economic Review.

[8] Nosek, B. A., Banaji, M., and Greenwald, A. G. 2002. Harvesting implicit group attitudes and beliefs from a demonstration web site. Group Dynamics: Theory, Research, and Practice.

[9] Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. 2011. Learning Word Vectors for Sentiment Analysis. ACL.

[10] Beutel, A., Chen, J., Zhao, Z., & Chi, E. H. 2017 Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations. FAT/ML.

[11] Zhang, B., Lemoine, B., and Mitchell, M. 2018. Mitigating Unwanted Biases with Adversarial Learning. AIES.

Time to Upgrade from GCM to FCM

Originally posted by Jen Person on the Firebase Blog.

In 2016, we unveiled Firebase Cloud Messaging (FCM) as the next evolution of Google Cloud Messaging (GCM). Since then, we've been working hard to make Firebase Cloud Messaging even more powerful than its predecessor. Like GCM, Firebase Cloud Messaging allows you to send notifications and data messages reliably to iOS, Android, and the Web at no cost. In addition, FCM includes a host of new features, such as an intuitive notifications interface in the Firebase console, better reporting, and native integrations with other Firebase products. With FCM, you can target and test notifications to re-engage your users with greater ease and efficiency.

We're excited to devote more time and attention to improving FCM. That's why today we're announcing that all developers will need to upgrade to FCM within a year. The GCM server and client APIs have been deprecated and will be removed as soon as April 11th, 2019. We recommend you upgrade sooner rather than later so you can start taking advantage of the new features we're building in FCM right away!


To help you through the upgrade, we've created a step-by-step migration guide and answered a few of the most common questions you'll probably have below.

What else is new in FCM?

Once you upgrade, you'll be able to use all of the new features and functionality available in FCM, like platform overrides and topic combinations. You'll also be able to send notifications directly from the Firebase console! What's more, FCM integrates seamlessly with other Firebase products like A/B Testing and Predictions.

Want to test different messages to see which one drives more conversions? You can use FCM with A/B Testing to run experiments to optimize your notifications. Want to engage users who are likely to churn or spend money in your app? You can use FCM with Predictions to target notifications to users based on their predicted behavior.

These are some of the awesome features you'll have at your fingertips with FCM. In the future, we'll be adding many more!

Will I still be able to send messages to my existing users?

If you have projects that are still using the GCM APIs, you will need to update your client and server code to use FCM before April 11, 2019. But rest assured, your existing GCM tokens will continue to work with FCM so you won't lose the ability to send messages to your existing users.

How do I upgrade?

The full process is outlined in our migration guide, or if you prefer video content, you can also check out this Firecast for details.

On a high level, upgrading consists of three main parts: console-side, app-side, and server-side.

  • In the Firebase console, you'll need to create a new Firebase project using your app's existing Cloud Project ID.
  • In your app, you'll need to make some code changes. The amount of changes will depend on what features of GCM you currently use, such as topic subscriptions and token generation.
  • On the server side, you'll need to change the server endpoint from GCM to FCM.

Keep in mind that you don't have to complete all three parts of the process in one sitting - you can take it at your own pace. For example, you can choose to configure the console today and work on the app code another time. You're also free to update your app's code right now, and tackle the server-side requirements later.

What happens to my users who don't update their apps?

As long as users have GCM logic in their apps, they will still receive messages. FCM is backwards compatible with GCM, so even if you don't update your server endpoint now, you can still update your app's logic, and vice versa.

What data will Firebase collect and use? I'm concerned about privacy.

Please see the Firebase terms and the Firebase Privacy and Security Policy. You can disable Google Analytics for Firebase to reduce the amount of data that is collected, but keep in mind this will also disable some FCM features.

What if I still have questions?

We're here to help you through the upgrade process. Check out this nifty FAQ page as a start. We also encourage you to post your questions on StackOverflow. Or, feel free to reach out to Firebase support through any of these means.

To save you clicking time, here are some of the links that are also worth a read. Start with the upgrade guide, and then check out the other links to find out more.

What if I already migrated?

Awesome! How'd it go? Tweet me at @ThatJenPerson to tell me what went well and what didn't. Sharing your experience helps us make improvements!

We look forward to welcoming you to FCM, the next evolution of GCM!

Browse the updated Google I/O 2018 schedule and reserve seats for Sessions

The Google I/O 2018 schedule just got a big update!

Find additional Sessions and Codelabs, as well as new App Reviews, Office Hours, and After Hours events. Times and locations for all events are also now available, so you can start planning accordingly. New this year: we'll have a series of Keynote Sessions, which take a broader look at how the technology we build can impact the world around us! The I/O schedule is subject to change until the event, so check back often, and keep an eye out for scheduled Meetup events taking place at the Community Lounge to help you connect and network with other developers.

Attending in person

To help make it easier to attend your favorite talks and minimize lines, confirmed attendees will be able to reserve seats for Sessions in advance of I/O - as long as they’re signed in with the same email address used to register for the 2018 event. A portion of seats will still be available first-come, first-served onsite.

To reserve a seat:

  • Navigate to google.com/io/schedule, sign in, and click on the ticket icon for each Session you want to reserve.
  • If a particular Session has already reached the reservation capacity, you'll see an hourglass icon instead. If you've joined a waitlist and a spot opens up, we'll automatically change your reservation status to reserved.
  • You can reserve as many Sessions as you'd like per day, but only one reservation/waitlist per time slot is allowed.
  • Reservations will remain open until 1 hour before the start time for each Session.
  • NOTE: Reservations are only available for Sessions, not other event types listed on the schedule.

Reserve seats via the main Schedule page…

…Or via the Session detail pages.

Anyone who's signed in can also star all event types listed on the schedule as a way to easily find them later on or on other devices.

In addition to more than 160 technical and Keynote Sessions, onsite guests will have the chance to explore various Sandbox domes, covering product areas like Android, Assistant, Design, IoT, Web, just to name a few. Sandboxes are dedicated spaces to learn and play with our latest products and platforms via interactive demos, physical installations, and more.

You can also take advantage of 100+ Office Hours and App Reviews. Office Hours gives you a chance to meet one-on-one with Google experts to ask all your technical questions, and App Reviews will give you the opportunity to receive advice and tips on your specific app-related projects.

Don't forget to save time in your schedule for Codelabs. Here, you'll have everything you need to learn about the latest and greatest Google technologies via self-paced tutorials, or bring your own machine and take your work home with you. Google staff will be on hand for helpful advice and to provide direction if you get stuck.

Joining remotely?

Don't worry - you're not alone and you won't miss a thing! We'll be livestreaming the majority of our Keynotes and Sessions from Shoreline. If you prefer to watch I/O with your developer community, find an I/O Extended viewing party near you.

We'll also let you experience I/O firsthand via our I/O Guides who will be touring the venue and giving you eyes on the ground.

I/O is only 27 days away! We'll continue to share updates in the upcoming weeks to help you get ready and make the most of this year's event. Stay tuned!

Highlights from TensorFlow Dev Summit 2018

Originally posted by Sandeep Gupta, Product Manager for TensorFlow, on behalf of the TensorFlow team on the TensorFlow Blog.

On March 30th, we held the second TensorFlow Developer Summit at the Computer History Museum in Mountain View, CA! The event brought together over 500 TensorFlow users in-person and thousands tuning into the livestream at TensorFlow events around the world. The day was filled with new product announcements along with technical talks from the TensorFlow team and guest speakers. Here are the highlights from the event:

Machine learning is solving challenging problems that impact everyone around the world. Problems that we thought were impossible or too complex to solve are now possible with this technology. Using TensorFlow, we've already seen great advancements in many different fields. For example:

We're excited to see these amazing uses of TensorFlow and are committed to making it accessible to more developers. This is why we're pleased to announce new updates to TensorFlow that will help improve the developer experience!

We're making TensorFlow easier to use

Researchers and developers want a simpler way of using TensorFlow. We're integrating a more intuitive programming model for Python developers called eager execution that removes the distinction between the construction and execution of computational graphs. You can develop with eager execution and then use the same code to generate the equivalent graph for training at scale using the Estimator high-level API. We're also announcing a new method for running Estimator models on multiple GPUs on a single machine. This allows developers to quickly scale their models with minimal code changes.

As machine learning models become more abundant and complex, we want to make it easier for developers to share, reuse, and debug them. To help developers share and reuse models, we're announcing TensorFlow Hub, a library built to foster the publication and discovery of modules (self-contained pieces of TensorFlow graph) that can be reused across similar tasks. Modules contain weights that have been pre-trained on large datasets, and may be retrained and used in your own applications. By reusing a module, a developer can train a model using a smaller dataset, improve generalization, or simply speed up training. To make debugging models easier, we're also releasing a new interactive graphical debugger plug-in as part of the TensorBoard visualization tool that helps you inspect and step through internal nodes of a computation graph in real-time.

Model training is only one part of the machine learning process and developers need a solution that works end-to-end to build real-world ML systems. Towards this end, we're announcing the roadmap for TensorFlow Extended (TFX) along with the launch of TensorFlow Model Analysis, an open-source library that combines the power of TensorFlow and Apache Beam to compute and visualize evaluation metrics. The components of TFX that have been released thus far (including TensorFlow Model Analysis, TensorFlow Transform, Estimators, and TensorFlow Serving) are well integrated and let developers prepare data, train, validate, and deploy TensorFlow models in production.

TensorFlow is available in more languages and platforms

Along with making TensorFlow easier to use, we're announcing that developers can use TensorFlow in new languages. TensorFlow.js is a new ML framework for JavaScript developers. Machine learning in the browser using TensorFlow.js opens exciting new possibilities, including interactive ML and support for scenarios where all data remains client-side. It can be used to build and train modules entirely in the browser, as well as import TensorFlow and Keras models trained offline for inference using WebGL acceleration. The Emoji Scavenger Hunt game is a fun example of an application built using TensorFlow.js.

We also have some exciting news for Swift programmers: TensorFlow for Swift will be open sourced this April. TensorFlow for Swift is not your typical language binding for TensorFlow. It integrates first-class compiler and language support, providing the full power of graphs with the usability of eager execution. The project is still in development, with more updates coming soon!

We're also sharing the latest updates to TensorFlow Lite, TensorFlow's lightweight, cross-platform solution for deploying trained ML models on mobile and other edge devices. In addition to existing support for Android and iOS, we're announcing support for Raspberry Pi, increased support for ops/models (including custom ops), and describing how developers can easily use TensorFlow Lite in their own apps. The TensorFlow Lite core interpreter is now only 75KB in size (vs 1.1 MB for TensorFlow) and we're seeing speedups of up to 3x when running quantized image classification models on TensorFlow Lite vs. TensorFlow.

For hardware support, TensorFlow now has integration with NVIDIA's TensorRT. TensorRT is a library that optimizes deep learning models for inference and creates a runtime for deployment on GPUs in production environments. It brings a number of optimizations to TensorFlow and automatically selects platform specific kernels to maximize throughput and minimizes latency during inference on GPUs.

For users who run TensorFlow on CPUs, our partnership with Intel has delivered integration with a highly optimized Intel MKL-DNN open source library for deep learning. When using Intel MKL-DNN, we observed up to 3x inference speedup on various Intel CPU platforms.

The list of platforms that run TensorFlow has grown to include Cloud TPUs, which were released in beta last month. The Google Cloud TPU team has already delivered a strong 1.6X performance increase in ResNet-50 performance since launch. These improvements will be available to TensorFlow users with the 1.8 release soon.

Enabling new applications and domains using TensorFlow

Many data analysis problems are solved using statistical and probabilistic methods. Beyond deep learning and neural network models, TensorFlow now provides state-of-the-art methods for Bayesian analysis via the TensorFlow Probability API. This library contains building blocks like probability distributions, sampling methods, and new metrics and losses. Many other classical ML methods also have increased support. As an example, boosted decision trees can be easily trained and deployed using pre-made high-level classes.

Machine learning and TensorFlow have already helped solve challenging problems in many different fields. Another area where we see TensorFlow having a big impact is in genomics, which is why we're releasing Nucleus, a library for reading, writing, and filtering common genomics file formats for use in TensorFlow. This, along with DeepVariant, an open-source TensorFlow based tool for genome variant discovery, will help spur new research and advances in genomics.

Expanding community resources and engagement

These updates to TensorFlow aim to benefit and grow the community of users and contributors - the thousands of people who play a part in making TensorFlow one of the most popular ML frameworks in the world. To continue to engage with the community and stay up-to-date with TensorFlow, we've launched the new official TensorFlow blog and the TensorFlow YouTube channel. We're also making it easier for our community to collaborate by launching new mailing lists and Special Interest Groups designed to support open-source work on specific projects. To see how you can be a part of the community, visit the TensorFlow Community page and as always, you can follow TensorFlow on Twitter for the latest news.

We're incredibly thankful to everyone who has helped make TensorFlow a successful ML framework in the past two years. Thanks for attending, thanks for watching, and remember to use #MadeWithTensorFlow to share how you are solving impactful and challenging problems with machine learning and TensorFlow!

Launchpad Accelerator expands with regional programs

Posted by Josh Yellin - Global Lead, Launchpad Accelerator

For the past five years, Launchpad has been connecting startups from around the world with the best of Google - its people, network, methodologies, and technologies. We have worked with market leaders including BrainQ (Israel), Flutterwave (Nigeria), Jumo (South Africa), Nestaway (India), and Nubank (Brazil), empowering their sustainable growth through high-touch programs in AI/ML implementation, leadership best-practices, and access to global capital.

In order to better support startup ecosystems in key regions, we're thrilled to expand Launchpad Accelerator by introducing new, standalone initiatives. Building on Google VP Yossi Matias's initial engagement with startups in Tel Aviv, we are introducing accelerators in Tel Aviv, Israel; Lagos, Nigeria; and São Paulo, Brazil. These regional accelerators are representative of our long-term commitment to support and learn from startup ecosystems around the world.

In Tel Aviv, we are working with Machine Learning startups. Our first class launched in March with four ML startups focused on healthcare technology solutions. If you are interested in joining our next class, you can find more information here.

In Lagos, we are working with seed-stage companies solving a range of market-related problems. Our first class, which also launched in March, includes 12 startups working across e-commerce, education, and supply chain. Learn more about the Launchpad Africa program here.

In São Paulo, we will be working with growth-stage companies that operate across a number of sectors. The application for Class One is currently open and will begin in May 2018. If you wish to apply, please do so here.

We will continue to operate a program in San Francisco for top growth-stage global startups. With the addition of accelerators in key regions, we are able to design more customized programs, develop stronger relationships with our partners on the ground, and support the growth of local startup ecosystems.


Stay updated on developments and future opportunities by subscribing to the Google Developers newsletter.

Google Fonts launches Korean support

Posted by the Google Fonts team

The Google Fonts catalog now includes Korean web fonts for designers and developers working with the nation's unique Hangul writing system. While some of the fonts themselves have been available in beta for years now, we introduced official support for Korean earlier this month after devising a more efficient means of serving Chinese, Japanese, and Korean (CJK) font files, which have very large character sets and file sizes.

We've always wanted to offer CJK fonts, and over the years we've worked on foundational technologies such as WOFF2 and CSS3 unicode-range in order to make this possible. Last year, Google engineers experimented with different approaches to slicing fonts into smaller subsets, and found that certain techniques had very good results that enabled this launch.

The Hangul script is distinct from Chinese Hanzi and Japanese Kanji characters. In some ways, it shares greater similarity with Western writing systems because it is constructed from a phonetic alphabet. Whereas the visual features of Hanzi and Kanji logograms give no direct indication of their pronunciation, Hangul is a phonographic script in which written words are built from their constituent sounds.

Hangul starts with a set of 19 consonants and 21 vowels (1). When writing a sentence, individual characters are first identified (2), then combined into blocks that represent compete words (3), and finally conjugated and arranged in grammatical form to create a sentence (4).

Despite the elegant logic underlying Hangul script, Korean fonts present the same basic difficulty for developers that Chinese and Japanese fonts do. Hangul characters may be constructed from just 40 basic elements, but the final forms add up quickly. Korean fonts eventually require over ten thousand characters, meaning the files are too large for most users to download so that they will appear instantly upon visiting a website. A typical full Korean font hovers around 4Mb, whereas even fairly extensive Latin fonts rarely exceed 250Kb.

During the time that Korean fonts were only available on the Google Fonts Early Access system, we were surprised that many web developers were willing to accept the latency implications of serving full font files to their users. Still, in order to graduate these fonts out of our Early Access system, we needed to devise a way for them to work for a wider cross-section of web users, especially those with relatively slow connections.

The Google Fonts API offers larger font files as several subsets, such as "latin" and "cyrillic." When the service launched, these subsets had to be selected by developers. For a few years, we've enabled the 'unicode-range' property of CSS3 for browsers that support it. This means when a large font file is sliced into subsets, the ranges of the Unicode characters in each subset are declared as part of the @font-face declaration. This allows browsers to fetch only a particular subset when those characters appear in a web page.

One of the key benefits of the Google Fonts API is cross-site caching, and this benefit continues to apply to the delivery of font subsets through unicode-range. The font files we serve are used by many domains, so after you visit a site and your browser downloads its fonts, the files are saved in the browser's cache. Then the next time you visit another site that uses the same font files, there's no need for your browser to download it again. This latency benefit only increases over time, and since the many subsets of large font files are cached the same way, you'll see the same cross-site benefits with our CJK fonts.

Over the years we have worked with the W3C and browser developers to ensure that unicode-range would become well supported. Now that Chrome, Firefox, Safari, and Edge have shipped this feature, there is enough support to enable a new means of delivering Korean web fonts that works seamlessly for these browsers.

Support for the unicode-range feature has become widespread, according to caniuse.com

In order maximize efficiency, we wanted to know which characters it made the most sense to cluster together in a subset. We devised a slicing strategy by analyzing text on the Korean-language web to extract patterns of Unicode characters, building topic models of which ones tend to appear together on the same page.

As we evaluated different slicing strategies to decide which Korean characters to include in each subset, our goal was to minimize both the number of subsets and the number of requests. If we sliced the script into 1,000 arbitrary subsets, without factoring in usage and commonality, we would get way too many HTTP requests. We built a testing framework to see how a variety of strategies worked with real-world traffic using our Early Access system, and we launched Korean fonts in our directory with the most efficient one we've found so far.

Strategy 1 is no slicing. The best strategy had 20 times fewer connection requests than the worst, which simply divides the font into equal parts without accounting for patterns of language use.

Moving forward, we think we can do even better. With our scale, a small improvement can justify a lot of effort. By continuing to use our testing framework on different approaches to slicing, we can tune our serving to be as efficient as possible. For the web developers who use our API, and all end users, these kinds of changes are totally transparent and don't require any further work on your part. For example, when WOFF2 came out in 2015, every user with a browser supporting WOFF2 got a 25% faster experience. We transparently make things better for all users on an ongoing basis, and there's enormous potential for future improvements in the delivery of CJK fonts.

This launch began with five Korean fonts originally designed by the leading Korean type foundry Sandoll for Naver. Since the initial launch, we have grown the collection to 23 Korean families, and to showcase them we commissioned a digital specimen website from Math Practice, a digital design studio in New York City. Here you can see beautiful Korean typography in action—and with fast page loads made possible by our new slicing technique.

Thanks to SooYoung Jang, Irin Kim, E Roon Kang, Wonyoung So, Guhong Min, Hannah Son, Aaron Bell, Marc Foley, and all the typeface designers involved in growing the Korean fonts collection and developing the minisite.

Transitioning Google URL Shortener to Firebase Dynamic Links

Posted by Michael Hermanto, Software Engineer, Firebase

We launched the Google URL Shortener back in 2009 as a way to help people more easily share links and measure traffic online. Since then, many popular URL shortening services have emerged and the ways people find content on the Internet have also changed dramatically, from primarily desktop webpages to apps, mobile devices, home assistants, and more.

To refocus our efforts, we're turning down support for goo.gl over the coming weeks and replacing it with Firebase Dynamic Links (FDL). FDLs are smart URLs that allow you to send existing and potential users to any location within an iOS, Android or web app. We're excited to grow and improve the product going forward. While most features of goo.gl will eventually sunset, all existing links will continue to redirect to the intended destination.

For consumers

Starting April 13, 2018, anonymous users and users who have never created short links before today will not be able to create new short links via the goo.gl console. If you are looking to create new short links, we recommend you use Firebase Dynamic Links or check out popular services like Bitly and Ow.ly as an alternative.

If you have existing goo.gl short links, you can continue to use all features of goo.gl console for a period of one year, until March 30, 2019, when we will discontinue the console. You can manage all your short links and their analytics through the goo.gl console during this period.

After March 30, 2019, all links will continue to redirect to the intended destination. Your existing short links will not be migrated to the Firebase console, however, you will be able to export your link information from the goo.gl console.

For developers

Starting May 30, 2018, only projects that have accessed URL Shortener APIs before today can create short links. To create new short links, we recommend FDL APIs. FDL short links will automatically detect the user's platform and send the user to either the web or your app, as appropriate.

If you are already calling URL Shortener APIs to manage goo.gl short links, you can continue to use them for a period of one year, until March 30, 2019, when we will discontinue the APIs.

As it is for consumers, all links will continue to redirect to the intended destination after March 30, 2019. However, existing short links will not be migrated to the Firebase console/API.

URL Shortener has been a great tool that we're proud to have built. As we look towards the future, we're excited about the possibilities of Firebase Dynamic Links, particularly when it comes to dynamic platform detection and links that survive the app installation process. We hope you are too!