Tag Archives: ML

Innovations in Graph Representation Learning



Relational data representing relationships between entities is ubiquitous on the Web (e.g., online social networks) and in the physical world (e.g., in protein interaction networks). Such data can be represented as a graph with nodes (e.g., users, proteins), and edges connecting them (e.g., friendship relations, protein interactions). Given the widespread prevalence of graphs, graph analysis plays a fundamental role in machine learning, with applications in clustering, link prediction, privacy, and others. To apply machine learning methods to graphs (e.g., predicting new friendships, or discovering unknown protein interactions) one needs to learn a representation of the graph that is amenable to be used in ML algorithms.

However, graphs are inherently combinatorial structures made of discrete parts like nodes and edges, while many common ML methods, like neural networks, favor continuous structures, in particular vector representations. Vector representations are particularly important in neural networks, as they can be directly used as input layers. To get around the difficulties in using discrete graph representations in ML, graph embedding methods learn a continuous vector space for the graph, assigning each node (and/or edge) in the graph to a specific position in a vector space. A popular approach in this area is that of random-walk-based representation learning, as introduced in DeepWalk.

Left: The well-known Karate graph representing a social network. Right: A continuous space embedding of the nodes in the graph using DeepWalk.
Here we present the results of two recent papers on graph embedding: “Is a Single Embedding Enough? Learning Node Representations that Capture Multiple Social Contexts” presented at WWW’19 and “Watch Your Step: Learning Node Embeddings via Graph Attention” at NeurIPS’18. The first paper introduces a novel technique to learn multiple embeddings per node, enabling a better characterization of networks with overlapping communities. The second addresses the fundamental problem of hyperparameter tuning in graph embeddings, allowing one to easily deploy graph embeddings methods with less effort. We are also happy to announce that we have released the code for both papers in the Google Research github repository for graph embeddings.

Learning Node Representations that Capture Multiple Social Contexts
In virtually all cases, the crucial assumption of standard graph embedding methods is that a single embedding has to be learned for each node. Thus, the embedding method can be said to seek to identify the single role or position that characterizes each node in the geometry of the graph. Recent work observed, however, that nodes in real networks belong to multiple overlapping communities and play multiple roles—think about your social network where you participate in both your family and in your work community. This observation motivates the following research question: is it possible to develop methods where nodes are embedded in multiple vectors, representing their participation in overlapping communities?

In our WWW’19 paper, we developed Splitter, an unsupervised embedding method that allows the nodes in a graph to have multiple embeddings to better encode their participation in multiple communities. Our method is based on recent innovations in overlapping clustering based on ego-network analysis, using the persona graph concept, in particular. This method takes a graph G, and creates a new graph P (called the persona graph), where each node in G is represented by a series of replicas called the persona nodes. Each persona of a node represents an instantiation of the node in a local community to which it belongs. For each node U in the graph, we analyze the ego-network of the node (i.e., the graph connecting the node to its neighbors, in this example A, B, C, D) to discover local communities to which the node belongs. For instance, in the figure below, node U belongs to two communities: Cluster 1 (with the friends A and B, say U’s family members) and Cluster 2 (with C and D, say U’s colleagues).
Ego-net of node U
Then, we use this information to “split” node U into its two personas U1 (the family persona) and U2 (the work persona). This disentangles the two communities, so that they no longer overlap.
The ego-splitting method separating the U nodes in 2 personas.
This technique has been used to improve the state-of-the-art results in graph embedding methods, showing up to 90% reduction in link prediction (i.e., predicting which link will form in the future) error on a variety of graphs. The key reason for this improvement is the ability of the method to disambiguate highly overlapping communities found in social networks and other real-world graphs. We further validate this result with an in-depth analysis of co-authorship graphs where authors belong to overlapping research communities (e.g., machine learning and data mining).
Top Left: A typical graphs with highly overlapping communities. Top Right: A traditional embedding of the graph on the left using node2vec. Bottom Left: A persona graph of the graph above. Bottom Right: The Splitter embedding of the persona graph. Notice how the persona graph clearly disentangles the overlapping communities of the original graph and Splitter outputs well-separated embeddings.
Automatic hyper-parameter tuning via graph attention.
Graph embedding methods have shown outstanding performance on various ML-based applications, such as link prediction and node classification, but they have a number of hyper-parameters that must be manually set. For example, are nearby nodes more important to capture when learning embeddings than nodes that are further away? Even though experts may be able to fine tune these hyper-parameters, one must do so independently for each graph. To obviate such manual work, in our second paper, we proposed a method to learn the optimal hyper-parameters automatically.

Specifically, many graph embedding methods, like DeepWalk, employ random walks to explore the context around a given node (i.e. the direct neighbors, the neighbors of the neighbors, etc). Such random walks can have many hyper-parameters that allow tuning of the local exploration of the graph, thus regulating the attention given by the embeddings to nearby nodes. Different graphs may present different optimal attention patterns and hence different optimal hyperparameters (see the picture below, where we show two different attention distributions). Watch Your Step formulates a model for the performance of the embedding methods based on the above mentioned hyper-parameters. Then we optimize the hyper-parameters to maximize the performance predicted by the model, using standard backpropagation. We found that the values learned by backpropagation agree with the optimal hyper-parameters obtained by grid search.
Our new method for automatic hyper-parameter tuning, Watch Your Step, uses an attention model to learn different graph context distributions. Shown above are two example local neighborhoods about a center node (in yellow) and the context distributions (red gradient) that was learned by the model. The left-side graph shows a more diffused attention model, while the distribution on the right shows one concentrated on direct neighbors.
This work falls under the growing family of AutoML, where we want to alleviate the burden of optimizing the hyperparameters—a common problem in practical machine learning. Many AutoML methods use neural architecture search. This paper instead shows a variant, where we use the mathematical connection between the hyperparameters in the embeddings and graph-theoretic matrix formulations. The “Auto” portion corresponds to learning the graph hyperparameters by backpropagation.

We believe that our contributions will further advance the state of the research in graph embedding in various directions. Our method for learning multiple node embeddings draws a connection between the rich and well-studied field of overlapping community detection, and the more recent one of graph embedding which we believe may result in fruitful future research. An open problem in this area is the use of multiple-embedding methods for classification. Furthermore, our contribution on learning hyperparameters will foster graph embedding adoption by reducing the need for expensive manual tuning. We hope the release of these papers and code will help the research community pursue these directions.

Acknowledgements
We thank Sami Abu-el-Haija who contributed to this work and is now a Ph.D. student at USC.

Source: Google AI Blog


ML Kit expands into NLP with Language Identification and Smart Reply

Posted by Christiaan Prins and Max Gubin

Today we are announcing the release of two new features to ML Kit: Language Identification and Smart Reply.

You might notice that both of these features are different from our existing APIs that were all focused on image/video processing. Our goal with ML Kit is to offer powerful but simple-to-use APIs to leverage the power of ML, independent of the domain. As such, we are excited to expand ML Kit with solutions for Natural Language Processing (NLP)!

NLP is a category of ML that deals with analyzing and generating text, speech, and other kinds of natural language data. We're excited to start out with two APIs: one that helps you identify the language of text, and one that generates reply suggestions in chat applications. Both of these features work fully on-device and are available on the latest version of the ML Kit SDK, on iOS (9.0 and higher) and Android (4.1 and higher).

Generate reply suggestions based on previous messages

A new feature popping up in messaging apps is to provide the user with a selection of suggested responses, either as actions on a notification or inside the app itself. This can really help a user to quickly respond when they are busy or a handy way to initiate a longer message.

With the new Smart Reply API you can now quickly achieve the same in your own apps. The API provides suggestions based on the last 10 messages in a conversation, although it still works if only one previous message is available. It is a stateless API that fully runs on-device, so we don't keep message history in memory nor send it to a server.

textPlus app providing response suggestions using Smart Reply

We have worked closely with partners like textPlus to ensure Smart Reply is ready for prime time and they have now implemented in-app response suggestions with the latest version of their app (screenshot above).

Adding Smart Reply to your own app is done with a simple function call (using Swift in this example):

let smartReply = NaturalLanguage.naturalLanguage().smartReply()
smartReply.suggestReplies(for: conversation) { result, error in
    guard error == nil, let result = result else {
        return
    }
    if (result.status == .success) {
        for suggestion in result.suggestions {
            print("Suggested reply: \(suggestion.text)")
        }
    }
}

After you initialize a Smart Reply instance, call suggestReplies with a list of recent messages. The callback provides the result which contains a list of suggestions.

For details on how to use the Smart Reply API, check out the documentation.

Tell me more ...

Although as a developer, you can just pick up this new API and easily get it integrated in your app, it may be interesting to reveal a bit on how it works under the hood. At the core of Smart Reply is a machine-learned model that is executed using TensorFlow Lite and has a state-of-the-art modern architecture based on SentencePiece text encoding[1] and Transformer[2].

However, as we realized when we started development of the API, the core suggestion model is not all that's needed to provide a solution that developers can use in their apps. For example, we added a model to detect sensitive topics, so that we avoid making suggestions in response to profanity or in cases of personal tragedy/hardship. Also, we included language identification, to ensure we do not provide suggestions for languages the core model is not trained on. The Smart Reply feature is launching with English support first.

Identify the language of a piece of text

The language of a given text string is a subtle but helpful piece of information. A lot of apps have functionality with a dependency on the language: you can think of features like spell checking, text translation or Smart Reply. Rather than asking a user to specify the language they use, you can use our new Language Identification API.

ML Kit recognizes text in 103 different languages and typically only requires a few words to make an accurate determination. It is fast as well, typically providing a response within 1 to 2 ms across iOS and Android phones.

Similar to the Smart Reply API, you can identify the language with a function call (using Swift in this example):

let languageId = NaturalLanguage.naturalLanguage().languageIdentification()
languageId.identifyLanguage(for: "¿Cómo estás?") { languageCode, error in
  guard error == nil, let languageCode = languageCode else {
    print("Failed to identify language with error: \(error!)")
    return
  }

  print("Identified Language: \(languageCode)")
}

The identifyLanguage functions takes a piece of a text and its callback provides a BCP-47 language code. If no language can be confidently recognized, ML Kit returns a code of und for undetermined. The Language Identification API can also provide a list of possible languages and their confidence values.

For details on how to use the Language Identification API, check out the documentation.

Get started today

We're really excited to expand ML Kit to include Natural Language APIs. Give the two new NLP APIs a spin today and let us know what you think! You can always reach us in our Firebase Talk Google Group.

As ML Kit grows we look forward to adding more APIs and categories that enables you to provide smarter experiences for your users. With that, please keep an eye out for some exciting ML Kit announcements at Google I/O.

Using Deep Learning to Improve Usability on Mobile Devices



Tapping is the most commonly used gesture on mobile interfaces, and is used to trigger all kinds of actions ranging from launching an app to entering text. While the style of clickable elements (e.g., buttons) in traditional desktop graphical user interfaces is often conventionally defined, on mobile interfaces it can still be difficult for people to distinguish tappable versus non-tappable elements due to the diversity of styles. This confusion can lead to false affordances (e.g., a feature that could be mistaken for a button) and a lack of discoverability that can lead to user frustration, uncertainty, and errors. To avoid this, interface designers can conduct a study or a visual affordance test to help clarify the tappability of items in their interfaces. However, such studies are time-consuming and their findings are often limited to a specific app or interface design.

In our CHI'19 paper, "Modeling Mobile Interface Tappability Using Crowdsourcing and Deep Learning", we introduced an approach for modeling the usability of mobile interfaces at scale. We crowdsourced a task to study UI elements across a range of mobile apps to measure the perceived tappability by a user. Our model predictions were consistent with the user group at the ~90% level, demonstrating that a machine learning model can be effectively used to estimate the perceived tappability of interface elements in their design without the need for expensive and time consuming user testing.
Predicting Tappability with Deep Learning
Designers often use visual properties such as the color or depth of an element to signify its availability for interaction on interfaces, e.g., the blue color and underline of a link. While these common signifiers are useful, it is not always clear when to apply them in each specific design setting. Furthermore, with design trends evolving, traditional signifiers are constantly being altered and challenged, potentially causing user uncertainty and mistakes.

To understand how users perceive this changing landscape, we analyzed the potential signifiers affecting tappability in real mobile apps—element type (e.g., check boxes, text boxes, etc.), location, size, color, and words. We started by crowdsourcing volunteers to label the perceived clickability of ~20,000 unique interface elements from ~3,500 apps. With the exception of text boxes, type signifiers yielded low uncertainty in user perceived tappability. The location signifier refers to the position of a feature on the screen and is informed by the common layout design in mobile apps, as demonstrated in the figure below.
Heatmaps displaying the accuracy of tappable and non-tappable elements by location, where warmer colors represent areas of higher accuracy. Users labeled non-tappable elements more accurately towards the upper center of the interface, and tappable elements towards the bottom center of the interface.
The impact of element size was relatively weak, but did indicate confusion in the case of large non-tappable elements. Users showed a tendency to bright colors and short word counts for tappable elements, though word semantics also played a significant role.

We used these labels to train a simple deep neural network that predicts the likelihood that a user will perceive an interface element as tappable versus non-tappable. For a given element of the interface, the model uses a range of features, including the spatial context of the element on the screen (location), the semantics and functionality of the element (words and type), and the visual appearance (size as well as raw pixels). The neural network model applies a convolutional neural network (CNN) to extract features from raw pixels, and uses learned semantic embeddings to represent text content and element properties. The concatenation of all these features are then fed to a fully-connected network layer, the output of which produces a binary classification of an element's tappability.

Evaluation of the Model
The model allowed us to automatically diagnose mismatches between the tappability of each interface element as perceived by a user—predicted by our model—and the intended or actual tappable state of the element specified by the developer or designer. In the example below, our model predicts that there is a 73% chance that a user would think the labels such as "Followers" or "Following" are tappable, while these interface elements are in fact not programmed to be tappable.
To understand how our model behaves compared to human users, particularly when there is ambiguity in human perception, we generated a second, independent dataset by crowdsourcing an effort among 290 volunteers to label each of 2,000 unique interface elements with respect to their perceived tappability. Each element was labeled independently by five different users. We found that more than 40% of the elements in our sample were labeled inconsistently by volunteers. Our model matches this uncertainty in human perception quite well, as demonstrated in the figure below.
The scatterplot of the tappability probability predicted by the model (the Y axis) versus the consistency in the human user labels (the X axis) for each element in the consistency dataset.
When users agree an element's tappability, our model tends to give a more definite answer—a probability close to 1 for tappable and close to 0 for not tappable. When workers are less consistent on an element (towards the middle of the X axis), our model is also less certain about the decision. Overall, our model achieved reasonable accuracy of matching human perception in identifying tappable UI elements with a mean precision of 90.2% and recall of 87.0%.

Predicting tappability is merely one example of what we can do with machine learning to solve usability issues in user interfaces. There are many other challenges in interaction design and user experience research where deep learning models can offer a vehicle to distill large, diverse user experience datasets and advance scientific understandings about interaction behaviors.

Acknowledgements
This research was a joint work of Amanda Swangson, summer intern at Google, and Yang Li, a Research Scientist in Deep Learning and Human Computer Interaction.

Source: Google AI Blog


A Summary of the Google Flood Forecasting Meets Machine Learning Workshop



Recently, we hosted the Google Flood Forecasting Meets Machine Learning workshop in our Tel Aviv office, which brought hydrology and machine learning experts from Google and the broader research community to discuss existing efforts in this space, build a common vocabulary between these groups, and catalyze promising collaborations. In line with our belief that machine learning has the potential to significantly improve flood forecasting efforts and help the hundreds of millions of people affected by floods every year, this workshop discussed improving flood forecasting by aggregating and sharing large data sets, automating calibration and modeling processes, and applying modern statistical and machine learning tools to the problem.

Panel on challenges and opportunities in flood forecasting, featuring (from left to right): Prof. Paolo Burlando (ETH Zürich), Dr. Tyler Erickson (Google Earth Engine), Dr. Peter Salamon (Joint Research Centre) and Prof. Dawei Han (University of Bristol).
The event was kicked off by Google's Yossi Matias, who discussed recent machine learning work and its potential relevance for flood forecasting, crisis response and AI for Social Good, followed by two introductory sessions aimed at bridging some of the knowledge gap between the two fields - introduction to hydrology for computer scientists by Prof. Peter Molnar of ETH Zürich, and introduction to machine learning for hydrologists by Prof. Yishay Mansour of Tel Aviv University and Google

Included in the 2-day event was a wide range of fascinating talks and posters across the flood forecasting landscape, from both hydrologic and machine learning points of view.

An overview of research areas in flood forecasting addressed in the workshop.
Presentations from the research community included:
Alongside these talks, we presented the various efforts across Google to try and improve flood forecasting and foster collaborations in the field, including:
Additionally, at this workshop we piloted an experimental "ML Consultation" panel, where Googlers Gal Elidan, Sasha Goldshtein and Doron Kukliansky gave advice on how to best use machine learning in several hydrology-related tasks. Finally, we concluded the workshop with a moderated panel on the greatest challenges and opportunities in flood forecasting, with hydrology experts Prof. Paolo Burlando of ETH Zürich, Prof. Dawei Han of the University of Bristol, Dr. Peter Salamon of the Joint Research Centre and Dr. Tyler Erickson of Google Earth Engine.
Flood forecasting is an incredibly important and challenging task that is one part of our larger AI for Social Good efforts. We believe that effective global-scale solutions can be achieved by combining modern techniques with the domain expertise already existing in the field. The workshop was a great first step towards creating much-needed understanding, communication and collaboration between the flood forecasting community and the machine learning community, and we look forward to our continued engagement with the broad research community to tackle this challenge.

Acknowledgements
We would like to thank Avinatan Hassidim, Carla Bromberg, Doron Kukliansky, Efrat Morin, Gal Elidan, Guy Shalev, Jennifer Ye, Nadav Rabani and Sasha Goldshtein for their contributions to making this workshop happen.

Source: Google AI Blog


This is the Future of Finance

Posted by Roy Glasberg, Head of Launchpad

Launchpad's mission is to accelerate innovation and to help startups build world-class technologies by leveraging the best of Google - its people, network, research, and technology.

In September 2018, the Launchpad team welcomed ten of the world's leading FinTech startups to join their accelerator program, helping them fast-track their application of advanced technology. Today, March 15th, we will see this cohort graduate from the program at the Launchpad team's inaugural event - The Future of Finance - a global discussion on the impact of applied ML/AI on the finance industry. These startups are ensuring that everyone has relevant insights at their fingertips and that all people, no matter where they are, have access to equitable money, banking, loans, and marketplaces.

Tune into the event from wherever you are via the livestream link

The Graduating Class of Launchpad FinTech Accelerator San Francisco'19

  • Alchemy (USA), bridging blockchain and the real world
  • Axinan (Singapore), providing smart insurance for the digital economy
  • Aye Finance (India), transforming financing in India
  • Celo (USA), increasing financial inclusion through a mobile-first cryptocurrency
  • Frontier Car Group (Germany), investing in the transformation of used-car marketplaces
  • GO-JEK (Indonesia), improving the welfare and livelihoods of informal sectors
  • GuiaBolso (Brazil), improving the financial lives of Brazilians
  • JUMO (South Africa), creating a transparent, fair money marketplace for mobile users to access loans
  • m.Paani (India), (em)powering local retailers and the next billion users in India
  • Starling Bank (UK), improving financial health with a 100% mobile-only bank

Since joining the accelerator, these startups have made great strides and are going from strength to strength. Some recent announcements from this cohort include:

  • JUMO have announced the launch of Opportunity Co, a 500M fund for credit where all the profits go back to the customers.
  • The team at Aye Finance have just closed $30m in Series D equity round.
  • Starling Bank has provided 150 new jobs in Southampton and have received a £100m grant from a fund aimed at increasing competition and innovation in the British banking sector, and also a £75m fundraise.
  • GuiaBolso ran a campaign to pay the bills of some its users (the beginning of the year in Brazil is a time of high expenses and debts) and is having a significant impact on credit with 80% of cases seeing interest rates on loans being cheaper than traditional banks.

We look forward to following the success of all our participating founders as they continue to make a significant impact on the global economy.

Want to know more about the Launchpad Accelerator? Visit our site, stay updated on developments and future opportunities by subscribing to the Google Developers newsletter and visit The Launchpad Blog.

Introducing Class II of Launchpad Accelerator India

https://lh6.googleusercontent.com/Gxl43TzIBGARTYC9VQmiY_1cbFn2_NSAuh0wL9GlaDG-dyr9P2hrFQuABDoN1ZrmVJuvTE8o4zfVEA87UgVveiHwJ00j_br_8Nxbe53FxqxLF6JYoShY3-zbPo75g0Qo8z8ceU4f
In December 2018, we opened applications for Class II of Launchpad Accelerator India and are thrilled to announce the start of the new class.
The second batch of the Launchpad Accelerator India announced
Similar to Class I, these 10 incredible startups will get access to the best of Google -- including mentorship from Google teams and industry experts, free support, cloud credits, and more. These startups will undergo an intensive 1-week mentorship bootcamp in March, followed by more engagements in April and May.  


At the bootcamp, they will meet with mentors both from Google and subject matter experts from the industry to set their goals for the upcoming three months. During the course of the program, the startups will receive insights and support on advanced technologies such as ML, in-depth design sprints for specifically identified challenges, guidance on focused tech projects, networking opportunities at industry events, and much more.


The first class kicks off today in Bangalore.


Meet the 10 startups of Class II:


(1) Opentalk Pvt Ltd: An app to talk to new people around the world, become a better speaker and make new friends
(2) THB: Helping healthcare providers organize and standardize healthcare information to drive clinical and commercial analytical applications and use cases
(3) Perceptiviti Data Solutions: An AI platform for insurance claim flagging, payment integrity, and fraud and abuse management
(4) DheeYantra: A cognitive conversational AI for Indian vernacular languages
(5) Kaleidofin: Customized financial solutions that combine multiple financial products such as savings, credit, and insurance in intuitive ways, to help customers achieve their real-life goals
(6) FinancePeer: A P2P lending company that connects lenders with borrowers online
(7) SmartCoin: An app for providing credit access to the vastly underserved lower- and middle-income segments through advanced AI/ML models
(8) HRBOT: Using AI and video analytics to find employable candidates in tier 2 and tier 3 cities, remotely.
(9) Savera.ai: A service that remotely maps your roof and helps you make an informed decision about having a solar panel, followed by chatbot-based support to help you learn about solar tech while enabling connections to local service providers
(10) Adiuvo Diagnostics: A rapid wound infection assessment and management device

By Paul Ravindranath, Program Manager, Launchpad Accelerator India

Introduction to Fairness in Machine Learning

Posted by Andrew Zaldivar, Developer Advocate, Google AI

A few months ago, we announced our AI Principles, a set of commitments we are upholding to guide our work in artificial intelligence (AI) going forward. Along with our AI Principles, we shared a set of recommended practices to help the larger community design and build responsible AI systems.

In particular, one of our AI Principles speaks to the importance of recognizing that AI algorithms and datasets are the product of the environment—and, as such, we need to be conscious of any potential unfair outcomes generated by an AI system and the risk it poses across cultures and societies. A recommended practice here for practitioners is to understand the limitations of their algorithm and datasets—but this is a problem that is far from solved.

To help practitioners take on the challenge of building fairer and more inclusive AI systems, we developed a short, self-study training module on fairness in machine learning. This new module is part of our Machine Learning Crash Course, which we highly recommend taking first—unless you know machine learning really well, in which case you can jump right into the Fairness module.

The Fairness module features a hands-on technical exercise. This exercise demonstrates how you can use tools and techniques that may already exist in your development stack (such as Facets Dive, Seaborn, pandas, scikit-learn and TensorFlow Estimators to name a few) to explore and discover ways to make your machine learning system fairer and more inclusive. We created our exercise in a Colaboratory notebook, which you are more than welcome to use, modify and distribute for your own purposes.

From exploring datasets to analyzing model performance, it's really easy to forget to make time for responsible reflection when building an AI system. So rather than having you run every code cell in sequential order without pause, we added what we call FairAware tasks throughout the exercise. FairAware tasks help you zoom in and out of the problem space. That way, you can remind yourself of the big picture: finding the undesirable biases that could disproportionately affect model performance across groups. We hope a process like FairAware will become part of your workflow, helping you find opportunities for inclusion.

FairAware task guiding practitioner to compare performances across gender.

The Fairness module was created to provide you with enough of an understanding to get started in addressing fairness and inclusion in AI. Keep an eye on this space for future work as this is only the beginning.

If you wish to learn more from our other examples, check out the Fairness section of our Responsible AI Practices guide. There, you will find a full set of Google recommendations and resources. From our latest research proposal on reporting model performance with fairness and inclusion considerations, to our recently launched diagnostic tool that lets anyone investigate trained models for fairness, our resource guide highlights many areas of research and development in fairness.

Let us know what your thoughts are on our Fairness module. If you have any specific comments on the notebook exercise itself, then feel free to leave a comment on our GitHub repo.


On behalf of many contributors and supporters,

Andrew Zaldivar – Developer Advocate, Google AI

Improving Search for the next 20 years

https://storage.googleapis.com/gweb-uniblog-publish-prod/images/BLR_-_Koshys_1_1.max-1000x1000.jpg
Growing up in India, there was one good library in my town that I had access to—run by the British Council.  It was modest by western standards, and I had to take two buses just to get there. But I was lucky, because for every child like me, there were many more who didn’t have access to the same information that I did. Access to information changed my life, bringing me to the U.S. to study computer science and opening up huge possibilities for me that would not have been available without the education I had.
Ben's library
The British Council Library in my hometown.


When Google started 20 years ago, our mission was to organize the world’s information and make it universally accessible and useful. That seemed like an incredibly ambitious mission at the time—even considering that in 1998 the web consisted of just 25 million pages (roughly the equivalent of books in a small library).
Fast forward to today, and now we index hundreds of billions of pages in our index—more information than all the libraries in the world could hold. We’ve grown to serve people all over the world, offering Search in more than 150 languages and over 190 countries.
Through all of this, we’ve remained grounded in our mission. In fact, providing greater access to information is as core to our work today as it was when we first started. And while almost everything has changed about technology and the information available to us, the core principles of Search have stayed the same.
  • First and foremost, we focus on the user. Whether you’re looking for recipes, studying for an exam, or finding information on where to vote, we’re focused on serving your information needs.
  • We strive to give you the most relevant, highest quality information as quickly as possible. This was true when Google started with the Page Rank algorithm—the foundational technology to Search. And it’s just as true today.
  • We see billions of queries every day, and 15 percent of queries are ones we’ve never seen before. Given this scale, the only way to provide Search effectively is through an algorithmic approach. This helps us not just solve all the queries we’ve seen yesterday, but also all the ones we can’t anticipate for tomorrow.
  • Finally, we rigorously test every change we make. A key part of this testing is the rater guidelines which define our goals in search, and which are publicly available for anyone to see. Every change to Search is evaluated by experimentation and by raters using these guidelines. Last year alone, we ran more than 200,000 experiments that resulted in 2,400+ changes to search. Search will serve you better today than it did yesterday, and even better tomorrow.
As Google marks our 20th anniversary, I wanted to share a first look at the next chapter of Search, and how we’re working to make information more accessible and useful for people everywhere. This next chapter is driven by three fundamental shifts in how we think about Search:
    Underpinning each of these are our advancements in AI, improving our ability to understand language in ways that weren’t possible when Google first started. This is incredibly exciting, because over 20 years ago when I studied neural nets at school, they didn’t actually work very well...at all!
    But we’ve now reached the point where neural networks can help us take a major leap forward from understanding words to understanding concepts. Neural embeddings, an approach developed in the field of neural networks, allow us to transform words to fuzzier representations of the underlying concepts, and then match the concepts in the query with the concepts in the document. We call this technique neural matching. This can enable us to address queries like: “why does my TV look strange?” to surface the most relevant results for that question, even if the exact words aren’t contained in the page. (By the way, it turns out the reason is called the soap opera effect).
    Finding the right information about my TV is helpful in the moment. But AI can have much more profound effects. Whether it’s predicting areas that might be affected in a flood, or helping you identify the best job opportunities for you, AI can dramatically improve our ability to make information more accessible and useful.
    I’ve worked on Search at Google since the early days of its existence. One of the things that keeps me so inspired about Search all these years is our mission and how timeless it is. Providing greater access to information is fundamental to what we do, and there are always more ways we can help people access the information they need. That’s what pushes us forward to continue to make Search better for our users. And that’s why our work here is never done.

    Posted by Ben Gomes, VP, Search, News and Assistant

    Keeping people safe with AI-enabled flood forecasting

    https://storage.googleapis.com/gweb-uniblog-publish-prod/original_images/Flood_Forecast.gif
    For 20 years, Google Search has provided people with the information they need, and in times of crisis, access to timely, actionable information is often crucial. Last year we launched SOS Alerts on Search and Maps to make emergency information more accessible. Since then, we’ve activated SOS Alerts in more than 200 crisis situations, in addition to tens of thousands of Google Public Alerts, which have been viewed more than 1.5 billion times.
    Floods are devastating natural disasters worldwide—it’s estimated that every year, 250 million people around the world are affected by floods, also costing billions of dollars in damages. Flood forecasting can help individuals and authorities better prepare to keep people safe, but accurate forecasting isn’t currently available in many areas. And the warning systems that do exist can be imprecise and non-actionable, resulting in far too many people being underprepared and under informed before a flood happens.
    To help improve awareness of impending floods, we're using AI and significant computational power to create better forecasting models that predict when and where floods will occur, and incorporating that information into Google Public Alerts. A variety of elements—from historical events, to river level readings, to the terrain and elevation of a specific area—feed into our models. From there, we generate maps and run up to hundreds of thousands of simulations in each location. With this information, we’ve created river flood forecasting models that can more accurately predict not only when and where a flood might occur, but the severity of the event as well.
    flood forecast
    These images depict a flood simulation of a river in Hyderabad, India. The left side uses publicly available data while the right side uses Google data and technology. Our models contain higher resolution, accuracy, and up-to-date information.


    We started these flood forecasting efforts in India, where 20 percent of global flood-related fatalities occur. We’re partnering with India’s Central Water Commission to get the data we need to roll out early flood warnings, starting with the Patna region. The first alert went out earlier this month after heavy rains in the region.
    alert
    Flood alert shown to users in the Patna region.


    We’re also looking to expand coverage to more countries, to help more people around the world get access to these early warnings, and help keep them informed and safe.

    Posted by Yossi Matias, VP, Engineering

    Cloud AutoML: Making AI accessible to every business

    https://img.youtube.com/vi/GbLQE2C181U/maxresdefault.jpg
    When we both joined Google Cloud just over a year ago, we embarked on a mission to democratize AI. Our goal was to lower the barrier of entry and make AI available to the largest possible community of developers, researchers and businesses.


    Our Google Cloud AI team has been making good progress towards this goal. In 2017, we introduced Google Cloud Machine Learning Engine, to help developers with machine learning expertise easily build ML models that work on any type of data, of any size. We showed how modern machine learning services, i.e., APIs—including Vision, Speech, NLP, Translation and Dialogflow—could be built upon pre-trained models to bring unmatched scale and speed to business applications. Kaggle, our community of data scientists and ML researchers, has grown to more than 1 million members. And today, more than 10,000 businesses are using Google Cloud AI services, including companies like Box, Rolls Royce Marine, Kewpie, and Ocado.


    But there’s much more we can do. Currently, only a handful of businesses in the world have access to the talent and budgets needed to fully appreciate the advancements of ML and AI. There’s a very limited number of people that can create advanced machine learning models. And if you’re one of the companies that has access to ML/AI engineers, you still have to manage the time-intensive and complicated process of building your own custom ML model. While Google has offered pre-trained machine learning models via APIs that perform specific tasks, there's still a long road ahead if we want to bring AI to everyone.


    To close this gap, and to make AI accessible to every business, we’re introducing Cloud AutoML. Cloud AutoML helps businesses with limited ML expertise start building their own high-quality custom models by using advanced techniques like learning2learn and transfer learning from Google. We believe Cloud AutoML will make AI experts even more productive, advance new fields in AI, and help less-skilled engineers build powerful AI systems they previously only dreamed of.


    Our first Cloud AutoML release will be Cloud AutoML Vision, a service that makes it faster and easier to create custom ML models for image recognition. Its drag-and-drop interface lets you easily upload images, train and manage models, and then deploy those trained models directly on Google Cloud. Early results using Cloud AutoML Vision to classify popular public datasets like ImageNet and CIFAR have shown more accurate results with fewer misclassifications than generic ML APIs.


    Here’s a little more on what Cloud AutoML Vision has to offer:
    • Increased accuracy: Cloud AutoML Vision is built on Google’s leading image recognition approaches, including transfer learning and neural architecture search technologies. This means you’ll get a more accurate model even if your business has limited machine learning expertise.
    • Faster turnaround time to production-ready models: With Cloud AutoML, you can create a simple model in minutes to pilot your AI-enabled application, or build out a full, production-ready model in as little as a day.
    • Easy to use: AutoML Vision provides a simple graphical user interface that lets you specify data, then turns that data into a high quality model customized for your specific needs.



      Urban Outfitters is constantly looking for new ways to enhance our customers’ shopping experience," says Alan Rosenwinkel, Data Scientist at URBN. "Creating and maintaining a comprehensive set of product attributes is critical to providing our customers relevant product recommendations, accurate search results, and helpful product filters; however, manually creating product attributes is arduous and time-consuming. To address this, our team has been evaluating Cloud AutoML to automate the product attribution process by recognizing nuanced product characteristics like patterns and necklines styles. Cloud AutoML has great promise to help our customers with better discovery, recommendation, and search experiences."


      Mike White, CTO and SVP, for Disney Consumer Products and Interactive Media, says: “Cloud AutoML’s technology is helping us build vision models to annotate our products with Disney characters, product categories, and colors. These annotations are being integrated into our search engine to enhance the impact on Guest experience through more relevant search results, expedited discovery, and product recommendations on shopDisney.”

      And Sophie Maxwell, Conservation Technology Lead at the Zoological Society of London, tells us: "ZSL is an international conservation charity devoted to the worldwide conservation of animals and their habitats. A key requirement to deliver on this mission is to track wildlife populations to learn more about their distribution and better understand the impact humans are having on these species. In order to achieve this, ZSL has deployed a series of camera traps in the wild that take pictures of passing animals when triggered by heat or motion. The millions of images captured by these devices are then manually analysed and annotated and with the relevant species such as elephants, lions, and giraffes, etc., which is a labour-intensive and expensive process. ZSL’s dedicated Conservation Technology Unit has been collaborating closely with Google’s CloudML team to help shape the development of this exciting technology, which ZSL aims to use to automate the tagging of these images—cutting costs, enabling wider-scale deployments, and gaining a deeper understanding of how to conserve the world’s wildlife effectively."


      If you’re interested in trying out AutoML Vision, you can request access via this form.

      AutoML Vision is the result of our close collaboration with Google Brain and other Google AI teams, and is the first of several Cloud AutoML products in development. While we’re still at the beginning of our journey to make AI more accessible, we’ve been deeply inspired by what our 10,000+ customers using Cloud AI products have been able to achieve. We hope the release of Cloud AutoML will help even more businesses discover what’s possible through AI.

      By Jia Li, Head of R&D, Cloud AI, and Fei-Fei Li, Chief Scientist, Cloud AI