Tag Archives: Research Awards

Emerging practices for Society-Centered AI

The first of Google’s AI Principles is to “Be socially beneficial.” As AI practitioners, we’re inspired by the transformative potential of AI technologies to benefit society and our shared environment at a scale and swiftness that wasn’t possible before. From helping address the climate crisis to helping transform healthcare, to making the digital world more accessible, our goal is to apply AI responsibly to be helpful to more people around the globe. Achieving global scale requires researchers and communities to think ahead — and act — collectively across the AI ecosystem.

We call this approach Society-Centered AI. It is both an extension and an expansion of Human-Centered AI, focusing on the aggregate needs of society that are still informed by the needs of individual users, specifically within the context of the larger, shared human experience. Recent AI advances offer unprecedented, societal-level capabilities, and we can now methodically address those needs — if we apply collective, multi-disciplinary AI research to society-level, shared challenges, from forecasting hunger to predicting diseases to improving productivity.

The opportunity for AI to benefit society increases each day. We took a look at our work in these areas and at the research projects we have supported. Recently, Google announced that 70 professors were selected for the 2023 Award for Inclusion Research Program, which supports academic research that addresses the needs of historically marginalized groups globally. Through evaluation of this work, we identified a few emerging practices for Society-Centered AI:

  • Understand society’s needs
    Listening to communities and partners is crucial to understanding major issues deeply and identifying priority challenges to address. As an emerging general purpose technology, AI has the potential to address major global societal issues that can significantly impact people’s lives (e.g., educating workers, improving healthcare, and improving productivity). We have found the key to impact is to be centered on society’s needs. For this, we focus our efforts on goals society has agreed should be prioritized, such as the United Nations’ 17 Sustainable Development Goals, a set of interconnected goals jointly developed by more than 190 countries to address global challenges.
  • Collective efforts to address those needs
    Collective efforts bring stakeholders (e.g., local and academic communities, NGOs, private-public collaborations) into a joint process of design, development, implementation, and evaluation of AI technologies as they are being developed and deployed to address societal needs.
  • Measuring success by how well the effort addresses society’s needs
    It is important and challenging to measure how well AI solutions address society’s needs. In each of our cases, we identified primary and secondary indicators of impact that we optimized through our collaborations with stakeholders.

Why is Society-Centered AI important?

The case examples described below show how the Society-Centered AI approach has led to impact across topics, such as accessibility, health, and climate.


Understanding the needs of individuals with non-standard speech

There are millions of people with non-standard speech (e.g., impaired articulation, dysarthria, dysphonia) in the United States alone. In 2019, Google Research launched Project Euphonia, a methodology that allows individual users with non-standard speech to train personalized speech recognition models. Our success began with the impact we had on each individual who is now able to use voice dictation on their mobile device.

Euphonia started with a Society-Centered AI approach, including collective efforts with the non-profit organizations ALS Therapy Development Institute and ALS Residence Initiative to understand the needs of individuals with amyotrophic lateral sclerosis (ALS) and their ability to use automatic speech recognition systems. Later, we developed the world’s largest corpus of non-standard speech recordings, which enabled us to train a Universal Speech Model to better recognize disordered speech by 37% on real conversation word error rate (WER) measurement. This also led to the 2022 collaboration between the University of Illinois Urbana-Champaign, Alphabet, Apple, Meta, Microsoft, and Amazon to begin the Speech Accessibility Project, an ongoing initiative to create a publicly available dataset of disordered speech samples to improve products and make speech recognition more inclusive of diverse speech patterns. Other technologies that use AI to help remove barriers of modality and languages, include live transcribe, live caption and read aloud.


Focusing on society’s health needs

Access to timely maternal health information can save lives globally: every two minutes a woman dies during pregnancy or childbirth and 1 in 26 children die before reaching age five. In rural India, the education of expectant and new mothers around key health issues pertaining to pregnancy and infancy required scalable, low-cost technology solutions. Together with ARMMAN, Google Research supported a program that uses mobile messaging and machine learning (ML) algorithms to predict when women might benefit from receiving interventions (i.e., targeted preventative care information) and encourages them to engage with the mMitra free voice call program. Within a year, the mMitra program has shown a 17% increase in infants with tripled birth weight and a 36% increase in women understanding the importance of taking iron tablets during pregnancy. Over 175K mothers and growing have been reached through this automated solution, which public health workers use to improve the quality of information delivery.

These efforts have been successful in improving health due to the close collective partnership among the community and those building the AI technology. We have adopted this same approach via collaborations with caregivers to address a variety of medical needs. Some examples include: the use of the Automated Retinal Disease Assessment (ARDA) to help screen for diabetic retinopathy in 250,000 patients in clinics around the world; our partnership with iCAD to bring our mammography AI models to clinical settings to aid in breast cancer detection; and the development of Med-PaLM 2, a medical large language model that is now being tested with Cloud partners to help doctors provide better patient care.


Compounding impact from sustained efforts for crisis response

Google Research’s flood prediction efforts began in 2018 with flood forecasting in India and expanded to Bangladesh to help combat the catastrophic damage from yearly floods. The initial efforts began with partnerships with India’s Central Water Commission, local governments and communities. The implementation of these efforts used SOS Alerts on Search and Maps, and, more recently, broadly expanded access via Flood Hub. Continued collaborations and advancing an AI-based global flood forecasting model allowed us to expand this capability to over 80 countries across Africa, the Asia-Pacific region, Europe, and South, Central, and North America. We also partnered with networks of community volunteers to further amplify flood alerts. By working with governments and communities to measure the impact of these efforts on society, we refined our approach and algorithms each year.

We were able to leverage those methodologies and some of the underlying technology, such as SOS Alerts, from flood forecasting to similar societal needs, such as wildfire forecasting and heat alerts. Our continued engagements with organizations led to the support of additional efforts, such as the World Meteorological Organization's (WMO) Early Warnings For All Initiative. The continued engagement with communities has allowed us to learn about our users' needs on a societal level over time, expand our efforts, and compound the societal reach and impact of our efforts.


Further supporting Society-Centered AI research

We recently funded 18 university research proposals exemplifying a Society-Centered AI approach, a new track within the Google Award for Inclusion Research Program. These researchers are taking the Society-Centered AI methodology and helping create beneficial applications across the world. Examples of some of the projects funded include:

  • AI-Driven Monitoring of Attitude Polarization in Conflict-Affected Countries for Inclusive Peace Process and Women’s Empowerment: This project’s goal is to create LLM-powered tools that can be used to monitor peace in online conversations in developing nations. The initial target communities are where peace is in flux and the effort will put a particular emphasis on mitigating polarization that impacts women and promoting harmony.
  • AI-Assisted Distributed Collaborative Indoor Pollution Meters: A Case Study, Requirement Analysis, and Low-Cost Healthy Home Solution for Indian Communities: This project is looking at the usage of low-cost pollution monitors combined with AI-assisted methodology for identifying recommendations for communities to improve air quality and at home health. The initial target communities are highly impacted by pollution, and the joint work with them includes the goal of developing how to measure improvement in outcomes in the local community.
  • Collaborative Development of AI Solutions for Scaling Up Adolescent Access to Sexual and Reproductive Health Education and Services in Uganda: This project’s goal is to create LLM-powered tools to provide personalized coaching and learning for users' needs on topics of sexual and reproductive health education in low-income settings in Sub-Saharan Africa. The local societal need is significant, with an estimated 25% rate of teenage pregnancy, and the project aims to address the needs with a collective development process for the AI solution.

Future direction

Focusing on society’s needs, working via multidisciplinary collective research, and measuring the impact on society helps lead to AI solutions that are relevant, long-lasting, empowering, and beneficial. See the AI for the Global Goals to learn more about potential Society-Centered AI research problems. Our efforts with non-profits in these areas is complementary to the research that we are doing and encouraging. We believe that further initiatives using Society-Centered AI will help the collective research community solve problems and positively impact society at large.


Acknowledgements

Many thanks to the many individuals who have worked on these projects at Google including Shruti Sheth, Reena Jana, Amy Chung-Yu Chou, Elizabeth Adkison, Sophie Allweis, Dan Altman, Eve Andersson, Ayelet Benjamini, Julie Cattiau, Yuval Carny, Richard Cave, Katherine Chou, Greg Corrado, Carlos De Segovia, Remi Denton, Dotan Emanuel, Ashley Gardner, Oren Gilon, Taylor Goddu, Brigitte Hoyer Gosselink, Jordan Green, Alon Harris, Avinatan Hassidim, Rus Heywood, Sunny Jansen, Pan-Pan Jiang, Anton Kast, Marilyn Ladewig, Ronit Levavi Morad, Bob MacDonald, Alicia Martin, Shakir Mohamed, Philip Nelson, Moriah Royz, Katie Seaver, Joel Shor, Milind Tambe, Aparna Taneja, Divy Thakkar, Jimmy Tobin, Katrin Tomanek, Blake Walsh, Gal Weiss, Kasumi Widner, Lihong Xi, and teams.

Source: Google AI Blog


Google Research, 2022 & beyond: Research community engagement


(This is Part 9 in our series of posts covering different topical areas of research at Google. You can find other posts in the series here.)

Sharing knowledge is essential to Google’s research philosophy — it accelerates technological progress and expands capabilities community-wide. Solving complex problems requires bringing together diverse minds and resources collaboratively. This can be accomplished through building local and global connections with multidisciplinary experts and impacted communities. In partnership with these stakeholders, we bring our technical leadership, product footprint, and resources to make progress against some of society's greatest opportunities and challenges.

We at Google see it as our responsibility to disseminate our work as contributing members of the scientific community and to help train the next generation of researchers. To do this well, collaborating with experts and researchers outside of Google is essential. In fact, just over half of our scientific publications highlight work done jointly with authors outside of Google. We are grateful to work collaboratively across the globe and have only increased our efforts with the broader research community over the past year. In this post, we will talk about some of the opportunities afforded by such partnerships, including:


Addressing social challenges together

Engaging the wider community helps us progress on seemingly intractable problems. For example, access to timely, accurate health information is a significant challenge among women in rural and densely populated urban areas across India. To solve this challenge, ARMMAN developed mMitra, a free mobile service that sends preventive care information to expectant and new mothers. Adherence to such public health programs is a prevalent challenge, so researchers from Google Research and the Indian Institute of Technology, Madras worked with ARMMAN to design an ML system that alerts healthcare providers about participants at risk of dropping out of the health information program. This early identification helps ARMMAN provide better-targeted support, improving maternal health outcomes.

Google Research worked with ARMMAN to design a system to alert healthcare providers about participants at risk for dropping out of their preventative care information program for expectant mothers. This plot shows the cumulative engagement drops prevented using our restless multi-armed bandit model (RMAB) compared to the control group (Round Robin).

We also support Responsible AI projects directly for other organizations — including our commitment of $3M to fund the new INSAIT research center based in Bulgaria. Further, to help build a foundation of fairness, interpretability, privacy, and security, we are supporting the establishment of a first-of-its-kind multidisciplinary Center for Responsible AI with a grant of $1M to the Indian Institute of Technology, Madras.

Top


Training the next generation of researchers

Part of our responsibility in guiding how technology affects society is to help train the next generation of researchers. For example, supporting equitable student persistence in computing research through our Computer Science Research Mentorship Program, where Googlers have mentored over one thousand students since 2018 — 86% of whom identify as part of a historically marginalized group.

We work towards inclusive goals and work across the globe to achieve them. In 2022, we expanded our research interactions and programs to faculty and students across Latin America, which included grants to women in computer science in Ecuador. We partnered with ENS, a university in France, to help fund scholarships for students to train through research. Another example is our collaboration with the Computing Alliance of Hispanic-Serving Institutions (CAHSI) to provide $4.8 million to support more than 30 collaborative research projects and over 3,000 Hispanic students and faculty across a network of Hispanic-serving institutions.

Efforts like these foster the research ecosystem and help the community give back. Through exploreCSR, we partner with universities to provide students with introductory experiences in research, such as Rice University’s regional workshop on applications and research in data science (ReWARDS), which was delivered in rural Peru by faculty from Rice. Similarly, one of our Awards for Inclusion Research led to a faculty member helping startups in Africa use AI.

The funding we provide is most often unrestricted and leads to inspiring results. Last year, for example, Kean University was one of 53 institutions to receive an exploreCSR award. It used the funding to create the Research Recruits Program, a two-semester program designed to give undergraduates an introductory opportunity to participate in research with a faculty mentor. A student at Kean with a chronic condition that requires him to take different medications every day, a struggle that affects so many, decided to pursue research on the topic with a peer. Their research, set to be published this year, demonstrates an ML solution, built with Google's TensorFlow, that can identify pills with 99.8% certainty when used correctly. Results like these are why we continue to invest in younger generations, further demonstrated by our long-term commitment to funding PhD Fellows every year across the globe.

Building an inclusive ecosystem is imperative. To this end, we've also partnered with the non-profit Black in Robotics (BiR), formed to address the systemic inequities in the robotics community. Together, we established doctoral student awards that help financially support graduate students and to support BiR’s newly established Bay Area Robotics lab. We also help make global conferences accessible to more researchers around the world, for example, by funding 24 students this year to attend Deep Learning Indaba in Tunisia.

Top


Collaborating to advance scientific innovations

In 2022 Google sponsored over 150 research conferences and even more workshops, which leads to invaluable engagements with the broader research community. At research conferences, Googlers serve on program committees and organize workshops, tutorials and numerous other activities to collectively advance the field. Additionally, last year, we hosted over 14 dedicated workshops to bring together researchers, such as the 2022 Quantum Symposium, which generates new ideas and directions for the research field, further advancing research initiatives. In 2022, we authored 2400 papers, many of which were presented at leading research conferences, such as NeurIPS, EMNLP, ECCV, Interspeech, ICML, CVPR, ICLR, and many others. More than 50% of these papers were authored in collaboration with researchers beyond Google.

Over the past year, we've expanded our engagement models to facilitate students, faculty, and Google's research scientists coming together across schools to form constructive research triads. One such project, undertaken in partnership with faculty and students from Georgia Tech, aims to develop a robot guide dog with human behavior modeling and safe reinforcement learning. Throughout 2022, we gave over 224 grants to researchers and over $10M in Google Cloud Platform credits for topics ranging from the improvement of algorithms for post-quantum cryptography with collaborators at CNRS in France to fostering cybersecurity research at TU Munich and Fraunhofer AISEC in Germany.

In 2022, we made 22 new multi-year commitments totaling over ~$80M to 65 institutions across nine countries, where each year we will host workshops to select over 100 research projects of mutual interest. For example, in a growing partnership, we are supporting the new Max Planck VIA-Center in Germany to work together on robotics. Another large area of investment is a close partnership with four universities in Taiwan (NTU, NCKU, NYCU, NTHU) to increase innovation in silicon chip design and improve competitiveness in semiconductor design and manufacturing. We aim to collaborate by default and were proud to be recently named one of Australia's top collaborating companies.

Top


Fueling innovation in products and engineering

The community fuels innovation at Google. For example, by facilitating student researchers to work with us on defined research projects, we've experienced both incremental and more dramatic improvements. Together with visiting researchers, we combine information, compute power, and a great deal of expertise to bring about breakthroughs, such as leveraging our undersea internet cables to detect earthquakes. Visiting Researchers also worked hand-in-hand with us to develop Minerva, a state-of-the-art solution that came about by training a deep learning model on a dataset that contains quantitative reasoning with symbolic expressions.

Minerva incorporates recent prompting and evaluation techniques to better solve mathematical questions. It then employs majority voting, in which it generates multiple solutions to each question and chooses the most common answer as the solution, thus improving performance significantly.

Top


Open-sourcing datasets and tools

Engaging with the broader research community is a core part of our efforts to build a more collaborative ecosystem. We support the general advancement of ML and related research through the release of open-source code and datasets. We continued to grow open source datasets in 2022, for example, in natural language processing and vision, and expanded our global index of available datasets in Google Dataset Search. We also continued to release sustainability data via Data Commons and invite others to use it for their research. See some of the datasets and tools we released in 2022 listed below.


Dataset Description
  
Auto-Arborist A multiview urban tree classification dataset that consists of ~2.6M trees covering >320 genera, which can aid in the development of models for urban forest monitoring.
  
Bazel GitHub Metrics A dataset with GitHub download counts of release artifacts from selected bazelbuild repositories.
  
BC-Z demonstration Episodes of a robotic arm performing 100 different manipulation tasks. Data for each episode includes the RGB video, the robot's end-effector positions, and the natural language embedding.
  
BEGIN V2 A benchmark dataset for evaluating dialog systems and natural language generation metrics.
  
CLSE: Corpus of Linguistically Significant Entities A dataset of named entities annotated by linguistic experts. It includes 34 languages and 74 different semantic types to support various applications from airline ticketing to video games.
  
CocoChorales A dataset consisting of over 1,400 hours of audio mixtures containing four-part chorales performed by 13 instruments, all synthesized with realistic-sounding generative models.
  
Crossmodal-3600 A geographically diverse dataset of 3,600 images, each annotated with human-generated reference captions in 36 languages.
  
CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus A Common Voice-based Speech-to-Speech translation corpus that includes 2,657 hours of speech-to-speech translation sentence pairs from 21 languages into English.
  
DSTC11 Challenge Task This challenge evaluates task-oriented dialog systems end-to-end, from users' spoken utterances to inferred slot values.
  
EditBench A comprehensive diagnostic and evaluation dataset for text-guided image editing.
  
Few-shot Regional Machine Translation FRMT is a few-shot evaluation dataset containing en-pt and en-zh bitexts translated from Wikipedia, in two regional varieties for each non-English language (pt-BR and pt-PT; zh-CN and zh-TW).
  
Google Patent Phrase Similarity A human-rated contextual phrase-to-phrase matching dataset focused on technical terms from patents.
  
Hinglish-TOP Hinglish-TOP is the largest code-switched semantic parsing dataset with 10k entries annotated by humans, and 170K generated utterances using the CST5 augmentation technique introduced in the paper.
  
ImPaKT A dataset that contains semantic parsing annotations for 2,489 sentences from shopping web pages in the C4 corpus, corresponding to annotations of 3,719 expressed implication relationships and 6,117 typed and summarized attributes.
  
InFormal A formality style transfer dataset for four Indic Languages, made up of a pair of sentences and a corresponding gold label identifying the more formal and semantic similarity.
  
MAVERICS A suite of test-only visual question answering datasets, created from Visual Question Answering image captions with question answering validation and manual verification.
  
MetaPose A dataset with 3D human poses and camera estimates predicted by the MetaPose model for a subset of the public Human36M dataset with input files necessary to reproduce these results from scratch.
  
MGnify proteins A 2.4B-sequence protein database with annotations.
  
MiQA: Metaphorical Inference Questions and Answers MiQA assesses the capability of language models to reason with conventional metaphors. It combines the previously isolated topics of metaphor detection and commonsense reasoning into a single task that requires a model to make inferences by selecting between the literal and metaphorical register.
  
MT-Opt A dataset of task episodes collected across a fleet of real robots, following the RLDS format to represent steps and episodes.
  
MultiBERTs Predictions on Winogender Predictions of BERT on Winogender before and after several different interventions.
  
Natural Language Understanding Uncertainty Evaluation NaLUE is a relabelled and aggregated version of three large NLU corpuses CLINC150, Banks77 and HWU64. It contains 50k utterances spanning 18 verticals, 77 domains, and ~260 intents.
  
NewsStories A collection of url links to publicly available news articles with their associated images and videos.
  
Open Images V7 Open Images V7 expands the Open Images dataset with new point-level label annotations, which provide localization information for 5.8k classes, and a new all-in-one visualization tool for better data exploration.
  
Pfam-NUniProt2 A set of 6.8 million new protein sequence annotations.
  
Re-contextualizing Fairness in NLP for India A dataset of region and religion-based societal stereotypes in India, with a list of identity terms and templates for reproducing the results from the "Re-contextualizing Fairness in NLP" paper.
  
Scanned Objects A dataset with 1,000 common household objects that have been 3D scanned for use in robotic simulation and synthetic perception research.
  
Specialized Rater Pools This dataset comes from a study designed to understand whether annotators with different self-described identities interpret toxicity differently. It contains the unaggregated toxicity annotations of 25,500 comments from pools of raters who self-identify as African American, LGBTQ, or neither.
  
UGIF A multi-lingual, multi-modal UI grounded dataset for step-by-step task completion on the smartphone.
  
UniProt Protein Names Data release of ~49M protein name annotations predicted from their amino acid sequence.
  
upwelling irradiance from GOES-16 Climate researchers can use the 4 years of outgoing longwave radiation and reflected shortwave radiation data to analyze important climate forcers, such as aircraft condensation trails.
  
UserLibri The UserLibri dataset reorganizes the existing popular LibriSpeech dataset into individual “user” datasets consisting of paired audio-transcript examples and domain-matching text-only data for each user. This dataset can be used for research in speech personalization or other language processing fields.
  
VideoCC A dataset containing (video-URL, caption) pairs for training video-text machine learning models.
  
Wiki-conciseness A manually curated evaluation set in English for concise rewrites of 2,000 Wikipedia sentences.
  
Wikipedia Translated Clusters Introductions to English Wikipedia articles and their parallel versions in 10 other languages, with machine translations to English. Also includes synthetic corruptions to the English versions, to be identified with NLI models.
  
Workload Traces 2022 A dataset with traces that aim to help system designers better understand warehouse-scale computing workloads and develop new solutions for front-end and data-access bottlenecks.


Tool Description
  
Differential Privacy Open Source Library An open-source library to enable developers to use analytic techniques based on DP.
  
Mood Board Search The result of collaborative work with artists, photographers, and image researchers to demonstrate how ML can enable people to visually explore subjective concepts in image datasets.
  
Project Relate An Android beta app that uses ML to help people with non-standard speech make their voices heard.
  
TensorStore TensorStore is an open-source C++ and Python library designed for storage and manipulation of n-dimensional data, which can address key engineering challenges in scientific computing through better management and processing of large datasets.
  
The Data Cards Playbook A Toolkit for Transparency in Dataset Documentation.

Top

Conclusion

Research is an amplifier, an accelerator, and an enabler — and we are grateful to partner with so many incredible people to harness it for the good of humanity. Even when investing in research that advances our products and engineering, we recognize that, ultimately, this fuels what we can offer our users. We welcome more partners to engage with us and maximize the benefits of AI for the world.


Acknowledgements

Thank you to our many research partners across the globe, including academics, universities, NGOs, and research organizations, for continuing to engage and work with Google on exciting research efforts. There are many teams within GoogIe who make this work possible, including Google’s research teams and community, research partnerships, education, and policy teams. Finally, I would especially like to thank those who provided helpful feedback in the development of this post, including Sepi Hejazi Moghadam, Jill Alvidrez, Melanie Saldaña, Ashwani Sharma, Adriana Budura Skobeltsyn, Aimin Zhu, Michelle Hurtado, Salil Banerjee and Esmeralda Cardenas.

Top


Google Research, 2022 & beyond

This was the ninth and final blog post in the “Google Research, 2022 & Beyond” series. Other posts in this series are listed in the table below:


Source: Google AI Blog


Announcing the 2019 Google Faculty Research Award Recipients



In Fall 2019, we opened our annual call for the Google Faculty Research Awards, a program focused on supporting the world-class technical research in Computer Science, Engineering and related fields performed at academic institutions around the world. These awards give Google researchers the opportunity to partner with faculty who are doing impactful research, additionally covering tuition for a student.

This year we received 917 proposals from ~50 countries and over 330 universities, and had the opportunity to increase our investment in several research areas related to Health, Accessibility, AI for Social Good, and ML Fairness. All proposals went through an extensive review process involving 1100 expert reviewers across Google who assessed the proposals on merit, innovation, connection to Google’s products/services and alignment with our overall research philosophy.

As a result of these reviews, Google is funding 150 promising proposals across a wide range of research areas, from Machine Learning, Systems, Human Computer Interaction and many more, with 26% of the funding awarded to universities outside the United States. Additionally, 27% of our recipients this year identified as a historically underrepresented group within technology. This is just the beginning of a larger investment in underrepresented communities and we are looking forward to sharing our 2020 initiatives soon.

Congratulations to the well-deserving recipients of this round's awards. More information on our faculty funding programs can be found on our website.

Source: Google AI Blog


Google Faculty Research Awards 2018



We just completed another round of the Google Faculty Research Awards, our annual open call for proposals on computer science and related topics, such as quantum computing, machine learning, algorithms and theory, natural language processing and more. Our grants cover tuition for a graduate student and provide both faculty and students the opportunity to work directly with Google researchers and engineers.

This round we received 910 proposals covering 40 countries and over 320 universities. After expert reviews and committee discussions, we decided to fund 158 projects. The subject areas that received the most support this year were human computer interaction, machine learning, machine perception, and systems.

Congratulations to the well-deserving recipients of this round's awards. More information on how to apply for the next round will be available at the end of the summer on our website. You can find award recipients from previous years here.

Source: Google AI Blog


Announcing the Google Cloud Platform Research Credits Program



Scientists across nearly every discipline are researching ever larger and more complex data sets, using tremendous amounts of compute power to learn, make discoveries and build new tools that few could have imagined only a few years ago. Traditionally, this kind of research has been limited by the availability of resources, with only the largest universities or industry partners able to successfully pursue these endeavors. However, the power of cloud computing has been removing obstacles that many researchers used to face, enabling projects that use machine learning tools to understand and address student questions and that study robotic interactions with humans, among many more.

In order to ensure that more researchers have access to powerful cloud tools, we’re launching Google Cloud Platform (GCP) research credits, a new program aimed to support faculty in qualified regions who want to take advantage of GCP’s compute, analytics, and machine-learning capabilities for research. Higher education researchers can use GCP research credits in a multitude of ways — below are just three examples to illustrate how GCP can help propel your research forward.

Andrew V. Sutherland, a computational number theorist and Principal Research Scientist at the Massachusetts Institute of Technology, is one of a growing number of academic researchers who has already made the transition and benefited from GCP. His team moved his extremely large database to GCP because “we are mathematicians who want to focus on our research, and not have to worry about hardware failures or scaling issues with the website.”

Ryan Abernathey, Assistant Professor of Earth and Environmental Sciences, Ocean and Climate Physics at the Lamont-Doherty Earth Observatory at Columbia University, used Google Cloud credits through an NSF partnership and, with his team, developed an open-source platform to manage the complex data sets of climate science. The platform, called Pangeo, can run Earth System Modeling simulations on petabytes of high-resolution, three-dimensional data. “This is the future of what day-to-day science research computing will look like,” he predicts.

At the Stanford Center for Genomics and Personalized Medicine (SCGPM), researchers using GCP and BigQuery can now run hundreds of genomes through a variant analysis pipeline and get query results quickly. Mike Snyder, director of SCGPM, notes, “We’re entering an era where people are working with thousands or tens of thousands or even million genome projects, and you’re never going to do that on a local cluster very easily. Cloud computing is where the field is going.”

The GCP research credits program is open to faculty doing cutting-edge research in eligible countries. We’re eager to hear how we can help accelerate your progress. If you’re interested, you can learn more on our website or apply now.

Announcing the Google Cloud Platform Research Credits Program



Scientists across nearly every discipline are researching ever larger and more complex data sets, using tremendous amounts of compute power to learn, make discoveries and build new tools that few could have imagined only a few years ago. Traditionally, this kind of research has been limited by the availability of resources, with only the largest universities or industry partners able to successfully pursue these endeavors. However, the power of cloud computing has been removing obstacles that many researchers used to face, enabling projects that use machine learning tools to understand and address student questions and that study robotic interactions with humans, among many more.

In order to ensure that more researchers have access to powerful cloud tools, we’re launching Google Cloud Platform (GCP) research credits, a new program aimed to support faculty in qualified regions who want to take advantage of GCP’s compute, analytics, and machine-learning capabilities for research. Higher education researchers can use GCP research credits in a multitude of ways — below are just three examples to illustrate how GCP can help propel your research forward.

Andrew V. Sutherland, a computational number theorist and Principal Research Scientist at the Massachusetts Institute of Technology, is one of a growing number of academic researchers who has already made the transition and benefited from GCP. His team moved his extremely large database to GCP because “we are mathematicians who want to focus on our research, and not have to worry about hardware failures or scaling issues with the website.”

Ryan Abernathey, Assistant Professor of Earth and Environmental Sciences, Ocean and Climate Physics at the Lamont-Doherty Earth Observatory at Columbia University, used Google Cloud credits through an NSF partnership and, with his team, developed an open-source platform to manage the complex data sets of climate science. The platform, called Pangeo, can run Earth System Modeling simulations on petabytes of high-resolution, three-dimensional data. “This is the future of what day-to-day science research computing will look like,” he predicts.

At the Stanford Center for Genomics and Personalized Medicine (SCGPM), researchers using GCP and BigQuery can now run hundreds of genomes through a variant analysis pipeline and get query results quickly. Mike Snyder, director of SCGPM, notes, “We’re entering an era where people are working with thousands or tens of thousands or even million genome projects, and you’re never going to do that on a local cluster very easily. Cloud computing is where the field is going.”

The GCP research credits program is open to faculty doing cutting-edge research in eligible countries. We’re eager to hear how we can help accelerate your progress. If you’re interested, you can learn more on our website or apply now.

Source: Google AI Blog


Google Faculty Research Awards 2017



We’ve just completed another round of the Google Faculty Research Awards, our annual open call for proposals on computer science and related topics such as machine learning, machine perception, natural language processing, and quantum computing. Our grants cover tuition for a graduate student and provide both faculty and students the opportunity to work directly with Google researchers and engineers.

This round we received 1033 proposals covering 46 countries and over 360 universities. After expert reviews and committee discussions, we decided to fund 152 projects. The subject areas that received the most support this year were human computer interaction, machine learning, machine perception, and systems. Here are a few observations from this round:
  • There was a 17% increase in the total number of proposals received
  • There was a 87% increase in the number of proposals from Asia Pacific universities
  • Proposals focused on Computational Neuroscience increased 53%
  • Proposals focused on Quantum Computing more than doubled this round
Congratulations to the well-deserving recipients of this round’s awards. If you are interested in applying for the next round (September 2018 deadline), please visit our website for more information. You can find award recipients from previous years here.

Google Faculty Research Awards 2017



We’ve just completed another round of the Google Faculty Research Awards, our annual open call for proposals on computer science and related topics such as machine learning, machine perception, natural language processing, and quantum computing. Our grants cover tuition for a graduate student and provide both faculty and students the opportunity to work directly with Google researchers and engineers.

This round we received 1033 proposals covering 46 countries and over 360 universities. After expert reviews and committee discussions, we decided to fund 152 projects. The subject areas that received the most support this year were human computer interaction, machine learning, machine perception, and systems. Here are a few observations from this round:
  • There was a 17% increase in the total number of proposals received
  • There was a 87% increase in the number of proposals from Asia Pacific universities
  • Proposals focused on Computational Neuroscience increased 53%
  • Proposals focused on Quantum Computing more than doubled this round
Congratulations to the well-deserving recipients of this round’s awards. If you are interested in applying for the next round (September 2018 deadline), please visit our website for more information. You can find award recipients from previous years here.

Source: Google AI Blog


Google Research Awards 2016



We’ve just completed another round of the Google Research Awards, our annual open call for proposals on computer science and related topics including machine learning, machine perception, natural language processing, and security. Our grants cover tuition for a graduate student and provide both faculty and students the opportunity to work directly with Google researchers and engineers.

This round we received 876 proposals covering 44 countries and over 300 universities. After expert reviews and committee discussions, we decided to fund 143 projects. Here are a few observations from this round:


Congratulations to the well-deserving recipients of this round’s awards. If you are interested in applying for the next round (deadline is September 30th), please visit our website for more information.

An Update on fast Transit Routing with Transfer Patterns



What is the best way to get from A to B by public transit? Google Maps is answering such queries for over 20,000 cities and towns in over 70 countries around the world, including large metro areas like New York, São Paulo or Moscow, and some complete countries, such as Japan or Great Britain.
Since its beginnings in 2005 with the single city of Portland, Oregon, the number of cities and countries served by Google’s public transit directions has been growing rapidly. With more and larger regions, the amount of data we need to search in order to provide optimal directions has grown as well. In 2010, the search speed of transit directions made a leap ahead of that growth and became fast enough to update the result while you drag the endpoints. The technique behind that speed-up is the Transfer Patterns algorithm [1], which was created at Google’s engineering office in Zurich, Switzerland, by visiting researcher Hannah Bast and a number of Google engineers.

I am happy to report that this research collaboration has continued and expanded with the Google Focused Research Award on Next-Generation Route Planning. Over the past three years, this grant has supported Hannah Bast’s research group at the University of Freiburg, as well as the research groups of Peter Sanders and Dorothea Wagner at the Karlsruhe Institute of Technology (KIT).

From the project’s numerous outcomes, I’d like to highlight two recent ones that re-examine the Transfer Patterns approach and massively improve it for continent-sized networks: Scalable Transfer Patterns [2] and Frequency-Based Search for Public Transit [3] by Hannah Bast, Sabine Storandt and Matthias Hertel. This blogpost presents the results from these publications.

The notion of a transfer pattern is easy to understand. Suppose you are at a transit stop downtown, call it A, and want to go to some stop B as quickly as possible. Suppose further you brought a printed schedule book but no smartphone. (This sounded plausible only a few years ago!) As a local, you might know that there are only two reasonable options:
  1. Take a tram from A to C, then transfer at C to a bus to B.
  2. Take the direct bus from A to B, which only runs infrequently.
We say the first option has transfer pattern A-C-B, and the second option has transfer pattern A-B. Notice that no in-between stops are mentioned. This is very compact information, much less than the actual schedules, but it makes looking up the schedules significantly faster: Knowing that all optimal trips follow one of these patterns, you only need to look at those lines in the schedule book that provide direct connections from A to C, C to B and A to B. All other lines can safely be ignored: you know you will not miss a better option.

While the basic idea of transfer patterns is indeed that simple, it takes more to make it work in practice. The transfer patterns of all optimal trips have to be computed ahead of time and stored, so that they are available to answer queries. Conceptually, we need transfer patterns for every pair of stops, because any pair could come up in a query. It is perfectly reasonable to compute them for all pairs within one city, or even one metro area that is densely criss-crossed by a transit network comprising, say, a thousand stops, yielding a million of pairs to consider.

As the scale of the problem increases from one metro area to an entire country or continent, this “all pairs” approach rapidly becomes expensive: ten thousand stops (10x more than above) already yield a hundred million pairs (100x more than above), and so on. Also, the individual transfer patterns become quite repetitive: For example, from any stop in Paris, France to any stop in Cologne, Germany, all optimal connections end up using the same few long-distance train lines in the middle, only the local connections to the railway stations depend on the specific pair of stops considered.

However, designated long-distance connections are not the only way to travel between different local networks – they also overlap and connect to each other. For mid-range trips, there is no universally correct rule when to choose a long-distance train or intercity bus, short of actually comparing options with local or regional transit, too.

The Scalable Transfer Patterns algorithm [2] does just that, but in a smart way. For starters, it uses what is known as graph clustering to cut the network into pieces, called clusters, that have a lot of connections inside but relatively few to the outside. As an example, the figure below (kindly provided by the authors) shows a partitioning of Germany into clusters. The stops highlighted in red are border stops: They connect directly to stops outside the cluster. Notice how they are a small fraction of the total network.
The public transit network of Germany (dots and lines), split into clusters (shown in various colors). Of all 251,763 stops, only 10,886 (4.32%) are boundary stops, highlighted as red boxes. Click here to view the full resolution image.[source: S. Storandt, 2016]
Based on the clustering, the transfer patterns of all optimal connections are computed in two steps.

In step 1, transfer patterns are computed for optimal connections inside each cluster. They are stored for query processing later on, but they also accelerate the search through a cluster in the following step: between the stops on its border, we only need to consider the connections captured in the transfer patterns.

The next figure sketches how the transit network in the cluster around Berlin gets reduced to much fewer connections between border stations. (The central station stands out as a hub, as expected. It is a border station itself, because it has direct connections out of the cluster.)
The cluster of public transit connections around Berlin (shown as dots and lines in light blue), its border stops (highlighted as red boxes), and the transfer patterns of optimal connections between border stops (thick black lines; only the most important 111 of 592 are shown to keep the image legible). This cuts out 96.15% of the network (especially a lot of the high-frequency inner city trips, which makes the time savings even bigger). Click here to view the full resolution image. [source: S. Storandt, 2016]
In step 2, transfer patterns can be computed for the entire network, that is, between any pair of clusters. This is done with the following twists:

  • It suffices to consider trips from and to boundary stops of any cluster; the local transfer patterns from step 1 will supply the missing pieces later on.
  • The per-cluster transfer patterns from step 1 greatly accelerate the search across other clusters.
  • The search stops exploring any possible connection between two boundary stops as soon as it gets worse than a connection that sticks to long-distance transit between clusters (which may not always be optimal, but is always quick to compute).

The results of steps 1 and 2 are stored and used to answer queries. For any given query from some A to some B, one can now easily stitch together a network of transfer patterns that covers all optimal connections from A to B. Looking up the direct connections on that small network (like in the introductory example) and finding the best one for the queried time is very fast, even if A and B are far away.

The total storage space needed for this is much smaller than the space that would be needed for all pairs of stops, all the more the larger the network gets. Extrapolating from their experiments, the researchers estimate [2] that Scalable Transfer Patterns for the whole world could be stored in 30 GB, cutting their estimate for the basic Transfer Patterns by a thousand(!). This is considerably more powerful than the “hub station” idea from the original Transfer Patterns paper [1].

The time needed to compute Scalable Transfer Patterns is also estimated to shrink by three orders of magnitude: At a high level, the earlier phases of the algorithm accelerate the later ones, as described above. At a low level, a second optimization technique kicks in: exploiting the repetitiveness of schedules in time. Recall that finding transfer patterns is all about finding the optimal connections between pairs of stops at any possible departure time.

Frequency-based schedules (e.g., one bus every 10 minutes) cause a lot of similarity during the day, although it often doesn’t match up between lines (e.g., said bus runs every 10 minutes before 6pm and every 20 minutes after, and we seek connections to a train that runs every 12 minutes before 8pm and every 30 minutes after). Moreover, this similarity also exists from one day to the next, and we need to consider all permissible departure dates.

The Frequency-Based Search for Public Transit [3] is carefully designed to find and take advantage of repetitive schedules while representing all one-off cases exactly. Comparing to the set-up from the original Transfer Patterns paper [1], the authors estimate a whopping 60x acceleration of finding transfer patterns from this part alone.

I am excited to see that the scalability questions originally left open by [1] have been answered so convincingly as part of this Focused Research Award. Please see the list of publications on the project’s website for more outcomes of this award. Besides more on transfer patterns, they contain a wealth of other results about routing on road networks, transit networks, and with combinations of travel modes.

References:

[1] Fast Routing in Very Large Public Transportation Networks Using Transfer Patterns
by H. Bast, E. Carlsson, A. Eigenwillig, R. Geisberger, C. Harrelson, V. Raychev and F. Viger
(ESA 2010). [doi]

[2] Scalable Transfer Patterns
by H. Bast, M. Hertel and S. Storandt (ALENEX 2016). [doi]

[3] Frequency-based Search for Public Transit
by H. Bast and S. Storandt (SIGSPATIAL 2014). [doi]