New York City is home to several Google algorithms research groups. We collaborate closely with the teams behind many Google products and work on a wide variety of algorithmic challenges, like optimizing infrastructure, protecting privacy, improving friend suggestions and much more.
Today, we’re excited to provide more insights into the research done in the Big Apple with the launch of the NYC Algorithms and Optimization Team page. The NYC Algorithms and Optimization Team comprises multiple overlapping research groups working on large-scale graph mining, large-scale optimization and market algorithms.
Large-scale Graph Mining
The Large-scale Graph Mining Group is tasked with building the most scalable library for graph algorithms and analysis and applying it to a multitude of Google products. We formalize data mining and machine learning challenges as graph algorithms problems and perform fundamental research in those fields leading to publications in top venues.
Our projects include:
- Large-scale Similarity Ranking: Our research in pairwise similarity ranking has produced a number of innovative methods, which we have published in top venues such as WWW, ICML, and VLDB, e.g., improving friend suggestion using ego-networks and computing similarity rankings in large-scale multi-categorical bipartite graphs.
- Balanced Partitioning: Balanced partitioning is often a crucial first step in solving large-scale graph optimization problems. As our paper shows, we are able to achieve a 15-25% reduction in cut size compared to state-of-the-art algorithms in the literature.
- Clustering and Connected Components: We have state-of-the-art implementations of many different algorithms including hierarchical clustering, overlapping clustering, local clustering, spectral clustering, and connected components. Our methods are 10-30x faster than the best previously studied algorithms and can scale to graphs with trillions of edges.
- Public-private Graph Computation: Our research on novel models of graph computation based on a personal view of private data preserves the privacy of each user.
The Large-scale Optimization Group’s mission is to develop large-scale optimization techniques and use them to improve the efficiency and robustness of infrastructure at Google. We apply techniques from areas such as combinatorial optimization, online algorithms, and control theory to make Google’s massive computational infrastructure do more with less. We combine online and offline optimizations to achieve such goals as increasing throughput, decreasing latency, minimizing resource contention, maximizing the efficacy of caches, and eliminating unnecessary work in distributed systems.
Our research is used in critical infrastructure that supports core products:
- Consistent Hashing: We designed memoryless balanced allocation algorithms to assign a dynamic set of clients to a dynamic set of servers such that the load on each server is bounded, and the allocation does not change by much for every update operation. This technique is currently implemented in Google Cloud Pub/Sub and externally in the open-source haproxy.
- Distributed Optimization Based on Core-sets: Composable core-sets provide an effective method for solving optimization problems on massive datasets. This technique can be used for several problems including distributed balanced clustering and distributed submodular maximization.
- Google Search Infrastructure Optimization: We partnered with the Google Search infrastructure team to build a distributed feedback control loop to govern the way queries are fanned out to machines. We also improved the efficacy of caching by increasing the homogeneity of the stream of queries seen by any single machine.
The Market Algorithms Group analyzes, designs, and delivers economically and computationally efficient marketplaces across Google. Our research serves to optimize display ads for DoubleClick’s reservation ads and exchange, as well as sponsored search and mobile ads.
In the past few years, we have explored a number of areas, including:
- Display Ads Research: The Display ads ecosystem provides a great platform for a variety of research problems in online stochastic optimization and computational economics, such as whole-page optimization and optimal contract design. An important part of this research area is dedicated to auction optimization for advertising exchanges where we deal with auctions with intermediaries, optimal pricing strategies, and optimal yield management for reservation contracts and ad exchanges.
- Online Stochastic Matching: We have developed new algorithms for online stochastic matching, budgeted allocation, handling traffic spikes, and more general variants of the problem, called submodular welfare maximization.
- Robust Stochastic Allocation: In one paper, we study online algorithms that achieve a good performance in both adversarial and stochastic arrival models. In another paper, we develop a hybrid model and algorithms with approximation factors that change as a function of the accuracy of the forecast.
- Optimizing Advertiser Campaigns: We have studied algorithmic questions such as positive carryover effects, budget optimization in search-based auctions, and concise bid optimization strategies with multiple budget constraints.
- Dynamic Mechanism Design: We have developed efficient mechanisms for sophisticated settings that occur in internet advertising, such as online settings and polyhedral constraints. We have also designed a new family of dynamic mechanisms, called bank account mechanisms, and showed their effectiveness in designing non-clairvoyant dynamic mechanisms that can be implemented without relying on forecasting the future steps.
It is our hope that with the help of this new Google NYC Algorithms and Optimization Team page that we can more effectively share our work and broaden our dialogue with the research and engineering community. Please visit the site to learn about our latest projects, publications, seminars, and research areas!