OptFormer: Towards Universal Hyperparameter Optimization with Transformers

One of the most important aspects in machine learning is hyperparameter optimization, as finding the right hyperparameters for a machine learning task can make or break a model’s performance. Internally, we regularly use Google Vizier as the default platform for hyperparameter optimization. Throughout its deployment over the last 5 years, Google Vizier has been used more than 10 million times, over a vast class of applications, including machine learning applications from vision, reinforcement learning, and language but also scientific applications such as protein discovery and hardware acceleration. As Google Vizier is able to keep track of use patterns in its database, such data, usually consisting of optimization trajectories termed studies, contain very valuable prior information on realistic hyperparameter tuning objectives, and are thus highly attractive for developing better algorithms.

While there have been many previous methods for meta-learning over such data, such methods share one major common drawback: their meta-learning procedures depend heavily on numerical constraints such as the number of hyperparameters and their value ranges, and thus require all tasks to use the exact same total hyperparameter search space (i.e., tuning specifications). Additional textual information in the study, such as its description and parameter names, are also rarely used, yet can hold meaningful information about the type of task being optimized. Such a drawback becomes more exacerbated for larger datasets, which often contain significant amounts of such meaningful information.

Today in “Towards Learning Universal Hyperparameter Optimizers with Transformers”, we are excited to introduce the OptFormer, one of the first Transformer-based frameworks for hyperparameter tuning, learned from large-scale optimization data using flexible text-based representations. While numerous works have previously demonstrated the Transformer’s strong abilities across various domains, few have touched on its optimization-based capabilities, especially over text space. Our core findings demonstrate for the first time some intriguing algorithmic abilities of Transformers: 1) a single Transformer network is capable of imitating highly complex behaviors from multiple algorithms over long horizons; 2) the network is further capable of predicting objective values very accurately, in many cases surpassing Gaussian Processes, which are commonly used in algorithms such as Bayesian Optimization.

Approach: Representing Studies as Tokens
Rather than only using numerical data as common with previous methods, our novel approach instead utilizes concepts from natural language and represents all of the study data as a sequence of tokens, including textual information from initial metadata. In the animation below, this includes “CIFAR10”, “learning rate”, “optimizer type”, and “Accuracy”, which informs the OptFormer of an image classification task. The OptFormer then generates new hyperparameters to try on the task, predicts the task accuracy, and finally receives the true accuracy, which will be used to generate the next round’s hyperparameters. Using the T5X codebase, the OptFormer is trained in a typical encoder-decoder fashion using standard generative pretraining over a wide range of hyperparameter optimization objectives, including real world data collected by Google Vizier, as well as public hyperparameter (HPO-B) and blackbox optimization benchmarks (BBOB).

The OptFormer can perform hyperparameter optimization encoder-decoder style, using token-based representations. It initially observes text-based metadata (in the gray box) containing information such as the title, search space parameter names, and metrics to optimize, and repeatedly outputs parameter and objective value predictions.

Imitating Policies
As the OptFormer is trained over optimization trajectories by various algorithms, it may now accurately imitate such algorithms simultaneously. By providing a text-based prompt in the metadata for the designated algorithm (e.g. “Regularized Evolution”), the OptFormer will imitate the algorithm’s behavior.

Over an unseen test function, the OptFormer produces nearly identical optimization curves as the original algorithm. Mean and standard deviation error bars are shown.

Predicting Objective Values
In addition, the OptFormer may now predict the objective value being optimized (e.g. accuracy) and provide uncertainty estimates. We compared the OptFormer’s prediction with a standard Gaussian Process and found that the OptFormer was able to make significantly more accurate predictions. This can be seen below qualitatively, where the OptFormer’s calibration curve closely follows the ideal diagonal line in a goodness-of-fit test, and quantitatively through standard aggregate metrics such as log predictive density.

Left: Rosenblatt Goodness-of-Fit. Closer diagonal fit is better. Right: Log Predictive Density. Higher is better.

Combining Both: Model-based Optimization
We may now use the OptFormer’s function prediction capability to better guide our imitated policy, similar to techniques found in Bayesian Optimization. Using Thompson Sampling, we may rank our imitated policy’s suggestions and only select the best according to the function predictor. This produces an augmented policy capable of outperforming our industry-grade Bayesian Optimization algorithm in Google Vizier when optimizing classic synthetic benchmark objectives and tuning the learning rate hyperparameters of a standard CIFAR-10 training pipeline.

Left: Best-so-far optimization curve over a classic Rosenbrock function. Right: Best-so-far optimization curve over hyperparameters for training a ResNet-50 on CIFAR-10 via init2winit. Both cases use 10 seeds per curve, and error bars at 25th and 75th percentiles.

Throughout this work, we discovered some useful and previously unknown optimization capabilities of the Transformer. In the future, we hope to pave the way for a universal hyperparameter and blackbox optimization interface to use both numerical and textual data to facilitate optimization over complex search spaces, and integrate the OptFormer with the rest of the Transformer ecosystem (e.g. language, vision, code) by leveraging Google’s vast collection of offline AutoML data.

The following members of DeepMind and the Google Research Brain Team conducted this research: Yutian Chen, Xingyou Song, Chansoo Lee, Zi Wang, Qiuyi Zhang, David Dohan, Kazuya Kawakami, Greg Kochanski, Arnaud Doucet, Marc'aurelio Ranzato, Sagi Perel, and Nando de Freitas.

We would like to also thank Chris Dyer, Luke Metz, Kevin Murphy, Yannis Assael, Frank Hutter, and Esteban Real for providing valuable feedback, and further thank Sebastian Pineda Arango, Christof Angermueller, and Zachary Nado for technical discussions on benchmarks. In addition, we thank Daniel Golovin, Daiyi Peng, Yingjie Miao, Jack Parker-Holder, Jie Tan, Lucio Dery, and Aleksandra Faust for multiple useful conversations.

Finally, we thank Tom Small for designing the animation for this post.

Source: Google AI Blog

Dev Channel Update for Desktop

 The dev channel has been updated to 106.0.5245.0 for Windows, Mac & Linux.

A partial list of changes is available in the log. Interested in switching release channels? Find out how. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.

Srinivas Sista
Google Chrome

Deprecation of Bid Manager API v1.1

Today we’re announcing the deprecation of the Bid Manager (DBM) API v1.1. This version will be fully sunset on February 28, 2023.

Please migrate to Bid Manager API v2 by the sunset date to avoid an interruption of service. v2 introduced a number of new features and breaking changes, which are listed in our release notes. Here are some of the changes introduced in v2:

Follow the steps in our v2 migration guide to help you migrate from v1.1 to v2.

If you run into issues or need help with your migration, please contact us using our support contact form.

How the Chrome team uses Chrome

Before Chrome browser was even launched, the Chrome team was working behind the scenes to create a different browsing experience: one that was both personalized and helpful. This mission has remained central to the Chrome team’s values as we continuously strive to make the web work better for you, building a browser to make your daily life more simple, efficient, and organized. As Chrome celebrated its 100th update earlier this year, we thought it fitting to honor this milestone by asking the people who make Chrome to share how their own innovations are helpful in their daily lives.

Commuting smarter with recent tabs

My team is constantly thinking of ways to make sure Chrome meets the needs of iPhone users, no matter where they are. Chrome on iPhone is essential to my daily routine, when I use my pesky metro commute to get some quick, simple to-dos out of the way before arriving at the office. Many of these tasks are continuation of what I started on desktop, and Chrome’s "recent tabs" feature allows me easy access on my iPhone. When I’m signed in and syncing, my in-office work blends seamlessly into my commute, without the stress of trying to remember what I was doing or hunting down URLs.

- Nasim Sedeghat, Chrome iOS product lead

Chrome browser on desktop is shown with a Google Slides presentation named “M+M Chrome Feature Launch Review” pulled up. An iPhone comes into view from the right, and the three dot menu on the bottom right corner of the iPhone is pressed. A menu is pulled up where a button labeled “recent tabs” is pushed. A menu showing the user’s recently accessed tabs comes up, the Google Slides tab “M+M Chrome Feature Launch Review” is selected, and the same Google slide presentation from the computer is pulled up on the iPhone.

Using tab groups…to make tab groups

My open tabs are often a direct reflection of my state of mind: the messier they are, the more I've got going on. Tab groups help me organize my thoughts, where I use titles to give context and color schemes to find where I left off. I started using tab groups before we had even finished developing them, in an effort to coordinate the project. In many ways, tab groups were responsible for their own creation, since we wouldn't have made it to launch without them!

- Connie Wan, software engineering manager, Chrome desktop user interface

A Chrome browser is shown with tab groups labeled “Sprint Planning” and “Bug Triage” shown at the top of the page. A cursor is hovering over a new tab, it travels through a menu of options,  stops on “Add tab to group”, and selects “New Group”.  A new tab group is shown, and the cursor types in “Team Management” to name the group and selects a pink color for the group.

Tracking down tabs with Tab Search

While I aspire to keep things neat and tidy in Chrome, the reality is that after a day of back-to-back meetings and checking up on Chrome engineering projects, my tabs accumulate faster than I can organize them into groups (Though I love using tab groups, too!). Tab Search comes to my rescue by allowing me to directly look up an existing tab. It instantly narrows the list as you type — and it includes not only open tabs, but also the tabs I closed just a few moments ago.

- Max Christoff, senior director of Chrome browser engineering

A Chrome browser is shown with more than 20 tabs opened. A cursor navigates to the top right of the screen, clicks on a downfacing arrow, and brings up a search bar where the text reads “Search Tabs” and displays a list of the opened tabs.  “Design reviews” is typed into the search bar, and the list of tabs narrows down to only those with that text in the title. The cursor selects a tab containing a Google Drive page with “Design reviews” results displayed

Getting Faster Results with Site Search

As a user experience (UX) designer, much of my role is centered around making sure Chrome is accessible and enjoyable. I spend a lot of time jumping across multiple files or conducting quick website searches, making Chrome’s site search one of my personal favorite features. Rather than first navigating to a specific site, and then clicking into the site’s search field, site search gives me faster access by letting me start my search from the Chrome address bar. I type the site shortcut, press tab then add my search term so the page loads with exactly what I want to find.

- Elvin Hu, Chrome UX interaction designer

A cursor navigates to the Chrome search bar on an open window. “Drive” is typed into the address bar, and the search bar changes to show “Search Google Drive”  in blue lettering within the search bar. Then “chrome logo” is typed in and the cursor clicks on the option to display search results for this term from within Google Drive. The page then opens to display a Google Drive page with the results for files containing the words “chrome logo” already shown.

Troubleshooting video calls with site permissions

Over the last few years, I've spent a lot of time on video calls, and I'm so glad I can stay in touch with my loved ones this way. When video calls don't go quite as planned, I troubleshoot by having everyone check their site permissions. You can access this easily on Chrome through thelock icon, located in the address bar. From here, make sure that "Microphone" and "Camera" are turned on. And if a website has been bugging you with too many notifications, this is the same place where you can turn those off. The lock icon remains a favorite feature of mine, especially since my team has always celebrated giving users control over their settings.

- Meggyn Watkins, senior UX writer, trust and safety

A Chrome browser window is shown with a Google Meet tab titled “Family Catch Up!” on the screen. The camera and microphone are turned off.  A cursor navigates to the Chrome search bar, and clicks on the lock icon on the left. The site permissions menu is opened and the cursor toggles on the settings for camera, microphone, and notifications.

Chrome Dev for Desktop Update

The Dev channel has been updated to 106.0.5245.0 for Windows, Mac and Linux.

A partial list of changes is available in the Git log. Interested in switching release channels? Find out how. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.

Srinivas Sista
Google Chrome

Building language models, one story at a time

One-third of the world's languages are spoken in Africa, but less than 1% of African languages are represented online. This is significant because the language you speak, write or sign shapes your online experience. Language is the cornerstone of your identity, the connection to your past and the key to your future. When we can’t experience the internet in our language, it limits what we can learn, what jobs we can have, what stories we can access, and so much more.

In my home country Mali, eighty percent of the population speaks Bambara as its first or second language. It is also spoken in Burkina Faso, Ivory Coast, Liberia and Guinea — making it one of West Africa's most widely spoken languages. But, if Bambara is your primary language, it can be difficult to have an immersive internet experience. That's why I've set out to make the internet more accessible to Bambara speakers, remove the language barrier, and bring this primarily spoken language online for everyone.

To achieve this goal, a language model for Bambara needs to be built. Language models require lots of data, which typically means having hours of transcribed recordings where humans are speaking the language so that computers can learn the language through a process called Natural Language Processing. Unfortunately, Bambara lacks readily available data to train. Researchers call this being “low-resourced.” My team at Robots Mali has been trying to solve this challenge for years as part of a collaborative project called Bayɛlɛmabaga. Through collaboration with the Google Research team in Accra, we're closer to accomplishing our goals of building more resources (written and bilingual texts) for Bambara.

To overcome the challenge of being “low-resourced," we teamed up with those who hold the culture's knowledge, rich history and teachings. Malian Griots are the real keepers of the Bambara collective memory, passing their knowledge only through oral storytelling. So, we gathered more than thirty griots to record them narrating generational stories. We transcribed and translated each tale to preserve the knowledge for future generations. While griots are traditionally older men, for this project, we worked to identify a diverse group of griots based on age, gender and background to build a representative group.

Using these recordings we've been able to build a model for understanding Bambara speech and facilitating easy translation to other languages, known as an Automatic Speech Recognition (ASR) model. As a result, we are making the world's information more accessible to millions of Bambara speakers and releasing our findings for the research community and everyone to benefit. Our work has allowed us to uplift traditional practices while building a new future for Bambara speakers. We’re in contact with the National Museum of Mali to donate all of the beautiful stories that the griots have narrated. The rich history and teachings from the griots will be available to the local community and public. Furthermore, the project is selected to be showcased at The Deep Learning Indaba 2022 next week, the largest machine learning conference in Africa.

Most importantly, we identified oral literature as a viable resource for languages. Many languages are underrepresented online, and this project represents a big step towards bringing more of them online. Of course, there's still a lot of work to do. But, by introducing this work to the community, researchers have new tools to keep breaking down the online language barrier.

More content by people, for people in Search

Many of us have experienced the frustration of visiting a web page that seems like it has what we’re looking for, but doesn’t live up to our expectations. The content might not have the insights you want, or it may not even seem like it was created for, or even by, a person.

We work hard to make sure the pages we show on Search are as helpful and relevant as possible. To do this, we constantly refine our systems: Last year, we launched thousands of updates to Search based on hundreds of thousands of quality tests, including evaluations where we gather feedback from human reviewers.

We know people don’t find content helpful if it seems like it was designed to attract clicks rather than inform readers. So starting next week for English users globally, we’re rolling out a series of improvements to Search to make it easier for people to find helpful content made by, and for, people. This ranking work joins a similar effort related to ranking better quality product review content over the past year, which will also receive an update. Together, these launches are part of a broader, ongoing effort to reduce low-quality content and make it easier to find content that feels authentic and useful in Search.

Better ranking of original, quality content

We continually update Search to make sure we're helping you find high quality content. Next week, we'll launch the “helpful content update” to tackle content that seems to have been primarily created for ranking well in search engines rather than to help or inform people. This ranking update will help make sure that unoriginal, low quality content doesn’t rank highly in Search, and our testing has found it will especially improve results related to online education, as well as arts and entertainment, shopping and tech-related content.

For example, if you search for information about a new movie, you might have previously seen articles that aggregated reviews from other sites without adding perspectives beyond what’s available elsewhere. This isn’t very helpful if you’re expecting to read something new. With this update, you’ll see more results with unique, authentic information, so you’re more likely to read something you haven't seen before.

As always, we'll continue to refine our systems and build on this improvement over time. If you’re a content creator, you can learn more about today’s update and guidance to consider on Search Central.

More helpful product reviews written by experts

We know product reviews can play an important role in helping you make a decision on something to buy. Last year, we kicked off a series of updates to show more helpful, in-depth reviews based on first-hand expertise in search results.

We've continued to refine these systems, and in the coming weeks, we’ll roll out another update to make it even easier to find high-quality, original reviews. We’ll continue this work to make sure you find the most useful information when you’re researching a purchase on the web.

We hope these updates will help you access more helpful information and valuable perspectives on Search. We look forward to building on this work to make it even easier to find original content by and for real people in the months ahead.

Helping people understand AI

If you’re like me, you may have noticed that AI has become a part of daily life. I wake up each morning and ask my smart assistant about the weather. I recently applied for a new credit card and the credit limit was likely determined by a machine learning model. And while typing the previous sentence, I got a word choice suggestion that “probably” might flow better than “likely,” a suggestion powered by AI.

As a member of Google’s Responsible Innovation team, I think a lot about how AI works and how to develop it responsibly. Recently, I spoke with Patrick Gage Kelley, Head of Research Strategy on Google’s Trust & Safety team, to learn more about developing products that help people recognize and understand AI in their daily lives.

How do you help people navigate a world with so much AI?

My goal is to ensure that people, at a basic level, know how AI works and how it impacts their lives. AI systems can be really complicated, but the goal of explaining AI isn’t to get everyone to become programmers and understand all of the technical details — it’s to make sure people understand the parts that matter to them.

When AI makes a decision that affects people (whether it’s recommending a video or qualifying for a loan), we want to explain how that decision was made. And we don’t want to just provide a complicated technical explanation, but rather, information that is meaningful, helpful, and equips people to act if needed.

We also want to find the best times to explain AI. Our goal is to help people develop AI literacy early, including in primary and secondary education. And when people use products that rely on AI (everything from online services to medical devices), we want to include a lot of chances for people to learn about the role AI plays, as well as its benefits and limitations. For example, if people are told early on what kinds of mistakes AI-powered products are likely to make, then they are better prepared to understand and remedy situations that might arise.

Do I need to be a mathematician or programmer to have a meaningful understanding of AI?

No! A good metaphor here is financial literacy. While we may not need to know every detail of what goes into interest rate hikes or the intricacies of financial markets, it’s important to know how they impact us — from paying off credit cards, to buying a home, or paying for student loans. In the same way, AI explainability isn’t about understanding every technical aspect of a machine learning algorithm – it’s about knowing how to interact with it and how it impacts our daily lives.

How should AI practitioners — developers, designers, researchers, students, and others — think about AI explainability?

Lots of practitioners are doing important work on explainability. Some focus on interpretability, making it easier to identify specific factors that influence a decision. Others focus on providing “in-the-moment explanations” right when AI makes a decision. These can be helpful, especially when carefully designed. However, AI systems are often so complex that we can’t rely on in-the-moment explanations entirely. It’s just too much information to pack into a single moment. Instead, AI education and literacy should be incorporated into the entire user journey and built continuously throughout a person’s life.

More generally, AI practitioners should think about AI explainability as fundamental to the design and development of the entire product experience. At Google, we use our AI Principles to guide responsible technology development. In accordance with AI Principle #4: “Be accountable to people,” we encourage AI practitioners to think about all the moments and ways they can help people understand how AI operates and makes decisions.

How are you and your collaborators working to improve explanations of AI?

We develop resources that help AI practitioners learn creative ways to incorporate AI explainability in product design. For example, in the PAIR Guidebook we launched a series of ethical case studies to help AI practitioners think through tricky issues and hone their skills for explaining AI. We also do fundamental research like this paper to learn more about how people perceive AI as a decision-maker, and what values they would like AI-powered products to embody.

We’ve learned that many AI practitioners want concrete examples of good explanations of AI that they can build on, so we’re currently developing a story-driven visual design toolkit for explanations of a fictional AI app. The toolkit will be publicly available, so teams in startups and tech companies everywhere can prioritize explainability in their work.

An illustration of a sailboat navigating the coast of Maine

The visual design toolkit provides story-driven examples of good explanations of AI.

I want to learn more about AI explainability. Where should I start?

This February, we released an Applied Digital Skills lesson, “Discover AI in Daily Life.” It’s a great place to start for anyone who wants to learn more about how we interact with AI everyday.

We also hope to speak about AI explainability at the upcoming South by Southwest Conference. Our proposed session would dive deeper into these topics, including our visual design toolkit for product designers. If you’re interested in learning more about AI explainability and our work, you can vote for our proposal through the SXSW PanelPicker® here.

How a career in cloud technology led Johnson to Google

Welcome to the latest edition of “My Path to Google,” where we talk to Googlers, interns, apprentices and alumni about how they got to Google, what their roles are like and even some tips on how to prepare for interviews.

Today’s post is all about Johnson Jose, a Google Cloud leader based in Bangalore, India with a passion for shaping the future of cloud technology.

What’s your role at Google?

I lead the Application Engineering team in Google Cloud India, which builds tools and platforms to help onboard our partners. I spend most of my time in technical discussions, but I also meet with both internal and external partners to stay plugged into what’s happening in the cloud industry.

Can you tell us a bit about yourself?

I grew up in Kerala, India and received a masters degree in engineering, followed by my MBA. I’ve also written two books, one about data quality excellence and the other about cloud development operations. I’m currently writing my third book about business management — stay tuned. When I’m not working, I love to hike and try new dishes.

Johnson stands with his wife and two children in front of a backdrop of mountains and trees.

Johnson hiking with his family.

How would you describe your path to your current role at Google?

I started my career working in cloud transformation at a few different companies, where I learned about local area networks, routing and switching technologies. I had always wanted to work at Google and I actually got the chance to work with Google Cloud as one of my clients. So when a Google recruiter approached me, I took the opportunity.

What inspires you to come in (or log in) every day?

I’m very passionate about cloud technology, and I enjoy knowing my work is shaping cloud infrastructure today and in the future. We’re influencing the future of the internet, simplifying and improving how quickly people can connect and work. Of course, I’m also inspired by my team and our amazing workplace. We have great food and a world-class gym.

What have you learned about leadership since joining Google?

Leadership at Google is rooted in inclusivity and respect. I remember when I joined, my own management team, who’s based all around the world, rearranged the entire meeting schedule to accommodate my time zone. And there’s a strong focus on the wellbeing of our teams. I’ve also learned you don’t need to be a senior leader to lead at Google. Everyone can teach and make an impact.

Johnson stands in front of a new building with large glass windows and a triangle roof.

Johnson visiting a Google campus in California.

How did the application and interview process go for you?

I remember being impressed with how well my recruiter explained the interview process. From start to finish, the entire experience was professional, respectful and transparent. I actually interviewed right in the middle of the pandemic and needed some flexibility to help my previous company navigate through that time. Google was very respectful of that.

Do you have any tips you’d like to share with aspiring Googlers?

First, preparation is key. Take advantage of the many resources and videos available online, including on the Google Careers site. For the interview, focus on your strengths and be confident about your work. Remember to also be curious and ask for clarification so it’s a discussion rather than a one-sided process. If you’re interviewing for the Google Application Engineering team specifically, showcase your domain expertise and experience in writing well-structured programs. Google wants to hire you for you - don’t be afraid of the interview and focus on enjoying it!