Image-Text Pre-training with Contrastive Captioners

Oftentimes, machine learning (ML) model developers begin their design using a generic backbone model that is trained at scale and with capabilities transferable to a wide range of downstream tasks. In natural language processing, a number of popular backbone models, including BERT, T5, GPT-3 (sometimes also referred to as “foundation models”), are pre-trained on web-scale data and have demonstrated generic multi-tasking capabilities through zero-shot, few-shot or transfer learning. Compared with training over-specialized individual models, pre-training backbone models for a large number of downstream tasks can amortize the training costs, allowing one to overcome resource limitations when building large scale models.

In computer vision, pioneering work has shown the effectiveness of single-encoder models pre-trained for image classification to capture generic visual representations that are effective for other downstream tasks. More recently, contrastive dual-encoder (CLIP, ALIGN, Florence) and generative encoder-decoder (SimVLM) approaches trained using web-scale noisy image-text pairs have been explored. Dual-encoder models exhibit remarkable zero-shot image classification capabilities but are less effective for joint vision-language understanding. On the other hand, encoder-decoder methods are good at image captioning and visual question answering but cannot perform retrieval-style tasks.

In “CoCa: Contrastive Captioners are Image-Text Foundation Models”, we present a unified vision backbone model called Contrastive Captioner (CoCa). Our model is a novel encoder-decoder approach that simultaneously produces aligned unimodal image and text embeddings and joint multimodal representations, making it flexible enough to be directly applicable for all types of downstream tasks. Specifically, CoCa achieves state-of-the-art results on a series of vision and vision-language tasks spanning vision recognition, cross-modal alignment, and multimodal understanding. Furthermore, it learns highly generic representations so that it can perform as well or better than fully fine-tuned models with zero-shot learning or frozen encoders.

Overview of Contrastive Captioners (CoCa) compared to single-encoder, dual-encoder and encoder-decoder models.

Method
We propose CoCa, a unified training framework that combines contrastive loss and captioning loss on a single training data stream consisting of image annotations and noisy image-text pairs, effectively merging single-encoder, dual-encoder and encoder-decoder paradigms.

To this end, we present a novel encoder-decoder architecture where the encoder is a vision transformer (ViT), and the text decoder transformer is decoupled into two parts, a unimodal text decoder and a multimodal text decoder. We skip cross-attention in unimodal decoder layers to encode text-only representations for contrastive loss, and cascade multimodal decoder layers with cross-attention to image encoder outputs to learn multimodal image-text representations for captioning loss. This design maximizes the model's flexibility and universality in accommodating a wide spectrum of tasks, and at the same time, it can be efficiently trained with a single forward and backward propagation for both training objectives, resulting in minimal computational overhead. Thus, the model can be trained end-to-end from scratch with training costs comparable to a naïve encoder-decoder model.

Illustration of forward propagation used by CoCa for both contrastive and captioning losses.

Benchmark Results
The CoCa model can be directly fine-tuned on many tasks with minimal adaptation. By doing so, our model achieves a series of state-of-the-art results on popular vision and multimodal benchmarks, including (1) visual recognition: ImageNet, Kinetics-400/600/700, and MiT; (2) cross-modal alignment: MS-COCO, Flickr30K, and MSR-VTT; and (3) multimodal understanding: VQA, SNLI-VE, NLVR2, and NoCaps.

Comparison of CoCa with other image-text backbone models (without task-specific customization) and multiple state-of-the-art task-specialized models.

It is noteworthy that CoCa attains these results as a single model adapted for all tasks while often lighter than prior top-performing specialized models. For example, CoCa obtains 91.0% ImageNet top-1 accuracy while using less than half the parameters of prior state-of-the-art models. In addition, CoCa also obtains strong generative capability of high-quality image captions.

Image classification scaling performance comparing fine-tuned ImageNet top-1 accuracy versus model size.
Text captions generated by CoCa with NoCaps images as input.

Zero-Shot Performance
Besides achieving excellent performance with fine-tuning, CoCa also outperforms previous state-of-the-art models on zero-shot learning tasks, including image classification,and cross-modal retrieval. CoCa obtains 86.3% zero-shot accuracy on ImageNet while also robustly outperforming prior models on challenging variant benchmarks, such as ImageNet-A, ImageNet-R, ImageNet-V2, and ImageNet-Sketch. As shown in the figure below, CoCa obtains better zero-shot accuracy with smaller model sizes compared to prior methods.

Image classification scaling performance comparing zero-shot ImageNet top-1 accuracy versus model size.

Frozen Encoder Representation
One particularly exciting observation is that CoCa achieves results comparable to the best fine-tuned models using only a frozen visual encoder, in which features extracted after model training are used to train a classifier, rather than the more computationally intensive effort of fine-tuning a model. On ImageNet, a frozen CoCa encoder with a learned classification head obtains 90.6% top-1 accuracy, which is better than the fully fine-tuned performance of existing backbone models (90.1%). We also find this setup to work extremely well for video recognition. We feed sampled video frames into the CoCa frozen image encoder individually, and fuse output features by attentional pooling before applying a learned classifier. This simple approach using a CoCa frozen image encoder achieves video action recognition top-1 accuracy of 88.0% on Kinetics-400 dataset and demonstrates that CoCa learns a highly generic visual representation with the combined training objectives.

Comparison of Frozen CoCa visual encoder with (multiple) best-performing fine-tuned models.

Conclusion
We present Contrastive Captioner (CoCa), a novel pre-training paradigm for image-text backbone models. This simple method is widely applicable to many types of vision and vision-language downstream tasks, and obtains state-of-the-art performance with minimal or even no task-specific adaptations.

Acknowledgements
We would like to thank our co-authors Vijay Vasudevan, Legg Yeung, Mojtaba Seyedhosseini, and Yonghui Wu who have been involved in all aspects of the project. We also would like to thank Yi-Ting Chen, Kaifeng Chen, Ye Xia, Zhen Li, Chao Jia, Yinfei Yang, Zhengdong Zhang, Wei Han, Yuan Cao, Tao Zhu, Futang Peng, Soham Ghosh, Zihang Dai, Xin Li, Anelia Angelova, Jason Baldridge, Izhak Shafran, Shengyang Dai, Abhijit Ogale, Zhifeng Chen, Claire Cui, Paul Natsev, Tom Duerig for helpful discussions, Andrew Dai for help with contrastive models, Christopher Fifty and Bowen Zhang for help with video models, Yuanzhong Xu for help with model scaling, Lucas Beyer for help with data preparation, Andy Zeng for help with MSR-VTT evaluation, Hieu Pham and Simon Kornblith for help with zero-shot evaluations, Erica Moreira and Victor Gomes for help with resource coordination, Liangliang Cao for proofreading, Tom Small for creating the animations used in this blogpost, and others in the Google Brain team for support throughout this project.

Source: Google AI Blog


The Google Cloud Startup Summit is coming on June 2, 2022

Posted by Chris Curtis, Startup Marketing Manager at Google Cloud

We’re excited to announce our annual Google Cloud Startup Summit will be taking place on June 2nd, 2022.

We hope you will join us as we bring together our startup & VC communities. Join us to dive into topics relevant to startups and enjoy sessions such as:

The future of web3

  • Hear from Google Cloud CEO, Thomas Kurian and Dapper Labs Co-founder and CEO, Roham Gharegozlou, as they discuss web3 and how startups can prepare for the paradigm changes it brings.

VC AMA: Startup Summit Edition

  • Join us for a very special edition of the VC AMA series where we’ll have a discussion with Derek Zanutto from CapitalG, Alison Lange Engel from Greycroft and Matt Turck from FirstMark to discuss investment trends and advice for founders around cloud, data, and the future of disruption in legacy industries.

What’s new for the Google for Startups Cloud Program

  • Exciting announcements from Ryan Kiskis, Director of the Startup Ecosystem at Google Cloud, on how Google Cloud is investing in the startup ecosystem with tailored programs and offers.

Technical leaders & business sessions

  • Growth insights from top startups Discord, Swit, and Streak on how their tech stack helped propel their growth.

Additionally, startups will have an opportunity to join ‘Ask me Anything’ live sessions after the event to interact with Google Cloud startup experts and technical teams to discuss questions that may come up throughout the event.

You can see the full agenda here to get more details on the sessions.

We can’t wait to see you at the Google Cloud Startup Summit. Register to secure your spot today.

Humans Behind Search: Meet Catherine

Catherine is an Engineering Director for Search and a Tech Site Lead in the Google London office. She’s been managing software engineering teams since the early 90’s and joined Google in 2017 to lead the engineering team working on the Google mobile app.

What’s your favorite feature on mobile?

It’s got to be Hum to Search, without a doubt. If you go into the Google app on your phone and press the microphone button, you can hum a song and it will tell you what the song is. This has helped me quickly identify a tune so many times!

We do have a rigorous testing process, even for fun features like this, to make sure these things are something users can use and actually want. It’s a continuation of the Search premise, to keep answering the questions that niggle at you – but this time via audio.

What excites you about the future of Search?

Probably the fact that it simply keeps getting more helpful, as we combine our understanding of text, voice and images — so you’ll be able to find helpful information about whatever you see, hear and experience, in ways that are most intuitive to you. We’ve developed a helpful new function called multisearch, which means you can search with images and text at the same time. So even if you don’t have the words to describe what you’re looking for, you can get help. For example, you can search for similar products in a different color, or take a picture of wallpaper and ask for it on a blanket instead, or even how to look after the basil plant on your windowsill. We’re envisioning a future where you can search your whole world, any way and anywhere.


You’ve said before that software engineering is a very social thing. Can you expand on this?

We have an incredible team working on Search — people developing the machine learning models, the services, the software on the phone. How well those people communicate determines how well the software fits together, so it’s important people have psychological safety in the job. If they do, it means easy feedback mechanisms, good communication and tight team work.

It’s also down to leadership to make sure teams realize everyone has to succeed for the business to — that it’s really not a competition. When looking for our future Search stars, the whole person matters, not just their skills — so will you put users first, do the right thing, work well with others and create an inclusive environment? Those questions really help determine the right fit.

What do you think is a lesser known, but really useful fact about Search?

We’ve got a newish feature called ‘About this result’. When you’re searching for something, you can click an icon that then tells you more about how our systems determined a result might be a good match for your search. You can also find important context about a source or topic, before you visit a website. We’re trying to help people develop information literacy skills — so they can have more context about the sources of their information and understand how Search works. And it means they can be more savvy about what’s going on.

What do you enjoy most about working on a product like Search?

Just the impact. We have billions of users. Lots of people are relying on our information to help them in their daily lives, help them in extreme situations, help them always. It’s really nice to work on something you know people need and want. We are helpful — that’s it really. I rely on it – it’s how I live in my world. I worked in computers long before the internet, and I grew up spending hours in the library just looking things up – Search coming along changed all that. If you’d told me about this as a teenager I would have told you you were crazy!

Chrome for iOS Update

Hi, everyone! We've just released Chrome 102 (102.0.5005.67) for iOS; it'll become available on App Store in the next few hours.

This release includes stability and performance improvements. You can see a full list of the changes in the Git log. If you find a new issue, please let us know by filing a bug.

Harry Souders

Google Chrome

Display & Video 360 brings Google audiences to connected TVs

Today at Google Marketing Live, we’re unveiling new products and features to help you build resilience for your business, drive results and prepare for the future of marketing.

As streaming continues to rise, that future includes more connected TV (CTV) advertising. That's why we’re unlocking even more CTV inventory in Display & Video 360 and extending Google audiences to CTV devices — helping you reach the right viewers as they watch top streaming content.

Connect with the right CTV audience

We’re committed to helping you deliver high quality ad experiences to all streamers by bringing the best of digital ad technology to the TV screen. One of the most effective digital strategies is creating relevant connections with your core audience. So we're introducing Google audiences for CTV inventory in Display & Video 360, which will work across Hulu, Peacock, YouTube and most other ad-supported CTV apps.

In a few months, you’ll be able to power your CTV campaigns with the same affinity, in-market and demographic audiences you’ve been using for your digital ads for years. Demographics and in-market segments will be available on CTV devices by the end of this quarter. Some affinity audiences are already available and more are coming later this summer.

GoDaddy has been using Google affinity audiences to reach the right people with their digital ads. Now, they can extend that strategy and reach audiences like “Business Professionals” or “Avid Investors” who are watching shows on Discovery’s HGTV Network or YouTube.

Google audiences on CTV gives us the opportunity to create ad campaigns that align directly with our prospective customers’ lifestyles. It's a great way to extend our digital best practices to the big screen. Jacob Jackson
Sr. Marketing Manager, GoDaddy

Reckitt has also found success using Google audiences on connected TV. Reckitt wanted Airborne, their immune support brand, to remain top of mind in the U.S. So they turned to Google’s custom audience segments to reach streamers interested in boosting their immune systems. Display & Video 360 analyzed the keywords Reckitt selected and automatically created tailored audiences that maximized the brand’s reach. This new audience strategy, combined with other digitally inspired approaches, resulted in over 18% more CTV reach for their Airborne campaign.

Our partnership with Display & Video 360 helped us reach CTV viewers in a much more data-driven way. James NT
Senior Performance Manager, Reckitt Benckiser

Access the biggest names in ad-supported streaming

Whether you’re looking for a good movie, the latest music video or a peaceful guided meditation — YouTube is the main stream. According to Nielsen, YouTube reached over 135 million people on connected TVs in the U.S. in December of last year.[d90432]For brands like Uber Eats, YouTube is fundamental to reaching younger demographics, who are lighter TV viewers. And Display & Video 360’s capacity to plan, manage ad frequency and measure performance across YouTube and other CTV buys saves these brands time and money.

Display & Video 360’s capacity to control ad frequency across YouTube and our other video buys makes it the ideal partner. Hanna El Hourani
Global Head of Programmatic, Uber

A majority of ad-supported CTV services — including broadcasters and cable network apps — also offer their inventory through Display & Video 360. According to comScore, Display & Video 360 now reaches 93% of ad-supported connected TV households in the U.S. and provides access to nine of the top ten most-watched ad-supported CTV apps in the U.S.[89917e]

Data visualization that reads “Display & Video 360 now reaches 93% of ad-supported connected TV households in the U.S.”

We’re also continuing to unlock top CTV inventory around the world. Peacock, NBCUniversal’s ad-supported streaming service, is the newest addition to available CTV publishers in the U.S. And you can now reach people watching Channel 4's programming in the UK.

Finally, we’re providing more innovative formats for brands to reach viewers the moment they turn on their TVs. Media and entertainment marketers in the U.S. have found success with the Google TV Masthead, which displays right on the home screen of Google TV devices. Today, in a new beta, we’re expanding the Google TV Masthead to allow more industry types to sponsor entertainment content. This cinematic teaser format can be tested using a Programmatic Guaranteed deal in Display & Video 360.

Check out the video below to learn more about new CTV solutions in Display & Video 360 and how to grow your business for the long run using enterprise ad tools.

Building the future of marketing together

Technology is powering more business growth around the world than ever before. And new consumer behaviors are redefining the role technology plays in everyday life. We see this in the surge of video watch time, the rise of browsing behavior on Search and the growth in online shopping. There is opportunity in all of this.

With the right tools, you can meet your customers where they are today while building resilience for tomorrow — and Google is here to be your partner. Your insights and feedback are helping shape every investment we’re making across Google Ads.

Throughout Google Marketing Live, you’ll see the many ways we’re working to help you unlock growth for your business and navigate today’s rapidly shifting advertising landscape. Join us at 9:00 a.m. PT (12:00 p.m. ET) to hear about Google’s latest product innovations across Ads and Commerce. Here’s a preview of some of those announcements.

Reimagining the future of marketing across Search and YouTube

Consumers are turning to Google Search and YouTube more than ever for help with purchase decisions. In fact, we see over one billion shopping journeys happen across Google every day.[5bfc42]

When it comes to shopping on Google Search, we’ve made the experience more natural and intuitive. Categories like apparel have seen tremendous success as consumers explore information in more visual and browsable ways. Later this year, advertisers will be eligible to show new, highly visual Shopping ads to U.S. customers. These will be clearly labeled as ads and will be eligible to appear in dedicated ad slots throughout the page.

Reimagining Shopping ads in new, visually engaging search results (U.S.-only search results)

However, nothing can quite replace seeing a product in person or bringing it home to try out. Augmented reality (AR) on cameras gets us close, and shoppers are ready for it. More than 90% of Americans currently use, or would consider using, AR for shopping.[3396f7]Soon, merchants will be able to have 3D models of their products appear directly on Google Search, allowing shoppers to easily see them in their spaces.

Augmented reality in Search provides a fully immersive shopping experience

In the coming months, you’ll also be able to promote your loyalty benefits to potential customers in the U.S. when they’re shopping across Google. Loyalty programs represent a meaningful relationship between you and your customers, and soon you’ll be able to easily integrate them with Google Ads.

A mobile phone screen showing the search query “hair blow dryer brush.” One of the results shows a tag that says “Get free shipping and earn points.”

Showcase your loyalty benefits across Google to consumers in the U.S.

Using Performance Max campaigns — along with a product feed — you’ll be able to drive more online loyalty sign-ups across YouTube, Display, Search, Discover, Gmail and Maps.

Starting today, your Video action campaigns and App campaigns will automatically scale to YouTube Shorts. YouTube Shorts now averages over 30 billion daily views — four times as many as a year ago — and we want to help you reach people immersed in this short-form content.[4cc29a]Later this year, you’ll also be able to connect your product feed to your campaigns and make your video ads on YouTube Shorts more shoppable. We've been experimenting with ads in YouTube Shorts since last year, and we’re now gradually rolling that out to all advertisers around the world. This is an exciting milestone for advertisers, and a key step on our road to developing a long-term YouTube Shorts monetization solution for our creators, which we'll share more about soon.

Product feeds on Video action campaigns will roll out to YouTube Shorts later this year

Delivering better results with automation and insights

The best way to unlock growth for your business is combining your data and marketing expertise with Google’s machine learning. This means building automated products that meet your objectives and are simple to use.

Performance Max campaigns are a powerful tool for helping you meet consumers where they are across Google channels. In fact, advertisers that use Performance Max campaigns in their account see an average increase of 13% total incremental conversions at a similar cost per action.[ee87c8][3b5bb4]Today, we’re announcing six upcoming additions:

1. More tools for experimentation, like A/B tests to see how Performance Max is driving incremental conversions.
2. Expanded campaign management support in Search Ads 360 and the Google Ads mobile app.
3. Support for store sales goals to optimize for in-store sales, in addition to store visits and local actions.
4. Maximize impact with burst campaigns for a set time period to help meet in-store goals during seasonal events.
5. New insights and explanations, including attribution, audience and auction insights so you know what’s driving performance.
6. Optimization score and recommendations so you can see how to improve your campaign.

Rothy’s, a sustainable brand known for its iconic shoes and accessories, turned to Performance Max to connect with customers across channels. As a result, they increased conversions by 60% and grew revenue by 59%.

Case study video where Rothy’s discusses their experience with Performance Max
10:25

Insights page uses machine learning to identify new pockets of consumer demand and provide personalized trend data. Only Google can surface these types of insights, based on the billions of searches we see every day and the millions of signals we analyze for every ad auction. Today, we’re introducing three new reports that will roll out over the coming months:

1. Attribution insights show how your ads work together across Google surfaces — like Search, Display and YouTube — to drive conversions.
2. Budget insights find new opportunities for budget optimization and show how your spend is pacing against your budget goals.
3. Audience insights for first-party data show how your customer segments, like those created with Customer Match, are driving campaign performance.

Building resilience in a shifting landscape

At Google we’re driven by a shared goal: to be the most helpful company in the world. But we know that our products can only be as helpful as they are safe. That's why we’re launching innovations like My Ad Center later this year to keep users in control of their privacy and online experience. People will be able to pick the types of ads they want to see more or less of, and control how their data informs ads they see across YouTube, Search and Discover.

Control your ads experience in My Ad Center

And these solutions can and should still work for advertisers. We can both advance privacy and continue supporting the ecosystem.

Join us at Google Marketing Live

What an extraordinary time to be in this industry — my team and I are humbled to be on this journey with you. It’s a big moment for all of us, and I know that we can meet it by working together.

Join us today at Google Marketing Live at 9:00 a.m. PT (12:00 p.m. ET) to learn more about these and other ads innovations and commerce announcements. We hope to see you there!

An accelerator for early-stage Latino founders

After 10 years of working with early-stage founders at Google for Startups, I’ve seen time and time again how access activates potential. Access to capital is the fuel that makes startups go, access to community keeps them running, and access to mentorship helps them navigate the road to success.

But access to the resources needed to grow one's business are still not evenly distributed. Despite being the fastest-growing group of entrepreneurs in the U.S., only 3% of Latino-owned companies ever reach $1 million in revenue. As part of our commitment to support the Latino founder community, today we're announcing a new partnership with Visible Hands, a Boston-based venture capital firm dedicated to investing in the potential of underrepresented founders.

During last year’s Google for Startups Founders Academy, I met Luis Suarez, a founder and fellow Chicagoan whose startup, Sanarai, addresses the massive gap in Spanish- speaking mental health providers in the U.S. Sanarai connects Latinos to therapists in Latin American countries for virtual sessions in their native language. When I asked Luis about the most helpful programs he had participated in, he highly recommended Visible Hands. The program gave Luis the opportunity to work alongside a community of diverse founders to grow his startup and have also helped him craft his early fundraising strategy. Visible Hands also supplies stipends to their participants, helping founders who might otherwise not be able to take the leap into full-time entrepreneurship.

Inspired by feedback from founders like Luis, Google for Startups is partnering with Visible Hands to run a 20-week fellowship program, VHLX, to better support the next wave of early-stage Latino founders across the U.S. and to create greater economic opportunity for the Latino community. In addition to hands-on support from Google and industry experts, we are providing $10,000 in cash for every VHLX participant to help kickstart their ideas. Following the program, founders will have the opportunity to receive additional investment from Visible Hands, up to $150,000.

Our work with Visible Hands and our recent partnership with eMerge Americas is part of a$7 million commitment to increase representation and support of the Latino startup community. I’m also looking forward to the Google for Startups Latino Leaders Summit in Miami this June, where in partnership with Inicio Ventures we’re bringing together around 30 top community leaders and investors from across the country to discuss how we can collectively support Latino founders in ways that will truly make a difference. And soon, we'll share the recipients Google for Startups Latino Founders Fund.

If you or someone you know would be a great fit for VHLX, encourage them to apply by June 24.

Street View turns 15 with a new camera and fresh features

Fifteen years ago, Street View began as a far-fetched idea from Google co-founder Larry Page to build a 360-degree map of the entire world. Fast forward to today: There are now over 220 billion Street View images from over 100 countries and territories — a new milestone — allowing people to fully experience what it’s like to be in these places right from their phone or computer. And Street View doesn't just help you virtually explore, it’s also critical to our mapping efforts — letting you see the most up-to-date information about the world, while laying the foundation for a more immersive, intuitive map.

While that’s all worth celebrating, we aren’t stopping there. Today, we’re unveiling Street View’s newest camera, giving you more ways to explore historical imagery, and taking a closer look at how Street View is powering the future of Google Maps.

Bringing Street View to more places with our newest camera

From the back of a camel in the Arabian desert to a snowmobile zipping through the Arctic, we’ve gotten creative with the ways we’ve used Street View cameras to capture imagery. And if there’s one thing we’ve learned, it’s that our world changes at lightning speed. Our hardware is one way we’re able to keep up with the pace.

In addition to our Street View car and trekker, we’re piloting a new camera that will fully roll out next year to help us collect high-quality images in more places. This new camera takes all the power, resolution and processing capabilities that we’ve built into an entire Street View car, and shrinks it down into an ultra-transportable camera system that’s roughly the size of a house cat. But unlike house cats, it’s ready to be taken to remote islands, up to the tops of mountains or on a stroll through your local town square.

Street View’s newest camera featuring a blue top and two camera lenses and a metallic bottom with vents

Here’s a quick look at our new camera system:

  • It weighs less than 15 pounds. This means it can be shipped anywhere. This is especially handy when we work with partners around the world to capture imagery of traditionally under-mapped areas — like the Amazon jungle.
  • It’s extremely customizable. Previously, we needed to create an entirely new camera system whenever we wanted to collect different types of imagery. But now, we can add on to this modular camera with components like lidar — laser scanners — to collect imagery with even more helpful details, like lane markings or potholes. We can add these features when we need them, and remove them when we don’t.
  • It can fit on any car. Our new camera can be attached to any vehicle with a roof rack and operated right from a mobile device — no need for a specialized car or complex processing equipment. This flexibility will make collections easier for partners all over the world, and allow us to explore more sustainable solutions for our current fleet of cars — like plug-in hybrids or fully electric vehicles. You’ll start seeing our new camera in fun Google colors alongside our iconic Street View cars and trekkers next year.

Traveling back in time with Street View ?️

Street View is all about capturing the world as it changes, and it’s also a powerful way to reminisce about the past. Starting today on Android and iOS globally, it’s now easier than ever to travel back in time right from your phone. Here’s how it works:

When you’re viewing Street View imagery of a place, tap anywhere on the photo to see information about the location. Then tap "See more dates" to see the historical imagery we’ve published of that place, dating back to when Street View launched in 2007. Browse each of the images to see a digital time capsule that shows how a place has changed — like how the Vessel in New York City’s Hudson Yards grew from the ground up.

A gif of a mobile phone scrolling through historical Street View imagery of The Vessel in New York on Google Maps

Building a more helpful, immersive map ?️

Street View is also an essential part of how we map the world. Here’s a look at how imagery helps us do that:

  • Updates to business information that reflect your changing world. We use Street View imagery coupled with AI to make helpful updates to Google Maps — such as adding newly opened businesses, surfacing new hours at your favorite restaurants and updating speed limit information. In fact, over the last three years, AI has helped us make over 25 billion updates to Maps so you can be confident that the information you’re seeing is as fresh and up-to-date as possible.
  • Easier than ever navigation, indoors and out. Street View imagery powers popular features like Live View, which allows you to use your phone’s camera to overlay navigation instructions on top of the real world so you can walk to your destination in a snap.
  • Immersive view helps you know before you go. Thanks to advances in computer vision and AI over the last several years, we’re able to fuse together billions of Street View and aerial images to create a rich, digital model of places around the world. With our new immersive view launching later this year, you can easily glide down to street level on Maps and even check out the inside of a business as if you were walking around.

In celebration of Street View’s birthday, you’ll have the opportunity to make your navigation icon a celebratory Street View car – just tap the chevron when you’re in driving navigation. And on desktop, our beloved Pegman – who you can pick up and drop anywhere in Maps to see Street View – will be dressed up in a birthday hat and balloons for the celebration.

To keep the celebration going, check out our newest collections of places like The Pyramids of Meroë in Sudan and Les Invalides in France, popular spots to explore with Street View and some of our all-time favorite Street View images to date. Oh the places you’ll go! ?

Helping Ukrainian teachers keep teaching

The Russian invasion of Ukraine is a tragedy, not just for now but for generations to come. As the international community response evolves, we’ve continued to look for ways to help, whether by supporting the humanitarian effort, providing timely, trusted information and promoting cybersecurity.

With millions of people forced to leave their homes, and thousands of schools affected by bombings and shelling, the Ukrainian Ministry of Education and Science predict more than 3.7 million students are learning remotely.

Providing Chromebooks to schools

For Ukraine’s teachers, creating and delivering content to their students has become increasingly difficult with the move to distance learning. To help teachers keep teaching, Google is working with the Ukrainian Ministry of Education and Science, UNESCO, and partners from around the world to provide hardware, software, content and training.

To help education continue for both remaining and displaced students, Google is giving 43,000 Chromebooks to Ukrainian teachers - helping them to connect with their students, wherever they are now based.

To ensure those devices make the best possible impact, Google is partnering with local organisations to train around 50,000 teachers - and providing our Chrome Enterprise upgrade so that schools can set-up and manage devices remotely. Through a series of workshops and online material, educators will learn how to get the best use out of their devices, and the suite of Google Workspace for Education tools we’re providing.

Google for Education will also continue to update resources such as Teach From Anywhere, a central hub of information, tips, training and tools, that was developed during the pandemic.

In the coming weeks, we’re expanding youtube.com/learning to include the Ukrainian language so that Ukrainian students aged 13-17 can discover content that supports their curriculum - wherever they are. This will include a range of subjects, aligned to the national curriculum, from Ukrainian Literature and Language studies, to Physics, Biology, Chemistry, Mathematics, and more.

Supporting universities and their students

Of course, university students have been impacted by the war in Ukraine too - with many now unable to attend their classes in person or in real-time. To help support them to continue their education, we have made several of our premium Google Workspace for Education features available to Ukrainian universities free of cost until the end of the year. That will allow universities to host larger meetings for up to 250 participants, as well as to record them directly in Drive.

Continuing to help Ukrainian refugees and students

Google will continue to search for ways it can partner with Ukraine's Ministry of Education and Science, and those of bordering countries, to help those impacted by the war in Ukraine - including supporting the millions of school-age refugees to access education in this difficult and trying time.

HBD to us! Let’s celebrate with Street View adventures

Street View is turning 15, and the birthday nostalgia is hitting us hard.

In 2007, we published our first Street View images of San Francisco, New York, Las Vegas, Miami and Denver. Since then, Street View cars equipped with cameras have captured and shared more than 220 billion Street View images and mapped 10 million miles — the equivalent of circling the globe more than 400 times! We’ve also captured Street View imagery inside cultural landmarks, high up in space and deep under the ocean.

To celebrate Street View’s 15th birthday, we’re sharing 15 amazing Street View collections — including three places the world’s been loving lately, four new collections (consider this our party favor to you), and Street View images that make us feel some kind of way. So raise your glasses — er, cursors — and let's cheers to exploring the world together.

Where you’ve been exploring and new places to go

With so many places and landmarks at your fingertips, three spots in particular piqued your interest over the past year. Here are the three most popular places to explore on Street View: head up to the 154th floor of the Burj Khalifa in the United Arab Emirates, which was named the world’s tallest building; the iconic Eiffel Tower in France, complete with dazzling views of Paris from the top; and our special collection of imagery from the Taj Mahal in India.

And for your next Street View excursions, we’ve started rolling out four new collections that we think will become all-time favorites.

A Street View image of the Pyramids of Meroë in Sudan

The Pyramids of Meroë in Sudan: Thanks to new panoramic imagery, explore the ancient pyramids that are home to tombs of the kings and queens of the Kushite Kingdom.

A Street View image of the Crypt in the Duomo in Milan

The Duomo in Milan: The Duomo is the largest cathedral in Italy and the third-largest cathedral in Europe. Not to mention, it boasts one of the best views of Milan. We’ve been working with Google Arts & Culture and the Duomo of Milan since 2019 to bring imagery from inside the Duomo to Street View so that everyone can get a behind-the-scenes look at this architectural and cultural gem — and it’s now live!

A Street View image of Paris from Les Invalides’ golden dome

Les Invalides in Paris: Before the Eiffel Tower, Les Invalides’ golden dome was the highest point in Paris. New images of the historic Hôtel des Invalides buildings let you explore its museums and monuments. Learn more about French military history viaa virtual tour.

Sydney Ferries in Australia: The iconic Sydney Ferries will soon be digitally preserved as a result of our work with Transport for New South Wales and Transdev. Later this year, we’ll bring this collection onto Street View so that people around the world can take a virtual tour of Sydney Ferries and get a glimpse of the journey along Sydney’s stunning harbor.

8 Street View images we love

With endless places to explore, it’s difficult to pick favorites — really, you should have seen the list we narrowed this down from — but we gave it our best shot. Here are eight Street View images we love.

Street View image of the active Ambrym Volcano Marum Crater

Does the thought of visiting an active volcano scare you? Us too! A New Zealand-based Googler took a trekker into the active Ambrym Volcano Marum Crater in Vanuatu so you don’t have to.

Street View image of a Greek town next to the ocean

Monemvasia is a Greek town that’s name is derived from two Greek words meaning “single entry.” Fittingly, there is only one way into this rock fortress. Explore the town on Street View without the headache of getting there.

Street View image of an empty chamber with a large chandelier

The Wieliczka Salt Mine in Poland is a UNESCO site with a chamber where all decorative elements are made of salt.

Street View image of a grassy hill overlooking the ocean

Calling all scary movie buffs! Can you guess which 1998 horror film this active volcano in Japan served as a backdrop for? (Hint: the title rhymes with “The Wing.”)

Street View image of two camels in front of a castle in the rocks

Does Petra, Jordan look familiar? How about here? The filming location has made cameos in a number of movies, including “Aladdin,” “The Mummy Returns,” “Indiana Jones and the Last Crusade” and “Transformers: Revenge of the Fallen.”

Street View image inside the International Space Station looking down at Earth.

Thanks to a collaboration with NASA, Street Viewers can get a taste of what it’s like to be an astronaut. Ditch the gravity and float through the International Space Station.

Street View image of sea lions swimming underwater.

Dive into the Pacific Ocean and swim with sea lions off the shore of the Galapagos Islands.

Street View image of a  person in a horse mask eating a banana next to a table on the side of the road

And if there’s one Street View image that lives in our heads rent free… it's this horse eating a banana on the side of the road in Canada.

We’re proud of the work we’ve done to capture so much of the world’s wonder, history and quirkiness in Street View. But we’d be remiss if we didn’t give a shout out to all of the Maps users around the world who have captured and shared their own Street View imagery. To help make exploring the world together even easier, we’re launching Street View Studio — a new platform with all the tools you need to publish 360 image sequences quickly and in bulk. Check out more ways we’re advancing Street View so we can explore together for another 15 years.