Category Archives: Google Webmaster Central Blog

Official news on crawling and indexing sites for the Google index

Crawling December: CDNs and crawling

Content delivery networks (CDNs) are particularly well suited for decreasing latency of your website and in general keeping web traffic-related headaches away. This is their primary purpose after all: speedy delivery of your content even if your site is getting loads of traffic. The "D" in CDN is for delivering or distributing the content across the world, so transfer times to your users is also lower than just hosting in one data center somewhere. In this post we're going to explore how to make use of CDNs in a way that improves crawling and users' experience on your site, and we also look at some nuances of crawling CDN-backed sites.

Search Central Live Kuala Lumpur and Taipei 2024: Recap

The Search Central Live events in Kuala Lumpur and Taipei were nothing short of amazing, in large thanks to the over 600 people who attended the events! We were thrilled to see the level of enthusiasm and engagement from attendees even if, on the day prior to the Taipei event, we collectively had to deal with typhoon Kong Rey, the first supertyphoon in Taiwan's history to make landfall after mid-October. Here's a deeper dive into what made these events so special and what's next.

An improved way to view your recent performance data in Search Console

To better help you monitor the recent performance of your content, we're launching the '24 hours' view to the SC performance reports and improving the freshness of the data. We're rolling out these changes to all properties gradually over the next few months, so you might not see changes right away.

Crawling December: HTTP caching

Allow us to cache, pretty please. As the internet grew over the years, so did how much Google crawls. While Google's crawling infrastructure supports heuristic caching mechanisms, in fact always had, the number of requests that can be returned from local caches has decreased: 10 years ago about 0.026% of the total fetches were cacheable, which is already not that impressive; today that number is 0.017%.

Crawling December: The how and why of Googlebot crawling

You may have heard that Google Search needs to do a bit of work before a web page can show up in Google Search results. One of these steps is called crawling. Crawling for Google Search is done by Googlebot, a program running on Google servers that retrieves a URL and handles things like network errors, redirects, and other small complications that it might encounter as it works its way through the web. But there are a few details that aren't often talked about. Each week this month we're going to explore some of those details as they may have a significant effect on how your sites are crawled.

Updating our site reputation abuse policy

Earlier this year, as part of our work to fight spam and deliver a great Search experience, we launched a spam policy to combat site reputation abuse. We're clarifying our policy language to further target this type of spammy behavior. We're making it clear that using third-party content on a site in an attempt to exploit the site's ranking signals is a violation of this policy — regardless of whether there is first-party involvement or oversight of the content.