Tag Archives: crawling and indexing

Google I/O 2018 – What sessions should SEOs and Webmasters watch live ?

Google I/O 2018 is starting today in California, to an international audience of 7,000+ developers. It will run until Thursday night. It is our annual developers festival, where product announcements are made, new APIs and frameworks are introduced, and Product Managers present the latest from Google.

However, you don't have to physically attend the event to take advantage of this once-a-year opportunity: many conferences and talks are live streamed on YouTube for anyone to watch. You will find the full-event schedule here.

Dozens upon dozens of talks will take place over the next 3 days. We have hand picked the talks that we think will be the most interesting for webmasters and SEO professionals. Each link shared will bring you to pages with more details about each talk, and you will find out how to tune in to the live stream. All times are California time (PCT). We might add other sessions to this list.

Tuesday, May 8th

3pm - Web Security post Spectre/Meltdown, with Emily Schechter and Chris Palmer - more info.

5pm - Dru Knox and Stephan Somogyi talk about building a seamless web with Chrome - more info.

Wednesday, May 9th

9.30am - Ewa Gasperowicz and Addy Osmani talk about Web Performance and increasing control over the loading experience - more info.

10.30am - Alberto Medina and Thierry Muller will explain how to make a WordPress site progressive - more info.

11.30am - Rob Dodson and Dominic Mazzoni will cover "What's new in web accessibility" - more info.

3.30pm - Michael Bleigh will introduce how to leverage AMP in Firebase for a blazing fast website - more info.

4.30pm - Rick Viscomi and Vinamrata Singal will introduce the latest with Lighthouse and Chrome UX Report for Web Performance - more info.

Thursday, May 10th

8.30am - John Mueller and Tom Greenaway will talk about building Search-friendly JavaScript websites - more info.

9.30am - Build e-commerce sites for the modern web with AMP, PWA, and more, with Adam Greenberg and Rowan Merewood - more info.

12.30pm - Session on "Building a successful web presence with Google Search" by John Mueller and Mariya Moeva - more info.

This list is only a sample of the content at this year's Google I/O, and there might be many more that are interesting to you! To find out about those other talks, check out the full list of web sessions, but also the sessions about Design, the Cloud sessions, the machine learning sessions, and more…

We hope you can make the time to watch the talks online, and participate in the excitement of I/O ! The videos will also be available on Youtube after the event, in case you can't tune in live.

Posted by Vincent Courson, Search Outreach Specialist, and the Google Webmasters team

Source: Google Webmaster Central Blog

Rolling out mobile-first indexing

Today we’re announcing that after a year and a half of careful experimentation and testing, we’ve started migrating sites that follow the best practices for mobile-first indexing.

To recap, our crawling, indexing, and ranking systems have typically used the desktop version of a page's content, which may cause issues for mobile searchers when that version is vastly different from the mobile version. Mobile-first indexing means that we'll use the mobile version of the page for indexing and ranking, to better help our – primarily mobile – users find what they're looking for.

We continue to have one single index that we use for serving search results. We do not have a “mobile-first index” that’s separate from our main index. Historically, the desktop version was indexed, but increasingly, we will be using the mobile versions of content.

We are notifying sites that are migrating to mobile-first indexing via Search Console. Site owners will see significantly increased crawl rate from the Smartphone Googlebot. Additionally, Google will show the mobile version of pages in Search results and Google cached pages.

To understand more about how we determine the mobile content from a site, see our developer documentation. It covers how sites using responsive web design or dynamic serving are generally set for mobile-first indexing. For sites that have AMP and non-AMP pages, Google will prefer to index the mobile version of the non-AMP page.

Sites that are not in this initial wave don’t need to panic. Mobile-first indexing is about how we gather content, not about how content is ranked. Content gathered by mobile-first indexing has no ranking advantage over mobile content that’s not yet gathered this way or desktop content. Moreover, if you only have desktop content, you will continue to be represented in our index.

Having said that, we continue to encourage webmasters to make their content mobile-friendly. We do evaluate all content in our index -- whether it is desktop or mobile -- to determine how mobile-friendly it is. Since 2015, this measure can help mobile-friendly content perform better for those who are searching on mobile. Related, we recently announced that beginning in July 2018, content that is slow-loading may perform less well for both desktop and mobile searchers.

To recap:

Mobile-indexing is rolling out more broadly. Being indexed this way has no ranking advantage and operates independently from our mobile-friendly assessment.
Having mobile-friendly content is still helpful for those looking at ways to perform better in mobile search results.
Having fast-loading content is still helpful for those looking at ways to perform better for mobile and desktop users.
As always, ranking uses many factors. We may show content to users that’s not mobile-friendly or that is slow loading if our many other signals determine it is the most relevant content to show.

We’ll continue to monitor and evaluate this change carefully. If you have any questions, please drop by our Webmaster forums or our public events.

Posted by Fan Zhang, Software Engineer

Source: Google Webmaster Central Blog

Introducing the new Search Console

A few months ago we released a beta version of a new Search Console experience to a limited number of users. We are now starting to release this beta version to all users of Search Console, so that everyone can explore this simplified process of optimizing a website's presence on Google Search. The functionality will include Search performance, Index Coverage, AMP status, and Job posting reports. We will send a message once your site is ready in the new Search Console.

We started by adding some of the most popular functionality in the new Search Console (which can now be used in your day-to-day flow of addressing these topics). We are not done yet, so over the course of the year the new Search Console (beta) will continue to add functionality from the classic Search Console. Until the new Search Console is complete, both versions will live side-by-side and will be easily interconnected via links in the navigation bar, so you can use both.

The new Search Console was rebuilt from the ground up by surfacing the most actionable insights and creating an interaction model which guides you through the process of fixing any pending issues. We’ve also added ability to share reports within your own organization in order to simplify internal collaboration.

Search Performance: with 16 months of data!

If you've been a fan of Search Analytics, you'll love the new Search Performance report. Over the years, users have been consistent in asking us for more data in Search Analytics. With the new report, you'll have 16 months of data, to make analyzing longer-term trends easier and enable year-over-year comparisons. In the near future, this data will also be available via the Search Console API.

Index Coverage: a comprehensive view on Google's indexing

The updated Index Coverage report gives you insight into the indexing of URLs from your website. It shows correctly indexed URLs, warnings about potential issues, and reasons why Google isn't indexing some URLs. The report is built on our new Issue tracking functionality that alerts you when new issues are detected and helps you monitor their fix.

So how does that work?

When you drill down into a specific issue you will see a sample of URLs from your site. Clicking on error URLs brings up the page details with links to diagnostic-tools that help you understand what is the source of the problem.
Fixing Search issues often involves multiple teams within a company. Giving the right people access to information about the current status, or about issues that have come up there, is critical to improving an implementation quickly. Now, within most reports in the new Search Console, you can do that with the share button on top of the report which will create a shareable link to the report. Once things are resolved, you can disable sharing just as easily.
The new Search Console can also help you confirm that you've resolved an issue, and help us to update our index accordingly. To do this, select a flagged issue, and click validate fix. Google will then crawl and reprocess the affected URLs with a higher priority, helping your site to get back on track faster than ever.
The Index Coverage report works best for sites that submit sitemap files. Sitemap files are a great way to let search engines know about new and updated URLs. Once you've submitted a sitemap file, you can now use the sitemap filter over the Index Coverage data, so that you're able to focus on an exact list of URLs.

Search Enhancements: improve your AMP and Job Postings pages

The new Search Console is also aimed at helping you implement Search Enhancements such as AMP and Job Postings (more to come). These reports provide details into the specific errors and warnings that Google identified for these topics. In addition to the functionally described in the index coverage report, we augmented the reports with two extra features:

The first feature is aimed at providing faster feedback in the process of fixing an issue. This is achieved by running several instantaneous tests once you click the validate fix button. If your pages don’t pass this test we provide you with an immediate notification, otherwise we go ahead and reprocess the rest of the affected pages.
The second new feature is aimed at providing positive feedback during the fix process by expanding the validation log with a list of URLs that were identified as fixed (in addition to URLs that failed the validation or might still be pending).

Similar to the AMP report, the new Search Console provides a Job postings report. If you have jobs listings on your website, you may be eligible to have those shown directly through Google for Jobs (currently only available in certain locations).

Feedback welcome

We couldn’t have gotten so far without the ongoing feedback from our diligent trusted testers (we plan to share more on how their feedback helped us dramatically improve Search Console). However, your continued feedback is critical for us: if there's something you find confusing or wrong, or if there's something you really like, please let us know through the feedback feature in the sidebar. Also note that the mobile experience in the new Search Console is still a work in progress.

We want to end this blog sharing an encouraging response we got from a user who has been testing the new Search Console recently:

"The UX of new Search Console is clean and well laid out, everything is where we expect it to be. I can even kick-off validation of my fixes and get email notifications with the result. It’s been a massive help in fixing up some pesky AMP errors and warnings that were affecting pages on our site. On top of all this, the Search Analytics report now extends to 16 months of data which is a total game changer for us" - Noah Szubski, Chief Product Officer, DailyMail.com

Are there any other tools that would make your life as a webmaster easier? Let us know in the comments here, and feel free to jump into our webmaster help forums to discuss your ideas with others!

Posted by John Mueller, Ofir Roval and Hillel Maoz

Source: Google Webmaster Central Blog

Rendering AJAX-crawling pages

The AJAX crawling scheme was introduced as a way of making JavaScript-based webpages accessible to Googlebot, and we've previously announced our plans to turn it down. Over time, Google engineers have significantly improved rendering of JavaScript for Googlebot. Given these advances, in the second quarter of 2018, we'll be switching to rendering these pages on Google's side, rather than on requiring that sites do this themselves. In short, we'll no longer be using the AJAX crawling scheme.

As a reminder, the AJAX crawling scheme accepts pages with either a "#!" in the URL or a "fragment meta tag" on them, and then crawls them with an "?_escaped_fragment_=" in the URL. That escaped version needs to be a fully-rendered and/or equivalent version of the page, created by the website itself.

With this change, Googlebot will render the #! URL directly, making it unnecessary for the website owner to provide a rendered version of the page. We'll continue to support these URLs in our search results.

We expect that most AJAX-crawling websites won't see significant changes with this update. Webmasters can double-check their pages as detailed below, and we'll be sending notifications to any sites with potential issues.

If your site is currently using either #! URLs or the fragment meta tag, we recommend:

Verify ownership of the website in Google Search Console to gain access to the tools there, and to allow Google to notify you of any issues that might be found.
Test with Search Console's Fetch & Render. Compare the results of the #! URL and the escaped URL to see any differences. Do this for any significantly different part of the website. Check our developer documentation for more information on supported APIs, and see our debugging guide when needed.
Use Chrome's Inspect Element to confirm that links use "a" HTML elements and include a rel=nofollow where appropriate (for example, in user-generated content)
Use Chrome's Inspect Element to check the page's title and description meta tag, any robots meta tag, and other meta data. Also check that any structured data is available on the rendered page.
Content in Flash, Silverlight, or other plugin-based technologies needs to be converted to either JavaScript or "normal" HTML, if their content should be indexed in search.

We hope that this change makes it a bit easier for your website, and reduces the need to render pages on your end. Should you have any questions or comments, feel free to drop by our webmaster help forums, or to join our JavaScript sites working group.

Posted by John Mueller, Google Switzerland

Source: Google Webmaster Central Blog

Closing down for a day

Even in today's "always-on" world, sometimes businesses want to take a break. There are times when even their online presence needs to be paused. This blog post covers some of the available options so that a site's search presence isn't affected.

Option: Block cart functionality

If a site only needs to block users from buying things, the simplest approach is to disable that specific functionality. In most cases, shopping cart pages can either be blocked from crawling through the robots.txt file, or blocked from indexing with a robots meta tag. Since search engines either won't see or index that content, you can communicate this to users in an appropriate way. For example, you may disable the link to the cart, add a relevant message, or display an informational page instead of the cart.

Option: Always show interstitial or pop-up

If you need to block the whole site from users, be it with a "temporarily unavailable" message, informational page, or popup, the server should return a 503 HTTP result code ("Service Unavailable"). The 503 result code makes sure that Google doesn't index the temporary content that's shown to users. Without the 503 result code, the interstitial would be indexed as your website's content.

Googlebot will retry pages that return 503 for up to about a week, before treating it as a permanent error that can result in those pages being dropped from the search results. You can also include a "Retry after" header to indicate how long the site will be unavailable. Blocking a site for longer than a week can have negative effects on the site's search results regardless of the method that you use.

Option: Switch whole website off

Turning the server off completely is another option. You might also do this if you're physically moving your server to a different data center. For this, have a temporary server available to serve a 503 HTTP result code for all URLs (with an appropriate informational page for users), and switch your DNS to point to that server during that time.

Set your DNS TTL to a low time (such as 5 minutes) a few days in advance.
Change the DNS to the temporary server's IP address.
Take your main server offline once all requests go to the temporary server.
… your server is now offline ...
When ready, bring your main server online again.
Switch DNS back to the main server's IP address.
Change the DNS TTL back to normal.

We hope these options cover the common situations where you'd need to disable your website temporarily. If you have any questions, feel free to drop by our webmaster help forums!

PS If your business is active locally, make sure to reflect these closures in the opening hours for your local listings too!

Posted by John Mueller, Webmaster Trends Analyst, Switzerland

Source: Google Webmaster Central Blog

An update on Google’s feature-phone crawling & indexing

Limited mobile devices, "feature-phones", require a special form of markup or a transcoder for web content. Most websites don't provide feature-phone-compatible content in WAP/WML any more. Given these developments, we've made changes in how we crawl feature-phone content (note: these changes don't affect smartphone content):

1. We've retired the feature-phone Googlebot

We won't be using the feature-phone user-agents for crawling for search going forward.

2. Use "handheld" link annotations for dynamic serving of feature-phone content.

Some sites provide content for feature-phones through dynamic serving, based on the user's user-agent. To understand this configuration, make sure your desktop and smartphone pages have a self-referential alternate URL link for handheld (feature-phone) devices:

<link rel="alternate" media="handheld" href="[current page URL]" />

This is a change from our previous guidance of only using the "vary: user-agent" HTTP header. We've updated our documentation on making feature-phone pages accordingly. We hope adding this link element is possible on your side, and thank you for your help in this regard. We'll continue to show feature-phone URLs in search when we can recognize them, and when they're appropriate for users.

3. We're retiring feature-phone tools in Search Console

Without the feature-phone Googlebot, special sitemaps extensions for feature-phone, the Fetch as Google feature-phone options, and feature-phone crawl errors are no longer needed. We continue to support sitemaps and other sitemaps extensions (such as for videos or Google News), as well as the other Fetch as Google options in Search Console.

We've worked to make these changes as minimal as possible. Most websites don't serve feature-phone content, and wouldn't be affected. If your site has been providing feature-phone content, we thank you for your help in bringing the Internet to feature-phone users worldwide!

For any questions, feel free to drop by our Webmaster Help Forums!

Posted by John Mueller, Webmaster Trends Analyst, Google Switzerland

Source: Google Webmaster Central Blog

Deprecating our AJAX crawling scheme

tl;dr: We are no longer recommending the AJAX crawling proposal we made back in 2009.

In 2009, we made a proposal to make AJAX pages crawlable. Back then, our systems were not able to render and understand pages that use JavaScript to present content to users. Because "crawlers … [were] not able to see any content … created dynamically," we proposed a set of practices that webmasters can follow in order to ensure that their AJAX-based applications are indexed by search engines.

Times have changed. Today, as long as you're not blocking Googlebot from crawling your JavaScript or CSS files, we are generally able to render and understand your web pages like modern browsers. To reflect this improvement, we recently updated our technical Webmaster Guidelines to recommend against disallowing Googlebot from crawling your site's CSS or JS files.

Since the assumptions for our 2009 proposal are no longer valid, we recommend following the principles of progressive enhancement. For example, you can use the History API pushState() to ensure accessibility for a wider range of browsers (and our systems).

Questions and answers

Q: My site currently follows your recommendation and supports _escaped_fragment_. Would my site stop getting indexed now that you've deprecated your recommendation?
A: No, the site would still be indexed. In general, however, we recommend you implement industry best practices when you're making the next update for your site. Instead of the _escaped_fragment_ URLs, we'll generally crawl, render, and index the #! URLs.

Q: Is moving away from the AJAX crawling proposal to industry best practices considered a site move? Do I need to implement redirects?
A: If your current setup is working fine, you should not have to immediately change anything. If you're building a new site or restructuring an already existing site, simply avoid introducing _escaped_fragment_ urls. .

Q: I use a JavaScript framework and my webserver serves a pre-rendered page. Is that still ok?
A: In general, websites shouldn't pre-render pages only for Google -- we expect that you might pre-render pages for performance benefits for users and that you would follow progressive enhancement guidelines. If you pre-render pages, make sure that the content served to Googlebot matches the user's experience, both how it looks and how it interacts. Serving Googlebot different content than a normal user would see is considered cloaking, and would be against our Webmaster Guidelines.

If you have any questions, feel free to post them here, or in the webmaster help forum.

Posted by Kazushi Nagayama, Search Quality Analyst

googblogs.com

All Google blogs and Press in one site

Tag Archives: crawling and indexing

Google I/O 2018 – What sessions should SEOs and Webmasters watch live ?

Source: Google Webmaster Central Blog

Rolling out mobile-first indexing

Source: Google Webmaster Central Blog

Introducing the new Search Console

Search Performance: with 16 months of data!

Index Coverage: a comprehensive view on Google's indexing

Search Enhancements: improve your AMP and Job Postings pages

Feedback welcome

Source: Google Webmaster Central Blog

Rendering AJAX-crawling pages

Source: Google Webmaster Central Blog

Closing down for a day

Option: Block cart functionality

Option: Always show interstitial or pop-up

Option: Switch whole website off

Source: Google Webmaster Central Blog

An update on Google’s feature-phone crawling & indexing

Source: Google Webmaster Central Blog

Deprecating our AJAX crawling scheme

Source: Google Webmaster Central Blog