Category Archives: Google Webmaster Central Blog

Official news on crawling and indexing sites for the Google index

Getting your site ready for mobile-first indexing

When we announced almost a year ago that we're experimenting with mobile-first indexing, we said we'd update publishers about our progress, something that we've done the past few months through public talks in office hours on Hangouts on Air and at conferences like Pubcon.

To recap, currently our crawling, indexing, and ranking systems typically look at the desktop version of a page's content, which may cause issues for mobile searchers when that version is vastly different from the mobile version. Mobile-first indexing means that we'll use the mobile version of the content for indexing and ranking, to better help our – primarily mobile – users find what they're looking for. Webmasters will see significantly increased crawling by Smartphone Googlebot, and the snippets in the results, as well as the content on the Google cache pages, will be from the mobile version of the pages.

As we said, sites that make use of responsive web design and correctly implement dynamic serving (that include all of the desktop content and markup) generally don't have to do anything. Here are some extra tips that help ensure a site is ready for mobile-first indexing:
  • Make sure the mobile version of the site also has the important, high-quality content. This includes text, images (with alt-attributes), and videos - in the usual crawlable and indexable formats.
  • Structured data is important for indexing and search features that users love: it should be both on the mobile and desktop version of the site. Ensure URLs within the structured data are updated to the mobile version on the mobile pages.
  • Metadata should be present on both versions of the site. It provides hints about the content on a page for indexing and serving. For example, make sure that titles and meta descriptions are equivalent across both versions of all pages on the site.
  • No changes are necessary for interlinking with separate mobile URLs (m.-dot sites). For sites using separate mobile URLs, keep the existing link rel=canonical and link rel=alternate elements between these versions.
  • Check hreflang links on separate mobile URLs. When using link rel=hreflang elements for internationalization, link between mobile and desktop URLs separately. Your mobile URLs' hreflang should point to the other language/region versions on other mobile URLs, and similarly link desktop with other desktop URLs using hreflang link elements there.
  • Ensure the servers hosting the site have enough capacity to handle potentially increased crawl rate. This doesn't affect sites that use responsive web design and dynamic serving, only sites where the mobile version is on a separate host, such as m.example.com.
We will be evaluating sites independently on their readiness for mobile-first indexing based on the above criteria and transitioning them when ready. This process has already started for a handful of sites and is closely being monitored by the search team.

We continue to be cautious with rolling out mobile-first indexing. We believe taking this slowly will help webmasters get their sites ready for mobile users, and because of that, we currently don't have a timeline for when it's going to be completed. If you have any questions, drop by our Webmaster forums or our public events.

Posted by Gary

Getting your site ready for mobile-first indexing

When we announced almost a year ago that we're experimenting with mobile-first indexing, we said we'd update publishers about our progress, something that we've done the past few months through public talks in office hours on Hangouts on Air and at conferences like Pubcon.

To recap, currently our crawling, indexing, and ranking systems typically look at the desktop version of a page's content, which may cause issues for mobile searchers when that version is vastly different from the mobile version. Mobile-first indexing means that we'll use the mobile version of the content for indexing and ranking, to better help our – primarily mobile – users find what they're looking for. Webmasters will see significantly increased crawling by Smartphone Googlebot, and the snippets in the results, as well as the content on the Google cache pages, will be from the mobile version of the pages.

As we said, sites that make use of responsive web design and correctly implement dynamic serving (that include all of the desktop content and markup) generally don't have to do anything. Here are some extra tips that help ensure a site is ready for mobile-first indexing:
  • Make sure the mobile version of the site also has the important, high-quality content. This includes text, images (with alt-attributes), and videos - in the usual crawlable and indexable formats.
  • Structured data is important for indexing and search features that users love: it should be both on the mobile and desktop version of the site. Ensure URLs within the structured data are updated to the mobile version on the mobile pages.
  • Metadata should be present on both versions of the site. It provides hints about the content on a page for indexing and serving. For example, make sure that titles and meta descriptions are equivalent across both versions of all pages on the site.
  • No changes are necessary for interlinking with separate mobile URLs (m.-dot sites). For sites using separate mobile URLs, keep the existing link rel=canonical and link rel=alternate elements between these versions.
  • Check hreflang links on separate mobile URLs. When using link rel=hreflang elements for internationalization, link between mobile and desktop URLs separately. Your mobile URLs' hreflang should point to the other language/region versions on other mobile URLs, and similarly link desktop with other desktop URLs using hreflang link elements there.
  • Ensure the servers hosting the site have enough capacity to handle potentially increased crawl rate. This doesn't affect sites that use responsive web design and dynamic serving, only sites where the mobile version is on a separate host, such as m.example.com.
We will be evaluating sites independently on their readiness for mobile-first indexing based on the above criteria and transitioning them when ready. This process has already started for a handful of sites and is closely being monitored by the search team.

We continue to be cautious with rolling out mobile-first indexing. We believe taking this slowly will help webmasters get their sites ready for mobile users, and because of that, we currently don't have a timeline for when it's going to be completed. If you have any questions, drop by our Webmaster forums or our public events.

Posted by Gary

#NoHacked 3.0: Tips on prevention

Last week on #NoHacked, we have shared on hack detection and the reasons why you might get hacked. This week we focus on prevention and here are some tips for you!

  • Be mindful of your sources! Be very careful of a free premium theme/plugin!

You probably have heard about free premium plugins! If you've ever stumbled upon a site offering you plugins you normally have to purchase for free, be very careful. Many hackers lure you in by copying a popular plugin and then add backdoors or malware that will allow them to access your site. Read more about a similar case on the Sucuri blog. Additionally, even legit good quality plugins and themes can become dangerous if:

  • you do not update them as soon as a new version becomes available
  • the developer of said theme or plugin does not update them, and they become old over time.
In any case, keeping all your site's software modern and updated is essential in keeping hackers out of your website.

  • Botnet in wordpress
    A botnetis a cluster of machines, devices, or websites under the control of a third party often used to commit malicious acts, such as operating spam campaigns, clickbots, or DDoS. It's difficult to detect if your site has been infected by a botnet because there are often no specific changes to your site. However, your site's reputation, resources, and data are at risk if your site is in a botnet. Learn more about botnets, how to detect them, and how they can affect your site at Botnet in wordpress and joomla article.

As usual if you have any questions post on our Webmaster Help Forums for help from the friendly community and see you next week!

A revamped SEO Starter Guide

There are lots of resources out there to create great websites. Website owners often ask Google what our recommended practices are to make sure great websites are search-engine-friendly. Traditionally, our resources for beginners were the SEO Starter Guide and the Webmaster Academy. To help webmasters create modern, search-engine-friendly websites, we’re announcing today the launch of a new, updated SEO Starter Guide.

The traditional SEO Starter Guide lists best practices that make it easier for search engines to crawl, index and understand content on websites. The Webmaster Academy has the information and tools to teach webmasters how to create a site and have it found in Google Search. Since these two resources have some overlapping purpose and content, and could be more exhaustive on some aspects of creating a user friendly and safe website, we’re deprecating the Webmaster Academy and removing the old SEO Starter Guide PDF.



The updated SEO Starter Guide will replace both the old Starter Guide and the Webmaster Academy. The updated version builds on top of the previously available document, and has additional sections on the need for search engine optimization, adding structured data markup and building mobile-friendly websites.
This new Guide is available in nine languages (English, German, Spanish, French, Italian, Japanese, Portuguese, Russian and Turkish) starting today, and we’ll be adding sixteen more languages very soon.

Go check out the new SEO Starter Guide, and let us know what you think about it.

For any questions, feel free to drop by our Webmaster Help Forums!

Posted by Abhas Tripathi, Search Quality Strategist

#NoHacked 3.0: How do I know if my site is hacked?

Last week #NoHacked is back on our G+ and Twitter channels! #NoHacked is our social campaign which aims to bring awareness about hacking attacks and offer tips on how to keep your sites safe from hackers. This time we would like to start sharing content from #NoHacked campaign on this blog in your local language!

Why do sites get hacked? Hackers havedifferent motives for compromising a website, and hack attacks can be very different, so they are not always easily detected. Here are some tips which will help you in detecting hacked sites!

  • Getting started:

    Start with our guide "How do I know if my site is hacked?" if you've received a security alert from Google or another party. This guide will walk you through basic steps to check for any signs of compromises on your site.

  • Understand the alert on Google Search:

    At Google, we have different processes to deal with hacking scenarios. Scanning tools will often detect malware, but they can miss some spamming hacks. A clean verdict from Safe Browsing does not mean that you haven't been hacked to distribute spam.

    • If you ever see "This site may be hacked", your site may have been hacked to display spam. Essentially, your site has been hijacked to serve some free advertising.
    • If you see"This site may harm your computer" beneath the site URL then we think the site you're about to visit might allow programs to install malicious software on your computer.
    • If you see a big red screen before your site, that can mean a variety of things:
      • If you see "The site ahead contains malware", Google has detected that your site distributes malware.
      • If you see "The site ahead contains harmful programs", then the site has been flagged for distributing unwanted software.
      • "Deceptive site ahead" warnings indicate that your site may be serving phishing or social engineering. Your site could have been hacked to do any of these things.
  • Malvertising vs Hack:

    Malvertising happens when your site loads a bad ad. It may make it seem as though your site has been hacked, perhaps by redirecting your visitors, but in fact is just an ad behaving badly.

  • Open redirects: check if your site is enabling open redirects

    Hackers might want to take advantage of a good site to mask their URLs. One way they do this is by using open redirects, which allow them to use your site to redirect users to any URL of their choice. You can read more here!

  • Mobile check: make sure to view your site from a mobile browser in incognito mode. Check for bad mobile ad networks.

    Sometimes bad content like ads or other third-party elements unknowingly redirect mobile users. This behavior can easily escape detection because it's only visible from certain browsers. Be sure to check that the mobile and desktop versions of your site show the same content.

  • Use Search Console and get message:

    Search Console is a tool that Google uses to communicate with you about your website. It also includes many other tools that can help you improve and manage your website. Make sure you have your site verified in Search Console even if you aren't a primary developer on your site. The alerts and messages in Search Console will let you know if Google has detected any critical errors on your site.

If you're still unable to find any signs of a hack, ask a security expert or post on our Webmaster Help Forums for a second look.

The #NoHacked campaign will run for the next 3 weeks. Follow us on our G+ and Twitter channels or look out for the content in this blog as we will be posting summary for each week right here at the beginning of each week! Stay safe meanwhile!

Rendering AJAX-crawling pages

The AJAX crawling scheme was introduced as a way of making JavaScript-based webpages accessible to Googlebot, and we've previously announced our plans to turn it down. Over time, Google engineers have significantly improved rendering of JavaScript for Googlebot. Given these advances, in the second quarter of 2018, we'll be switching to rendering these pages on Google's side, rather than on requiring that sites do this themselves. In short, we'll no longer be using the AJAX crawling scheme.

As a reminder, the AJAX crawling scheme accepts pages with either a "#!" in the URL or a "fragment meta tag" on them, and then crawls them with an "?_escaped_fragment_=" in the URL. That escaped version needs to be a fully-rendered and/or equivalent version of the page, created by the website itself.

With this change, Googlebot will render the #! URL directly, making it unnecessary for the website owner to provide a rendered version of the page. We'll continue to support these URLs in our search results.

We expect that most AJAX-crawling websites won't see significant changes with this update. Webmasters can double-check their pages as detailed below, and we'll be sending notifications to any sites with potential issues.

If your site is currently using either #! URLs or the fragment meta tag, we recommend:

  • Verify ownership of the website in Google Search Console to gain access to the tools there, and to allow Google to notify you of any issues that might be found.
  • Test with Search Console's Fetch & Render. Compare the results of the #! URL and the escaped URL to see any differences. Do this for any significantly different part of the website. Check our developer documentation for more information on supported APIs, and see our debugging guide when needed.
  • Use Chrome's Inspect Element to confirm that links use "a" HTML elements and include a rel=nofollow where appropriate (for example, in user-generated content)
  • Use Chrome's Inspect Element to check the page's title and description meta tag, any robots meta tag, and other meta data. Also check that any structured data is available on the rendered page.
  • Content in Flash, Silverlight, or other plugin-based technologies needs to be converted to either JavaScript or "normal" HTML, if their content should be indexed in search.

We hope that this change makes it a bit easier for your website, and reduces the need to render pages on your end. Should you have any questions or comments, feel free to drop by our webmaster help forums, or to join our JavaScript sites working group.


A reminder about “event” markup

Lately we’ve been receiving feedback from users seeing non-events like coupons or vouchers showing up in search results where “events” snippets appear. This is really confusing for users and also against our guidelines, where we have added additional clarification.

So, what’s the problem?

We’ve seen a number of  publishers in the coupons/vouchers space use the “event” markup to describe their offers. And as much as using a discount voucher can be a very special thing, that doesn’t make coupons or vouchers events or “saleEvents”. Using Event markup to describe something that is not an event creates a bad user experience, by triggering a rich result for something that will happen at a particular time, despite no actual event being present.

Here are some examples to illustrate the issue:

Since this creates a misleading user experience, we may take manual action on such cases. In case your website is affected by such a manual action, you will find a notification in your Search Console account. If a manual action is taken, it can result in structured data markup for the whole site not being used for search results.  

While we’re specifically highlighting coupons and vouchers in this blogpost, this applies to all other non-event items being annotated with “event” markup as well -- or, really, for applying a type of markup to something other than the type of thing it is meant to describe.

For more information, please visit our developer documentation or stop by our Webmaster Forum in case you have additional questions!


Engaging users through high quality AMP pages

To improve our users' experience with AMP results, we are making changes to how we enforce our policy on content parity with AMP. Starting Feb 1, 2018, the policy requires that the AMP page content be comparable to the (original) canonical page content. AMP is not a ranking signal and there is no change in terms of the ranking policy with respect to AMP.

The open source accelerated mobile pages project (AMP) launched in 2015 and has seen tremendous growth with over 25M domains having implemented the AMP format. This rapid progress comes with a sense of responsibility of ensuring that our users continue to have a great content consumption experience that ultimately leads to more engagement with publisher content.

In some cases, webmasters publish two versions of their content: a canonical page that is not based on AMP and an AMP page. In the ideal scenario, both these pages have equivalent content leading the user to get the same content but with a faster and smoother experience via AMP.  However, in some cases the content on the AMP page does not match the content on its original (canonical) page.

In a small number of cases, AMP pages are used as teaser pages which create a particularly bad user experience since they only contain minimal content. In these instances, users have to click twice to get to the real content. Below is an example of how this may look like: a brief text of the main article and then asking the user to click to visit another page to complete reading the article.

AMP was introduced to dramatically improve the performance of the web and deliver a fast, consistent content consumption experience. In keeping with this goal, we'll be enforcing the requirement of close parity between AMP and canonical page, for pages that wish to be shown in Google Search as AMPs.

Where we find that an AMP page doesn't contain the same critical content as its non-AMP equivalent, we will direct our users to the non-AMP page. This does not affect Search ranking. However, these pages will not be considered for Search features that require AMP, such as the Top Stories carousel with AMP. Additionally, we will notify the webmaster via Search console as a manual action message and give the publisher the opportunity to fix the issue before its AMP page can be served again. The AMP open source website has several helpful guides to help produce fast, beautiful and high-performing AMP pages.

We hope this change encourages webmasters to maintain content parity between the canonical and AMP equivalent. This will lead to better experience on your site and ultimately happier users.


Make your site’s complete jobs information accessible to job seekers

In June, we announced a new experience that put the convenience of Search into the hands of job seekers. Today, we are taking the next step in improving the job search experience on Google by adding a feature that shows estimated salary information from the web alongside job postings, as well as adding new UI features for users.

Salary information has been one of the most requested additions from job seekers. This helps people evaluate whether a job is a good fit, and is an opportunity for sites with estimated salary information to:

  • Increase brand awareness: Estimated salary information shows a representative logo from the estimated salary provider.
  • Get more referral traffic: Users can click through directly to salary estimate pages when salary information surfaces in job search results.

If your site provides salary estimates, you can take advantage of these changes in the following ways:

Specify actual salary information

Actual salary refers to the base salary information that is provided by the employer. If your site publishes job listings, you can add JobPosting structured data and populate the baseSalary property to be eligible for inclusion in job search results.

This salary information will be made available in both the list and the detail views.

Provide estimated salary information

In cases where employers don’t provide actual salary, job seekers may see estimated salaries sourced from multiple partners for the same or similar occupation. If your site provides salary estimate information, you can add Occupation structured data to be eligible for inclusion in job search results.  

Include exact location information

We've heard from users that having accurate, street-level location information helps them to focus on opportunities that work best for them. Sites that publish job listings can do this can do this by using the jobLocation property in JobPosting structured data.

Validate your structured data

To double-check the structured data on your pages, we'll be updating the Structured Data Testing Tool and the Search Console reports in the near future. In the meantime, you can monitor the performance of your job postings in Search Analytics. Stay tuned!

Since launching this summer, we’ve seen over 60% growth in number of companies with jobs showing on Google and connected tens of millions of people to new job opportunities. We are excited to help users find jobs with salaries that meet their needs, and to route them to your site for more information. We invite sites that provide salary estimates to mark up their salary pages using the Occupation structured data. Should you have any questions regarding the use of structured data on your site, feel free to drop by our webmaster help forums.


Enabling more high quality content for users

In Google’s mission to organize the world's information, we want to guide Google users to the highest quality content, the principle exemplified in our quality rater guidelines. Professional publishers provide the lion’s share of quality content that benefits users and we want to encourage their success.

The ecosystem is sustained via two main sources of revenue: ads and subscriptions, with the latter requiring a delicate balance to be effective in Search. Typically subscription content is hidden behind paywalls, so that users who don’t have a subscription don’t have access. Our evaluations have shown that users who are not familiar with the high quality content behind a paywall often turn to other sites offering free content. It is difficult to justify a subscription if one doesn't already know how valuable the content is, and in fact, our experiments have shown that a portion of users shy away from subscription sites. Therefore, it is essential that sites provide some amount of free sampling of their content so that users can learn how valuable their content is.

The First Click Free (FCF) policy for both Google web search and News was designed to address this issue. It offers promotion and discovery opportunities for publishers with subscription content, while giving Google users an opportunity to discover that content. Over the past year, we have worked with publishers to investigate the effects of FCF on user satisfaction and on the sustainability of the publishing ecosystem. We found that while FCF is a reasonable sampling model, publishers are in a better position to determine what specific sampling strategy works best for them. Therefore, we are removing FCF as a requirement for Search, and we encourage publishers to experiment with different free sampling schemes, as long as they stay within the updated webmaster guidelines. We call this Flexible Sampling.

One of the original motivations for FCF is to address the issues surrounding cloaking, where the content served to Googlebot is different from the content served to users. Spammers often seek to game search engines by showing interesting content to the search engine, say healthy food recipes, but then showing users an offer for diet pills. This “bait and switch” scheme creates a bad user experience since users do not get the content they expected. Sites with paywalls are strongly encouraged to apply the new structured data to their pages, because without it, the paywall may be interpreted as a form of cloaking, and the pages would then be removed from search results.

Based on our investigations, we have created detailed best practices for implementing flexible sampling. There are two types of sampling we advise: metering, which provides users with a quota of free articles to consume, after which paywalls will start appearing; and lead-in, which offers a portion of an article’s content without it being shown in full.

For metering, we think that monthly (rather than daily) metering provides more flexibility and a safer environment for testing. The user impact of changing from one integer value to the next is less significant at, say, 10 monthly samples than at 3 daily samples. All publishers and their audiences are different, so there is no single value for optimal free sampling across publishers. However, we recommend that publishers start by providing 10 free clicks per month to Google search users in order to preserve a good user experience for new potential subscribers. Publishers should then experiment to optimize the tradeoff between discovery and conversion that works best for their businesses.

Lead-in is generally implemented as truncated content, such as the first few sentences or 50-100 words of the article. Lead-in allows users a taste of how valuable the content may be. Compared to a page with completely blocked content, lead-in clearly provides more utility and added value to users.

We are excited by this change as it allows the growth of the premium content ecosystem, which ultimately benefits users. We look forward to the prospect of serving users more high quality content!