Tag Archives: Structured Data

Help Google Search know the best date for your web page

Sometimes, Google shows dates next to listings in its search results. In this post, we’ll answer some commonly-asked questions webmasters have about how these dates are determined and provide some best practices to help improve their accuracy.

How dates are determined

Google shows the date of a page when its automated systems determine that it would be relevant to do so, such as for pages that can be time-sensitive, including news content:

Google determines a date using a variety of factors, including but not limited to: any prominent date listed on the page itself or dates provided by the publisher through structured markup.

Google doesn’t depend on one single factor because all of them can be prone to issues. Publishers may not always provide a clear visible date. Sometimes, structured data may be lacking or may not be adjusted to the correct time zone. That’s why our systems look at several factors to come up with what we consider to be our best estimate of when a page was published or significantly updated.

How to specify a date on a page

To help Google to pick the right date, site owners and publishers should:

  • Show a clear date: Show a visible date prominently on the page.
  • Use structured data: Use the datePublished and dateModified schema with the correct time zone designator for AMP or non-AMP pages. When using structured data, make sure to use the ISO 8601 format for dates.

Guidelines specific to Google News

Google News requires clearly showing both the date and the time that content was published or updated. Structured data alone is not enough, though it is recommended to use in addition to a visible date and time. Date and time should be positioned between the headline and the article text. For more guidance, also see our help page about article dates.

If an article has been substantially changed, it can make sense to give it a fresh date and time. However, don't artificially freshen a story without adding significant information or some other compelling reason for the freshening. Also, do not create a very slightly updated story from one previously published, then delete the old story and redirect to the new one. That's against our article URLs guidelines.

More best practices for dates on web pages

In addition to the most important requirements listed above, here are additional best practices to help Google determine the best page to consider showing for a web page:

  • Show when a page has been updated: If you update a page significantly, also update the visible date (and time, if you display that). If desired, you can show two dates: when a page was originally published and when it was updated. Just do so in a way that’s visually clear to your readers. If showing both dates, it’s also highly recommended to use datePublished and dateModified for AMP or non-AMP pages to make it easier for algorithms to recognize.
  • Use the right time zone: If specifying a time, make sure to provide the correct timezone, taking into account daylight saving time as appropriate.
  • Be consistent in usage. Within a page, make sure to use exactly the same date (and, potentially, time) in structured data as well as in the visible part of the page. Make sure to use the same timezone if you specify one on the page.
  • Don’t use future dates or dates related to what a page is about: Always use a date for when a page itself was published or updated, not a date linked to something like an event that the page is writing about, especially for events or other subjects that happen in the future (you may use Event markup separately, if appropriate).
  • Follow Google's structured data guidelines: While Google doesn't guarantee that a date (or structured data in general) specified on a page will be used, following our structured data guidelines does help our algorithms to have it available in a machine-readable way.
  • Troubleshoot by minimizing other dates on the page: If you’ve followed the best practices above and find incorrect dates are being selected, consider if you can remove or minimize other dates that may appear on the page, such as those that might be next to related stories.

We hope these guidelines help to make it easier to specify the right date on your website's pages! For questions or comments on this, or other structured data topics, feel free to drop by our webmaster help forums.


Help customers discover your products on Google

People come to Google to discover new brands and products throughout their shopping journey. On Search and Google Images, shoppers are provided with rich snippets like product description, ratings, and price to help guide purchase decisions.

Connecting potential customers with up-to-date and accurate product information is key to successful shopping journeys on Google, so today, we’re introducing new ways for merchants to provide this information to improve results for shoppers.

  1. Search Console

    Many retailers and brands add structured data markup to their websites to ensure Google understands the products they sell. A new report for ‘Products’ is now available in Search Console for sites that use schema.org structured data markup to annotate product information. The report allows you to see any pending issues for markup on your site. Once an issue is fixed, you can use the report to validate if your issues were resolved by re-crawling your affected pages. Learn more about the rich result status reports

  1. Merchant Center

    While structured data markup helps Google properly display your product information when we crawl your site, we are expanding capabilities for all retailers to directly provide up-to-date product information to Google in real-time. Product data feeds uploaded to Google Merchant Center will now be eligible for display in results on surfaces like Search and Google Images. This product information will be ranked based only on relevance to users’ queries, and no payment is required or accepted for eligibility. We’re starting with the expansion in the US, and support for other countries will be announced later in the year.

    Get started

    You don’t need a Google Ads campaign to participate. If you don’t have an existing account and sell your products in the US, create a Merchant Center account and upload a product data feed.

  1. Manufacturer Center

    We’re also rolling out new features to improve your brand’s visibility and help customers find your products on Google by providing authoritative and up-to-date product information through Google Manufacturer Center. This information includes product description, variants, and rich content, such as high-quality images and videos that can show on the product’s knowledge panel.

These solutions give you multiple options to better reach and inform potential customers about your products as they shop across Google.

If you have any questions, be sure to post in our forum.

Ways to succeed in Google News

With the New Year now underway, we'd like to offer some best practices and advice we hope will lead publishers to more success within Google News in 2019.

General advice

There is a lot of helpful information to consider within the Google News Publisher Help Center. Be sure to have read the material in this area, in particular the content and technical guidelines.

Headlines and dates


  • Present clear headlines: Google News looks at a variety of signals to determine the headline of an article, including within your HTML title tag and for the most prominent text on the page. Review our headline tips.
  • Provide accurate times and dates: Google News tries to determine the time and date to display for an article in a variety of ways. You can help ensure we get it right by using the following methods:
    • Show one clear date and time: As per our date guidelines, show a clear, visible date and time between the headline and the article text. Prevent other dates from appearing on the page whenever possible, such as for related stories.
    • Use structured data: Use the datePublished and dateModified schema and use the correct time zone designator for AMP or non-AMP pages
  • Avoid artificially freshening stories: If an article has been substantially changed, it can make sense to give it a fresh date and time. However, don't artificially freshen a story without adding significant information or some other compelling reason for the freshening. Also, do not create a very slightly updated story from one previously published, then delete the old story and redirect to the new one. That's against our article URLs guidelines.

Duplicate content

Google News seeks to reward independent, original journalistic content by giving credit to the originating publisher, as both users and publishers would prefer. This means we try not to allow duplicate content—which includes scraped, rewritten, or republished material—to perform better than the original content. In line with this, these are guidelines publishers should follow:

  • Block scraped content: Scraping commonly refers to taking material from another site, often on an automated basis. Sites that scrape content must block scraped content from Google News.
  • Block rewritten content: Rewriting refers to taking material from another site, then rewriting that material so that it is not identical. Sites that rewrite content in a way that provides no substantial or clear added value must block that rewritten content from Google News. This includes, but is not limited to, rewrites that make only very slight changes or those that make many word replacements but still keep the original article's overall meaning.
  • Block or consider canonical for republished content: Republishing refers to when a publisher has permission from another publisher or author to republish an original work, such as material from wire services or in partnership with other publications.
    Publishers that allow others to republish content can help ensure that their original versions perform better in Google News by asking those republishing to block or make use of canonical.
    Google News also encourages those that republish material to consider proactively blocking such content or making use of the canonical, so that we can better identify the original content and credit it appropriately.
  • Avoid duplicate content: If you operate a network of news sites that share content, the advice above about republishing is applicable to your network. Select what you consider to be the original article and consider blocking duplicates or making use of the canonical to point to the original.

Transparency


  • Be transparent: Visitors to your site want to trust and understand who publishes it and information about those who have written articles. That's why our content guidelines stress that content should have posts with clear bylines, information about authors, and contact information for the publication.
  • Don't be deceptive: Our content policies do not allow sites or accounts that impersonate any person or organization, or that misrepresent or conceal their ownership or primary purpose. We do not allow sites or accounts that engage in coordinated activity to mislead users. This includes, but isn't limited to, sites or accounts that misrepresent or conceal their country of origin or that direct content at users in another country under false premises.

More tips


  • Avoid taking part in link schemes: Don't participate in link schemes, which can include large-scale article marketing programs or selling links that pass PageRank. Review our page on link schemes for more information.
  • Use structured for rich presentation: Both those using AMP and non-AMP pages can make use of structured data to optimize your content for rich results or carousel-like presentations.
  • Protect your users and their data: Consider securing every page of your website with HTTPS to protect the integrity and confidentiality of the data users exchange on your site. You can find more useful tips in our best practices on how to implement HTTPS.

Here's to a great 2019!

We hope these tips help publishers succeed in Google News over the coming year. For those who have more questions about Google News, we are unable to do one-to-one support. However, we do monitor our Google News Publisher Forum—which has been newly-revamped—and try to provide guidance on questions that might help a number of publishers all at once. The forum is also a great resource where publishers share tips and advice with each other.

Mobile-First indexing, structured data, images, and your site

It's been two years since we started working on "mobile-first indexing" - crawling the web with smartphone Googlebot, similar to how most users access it. We've seen websites across the world embrace the mobile web, making fantastic websites that work on all kinds of devices. There's still a lot to do, but today, we're happy to announce that we now use mobile-first indexing for over half of the pages shown in search results globally.

Checking for mobile-first indexing

In general, we move sites to mobile-first indexing when our tests assure us that they're ready. When we move sites over, we notify the site owner through a message in Search Console. It's possible to confirm this by checking the server logs, where a majority of the requests should be from Googlebot Smartphone. Even easier, the URL inspection tool allows a site owner to check how a URL from the site (it's usually enough to check the homepage) was last crawled and indexed.

If your site uses responsive design techniques, you should be all set! For sites that aren't using responsive web design, we've seen two kinds of issues come up more frequently in our evaluations:

Missing structured data on mobile pages

Structured data is very helpful to better understand the content on your pages, and allows us to highlight your pages in fancy ways in the search results. If you use structured data on the desktop versions of your pages, you should have the same structured data on the mobile versions of the pages. This is important because with mobile-first indexing, we'll only use the mobile version of your page for indexing, and will otherwise miss the structured data.

Testing your pages in this regard can be tricky. We suggest testing for structured data in general, and then comparing that to the mobile version of the page. For the mobile version, check the source code when you simulate a mobile device, or use the HTML generated with the mobile-friendly testing tool. Note that a page does not need to be mobile-friendly in order to be considered for mobile-first indexing.

Missing alt-text for images on mobile pages

The value of alt-attributes on images ("alt-text") is a great way to describe images to users with screen-readers (which are used on mobile too!), and to search engine crawlers. Without alt-text for images, it's a lot harder for Google Images to understand the context of images that you use on your pages.

Check "img" tags in the source code of the mobile version for representative pages of your website. As above, the source of the mobile version can be seen by either using the browser to simulate a mobile device, or by using the Mobile-Friendly test to check the Googlebot rendered version. Search the source code for "img" tags, and double-check that your page is providing appropriate alt-attributes for any that you want to have findable in Google Images.

For example, that might look like this:

With alt-text (good!):
<img src="cute-puppies.png" alt="A photo of cute puppies on a blanket">

Without alt-text:
<img src="sad-puppies.png">

It's fantastic to see so many great websites that work well on mobile! We're looking forward to being able to index more and more of the web using mobile-first indexing, helping more users to search the web in the same way that they access it: with a smartphone. We’ll continue to monitor and evaluate this change carefully. If you have any questions, please drop by our Webmaster forums or our public events.


Introducing the Indexing API and structured data for livestreams

Over the past few years, it's become easier than ever to stream live videos online, from celebrity updates to special events. But it's not always easy for people to determine which videos are live and know when to tune in.
Today, we're introducing new tools to help more people discover your livestreams in Search and Assistant. With livestream structured data and the Indexing API, you can let Google know when your video is live, so it will be eligible to appear with a red "live" badge:

Add livestream structured data to your page

If your website streams live videos, use the livestream developer documentation to flag your video as a live broadcast and mark the start and end times. In addition, VideoObject structured data is required to tell Google that there's a video on your page.

Update Google quickly with the Indexing API

The Indexing API now supports pages with livestream structured data. We encourage you to call the Indexing API to request that your site is crawled in time for the livestream. We recommend calling the Indexing API when your livestream begins and ends, and if the structured data changes.
For more information, visit our developer documentation. If you have any questions, ask us in the Webmaster Help Forum. We look forward to seeing your live videos on Google!

Rich Results expands for Question & Answer pages

People come to Google seeking information about all kinds of questions.
Frequently, the information they're looking for is on sites where users ask and answer each other's questions. Popular social news sites, expert forums, and help and support message boards are all examples of this pattern.

A screenshot of an example search result for a page titled “Why do touchscreens sometimes register a touch when ...” with a preview of the top answers from the page.
In order to help users better identify which search results may give the best information about their question, we have developed a new rich result type for question and answer sites. Search results for eligible Q&A pages display a preview of the top answers. This new presentation helps site owners reach the right users for their content and helps users get the relevant information about their questions faster.
A screenshot of an example search result for a page titled “Why do touchscreens sometimes register a touch when ...” with a preview of the top answers from the page.

To be eligible for this feature, add Q&A structured data to your pages with Q&A content. Be sure to use the Structured Data Testing Tool to see if your page is eligible and to preview the appearance in search results. You can also check out Search Console to see aggregate stats and markup error examples. The Performance report also tells you which queries show your Q&A Rich Result in Search results, and how these change over time.
If you have any questions, ask us in the Webmaster Help Forum or reach out on Twitter!

Building Google Dataset Search and Fostering an Open Data Ecosystem



Earlier this month we launched Google Dataset Search, a tool designed to make it easier for researchers to discover datasets that can help with their work. What we colloquially call "Google Scholar for data,” Google Dataset Search is a search engine across metadata for millions of datasets in thousands of repositories across the Web. In this post, we go into some detail of how Dataset Search is built, outlining what we believe will help develop an open data ecosystem, and we also address the question that we received frequently since the Dataset Search launch, "Why is my dataset not showing up in Google Dataset Search?

An Overview
At a very high level, Google Data Search relies on dataset providers, big and small, adding structured metadata on their sites using the open schema.org/Dataset standard. The metadata specifies the salient properties of each dataset: its name and description, spatial and temporal coverage, provenance information, and so on. Dataset Search uses this metadata, links it with other resources that are available at Google (more on this below!), and builds an index of this enriched corpus of metadata. Once we built the index, we can start answering user queries — and figuring out which results best correspond to the query.
An overview of the technology behind Google Dataset Search
Using Structured Metadata from Data Providers
When Google's search engine processes a Web page with schema.org/Dataset mark-up, it understands that there is dataset metadata there and processes that structured metadata to create "records" describing each annotated dataset on a page. The use of schema.org allows developers to embed this structured information into HTML, without affecting the appearance of the page while making the semantics of the information visible to all search engines.

However, no matter how precise schema.org definitions or guidelines are, some metadata will inevitably be incomplete, wrong, or entirely missing. Furthermore, distinctions between some fields can be vague: is the dataset repository a publisher or a provider of a dataset? How do we distinguish between citations to a scientific paper that describes the creation of the dataset vs. papers describing its use? Indeed, many of these questions often generate active scholarly discussions.

Despite these variations, Dataset Search must provide a uniform and predictable user experience on the front end. Therefore, in some cases we substitute a more general field name (e.g., “provided by”) to display the values coming from multiple other fields (e.g., “publisher”, “creator”, etc.). In other cases, we are not able to use some of the fields at all: if a specific field is being misinterpreted in many different ways by dataset providers, we bypass that field for now and work with the community to clarify the guidelines. In each decision, we had one specific question that helped us in difficult cases "What will help data discovery the most?" This focus on the task that we were addressing made some of the problems easier than they seemed at first.

Connecting Replicas of Datasets
It is very common for a dataset, in particular a popular one, to be present in more than one repository. We use a variety of signals to determine when two datasets are replicas of each other. For example, schema.org has a way to specify the connection explicitly, through schema.org/sameAs, which is the best way to link different replicas together and to point to the canonical source of a dataset. Other signals include two datasets descriptions pointing to the same canonical page, having the same Digital Object Identifier (DOI), sharing links for downloading the dataset, or having a large overlap in other metadata fields. None of these signals are perfect in isolation, therefore we combine them to get the strongest possible indication of when two datasets are the same.

Reconciling to the Google Knowledge Graph
Google's Knowledge Graph is a powerful platform that describes and links information about many entities, including the ones that appear in dataset metadata: organizations providing datasets, locations for spatial coverage of the data, funding agencies, and so on. Therefore, we try to reconcile information mentioned in the metadata fields with the items in the Knowledge Graph. We can do this reconciliation with good precision for two main reasons. First, we know the types of items in the Knowledge Graph and the types of entities that we expect in the metadata fields. Therefore, we can limit the types of entities from the Knowledge Graph that we match with values for a particular metadata field. For example, a provider of a dataset should match with an organization entity in the Knowledge Graph and not with, say, a location. Second, the context of the Web page itself helps reduce the number of choices, which is particularly useful for distinguishing between organizations that share the same acronym. For example, the acronym CAMRA can stand for “Chilbolton Advanced Meteorological Radar” or “Campaign for Real Ale”. If we use terms from the Web page, we can then more easily determine that CAMRA is in fact the Chilbolton Radar when we see terms such as “clouds”, “vapor”, and “water” on the page.

This type of reconciliation opens up lots of possibilities to improve the search experience for users. For instance, Dataset Search can localize results by showing reconciled values of metadata in the same language as the rest of the page. Additionally, it can rely on synonyms, correct misspellings, expand acronyms, or use other relations in the Knowledge Graph for query expansion.

Linking to other Google Resources
Google has many other data resources that are useful in augmenting the dataset metadata, such as Google Scholar. Knowing which datasets are referenced and cited in publications serves at least two purposes:
  1. It provides a valuable signal about the importance and prominence of a dataset.
  2. It gives dataset authors an easy place to see citations to their data and to get credit.
Indeed, we hope that highlighting publications that use the data will lead to a more healthy ecosystem of data citation. For the moment, our links to Google scholar are very approximate as we lack a good model on how people cite data. We try to go beyond DOIs to give somewhat better coverage, but the number of articles citing a dataset ends up being approximate. We hope to make more progress in this area in order to get a higher level of precision.

Search and Ranking of Results
When a user issues a query, we search through the corpus of datasets, in a way not unlike Google Search works over Web pages. Just like with any search, we need to determine whether a document is relevant for the query and then rank the relevant documents. Because there are no large-scale studies on how users search for datasets, as a first approximation, we rely on Google Web ranking. However, ranking datasets is different from ranking Web pages, and we add some additional signals that take into account the metadata quality, citations, and so on. As Dataset Search gets used more by our users and we understand better how users search for datasets, we hope that ranking will improve significantly.

A Better Open Data Ecosystem
We built Dataset Search in an attempt to create a tool that will positively impact the discoverability of data. The decision to rely on open standards (schema.org, W3C DCAT, JSON-LD, etc.) for markup is intentional, as Dataset Search can only be as good as the open-data ecosystem that it supports. As such, Google Dataset Search aims to support a strong open data ecosystem by encouraging:
  1. Widespread adoption of open metadata formats to describe published data.
  2. Further development of open metadata formats to describe more types of data and in more detail.
  3. The culture of citing data the way we cite research publications, giving those who create and publish the data the credit that they deserve.
  4. The development of tools that leverage this metadata to enable more discovery or better use of data. 
The increased adoption of open metadata standards in conjunction with the continued development of Dataset Search (and, hopefully, other tools) should foster a healthier open data ecosystem where data is a first-class citizen of research.

So, Where is Your Dataset?
It is probably clear by now that Dataset Search is only as good as the metadata that exists on the Web pages for datasets. The most common answer to the question of why a specific dataset does not show up in our results is that the Web page for that dataset does not have any markup. Just pop that page into the Structured Data Testing Tool and you will see whether the markup is there. If you don't see any markup there, and you own the page, you can add it and if you don't own the page, you can ask the page owners to do it, which will make their page more easily discoverable by everyone.

We hope that the community finds Dataset Search useful, users make serendipitous discoveries and save time and scientists and journalists spend less time searching for data and more time using it.

Acknowledgements
We would like to thank Xiaomeng Ban, Dan Brickley, Lee Butler, Thomas Chen, Corinna Cortes, Kevin Espinoza, Archana Jain, Mike Jones, Kishore Papineni, Chris Sater, Gokhan Turhan, Shubin Zhao and Andi Vajda for their work on the project and all our partners, collaborators, and early adopters for their help.

Source: Google AI Blog


Hey Google, what’s the latest news?

Since launching the Google Assistant in 2016, we have seen users ask questions about everything from weather to recipes and news. In order to fulfill news queries with results people can count on, we collaborated on a new schema.org structured data specification called speakable for eligible publishers to mark up sections of a news article that are most relevant to be read aloud by the Google Assistant.

When people ask the Google Assistant -- "Hey Google, what's the latest news on NASA?", the Google Assistant responds with an excerpt from a news article and the name of the news organization. Then the Google Assistant asks if the user would like to hear another news article and also sends the relevant links to the user's mobile device.

As a news publisher, you can surface your content on the Google Assistant by implementing Speakable markup according to the developer documentation. This feature is now available for English language users in the US and we hope to launch in other languages and countries as soon as a sufficient number of publishers have implemented speakable. As this is a new feature, we are experimenting over time to refine the publisher and user experience.

If you have any questions, ask us in the Webmaster Help Forum. We look forward to hearing from you!

Introducing the Indexing API for job posting URLs

Last June we launched a job search experience that has since connected tens of millions of job seekers around the world with relevant job opportunities from third party providers across the web. Timely indexing of new job content is critical because many jobs are filled relatively quickly. Removal of expired postings is important because nothing's worse than finding a great job only to discover it's no longer accepting applications.

Today we're releasing the Indexing API to address this problem. This API allows any site owner to directly notify Google when job posting pages are added or removed. This allows Google to schedule job postings for a fresh crawl, which can lead to higher quality user traffic and job applicant satisfaction. Currently, the Indexing API can only be used for job posting pages that include job posting structured data.

For websites with many short-lived pages like job postings, the Indexing API keeps job postings fresh in Search results because it allows updates to be pushed individually. This API can be integrated into your job posting flow, allowing high quality job postings to be searchable quickly after publication. In addition, you can check the last time Google received each kind of notification for a given URL.

Follow the Quickstart guide to see how the Indexing API works. If you have any questions, ask us in the Webmaster Help Forum. We look forward to hearing from you!

Google Search at I/O 2018

With the eleventh annual Google I/O wrapped up, it’s a great time to reflect on some of the highlights.

What we did at I/O


The event was a wonderful way to meet many great people from various communities across the globe, exchange ideas, and gather feedback. Besides many great web sessions, codelabs, and office hours we shared a few things with the community in two sessions specific to Search:




The sessions included the launch of JavaScript error reporting in the Mobile Friendly Test tool, dynamic rendering (we will discuss this in more detail in a future post), and an explanation of how CMS can use the Indexing and Search Console APIs to provide users with insights. For example, Wix lets their users submit their homepage to the index and see it in Search results instantly, and Squarespace created a Google Search keywords report to help webmasters understand what prospective users search for.

During the event, we also presented the new Search Console in the Sandbox area for people to try and were happy to get a lot of positive feedback, from people being excited about the AMP Status report to others exploring how to improve their content for Search.

Hands-on codelabs, case studies and more


We presented the Structured Data Codelab that walks you through adding and testing structured data. We were really happy to see that it ended up being one of the top 20 codelabs by completions at I/O. If you want to learn more about the benefits of using Structured Data, check out our case studies.



During the in-person office hours we saw a lot of interest around HTTPS, mobile-first indexing, AMP, and many other topics. The in-person Office Hours were a wonderful addition to our monthly Webmaster Office Hours hangout. The questions and comments will help us adjust our documentation and tools by making them clearer and easier to use for everyone.

Highlights and key takeaways


We also repeated a few key points that web developers should have an eye on when building websites, such as:


  • Indexing and rendering don’t happen at the same time. We may defer the rendering to a later point in time.
  • Make sure the content you want in Search has metadata, correct HTTP statuses, and the intended canonical tag.
  • Hash-based routing (URLs with "#") should be deprecated in favour of the JavaScript History API in Single Page Apps.
  • Links should have an href attribute pointing to a URL, so Googlebot can follow the links properly.

Make sure to watch this talk for more on indexing, dynamic rendering and troubleshooting your site. If you wanna learn more about things to do as a CMS developer or theme author or Structured Data, watch this talk.

We were excited to meet some of you at I/O as well as the global I/O extended events and share the latest developments in Search. To stay in touch, join the Webmaster Forum or follow us on Twitter, Google+, and YouTube.