Tag Archives: Structured Data

Introducing the Indexing API and structured data for livestreams

Over the past few years, it's become easier than ever to stream live videos online, from celebrity updates to special events. But it's not always easy for people to determine which videos are live and know when to tune in.
Today, we're introducing new tools to help more people discover your livestreams in Search and Assistant. With livestream structured data and the Indexing API, you can let Google know when your video is live, so it will be eligible to appear with a red "live" badge:

Add livestream structured data to your page

If your website streams live videos, use the livestream developer documentation to flag your video as a live broadcast and mark the start and end times. In addition, VideoObject structured data is required to tell Google that there's a video on your page.

Update Google quickly with the Indexing API

The Indexing API now supports pages with livestream structured data. We encourage you to call the Indexing API to request that your site is crawled in time for the livestream. We recommend calling the Indexing API when your livestream begins and ends, and if the structured data changes.
For more information, visit our developer documentation. If you have any questions, ask us in the Webmaster Help Forum. We look forward to seeing your live videos on Google!

Rich Results expands for Question & Answer pages

People come to Google seeking information about all kinds of questions.
Frequently, the information they're looking for is on sites where users ask and answer each other's questions. Popular social news sites, expert forums, and help and support message boards are all examples of this pattern.

A screenshot of an example search result for a page titled “Why do touchscreens sometimes register a touch when ...” with a preview of the top answers from the page.
In order to help users better identify which search results may give the best information about their question, we have developed a new rich result type for question and answer sites. Search results for eligible Q&A pages display a preview of the top answers. This new presentation helps site owners reach the right users for their content and helps users get the relevant information about their questions faster.
A screenshot of an example search result for a page titled “Why do touchscreens sometimes register a touch when ...” with a preview of the top answers from the page.

To be eligible for this feature, add Q&A structured data to your pages with Q&A content. Be sure to use the Structured Data Testing Tool to see if your page is eligible and to preview the appearance in search results. You can also check out Search Console to see aggregate stats and markup error examples. The Performance report also tells you which queries show your Q&A Rich Result in Search results, and how these change over time.
If you have any questions, ask us in the Webmaster Help Forum or reach out on Twitter!

Building Google Dataset Search and Fostering an Open Data Ecosystem



Earlier this month we launched Google Dataset Search, a tool designed to make it easier for researchers to discover datasets that can help with their work. What we colloquially call "Google Scholar for data,” Google Dataset Search is a search engine across metadata for millions of datasets in thousands of repositories across the Web. In this post, we go into some detail of how Dataset Search is built, outlining what we believe will help develop an open data ecosystem, and we also address the question that we received frequently since the Dataset Search launch, "Why is my dataset not showing up in Google Dataset Search?

An Overview
At a very high level, Google Data Search relies on dataset providers, big and small, adding structured metadata on their sites using the open schema.org/Dataset standard. The metadata specifies the salient properties of each dataset: its name and description, spatial and temporal coverage, provenance information, and so on. Dataset Search uses this metadata, links it with other resources that are available at Google (more on this below!), and builds an index of this enriched corpus of metadata. Once we built the index, we can start answering user queries — and figuring out which results best correspond to the query.
An overview of the technology behind Google Dataset Search
Using Structured Metadata from Data Providers
When Google's search engine processes a Web page with schema.org/Dataset mark-up, it understands that there is dataset metadata there and processes that structured metadata to create "records" describing each annotated dataset on a page. The use of schema.org allows developers to embed this structured information into HTML, without affecting the appearance of the page while making the semantics of the information visible to all search engines.

However, no matter how precise schema.org definitions or guidelines are, some metadata will inevitably be incomplete, wrong, or entirely missing. Furthermore, distinctions between some fields can be vague: is the dataset repository a publisher or a provider of a dataset? How do we distinguish between citations to a scientific paper that describes the creation of the dataset vs. papers describing its use? Indeed, many of these questions often generate active scholarly discussions.

Despite these variations, Dataset Search must provide a uniform and predictable user experience on the front end. Therefore, in some cases we substitute a more general field name (e.g., “provided by”) to display the values coming from multiple other fields (e.g., “publisher”, “creator”, etc.). In other cases, we are not able to use some of the fields at all: if a specific field is being misinterpreted in many different ways by dataset providers, we bypass that field for now and work with the community to clarify the guidelines. In each decision, we had one specific question that helped us in difficult cases "What will help data discovery the most?" This focus on the task that we were addressing made some of the problems easier than they seemed at first.

Connecting Replicas of Datasets
It is very common for a dataset, in particular a popular one, to be present in more than one repository. We use a variety of signals to determine when two datasets are replicas of each other. For example, schema.org has a way to specify the connection explicitly, through schema.org/sameAs, which is the best way to link different replicas together and to point to the canonical source of a dataset. Other signals include two datasets descriptions pointing to the same canonical page, having the same Digital Object Identifier (DOI), sharing links for downloading the dataset, or having a large overlap in other metadata fields. None of these signals are perfect in isolation, therefore we combine them to get the strongest possible indication of when two datasets are the same.

Reconciling to the Google Knowledge Graph
Google's Knowledge Graph is a powerful platform that describes and links information about many entities, including the ones that appear in dataset metadata: organizations providing datasets, locations for spatial coverage of the data, funding agencies, and so on. Therefore, we try to reconcile information mentioned in the metadata fields with the items in the Knowledge Graph. We can do this reconciliation with good precision for two main reasons. First, we know the types of items in the Knowledge Graph and the types of entities that we expect in the metadata fields. Therefore, we can limit the types of entities from the Knowledge Graph that we match with values for a particular metadata field. For example, a provider of a dataset should match with an organization entity in the Knowledge Graph and not with, say, a location. Second, the context of the Web page itself helps reduce the number of choices, which is particularly useful for distinguishing between organizations that share the same acronym. For example, the acronym CAMRA can stand for “Chilbolton Advanced Meteorological Radar” or “Campaign for Real Ale”. If we use terms from the Web page, we can then more easily determine that CAMRA is in fact the Chilbolton Radar when we see terms such as “clouds”, “vapor”, and “water” on the page.

This type of reconciliation opens up lots of possibilities to improve the search experience for users. For instance, Dataset Search can localize results by showing reconciled values of metadata in the same language as the rest of the page. Additionally, it can rely on synonyms, correct misspellings, expand acronyms, or use other relations in the Knowledge Graph for query expansion.

Linking to other Google Resources
Google has many other data resources that are useful in augmenting the dataset metadata, such as Google Scholar. Knowing which datasets are referenced and cited in publications serves at least two purposes:
  1. It provides a valuable signal about the importance and prominence of a dataset.
  2. It gives dataset authors an easy place to see citations to their data and to get credit.
Indeed, we hope that highlighting publications that use the data will lead to a more healthy ecosystem of data citation. For the moment, our links to Google scholar are very approximate as we lack a good model on how people cite data. We try to go beyond DOIs to give somewhat better coverage, but the number of articles citing a dataset ends up being approximate. We hope to make more progress in this area in order to get a higher level of precision.

Search and Ranking of Results
When a user issues a query, we search through the corpus of datasets, in a way not unlike Google Search works over Web pages. Just like with any search, we need to determine whether a document is relevant for the query and then rank the relevant documents. Because there are no large-scale studies on how users search for datasets, as a first approximation, we rely on Google Web ranking. However, ranking datasets is different from ranking Web pages, and we add some additional signals that take into account the metadata quality, citations, and so on. As Dataset Search gets used more by our users and we understand better how users search for datasets, we hope that ranking will improve significantly.

A Better Open Data Ecosystem
We built Dataset Search in an attempt to create a tool that will positively impact the discoverability of data. The decision to rely on open standards (schema.org, W3C DCAT, JSON-LD, etc.) for markup is intentional, as Dataset Search can only be as good as the open-data ecosystem that it supports. As such, Google Dataset Search aims to support a strong open data ecosystem by encouraging:
  1. Widespread adoption of open metadata formats to describe published data.
  2. Further development of open metadata formats to describe more types of data and in more detail.
  3. The culture of citing data the way we cite research publications, giving those who create and publish the data the credit that they deserve.
  4. The development of tools that leverage this metadata to enable more discovery or better use of data. 
The increased adoption of open metadata standards in conjunction with the continued development of Dataset Search (and, hopefully, other tools) should foster a healthier open data ecosystem where data is a first-class citizen of research.

So, Where is Your Dataset?
It is probably clear by now that Dataset Search is only as good as the metadata that exists on the Web pages for datasets. The most common answer to the question of why a specific dataset does not show up in our results is that the Web page for that dataset does not have any markup. Just pop that page into the Structured Data Testing Tool and you will see whether the markup is there. If you don't see any markup there, and you own the page, you can add it and if you don't own the page, you can ask the page owners to do it, which will make their page more easily discoverable by everyone.

We hope that the community finds Dataset Search useful, users make serendipitous discoveries and save time and scientists and journalists spend less time searching for data and more time using it.

Acknowledgements
We would like to thank Xiaomeng Ban, Dan Brickley, Lee Butler, Thomas Chen, Corinna Cortes, Kevin Espinoza, Archana Jain, Mike Jones, Kishore Papineni, Chris Sater, Gokhan Turhan, Shubin Zhao and Andi Vajda for their work on the project and all our partners, collaborators, and early adopters for their help.

Source: Google AI Blog


Hey Google, what’s the latest news?

Since launching the Google Assistant in 2016, we have seen users ask questions about everything from weather to recipes and news. In order to fulfill news queries with results people can count on, we collaborated on a new schema.org structured data specification called speakable for eligible publishers to mark up sections of a news article that are most relevant to be read aloud by the Google Assistant.

When people ask the Google Assistant -- "Hey Google, what's the latest news on NASA?", the Google Assistant responds with an excerpt from a news article and the name of the news organization. Then the Google Assistant asks if the user would like to hear another news article and also sends the relevant links to the user's mobile device.

As a news publisher, you can surface your content on the Google Assistant by implementing Speakable markup according to the developer documentation. This feature is now available for English language users in the US and we hope to launch in other languages and countries as soon as a sufficient number of publishers have implemented speakable. As this is a new feature, we are experimenting over time to refine the publisher and user experience.

If you have any questions, ask us in the Webmaster Help Forum. We look forward to hearing from you!

Introducing the Indexing API for job posting URLs

Last June we launched a job search experience that has since connected tens of millions of job seekers around the world with relevant job opportunities from third party providers across the web. Timely indexing of new job content is critical because many jobs are filled relatively quickly. Removal of expired postings is important because nothing's worse than finding a great job only to discover it's no longer accepting applications.

Today we're releasing the Indexing API to address this problem. This API allows any site owner to directly notify Google when job posting pages are added or removed. This allows Google to schedule job postings for a fresh crawl, which can lead to higher quality user traffic and job applicant satisfaction. Currently, the Indexing API can only be used for job posting pages that include job posting structured data.

For websites with many short-lived pages like job postings, the Indexing API keeps job postings fresh in Search results because it allows updates to be pushed individually. This API can be integrated into your job posting flow, allowing high quality job postings to be searchable quickly after publication. In addition, you can check the last time Google received each kind of notification for a given URL.

Follow the Quickstart guide to see how the Indexing API works. If you have any questions, ask us in the Webmaster Help Forum. We look forward to hearing from you!

Google Search at I/O 2018

With the eleventh annual Google I/O wrapped up, it’s a great time to reflect on some of the highlights.

What we did at I/O


The event was a wonderful way to meet many great people from various communities across the globe, exchange ideas, and gather feedback. Besides many great web sessions, codelabs, and office hours we shared a few things with the community in two sessions specific to Search:




The sessions included the launch of JavaScript error reporting in the Mobile Friendly Test tool, dynamic rendering (we will discuss this in more detail in a future post), and an explanation of how CMS can use the Indexing and Search Console APIs to provide users with insights. For example, Wix lets their users submit their homepage to the index and see it in Search results instantly, and Squarespace created a Google Search keywords report to help webmasters understand what prospective users search for.

During the event, we also presented the new Search Console in the Sandbox area for people to try and were happy to get a lot of positive feedback, from people being excited about the AMP Status report to others exploring how to improve their content for Search.

Hands-on codelabs, case studies and more


We presented the Structured Data Codelab that walks you through adding and testing structured data. We were really happy to see that it ended up being one of the top 20 codelabs by completions at I/O. If you want to learn more about the benefits of using Structured Data, check out our case studies.



During the in-person office hours we saw a lot of interest around HTTPS, mobile-first indexing, AMP, and many other topics. The in-person Office Hours were a wonderful addition to our monthly Webmaster Office Hours hangout. The questions and comments will help us adjust our documentation and tools by making them clearer and easier to use for everyone.

Highlights and key takeaways


We also repeated a few key points that web developers should have an eye on when building websites, such as:


  • Indexing and rendering don’t happen at the same time. We may defer the rendering to a later point in time.
  • Make sure the content you want in Search has metadata, correct HTTP statuses, and the intended canonical tag.
  • Hash-based routing (URLs with "#") should be deprecated in favour of the JavaScript History API in Single Page Apps.
  • Links should have an href attribute pointing to a URL, so Googlebot can follow the links properly.

Make sure to watch this talk for more on indexing, dynamic rendering and troubleshooting your site. If you wanna learn more about things to do as a CMS developer or theme author or Structured Data, watch this talk.

We were excited to meet some of you at I/O as well as the global I/O extended events and share the latest developments in Search. To stay in touch, join the Webmaster Forum or follow us on Twitter, Google+, and YouTube.

 

We updated our job posting guidelines

Last year, we launched job search on Google to connect more people with jobs. When you provide Job Posting structured data, it helps drive more relevant traffic to your page by connecting job seekers with your content. To ensure that job seekers are getting the best possible experience, it's important to follow our Job Posting guidelines.

We've recently made some changes to our Job Posting guidelines to help improve the job seeker experience.

  • Remove expired jobs
  • Place structured data on the job's detail page
  • Make sure all job details are present in the job description

Remove expired jobs

When job seekers put in effort to find a job and apply, it can be very discouraging to discover that the job that they wanted is no longer available. Sometimes, job seekers only discover that the job posting is expired after deciding to apply for the job. Removing expired jobs from your site may drive more traffic because job seekers are more confident when jobs that they visit on your site are still open for application. For more information on how to remove a job posting, see Remove a job posting.


Place structured data on the job's detail page

Job seekers find it confusing when they land on a list of jobs instead of the specific job's detail page. To fix this, put structured data on the most detailed leaf page possible. Don't add structured data to pages intended to present a list of jobs (for example, search result pages) and only add it to the most specific page describing a single job with its relevant details.

Make sure all job details are present in the job description

We've also noticed that some sites include information in the JobPosting structured data that is not present anywhere in the job posting. Job seekers are confused when the job details they see in Google Search don't match the job description page. Make sure that the information in the JobPosting structured data always matches what's on the job posting page. Here are some examples:

  • If you add salary information to the structured data, then also add it to the job posting. Both salary figures should match.
  • The location in the structured data should match the location in the job posting.

Providing structured data content that is consistent with the content of the job posting pages not only helps job seekers find the exact job that they were looking for, but may also drive more relevant traffic to your job postings and therefore increase the chances of finding the right candidates for your jobs.

If your site violates the Job Posting guidelines (including the guidelines in this blog post), we may take manual action against your site and it may not be eligible for display in the jobs experience on Google Search. You can submit a reconsideration request to let us know that you have fixed the problem(s) identified in the manual action notification. If your request is approved, the manual action will be removed from your site or page.

For more information, visit our Job Posting developer documentation and our JobPosting FAQ.

Introducing Rich Results & the Rich Results Testing Tool

Over the years, the different ways you can choose to highlight your website's content in search has grown dramatically. In the past, we've called these rich snippets, rich cards, or enriched results. Going forward - to simplify the terminology -  our documentation will use the name "rich results" for all of them. Additionally, we're introducing a new rich results testing tool to make diagnosing your pages' structured data easier.

The new testing tool focuses on the structured data types that are eligible to be shown as rich results. It allows you to test all data sources on your pages, such as JSON-LD (which we recommend), Microdata, or RDFa. The new tool provides a more accurate reflection of the page’s appearance on Search and includes improved handling for Structured Data found on dynamically loaded content. The tests for Recipes, Jobs, Movies, and Courses are currently supported -- but this is just a first step, we plan on expanding over time.

Testing a page is easy: just open the testing tool, enter a URL, and review the output. If there are issues, the tool will highlight the invalid code in the page source. If you're working with others on this page, the share-icon on the bottom-right lets you do that quickly. You can also use preview button to view all the different rich results the page is eligible for. And … once you're happy with the result, use Submit To Google to fetch & index this page for search.

Want to get started with rich snippets rich results? Check out our guides for marking up your content. Feel free to drop by our Webmaster Help forums should you have any questions or get stuck; the awesome experts there can often help resolve issues and give you tips in no time!


A reminder about “event” markup

Lately we’ve been receiving feedback from users seeing non-events like coupons or vouchers showing up in search results where “events” snippets appear. This is really confusing for users and also against our guidelines, where we have added additional clarification.

So, what’s the problem?

We’ve seen a number of  publishers in the coupons/vouchers space use the “event” markup to describe their offers. And as much as using a discount voucher can be a very special thing, that doesn’t make coupons or vouchers events or “saleEvents”. Using Event markup to describe something that is not an event creates a bad user experience, by triggering a rich result for something that will happen at a particular time, despite no actual event being present.

Here are some examples to illustrate the issue:

Since this creates a misleading user experience, we may take manual action on such cases. In case your website is affected by such a manual action, you will find a notification in your Search Console account. If a manual action is taken, it can result in structured data markup for the whole site not being used for search results.  

While we’re specifically highlighting coupons and vouchers in this blogpost, this applies to all other non-event items being annotated with “event” markup as well -- or, really, for applying a type of markup to something other than the type of thing it is meant to describe.

For more information, please visit our developer documentation or stop by our Webmaster Forum in case you have additional questions!


Make your site’s complete jobs information accessible to job seekers

In June, we announced a new experience that put the convenience of Search into the hands of job seekers. Today, we are taking the next step in improving the job search experience on Google by adding a feature that shows estimated salary information from the web alongside job postings, as well as adding new UI features for users.

Salary information has been one of the most requested additions from job seekers. This helps people evaluate whether a job is a good fit, and is an opportunity for sites with estimated salary information to:

  • Increase brand awareness: Estimated salary information shows a representative logo from the estimated salary provider.
  • Get more referral traffic: Users can click through directly to salary estimate pages when salary information surfaces in job search results.

If your site provides salary estimates, you can take advantage of these changes in the following ways:

Specify actual salary information

Actual salary refers to the base salary information that is provided by the employer. If your site publishes job listings, you can add JobPosting structured data and populate the baseSalary property to be eligible for inclusion in job search results.

This salary information will be made available in both the list and the detail views.

Provide estimated salary information

In cases where employers don’t provide actual salary, job seekers may see estimated salaries sourced from multiple partners for the same or similar occupation. If your site provides salary estimate information, you can add Occupation structured data to be eligible for inclusion in job search results.  

Include exact location information

We've heard from users that having accurate, street-level location information helps them to focus on opportunities that work best for them. Sites that publish job listings can do this can do this by using the jobLocation property in JobPosting structured data.

Validate your structured data

To double-check the structured data on your pages, we'll be updating the Structured Data Testing Tool and the Search Console reports in the near future. In the meantime, you can monitor the performance of your job postings in Search Analytics. Stay tuned!

Since launching this summer, we’ve seen over 60% growth in number of companies with jobs showing on Google and connected tens of millions of people to new job opportunities. We are excited to help users find jobs with salaries that meet their needs, and to route them to your site for more information. We invite sites that provide salary estimates to mark up their salary pages using the Occupation structured data. Should you have any questions regarding the use of structured data on your site, feel free to drop by our webmaster help forums.