How we detect, remove and report child sexual abuse material

Since Google’s earliest days, we have worked to prevent the spread of illegal child sexual abuse material (referred to as CSAM). Child safety organizations and governments rightly expect — and in many cases require — us to take action to remove it from our systems. Which is why, when we find CSAM on our platforms, we remove it, report it and often take the step to suspend the account.

Although CSAM accounts for a very small portion of the material uploaded and shared across our platforms, we take the implications of both CSAM violations and suspending accounts seriously. Our goal is to prevent abuse on our platforms while minimizing the risk of an incorrect suspension. Today, we are sharing more information on how we detect this harmful content and the steps we are taking to be more transparent about our processes with users.

How we detect CSAM

We rely on two equally important technologies to help us proactively identify child sexual abuse material: hash matching and artificial intelligence (AI). We also have a team of highly specialized and trained content reviewers and subject matter experts who help ensure that our technology delivers accurate results.

This combination enables us to detect CSAM on our platforms at scale, while keeping our false positive rate extremely low.

How we use hash matching to identify known CSAM

CSAM that has been previously identified is automatically flagged by our systems using Hash Matching Technology. This technology assigns images and videos a unique digital signature — a “hash” — and then compares it against a database of known signatures. If the two match, the content is considered to be the same or closely similar.

We obtain hashes from a variety of highly trusted sources including Internet Watch Foundation (IWF), National Center for Missing and Exploited Children (NCMEC), and others. NCMEC specifically hosts a hash-sharing service used by the tech industry and specialist NGOs from around the world. This repository serves as one starting point – but we review every purported CSAM hash independently to confirm its accuracy. Once we confirm it as CSAM, we input it into our detection systems.

The overwhelming majority of imagery reported by Google – approximately 90% – matches previously identified CSAM, much of which is already in the NCMEC database.

How we use artificial intelligence to identify new content

While hash matching helps us find known CSAM, we use artificial intelligence to flag new content that is very similar to patterns of previously confirmed CSAM. Our systems are specifically designed to recognize benign imagery like a child playing in the bathtub or backyard, which will not be flagged. A specialist team of trained personnel also reviews each piece of new imagery flagged, to confirm it is CSAM before it is ever reported.

Quick detection of new images means that children who are being sexually abused today are much more likely to be identified and protected from further abuse. And to help promote safety across the web, we provide other companies and NGOs access to detection and processing technology through our Child Safety Toolkit. This includes our Content Safety API, which helps partners more quickly prioritize and review content that is highly likely to be abusive. In the past 30 days alone, the Content Safety API has helped partners process over four billion pieces of CSAM. Through the toolkit, partners can also license our proprietary CSAI Match Technology, to detect known video CSAM on their platforms.

Our specialized content reviewers

While technology is essential in the fight against CSAM at scale, human reviewers also play a critical role to confirm hash matches and content discovered through AI. Our team members bring deep expertise to this work with backgrounds in law, child safety and advocacy, social work, and cyber investigations, among other disciplines. They are specially trained on both our policy scope and what legally constitutes child sexual abuse material. We regularly update this training and our guidelines in consultation with legal counsel, independent experts and medical professionals.

We know this is incredibly sensitive work and have a number of measures in place to protect reviewers’ physical and mental wellness. Our teams have access to tools, workspaces, resources and professional expertise, including counseling.

Referring content to NCMEC

Following this review process, we report the imagery identified as CSAM to NCMEC as required by US law. NCMEC evaluates the report and may decide to refer the case to a relevant law enforcement agency. If the local law enforcement agency chooses to investigate the NCMEC report further, requests for additional information from Google must be made through valid legal process or in accordance with applicable laws. You can learn more here on how we handle these types of requests.

In doing this work, we also believe in the importance of transparency. Today, we updated our Transparency Report, with the latest data around our detection and reporting efforts. In the first half of this year, we've made over one million reports to NCMEC about content that met the legal definition of CSAM, and where appropriate, also suspended the Google accounts associated with that content (approximately 270,000 account suspensions).

By using existing, confirmed CSAM to identify identical or similar material uploaded or shared to our platforms, we maintain an incredibly low false positive rate. However, if someone believes their account was incorrectly disabled, including for content flagged as CSAM, they can appeal the determination. A member of our child safety team reviews the appeal, and if we find we have made a mistake, we reinstate the account as soon as possible.

Improving our processes

Avoiding CSAM on our platforms is incredibly important work and is an area we’ll continue to invest in. At the same time, we recognize that we can improve the user experience when people come to us with questions about their accounts or believe we made wrong decisions. For example, we are actively working on ways to increase transparency and provide more detailed reasons for account suspensions (while making sure we don’t compromise the safety of children or interfere with potential law enforcement investigations). And we will also update our appeals process to allow users to submit even more context about their account, including to share more information and documentation from relevant independent professionals or law enforcement agencies to aid our understanding of the content detected in the account.

We will continue to explore additional ways to balance preventing this harmful content from spreading on our platforms with creating a more streamlined support experience for all users.

googblogs.com

All Google blogs and Press in one site