Expanding access to Differential Privacy to create a safer online ecosystem

Posted by Miguel Guevara, Product Manager, Privacy and Data Protection Office

At Google, we believe in democratizing access to privacy technology for all. Today, on Data Privacy Day, we’re sharing updates on our effort to create free tools that help the developer community – researchers, governments, nonprofits, businesses and more – build and launch new applications for differential privacy, which can provide useful insights and services without revealing any information about individuals. We hope to push the industry forward in creating a safer ecosystem for every Internet user with products that are private by design.

Enabling more developers to use differential privacy

In 2019, we launched our open-sourced version of our foundational differential privacy library in C++, Java and Go. Our goal was to be transparent, and allow researchers to inspect our code. We received a tremendous amount of interest from developers who wanted to use the library in their own applications, including startups like Arkhn, which enabled different hospitals to learn from medical data in a privacy-preserving way, and developers in Australia that have accelerated scientific discovery through provably private data.

Since then, we have been working on various projects and new ways to make differential privacy more accessible and usable. Today, after a year of development in partnership with OpenMined, an organization of open-source developers, we are happy to announce a new milestone for our differential privacy framework: a product that allows any Python developer to process data with differential privacy.

Previously, our differential privacy library was available in three programming languages. Now, we’re making it available in Python, reaching nearly half of the developers worldwide. This means millions more developers, researchers, and companies will be able to build applications with industry leading privacy technology, enabling them to obtain insights and observe trends from their datasets while protecting and respecting the privacy of individuals.

With this new Python library, we’ve already had organizations begin experimenting with new use cases, such as showing a site’s most visited webpages on a per country basis in an aggregate and anonymized way. The library is unique as it can be used with Spark and Beam frameworks, two of the leading engines for large data processing, yielding more flexibility in its usage and implementation. We are also releasing a new differential privacy tool that allows practitioners to visualize and better tune the parameters used to produce differentially private information. Finally, we are also publishing a paper sharing the techniques that we use to efficiently scale differential privacy to datasets of a petabyte or more.

As with all open-source projects, the technology and outputs are only as strong as its community. Internally, we’ve trained a team that develops differentially private solutions, including the infrastructure behind our Mobility Reports and the popular times feature in Google Maps. Being true to our goal, we took the step of helping OpenMined build a team of experts outside of Google as well to serve as a resource for anyone interested in learning how to deploy differential privacy technologies.

Looking forward

We encourage developers around the world to take this opportunity to experiment with differential privacy use cases like statistical analysis and machine learning, but most importantly, provide us with feedback. We are excited to learn more about the applications you all can develop and the features we can provide to help along the way.

We will continue investing in democratizing access to critical privacy enhancing technologies and hope developers join us in this journey to improve usability and coverage. As we’ve said before, we believe that every Internet user in the world deserves world-class privacy, and we’ll continue partnering with organizations to further that goal.