Tag Archives: apache

Expanding our Differential Privacy Library

All developers have a responsibility to treat data with care and respect. Differential privacy helps organizations derive insights from data while simultaneously ensuring that those results do not allow any individual's data to be distinguished or re-identified. This principled approach supports data computation and analysis across many of Google’s core products and features.

Last summer, Google open sourced our foundational differential privacy library so developers and organizations around the world can benefit from this technology. Today, we’re announcing the addition of Go and Java to our library, an end-to-end solution for differential privacy: Privacy on Beam, and new tools to help developers implement this technology effectively.

We’ve listened to feedback from our developer community and, as of today, developers can now perform differentially private analysis in Java and Go. We’re working to bring these two libraries to full feature parity with C++.

We want all developers to have access to differential privacy, regardless of their level of expertise. Our new Privacy on Beam framework captures years of Googler developer experience and efficiency improvements in a comprehensive and easy-to-use solution that handles computation end-to-end. Built on Apache Beam, Privacy on Beam can reduce implementation mistakes, and take care of all the steps that are essential to differential privacy, including noise addition, partition selection, and contribution bounding. If you’re new to Apache Beam or differential privacy, our codelab can get you started.

Tracking privacy budgets is another challenge developers face when implementing differential privacy. So, we’re also releasing a new Privacy Loss Distribution tool for tracking privacy budgets. With this tool, developers can maintain an accurate estimate of the total cost to user privacy for collections of differentially private queries, and better evaluate the overall impact of their pipelines. Privacy Loss Distribution supports widely used mechanisms (such as Laplace, Gaussian, and Randomized response) and can scale to hundreds of compositions.

We hope these new languages, tools, and features unlock differential privacy for even more developers. Continue to share your stories and suggestions with us at [email protected]—your feedback will help inform our future differential privacy launches and updates.

Acknowledgements

Software Engineers: Yurii Sushko, Daniel Simmons-Marengo, Christoph Dibak, Damien Desfontaines, Maria Telyatnikova
Research Scientists: Pasin Manurangsi, Ravi Kumar, Sergei Vassilvitskii, Alex Kulesza, Jenny Gillenwater, Kareem Amin


By: Miguel Guevara, Mirac Vuslat Basaran, Sasha Kulankhina, and Badih Ghazi – Google Privacy Team and Google Research

Apache Beam presents its mascot to the world

Four years after it graduated from The Apache Software Foundation’s incubator, Apache Beam welcomes a new addition to the family: its mascot, the Beam firefly! Apache Beam is an open source data streaming programming model that runs on the back end of some of your favorite apps. It is the technology behind many popular apps that need to process big data in real time. And the reason it has come this far is the community of developers that contribute to this open source project every week. In the last few months, we worked with Apache Beam contributors to collaboratively design a mascot for the project—a creative asset that can represent the values of the project and attract new users and contributors—to keep growing the project and expanding its reach.

In these four years, the Apache Beam project has been a busy… firefly. According to the Apache Software Foundation’s 2019 Annual Report, Beam ranks fourth in the top five most active projects by commits, and it ranks first in the top five most active [email protected] mailing list, showing a strong and transparent communication exchange within the community of developers. On top of that, 53 Meetup groups across the globe are directly or indirectly connected to sharing knowledge about Apache Beam use cases, applications and functionalities.

With this much momentum and enthusiasm for this project, it was a good opportunity to cement some of Apache Beam’s most valued characteristics, to help raise awareness of this rapidly growing project, and convert more users to contributors. “Projects with a mascot are more relatable. They signal that there is more to the project than its technical vision. It signals that there is more to the project than its code,” said project contributor Maximilian Michel. In November of last year, the community discussed adopting a mascot and as a result, 11 animal ideas emerged for a possible mascot: beaver, hedgehog, lemur, owl, salmon, trout, robot dinosaur, firefly, cuttlefish, dumbo octopus, and angler fish. After 48 contributors expressed their vote, the collective decision was a firefly. In January of this year, artist Julián Bruno set out to bring the mascot to life. There were four rounds of feedback on different iterations of the mascot, plus a final vote, where 18 people participated; Engagement increased with every new round. In the end, this process produced an original mascot, a model sheet (so that anyone may reproduce it), and two adaptations of the Beam firefly: one where it is learning, and one where it is doing what it does best… stream data! You can read more about this process on the Apache Software Foundation’s blog.

In a year that has presented a lot of challenges to bring people together, working on the mascot project with the Apache Beam community was refreshing, and felt like a medium for contributors to connect beyond code and technical questions. It is our wish that Apache Beam continues to grow as a project, and we hope to continue to support its community to: support newcomers, share what works, and collaborate with others to build great solutions.

By María Cruz, Google Open Source

The Apache Beam Community in 2019

2019 has already been a busy time for the Apache Beam. The ASF blog featured our way of community building and we've had more Beam meetups around the world. Apache Beam also received the Technology of the Year Award from InfoWorld.
As these events happened, we were building up to the 20th anniversary of the Apache Software Foundation. The contributions of the Beam community were a part of Maximilian Michels blog post on the success of the ASF's open source development model:

As the founder of the first Beam meetup in London back in 2017, seeing the community flourish on a larger and worldwide scale is something that makes me happy. And we have come quite a long way since 2017, both in terms of geographical spread:



As well as in numbers:



All of this culminates in two Beam Summits this year—one we already had a few weeks ago in Berlin, and the other which will take place in a few weeks in Las Vegas, where we worked together with Apache and the ApacheCon team.

In that spirit, let's have a more detailed overview of the things that have happened, what the next few months look like, and how we can foster even more community growth.

Meetups

We've had a flurry of activity, with several meetups in the planning process and more popping up globally over time. As diversity of contributors is a core ASF value, this geographic spread is exciting for the community. Here's a picture from the latest Apache Beam meetup organized at Lyft in San Francisco:



We have more Bay Area meetups coming soon, and the community is looking into kicking off a meetup in Toronto and New York! In Europe, London had its first meetup of 2019 at the start of April, as did Stockholm at the start of May:
Meetup groups are becoming active in Berlin and New York also, so stay tuned for events there and more meetups internationally! If you are interested in starting your own meetup, feel free to reach out! Good places to start include our Slack channel, the dev and user mailing lists, or the Apache Beam Twitter. Even if you can’t travel to these meetups, you can stay informed on the happenings of the community. The talks and sessions from previous conferences and meetups are archived on the Apache Beam YouTube channel. If you want your session added to the channel, don’t hesitate to get in touch!

Summits

The first summit of the year was held in Berlin this past June. You can read about the inaugural edition of the Beam Summit Europe here. At these summits, you have the opportunity to meet with other Apache Beam creators and users, get expert advice, learn from the speaker sessions, and participate in workshops. We are proud to say that the Summit doubled in size this year with attendees from 24 countries across 4 continents.

You can find resources from this year’s Summit here:
  • ? the recordings can be found on our YouTube channel.
  • ? presentations of the Summit are made available via the website and in this folder.
  • We strongly encourage you to get involved again this year! You can still sign up for the upcoming summit in North America.
  • ? If you want to secure your ticket to attend the Beam Summit North America 2019, check our the ApacheCon website.
  • ? In case you want to get involved in speaking at events, do not hesitate to contact us via email or Twitter.

Why community engagement matters

Why we need a strong Apache Beam community:
  • We’re gaining lots of code contributions and need committers to review them
  • We want people to feel a sense of ownership to the project. By fostering this level of engagement, the work becomes even more exciting.
  • A healthy community has a further reach and leads to more growth. More hours can be contributed to the project as we can spread the work and ownership.
Why are we organizing these summits:
  • We’d like to give folks a place to meet and share ideas.
  • We know that offline interactions often changes the nature of the online ones in a positive manner.
  • Building an active and diverse community is part of the Apache Way. These summits provide an opportunity for us to engage people from different locations, companies, and backgrounds.
By Matthias Baetens, Google Developer Expert for Cloud, Apache Beam committer and community organiser

A New Home for Google Open Source

Originally on Google Open Source Blog
Posted by Will Norris, Open Source Programs Office

Free and open source software has been part of our technical and organizational foundation since Google's early beginnings. From servers running the Linux kernel to an internal culture of being able to patch any other team's code, open source is part of everything we do. In return, we've released millions of lines of open source code, run programs like Google Summer of Code and Google Code-in, and sponsor open source projects and communities through organizations like Software Freedom Conservancy, the Apache Software Foundation, and many others.

Today, we're launching opensource.google.com, a new website for Google Open Source that ties together all of our initiatives with information on how we use, release, and support open source.

This new site showcases the breadth and depth of our love for open source. It will contain the expected things: our programs, organizations we support, and a comprehensive list of open source projects we've released. But it also contains something unexpected: a look under the hood at how we "do" open source.

Helping you find interesting open source

One of the tenets of our philosophy towards releasing open source code is that "more is better." We don't know which projects will find an audience, so we help teams release code whenever possible. As a result, we have released thousands of projects under open source licenses ranging from larger products like TensorFlow, Go, and Kubernetes to smaller projects such as Light My Piano, Neuroglancerand Periph.io. Some are fully supported while others are experimental or just for fun. With so many projects spread across 100 GitHub organizations and our self-hosted Git service, it can be difficult to see the scope and scale of our open source footprint.

To provide a more complete picture, we are launching a directory of our open source projects which we will expand over time. For many of these projects we are also adding information about how they are used inside Google. In the future, we hope to add more information about project lifecycle and maturity.

How we do open source

Open source is about more than just code; it's also about community and process. Participating in open source projects and communities as a large corporation comes with its own unique set of challenges. In 2014, we helped form the TODO Group, which provides a forum to collaborate and share best practices among companies that are deeply committed to open source. Inspired by many discussions we've had over the years, today we are publishing our internal documentation for how we do open source at Google.

These docs explain the process we follow for releasing new open source projects, submitting patches to others' projects, and how we manage the open source code that we bring into the company and use ourselves. But in addition to the how, it outlines why we do things the way we do, such as why we only use code under certain licenses or why we require contributor license agreements for all patches we receive.

Our policies and procedures are informed by many years of experience and lessons we've learned along the way. We know that our particular approach to open source might not be right for everyone—there's more than one way to do open source—and so these docs should not be read as a "how-to" guide. Similar to how it can be valuable to read another engineer's source code to see how they solved a problem, we hope that others find value in seeing how we approach and think about open source at Google.

To hear a little more about the backstory of the new Google Open Source site, we invite you to listen to the latest episode from our friends at The Changelog. We hope you enjoy exploring the new site!