Tag Archives: google open source

Open source by the numbers at Google

At Google, open source is at the core of our infrastructure, processes, and culture. As such, participation in these communities is vital to our productivity. Within OSPO (Open Source Programs Office), our mission is to bring the value of open source to Google and the resources of Google to open source. To ensure our actions match our commitment, in this post we will explore a variety of metrics intended to increase context, transparency, and accountability across all of the communities we engage with.

Why we contribute: Open source has become a pervasive component in modern software development, and Google is no exception. We use thousands of open source projects across our internal infrastructure and products. As participants in the ecosystem, our intentions are twofold: give back to the communities we depend on as well as expand support for open source overall. We firmly believe in open source and its ability to bring together users, contributors, and companies alike to deliver better software.

The majority of Google’s open source work is done within one of two hosting platforms: GitHub and git-on-borg, Google’s production Git service which integrates with Gerrit for code review and access control. While we also allow individual usage of Bitbucket, GitLab, Launchpad, and other platforms, this analysis will focus on GitHub and git-on-borg. We will continue to explore how best to incorporate activity across additional channels.

A little context about the numbers you’ll read below:
  • Business and personal: While git-on-borg hosts both internal and external Google created repos, GitHub is a mixture of Google projects, experimental efforts and personal projects created by Googlers.
  • Driven by humans: We have created many automated bots and systems that can propose changes on both hosting platforms. We have intentionally filtered these data to ensure we are only showing human initiated activities.
  • GitHub data: We are using GH Archive as the primary source for GitHub data, which is currently available as a public dataset on BigQuery. Google activity within GitHub is identified by self registered accounts, which we anticipate under reports actual usage as employees acclimate to our policies.
  • Active counts: Where possible, we will show ‘active users’ and ‘active repositories’ defined by logged activity within each specified timeframe (for GH archive data, that’s any event type logged in the public GitHub event stream).
As numbers mean nothing without scale, let’s start by defining our applicable community: In 2019, more than 9% of Alphabet’s full time employees actively contributed to public repositories on git-on-borg and GitHub. While single digit, this percentage represents a portion of all full time Alphabet employees—from engineers to marketers to admins, across every business unit in Alphabet—and does not include those who contribute to open source projects outside of code. As our population has grown, so has our registered contributor base:
This chart shows the aggregate per year counts of Googlers active on public repositories hosted on GitHub and git-on-borg

What we create: As mentioned above, our contributing population works across a variety of Google, personal, and external repositories. Over the years, Google has released thousands of open source projects (many of which span multiple repositories) and ~2,600 are still active. Today, Google hosts over 8,000 public repositories on GitHub and more than 1,000 public repositories on git-on-borg. Over the last five years, we have doubled the number of public repos, growing our footprint by an average of 25% per year.

What we work on: In addition to our own repositories, we contribute to a wide pool of external projects. In 2019, Googlers were active in over 70,000 repositories on GitHub, pushing commits and/or opening pull requests on over 40,000 repositories. Note that more than 75% of the repos with Googler-opened pull requests were outside of Google-managed organizations (on GitHub).
This charts shows per year counts of activities initiated by Googlers on GitHub

What we contribute: For contribution volume on GitHub, we chose to focus on push events, opened, and merged pull requests instead of commits as this metric on its own is difficult to contextualize. Note that push events and pull requests typically include one or more commits per event. In 2019, Googlers created over 570,000 issues, opened over 150,000 pull requests, and created more than 36,000 push events on GitHub. Since 2015, we have doubled our annual counts of issues created and push events, and more than tripled the number of opened pull requests. Over the last five years, more than 80% of pull requests opened by Googlers have been closed and merged into active repositories.

How we spend our time: Combining these two classes of metrics—contributions and repos—provides context on how our contributors focus their time. On GitHub: in 2015, about 40% of our opened pull requests were concentrated in just 25 repositories. However, over the next four years, our activity became more distributed across a larger set of projects, with the top 25 repos claiming about 20% of opened pull requests in 2019. For us, this indicates a healthy expansion and diversification of interests, especially given that this activity represents both Google, as well as a community of contributors that happen to work at Google.
This chart splits the total per year counts of Googler created pull requests on GitHub by Top 25 repos vs the remainder ranked by number of opened pull requests per repo per year.

Open source contribution is about more than code

Every day, Google relies on the health and continuing availability of open source, and as such we actively invest in the security and sustainability of open source and its supply chain in three key areas:
  • Security: In addition to building security projects like OpenTitan and gVisor, Google’s OSS-Fuzz project aims to help other projects identify programming errors in software. As of the end of 2019, OSS-Fuzz had over 250 projects using the project, filed over 16,000 bugs, including 3,500 security vulnerabilities.
  • Community: Open source projects depend on communities of diverse individuals. We are committed to improving community sustainability and growth with programs like Google Summer of Code and Season of Docs. Over the last 15 years, about 15,000 students from over 105 countries have participated in Google Summer of Code, along with 25,000 mentors in more than 115 countries working on more than 680 open source projects.
  • Research: At the end of 2019, Google invested $1 million in open source research, partnering with researchers at UVM, with the goal to deepen understanding of how people, teams and organizations thrive in technology-rich settings, especially in open-source projects and communities.
Learn more about our open source initiatives at opensource.google.

By Sophia Vargas – Researcher, Google Open Source Programs Office

Welcoming 1,000+ Interns to Open Source at Google

One of the core tenets of open source is about finding ways for people to build great things by working together, regardless of location. This summer, through our intern program we’re gathering incredible talent from schools around the world, Googlers with a passion for open source, and project maintainers both inside and outside of Google to see what we can build together. 

Onboarding that many interns and turning them into new open source contributors was no easy task. So in partnership with the Intern Programs team and engineering teams across Google, we’ve grounded our planning by answering four key questions. 

How can we make our internship program a force for good in the open source ecosystem?

We knew that having more than a thousand interns contribute to open source projects could have a huge impact, however, many projects aren’t set up to onboard dozens of new contributors at one time and many maintainers can’t take on hundreds of new pull requests. Early on, we established best practices for intern placement and support. We committed to:
  • Aligning interns’ work with project priorities to advance the project while also allowing the interns to learn and grow their skills.
  • Proactively communicating with project maintainers and contributors, keeping them in the loop on timelines and logistics.
  • Looking beyond Google. While we prioritized projects that have full-time Google engineerings support. That includes Google-owned projects like Go, TensorFlow, and Chromium, as well as Google-created projects we invest heavily in, such as Kubernetes, Apache Beam, and Tekton. But Google also has full-time engineers working on outside projects we rely on, so our interns will also be working on projects like Envoy, Rust, and Apache Maven.

How can we introduce the interns to open source at Google?

We are determined to support and empower the interns as they become lifelong contributors to open source. Every Noogler in engineering learns about using and contributing to open source in a training run by our Open Source Programs Office. With an unprecedented number of interns working on open source projects, we are also providing additional resources; from offering a platform for questions, office hours, enrichment talks, and partnerships with external open source organizations.

How can we learn from our interns about the experience of contributing to open source at Google and beyond?

We see a huge opportunity to listen to our interns this summer. By meeting with interns and hosts—as well as surveying the entire class of interns at the end of the summer—we can look for ways to improve open source at Google and the contributor experience for projects they’re working on. We’re excited to learn from the internship program and from interns’ perspectives working in and contributing to open source.

How can we have an impact on these students that carries on throughout their careers?

One of my favorite questions to ask Googlers who are active in open source is how they were first introduced to open source. There’s a well-trodden path of a developer fixing an annoying bug, then a few more bugs, then adding small features, becoming a core contributor, and eventually a project maintainer. That process requires persistence and patience, and projects lose a lot of great developers along the way.

But... What if your first experience with open source is being welcomed into a large and thriving community of contributors? What if you get to contribute to open source full time, mentored by creators and maintainers of the project you’re working on, collaborating across organizations and across time zones? Our hope is that this kind of experience will leave a lasting impression on this summer’s interns and that they’ll continue to contribute to open source for a long time to come.

By Jen Phillips, Google Open Source

Welcoming 1,000+ Interns to Open Source at Google

One of the core tenets of open source is about finding ways for people to build great things by working together, regardless of location. This summer, through our intern program we’re gathering incredible talent from schools around the world, Googlers with a passion for open source, and project maintainers both inside and outside of Google to see what we can build together. 

Onboarding that many interns and turning them into new open source contributors was no easy task. So in partnership with the Intern Programs team and engineering teams across Google, we’ve grounded our planning by answering four key questions. 

How can we make our internship program a force for good in the open source ecosystem?

We knew that having more than a thousand interns contribute to open source projects could have a huge impact, however, many projects aren’t set up to onboard dozens of new contributors at one time and many maintainers can’t take on hundreds of new pull requests. Early on, we established best practices for intern placement and support. We committed to:
  • Aligning interns’ work with project priorities to advance the project while also allowing the interns to learn and grow their skills.
  • Proactively communicating with project maintainers and contributors, keeping them in the loop on timelines and logistics.
  • Looking beyond Google. While we prioritized projects that have full-time Google engineerings support. That includes Google-owned projects like Go, TensorFlow, and Chromium, as well as Google-created projects we invest heavily in, such as Kubernetes, Apache Beam, and Tekton. But Google also has full-time engineers working on outside projects we rely on, so our interns will also be working on projects like Envoy, Rust, and Apache Maven.

How can we introduce the interns to open source at Google?

We are determined to support and empower the interns as they become lifelong contributors to open source. Every Noogler in engineering learns about using and contributing to open source in a training run by our Open Source Programs Office. With an unprecedented number of interns working on open source projects, we are also providing additional resources; from offering a platform for questions, office hours, enrichment talks, and partnerships with external open source organizations.

How can we learn from our interns about the experience of contributing to open source at Google and beyond?

We see a huge opportunity to listen to our interns this summer. By meeting with interns and hosts—as well as surveying the entire class of interns at the end of the summer—we can look for ways to improve open source at Google and the contributor experience for projects they’re working on. We’re excited to learn from the internship program and from interns’ perspectives working in and contributing to open source.

How can we have an impact on these students that carries on throughout their careers?

One of my favorite questions to ask Googlers who are active in open source is how they were first introduced to open source. There’s a well-trodden path of a developer fixing an annoying bug, then a few more bugs, then adding small features, becoming a core contributor, and eventually a project maintainer. That process requires persistence and patience, and projects lose a lot of great developers along the way.

But... What if your first experience with open source is being welcomed into a large and thriving community of contributors? What if you get to contribute to open source full time, mentored by creators and maintainers of the project you’re working on, collaborating across organizations and across time zones? Our hope is that this kind of experience will leave a lasting impression on this summer’s interns and that they’ll continue to contribute to open source for a long time to come.

By Jen Phillips, Google Open Source

COVID-19: How Google is helping the open source community

COVID-19 has affected so much of the world around us, and open source is no exception. Project resilience is being challenged by COVID-19. Community members have even less time to contribute. Event cancellations are impacting networking, collaboration, and fundraising.


Google wants to do everything it can to help. This means that it’s even more important for the Google Open Source Programs Office to step up our commitment to citizenship. We’re taking several steps to support industry organizations and the projects that we participate in to help them operate during this time.

Virtual Events Support

  • Participating in talks internally and externally to Google to share knowledge and insight into open source projects and practices with the wider open source communities.
  • To support the shift from an offline to online events model, we created an online guide to share resources and event planning knowledge: Open Source Virtual Events Guide.

Talent

  • COVIDActNow is a multidisciplinary team working to provide disease intelligence and data analysis on COVID in the U.S. Google contributed to this project by improving their data pipeline allowing for county level data visualization, providing more localized insight for crisis planning.
  • Nextstrain is a platform for real-time tracking of pathogen evolution. Google contributed engineering, design, and translation resources to help scientists conduct research into real-time tracking of pathogen evolution.
  • Schema.org - Google led Schema.org rapid response designs for structured data markup to contribute to the COVID-19 global response, leading to the UK making similar announcements.
  • Google’s annual internship program was converted to a digital program where interns will focus on open source projects, allowing projects to gain new contributors in a non-traditional environment.
  • Google Summer of Code brings over 1100 university students from around the world together with open source communities, many of which are working on various humanitarian efforts related to COVID-19. The program is completely online so students can work with their mentors remotely, allowing all organizations to continue receiving the support they need.
The impact from COVID-19 will have long-term effects on many organizations and projects that may not be immediately apparent. In the coming months, we will monitor the needs of projects and organizations across open source. We understand the value of open source not just to the tech world, but the impact it has on bringing communities together; Google has a long standing history in open source and we will continue supporting our community to stay strong during and after the passing of COVID-19.

We encourage folks who have the time and ability to support open source communities to do so by getting involved and reaching out directly to organizations that interest you. This is a time for all of us to come together and lift up each other and open source.

By Megan Byrd-Sanicki and Radha Jhatakia, Google Open Source

Google Summer of Code 2020 Statistics: Part 1

Since 2005, Google Summer of Code (GSoC) has been bringing new developers into the open source community every year. This year, we accepted 1,199 from 66 countries into the 2020 GSoC program to work with 199 open source organizations over the summer. Students began coding June 1st and will spend the next 12 weeks working closely under the guidance from mentors from their open source communities.

Each year we like to share program statistics about the GSoC program and the accepted students and mentors involved in the program. 6,626 students from 121 countries submitted 8,903 applications for this year’s program.

Accepted Students

  • 86.6% are participating in their first GSoC
  • 71.7% are first time applicants to GSoC

Degrees

  • 77.4% are undergraduates, 16.8% are masters students, and 5.8% are in PhD programs
  • 72.5% are Computer Science majors, 3.6% are Mathematics majors, 23.9% are other majors including many from engineering fields like Electrical, Mechanical, Aerospace, etc.
  • Students are studying in a variety of fields including Atmospheric Science, Finance, Neuroscience, Economics, Biophysics, Linguistics, Geology, Pharmacy and Real estate.

Proposals

There were a record number of students submitting proposals for the program this year:
  • 6,626 students (18.2% increase from last year)
  • 121 countries
  • 8,902 proposals submitted

Registrations

We had a record breaking 51,244 students from 178 countries(!) register for the program this year—that’s a 65% increase in registrations from last year’s record numbers!

In our next GSoC statistics post, we will do a deeper dive into the schools and mentors for the 2020 program.

By Stephanie Taylor, Google Open Source

Google Summer of Code 2020 is now open for mentor organization applications!

We are looking for open source projects and organizations to participate in the 16th annual Google Summer of Code (GSoC)! GSoC is a global program that draws university student developers from around the world to contribute to open source projects. Each student will spend three months working on a coding project with the support of volunteer mentors from participating open source organizations, mid-May to mid-August.

Last year, 1,276 students worked with 206 open source organizations and over 2,000 mentors. Organizations include small and medium sized open source projects, as well as a number of umbrella organizations with many sub-projects under them (Apache Software Foundation, Python Software Foundation, etc.).

Our 2020 goal is to accept more organizations into their first GSoC than ever before! We ask that veteran organizations refer other organizations they think would be a good fit to participate in GSoC.

You can apply to be a mentoring organization for GSoC starting today. The deadline to apply is February 5 at 19:00 UTC. Organizations chosen for GSoC 2020 will be publicly announced on February 20.

Please visit the program site for more information on how to apply and review the detailed timeline of important deadlines. We also encourage you to check out the Mentor Guide and our short video on why open source projects apply to be a part of the program.

Best of luck to all of the open source mentoring organization applicants!

By Stephanie Taylor, Google Open Source

Paving the way for a more diverse open source landscape: The First OSS Contributor Summit in Mexico

“I was able to make my first contribution yesterday, and today it was merged. I'm so excited about my first steps in open source", a participant said about the First Summit for Open Source Contributors, which took place this September in Guadalajara, México.
How do you involve others in open source? How can we make this space more inclusive for groups with low representation in the field?

With these questions in mind and the call to contribute to software that is powering the world's favorite products, Google partnered with Software Guru magazine, Wizeline Academy, OSOM (a consortium started by Googler, Griselda Cuevas, to engage more Mexican developers in open source), IBM, Intel, Salesforce and Indeed to organize the First Summit for Open Source Contributors in Mexico. The Apache Software Foundation and the CNCF were some of the organizations that sponsored the conference. The event consisted of two days of training and presentations on a selection of open source projects, including Apache Beam, Gnome, Node JS, Istio, Kubernetes, Firefox, Drupal, and others. Through 19 workshops, participants were able to learn about the state of open source in Latin America, and also get dedicated coaching and hands-on practice to become active contributors in OSS. While unpaid, these collaborations represent the most popular way of learning to code and building a portfolio for young professionals, or people looking to do a career shift towards tech.


As reported by many advocacy groups in the past few years, diversity remains a big debt in the tech industry. Only an average of 8.4% of employees in ten of the leading tech companies are Latinx(1). The gap is even bigger in open source software, where only 2.6% of committers to Apache projects are Latinx(2). Diversity in tech is not just the right thing to do, it is also good business: bringing more diverse participation in software development will result in more inclusive and successful products, that serve a more comprehensive set of use cases and needs in any given population.


While representation numbers in the creation of software are still looking grim, the use of OSS is growing fast: It is estimated that Cloud and big-data OSS technologies will grow five times by 2025 in Latin America. The main barrier for contributing? Language. 

The First Summit for Open Source Contributors set out to close this fundamental gap between tech users and its makers. To tackle this problem, we created, in partnership with other companies, 135 hours of content in Spanish for 481 participants, which produced over 200 new contributors across 19 open source projects. When asked why contributions from the region are so low, 41% of participants said it was due to lack of awareness, and 34% said they thought their contributions were not valuable. After the event, 47% of participants reported that the workshops and presentations provided them with information or guidance on how to contribute to specific projects, and 39% said the event helped them to lose fear and contribute. Almost 100% of participants stated that they plan to continue contributing to Open Source in the near future… and if they do, they would raise representation of Latinx in Open Source to 10%.
Organizing Team
This event left us with a lot of hope for the future of diversity and inclusion in open source. Going forward, we hope to continue supporting this summit in Latin America, and look for ways of reproducing this model in other regions of the world, as well as designing proactive outreach campaigns in other formats.

View more pictures of the event here.
View some of the recorded presentations here.


By: María Cruz for Google Open Source

(1) Aggregate data from Tech Crunch: https://techcrunch.com/2019/06/17/the-future-of-diversity-and-inclusion-in-tech/
(2) Data from the last Apache Software Foundation Committer Survey, applied in 2016, 765 respondents (13% of committers)