2023 Open Source Contributions: A Year in Review


At Alphabet, open source remains a critical component of our business and internal systems. We depend on thousands of upstream projects and communities to run our infrastructure, products, and services. Within the Open Source Programs Office (OSPO), we continue to focus on investing in the sustainability of open source communities and expanding access to open source opportunities for contributors around the world. As participants in this global ecosystem, our goal with this report is to provide transparency and to report our work within and around open source communities.

In 2023 roughly 10% of Alphabet’s full-time workforce actively contributed to open source projects. This percentage has remained roughly consistent over the last five years, indicating that our open source contribution has remained proportional to the size of Alphabet over time. Over the last 5 years, Google has released more than 7,000 open source elements, representing a mix of new projects, features, libraries, SDKs, datasets, sample code, and more.


Most open source projects we contribute to are outside of Alphabet

In 2023, employees from Alphabet interacted with more than 70,000 public repositories on GitHub. Over the last five years, more than 70% of the non-personal GitHub repositories receiving Alphabet contributions were outside of Google-managed organizations. Our top external projects (by number of unique contributors at Alphabet) include both Google-initiated projects such as Kubernetes, Apache Beam, and gRPC as well as community-led projects such as LLVM, Envoy, and web-platform-tests.

In addition to Alphabet employees supporting external projects, in 2023 Alphabet-led projects received contributions from more than 180,000 non-Alphabet employees (unique GitHub accounts not affiliated with Alphabet).


Open source remains vital to industry collaboration and innovation

As the technology industry turns to focus on novel AI and machine learning technologies, open source communities have continued to serve as a shared resource and avenue for collaboration on new frameworks and emerging standards. In addition to launching new projects such as Project Open Se Cura (an open-source framework to accelerate the development of secure, scalable, transparent and efficient AI systems), we also collaborated with AI/ML industry leaders including Alibaba, Amazon Web Services, AMD, Anyscale, Apple, Arm, Cerebras, Graphcore, Hugging Face, Intel, Meta, NVIDIA, and SiFive to release OpenXLA to the public for use and contribution. OpenXLA is an open source ML compiler enabling developers to train and serve highly-optimized models from all leading ML frameworks on all major ML hardware. In addition to technology development, Google’s OSPO has been supporting the OSI's Open Source AI definition initiative, which aims to clearly define 'Open Source AI' by the end of 2024.


Investing in the next generation of open source contributors

As a longstanding consumer and contributor to open source projects, we believe it is vital to continue funding both established communities as well as invest in the next generation of contributors to ensure the sustainability of open source ecosystems. In 2023, OSPO provided $2.4M in sponsorships and membership fees to more than 60 open source projects and organizations. Note that this value only represents OSPO's financial contribution; other teams across Alphabet also directly fund open source work. In addition, we continue to support our longstanding programs:

  • In its 19th year, Google Summer of Code (GSoC) enabled more than 900 individuals to contribute to 168 organizations. Over the lifetime of this program, more than 20,000 individuals from 116 countries have contributed to more than 1,000 open source organizations across the globe.
  • In its fifth year, Google Season of Docs provided direct grants to 13 open source projects to improve open source project documentation. Each organization also created a case study to help other open source projects learn from their experience.
A map of the world with highlighting every country that has had Google Summer of Code participants

Securing our shared supply chain remains a priority

We continue to invest in improving the security posture of open source projects and ecosystems. Since launching in 2016, Google's free OSS-Fuzz code testing service has helped discover and get over 10000 vulnerabilities and 34,000 bugs fixed across more than 1200 projects. In 2023, we added features, expanded our OSS-Fuzz Rewards Program, and continued our support for academic fuzzing research. In 2023, we also applied the generative power of LLMs to improve fuzz testing. In addition to this project we’ve been:

  • Helping more projects adopt security best practices as well as identify and remediate vulnerabilities: Over the last year, the upstream team has proposed security improvements to more than 181 critical open source projects including widely-used projects such as NumPy, etcd, XGBoost, Ruby, TypeScript, LLVM, curl, Docker, and more. In addition to this work, GOSST continues to support OSV-Scanner to help projects find existing vulnerabilities in their dependencies, and enable comprehensive detection and remediation by providing commit-level vulnerability detail for over 30,000 existing CVE records from the NVD.

Our open source work will continue to grow and evolve to support the changing needs of our communities. Thank you to our colleagues and community members who continue to dedicate personal and professional time supporting the open source ecosystem. Follow our work at opensource.google.


Appendix: About this data

This report features metrics provided by many teams and programs across Alphabet. In regards to the code and code-adjacent activities data, we wanted to share more details about the derivation of those metrics.

  • Data sources: These data represent the activities of Alphabet employees on public repositories hosted on GitHub and our internal production Git service Git-on-Borg. These sources represent a subset of open source activity currently tracked by Google OSPO.
      • GitHub: We continue to use GitHub Archive as the primary source for GitHub data, which is available as a public dataset on BigQuery. Alphabet activity within GitHub is identified by self-registered accounts, which we estimate underreports actual activity.
      • Git-on-Borg: This is a Google managed git service which hosts some of our larger, long running open source projects such as Android and Chromium. While we continue to develop on this platform, most of our open source activity has moved to GitHub to increase exposure and encourage community growth.
  • Driven by humans: We have created many automated bots and systems that can propose changes on various hosting platforms. We have intentionally filtered these data to focus on human-initiated activities.
  • Business and personal: Activity on GitHub reflects a mixture of Alphabet projects, third-party projects, experimental efforts, and personal projects. Our metrics report on all of the above unless otherwise specified.
  • Alphabet contributors: Please note that unless additional detail is specified, activity counts attributed to Alphabet open source contributors will include our full-time employees as well as our extended Alphabet community (temps, vendors, contractors, and interns). In 2023, full time employees at Alphabet represented more than 95% of our open source contributors.
  • GitHub Accounts: For counts of GitHub accounts not affiliated with Alphabet, we cannot assume that one account is equivalent to one person, as multiple accounts could be tied to one individual or bot account.
  • *Active counts: Where possible, we will show ‘active users’ defined by logged activity (excluding ‘WatchEvent’) within a specified timeframe (a month, year, etc.) and ‘active repositories’ and ‘active projects’ as those that have enough activity to meet our internal active-project criteria and have not been archived.

By Sophia Vargas – Analyst and Researcher, OSPO

Post-Quantum Cryptography: Standards and Progress

The National Institute of Standards and Technology (NIST) just released three finalized standards for post-quantum cryptography (PQC) covering public key encapsulation and two forms of digital signatures. In progress since 2016, this achievement represents a major milestone towards standards development that will keep information on the Internet secure and confidential for many years to come.

Here's a brief overview of what PQC is, how Google is using PQC, and how other organizations can adopt these new standards. You can also read more about PQC and Google's role in the standardization process in this 2022 post from Cloud CISO Phil Venables.

What is PQC?

Encryption is central to keeping information confidential and secure on the Internet. Today, most Internet sessions in modern browsers are encrypted to prevent anyone from eavesdropping or altering the data in transit. Digital signatures are also crucial to online trust, from code signing proving that programs haven't been tampered with, to signals that can be relied on for confirming online identity.

Modern encryption technologies are secure because the computing power required to "crack the code" is very large; larger than any computer in existence today or the foreseeable future. Unfortunately, that's an advantage that won't last forever. Practical large-scale quantum computers are still years away, but computer scientists have known for decades that a cryptographically relevant quantum computer (CRQC) could break existing forms of asymmetric key cryptography.

PQC is the effort to defend against that risk, by defining standards and collaboratively implementing new algorithms that will resist attacks by both classical and quantum computers.

You don't need a quantum computer to use post-quantum cryptography, or to prepare. All of the standards released by NIST today run on the classical computers we currently use.

How is encryption at risk?

While a CRQC doesn't exist yet, devices and data from today will still be relevant in future. Some risks are already here:

  • Stored Data Through an attack known as Store Now, Decrypt Later, encrypted data captured and saved by attackers is stored for later decryption, with the help of as-yet unbuilt quantum computers
  • Hardware Products Defenders must ensure that future attackers cannot forge a digital signature and implant compromised firmware, or software updates, on pre-quantum devices that are still in use

For more information on CRQC-related risks, see our PQC Threat Model post.

How can organizations prepare for PQC migrations?

Migrating to new cryptographic algorithms is often a slow process, even when weaknesses affect widely-used crypto systems, because of organizational and logistical challenges in fully completing the transition to new technologies. For example, NIST deprecated SHA-1 hashing algorithms in 2011 and recommends complete phase-out by 2030.

That’s why it's crucial to take steps now to improve organizational preparedness, independent of PQC, with the goal of making your transition to PQC easier.

These crypto agility best practices can be enacted anytime:

  • Cryptographic inventory Understanding where and how organizations are using cryptography includes knowing what cryptographic algorithms are in use, and critically, managing key material safely and securely
  • Key rotation Any new cryptographic system will require the ability to generate new keys and move them to production without causing outages. Just like testing recovery from backups, regularly testing key rotation should be part of any good resilience plan
  • Abstraction layers You can use a tool like Tink, Google's multi-language, cross-platform open source library, designed to make it easy for non-specialists to use cryptography safely, and to switch between cryptographic algorithms without extensive code refactoring
  • End-to-end testing PQC algorithms have different properties. Notably, public keys, ciphertexts, and signatures are significantly larger. Ensure that all layers of the stack function as expected

Our 2022 paper "Transitioning organizations to post-quantum cryptography" provides additional recommendations to help organizations prepare and this recent post from the Google Security Blog has more detail on cryptographic agility and key rotation.

Google's PQC Commitments

Google takes these risks seriously, and is taking steps on multiple fronts. Google began testing PQC in Chrome in 2016 and has been using PQC to protect internal communications since 2022. In May 2024, Chrome enabled ML-KEM by default for TLS 1.3 and QUIC on desktop. ML-KEM is also enabled on Google servers. Connections between Chrome Desktop and Google's products, such as Cloud Console or Gmail, are already experimentally protected with post-quantum key exchange.

Google engineers have contributed to the standards released by NIST, as well as standards created by ISO, and have submitted Internet Drafts to the IETF for Trust Expressions, Merkle Tree Certificates, and managing state for hash-based signatures. Tink, Google's open source library that provides secure and easy-to-use cryptographic APIs, already provides experimental PQC algorithms in C++, and our engineers are working with partners to produce formally verified PQC implementations that can be used at Google, and beyond.

As we make progress on our own PQC transition, Google will continue to provide PQC updates on Google services, with updates to come from Android, Chrome, Cloud, and others.

Celebrating Māori and Pasifika Innovation: A Virtual Internship Journey



When 24 of the brightest, most curious minds from the Māori and Pasifika communities step into Google New Zealand's halls, amazing things happen.

Even though it was through a virtual internship, these students and working professionals didn't let that stop them from dreaming up solutions to some of the Pacific's most pressing challenges, and picking up important career skills.

This virtual internship program, a first in Aotearoa, was part of our collaboration with TupuToa to foster Māori and Pasifika representation in the tech industry.

Over three weeks, the 24 interns, divided into four groups of six, immersed themselves in mentorship and innovation, guided by four Googler mentors based in New Zealand and the United States, including Rob Coyne, Jacob Chalkley, Justin Keown and Hautahi Kingi.

The internship threw down the gauntlet: "If you could develop and change a Google product, what would it be and why?". 

Fuelled by the challenge and their Googler mentors, the interns responded with four novel and innovative ideas, each with the potential to transform the lives of Māori and Pasifika communities. 


These include an AI-powered mental health product for Māori and Pasifika communities, earlier disaster alerts, enhanced ways to trace Māori and Pasifika ancestry using Google tools, and a more efficient Google Scholar indexing system for research related to these communities. 

As we delve into the feasibility of these four concepts, mentor Hautahi Kingi reflects on the profound impact of this internship program. His own journey, from growing up on a marae near Whanganui to becoming a Google Data Scientist in New York, resonates deeply with the aspirations of these interns. 

Hautahi Kingi says he’s proud to have become the representation he longed for as a young person - a symbol of success for Māori and Pasifika individuals in the tech industry.

“It was a privilege to have the opportunity to work with these impressive and talented rangatahi,” he says. “The future looks bright for tech in Aotearoa.”

For TupuToa Initiative’s chief executive, Anne Fitisemanu, this was a much-needed step in the right direction. “Programmes like this internship are the foundation for TupuToa, that really help support and grow curious minds and foster innovation. The talent pool in our communities is vast and deep, and we’re proud to work alongside our partners to provide a platform to seek and nurture it.”

Google New Zealand is proud and thrilled that this program has ignited a spark in these 24 youths. They leave with a deeper passion for tech, connections with the tech industry, and skills that will serve them well in any field they choose, among them problem-solving, collaboration, and critical thinking. 

We’re excited to see what the future holds for them and grateful to TupuToa for their partnership. We look forward to working together to build an even more inclusive tech landscape in and around New Zealand.

Give it up for these 24 interns! Amish Kumar, Anaya Cole, Asifa Hanif, Gloria Tawake, Hayden Richard-Marsters, Lachlan McCreanney, Lauryn Maxwell, Lenalei Chan Ting, Lomaloma Pepine, Lucas Bawden, Malia Carter, Maria Munsanda Analega Ioane, McKay Leehmann Rimbao, Michael Heavey, Miracle Faamalosi, Paulo Opetaia, Rahera Williams, Sakura Kawakami Potaka-Dewes, Tele Tamati, Tom Tamaira, Vensel Margraff, Zachariah Hunt.

And a big shout-out to the awesome foursome who clinched the Google challenge with their idea for a mental health virtual assistant, designed to bridge the gap between young people and mental health resources: Sakura Kawakami Potaka-Dewes, Zachariah Hunt, Lucas Bawden and Maria Munsanda Analega and Lachlan McCreanney (with mentor Justin Keown).

By Nathan Laing, Head of Scaled, Google Customer Solutions, Google New Zealand

Improvements to conversion adjustment uploads

Starting on September 9, 2024, Google Ads API users will no longer need to wait 24 hours before uploading conversion adjustments - they can be uploaded immediately after the original conversion has been uploaded or recorded by Google tags.

This means that you will no longer need to keep track of the 24-hour window before uploading conversion adjustments, and can stop checking for certain error codes and retrying those upload requests.

Specifically, the following changes will take effect:

  1. The following error codes will no longer be returned in responses from the UploadConversionAdjustments method, and will no longer be visible in diagnostic reports:
  2. Conversion adjustments that would previously be rejected with these error codes will count towards the pending_count in diagnostics until they’re processed, at which point they’ll be counted towards either the successful_count or failed_count fields. This might take up to 24 hours.

Here is how these changes will affect older Google Ads API versions v15 and v16:

  1. The following error codes will no longer be returned in responses from the UploadConversionAdjustments method, and will no longer be visible in diagnostic reports:
  2. Any conversion that would have triggered these codes will, in diagnostic reports, count towards the total_event_count metric while being processed. Once processing is completed they will be counted towards either the successful_count or failed_count. This might take up to 24 hours.

What do I need to do?

  1. Remove any logic from your application that waits before uploading adjustments, and begin uploading conversion adjustments at any time after the original conversion has been uploaded.
  2. Modify your application logic and business processes so that you are not tracking the two conversion adjustment errors that are being removed.
  3. If you rely on the successful or failed event count metrics, revisit your application logic with the understanding that some uploaded events may, at times, when using v17, be represented as pending.

If you have any questions or need help, see the Google Ads API support page for options.

Google Meet hardware event logs are now available in the security investigation tool and BigQuery

What’s changing 

We’re pleased to announce a new set of features to help you conduct deeper analysis and more flexible issue detection within your Google Meet hardware fleet:
 
First, Meet hardware log events are now captured in the security investigation tool. Within the tool, you’ll be able to view historical events for your devices and create customized alerts. You can also click out to Meet hardware log events from individual device pages (Devices > Google Meet Devices > [Device Name]), allowing you to find information on specific devices even faster.

Meet hardware logs in the security and investigation tool




Secondly, through integration with BigQuery, Meet hardware logs can be imported from the security investigation tool to be analyzed at scale. This is a powerful new tool that can be used to build customized views of your historical data across your entire hardware fleet. For example, you can use this data to identify which devices are the most used across your organization, which devices are experiencing the most issues within a specific timeframe, and more.




Specifically, you’ll be able to filter by the following details: 



Getting started

Rollout pace


Availability

The security investigation tool is available for Google Workspace:
  • Enterprise Standard and Plus
  • Education Standard and Plus
  • Enterprise Essentials Plus
  • Frontline Standard
  • Cloud Identity Premium
Reporting logs in BigQuery is available for Google Workspace:
  • Enterprise Standard and Plus
  • Education Standard and Plus
  • Enterprise Essentials Plus
  • Frontline Standard

Manage all Calendar interop settings from the Admin console

What’s changing

Previously, the interoperability settings that allow Calendar users to see availability of colleagues using Outlook and vice-versa were split between two separate locations: in the Admin console and from https://calendar.google.com/Exchange/tools. Going forward, all interoperability settings will be housed in the Admin console at Apps > Google Workspace > Settings for Calendar > Calendar Interop management. This will make it easier for admins to view and manage their interop setups.



Getting started


Rollout pace

Availability

  • Available for Google Workspace customers except Google Workspace Essentials and Workspace Individual Subscribers 

National 8/11 Day: Excavation Safety for Homeowners

Today is National 8/11 Day; August 11th highlights the significance of underground excavation safety when doing home projects. By reinforcing the importance of calling the national 811 hotline before digging, everyone can help prevent damage to underground utilities and ensure the safety of workers and homeowners alike.

Thumbnail

In the United States, nearly every digging project, regardless of its size or location, requires contacting 811 a few days in advance of breaking ground. This free, national service connects homeowners with their local utility companies and others who may have infrastructure assets underground, who will mark the locations of underground lines such as gas, electric, and water pipes, as well as internet lines. By calling 811 before digging, homeowners can avoid accidentally damaging these lines, which can lead to injuries, property damage, loss of service, and costly repairs.

Excavation accidents are far too common, causing damage to underground utilities and posing a significant risk to public safety. According to the Common Ground Alliance’s (“CGA”) annual DIRT Report (issued in conjunction with CGA’s affiliated Damage Prevention Institute), over 213,000 excavation-related damages occurred in the United States in the year 2022 alone. However, data collection, transparency, and industry collaboration of the nature reflected in the DIRT Report are instrumental tools in the fight to mitigate the frequency of excavation-related accidents. We have been impressed with the work that CGA and the Damage Prevention Institute have done and would encourage others in the industry to join and participate.


Homeowners play a vital role in preventing excavation accidents by contacting 811 by phone or online before any digging project, no matter how small. Even if you are just planting a tree, installing a fence, or building a deck, it's crucial to contact 811 to have underground utilities marked before you start digging.

Once a call comes in to 811, a Utility Locator will come out to mark the location of existing underground infrastructure by spray painting the sidewalk or lawn or putting in flags. Being a Utility Locator is a tough job. These individuals are helping to save lives by taking affirmative safety measures designed to avoid major damages that in extreme circumstances lead to loss of life. Please help us get this important message out by sharing with your neighbors that these Locators are doing this job as a public service. In fact, you might even want to thank a Locator next time you see them! By working together, we can create a safer environment for everyone and make sure everyone’s utility services stay online.

Our internet services should “just work”, and typically that is the case. The foundation of the internet is millions of miles of fiber optic cable that connect each of us to the online world. These cables are continuously at risk of damage from construction activities (and the same could be said for utilities as well). Locators are the first line of defense, working diligently to provide a safe working environment for excavators, and ensuring access to the internet whenever we need it.


Thank you for being proactive when it comes to making that 811 call - on 8/11 and every day. GFiber is committed to doing our part. We use 811 for all of our construction projects, and we’re going even further. By leveraging large public databases of historic utility damages, a detailed model of the GFiber network, and 811 locate tickets, GFiber is developing an AI model that can better predict when and where damage to our network may occur. This will allow GFiber to implement additional protective measures in high-risk areas.

Posted by Ariane Schaffer, Government Affairs & Public Policy Manager, & Kelly Bell, Network Deployment & Operations Lead