Category Archives: Open Source Blog

News about Google’s open source projects and programs

Finding Critical Open Source Projects

Comic graphic of modern digital infrastructure
Open source software (OSS) has long suffered from a "tragedy of the commons" problem. Most organizations, large and small, make use of open source software every day to build modern products, but many OSS projects are struggling for the time, resources and attention they need. This is a resource allocation problem and Google, as part of Open Source Security Foundation (OpenSSF), can help solve it together. We need ways to connect critical open source projects we all rely on, with organizations that can provide them with adequate support.

Criticality of an open source project is difficult to define; what might be a critical dependency for one consumer of open source software may be entirely absent for another. However, arriving at a shared understanding and framework allows us to have productive conversations about our dependencies. Simply put, we define criticality to be the influence and importance of a project.

In order for OpenSSF to fund these critical open source projects, they need to be identified first. For this purpose, we are releasing a new project - “Criticality Score” under the OpenSSF. Criticality score indicates a project’s criticality (a number between 0 and 1) and is derived from various project usage metrics in a fully automated way. Our initial evaluation metrics include a project’s age, number of individual contributors and organizations involved, user involvement (in terms of new issue requests and updates), and a rough estimate of its dependencies using commit mentions. We also provide a way to add your own metric(s). For example, you can add internal project usage data to re-adjust a project's criticality score for individualized prioritization needs.

Identifying these critical projects is only the first step in making security improvements. OpenSSF is also exploring ways to provide maintainers of these projects with the resources they need. If you're a maintainer of a critical software package and are interested in getting help, funding, or infrastructure to run your project, reach out to the OpenSSF’s Securing Critical Projects working group here.

Check out the Criticality Score project on GitHub, including a list of critical open source projects. Though we have made some progress on this problem, we have not solved it and are eager for the community’s help in refining these metrics to identify critical open source projects.

By Abhishek Arya, Kim Lewandowski, Dan Lorenc and Julia Ferraioli – Google Open Source

Expanding Fuchsia’s open source model

Fuchsia is a long-term project to create a general-purpose, open source operating system, and today we are expanding Fuchsia’s open source model to welcome contributions from the public.

Fuchsia is designed to prioritize security, updatability, and performance, and is currently under active development by the Fuchsia team. We have been developing Fuchsia in the open, in our git repository for the last four years. You can browse the repository history at https://fuchsia.googlesource.com to see how Fuchsia has evolved over time. We are laying this foundation from the kernel up to make it easier to create long-lasting, secure products and experiences.

Starting today, we are expanding Fuchsia's open source model to make it easier for the public to engage with the project. We have created new public mailing lists for project discussions, added a governance model to clarify how strategic decisions are made, and opened up the issue tracker for public contributors to see what’s being worked on. As an open source effort, we welcome high-quality, well-tested contributions from all. There is now a process to become a member to submit patches, or a committer with full write access.

In addition, we are also publishing a technical roadmap for Fuchsia to provide better insights for project direction and priorities. Some of the highlights of the roadmap are working on a driver framework for updating the kernel independently of the drivers, improving file systems for performance, and expanding the input pipeline for accessibility.

Fuchsia is an open source project that is inclusive by design, from the architecture of the platform itself, to the open source community that we’re building. The project is still evolving rapidly, but the underlying principles and values of the system have remained relatively constant throughout the project. More information about the core architectural principles are available in the documentation: secure, updatable, inclusive, and pragmatic.

Fuchsia is not ready for general product development or as a development target, but you can clone, compile, and contribute to it. It has support for a limited set of x64-based hardware, and you can also test it with Fuchsia’s emulator. You can download and build the source code by following the getting started guide.

Fuchsia emulator startup with fx emu
If you would like to learn more about Fuchsia, join our mailing lists and browse the documentation at fuchsia.dev. You can now be part of the project and help build the future of this operating system. We are looking forward to receiving contributions from the community as we grow Fuchsia together.

By Wayne Piekarski, Developer Advocate for Fuchsia

OpenTitan at one year: the open source journey to secure silicon

During the past year, OpenTitan has grown tremendously as an open source project and is on track to provide transparent, trustworthy, and cost-free security to the broader silicon ecosystem. OpenTitan, the industry’s first open source silicon root of trust, has rapidly increased engineering contributions, added critical new partners, selected our first tapeout target, and published a comprehensive logical security model for the OpenTitan silicon, among other accomplishments.

OpenTitan by the Numbers

OpenTitan has doubled many metrics in the year since our public launch: in design size, verification testing, software test suites, documentation, and unique collaborators at least. Crucially, this growth has been both in the design verification collateral required for high volume production-quality silicon, as well as the digital design itself, a first for any open source silicon project.
  • More than doubled the number of commits at launch: from 2,500 to over 6,100 (across OpenTitan and the Ibex RISC-V core sub-project).
  • Grew to over 141K lines of code (LOC) of System Verilog digital design and verification.
  • Added 13 new IP blocks to grow to a total to 29 distinct hardware units.
  • Implemented 14 Device Interface Functions (DIFs) for a total 15 KLOC of C11 source code and 8 KLOC of test software.
  • Increased our design verification suite to over 66,000 lines of test code for all IP blocks.
  • Expanded documentation to over 35,000 lines of Markdown.
  • Accepted contributions from 52 new unique contributors, bringing our total to 100.
  • Increased community presence as shown by an aggregate of over 1,200 Github stars between OpenTitan and Ibex.
Chart that shows: One year of OpenTitan and Ibex growth on GitHub: the total number of commits grew from 2,500 to over 6,100
One year of OpenTitan and Ibex growth on GitHub: the total number of commits grew from 2,500 to over 6,100.
High quality development is one of OpenTitan’s core principles. Besides our many style guides, we require thorough documentation and design verification for each IP block. Each piece of hardware starts with auto-generated documentation to ensure consistency between documentation and design, along with extensive, progressively improving, design verification as it advances through the OpenTitan hardware stages to reach tapeout readiness.
One year of growth in Design Verification: from 30,000 to over 65,000 lines of testing source code. Each color represents design verification for an individual IP block.

Innovating for Open Silicon Development

Besides writing code, we have made significant advances in developing processes and security framework for high quality, secure open source silicon development. Design success is not just measured by the hardware, highly functional software and a firm contract between the two, with well-defined interfaces and well-understood behavior, play an important role.

OpenTitan’s hardware-software contract is realized by our DIF methodology, yet another way in which we ensure hardware IP quality. DIFs are a form of hardware-software co-design and the basis of our chip-level design verification testing infrastructure. Each OpenTitan IP block requires a style guide-compliant DIF, and this year we implemented 14 DIFs for a total 15 KLOC of C11 source code and 8 KLOC of tests.

We also reached a major milestone by publishing an open Security Model for a silicon root of trust, an industry first. This comprehensive guidance demonstrates how OpenTitan provides the core security properties required of a secure root of trust. It covers provisioning, secure boot, device identity, and attestation, and our ownership transfer mechanism, among other topics.

Expanding the OpenTitan Ecosystem

Besides engineering effort and methodology development, the OpenTitan coalition added two new Steering Committee members in support of lowRISC as an open source not-for-profit organization. Seagate, a leader in storage technology, and Giesecke and Devrient Mobile Security, a major producer of certified secure systems. We also chartered our Technical Committee to steer technical development of the project. Technical Committee members are drawn from across our organizational and individual contributors, approving 9 technical RFCs and adding 11 new project committers this past year.

On the strength of the OpenTitan open source project’s engineering progress, we are excited to announce today that Nuvoton and Google are collaborating on the first discrete OpenTitan silicon product. Much like the Linux kernel is itself not a complete operating system, OpenTitan’s open source design must be instantiated in a larger, complete piece of silicon. We look forward to sharing more on the industry’s first open source root of trust silicon tapeout in the coming months.

Onward to 2021

OpenTitan’s future is bright, and as a project it fully demonstrates the potential for open source design to enable collaboration across disparate, geographically far flung teams and organizations, to enhance security through transparency, and enable innovation in the open. We could not do this without our committed project partners and supporters, to whom we owe all this progress: Giesecke and Devrient Mobile Security, Western Digital, Seagate, the lowRISC CIC, Nuvoton, ETH Zürich, and many independent contributors.

Interested in contributing to the industry's first open source silicon root of trust? Contact us here.

By Dominic Rizzo, OpenTitan Lead – Google Cloud

Announcing the Atheris Python Fuzzer

Fuzz testing is a well-known technique for uncovering programming errors. Many of these detectable errors have serious security implications. Google has found thousands of security vulnerabilities and other bugs using this technique. Fuzzing is traditionally used on native languages such as C or C++, but last year, we built a new Python fuzzing engine. Today, we’re releasing the Atheris fuzzing engine as open source.

What can Atheris do?

Atheris can be used to automatically find bugs in Python code and native extensions. Atheris is a “coverage-guided” fuzzer, which means that Atheris will repeatedly try various inputs to your program while watching how it executes, and try to find interesting paths.

One of the best uses for Atheris is for differential fuzzers. These are fuzzers that look for differences in behavior of two libraries that are intended to do the same thing. One of the example fuzzers packaged with Atheris does exactly this to compare the Python “idna” package to the C “libidn2” package. Both of these packages are intended to decode and resolve internationalized domain names. However, the example fuzzer idna_uts46_fuzzer.py shows that they don’t always produce the same results. If you ever decided to purchase a domain containing (Unicode codepoints [U+0130, U+1df9]), you’d discover that the idna and libidn2 libraries resolve that domain to two completely different websites.

In general, Atheris is useful on pure Python code whenever you have a way of expressing what the “correct” behavior is - or at least expressing what behaviors are definitely not correct. This could be as complex as custom code in the fuzzer that evaluates the correctness of a library’s output, or as simple as a check that no unexpected exceptions are raised. This last case is surprisingly useful. While the worst outcome from an unexpected exception is typically denial-of-service (by causing a program to crash), unexpected exceptions tend to reveal more serious bugs in libraries. As an example, the one YAML parsing library we tested Atheris on says that it will only raise YAMLErrors; however, yaml_fuzzer.py detects numerous other exceptions, such as ValueError from trying to interpret “-_” as an integer, or TypeError from trying to use a list as a key in a dict. (Bug report.) This indicates flaws in the parser.

Finally, Atheris supports fuzzing native Python extensions, using libFuzzer. libFuzzer is a fuzzing engine integrated into Clang, typically used for fuzzing C or C++. When using libFuzzer with Atheris, Atheris can still find all the bugs previously described, but can also find memory corruption bugs that only exist in native code. Atheris supports the Clang sanitizers Address Sanitizer and Undefined Behavior Sanitizer. These make it easy to detect corruption when it happens, rather than far later. In one case, the author of this document found an LLVM bug using an Atheris fuzzer (now fixed).

What does Atheris support?

Atheris supports fuzzing Python code and native extensions in Python 2.7 and Python 3.3+. When fuzzing Python code, using Python 3.8+ is strongly recommended, as it allows for much better coverage information. When fuzzing native extensions, Atheris can be used in combination with Address Sanitizer or Undefined Behavior Sanitizer.

OSS-Fuzz is a fuzzing service hosted by Google, where we execute fuzzers on open source code free of charge. OSS-Fuzz will soon support Atheris!

How can I get started?

Take a look at the repo, in particular the examples. For fuzzing pure Python, it’s as simple as:

pip3 install atheris

And then, just define a TestOneInput function that runs the code you want to fuzz:

import atheris
import sys


def TestOneInput(data):
    if data == b"bad":
        raise RuntimeError("Badness!")


atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()

That’s it! Atheris will repeatedly invoke TestOneInput and monitor the execution flow, until a crash or exception occurs.

For more details, including how to fuzz native code, see the README.


By Ian Eldred Pudney, Google Information Security

From MLPerf to MLCommons: moving machine learning forward

Today, the community of machine learning researchers and engineers behind the MLPerf benchmark is launching an open engineering consortium called MLCommons. For us, this is the next step in a journey that started almost three years ago.


Early in 2018, we gathered a group of industry researchers and academics who had published work on benchmarking machine learning (ML), in a conference room to propose the creation of an industry standard benchmark to measure ML performance. Everyone had doubts: creating an industry standard is challenging under the best conditions and ML was (and is) a poorly understood stochastic process running on extremely diverse software and hardware. Yet, we all agreed to try.

Together, along with a growing community of researchers and academics, we created a new benchmark called MLPerf. The effort took off. MLPerf is now an industry standard with over 2,000 submitted results and multiple benchmarks suites that span systems from smartphones to supercomputers. Over that time, the fastest result submitted to MLPerf for training the classic ML network ResNet improved by over 13x.

We created MLPerf because we believed in three principles:
  • Machine learning has tremendous potential: Already, machine learning helps billions of people find and understand information through tools like Google’s search engine and translation service. Active research in machine learning could one day save millions of lives through improvements in healthcare and automotive safety.
  • Transforming machine learning from promising research into wide-spread industrial practice requires investment in common infrastructure -- especially metrics: Much like computing in the ‘80s, real innovation is mixed with hype and adopting new ideas is slow and cumbersome. We need good metrics to identify the best ideas, and good infrastructure to make adoption of new techniques fast and easy.
  • Developing common infrastructure is best done by an open, fast-moving collaboration: We need the vision of academics and the resources of industry. We need the agility of startups and the scale of leading tech companies. Working together, a diverse community can develop new ideas, launch experiments, and rapidly iterate to arrive at shared solutions.
Our belief in the principles behind MLPerf has only gotten stronger, and we are excited to be part of the next step for the MLPerf community with the launch of MLCommons.

MLCommons aims to accelerate machine learning to benefit everyone. MLCommons will build a a common set of tools for ML practitioners including:
  • Benchmarks to measure progress: MLCommons will leverage MLPerf to measure speed, but also expand benchmarking other aspects of ML such as accuracy and algorithmic efficiency. ML models continue to increase in size and consequently cost. Sustaining growth in capability will require learning how to do more (accuracy) with less (efficiency).
  • Public datasets to fuel research: MLCommons new People’s Speech project seeks to develop a public dataset that, in addition to being larger than any other public speech dataset by more than an order of magnitude, better reflects diverse languages and accents. Public datasets drive machine learning like nothing else; consider ImageNet’s impact on the field of computer vision. 
  • Best practices to accelerate development: MLCommons will make it easier to develop and deploy machine learning solutions by fostering consistent best practices. For instance, MLCommons’ MLCube project provides a common container interface for machine learning models to make them easier to share, experiment with (including benchmark), develop, and ultimately deploy.
Google believes in the potential of machine learning, the importance of common infrastructure, and the power of open, collaborative development. Our leadership in co-founding, and deep support in sustaining, MLPerf and MLCommons has echoed our involvement in other efforts like TensorFlow and NNAPI. Together with the MLCommons community, we can improve machine learning to benefit everyone.

Want to get involved? Learn more at mlcommons.org.


By Peter Mattson – ML Metrics, Naveen Kumar – ML Performance, and Cliff Young – Google Brain

Best practices for accessibility for virtual events

As everyone knows, most of our open source events have transformed from in-person to digital this year. However, due to the sudden change, not everything is accessible. We took this issue seriously and decided to work with one of our accessibility experts, Neighborhood Access, to share best practices for our community. We hope this will help you organize your digital events!


Virtual events can be a wonderful way to make information more accessible to people, including attendees with disabilities, but hosts often fall short in providing the appropriate logistical support needed for this audience to fully engage in the programming. In this article, we’ll go over some best practices for hosting virtual events to make them accessible to the d/Deaf/Hard of Hearing, Blind/Low Vision, Developmentally Disabled, and Neurodivergent communities. Since some of these accommodations require a bit of planning ahead of the event, we’ll use a timeline format to discuss what needs to be arranged and when.

Initial Planning Stages

Selecting a Platform
When selecting a video conferencing platform for your virtual event, you will want to make sure the platform has the following features:
  1. Ability to dial into the event via phone (see Accessible Event Links for more info)
  2. Ability to assign captioning to someone in the call or via a third party within the platform
Working With Your Speakers
Once you read through the rest of this guide and decide what accessibility tools you’ll use, you should brief your speakers on those practices so everyone is on the same page. For example, if you are going to do an image description of yourself (as detailed in Incorporating Audio Cues), make sure your guests do the same to provide a consistently accessible experience.

At Least 2 Weeks Before the Event

Disability Accommodation Requests
If you are hosting a large event with many attendees, you should provide accommodations like interpreters and audio description without people having to request them. If you are hosting a smaller event (<20 guests), the RSVP form for your event should link to a separate form where you will track accommodation requests. This should be a simple form that collects no identifying information about the person filling it out. A person should not have to disclose their disability in order to request accommodations; an anonymous form will get you all the information you need. The one question you need to ask is: “Are there any disability accommodations you need us to provide in order for you to fully participate in this event?” You should aim to collect this information at least two weeks before the event, so that you have ample time to arrange necessary accommodations.

Accessible Event Links
When sending a link to join a video call, be sure to include the part of the link that includes the dial-in number if you will not be providing an interpreter for the event. d/Deaf/HoH people who can access the Video Relay System (VRS) will need this number to be able to dial their interpreter into the meeting, and it is much easier for these attendees to have this ahead of time rather than trying to ask for it on the day of the event.

American Sign Language (ASL) Interpreters
This section applies to events that have a mostly-US-based audience. If you have an international audience, you will need to ask attendees what nationality of sign language is needed. Interpreters are booked to do everything from attending doctor’s appointments with d/Deaf/HoH patients to interpreting for larger scale events. Therefore, they are quite busy, and require booking a few weeks before you need their services. One of the easiest ways to find an interpreter is through your state or city’s Deaf and Hard of Hearing Services provider. Most agencies are called “(State name) Deaf and Hard of Hearing Services”, and if you search this term you should find the right agency. Most agencies will have you fill out a short form with a few details about your event, and then they will work to connect you with an available interpreter. Once connected, the interpreter may ask you a few additional questions about the event, so they can have any field-specific signs and names prepared ahead of time.

Live Captioning
There are several companies that offer live captioning (sometimes referred to as CART, or Communication Access Real-time Translation) for virtual events. If you are able to hire one of these services, they are your best bet—they know the ins and outs of the technology—and this gives you one less thing to stress about on the day of your event. They will usually schedule a time to perform a test run with you a few days ahead of the event, so you both can be sure things are in working order.

If you are unable to hire a service to provide live captions, most online meeting platforms give you the option to assign captioning to someone. You can have a volunteer transcribe the event, and the captions will show up for anyone who opts to see them.

Interpreting vs Live Captioning
There is a hierarchy in terms of what services to utilize in the event that you must choose between them. The best case scenario, and what you should absolutely strive for, is to have both an interpreter and live captioning. There are many people who can benefit from captions who may not benefit from sign language interpreting, and vise versa. In the event that you have to choose either an interpreter or live captioning, go with live captioning, as more people will be able to benefit from it.

One Week Before the Event

Make Your Visual Materials Accessible
There are a few steps you’ll need to take to ensure that your visual materials (slideshows and pictures) are accessible to Blind/Low Vision attendees. Many Blind/Low Vision people use screen readers, which read aloud the text that is on a screen. Since you cannot use screen readers during a presentation (both because the reader cannot read text in video format and because attendees will want to hear what you’re saying), you will need to provide a copy of your visual materials to attendees either before or after the event so they can review them. If you are unable to share the exact materials, try to share a version of them that has the same text. This ensures that everyone has full access to all event materials, albeit possibly at different times. Before sharing visual materials, be sure of two things:
  1. You are not sharing an image file of text. If you are sharing a slideshow, for example, do not send a PDF printout of the slides (screen readers will not recognize these as text). Instead, download and share a version of the slideshow and send that directly.
  2. You must add alt text or captions to any pictures. How you will do this varies from software to software, but instructions can usually be found on the software’s website or help forum. Here is a guide on how to do this in Google Slides..
Format Visual Elements
People with visual processing disorders, such as dyslexia, may find some fonts harder to read than others. While needs can vary from person to person, it is generally agreed that sans serif fonts (Arial) are easier to read than serif fonts (Times New Roman). Use sans serif fonts wherever possible in your visual components.

Timing the Event
Most people can benefit from a five minute break every half hour or so (the 30:5 Rule), but this is especially true for people with disabilities that impact their need to go to the bathroom, and also people who struggle with focus. Following the 30:5 Rule ensures that people have time to take care of their needs throughout the event, and that everyone will be continually focused and refreshed. Schedule these five minute breaks into your event timeline.

During the Event

Incorporating Audio Cues
When you first introduce yourself at the event, provide a verbal description of yourself and your surroundings. This allows Blind/Low Vision attendees to learn key visual characteristics about each presenter, much like how a sighted person might remember someone by their statement necklace or unique hairstyle. Here is a format you can follow:
“Hi, I’m (name). I’m going to do a quick image description of myself for any Blind/Low-Vision attendees. I’m a (race) (gender), and I’m wearing (color of shirt, notable accessories). Behind me is (color of wall, clock, etc).”
Additionally, when switching between speakers, it is crucial that you briefly state your name before talking. Many peoples’ voices sound similar, and this practice is helpful to Blind/Low Vision people who need to know who is speaking. This is also important for d/Deaf/HoH attendees who are utilizing their own interpreter via VRS, because the interpreter needs to be able to sign to them the name of the person speaking.

Content Warnings
If your presentation includes very loud noises, flashing lights, or rapidly-transitioning imagery, you need to give attendees a 10-second warning before presenting that content. This is a safety measure for people who are prone to seizures and others who are sensitive to these elements.

Verbally Highlighting Key Visual Features
If you are sharing an important image with attendees (a chart or graph, for example), be sure to verbally describe all of the important information the image relays: general trends, names of data groups, axis titles, etc. Additionally, be sure to at least summarize the text on screen (if applicable)--not everyone is able to read and listen at the same time. Blind/Low Vision attendees will not be able to fully participate in your event if they are not getting the same access to information as sighted people are.

Be Prepared to Answer Questions from the Interpreter
Sometimes, an interpreter may need to ask for clarification on something you said, either because your audio cut out, or because you used a term they have not heard before and they want to make sure they are signing it correctly. Answer interpreter questions right away, so d/Deaf/HoH attendees are able to keep up with what you’re saying.

By following these guidelines, you will ensure that your event is inclusive and engaging for all attendees. If you have questions on how to implement some of these measures, or about how your organization can benefit from becoming more accessible, visit neighborhoodaccess.org to chat with our Accessibility Consulting Team. Let’s work together to create virtual events that work for everyone!

Introduction by Teresa Terasaki, Google Open Source Programs Office
Guidelines by guest author Juliana Good, Founder and Consulting Lead – Neighborhood Access

How Google’s 2020 summer interns became the newest contributors in open source

Our internship program changed in structure this year to accommodate a virtual environment, and we enjoyed seeing the intern involvement in our open source teams. Now, as the Summer 2020 Interns have departed Google, we’ve seen widespread impact across these OSS projects. Some accomplishments from the intern community included:
  • Mohamed Ibrahim, a Software Engineering major at the University of Ontario Institute of Technology, interned on the Earth Engine team in Geo. He built a web app from scratch that allows Earth Engine developers, who are primarily climate and remote-sensing researchers, to build rich UIs for their Earth Engine Apps without needing to write any code. Mohamed also learned two coding languages unfamiliar to him, enabling him to write over 10,000 lines of TypeScript, 480 lines of Go, and merge over 30 PRs during one internship.
App creator demo
Web app demo
  • Vismita Uppalli, a Cloud intern and Computer Science major at the University of Virginia, wrote a tutorial showing how to use AI Platform Operators on Apache Airflow, which is now published in the official Airflow docs.
  • Colin Marsch interned with the Android team and published a blog post for Android developers, "Re-writing the AOSP DeskClock App in Kotlin," which has reached over 1,600 viewers! He is scheduled to graduate from the University of Waterloo with a major in Computer Science in Spring 2021.
  • Satyam Ralhan worked in the MyHeart team in Research to build a first-of-its-kind Android app that engages users in conversations to encourage healthy habits. He created a demo, which explores the different phases of the app and how it learns to personalize lifestyle suggestions for various kinds of users. He is in his fourth year at the Indian Institute of Technology, Kanpur, studying Computer Science and Engineering.
    MyHeart app demo
  • An Apigee intern, Nicole Gizzo, presented her work analyzing API vocabularies at the API Specifications Conference. She is majoring in Computer Science and Cognitive Science at Rensselaer Polytechnic Institute, and will graduate in May 2021.
  • The OSS Fuzzing Interns have found and reported over 600 bugs to critical open source projects like the Linux kernel and Nginx, over 100 of which were security vulnerabilities.
  • Madelyn Dubuk, a SWE Intern on the Cloud DPE team and a Computer Science major at USC, worked with three other interns to create a full stack web app to help better understand test flakiness, and enjoyed working directly with other interns.
Initial feedback from our interns indicates that their OSS contributions won’t stop when their internships end. Of the interns who worked on OSS projects, 69% plan to continue contributing to OSS, enjoying the ability to talk about their work and have a broader impact. Beyond the impact on OSS, we’ve seen tremendous professional growth for our interns. Lucia Cantu-Miller, an intern on the Chrome team and Computer Science major at ITESM Monterrey, reflected she was, “proud of seeing how I’ve grown during the internship. As the days passed I became more confident in my work and in asking questions, I have grown a lot as a person and as a professional student.” Lucia wasn’t the only intern to experience this as 98% of interns who worked on OSS feel that Google is a good place to start a career. The success of this summer’s Internship is due in large part to the many contributions of Google’s OSS community—from the intern hosts to the project champions and mentors—we can’t thank them enough for their support. 

By Emma Stamp, Google Engineering Education

Kubernetes: Efficient Multi-Zone Networking with Topology Aware Routing

Topology Aware Routing of Services, a feature that was first introduced as alpha in the Kubernetes 1.17 release, aims to solve an often overlooked issue with Kubernetes Services; that they are not region aware.

Kubernetes services provide a uniform, durable, and easy to use method of accessing a variety of different backend applications. These backend applications are most commonly an exposed app running within your pods. Kubernetes does this by reserving a static virtual IP and DNS name, unique to it throughout the cluster and turning them into simple load balancers.

While this model is great for small clusters or applications, if you have thousands of nodes, your cluster spans multiple regions, or your application is latency sensitive then the service model can start to break down a bit. By default, each endpoint in a service has an equal opportunity to be selected as the destination. If you’re accessing a service with a backend hosted in the same zone, there’s a high probability that you’d be directed to a pod in a completely separate zone—likely in a completely separate region—and is what Topology Awareness intends to solve.

The Topology Aware Routing of Services feature added the concept of topologyKeys as an additional field in service objects. It allows you to define a set of node labels that could be used to route traffic closer to where it originated from.

Example Service with topologyKeys


apiVersion: v1
kind: Service
metadata:
  Name: my-app-web
spec:
  selector:
    app: my-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
  topologyKeys:

    - "topology.kubernetes.io/zone"

    - "topology.kubernetes.io/region"

In this example, the service makes use of some commonly used labels for its topology preferences. It signals that when kube-proxy is routing traffic for that service, it should only route to pods within the same zone or region the traffic is originating from.

This is great! Traffic should remain “close” to where it originated and remove unnecessary latency.

While topologyKeys is available as alpha in 1.17, it hasn’t yet graduated to the next stage because the first pass at building topology-aware routing surfaced many challenges and scalability issues.

Each node in the cluster now has to manage a potentially complex ruleset for each service that would require more frequent updating. In clusters with thousands of pods or thousands of nodes, this solution quickly becomes untenable.

Another pain point with this implementation depends on how your application was distributed across a zone or region, as it's quite possible that a singular pod would be receiving ALL traffic for that zone or region. The preference list doesn’t take into account the performance of the pod on the receiving end and could potentially cause an outage.

These problems have led the Kubernetes Network Special Interest Group (SIG) to do a full re-evaluation of how to approach the Topology Awareness implementation.

What’s Planned for Topology Aware Routing?
The new design is intended to automatically handle the routing of services so that they will be load-balanced across a minimum number of the closest possible endpoints. It does this by applying an algorithm using two of the topology keys to signal affinity for service routing: topology.kubernetes.io/region and topology.kubernetes.io/zone without having to specify them via topologyKeys at the service level.

This algorithm works by establishing a dynamic threshold for a service where it calculates an expected number of endpoints per zone. It then builds a list of available endpoints for that service, prioritizing the ones that are in the same zone. If there are not enough endpoints to meet that expected number, it adds them from other zones until it reaches its expected number of endpoints. This list of expected endpoints, or a subset of endpoints are then passed to the nodes within that zone.

These nodes no longer have to maintain the complex set of rules like they had in the first iteration, and now just manage the small subset of endpoints for each service. This is less flexible than its predecessor, but it drastically reduces the performance overhead when compared to the previous method, while also covering the majority of use-cases. A big win for everyone.

These features are slated to graduate to alpha in the 1.21 release in the first part of 2021. If Topology Aware Routing would be of value to you, please consider taking the time to test it when it becomes available. Early feedback is highly appreciated and helps shape the direction of the feature.

Until then, if you’d like to learn more about Service Topology, Endpoint Slice, and the various algorithms that have been evaluated for service routing, check out Rob Scott’s presentation: Improving Network Efficiency with Topology Aware Routing, on November 19th, at KubeCon + CloudNativeCon North America.



By Bob Killen, Program Manager – Google Open Source Programs Office

Welcome Android Open Source Project (AOSP) to the Bazel ecosystem

After significant investment in understanding how best to build the Android Platform correctly and quickly, we are pleased to announce that the Android Platform is migrating from its current build systems (Soong and Make) to Bazel. While components of Bazel have been already checked into the Android Open Source Project (AOSP) source tree, this will be a phased migration over the next few Android releases which includes many concrete and digestible milestones to make the transformation as seamless and easy as possible. There will be no immediate impact to the Android Platform build workflow or the existing supported Android Platform Build tools in 2020 or 2021. Some of the changesto support Android Platform builds are already in Bazel, such as Bazel’s ability to parse and execute Ninja files to support a gradual migration.

Migrating to Bazel will enable AOSP to:
  • Provide more flexibility for configuring the AOSP build (better support for conditionals)
  • Allow for greater introspection into the AOSP build progress and dependencies
  • Enable correct and reproducible (hermetic) AOSP builds
  • Introduce a configuration mechanism that will reduce complexity of AOSP builds
  • Allow for greater integration of build and test activities
  • Combine all of these to drive significant improvements in build time and experience
The benefits of this migration to the Bazel community are:
  • Significant ongoing investment in Bazel to support Android Platform builds
  • Expansion of the Bazel ecosystem and community to include, initially, tens of thousands of Android Platform developers and Android handset OEMs and chipset vendors.
  • Google’s Bazel rules for building Android apps will be open sourced, used in AOSP, and maintained by Google in partnership with the Android / Bazel community
  • Better Bazel support for building Android Apps
  • Better rules support for other languages used to build Android Platform (Rust, Java, Python, Go, etc)
  • Strong support for Bazel Long Term Support (LTS) releases, which benefits the expanded Bazel community
  • Improved documentation (tutorials and reference)
The recent check-in of Bazel to AOSP begins an initial pilot phase, enabling Bazel to be used in place of Ninja as the execution engine to build AOSP. Bazel can also explore the AOSP build graph. We're pleased to be developing this functionality directly in the Bazel and AOSP codebases. As with most initial development efforts, this work is experimental in nature. Remember to use the currently supported Android Platform Build System for all production work.

We believe that these updates to the Android Platform Build System enable greater developer velocity, productivity, and happiness across the entire Android Platform ecosystem.

By Joe Hicks on behalf of the Bazel and AOSP infrastructure teams

Get ready for BazelCon 2020

With only 24 hours to go, BazelCon 2020 is shaping up to be a much anticipated gathering for the Bazel community and broader Build ecosystem. With over 1000 attendees, presentations by Googlers, as well as talks from industry Build leaders from Twitter, Dropbox, Uber, Pinterest, GrabTaxi, and more, we hope BazelCon 2020 will provide an opportunity for knowledge sharing, networking, and community building.

I am very excited by the keynote announcements, the migration stories at Twitter, Pinterest, and CarGurus, as well as technical deep dives on Bazel persistent workers, incompatible target skipping, migrating from Gradle to Bazel, and more. The “sold out” Birds of a Feather sessions and the Live Q&A with the Bazel team will bring the community together to discuss design docs, look at landings, and provide feedback on the direction of Bazel and the entire ecosystem.

We are also pleased to announce that, starting with the next major release (4.0), Bazel will support Long Term Support (LTS) releases as well as regular Rolling releases.

Some benefits of this new release cadence are:
  • Bazel will release stable, supported LTS releases on a predictable schedule with a long window without breaking changes
  • Bazel contributors / rules owners can prepare to support future LTS releases via rolling releases.
  • Bazel users can choose the release cadence that works best for them, since we will offer both LTS releases and rolling releases.
Long Term Support (LTS) releases:
  • We will create an LTS release every ~9 months => new LTS release branch, increment major version number.
  • Each LTS release will include all new features, bug fixes and (breaking) changes since the last major version.
  • Bazel will actively support each LTS branch for 9 months with critical bug fixes, but no new features.
  • Thereafter, Bazel will provide maintenance for two additional years with only security and OS compatibility fixes.
  • Bazel Federation reboot: Bazel will provide strong guidance about the ruleset versions that should be used with each Bazel release so that each user will not have to manage interoperability themselves.
Make sure that you register at http://goo.gle/bazelcon to be a part of the excitement of the premier build conference!

See you all at BazelCon 2020!

By Joe Hicks and the entire Bazel Team at Google