Author Archives: Open Source Programs Office

Optimizing gVisor filesystems with Directfs

gVisor is a sandboxing technology that provides a secure environment for running untrusted code. In our previous blog post, we discussed how gVisor performance improves with a root filesystem overlay. In this post, we'll dive into another filesystem optimization that was recently launched: directfs. It gives gVisor’s application kernel (the Sentry) secure direct access to the container filesystem, avoiding expensive round trips to the filesystem gofer.

Origins of the Gofer

gVisor is used internally at Google to run a variety of services and workloads. One of the challenges we faced while building gVisor was providing remote filesystem access securely to the sandbox. gVisor’s strict security model and defense in depth approach assumes that the sandbox may get compromised because it shares the same execution context as the untrusted application. Hence the sandbox cannot be given sensitive keys and credentials to access Google-internal remote filesystems.

To address this challenge, we added a trusted filesystem proxy called a "gofer". The gofer runs outside the sandbox, and provides a secure interface for untrusted containers to access such remote filesystems. For architectural simplicity, gofers were also used to serve local filesystems as well as remote.

Gofer process intermediates filesystem operations

Isolating the Container Filesystem in runsc

When gVisor was open sourced as runsc, the same gofer model was copied over to maintain the same security guarantees. runsc was configured to start one gofer process per container which serves the container filesystem to the sandbox over a predetermined protocol (now LISAFS). However, a gofer adds a layer of indirection with significant overhead.

This gofer model (built for remote filesystems) brings very few advantages for the runsc use-case, where all the filesystems served by the gofer (like rootfs and bind mounts) are mounted locally on the host. The gofer directly accesses them using filesystem syscalls.

Linux provides some security primitives to effectively isolate local filesystems. These include, mount namespaces, pivot_root and detached bind mounts1. Directfs is a new filesystem access mode that uses these primitives to expose the container filesystem to the sandbox in a secure manner. The sandbox’s view of the filesystem tree is limited to just the container filesystem. The sandbox process is not given access to anything mounted on the broader host filesystem. Even if the sandbox gets compromised, these mechanisms provide additional barriers to prevent broader system compromise.

Directfs

In directfs mode, the gofer still exists as a cooperative process outside the sandbox. As usual, the gofer enters a new mount namespace, sets up appropriate bind mounts to create the container filesystem in a new directory and then pivot_root(2)s into that directory. Similarly, the sandbox process enters new user and mount namespaces and then pivot_root(2)s into an empty directory to ensure it cannot access anything via path traversal. But instead of making RPCs to the gofer to access the container filesystem, the sandbox requests the gofer to provide file descriptors to all the mount points via SCM_RIGHTS messages. The sandbox then directly makes file-descriptor-relative syscalls (e.g. fstatat(2), openat(2), mkdirat(2), etc) to perform filesystem operations.

Sandbox directly accesses container filesystem with directfs

Earlier when the gofer performed all filesystem operations, we could deny all these syscalls in the sandbox process using seccomp. But with directfs enabled, the sandbox process's seccomp filters need to allow the usage of these syscalls. Most notably, the sandbox can now make openat(2) syscalls (which allow path traversal), but with certain restrictions: O_NOFOLLOW is required, no access to procfs and no directory FDs from the host. We also had to give the sandbox the same privileges as the gofer (for example CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH), so it can perform the same filesystem operations.

It is noteworthy that only the trusted gofer provides FDs (of the container filesystem) to the sandbox. The sandbox cannot walk backwards (using ‘..’) or follow a malicious symlink to escape out of the container filesystem. In effect, we've decreased our dependence on the syscall filters to catch bad behavior, but correspondingly increased our dependence on Linux's filesystem isolation protections.

Performance

Making RPCs to the gofer for every filesystem operation adds a lot of overhead to runsc. Hence, avoiding gofer round trips significantly improves performance. Let's find out what this means for some of our benchmarks. We will run the benchmarks using our newly released systrap platform on bind mounts (as opposed to rootfs). This would simulate more realistic use cases because bind mounts are extensively used while configuring filesystems in containers. Bind mounts also do not have an overlay (like the rootfs mount), so all operations go through goferfs / directfs mount.

Let's first look at our stat micro-benchmark, which repeatedly calls stat(2) on a file.

Stat benchmark improvement with directfs
The stat(2) syscall is more than 2x faster! However, since this is not representative of real-world applications, we should not extrapolate these results. So let's look at some real-world benchmarks.
Stat benchmark improvement with directfs
We see a 12% reduction in the absolute time to run these workloads and 17% reduction in Ruby load time!

Conclusion

The gofer model in runsc was overly restrictive for accessing host files. We were able to leverage existing filesystem isolation mechanisms in Linux to bypass the gofer without compromising security. Directfs significantly improves performance for certain workloads. This is part of our ongoing efforts to improve gVisor performance. You can learn more about gVisor at gvisor.dev. You can also use gVisor in GKE with GKE Sandbox. Happy sandboxing!


1Detached bind mounts can be created by first creating a bind mount using mount(MS_BIND) and then detaching it from the filesystem tree using umount(MNT_DETACH).


By Ayush Ranjan, Software Engineer – Google

OpenTitan RTL freeze

We are excited to announce that the OpenTitan® coalition has successfully reached a key milestone—RTL freeze of its first engineering sample release candidate! A snapshot of our high quality, open source silicon root of trust hardware implementation has been released for synthesis, layout and fabrication. We expect engineering sample chips to be available for lab testing and evaluation by the end of 2023.

This is a major achievement that represents the culmination of a multi-year investment and long-term, coordinated effort by the project’s active community of commercial and academic partners—including Google, G+D Mobile Security, ETH Zurich, Nuvoton, Winbond, Seagate, Western Digital, Rivos, and zeroRISC, plus a number of independent contributors. The OpenTitan project and its community are actively supported and maintained by lowRISC C.I.C., an independent non-profit.

Hitting this milestone demonstrates that large-scale engineering efforts can be successful when many organizations with aligned interests collaborate on an open source project. It also matters because traditionally, computing ecosystems have had to depend heavily on proprietary hardware (silicon) and software solutions to provide foundational, or “root,” trust assurances to their users. OpenTitan fundamentally changes that paradigm for the better, delivering secure root of trust silicon technology which is open source, high quality, and publicly verifiable.

Our belief is that core security features like the authenticity of the root of trust and the firmware it executes should be safely commoditized rights guaranteed to the end user—not areas for differentiation. To that end, we have made available a high-quality, industrial strength, reusable ecosystem of OpenTitan blocks, top-levels, infrastructure, and software IP adaptable for many use cases, delivered under a permissive, no-cost license and with known-good provenance. OpenTitan's now-complete, standalone “Earl Grey” chip implementation, design verification, full-chip testing, and continuous integration (CI) infrastructure are all available on GitHub today.

Flowchart illustrating the silicone process and OpenTitan

The silicon process and OpenTitan

This release means the OpenTitan chip digital design is complete and has been verified to be of sufficiently high quality that a tapeout is expected to succeed. In other words, the logical design is judged to be of sufficient maturity to translate into a physical layout and create a physical chip. The initial manufacturing will be performed in a smaller batch, delivering engineering samples which allow post-silicon verification of the physical silicon, prior to creating production devices in large volume.


Earl Grey: Discrete implementation of OpenTitan

Design Verification

Industrial quality implementation has been a core tenet of the OpenTitan project from the outset, both to ensure the design meets its goals—including security—and to ensure the first physical chips are successful. OpenTitan’s hardware development stages ensure all hardware blocks go through several gating design and verification reviews before final integration signoff. This verification has required development of comprehensive testbenches and test infrastructure, all part of the open source project. Both individual blocks and the top-level Earl Grey design have functional and code coverage above 90%—at or above the standards of typical closed-source designs—with 40k+ tests running nightly and reported publicly via the OpenTitan Design Verification Dashboard. Regressions are caught and resolved quickly, ensuring design quality is maintained over the long term.

Software tooling

OpenTitan has led the way in making open source silicon a reality, and doing so requires much more than just open source silicon RTL and Design Verification collateral. Successful chips require real software support to have broad industry impact and adoption. OpenTitan has created generalizable infrastructure for silicon projects (test frameworks, continuous integration infrastructure, per-block DIFs), host tools like opentitantool to support interactions with all OpenTitan instances, and formal releases (e.g. the ROM to guarantee important security functionality such as firmware verification and ownership transfer).

Documentation

A good design isn’t worth much if it’s hard to use. With this in mind, thorough and accurate documentation is a major component of the OpenTitan project too. This includes a Getting Started Guide, which is a ‘from scratch’ walkthrough on a Linux workstation, covering software and tooling installation, and hardware setup. It includes a playbook to run local simulations or even emulate the entire OpenTitan chip on an FPGA.

Furthermore, OpenTitan actively maintains live dashboards of quality metrics for its entire IP ecosystem (e.g. regression testing and coverage reports). If you’re new to open source silicon development, there are comprehensive resources describing project standards for technical contribution that have been honed to effectively facilitate inter-organizational collaboration.

Thriving open source community

OpenTitan’s broad community has been critical to its success. As the following metrics show (baselined from the project’s public launch in 2019), the OpenTitan community is rapidly growing:

  • More than eight times the number of commits at launch: from 2,500 to over 20,000.
  • 140 contributors to the code base
  • 13k+ merged pull requests
  • 1.5M+ LoC, including 500k LoC of HDL
  • 1.8k Github stars

Participating in OpenTitan

Reaching this key RTL freeze milestone is a major step towards transparency at the very foundation of the security stack: the silicon root of trust. The coordinated contributions of OpenTitan’s project's partners—enabled by lowRISC’s Silicon Commons™ approach to open source silicon development—are what has enabled us to get here today.

This is a watershed moment for the trustworthiness of systems we all rely on. The future of free and open, high quality silicon implementations is bright, and we expect to see many more devices including OpenTitan top-levels and ecosystem IP in the future!

If you are interested in contributing to OpenTitan, visit the open source GitHub repository or reach out to the OpenTitan team.

By Cyrus Stoller, Miguel Osorio, and Will Drewry, OpenTitan – Google

Controlling Stable Diffusion with JAX, diffusers, and Cloud TPUs

Diffusion models are state-of-the-art in generating photorealistic images from text. These models are hard to control through only text and generation parameters. To overcome this, the open source community developed ControlNet (GitHub), a neural network structure to control diffusion models by adding more conditions on top of the text prompts. These conditions include canny edge filters, segmentation maps, and pose keypoints. Thanks to the 🧨diffusers library, it is very easy to train, fine-tune or control diffusion models written in various frameworks, including JAX!

At Hugging Face, we were particularly excited to see the open source machine learning (ML) community leverage these tools to explore fun and creative diffusion models. We joined forces with Google Cloud to host a community sprint where participants explored the capabilities of controlling Stable Diffusion by building various open source applications with JAX and Diffusers, using Google Cloud TPU v4 accelerators. In this three week sprint, participants teamed up, came up with various project ideas, trained ControlNet models, and built applications based on them. The sprint resulted in 26 projects, accessible via a leaderboard here. These demos use Stable Diffusion (v1.5 checkpoint) initialized with ControlNet models. We worked with Google Cloud to provide access to TPU v4-8 hardware with 3TB storage, as well as NVIDIA A10G GPUs to speed up the inference in these applications.

Below, we showcase a few projects that stood out from the sprint, and that anyone can create a demo themselves. When picking projects to highlight, we considered several factors:

  • How well-described are the models produced?
  • Are the models, datasets, and other artifacts fully open sourced?
  • Are the applications easy to use? Are they well described?

The projects were voted on by a panel of experts and the top ten projects on the leaderboard won prizes.

Control with SAM

One team used the state-of-the-art Segment Anything Model (SAM) output as an additional condition to control the generated images. SAM produces zero-shot segmentation maps with fine details, which helps extract semantic information from images for control. You can see an example below and try the demo here.

Screencap of the 'Control with SAM' project

Fusing MediaPipe and ControlNet

Another team used MediaPipe to extract hand landmarks to control Stable Diffusion. This application allows you to generate images based on your hand pose and prompt. You can also use a webcam to input an image. See an example below, and try it yourself here.

Screencap of a project fusing MediaPipe and ControlNet

Make-a-Video

Top on the leaderboard is Make-a-Video, which generates video from a text prompt and a hint image. It is based on latent diffusion with temporal convolutions for video and attention. You can try the demo here.

Screencap of the 'Make-a-Video' project

Bootstrapping interior designs

The project that won the sprint is ControlNet for interior design. The application can generate interior design based on a room image and prompt. It can also perform segmentation and generations, guided by image inpainting. See the application in inpainting mode below.

Screencap of a project using ControlNet for interior design

In addition to the projects above, many applications were built to enhance images, like this application to colorize grayscale images. You can check out the leaderboard to try all the projects.

Learning more about diffusion models

To kick-off the sprint, we organized a three-day series of talks by leading scientists and engineers from Google, Hugging Face, and the open source diffusion community. We'd recommend that anyone interested in learning more about diffusion models and generative AI take a look at the recorded sessions below!

Tim Salimans (Google Research) speaking on Discrete Diffusion Models
Tim Salimans (Google Research) speaking on Discrete Diffusion Models
You can watch all the talks from the links below.

You can check out the sprint homepage to learn more about the sprint.

Acknowledgements

We would like to thank Google Cloud for providing TPUs and storage to help make this great sprint happen, in particular Bertrand Rondepierre and Jonathan Caton for the hard work behind the scenes to get all of the Cloud TPUs allocated so participants had cutting-edge hardware to build on and an overall great experience. And also Andreas Steiner and Cristian Garcia for helping to answer questions in our Discord forum and for helping us make the training script example better. Their help is deeply appreciated.

By Merve Noyan and Sayak Paul – Hugging Face

Controlling Stable Diffusion with JAX, diffusers, and Cloud TPUs

Diffusion models are state-of-the-art in generating photorealistic images from text. These models are hard to control through only text and generation parameters. To overcome this, the open source community developed ControlNet (GitHub), a neural network structure to control diffusion models by adding more conditions on top of the text prompts. These conditions include canny edge filters, segmentation maps, and pose keypoints. Thanks to the 🧨diffusers library, it is very easy to train, fine-tune or control diffusion models written in various frameworks, including JAX!

At Hugging Face, we were particularly excited to see the open source machine learning (ML) community leverage these tools to explore fun and creative diffusion models. We joined forces with Google Cloud to host a community sprint where participants explored the capabilities of controlling Stable Diffusion by building various open source applications with JAX and Diffusers, using Google Cloud TPU v4 accelerators. In this three week sprint, participants teamed up, came up with various project ideas, trained ControlNet models, and built applications based on them. The sprint resulted in 26 projects, accessible via a leaderboard here. These demos use Stable Diffusion (v1.5 checkpoint) initialized with ControlNet models. We worked with Google Cloud to provide access to TPU v4-8 hardware with 3TB storage, as well as NVIDIA A10G GPUs to speed up the inference in these applications.

Below, we showcase a few projects that stood out from the sprint, and that anyone can create a demo themselves. When picking projects to highlight, we considered several factors:

  • How well-described are the models produced?
  • Are the models, datasets, and other artifacts fully open sourced?
  • Are the applications easy to use? Are they well described?

The projects were voted on by a panel of experts and the top ten projects on the leaderboard won prizes.

Control with SAM

One team used the state-of-the-art Segment Anything Model (SAM) output as an additional condition to control the generated images. SAM produces zero-shot segmentation maps with fine details, which helps extract semantic information from images for control. You can see an example below and try the demo here.

Screencap of the 'Control with SAM' project

Fusing MediaPipe and ControlNet

Another team used MediaPipe to extract hand landmarks to control Stable Diffusion. This application allows you to generate images based on your hand pose and prompt. You can also use a webcam to input an image. See an example below, and try it yourself here.

Screencap of a project fusing MediaPipe and ControlNet

Make-a-Video

Top on the leaderboard is Make-a-Video, which generates video from a text prompt and a hint image. It is based on latent diffusion with temporal convolutions for video and attention. You can try the demo here.

Screencap of the 'Make-a-Video' project

Bootstrapping interior designs

The project that won the sprint is ControlNet for interior design. The application can generate interior design based on a room image and prompt. It can also perform segmentation and generations, guided by image inpainting. See the application in inpainting mode below.

Screencap of a project using ControlNet for interior design

In addition to the projects above, many applications were built to enhance images, like this application to colorize grayscale images. You can check out the leaderboard to try all the projects.

Learning more about diffusion models

To kick-off the sprint, we organized a three-day series of talks by leading scientists and engineers from Google, Hugging Face, and the open source diffusion community. We'd recommend that anyone interested in learning more about diffusion models and generative AI take a look at the recorded sessions below!

Tim Salimans (Google Research) speaking on Discrete Diffusion Models
Tim Salimans (Google Research) speaking on Discrete Diffusion Models
You can watch all the talks from the links below.

You can check out the sprint homepage to learn more about the sprint.

Acknowledgements

We would like to thank Google Cloud for providing TPUs and storage to help make this great sprint happen, in particular Bertrand Rondepierre and Jonathan Caton for the hard work behind the scenes to get all of the Cloud TPUs allocated so participants had cutting-edge hardware to build on and an overall great experience. And also Andreas Steiner and Cristian Garcia for helping to answer questions in our Discord forum and for helping us make the training script example better. Their help is deeply appreciated.

By Merve Noyan and Sayak Paul – Hugging Face

Accelerate JAX models on Intel GPUs via PJRT

We are excited to announce the first PJRT plugin implementation in Intel Extension for TensorFlow, which seamlessly runs JAX models on Intel® GPU. The PJRT API simplified the integration, which allowed the Intel GPU plugin to be developed separately and quickly integrated into JAX. This same PJRT implementation also enables initial Intel GPU support for TensorFlow and PyTorch models with XLA acceleration.

Image of the Intel
Figure 1. Intel Data Center GPU Max Series

With the shared vision that modular interfaces make integration easier and enable faster, independent development, Intel and Google collaborated in developing the TensorFlow PluggableDevice mechanism. This is the supported way to extend TensorFlow to new devices and allows hardware vendors to release separate plugin binaries. Intel has continued to work with Google to build modular interfaces for the XLA compiler and to develop the PJRT plugin to run JAX workloads on Intel GPUs.

JAX

JAX is an open source Python library designed for complex numerical computations on high-performance computing devices like GPUs and TPUs. It supports NumPy functions and provides automatic differentiation as well as a composable function transformation system to build and train neural networks.

JAX uses XLA as its compilation and execution backend to optimize and parallelize computations, particularly on AI hardware accelerators. When a JAX program is executed, the Python code is transformed into OpenXLA’s StableHLO operations, which are then passed to PJRT for compilation and execution. Underneath, the StableHLO operations are compiled into machine code by the XLA compiler, which can then be executed on the target hardware accelerator.

PJRT

PJRT (used in conjunction with OpenXLA’s StableHLO) provides a hardware- and framework-independent interface for compilers and runtimes (recent announcement). The PJRT interface supports the plugin from a new device backend. This interface provides a means for a straightforward integration of JAX into Intel's systems, and enables JAX workloads on Intel GPUs. Through PJRT integration with various AI frameworks, Intel’s GPU plugin can deliver hardware acceleration and oneAPI optimizations to a wider range of developers using Intel GPUs.

The PJRT API is a framework-independent API to allow upper level AI frameworks to compile and execute numeric computation represented in StableHLO on an AI hardware/accelerator. It has been integrated with popular AI frameworks including JAX, TensorFlow (via TF-XLA) and PyTorch (via PyTorch-XLA) which enables hardware vendors to provide one plugin for their new AI hardware and all these popular AI Frameworks will support it. It also provides low level primitives to enable efficient interaction with upper level AI frameworks including zero-copy buffer donation, light-weight dependency management, etc, which enables AI frameworks to best utilize hardware resources and achieve high-performance execution.

Image of the Intel
Figure 2. PJRT simplifies the integration of oneAPI on Intel GPU into AI Frameworks

PJRT Plugin for Intel GPU

The Intel GPU plugin implements the PJRT API by compiling StableHLO and dispatching the executable to Intel GPUs. The compilation is based on XLA implementation, adding target-specific passes for Intel GPUs and leveraging oneAPI performance libraries for acceleration. The device execution is supported using SYCL runtime. The Intel GPU Plugin also implements device registration, enumeration, and SPMD execution mode.

PJRT’s high-level runtime abstraction allows the plugin to develop its own low-level device management modules and use the advanced runtime features provided by the new device. For example, the Intel GPU plugin developed an out-of-order queue feature provided by SYCL runtime. Compared to fitting the plugin implementation to a low-level runtime interface, such as the stream executor C API used in PluggableDevice, implementing PJRT runtime interface is straightforward and efficient.

It’s simple to get started using the Intel GPU plugin to run a JAX program, including JAX-based frameworks like Flax and T5X. Just build the plugin (example documentation) then set the environment variable and dependent library paths. JAX automatically looks for the plugin library and loads it into the current process.

Below are example code snippets of running JAX on an Intel GPU.

$ export PJRT_NAMES_AND_LIBRARY_PATHS='xpu:Your_itex_library/libitex_xla_extension.so' $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:Your_Python_site-packages/jaxlib $ python >>> import numpy as np >>> import jax >>> jax.local_devices() # PJRT Intel GPU plugin loaded [IntelXpuDevice(id=0, process_index=0), IntelXpuDevice(id=1, process_index=0)] >>> x = np.random.rand(2,2).astype(np.float32) >>> y = np.random.rand(2,2).astype(np.float32) >>> z = jax.numpy.add(x, y) # Runs on Intel XPU
This is the latest example of Intel AI tools and frameworks leveraging oneAPI software libraries to provide high performance on Intel GPU.

Future Work

This PJRT plugin for Intel GPUs has also been integrated into TensorFlow to run XLA supported ops in TensorFlow models. However, XLA has a smaller op set than TensorFlow. For many TensorFlow models in production, some parts of the model graph are executed with PJRT (XLA compatible) while other parts are executed with the classic TensorFlow runtime using TensorFlow OpKernel. This mixed execution model requires PJRT and TensorFlow OpKernel to work seamlessly with each other. The TensorFlow team has introduced the NextPluggableDevice API to enable this.

When using NextPluggableDevice API, PJRT manages all critical hardware states (e.g. allocator, stream manager, driver, etc) and NextPluggableDevice API allows hardware vendors to build new TensorFlow OpKernels that can access those hardware states via PJRT. PJRT and NextPluggableDevice API enable interoperability between classic TensorFlow runtime and XLA, allowing the XLA subgraph to produce a PJRT buffer and feed to TensorFlow and vice versa.

As a next step, Intel will continue working with Google to adopt the NextPluggableDevice API to implement non-XLA ops on Intel GPUs supporting all TensorFlow models.

Written in collaboration with Jianhui Li, Zhoulong Jiang, and Yiqiang Li from Intel.

By Jieying Luo, Chuanhao Zhuge, and Xiao Yu – Google

Google Open Source Peer Bonus program announces first group of winners for 2023



We are excited to announce the first group of winners for the 2023 Google Open Source Peer Bonus Program! This program recognizes external open source contributors who have been nominated by Googlers for their exceptional contributions to open source projects.

The Google Open Source Peer Bonus Program is a key part of Google's ongoing commitment to open source software. By supporting the development and growth of open source projects, Google is fostering a more collaborative and innovative software ecosystem that benefits everyone.

This cycle's Open Source Peer Bonus Program received a record-breaking 255 nominations, marking a 49% increase from the previous cycle. This growth speaks to the popularity of the program both within Google and the wider open source community. It's truly inspiring to see so many individuals dedicated to contributing their time and expertise to open source projects. We are proud to support and recognize their efforts through the Open Source Peer Bonus Program.

The winners of this year's Open Source Peer Bonus Program come from 35 different countries around the world, reflecting the program's global reach and the immense impact of open source software. Community collaboration is a key driver of innovation and progress, and we are honored to be able to support and celebrate the contributions of these talented individuals from around the world through this program.

In total, 203 winners were selected based on the impact of their contributions to the open source project, the quality of their work, and their dedication to open source. These winners represent around 150 unique open source projects, demonstrating a diverse range of domains, technologies, and communities. There are open source projects related to web development such as Angular, PostCSS, and the 2022 Web Almanac, and tools and libraries for programming languages such as Rust, Python, and Dart. Other notable projects include cloud computing frameworks like Apache Beam and Kubernetes, and graphics libraries like Mesa 3D and HarfBuzz. The projects also cover various topics such as security (CSP), testing (AFLPlusplus), and documentation (The Good Docs Project). Overall, it's an impressive list of open source projects from different areas of software development.

We would like to extend our congratulations to the winners! Included below are those who have agreed to be named publicly.

Winner

Open Source Project

Bram Stein

2022 Web Almanac

Saptak Sengupta

2022 Web Almanac

Thibaud Colas

2022 Web Almanac

Andrea Fioraldi

AFLPlusplus

Marc Heuse

AFLplusplus

Joel Ostblom

Altair

Chris Dalton

ANGLE

Matthieu Riegler

Angular

Ryan Carniato

Angular

Johanna Öjeling

Apache Beam

Rickard Zwahlen

Apache Beam

Seunghwan Hong

Apache Beam

Claire McGinty

Apache Beam & Scio

Kellen Dye

Apache Beam & Scio

Michel Davit

Apache Beam & Scio

Stamatis Zampetakis

Apache Hive

Matt Casters

Apache Hop

Kevin Mihelich

Arch Linux ARM

Sergio Castaño Arteaga

Artifact Hub

Vincent Mayers

Atlanta Java Users Group

Xavier Bonaventura

Bazel

Jelle Zijlstra

Black

Clément Contet

Blockly

Yutaka Yamada

Blockly

Luiz Von Dentz

Bluez

Kate Gregory

Carbon Language

Ruth Ikegah

Chaoss

Dax Fohl

Cirq

Chad Killingsworth

closure-compiler

Yuan Li

Cloud Dataproc Initialization Actions

Manu Garg

Cloudprober

Kévin Petit

CLVK

Dimitris Koutsogiorgas

CocoaPods

Axel Obermeier

Conda Forge

Roman Dodin

Containerlab

Denis Pushkarev

core-js

Chris O'Haver

CoreDNS

Justine Tunney

cosmopolitan

Jakob Kogler

cp-algorithms

Joshua Hemphill

CSP (Content-Security-Policy)

Romain Menke

CSSTools’ PostCSS Plugins and Packages

Michael Sweet

CUPS

Daniel Stenberg

curl

Pokey Rule

Cursorless

Ahmed Ashour

Dart

Zhiguang Chen

Dart Markdown Package

Dmitry Zhifarsky

DCM

Mark Pearson

Debian

Salvatore Bonaccorso

Debian

Felix Palmer

deck.gl

Xiaoji Chen

deck.gl

Andreas Deininger

Docsy

Simon Binder

Drift

Hajime Hoshi

Ebitengine

Protesilaos Stavrou

Emacs modus-themes

Raven Black

envoy

Péter Szilágyi

ethereum

Sudheer Hebbale

evlua

Morten Bek Ditlevsen

Firebase SDK for Apple App Development

Victor Zigdon

Flashing Detection

Ashita Prasad

Flutter

Callum Moffat

Flutter

Greg Price

Flutter

Jami Couch

Flutter

Reuben Turner

Flutter

Heather Turner

FORWARDS

Donatas Abraitis

FRRouting/frr

Guillaume Melquiond

Gappa

Sam James

Gentoo

James Blair

Gerrit Code Review

Martin Paljak

GlobalPlatformPro

Jeremy Bicha

GNOME

Austen Novis

Goblet

Ka-Hing Cheung

goofys

Nicholas Junge

Google Benchmark

Robert Teller

Google Cloud VPC Firewall Rules

Nora Söderlund

Google Maps Platform Discord community and GitHub repositories

Aiden Grossman

google/ml-compiler-opt

Giles Knap

gphotos-sync

Behdad Esfahbod

HarfBuzz

Juan Font Alonso

headscale

Blaž Hrastnik

Helix

Paulus Schoutsen

home-assistant

Pietro Albini

Infrastructure team - Rust Programming Language

Eric Van Norman

Istio

Zhonghu Xu

Istio

Pierre Lalet

Ivre Rocks

Ayaka Mikazuki

JAX

Kyle Zhao

JGit | The Eclipse Foundation

Yuya Nishihara

jj (Jujutsu VCS)

Oscar Dowson

JuMP-dev

Mikhail Yakshin

Kaitai Struct

Daniel Seemaier

KaMinPar

Abheesht Sharma

KerasNLP

Jason Hall

ko

Jonas Mende

Kubeflow Pipelines Operator

Paolo Ambrosio

Kubeflow Pipelines Operator

Arnaud Meukam

Kubernetes

Patrick Ohly

Kubernetes

Ricardo Katz

Kubernetes

Akihiro Suda

Lima

Jan Dubois

Lima

Dongliang Mu

Linux Kernel

Johannes Berg

Linux Kernel

Mauricio Faria de Oliveira

Linux Kernel

Nathan Chancellor

Linux Kernel

Ondřej Jirman

Linux Kernel

Pavel Begunkov

Linux Kernel

Pavel Skripkin

Linux Kernel

Tetsuo Handa

Linux Kernel

Vincent Mailhol

Linux Kernel

Hajime Tazaki

Linux Kernel Library

Jonatan Kłosko

Livebook

Jonas Bernoulli

Magit

Henry Lim

Malaysia Vaccine Tracker Twitter Bot

Thomas Caswell

matplotlib

Matt Godbolt

mattgodbolt

Matthew Holt

mholt

Ralf Jung

Miri and Stacked Borrows

Markus Böck

mlir

Matt DeVillier

MrChromebox.tech

Levi Burner

MuJoCo

Hamel Husain

nbdev

Justin Keyes

Neovim

Wim Henderickx

Nephio

Janne Heß

nixpkgs

Martin Weinelt

nixpkgs

Brian Carlson

node-postgres

Erik Doernenburg

OCMock

Aaron Brethorst

OneBusAway for iOS, written in Swift.

Onur Mutlu

Onur Mutlu Lectures - YouTube

Alexander Alekhin

OpenCV

Alexander Smorkalov

OpenCV

Stafford Horne

OpenRISC

Peter Gadfort

OpenROAD

Christopher "CRob" Robinson

OpenSSF Best Practices WG

Arnaud Le Hors

OpenSSF Scorecard

Nate Wert

OpenSSF Scorecard

Kevin Thomas Abraham

Oppia

Praneeth Gangavarapu

Oppia

Mohit Gupta

Oppia Android

Jaewoong Eum

Orbital

Carsten Dominik

Org mode

Guido Vranken

oss-fuzz

Daniel Anderson

parlaylib

Richard Davey

Phaser

Juliette Reinders Folmer

PHP_CodeSniffer

Hassan Kibirige

plotnine

Andrey Sitnik

PostCSS · GitHub

Dominik Czarnota

pwndbg

Ee Durbin

PyPI

Adam Turner

Python PEPs

Peter Odding

python-rotate-backups

Christopher Courtney

QMK

Jay Berkenbilt

qpdf

Tim Everett

RTKLIB

James Higgins

Rust

Tony Arcieri

rustsec

Natsuki Natsume

Sass

Mohab Mohie

SHAFT

Cory LaViska

Shoelace

Catherine 'whitequark'

smoltcp

Kumar Shivendu

Software Heritage

Eriol Fox

SustainOSS

Richard Littauer

SustainOSS

Billy Lynch

Tekton

Trevor Morris

TensorFlow

Jiajia Qin

TensorFlow.js

Patty O'Callaghan

TensorFlow.js Training and Ecosystem

Luiz Carvalho

TEP-0084: End-to-end provenance in Tekton Chains

Hannes Hapke

TFX-Addons

Sakae Kotaro

The 2021 Web Almanac

Frédéric Wang

The Chromium Projects

Raphael Kubo da Costa

The Chromium Projects

Mengmeng Tang

The Good Docs Project

Ophy Boamah Ampoh

The Good Docs Project

Gábor Horváth

The LLVM Project

Shafik Yaghmour

The LLVM Project

Dave Airlie

The Mesa 3D Graphics Library

Faith Ekstrand

The Mesa 3D Graphics Library

Aivar Annamaa

Thonny

Lawrence Kesteloot

trs80

Josh Goldberg

TypeScript

Linus Seelinger

UM-Bridge

Joseph Kato

Vale.sh

Abdelrahman Awad

vee-validate

Maxi Ferreira

View Transitions API

Tim Pope

vim-fugitive

Michelle O'Connor

Web Almanac

Jan Grulich

WebRTC

Wez Furlong

WezTerm

Yao Wang

World Federation of Advertisers - Virtual People Core Serving

Duncan Ogilvie

x64dbg

We are incredibly proud of all of the nominees for their outstanding contributions to open source, and we look forward to seeing even more amazing contributions in the years to come.

By Maria Tabak, Google Open Source Peer Bonus Program Lead

Open sourcing our Rust crate audits

Many open-source projects at Google use Rust, a modern systems language designed for building reliable and efficient software. Google has been investing in the Rust community for a long time; we helped found the Rust Foundation, many Googlers work on upstream Rust as part of their job, and we financially support key Rust projects. Today, we're continuing our commitment to the open-source Rust community by aggregating and publishing audits for Rust crates that we use in open-source Google projects.

Rust makes it easy to encapsulate and share code in crates, which are reusable software components that are like packages in other languages. We embrace the broad ecosystem of open-source Rust crates, both by leveraging crates written outside of Google and by publishing several of our own.

All third-party code carries an element of risk. Before a project starts using a new crate, members usually perform a thorough audit to measure it against their standards for security, correctness, testing, and more. We end up using many of the same dependencies across our open-source projects, which can result in duplicated effort when several different projects audit the same crate. To de-duplicate that work, we've started sharing our audits across our projects. Now, we're excited to join other organizations in sharing them with the broader open-source community.

Our crate audits are continually aggregated and published on GitHub under our supply-chain repository. They work with cargo vet to mechanically verify that:

  • a human has audited all of our dependencies and recorded their relevant properties, and
  • those properties satisfy the requirements for the current project

You can easily import audits done by Googlers into your own projects that attest to the properties of many open-source Rust crates. Then, equipped with this data, you can decide whether crates meet the security, correctness, and testing requirements for your projects. Cargo vet has strong support for incrementally vetting your dependencies, so it's easy to introduce to existing projects.

Different use cases have different requirements, and cargo vet allows you to independently configure the requirements for each of your dependencies. It may be suitable to only check a local development tool for actively malicious code – making sure it doesn't violate privacy, exfiltrate data, or install malware. But code deployed to users usually needs to meet a much stricter set of requirements – making sure it won't introduce memory safety issues, uses up-to-date cryptography, and conforms to its standards and specifications. When consuming and sharing audits, it’s important to consider how your project’s requirements relate to the facts recorded during an audit.

We hope that by sharing our work with the open-source community, we can make the Rust ecosystem even safer and more secure for everyone. ChromeOS and Fuchsia have already begun performing and publishing their audits in the above-mentioned supply-chain repository, and other Google open-source projects are set to join them soon. As more projects participate and we work through our collective audit backlog, our audits will grow to provide even more value and coverage. We're still early on in performing and sharing our audits through cargo vet, and the tool is still under active development. The details are likely to change over time, and we're excited to evolve and improve our processes and tooling as they do. We hope you'll find value in the work Googlers have done, and join us in building a safer and more secure Rust ecosystem.

By David Koloski, Fuchsia and George Burgess, Chrome OS

Accelerate AI development for Digital Pathology using EZ WSI DICOMWeb Python library

Overview

Digital pathology is changing the way pathology is practiced by making it easier to share images, collaborate with colleagues, and develop new AI algorithms that can improve the quality and cost of medical care. One of the biggest challenges of digital pathology is storing and managing the large volume of data generated. The Google Cloud Healthcare API provides a solution for this with a managed DICOM store, which is a secure, scalable, and performant way to store digital pathology images in a manner that is both standardized and interoperable.

However, performing image retrieval of specific patches (i.e. regions of interest) of a whole slide image (WSI) from the managed DICOM store using DICOMweb can be complex and requires DICOM format expertise. To address this, we are open sourcing EZ WSI (Whole Slide Image) DICOMWeb, a Python library that makes fetching these patches both efficient and easy-to-use.

How EZ WSI DICOMWeb works

EZ WSI DICOMweb facilitates the retrieval of arbitrary and sequential patches of a DICOM WSI from a DICOMWeb compliant Google Cloud Healthcare API DICOM store. Unlike downloading the entire DICOM series WSI and extracting patches locally from that file, which can increase network traffic, latency and storage space usage, EZ WSI DICOMweb retrieves only the necessary tiles for the desired patch directly through the DICOMweb APIs. This is simpler to use and abstracts away the following:

  • The need to fetch many tiles, which requires an understanding of DICOM data structure (e.g. offset & data hierarchy).
  • The need for a detailed understanding of the DICOMWeb APIs, REST payloads, and authentication, as well as addressing the possibility of redundant requests if several patches are fetched and there are overlapping tiles.
  • The need to decode images on the server if client side decoding is not supported, which increases the time it takes to transfer data and the size of the data being transferred.

EZ WSI DICOMWeb allows researchers and developers to focus on their ML tasks rather than the intricacies of DICOM. Developers do not need to have an in-depth understanding of DICOM data structuring or the DICOM API. The library provides a simple and intuitive functionality that allows developers to efficiently fetch DICOM images using only the Google Cloud Platform (GCP) Resource Name and DICOM Series path without any pixel recompression.

Case Study: Generating Patches for AI Workflows

A typical pathology WSI could be on the order of 40,000 pixels in length or width. However, an AI model that is trained to assess that WSI may only analyze a patch that is 512 x 512 pixels at a time. The way the model can operate over the entire WSI is by using a sliding windows approach. We demonstrate how that can be done using EZ WSI DICOMWeb.

First, we create a DicomSlide object using the DICOMweb client and interface. This can be done with just a few lines of code.

dicom_web_client = dicom_web.DicomWebClientImpl() dwi = dicom_web_interface.DicomWebInterface(dicom_web_client) ds = dicom_slide.DicomSlide( dwi=dwi, path=gcp_resource_name+dicom_series_path, enable_client_slide_frame_decompression = True ) ds.get_image(desired_magnification) # e.g. '0.625X'

This DicomSlide represents the entire WSI, as illustrated below.

Image of a WSI at the magnitude of 0.625X rendered by matplotlib

The above image leverages EZ WSI’s DicomSlide module to fetch an entire WSI at the requested magnification of 0.625X and uses matplotlib to render it, see the sample code for more details.

By providing coordinates, DicomSlide’s get_patch() method allows us to manually extract just the two sections of tissue at supported magnification with coordinates as pictured below.

tissue_patch = ds.get_patch( desired_magnification, x=x_origin, y=y_origin, width=patch_width, height=patch_ height )
Left tissue sample and right tissue sample at 0.625X magnitude, rendered by matplotlib

We can effectively zoom in on patches programmatically by reducing the window size and increasing the magnification using the same get patch method from above.

image of three panels showing the same interesting patch at 0.625, 2.5X, and 40X magnitude, rendered by matplotlib

Our ultimate goal is to generate a set of patches that can be used in a downstream AI application from this WSI.

image showing patch generation at 10X with 0.625X mask, rendered by matplotlib

To do this, we call PatchGenerator. It works by sliding a window of a specified size with a specified stride size across the image, heuristically ignoring tissue-less regions at a specified magnification level.

patch_gen = patch_generator.PatchGenerator( slide=ds, stride_size=stride_size, # the number of pixels between patches patch_size=patch_size, # the length and width of the patch in pixels magnification=patch_magnification, # magnification to generate patches at max_luminance=0.8, # defaults to .8, heuristic to evaluate where tissue is. tissue_mask_magnification=mask_magnification, )

The result is a list of patches that can be used as input into a machine learning algorithm.

image showing patch generation at 40X with 0.625X mask, rendered by matplotlib

Conclusion

We have built this library to make it easy to directly interact with DICOM WSIs that are stored in Google's DICOMWeb compliant Healthcare API DICOM store and extract image patches for AI workflows. Our hope is that by making this available, we can help accelerate the development of cutting edge AI for digital pathology in Google Cloud and beyond.

Links: Github, GCP-DICOMWeb

By Google HealthAI and Google Cloud Healthcare teams

PJRT: Simplifying ML Hardware and Framework Integration

Infrastructure fragmentation in Machine Learning (ML) across frameworks, compilers, and runtimes makes developing new hardware and toolchains challenging. This inhibits the industry’s ability to quickly productionize ML-driven advancements. To simplify the growing complexity of ML workload execution across hardware and frameworks, we are excited to introduce PJRT and open source it as part of the recently available OpenXLA Project.

PJRT (used in conjunction with OpenXLA’s StableHLO) provides a hardware- and framework-independent interface for compilers and runtimes. It simplifies the integration of hardware with frameworks, accelerating framework coverage for the hardware, and thus hardware targetability for workload execution.

PJRT is the primary interface for TensorFlow and JAX and fully supported for PyTorch, and is well integrated with the OpenXLA ecosystem to execute workloads on TPU, GPU, and CPU. It is also the default runtime execution path for most of Google’s internal production workloads. The toolchain-independent architecture of PJRT allows it to be leveraged by any hardware, framework, or compiler, with extensibility for unique features. With this open-source release, we're excited to allow anyone to begin leveraging PJRT for their own devices.

If you’re developing an ML hardware accelerator or developing your own compiler and runtime, check out the PJRT source code on GitHub and sign up for the OpenXLA mailing list to quickly bootstrap your work.

Vision: Simplifying ML Hardware and Framework Integration

We are entering a world of ambient experiences where intelligent apps and devices surround us, from edge to the cloud, in a range of environments and scales. ML workload execution currently supports a combinatorial matrix of hardware, frameworks, and workflows, mostly through tight vertical integrations. Examples of such vertical integrations include specific kernels for TPU versus GPU, specific toolchains to train and serve in TensorFlow versus PyTorch. These bespoke 1:1 integrations are perfectly valid solutions but promote lock-in, inhibit innovation, and are expensive to maintain. This problem of a fragmented software stack is compounded over time as different computing hardware needs to be supported.

A variety of ML hardware exists today and hardware diversity is expected to increase in the future. ML users have options and they want to exercise them seamlessly: users want to train a large language model (LLM) on TPU in the Cloud, batch infer on GPU or even CPU, distill, quantize, and finally serve them on mobile processors. Our goal is to solve the challenge of making ML workloads portable across hardware by making it easy to integrate the hardware into the ML infrastructure (framework, compiler, runtime).

Portability: Seamless Execution

The workflow to enable this vision with PJRT is as follows (shown in Figure 1):

  1. The hardware-specific compiler and runtime provider implement the PJRT API, package it as a plugin containing the compiler and runtime hooks, and register it with the frameworks. The implementation can be opaque to the frameworks.
  2. The frameworks discover and load one or multiple PJRT plugins as dynamic libraries targeting the hardware on which to execute the workload.
  3. That’s it! Execute the workload from the framework onto the target hardware.

The PJRT API will be backward compatible. The plugin would not need to change often and would be able to do version-checking for features.

Diagram of PJRT architecture
Figure 1: To target specific hardware, provide an implementation of the PJRT API to package a compiler and runtime plugin that can be called by the framework.

Cohesive Ecosystem

As a foundational pillar of the OpenXLA Project, PJRT is well-integrated with projects within the OpenXLA Project including StableHLO and the OpenXLA compilers (XLA, IREE). It is the primary interface for TensorFlow and JAX and fully supported for PyTorch through PyTorch/XLA. It provides the hardware interface layer in solving the combinatorial framework x hardware ML infrastructure fragmentation (see Figure 2).

Diagram of PJRT hardware interface layer
Figure 2: PJRT provides the hardware interface layer in solving the combinatorial framework x hardware ML infrastructure fragmentation, well-integrated with OpenXLA.

Toolchain Independent

PJRT is hardware and framework independent. With framework integration through the self-contained IR StableHLO, PJRT is not coupled with a specific compiler, and can be used outside of the OpenXLA ecosystem, including with other proprietary compilers. The public availability and toolchain-independent architecture allows it to be used by any hardware, framework or compiler, with extensibility for unique features. If you are developing an ML hardware accelerator, compiler, or runtime targeting any hardware, or converging siloed toolchains to solve infrastructure fragmentation, PJRT can minimize bespoke hardware and framework integration, providing greater coverage and improving time-to-market at lower development cost.

Driving Impact with Collaboration

Industry partners such as Intel and others have already adopted PJRT.

Intel

Intel is leveraging PJRT in Intel® Extension for TensorFlow to provide the Intel GPU backend for TensorFlow and JAX. This implementation is based on the PJRT plugin mechanism (see RFC). Check out how this greatly simplifies the framework and hardware integration with this example of executing a JAX program on Intel GPU.

"At Intel, we share Google's vision of modular interfaces to make integration easier and enable faster, framework-independent development. Similar in design to the PluggableDevice mechanism, PJRT is a pluggable interface that allows us to easily compile and execute XLA's High Level Operations on Intel devices. Its simple design allowed us to quickly integrate it into our systems and start running JAX workloads on Intel® GPUs within just a few months. PJRT enables us to more efficiently deliver hardware acceleration and oneAPI-powered AI software optimizations to developers using a wide range of AI Frameworks." - Wei Li, VP and GM, Artificial Intelligence and Analytics, Intel.

Technology Leader

We’re also working with a technology leader to leverage PJRT to provide the backend targeting their proprietary processor for JAX. More details on this to follow soon.

Get Involved

PJRT is available on GitHub: source code for the API and a reference openxla-pjrt-plugin, and integration guides. If you develop ML frameworks, compilers, or runtimes, or are interested in improving portability of workloads across hardware, we want your feedback. We encourage you to contribute code, design ideas, and feature suggestions. We also invite you to join the OpenXLA mailing list to stay updated with the latest product and community announcements and to help shape the future of an interoperable ML infrastructure.

Acknowledgements

Allen Hutchison, Andrew Leaver, Chuanhao Zhuge, Jack Cao, Jacques Pienaar, Jieying Luo, Penporn Koanantakool, Peter Hawkins, Robert Hundt, Russell Power, Sagarika Chalasani, Skye Wanderman-Milne, Stella Laurenzo, Will Cromar, Xiao Yu.

By Aman Verma, Product Manager, Machine Learning Infrastructure

Google Summer of Code 2023 accepted contributors announced!

We are pleased to announce the Google Summer of Code (GSoC) Contributors for 2023. Over the last few weeks, our 171 mentoring organizations have read through applications, had discussions with applicants, and made the difficult decision of selecting the GSoC Contributors they will be mentoring this summer.

Some notable results from this year’s application period:
  • 43,765 applicants from 160 countries
  • 7,723 proposals submitted
  • 967 GSoC contributors accepted from 65 countries
  • Over 2,400 mentors and organization administrators

Over the next few weeks, our GSoC 2023 Contributors will be actively engaging with their new open source community and getting acclimated with their organization. Mentors will guide the GSoC Contributors through the documentation and processes used by the community, as well as help with planning their milestones and projects for the summer. This Community Bonding period helps familiarize the GSoC Contributors with the languages and tools they will need to successfully complete their projects. Coding begins May 29th and most folks will wrap up September 5th, however, for the second year in a row, GSoC Contributors can request a longer coding period and wrap up their projects by mid-November instead.

We’d like to express our appreciation to the thousands of applicants who took the time to reach out to our mentoring organizations and submit proposals. Through the experience of asking questions, researching, and writing your proposals we hope you all learned more about open source and maybe even found a community you want to contribute to outside of Google Summer of Code! We always say that communication is key, and staying connected with the community or reaching out to other organizations is a great way to set the stage for future opportunities. Open source communities are always looking for new and eager collaborators to bring fresh ideas to the table. We hope you connect with an open source community or even start your own open source project!

There are a handful of program changes to this 19th year of GSoC and we are excited to see how our GSoC Contributors and mentoring organizations take advantage of these adjustments. A big thank you to all of our mentors and organization administrators who make this program so special.

GSoC Contributors—have fun this summer and keep learning! Your mentors and community members have dozens, and in some cases hundreds, of years of combined experience. Let them share their knowledge with you and help you become awesome open source contributors!

By Perry Burnham, Associate Program Manager for the Google Open Source Programs Office