Author Archives: Open Source Programs Office

Google and NIST partner on nanotechnology development platform

We’re proud to announce Google’s cooperative research and development agreement with the U.S. National Institute of Standards and Technology (NIST) to develop an open source testbed for nanotechnology research and development for American universities. NIST—a bureau of the U.S. Department of Commerce—will start by migrating their existing planarized wafer designs to an open source framework, which can be manufactured in the U.S. on SkyWater Technologies’ open source 130nm process (SKY130). The physical wafers and source code will be available in the coming months. Together, NIST, Google, and the open source community will develop designs to facilitate research into both basic and applied science, including technology transfer into production with U.S. manufacturers.

Furthering Google’s goals to improve access to semiconductor technology, this agreement will provide academic researchers with unprecedented resources from a semiconductor foundry to enhance research into the physics of semiconductors and nanodevices. This includes their chemistry, defects, electrical properties, high frequency operation, and switching behavior, while reducing overall costs through economies of scale. Most importantly, this access enhances the technology transfer process by enabling researchers to develop new and emerging technologies using foundry resources, that can then be seamlessly transitioned into mass production since universities will already be using an industrially relevant platform. This will greatly improve scientist’s ability to move their technologies through the tech-transfer “valley of death” and into practical use.

Nanotechnology research has benefitted from silicon wafers that are normally used for chip manufacturing in a unique way. Instead of turning them into packaged microchips, their smooth, planarized surface makes a great substrate for building and testing nanoscale structures. This likewise helps test their transition into mass production.

Picture of a full wafer using the SKY130 open source PDK.

Picture of a full wafer using the SKY130 open source PDK.


The wafer for this platform has a number of different metrology structures, from parametric test structures based on simple transistor arrays—which can be probed in a probe station—to thousands of complex measurements that users can operate using synthesized digital circuits. Critically, the wafers will be available to universities in a 200 mm form factor, and mid-production planarized wafers with less than a single nanometer of surface roughness. Smooth, flat surfaces are critical for advanced manufacturing at small sizes.

NIST researchers are also ensuring that the wafers have photolithographic and electron beam alignment marks commonly found in university nanofabrication facilities, allowing the foundry silicon to be used directly by university researchers with ease. Metal pads on the surface will allow scientists to access the semiconductor transistors from the surface.

NIST scientists anticipate the nanotechnology accelerator platform will enhance scientific investigations into a diverse set of technologies, including memory devices (resistive switches, magnetic tunnel junctions, flash memories), artificial intelligence, plasmonics, semiconductor bioelectronics, thin film transistors and even quantum information science.

Picture of a development die from Google 's OpenMPW program for the nanotechnology accelerator developed by NIST and the University of Michigan

Picture of a development die from Google 's OpenMPW program for the nanotechnology accelerator developed by NIST and the University of Michigan

This program also benefits from Google’s previous contributions and support of the GDSFactory and OpenFASOC open source projects that help automate and shorten the construction of these important measuring devices from months to days. Ahead of the full wafer tapeout in 2023, NIST scientists, working with partners at the University of Michigan, Carnegie Mellon, University of Maryland, The George Washington University, and Brown University have been using Google's OpenMPW program to develop and test preliminary circuits which they expect to include in the nanotechnology accelerator. Preliminary testing will help ensure the program’s goals are met with working circuits that best serve the scientific community.

A key factor in cutting-edge research is reproducibility, or the ability for researchers from different institutions to repeat each other’s experiments and improve upon them. By migrating to an open source framework, researchers can more easily share reproducible results, contribute to the creation of open source datasets to enhance future simulation, and advance the scientific community’s state of the art of nanotechnology and semiconductor manufacturing.

NIST and Google will distribute the first production run of wafers to leading U.S. universities. Post-program, American scientists will be able to directly purchase the wafers from Skywater without license requirements, giving them the freedom to pursue their research without any restrictions. Since wafers are hundreds of times cheaper than full mask-sets or the cost of designing integrated circuits from scratch, scientists will have a much easier time getting and using this powerful industrial technology. Longer term, working with NIST to develop future platforms on the recently announced SKY90FD open source PDK will further expand this R&D ecosystem.

To kick off this research effort NIST is organizing the "NIST Integrated Circuits for Metrology Workshop" from September 20–21, 2022. This workshop will be held online with a series of presentations and panel discussions on the first day. During the second day, a working group of researchers, scientists and engineers will work to focus on the creation of parametric test structures for monolithic integration using open source silicon technology. Visit the event website to get more details about this program and register to attend or learn more about presenting.

By Ethan Mahintorabi and Johan Euphrosine, Software Engineers – Hardware Toolchains Team, and Aaron Cunningham, Technical Program Manager – Google Open Source Programs Office

Accelerate your models to production with Google Cloud and PyTorch

We believe in the power of choice for Machine Learning development, and continue to invest resources to make it easy for ML practitioners to train, deploy, and orchestrate models from a single unified data and AI cloud platform. We’re excited to announce our role as a founding member of the newly formed PyTorch Foundation, which will better position Google Cloud to make meaningful contributions to the PyTorch community. As a member of the board, we will deepen our open source investment to deliver on the Foundation’s mission to drive adoption of AI tooling by building an ecosystem of open source projects with PyTorch. We strongly believe in choice and will continue to invest in frameworks such as JAX and Tensorflow and support integrations with other OSS Projects including Spark, Airflow, XGBoost, and others.

In this blog, we provide an overview of existing resources to help you get started with PyTorch on Google Cloud. We also talk about how ML practitioners can leverage our end-to-end ML platform to train, tune, and deploy PyTorch models.

PyTorch on Google Cloud

Open source in the cloud is important because it gives you flexibility and control over where you train and deploy your ML workloads. PyTorch is extensively used in the research space and in recent years it has gained immense traction in the industry due to its ease of use and deployment. In fact, according to a survey of Kaggle users, PyTorch is the fastest growing ML framework today.

ML practitioners using PyTorch tell us that it can be challenging to advance their ML project past experimentation. This is why Google Cloud has built integrations with PyTorch that make it easier to train, deploy, and orchestrate models in production. Some examples are:

  • PyTorch integrates directly with Vertex AI, a fully managed ML platform that provides the tools you need to take a model from PyTorch to production, like the Pytorch DL containers or the Vertex AI workbench PyTorch one-click JupyterLab environment.
  • PyTorch/XLA, an open source library, uses the XLA deep learning compiler to enable PyTorch to run on Cloud TPUs. Cloud TPUs are custom accelerators designed by Google, optimized for perf/TCO with large scale ML workload PyTorch/XLA also enables XLA driven optimizations on GPUs.
  • TorchX provides an adapter to run and orchestrate TorchX components as part of Kubeflow Pipelines that you can easily scale on Vertex AI Pipelines.
  • With our OSS contributions to Apache Beam, we have made PyTorch models easy to deploy in batch or stream, data processing pipelines. Running on Google Dataflow, these pipelines will scale to very large workloads in a fully managed and simple to maintain environment.

To learn more and start using PyTorch on Google Cloud, check out the resources below:

PyTorch on Vertex AI Resources

  1. How To train and tune PyTorch models on Vertex AI: Learn how to use Vertex AI Training to build and train a sentiment text classification model using PyTorch and Vertex AI Hyperparameter Tuning to tune hyperparameters of PyTorch models.
  2. How to deploy PyTorch models on Vertex AI: Walk through the deployment of a Pytorch model using TorchServe as a custom container, by deploying the model artifacts to a Vertex Prediction service.
  3. Orchestrating PyTorch ML Workflows on Vertex AI Pipelines: See how to build and orchestrate ML pipelines for training and deploying PyTorch models on Google Cloud Vertex AI using Vertex AI Pipelines.
  4. Scalable ML Workflows using PyTorch on Kubeflow Pipelines and Vertex Pipelines: Take a look at examples of PyTorch-based ML workflows on two pipelines frameworks: OSS Kubeflow Pipelines, part of the Kubeflow project, and Vertex AI Pipelines. We share new PyTorch built-in components added to the Kubeflow Pipelines.

PyTorch/XLA and Cloud TPU/GPU

  1. Scaling deep learning workloads with PyTorch / XLA and Cloud TPU VM: Describes the challenges associated with scaling deep learning jobs to distributed training settings, using the Cloud TPU VM and shows how to stream training data from Google Cloud Storage (GCS) to PyTorch / XLA models running on Cloud TPU Pod slices.
  2. PyTorch/XLA: Performance debugging on Cloud TPU VM: Part I: In the first part of the performance debugging series on Cloud TPU, we lay out the conceptual framework for PyTorch/XLA in the context of training performance. We introduced a case study to make sense of preliminary profiler logs and identify the corrective actions.
  3. PyTorch/XLA: Performance debugging on Cloud TPU VM: Part II: In the second part, we deep dive into further analysis of the performance debugging to discover more performance improvement opportunities.
  4. PyTorch/XLA: Performance debugging on Cloud TPU VM: Part III: In the final part of the performance debugging series, we introduce user defined code annotation and visualize these annotations in the form of a trace.
  5. Train ML models with Pytorch Lightning on TPUs: Learn how easy it is to start training models with PyTorch Lightning on TPUs with its built-in TPU support.

Other resources

  1. Increase your productivity using PyTorch Lightning: Learn how to use PyTorch Lightning on Vertex AI Workbench (was previously Notebooks).

By Erwin Huizing and Grace Reed – Cloud AI and ML

TestParameterInjector gets JUnit5 support

In March 2021, we announced the open source release of TestParameterInjector: A parameterized test runner for JUnit4 (see GitHub page).

Over a year later, the Google-internal usage of TestParameterInjector has continued to rapidly grow, and is now by far the most popular parameterized test framework.
Graph of the different parameterized test frameworks in Google
Guava's philosophy frames it nicely: "When trying to estimate the ubiquity of a feature, we frequently use the Google internal code base as a reference." We also believe that TestParameterInjector usage in Google is a decent proxy for its utility elsewhere.

As you can see on the graph above, not only did TestParameterInjector reduce the usage of the other frameworks, but it also caused a drastic increase of the total amount of parameterized tests. This suggests that TestParameterInjector reduced the threshold for parameterizing a regular unit test and Googlers are more actively using this tool to improve the quality of their tests.

JUnit5 (Jupiter) support

At Google, we use JUnit4 exclusively, but some developers outside of Google have moved on to JUnit5 (Jupiter). For those users, we have now expanded the scope of TestParameterInjector.

We've kept the API the same as much as possible:

// **************** JUnit4 **************** //

@RunWith(TestParameterInjector.class)

public class MyTest {


  @TestParameter boolean isDryRun;


  @Test public void test1(@TestParameter boolean enableFlag) { ... }


  @Test public void test2(@TestParameter MyEnum myEnum) { ... }


  enum MyEnum { VALUE_A, VALUE_B, VALUE_C }

}


// **************** JUnit5 (Jupiter) **************** //

class MyTest {


  @TestParameter boolean isDryRun;


  @TestParameterInjectorTest

  void test1(@TestParameter boolean enableFlag) {

    // This method is run 4 times for all combinations of isDryRun and enableFlag

  }


  @TestParameterInjectorTest

  void test2(@TestParameter MyEnum myEnum) {

    // This method is run 6 times for all combinations of isDryRun and myEnum

  }


  enum MyEnum { VALUE_A, VALUE_B, VALUE_C }

}

The only differences are that @RunWith / @ExtendWith are not necessary and that every test method needs a @TestParameterInjectorTest annotation.

The other features of TestParameterInjector work in a similar way with Jupiter:

class MyTest {


  // **************** Defining sets of parameters **************** //

  @TestParameterInjectorTest

  @TestParameters(customName = "teenager", value = "{age: 17, expectIsAdult: false}")

  @TestParameters(customName = "young adult", value = "{age: 22, expectIsAdult: true}")

  void personIsAdult_success(int age, boolean expectIsAdult) {

    assertThat(personIsAdult(age)).isEqualTo(expectIsAdult);

  }


  // **************** Dynamic parameter generation **************** //

  @TestParameterInjectorTest

  void matchesAllOf_throwsOnNull(

      @TestParameter(valuesProvider = CharMatcherProvider.class) CharMatcher charMatcher) {

    assertThrows(NullPointerException.class, () -> charMatcher.matchesAllOf(null));

  }


  private static final class CharMatcherProvider implements TestParameterValuesProvider {

    @Override

    public List<CharMatcher> provideValues() {

      return ImmutableList.of(

          CharMatcher.any(), CharMatcher.ascii(), CharMatcher.whitespace());

    }

  }

}

Other things we've been working on

Custom names for @TestParameters
When running a the following parameterized test:

@Test

@TestParameters("{age: 17, expectIsAdult: false}")

@TestParameters("{age: 22, expectIsAdult: true}")

public void withRepeatedAnnotation(int age, boolean expectIsAdult){ ... }

the generated test names will be:

MyTest#withRepeatedAnnotation[{age: 17, expectIsAdult: false}]

MyTest#withRepeatedAnnotation[{age: 22, expectIsAdult: true}]

This is fine for small parameter sets, but when the number of @TestParameters or parameters within the YAML string gets large, it quickly becomes hard to figure out what each parameter set is supposed to represent.

For those cases, we added the option to add customName:

@Test

@TestParameters(customName = "teenager", value = "{age: 17, expectIsAdult: false}")

@TestParameters(customName = "young adult", value = "{age: 22, expectIsAdult: true}")

public void personIsAdult(int age, boolean expectIsAdult){...}

To allow this API change, we had to allow @TestParameters to be used in a different way: The original way of specifying @TestParameters sets was to specify them as a list of YAML strings inside a single @TestParameters annotation. We considered multiple options of specifying the custom name inside of these YAML strings, such as a magic 
_name key and an extra YAML mapping layer where the keys would be the test names. But we eventually settled on the aforementioned API, which makes @TestParameters a repeated annotation because it results in the least complex code and clearly separates the different parameter sets.

It should be noted that the original API (list of YAML strings in single annotation) still works, but it is now discouraged in favor of multiple @TestParameters annotations with a single YAML string, even when customName isn't used. The main arguments for this recommendation are:
  • Consistency with the customName case, which needs a single YAML string per @TestParameters annotation
  • We believe it structures the list of parameters (especially when it's long) in a more structured way

Integration with RobolectricTestRunner

Recently, we've managed to internally make a version of RobolectricTestRunner that supports TestParameterInjector annotations. There is a significant amount of work left to open source this, and we are now considering when and how to do this.

Learn more

Our GitHub README provides an overview of the framework. Let us know on GitHub if you have any questions, comments, or feature requests!

By Jens Nyman – TestParameterInjector

El Carro for Oracle: Data migration and improved backups

In May 2021, we released El Carro to make it easier to run Oracle databases on Kubernetes. Our following blog dove deeper into El Carro’s features, announced support for Oracle 19c, and detailed more flexibility for building database images. Today we’re excited to open source two new features to enhance El Carro and make it easier to manage your Oracle deployments: Data Migration and Point-in-Time Recovery. Automated Data Migration makes it much easier to re-platform to El Carro and Point-in-Time Recovery is a standard feature that database professionals come to expect because it enables you to drastically reduce RPO and worry less about backup frequency.

Data Migration

The Data Migration feature of El Carro enables users to migrate data from their existing database to an El Carro database running on Kubernetes. This functionality allows users to re-platform to El Carro with minimal disruption. The two most common pathways shown in the image below are to 1) modernize in place by simply migrating your database to Kubernetes so you can leverage the automation of El Carro and to 2) migrate to Kubernetes in the cloud.
Migrate your database in place or to the cloud
Migrate your database in place or to the cloud

Typical migration sources include AWS (RDS, EKS, EC2), Azure (AKS, VMs), GCP (GKE, GCE, BMS) or on-premises deployments. Typical migration targets include any Kubernetes installation on GCP (GKE), AWS (EKS), Azure (AKS), or on-premises.

The Data Migration feature offers two automated migration flows and two manual ones.

Category

Options

Migration Downtime

Complexity

Automated

Data Guard with physical standby

minimal

lowest

Data pump

long

low

Manual

RMAN-based migration

long

medium

Data pump

long

medium

  • Migration Downtime: required downtime to migrate source database into El Carro without data loss. Minimal means less downtime, long means more downtime.
  • Complexity: summarizes the difficulty and complexity of the migration journey.

Point-in-Time Recovery

Since its release, El Carro has provided users the ability to take backups and restore via RMAN or storage-based snapshots. Today we’re excited to release a new Point-in-Time Recovery feature to enhance El Carro’s backup functionality by automatically backing up archive redo logs to a GCS (Google Cloud Storage) bucket and allowing users to seamlessly restore their databases to any point in time within a user configurable window. This optional feature provides an additional layer of protection and enhanced restore granularity without interfering with manual backups or affecting database performance.

The diagram below contrasts the new versus old functionality. Previously, there were discrete restore points (shown in green on the top arrow) which represented limited opportunities to restore. With Point-in-Time Recovery, the entire arrow is green, meaning the recovery functionality is continuous, with restore points at any time along the green arrow.

With Point-in-Time Recovery you can restore to any point after the first backup
With Point-in-Time Recovery you can restore to any point after the first backup

Conclusion

As always, you can try the open source El Carro operator for free (Apache 2.0 license) wherever you run Oracle databases. Follow the quick start guide and try out provisioning of instances, databases, users. Import data via Data Pump, manage instance parameters, choose between different methods for backups, and try out a restore. Have a look at how we integrate with external logging and monitoring solutions. Reach out via our Google group and leave feedback for what features you would like to see next, or even create your own patch, issue or pull request on GitHub.

By Kyle Meggs, Product Manager and Ash Gbadamassi, Software Engineer – Cloud Databases

GlobalFoundries joins Google’s open source silicon initiative

Over the last year we have been busy planning the expansion of our free open source silicon design and manufacturing program to further grow the community of developers and companies building custom silicon, and build a thriving ecosystem around open source hardware.

Today, we’re excited to announce an expansion of this program and our partnership with GlobalFoundries. Together, we're releasing the Process Design Kit (PDK) for the GlobalFoundries 180MCU technology platform under the Apache 2.0 license, along with a no-cost silicon realization program to manufacture open source designs on the Efabless platform. This open source PDK is the first result of our ongoing partnership with GF. Based on the scale and breadth of GF’s technology and manufacturing expertise, we expect to do more together to further access and innovation in semiconductor development and manufacturing.
GF180MCU 1P5M 5 metals stack-up, 9kA top metal, with MIM between M3 and M4 layers.
Google started this program with SkyWater Technologies, by releasing one of their PDKs under the Apache 2.0 license. We sponsored six shuttle runs over the course of two years, allowing the open source community to submit more than 350 unique designs of which around 240 were manufactured at no-cost.
We cannot understate the milestone that this new partnership represents in the foundry ecosystem market.

Over the past few years, the world has experienced an unprecedented acceleration of adoption of digital capabilities—driven by the pandemic, and technology megatrends that have shifted every aspect of human life. According to GlobalFoundries, this has led to roughly 73% of foundry revenue being associated with high growth markets such as mobile, IoT, and automotive. This transition has not only given rise to a “New Golden Age” of semiconductors but also a tectonic shift in how we define and deliver innovation as an industry.  

Specifically, applications using 180nm are at a global capacity of 16+ million wafers a year and bound to grow to 22+ million wafers in 2026, according to GlobalFoundries.

The 180nm application space continues to see strong market traction in motor controller, RFID, general purpose MCUs and PMIC, along with emerging applications such as IoT Sensors, Dual Frequency RFID and Motor Drive.

The collaboration between GlobalFoundries and Google will help drive innovation for the application and silicon engineers designing in these high growth areas, and is an unambiguous affirmation of the viability of the open source model for the foundry ecosystem.

The GF 180nm technology platform offers open source silicon designers new capabilities for high volume production, affordability, and more voltage options. This PDK includes the following standard cells
  • Digital standard cells libraries (7-track and 9-track)
  • Low (3.3V), Medium (5V, 6V) and High (10V) voltage devices
  • SRAM macros (64x8, 128x8, 256x8, 512x8)
  • I/O and primitives (Resistors, Capacitors, Transistors, eFuses) cells libraries
Open sourcing more PDKs is a critical step in the development of the open source silicon ecosystem:
  • Open source EDA tools can now add support for multiple process technologies.
  • Researchers can produce fully-reproducible designs against multiple technology baselines.
  • Popular open source IP blocks can be ported to different process technologies.
We cannot build this on our own, we need you: software developers and hardware engineers, researchers and undergrad students, hobbyists and industry veterans, new startups and industry players alike, to bring your fresh ideas and your proven experiences to help us grow the open source silicon ecosystem.

We encourage you to:
By Johan Euphrosine and Ethan Mahintorabi – Hardware Toolchains Team

SkyWater and Google expand open source program to new 90nm technology

Today, Google is announcing the expansion of our partnership with SkyWater Technology. We are working together to release an open source process design kit (PDK) for SKY90-FD, SkyWater’s commercial 90nm fully depleted silicon on insulator (FDSOI) CMOS process technology. SKY90-FD is based on MIT Lincoln Laboratory’s 90 nm commercial FDSOI technology, and enables designers to create complex integrated circuits for a diverse range of applications.

Over the last two years, Google and SkyWater Technology have partnered to make building open silicon accessible to all developers, starting with the open source release of the SKY130 PDK and continuing with a series of no-cost manufacturing shuttles for developers in the open source hardware ecosystem. To date, Google has sponsored six shuttles on the Efabless platform, manufacturing 240 designs from over 364 community submissions. This is the first partnership of its type ever launched, and the results to date have been impressive.
The latest MPW-6 shuttle received 90 submissions from a diverse community across 24 different countries:

Over the coming months, we'll work closely with SkyWater Technology to release their new SKY90-FD PDK under the Apache 2.0 license and organize additional Open MPW shuttles to manufacture open source designs for this new 90nm FDSOI technology, through the Efabless platform.

We believe that having access to different technologies through open source PDKs is critical to grow and strengthen the open silicon ecosystem:
  • Developers can go beyond the constraints of their familiar process nodes and explore different performance, power and area trade offs with existing or new designs.
  • Researchers can reproduce their research on different technologies to produce diverse figures of merit.
  • Tool maintainers can generalize their technologies' backends to support more than one process.
  • The community can refine the ways we structure, distribute and maintain these PDKs.
SKY90-FD is a 90nm FDSOI process. Unlike a traditional CMOS BULK process, SKY90-FD features a thin layer of insulator material between the substrate and the upper silicon layer. This thin oxide process allows the transistor to be significantly thinner than in the BULK process, allowing the device to be “fully depleted,” and simplifying the fabrication process. This extra insulation greatly reduces parasitic current leakage and lowers junction capacitances, providing improved speed and power performance under various environmental conditions.
The SKY90-FD process stack topology features 5x thin Copper base metal layers for the main interconnect and two extra thicker Al (Aluminum) metal layers capable of conducting higher current.
Google is excited about the new range of applications this open source PDK will enable once it's released later this year, and we can't wait to hear from you and watch the growing stream of innovative project ideas originating from the open silicon community.

In the meantime, make sure to check https://developers.google.com/silicon for resources and pointers to start your open silicon journey!


By Johan Euphrosine and Ethan Mahintorabi – Hardware Toolchains Team

Cirq Turns 1.0



Today we are excited to announce the first full version release of the open source quantum programming framework Cirq: Cirq 1.0. Cirq is a Python framework for writing, running, and analyzing the results of quantum computer programs. It was designed for near-term quantum computers, those with a few hundred qubits and few thousands of quantum gates. The significance of the 1.0 release is that Cirq has support for the vast majority of workflows for these systems and is considered to be a stable API that we will only update with breaking changes at major version numbers.

Getting to Cirq 1.0 is the culmination of a large amount of hard work by hundreds of contributors from Google, industry, and academia. We have been running a weekly meeting, called the “Cirq Cync”, for over four years where community members gather to discuss work on Cirq, bugs, and to generally tell terrible but amusing quantum programming jokes. We’re proud of this inclusive community, and we’ve been particularly happy to see the growth of many software developers into quantum computing experts, and quantum computing experts into solid software developers. One of our contributors, Victory Omole, won the 2021 Witteck Quantum Prize for Open Source Software. Way to go Victory!

The first commit to Cirq on GitHub (an internal version of Cirq at Google existed prior to this) was on Dec 19, 2017 by Craig Gidney, and we publicly announced Cirq in July of 2018. 3,200+ commits later to the GitHub repo, in the hands of the team at Google and the Cirq community, we’ve seen Cirq help accomplish some amazing things:
  • Cirq is the lingua franca that Google’s hardware team uses to write quantum programs that run on Google’s quantum computing hardware. Because of this, we have been able to post open source code in our ReCirq repo for these experiments for anyone to examine and extend. A few highlights of the past few years:
    • “Realizing topologically ordered states on a quantum processor”, K. J. Satzinger et al., Science 374 6572, 1237-1241 (2021) [paper] [ReCirq code]
    • “Information scrambling in quantum circuits”, X. Mi, P. Roushan, C. Quintana et al, Science 374, 6574 1479-1483 (2021) [paper] [ReCirq code]
    • “Hartree-Fock on a superconducting qubit quantum computer”, F. Arute et al., Science 369, 6507 1084--1089 (2020) [paper] [ReCirq code]
  • A healthy community of libraries have now been built on top of Cirq, enabling different quantum computing research areas. These libraries include:
    • TensorFlow Quantum: a tool for exploring quantum machine learning. Using TensorFlow Quantum researchers trained a machine learning model on 30 qubits at a rate of 1.1 petaflops per second (1.1 x 1015 operations per second).
    • OpenFermion: an open source tool for quantum computations involved in chemistry simulations.
    • Pytket (pytkey-cirq): an open source Python tool for optimizing and manipulating quantum circuits.
    • Mitiq: an open source library developed by the non-profit Unitary fund for error mitigation techniques developed by the non-profit Unitary fund.
    • Qsim: a high performance state vector simulator written using AVX/FMA vectorized instructions with optional GPU acceleration. qsimcirq is the Cirq interface one can use to access qsim from Cirq.
  • Numerous quantum computing cloud services from companies in the industry have also integrated/standardized Cirq. Programs written in Cirq can be used to run through AQT, IonQ, Pascal, Rigetti, and IQM vendors. In addition, Cirq can be used on Azure Quantum to run on the hardware supported by Azure Quantum. Finally, one can get realistic noise simulations of Google’s quantum computing hardware using our newly released Quantum Virtual Machine.
  • Cirq is not just for stuffy research. Cirq has also been used to help develop Quantum Chess, a version of chess that uses superposition and entanglement. This notebook shows you how the game of Quantum Chess can be programmed using Cirq.
Cirq moving to its first full version does not just come with new features (see 1.0 release notes), but also with more guarantees about stability. Cirq uses semantic versioning, which means that future point release of Cirq will be compatible with the full version release. For example, version 1.1 of Cirq will not introduce breaking changes to Cirq’s interfaces from version 1.0; only at major version bumps (from 1.x to 2.0, for example) will breaking changes occur.

When we began working on Cirq, quantum computers consisted of only a few qubits and a few quantum gates on these qubits. Building Cirq and the supporting software for these custom systems and having them start to scale to hundreds of qubits over the past (nearly) five years has taught us many lessons. One key takeaway from these lessons is that: As quantum computing hardware continues to grow in scale and complexity, we expect that making software to support this growth will be essential to continue meaningful research and progress. In the next five years, with hardware expected to reach hundreds or even thousands of qubits, the software that is developed for quantum computing will need to have a careful eye set on supporting these bigger and bigger systems. Going forward we will need an ever wider set of frameworks, programming languages, and libraries to achieve quantum computing’s promise.

Acknowledgements

We are indebted to all 169 contributors to the Cirq github repo, and the many more who have filed issues and used Cirq in their own software. A particular shout out to the original lead of Cirq, Craig Gidney, to Cirq’s second lead, ‪Bálint Pató who guided Cirq through its middle ages, and to Alan Ho and Catherine Vollgraff Heidweiller for product wisdom. A special thanks to the core Cirq contributors including Doug Strain, Matthew Neely, Tanuj Khatter, Dax Fohl, Adam Zalcman, Kevin Sung, Matt Harrigan, Casey Duckering, Orion Martin, Smit Sanghavi, Bryan O'Gorman, Wojciech Mruczkiewicz, Ryan LaRose, Tony Bruguier, Victory Omole, and Cheng Xing, and our documentarians Auguste Hirth and Abe Asfaw.


By Dave Bacon and Michael Broughton – Quantum AI Team

Gateway API Graduates to Beta

For many years, Kubernetes users have wanted more advanced routing capabilities to be configurable in a Kubernetes API. With Google leadership, Gateway API has been developed to dramatically increase the number of features available. This API enables many new features in Kubernetes, including traffic splitting, header modification, and forwarding traffic to backends in different namespaces, just to name a few.
Since the project was originally proposed, Googlers have helped lead the open source efforts. Two of the top contributors to the project are from Google, while more than 10 engineers from Google have contributed to the API.

This week, the API has graduated from alpha to beta. This marks a significant milestone in the API and reflects new-found stability. There are now over a dozen implementations of the API and many are passing a comprehensive set of conformance tests. This ensures that users will have a consistent experience when using this API, regardless of environment or underlying implementation.

A Simple Example

To highlight some of the new features this API enables, it may help to walk through an example. We’ll start with a Gateway:

apiVersion: gateway.networking.k8s.io/v1beta1

kind: Gateway

metadata:

  name: store-xlb

spec:

  gatewayClassName: gke-l7-gxlb

  listeners:  

  - name: http

    protocol: HTTP

    port: 80

This Gateway uses the gke-l7-gxlb Gateway Class, which means a new external load balancer will be provisioned to serve this Gateway. Of course, we still need to tell the load balancer where to send traffic. We’ll use an HTTPRoute for this:

apiVersion: gateway.networking.k8s.io/v1beta1

kind: HTTPRoute

metadata:

  name: store

spec:

  parentRefs:

  - name: store-xlb

  rules:

  - matches:

    - path:

        type: PathPrefix

        value: /

    backendRefs:

    - name: store-svc

      port: 3080

      weight: 9

    - name: store-canary-svc

      port: 3080

      weight: 1

This simple HTTPRoute tells the load balancer to route traffic to one of the “store-svc” or “store-canary-svc” on port 3080. We’re using weights to do some basic traffic splitting here. That means that approximately 10% of requests will be routed to our canary Service.

Now, imagine that you want to provide a way for users to opt in or out of the canary service. To do that, we’ll add an additional HTTPRoute with some header matching configuration:

apiVersion: gateway.networking.k8s.io/v1beta1

kind: HTTPRoute

metadata:

  name: store-canary-option

spec:

  parentRefs:

  - name: store-xlb

  rules:

  - matches:

    - header:

        name: env

        value: stable

    backendRefs:

    - name: store-svc

      port: 3080

  - matches:

    - header:

        name: env

        value: canary

    backendRefs:

    - name: store-canary-svc

      port: 3080

This HTTPRoute works in conjunction with the first route we created. If any requests set the env header to “stable” or “canary” they’ll be routed directly to their preferred backend.

Getting Started

Unlike previous Kubernetes APIs, you don’t need to have the latest version of Kubernetes to use this API. Instead, this API is built with Custom Resource Definitions (CRDs) that can be installed in any Kubernetes cluster, as long as it is version 1.16 or newer (released almost 3 years ago).

To try this API on GKE, refer to the GKE specific installation instructions. Alternatively, if you’d like to use this API with another implementation, refer to the OSS getting started page.

What’s next for Gateway API?

As the core capabilities of Gateway API are stabilizing, new features and concepts are actively being explored. Ideas such as Route Delegation and a new GRPCRoute are deep in the design process. A new service mesh workstream has been established specifically to build consensus among mesh implementations for how this API can be used for service-to-service traffic. As with many open source projects, we’re trying to find the right balance between enabling new use cases and achieving API stability. This API has already accomplished a lot, but we’re most excited about what’s ahead.


By Rob Scott – GKE Networking

Establishing new baselines: Identifying open source work in an unstable world

For more than 17 years, Google's Open Source Programs Office (OSPO) has brought the best of open source to Google and the best of Google to open source. We sponsor, create, and invest in projects and programs that enable everyone to join and contribute to the global open source ecosystem. Because that's who open source is for—everyone.

In 2021, Google supported open source innovation, security, collaboration, and sustainability through our programs and services with $15 million of funding. This includes $7 million in direct funding to open source communities through Google’s OSPO. We also stepped up our investments in the governing organizations of open source ecosystems. We became the first Visionary Sponsor of the Python Software Foundation, supporting a number of key initiatives including funding for the first CPython Developer in Residence. We sustained our Platinum Membership of the OpenJS Foundation to continue supporting the essential work of the foundation that fosters many critical projects in the Javascript ecosystem.

In late 2021, we announced the release of Knative 1.0 and submitted Knative to become a Cloud Native Computing Foundation incubation project to enable the next phase of community-driven innovation. (Knative’s application was accepted in March 2022 and followed by our submission of Istio to the CNCF). We also collaborated with Antmicro to develop and release the Rowhammer Tester platform, which provides security professionals a flexible platform for experimenting with new types of attacks to find better mitigation techniques.

We sustain and expand the open source ecosystem through programs at scale

In addition to our new investments, we continued to maintain our long-term programs elevating open source contributors, both new and existing.
  • The Google Open Source Live virtual events program, initiated in 2020, hosted 11 events, expanding to cover more projects and technology areas.
  • Google Summer of Code, in its 17th year, matched 1,205 students from 67 countries with more than 2,100 mentors from 75 countries in 199 open source organizations to successfully complete this year’s program. At the end of this round, we announced our plans to broaden the scope of the program in 2022, expanding the eligibility of who could participate, offering more options around project contribution, and increased flexibility in the timing of the program.
  • Season of Docs, in its third year, had 30 open source organizations finish their projects with 93% of organizations and 96% of the technical writers reporting a positive experience with the program.
  • Through our Open Source Peer Bonus Program, we were able to award peer bonuses to 175 contributors working in 35 countries on 86 unique projects, including contributors to the CHAOSS, CocoaPods, git and Open Civic Data projects.
In 2021, Google's open source programs directly supported more than 120 events to safely bring together more than 88,000 attendees from open source communities. We proudly sponsored All Things Open 2021, which had nearly 5,000 registered participants from 86 countries. We were also a Cornerstone Sponsor for FOSDEM 2021's two-day virtual gathering.

Our contributing population continues to scale with the growth of Alphabet

In 2021, roughly 10% of Alphabet's full-time workforce (FTEs) actively contributed code or code-adjacent work to open source projects. This percentage has remained roughly consistent over the last five years, indicating that our open source contribution has continued to scale with the growth of Alphabet. Note that in 2021, FTEs represented over 95% of our open source contributors, while the remainder includes vendors, independent contractors, temporary staff, and interns that have contributed to open source projects during their tenure at Alphabet.

After 2020’s atypical growth—which was largely due to novel programs such as our open source internships and COVID-19 response work—in 2021, our activity numbers returned to pre-pandemic baselines. Over the last five years, the trajectory of monthly active usage has increased on both GitHub and Git-on-Borg by more than 15% on average each year.

Note that the dip in GitHub data reported in October of 2021 corresponds to an outage on GHarchive resulting in 4+ days of lost data (see figure below). Despite this outage, contribution levels remained fairly consistent month over month: more than 45% of our active contributing population for the year logged an activity on GitHub or Git-on-Borg in an average month.
For more details about the source data and analysis cited in this report, see “About this data.” below.

The number of Alphabet open source projects remains steady

In 2021, we estimated that more than 2,000 open source projects that originated from Alphabet teams and employees were still active (not archived). To make this estimate, we chose a broad and variable definition of an open source project, including developer tools, utilities, languages, frameworks, libraries, demos, sample code, models, raw data, designs, and more. This estimate also includes personal projects that went through Alphabet's releasing process but not projects that have been moved to or originated under external organizations or foundations.

Project counts should not be confused with repositories; projects can include many repositories. Within Alphabet, we maintain more than 9,500 public repositories on Github and 1,700 public repositories on Git-on-Borg. While these efforts originated at Google and Alphabet, these repositories are open for anyone to use, contribute to, fork or build on through open source licenses. In 2021, more than 500,000 unique GitHub accounts not affiliated with Alphabet employees contributed to Alphabet projects.

The majority of repositories we work on are outside of Alphabet organizations

Open source contributors at Alphabet work on a variety of projects and repositories—not just our own code. In 2021, contributors at Alphabet engaged with more than 70,000 repositories on GitHub (WatchEvents or “stars” were removed from this count to represent active engagement), pushing commits and/or opening pull requests on over 49,000 repositories. Consistent with our 2019 and 2020 reports, more than 75% of repositories with pull requests opened by Alphabet contributors on GitHub were outside of Google-managed organizations.

We continue to invest in open source quality, security, and long term viability

Alphabet continues to rely on the health and availability of open source projects. Through internal efforts and collaboration with industry-led efforts such as OpenSSF, Alphabet is committed to sharing relevant practices and tooling with the goal of improving overall code quality and bolstering the security posture of users and developers of open source software. We hope that by sharing our internal frameworks and best practices, we can spark industry-wide discussion and progress on the security and sustainability of the open source ecosystem.

In 2021, we launched Open Source Insights, a tool designed to list and visualize a project’s dependencies and their properties, helping developers review the packages that make up their software supply chains. The potential of this tool was put to the test after the disclosure of Log4j vulnerabilities in December 2021. The Open Source insights team reported that more than 35,000 Java packages, amounting to over 8% of the Maven Central repository, were impacted by this vulnerability. As part of their investigation, they compiled a list of 500 affected packages with some of the highest transitive usage to guide users and maintainers who were supporting patching and remediation activities.

In 2021, we also announced our sponsorship of the Secure Open Source (SOS) Rewards pilot program run by the Linux Foundation. This program financially rewards developers for enhancing the security of critical open source projects that we all depend on.

These efforts build on our existing open source security work, such as OSS Fuzz, which was used by over 500 critical open source projects in 2021 and has helped find more than 7,000 vulnerabilities to date.

Our open source work will continue to grow and evolve to support the changing needs of our communities. Thank you to our colleagues and community members that continue to dedicate their personal and professional time supporting the open source ecosystem. Follow our work at opensource.google.

By Sophia Vargas – Research Analyst and Amanda Casari, Open Source Researcher – Google Open Source Programs Office

About this data: 

This report features metrics provided by many teams and programs across Alphabet. In regards to the data centered on code and code adjacent activities, we wanted to share more details about the derivation of those metrics:
  • Data source: These data represent activities on repositories hosted on GitHub and our internal production Git service git-on-borg. These sources represent a subset of open source activity currently tracked by our OSPO.
    • GitHub: We continue to use GitHub Archive as the primary source for GitHub data, which is available as a public dataset on BigQuery. Alphabet activity within GitHub is identified by self-registered accounts, which we estimate underreports actual activity.
    • git-on-borg: This is our primary platform for internal projects and some of our larger, long running public projects like Android and Chromium. While we continue to develop on this platform, most of our open source activity has moved to GitHub to increase exposure and encourage community growth.
    • Distinct event types: Note that git-on-borg and GitHub APIs produce distinct sets of events—as such we will report activity metrics per platform. Where GitHub Event logs capture a wide range of activity from code creation and review to issue creation and comments, the Gerrit Event stream (used by git-on-borg) only captures code changes and reviews.
  • Driven by humans: We have created many automated bots and systems that can propose changes on various hosting platforms. We have intentionally filtered these data to focus on human-initiated activities. For our estimation of bots, we married our own records with a public list maintained by the devstats project
  • Business and personal: Activity on GitHub reflects a mixture of Alphabet projects, third party projects, experimental efforts, and personal projects. Our metrics report on all of the above unless otherwise specified.
  • Alphabet contributors: Please note that unless additional detail is specified, activity counts attributed to Alphabet open source contributors will include our full-time employees as well as our extended Alphabet community (temps, vendors, contractors, and interns).
  • GitHub Accounts: For counts of Github accounts not affiliated with Alphabet, we cannot assume that 1 account translates to 1 person - as multiple accounts could be tied to one individual or bot accounts.
  • Active counts: Where possible, we will show ‘active users’ defined by logged activity within a specified timeframe (i.e. in month, year, etc) and ‘active repositories’ and ‘active projects’ as those that have not been archived.
  • Activity types: This year we explore GitHub activity types in more detail. Note that in some cases we have removed “Watch Events” or articulated this as passive engagement. Additionally, GitHub added an event type “PullRequestReviewEvent” that started logging activity in August 2020, but we chose to remove this from our charts and aggregate counts as it invalidates year over year comparisons.

Dart and Flutter enable Allstar and Security Scorecards

We are thrilled to announce that Dart and Flutter have enabled Allstar and Security Scorecards on their open source repositories. This achievement marks the first milestone in our journey towards SLSA compliance to secure builds and releases from supply chain attacks.

Allstar is a GitHub app that provides automated continuous enforcement of security checks such as the OpenSSF Security Scorecards. With Allstar, owners can check for security policy adherence, set desired enforcement actions, and continuously implement those enforcement actions when triggered by a setting or file change in the org or repo.

Security Scorecards is an automated tool that assesses several key heuristics ("checks") associated with software security and assigns each check a score of 0-10. These scores can be used to evaluate the security posture of the project and help assess the risks introduced by dependencies.

Scorecards have been enabled on the following open source repositories, prioritized by their criticality score.

Org

Repos

Flutter

github.com/flutter/flutter

github.com/flutter/engine

github.com/flutter/plugins

github.com/flutter/packages

github.com/flutter/samples

github.com/flutter/website

github.com/flutter/flutter-intellij

github.com/flutter/gallery

github.com/flutter/codelabs

Dart

github.com/dart-lang/linter

github.com/dart-lang/sdk

github.com/dart-lang/dartdoc

github.com/dart-lang/site-www

github.com/dart-lang/test

With these security scanning tools, the Dart and Flutter team have found and resolved more than 200 high and medium security findings. The issues can be classified in the following categories:
  • Pinned Dependencies: The project should pin its dependencies. A "pinned dependency" is a dependency that is explicitly set to a specific hash instead of allowing a mutable version or range of versions. This reduces several security risks related to dependencies.
  • Token Permissions: The project's automated workflow tokens should be set to read-only by default. This follows the principle of least privilege.
  • Branch Protection: Github project's default and release branches should be protected with GitHub's branch protection settings. Branch protection allows maintainers to define rules that enforce certain workflows for branches, such as requiring review or passing certain status.
  • Code Review: The project should enforce a code review before pull requests (merge requests) are merged.
  • Dependency update tool: A dependency update tool should be used by the project to identify and update outdated and insecure dependencies.
  • Binary-Artifacts: The project should not have generated executable (binary) artifacts in the source repository. Embedded binary artifacts in the project cannot be reviewed, allowing possible obsolete or maliciously subverted executables in the source code.
Additionally, the Dart and Flutter teams have an aligned vulnerability management process. Details of these processes can be found on our respective developer sites at https://dart.dev/security and https://docs.flutter.dev/security. Internal process used by the team to handle vulnerabilities can be found on Flutter github wiki.

Learnings and Best Practices

  1. AllStar and Scorecards allowed Dart and Flutter to quickly identify areas of opportunity to improve security across hundreds of repositories triggering the removal of binaries, standardizing branch protection and enforcing code reviews.
  2. Standardizing third-party dependency management and running vulnerability scanning were identified as the next milestones in the Dart and Flutter journey to improve their overall security posture.
  3. It is safer to not embed binary artifacts in your code. However, there are cases when this is unavoidable.
  4. It is important to track your dependencies and to keep them up to date using tools like Dependabot.

By Khyati Mehta, Technical Program Manager – Dart-Flutter