Tag Archives: Open source

Google Summer of Code 2023 contributor applications open!

Contributor applications for Google Summer of Code (GSoC) 2023 are now open! Students and open source beginners 18 years and older are welcome to apply during the registration period, which opened March 20th at 18:00 UTC and closes April 4th at 18:00 UTC.

Google Summer of Code is a global online program focused on bringing new contributors into open source software development. GSoC Contributors work with an open source organization on a 12+ week programming project under the guidance of mentors. GSoC’s mission is centered around bringing new contributors into open source communities through mentorship and collaboration.

Since 2005, GSoC has welcomed new developers into the open source community every year. The GSoC program has brought together over 19,000 contributors from 112 countries and 18,000 mentors from 800+ open source organizations.

2023 will be the 19th consecutive year hosting Google Summer of Code. We are keeping the big changes we made leading into the 2022 program, with one adjustment around eligibility described below:

  • Increased flexibility in project lengths (10-22 weeks, not a set 12 weeks for everyone).
  • Choice of project time commitment (medium at ~175 hours or large at ~350 hours)
  • For 2023, we are expanding the program to be open to students and beginners in open source software development.

We invite students and beginners in open source to check out Google Summer of Code. Now that applications are open, please keep a few helpful tips in mind:

Interested contributors may register and submit project proposals on the GSoC site from now until Tuesday, April 4th at 18:00 UTC.

Best of luck to all our applicants!

By Stephanie Taylor, Program Manager, and Perry Burnham, Associate Program Manager for the Google Open Source Programs Office

Getting To SLSA Level 2 with Tekton and Tekton Chains

Overview

As application developers, we achieve amazing results quickly by leveraging a rich ecosystem of freely available libraries, modules and frameworks that provide ready-to-use capabilities and abstract away from underlying complexity. This is so foundational to how we work that we'll nonchalantly build and publish an app that pulls in hundreds of dependencies without even thinking about it. And it's only fairly recently, in the wake of some very high profile and high impact compromises, that we've started to reckon with the fact that this wonderful ecosystem is also a security quagmire. All of the dependencies that feed into your build make up your software supply chain, and supply chains need to be secured. In this post, we'll show how an increasingly popular open source CI/CD system, Tekton, implements the OpenSSF SLSA framework to provide you with supply chain security guarantees.

Software Supply Chain Security

A software supply chain is anything that goes into or affects your code from development, through your CI/CD pipeline, until it gets deployed into production. Increasingly, the software supply chain has become a vector for attacks. The recent Log4j, SolarWinds, Kaseya, and Codecov hacks highlight vulnerable surface areas exposed by an insecure software supply chain.

Between 2020 and 2021, there has been a 650% Surge in OSS supply chain attacks, and Gartner projects that 45% of organizations worldwide will have experienced software supply chain attacks by 2025.

Supply Chains Levels for Software Artifacts (SLSA)

The Supply chain Levels for Software Artifacts (SLSA) framework is a check-list of controls to prevent tampering, improve integrity, and increase security in the packages and infrastructure used by projects, businesses or enterprises. SLSA formalizes criteria around software supply chain integrity to help the industry and open source ecosystem secure the software development life cycle at all stages.

As part of the framework, SLSA has multiple levels of assurances. These levels contain industry-recognized best practices to create four levels of increasing assurance.

Supply-Chain Levels for Software Artifacts (SLSA) Levels 1 through 4

SLSA provides a set of requirements that needs to be met for an artifact to be considered for a particular SLSA level.

Tekton + Tekton Chains

Tekton is a powerful and flexible, open source, cloud-native framework for creating CI/CD systems, allowing developers to build, test, and deploy across cloud providers and on-premise systems. Tekton consists of several subprojects which are relevant to SLSA:

  • Pipelines: A system that allows one to define a pipeline of CI/CD tasks and have it be orchestrated by the Tekton controller.
  • Chains: A standalone system which observes Pipelines and generates provenance for the artifacts built by Pipelines.

Tekton build processes are defined as tasks and pipelines. A Task is a collection of Steps that are defined and arranged in a specific order of execution as part of a continuous integration flow.

A Pipeline is a collection of Tasks defined and arranged in a specific order of execution as part of a continuous integration flow.

A TaskRun is an instantiation of a Task with specific inputs, outputs and execution parameters while a PipelineRun is an instantiation of a Pipeline.

A Task/Pipeline can define a set of Results. TaskRuns and PipelineRuns create Results as defined in the Task/Pipeline. Results are used to communicate to Tekton Chains run specifics like the uri and the digest of the built artifact.

Tasks, Pipelines, TaskRuns and PipelineRuns are defined through yaml files. The entire build is defined by the set of yaml files which define Tekton Tasks, Pipelines, TaskRuns and PipelineRuns. These yaml files can be checked in as code and run directly from the code repository.

Getting to SLSA L1: Automation + Provenance

For an artifact to be SLSA L1 compliant it should satisfy the following:

  1. Scripted build: All build steps are fully defined in some sort of “build script”. The only manual command, if any, is to invoke the build script.
  2. Provenance: The provenance is available to the consumer in a format that the consumer accepts. The format SHOULD be in-toto SLSA Provenance, but another format MAY be used if both producer and consumer agree and it meets all the other requirements.

Tekton Tasks, TaskRuns, Pipelines and PipelineRuns are specified in yaml files. These yaml files can be considered as scripts and can even be checked in into a code repository. These could also be run from code repositories. Tekton Chains provides a way to generate provenance in in-toto SLSA format. As such, Tekton can easily make builds which satisfy the SLSA L1 requirements.

Let's follow through with an example, which has the following files:

  • setup.sh: Sets up Google cloud to run an instance of the build specified in pipeline_run.yaml. It also installs Tekton Pipeline and Tekton Chains. In the production environment, this would be run once to set up the environment and all builds would use the same environment.
  • pipeline_run.yaml: This file is the actual build file that is run by Tekton Pipelines. The build here first clones a Github repo, builds the container specified in the source and uploads it to a Docker repository.
A workflow diagram depicting how Tekton can be used to acheive SLSA L2 requirements
The build script pipeline.yaml is the definition of the script while pipeline_run.yaml defines an instance of the build. It provides instance specific parameters for the build. Though both pipeline_run.yaml and pipeline.yaml are in source control for this example, the build definition is in pipeline.yaml and as such pipeline.yaml being in source control would satisfy the requirement of a source controlled build script.

kubectl create -f

https://raw.githubusercontent.com/google/tekton-slsa-demo/main/pipeline_run.yaml

Tekton Chains for Provenance Generation

Provenance is metadata about how an artifact was built, including the build process, top-level source, and dependencies. Knowing the provenance allows software consumers to make risk-based security decisions.

Tekton Chains observes TaskRuns and PipelineRuns in a Kubernetes cluster. Once the runs are done, Chains collects information (provenance) about the Run or the build process and the artifact created by the Run. It signs the provenance and stores the signed provenance. The provenance generated for the example build complies to the SLSA provenance schema and is explained further below.

Note that every step of the build has been recorded and can be reconstructed by following the steps in the provenance.

Next Steps: CI/CD @ SLSA L2

SLSA requires that for a build to be SLSA L2 compliant it should satisfy the following

  1. Every change to the source is tracked in a version control system
  2. All build steps were fully defined in some sort of “build script”. The only manual command, if any, was to invoke the build script.
  3. All build steps ran using some build service, not on a developer’s workstation.
  4. The provenance is available to the consumer in a format that the consumer accepts. The format SHOULD be in-toto SLSA Provenance, but another format MAY be used if both producer and consumer agree and it meets all the other requirements.
  5. The provenance’s authenticity and integrity can be verified by the consumer. This SHOULD be through a digital signature from a private key accessible only to the service generating the provenance.
  6. The data in the provenance MUST be obtained from the build service (either because the generator is the build service or because the provenance generator reads the data directly from the build service). Regular users of the service MUST NOT be able to inject or alter the contents.

Every change to the source is tracked in a version control system

Tekton does not explicitly enforce that the source is version controlled. Tekton users can enforce that the source is version controlled by writing an appropriate Task which will check for version control. The source should also be communicated by Tekton Pipelines to Tekton Chains through a result variable that is suffixed with -ARTIFACT_INPUTS.

All build steps were fully defined in some sort of “build script”. The only manual command, if any, was to invoke the build script.

This is a requirement for SLSA L1 as well and as explained above, Tekton provides a way to script the build through yaml files. The build is defined as a Pipeline (or Task) which can be saved as a yaml file and submitted into source control. The build instance which is defined as a PipelineRun (or TaskRun) can resolve the Pipeline (or Task) yaml from source control and use it for the current instance of the build.

All build steps ran using some build service, not on a developer’s workstation.

Tekton can be hosted on a cloud provider or on a hosted Kubernetes cluster and run as a build service. The build scripts can be submitted into source control (like GitHub) and Tekton can read the scripts directly from source control.

Provenance should be available

This is a requirement for SLSA L1 and as explained above Tekton Chains provides build provenance.

Provenance should be signed and Authenticated

As can be seen in the example, Tekton Chains creates and signs the build provenance. The signature can be verified anytime to ensure that the provenance has not been tampered after the build and the provenance is really created by the build process that claims to have built it. The signing is done according to the SLSA specification using the DSSE format.

Tekton Chains creates the provenance and signs it using a secure private key. Chains then uploads the signed provenance to a user-specified location, one of which is Google Cloud’s Container Analysis, which implements the open standard Grafeas API for storing provenance.

An annotated block of code depicting how Tekton Chains create provenance and signs it

Provenance should be generated by a Service

Note that the provenance in the example is generated by the Tekton Chains service and it cannot be modified after it has been generated, which is guaranteed by the signature.

SLSA requirements for the contents of the provenance, for the build to be considered L2.

All images below are extracted from the provenance of the example build. These can be verified by re-running the example.

1. Identifies artifact: The provenance MUST identify the output artifact via at least one cryptographic hash. The subject field in the SLSA provenance captures the location of the built artifact and the cryptographic hash associated with it. To be able to capture the artifact, Tekton Pipelines should populate the result variable -ARTIFACT_OUTPUTS with the location and the digest of the artifact.
A block of code extracted from the provenance of the example build
2. Identifies builder: The provenance identifies the entity that performed the build and generated the provenance. The builder.id field captures the builder that built the artifact.
A block of code extracted from the provenance of the example build
3. Identifies build instructions: The provenance identifies the top-level instructions used to execute the build. In our example, the build script is in source control. Recording the repo, the path in the repo and the commit hash will uniquely identify the build instructions used to build the artifact.
A block of code extracted from the provenance of the example build
4. Identifies source code: The provenance identifies the repository origin(s) for the source code used in the build. The materials field records all the dependencies used to build the artifact, one of which is the source code. In the example the source used is in a GitHub repo, and as such the repo name and the commit hash will uniquely identify the source code.
A block of code extracted from the provenance of the example build

Conclusion

SLSA aims to secure the software supply chain by providing guidelines on how the software build should be done. Tekton pipelines and Tekton chains implement those guidelines and help in securing the software supply chain.

By Prakash Jagatheesan (team: TektonCD), Brandon Lum (team: GOSST)

OpenXLA is available now to accelerate and simplify machine learning

ML development and deployment today suffer from fragmented and siloed infrastructure that can differ by framework, hardware, and use case. Such fragmentation restrains developer velocity and imposes barriers to model portability, efficiency, and productionization. 

Today, we’re taking a significant step towards eliminating these barriers by making the OpenXLA Project, including the XLA, StableHLO, and IREE repositories, available for use and contribution.

OpenXLA is an open source ML compiler ecosystem co-developed by AI/ML industry leaders including Alibaba, Amazon Web Services, AMD, Apple, Arm, Cerebras, Google, Graphcore, Hugging Face, Intel, Meta, and NVIDIA. It enables developers to compile and optimize models from all leading ML frameworks for efficient training and serving on a wide variety of hardware. Developers using OpenXLA will see significant improvements in training time, throughput, serving latency, and, ultimately, time-to-market and compute costs.

Start accelerating your workloads with OpenXLA on GitHub.

The Challenges with ML Infrastructure Today

Development teams across numerous industries are using ML to tackle complex real-world challenges, such as prediction and prevention of disease, personalized learning experiences, and black hole physics.

As model parameter counts grow exponentially and compute for deep learning models doubles every six months, developers seek maximum performance and utilization of their infrastructure. Teams are leveraging a wider array of hardware from power-efficient ML ASICs in the datacenter to edge processors that can deliver more responsive AI experiences. These hardware devices have bespoke software libraries with unique algorithms and primitives.

However, without a common compiler to bridge these diverse hardware devices to the multiple frameworks in use today (e.g. TensorFlow, PyTorch), significant effort is required to run ML efficiently; developers must manually optimize model operations for each hardware target. This means using bespoke software libraries or writing device-specific code, which requires domain expertise. The result is isolated, non-generalizable paths across frameworks and hardware that are costly to maintain, promote vendor lock-in, and slow progress for ML developers.

Our Solution and Goals

The OpenXLA Project provides a state-of-the-art ML compiler that can scale amidst the complexity of ML infrastructure. Its core pillars are performance, scalability, portability, flexibility, and extensibility for users. With OpenXLA, we aspire to realize the real-world potential of AI by accelerating its development and delivery.

Our goals are to:
  • Make it easy for developers to compile and optimize any model in their preferred framework, for a wide range of hardware through (1) a unified compiler API that any framework can target (2) pluggable device-specific back-ends and optimizations.
  • Deliver industry-leading performance for current and emerging models that (1) scales across multiple hosts and accelerators (2) satisfies the constraints of edge deployments (3) generalizes to novel model architectures of the future.
  • Build a layered and extensible ML compiler platform that provides developers with (1) MLIR-based components that are reconfigurable for their unique use cases (2) plug-in points for hardware-specific customization of the compilation flow.

A Community of AI/ML Leaders

The challenges we face in ML infrastructure today are immense and no single organization can effectively resolve them alone. The OpenXLA community brings together developers and industry leaders operating at different levels of the AI stack, from frameworks to compilers, runtimes, and silicon, and is thus well suited to address the fragmentation we see across the ML landscape.

As an open source project, we’re guided by the following set of principles:
  • Equal footing: Individuals contribute on equal footing regardless of their affiliation. Technical leaders are those who contribute the most time and energy.
  • Culture of respect: All members are expected to uphold project values and code of conduct, regardless of their position in the community.
  • Scalable, efficient governance: Small groups make consensus-based decisions, with clear but rarely-used paths for escalation.
  • Transparency: All decisions and rationale should be legible to the public community.

Performance, Scale, and Portability: Leveraging the OpenXLA Ecosystem

OpenXLA eliminates barriers for ML developers via a modular toolchain that is supported by all leading frameworks through a common compiler interface, leverages standardized model representations that are portable, and provides a domain-specific compiler with powerful target-independent and hardware-specific optimizations. This toolchain includes XLA, StableHLO, and IREE, all of which leverage MLIR: a compiler infrastructure that enables machine learning models to be consistently represented, optimized and executed on hardware.

Flow chart depicting high-level OpenXLA compilation flow and architecture showing depicted optimizations, frameworks and hardware targets
High-level OpenXLA compilation flow and architecture. Depicted optimizations, frameworks and hardware targets represent a select portion of what is available to developers through OpenXLA.

Here are some of the key benefits that OpenXLA provides:

Spectrum of ML Use Cases

Usage of OpenXLA today spans the gamut of ML use cases. This includes full-scale training of models like DeepMind’s AlphaFold, GPT2 and Swin Transformer on Alibaba Cloud, and multi-modal LLMs for Amazon.com. Users like Waymo leverage OpenXLA for on-vehicle, real-time inference. In addition, OpenXLA is being used to optimize serving of Stable Diffusion on AMD RDNA™ 3-equipped local machines.

Optimal Performance, Out of the Box

OpenXLA makes it easy for developers to speed up model performance without needing to write device-specific code. It features whole-model optimizations including simplification of algebraic expressions, optimization of in-memory data layout, and improved scheduling for reduced peak memory use and communication overhead. Advanced operator fusion and kernel generation help improve device utilization and reduce memory bandwidth requirements.

Scale Workloads With Minimal Effort

Developing efficient parallelization algorithms is time-consuming and requires expertise. With features like GSPMD, developers only need to annotate a subset of critical tensors that the compiler can then use to automatically generate a parallelized computation. This removes much of the work required to partition and efficiently parallelize models across multiple hardware hosts and accelerators.

Portability and Optionality

OpenXLA provides out-of-the-box support for a multitude of hardware devices including AMD and NVIDIA GPUs, x86 CPU and Arm architectures, as well as ML accelerators like Google TPUs, AWS Trainium and Inferentia, Graphcore IPUs, Cerebras Wafer-Scale Engine, and many more. OpenXLA additionally supports TensorFlow, PyTorch, and JAX via StableHLO, a portability layer that serves as OpenXLA's input format.

Flexibility

OpenXLA gives users the flexibility to manually tune hotspots in their models. Extension mechanisms such as Custom-call enable users to write deep learning primitives with CUDA, HIP, SYCL, Triton and other kernel languages so they can take full advantage of hardware features.

StableHLO

StableHLO, a portability layer between ML frameworks and ML compilers, is an operation set for high-level operations (HLO) that supports dynamism, quantization, and sparsity. Furthermore, it can be serialized into MLIR bytecode to provide compatibility guarantees. All major ML frameworks (JAX, PyTorch, TensorFlow) can produce StableHLO. Through 2023, we plan to collaborate closely with the PyTorch team to enable an integration to the recent PyTorch 2.0 release.

We’re excited for developers to get their hands on these features and many more that will significantly accelerate and simplify their ML workflows.

Moving Forward Together

The OpenXLA Project is being built by a collaborative community, and we're excited to help developers extend and use it to address the gaps and opportunities we see in the ML industry today. Get started with OpenXLA today on GitHub and sign up for our mailing list here for product and community announcements. You can follow us on Twitter: @OpenXLA

Member Quotes

Here’s what our collaborators are saying about OpenXLA:

Alibaba

“At Alibaba, OpenXLA is leveraged by Elastic GPU Service customers for training and serving of large PyTorch models. We’ve seen significant performance improvements for customers using OpenXLA, notably speed-ups of 72% for GPT2 and 88% for Swin Transformer on NVIDIA GPUs. We're proud to be a founding member of the OpenXLA Project and work with the open-source community to develop an advanced ML compiler that delivers superior performance and user experience for Alibaba Cloud customers.” – Yangqing Jia, VP, AI and Data Analytics, Alibaba

AWS

“We're excited to be a founding member of the OpenXLA Project, which will democratize access to performant, scalable, and extensible AI infrastructure as well as further collaboration within the open source community to drive innovation. At AWS, our customers scale their generative AI applications on AWS Trainium and Inferentia and our Neuron SDK relies on XLA to optimize ML models for high performance and best in class performance per watt. With a robust OpenXLA ecosystem, developers can continue innovating and delivering great performance with a sustainable ML infrastructure, and know that their code is portable to use on their choice of hardware.” – Nafea Bshara, Vice President and Distinguished Engineer, AWS

AMD

“We are excited about the future direction of OpenXLA on the broad family of AMD devices (CPUs, GPUs, AIE) and are proud to be part of this community. We value projects with open governance, flexible and broad applicability, cutting edge features and top-notch performance and are looking forward to the continued collaboration to expand open source ecosystem for ML developers.”  – Alan Lee, Corporate Vice President, Software Development, AMD

Arm

“The OpenXLA Project marks an important milestone on the path to simplifying ML software development. We are fully supportive of the OpenXLA mission and look forward to leveraging the OpenXLA stability and standardization across the Arm® Neoverse™ hardware and software roadmaps.” – Peter Greenhalgh, vice president of technology and fellow, Arm.

Cerebras

“At Cerebras, we build AI accelerators that are designed to make training even the largest AI models quick and easy. Our systems and software meet users where they are -- enabling rapid development, scaling, and iteration using standard ML frameworks without change. OpenXLA helps extend our user reach and accelerated time to solution by providing the Cerebras Wafer-Scale Engine with a common interface to higher level ML frameworks. We are tremendously excited to see the OpenXLA ecosystem available for even broader community engagement, contribution, and use on GitHub.” – Andy Hock, VP and Head of Product, Cerebras Systems

Google

“Open-source software gives everyone the opportunity to help create breakthroughs in AI. At Google, we’re collaborating on the OpenXLA Project to further our commitment to open source and foster adoption of AI tooling that raises the standard for ML performance, addresses incompatibilities between frameworks and hardware, and is reconfigurable to address developers’ tailored use cases. We’re excited to develop these tools with the OpenXLA community so that developers can drive advancements across many different layers of the AI stack.” – Jeff Dean, Senior Fellow and SVP, Google Research and AI

Graphcore

“Our IPU compiler pipeline has used XLA since it was made public. Thanks to XLA's platform independence and stability, it provides an ideal frontend for bringing up novel silicon. XLA’s flexibility has allowed us to expose our IPU’s novel hardware features and achieve state of the art performance with multiple frameworks. Millions of queries a day are served by systems running code compiled by XLA. We are excited by the direction of OpenXLA and hope to continue contributing to the open source project. We believe that it will form a core component in the future of AI/ML.” – David Norman, Director of Software Design, Graphcore

Hugging Face

“Making it easy to run any model efficiently on any hardware is a deep technical challenge, and an important goal for our mission to democratize good machine learning. At Hugging Face, we enabled XLA for TensorFlow text generation models and achieved speed-ups of ~100x. Moreover, we collaborate closely with engineering teams at Intel, AWS, Habana, Graphcore, AMD, Qualcomm and Google, building open source bridges between frameworks and each silicon, to offer out of the box efficiency to end users through our Optimum library. OpenXLA promises standardized building blocks upon which we can build much needed interoperability, and we can't wait to follow and contribute!” – Morgan Funtowicz, Head of Machine Learning Optimization, Hugging Face

Intel

“At Intel, we believe in open, democratized access to AI. Intel CPUs, GPUs, Habana Gaudi accelerators, and oneAPI-powered AI software including OpenVINO, drive ML workloads everywhere from exascale supercomputers to major cloud deployments. Together with other OpenXLA members, we seek to support standards-based, componentized ML compiler tools that drive innovation across multiple frameworks and hardware environments to accelerate world-changing science and research.” – Greg Lavender, Intel SVP, CTO & GM of Software & Advanced Technology Group

Meta

“In research, at Meta AI, we have been using XLA, a core technology of the OpenXLA project, to enable PyTorch models for Cloud TPUs and were able to achieve significant performance improvements on important projects. We believe that open source accelerates the pace of innovation in the world, and are excited to be a part of the OpenXLA Project.” – Soumith Chintala, Lead Maintainer, PyTorch

NVIDIA

“As a founding member of the OpenXLA Project, NVIDIA is looking forward to collaborating on AI/ML advancements with the OpenXLA community and are positive that with wider engagement and adoption of OpenXLA, ML developers will be empowered with state-of-the-art AI infrastructure.” – Roger Bringmann, VP, Compiler Software, NVIDIA.

Acknowledgements

Abhishek Ratna, Allen Hutchison, Aman Verma, Amber Huffman, Andrew Leaver, Ashok Bhat, Chalana Bezawada, Chandan Damannagari, Chris Leary, Christian Sigg, Cormac Brick, David Dunleavy, David Huntsperger, David Majnemer, Elisa Garcia Anzano, Elizabeth Howard, Eugene Burmako, Gadi Hutt, Geeta Chauhan, Geoffrey Martin-Noble, George Karpenkov, Ian Chan, Jacinda Mein, Jacques Pienaar, Jake Hall, Jake Harmon, Jason Furmanek, Julian Walker, Kulin Seth, Kanglan Tang, Kuy Mainwaring, Magnus Hyttsten, Mahesh Balasubramanian, Mehdi Amini, Michael Hudgins, Milad Mohammadi, Navid Khajouei, Paul Baumstarck, Peter Hawkins, Puneith Kaul, Rich Heaton, Robert Hundt, Roman Dzhabarov, Rostam Dinyari, Scott Kulchycki, Scott Main, Scott Todd, Shantu Roy, Shauheen Zahirazami, Stella Laurenzo, Stephan Herhut, Thea Lamkin, Tomás Longeri, Tres Popp, Vartika Singh, Vinod Grover, Will Constable, and Zac Mustin.

By James Rubin, Product Manager, Machine Learning

Google Dev Library Letters: 19th Edition

Posted by the Dev Library team

In this newsletter, we’re highlighting the best projects developed with Google technologies that have been contributed to the Google Dev Library platform. We hope this will spark some inspiration for your next project!


Contributions of the Month


[ML] Serving Stable Diffusion by Chansung Park

Learn the various ways to deploy Stable Diffusion with TensorFlow Serving, Hugging Face Endpoint, and FastAPI.


[ML] Textual inversion pipeline for Stable Diffusion by Chansung Park

Dive into this repository which demonstrates how to manage multiple models and their prototype applications of fine-tuned Stable Diffusion on new concepts by Textual Inversion.

Read more on DevLibrary 


[Flutter] Animated soccer rating hexagon by Prateek Sharma

Create a hexagon widget in Flutter that displays the ratings of a soccer player or team. The six sides represent a different aspect of the player or team's rating such as speed, strength, and accuracy.

Read more on DevLibrary 


Android & Kotlin


Mastering Kotlin Coroutines by Amit Shekhar

Dive into an introduction to coroutines in Kotlin programming language. Coroutines are a way to write asynchronous and non-blocking code in a sequential and easy-to-understand manner.

Kotlin Symbol Processing (KSP) for code generation by Tim Lin

Discover more about KSP API you can use to develop lightweight compiler plugins, which helps you get the complete source code information during compile time.

Form Conductor by Naing Aung Luu

Learn about form conductor. More than form validation, it provides a handful of reusable API to construct a form in simple easy steps.

MovieDB by Gabriel Bronzatti Moro

Discover how to fetch data from Movie DB API and allow users to search for movies and view details and store them on a local database in this Android project.


Angular


A complete guide to Angular Multilingual Application by Hossein Mousavi

Dive into the technical aspects of building a multilingual Angular application, starting with the localization of the application's text.


Flutter


Bank cards UI by Ethiel Adiassa

See how Flutter can be used to create aesthetically pleasing and functional UI designs for banking applications.

macOS UI by Reuben Turner

Dive into the repo resource for designers and developers looking to create beautiful templates and tutorials to create macOS applications and interfaces.


Google Cloud


Search for Brazilian laws using Dialogflow CX and matching engine by Rubens Zimbres

Develop a chatbot using Dialogflow CX and a matching engine to help users search for something specific in legislation.

Awesome CloudOps automation by Doug Sillars

Learn how a single repository could satisfy all your day-to-day CloudOps automation needs.

Serverless Kubernetes on Google Cloud Platform by Gursimar Singh

Learn how serverless technologies like Cloud Run can be used to simplify and expedite the process of designing software applications.

Implement secure CI/CD with Workload Identity Federation, GitLab CI, and Cloud Deploy by Ezekias Bokove

See how to implement a secure Continuous Integration/Continuous Deployment (CI/CD) pipeline using Workload Identity Federation and GitLab CI.


Introducing Service Weaver: A Framework for Writing Distributed Applications

We are excited to introduce Service Weaver, an open source framework for building and deploying distributed applications. Service Weaver allows you to write your application as a modular monolith and deploy it as a set of microservices.

More concretely, Service Weaver consists of two core pieces:

  1. A set of programming libraries, which let you write your application as a single modular binary, using only native data structures and method calls, and
  2. A set of deployers, which let you configure the runtime topology of your application and deploy it as a set of microservices, either locally or on the cloud of your choosing.
  3. Flow chart of Service Weaver Programming Libraries from development to execution, moving four modules labeled A through D from application across a level of microservices to deployers labeled Desktop, Google Cloud, and Other Cloud
By decoupling the process of writing the application from runtime considerations such as how the application is split into microservices, what data serialization formats are used, and how services are discovered, Service Weaver aims to improve distributed application development velocity and performance.

Motivation for Building Service Weaver

While writing microservices-based applications, we found that the overhead of maintaining multiple different microservice binaries—with their own configuration files, network endpoints, and serializable data formats—significantly slowed our development velocity.

More importantly, microservices severely impacted our ability to make cross-binary changes. It made us do things like flag-gate new features in each binary, evolve our data formats carefully, and maintain intimate knowledge of our rollout processes. Finally, having a predetermined number of specific microservices effectively froze our APIs; they became so difficult to change that it was easier to squeeze all of our changes into the existing APIs rather than evolve them.

As a result, we wished we had a single monolithic binary to work with. Monolithic binaries are easy to write: they use only language-native types and method calls. They are also easy to update: just edit the source code and re-deploy. They are easy to run locally or in a VM: simply execute the binary.

Service Weaver, is a framework that has the best of both worlds: the development velocity of a monolith, with the scalability, security, and fault-tolerance of microservices.

Service Weaver Overview

The core idea of Service Weaver is its modular monolith model. You write a single binary, using only language-native data structures and method calls. You organize your binary as a set of modules, called components, which are native types in the programming language. For example, here is a simple application written in Go using Service Weaver. It consists of a main() function and a single Adder component:
type Adder interface { Add(context.Context, int, int) (int, error) } type adder struct{ weaver.Implements[Adder] } func (adder) Add(_ context.Context, x, y int) (int, error) { return x + y, nil} func main() { ctx := context.Background() root := weaver.Init(ctx) adder, err := weaver.Get[Adder](root) sum, err := adder.Add(ctx, 1, 2) }
When running the above application, you can make a trivial configuration choice of whether to place the Adder component together with the main() function or to place it separately. When the Adder component is separate, the Service Weaver framework automatically translates the Add call into a cross-machine RPC; otherwise, the Add call remains a local method call.

To make a change to the above application, such as adding an unbounded number of arguments to the Add method, all you have to do is change the signature of Add, change its call-sites, and re-deploy your application. Service Weaver makes sure that the new version of main() communicates only with the new version of Adder, regardless of whether they are co-located or not. This behavior, combined with using language-native data structures and method calls, allows you to focus exclusively on writing your application logic, without worrying about the deployment topology and inter-service communication (e.g., there are no protos, stubs, or RPC channels in the code).

When it is time to run your application, Service Weaver allows you to run it anywhere—on your local desktop environment or on your local rack of machines or in the cloud—without any changes to your application code. This level of portability is achieved by a clear separation of concerns built into the Service Weaver framework. On one end, we have the programming framework, used for application development. On the other end, we have various deployer implementations, one per deployment environment.
Flow chart depicting Service Weaver Libraries deployer implementations across three separate platforms in one single iteration

This separation of concerns allows you to run your application locally in a single process via go run .; or run it on Google Cloud via weaver gke deploy; or enable and run it on other platforms. In all of these cases, you get the same application behavior without the need to modify or re-compile your application.

What’s in Service Weaver v0.1?

The v0.1 release of Service Weaver includes:

  • The core Go libraries used for writing your applications.
  • A number of deployers used for running your applications locally or on GKE.
  • A set of APIs that allow you to write your own deployers for any other platform.

All of the libraries are released under the Apache 2.0 license. Please be aware that we are likely to introduce breaking changes until version v1.0 is released.

Get Started and Get Involved

While Service Weaver is still in an early development stage, we would like to invite you to use it and share your feedback, thoughts, and contributions.

The easiest way to get started using Service Weaver is to follow the Step-By-Step instructions on our website. If you would like to contribute, please follow our contributor guidelines. To post a question or contact the team directly, use the Service Weaver mailing list.

The team is excited to host a Twitter Space with Kelsey Hightower on March 2nd, at 10am PST. Keep an eye out on the Service Weaver blog for the latest news, updates, and details on future events.

More Resources

  • Visit us at serviceweaver.dev to get the latest information about the project, such as getting started, tutorials, and blog posts.
  • Access one of our Service Weaver repositories on GitHub.

By Srdjan Petrovic and Garv Sawhney, on behalf of the Service Weaver team

Mentor organizations announced for Google Summer of Code 2023!

After careful review, we are pleased to announce that 172 open source projects have been selected for Google Summer of Code (GSoC) 2023! This year we are excited to welcome 18 new organizations for their first year as part of the program.

Please see our program site to view the complete list of GSoC 2023 accepted mentoring organizations. We invite you to learn more about each organization on their GSoC program page, which includes reading through the project ideas that they are looking for GSoC contributors to work on this year.

Are you interested in being a GSoC Contributor?

The 2023 GSoC program is open to students and to beginners in open source software development. Contributor applications will open on Monday, March 20, 2023 at 18:00 UTC with a deadline of Tuesday, April 4, 2023 18:00 UTC to submit your application (including your project proposal).

The most successful applications come from contributors who start preparing now. We can’t say this enough—if you want to significantly increase your chances of being selected as a 2023 GSoC contributor, we recommend you prepare and communicate early. Below are some tips for prospective GSoC contributors to accomplish before the application period begins March 20th:

  • Watch our new ‘Introduction to GSoC’ video to see a quick overview of the program, and view our Community Talks or Org Highlight Videos to get inspired and learn more about some projects that contributors have worked on in the past.
  • Check out the Contributor Guide and Advice for Applying to GSoC doc.
  • Review the list of accepted organizations here. We recommend finding two to four that interest you and reading through their project ideas lists.
  • As soon as you see an idea that sparks your interest, reach out to the organization via their preferred communication methods (listed on their org page on the GSoC program site). The earlier you start the conversation, the better your chances of being accepted as a GSoC contributor.
  • Talk with the mentors and community to determine if this project idea is something you would enjoy working on during the program. Find a project that excites you, otherwise it may be a challenging summer for you and your mentor.
  • Use the information you received during your communications with the mentors and other org community members to write up your proposal.

You can find more information about the program on our website which includes a full timeline of important dates. We also urge anyone interested in applying to read the FAQ and Program Rules and watch some of our other videos with more details about GSoC for contributors and mentors.

A hearty welcome—and thank you—to all of our mentor organizations! We look forward to working with all of you during Google Summer of Code 2023.

Google Dev Library Letters: 18th Edition

Posted by the Dev Library Team

In this newsletter, we’re highlighting the best projects developed with Google technologies that have been contributed to the Google Dev Library platform. We hope this will spark some inspiration for your next project!


Contributions of the month


Moving image showing SSImagePicker in different modes

[Android] SSImagePicker by Simform

See how to use a lightweight and easy-to-use image picker library that has features like cropping, compression and rotation, video, and Live Photos support.



Moving image showing overview of coroutines

[Kotlin] Mastering Coroutines in Kotlin by Reyhaneh Ezatpanah

Dive into a comprehensive overview of coroutines including tips and best practices, along with a detailed explanation of the different types of coroutines available in Kotlin and how to use them effectively.

Read more on DevLibrary


Flow Chart demonstrating Image to Image stable diffusion in Flax

[Machine Learning] Image2Image with Stable Diffusion in Flax by Bachir Chihani

Learn the uses of the Diffusion method, a technique used to improve the stability and performance of image-to-image translation models.

Read more on DevLibrary


Android


Jetpack Compose state, deconstructed by Yves Kalume

Learn how state management in Jetpack Compose is implemented, how it can be used to build a responsive and dynamic UI, and how it compares to other solutions in Android development.


Dynamic environment switching on Android by Ashwini Kumar

Find out how to switch between different environments (such as development, staging, and production) in an Android app.


Migration to Jetpack Compose for a legacy application by Abhishek Saxena

Migrate an existing legacy Android application to Jetpack Compose, a modern UI toolkit for building native Android apps



Machine Learning


Simple diffusion in TensorFlow by Bachir Chihani

Understand the benefits of using TensorFlow for image processing, including the ability to easily parallelize computations and utilize GPUs for faster processing.


Deep dive into stable diffusion by Bachir Chihani

Look into the Flax implementation of the Stable Diffusion model to better understand how it works.


Create-tf-app by Radi Cho

See the tool that allows you to quickly create a TensorFlow application by generating the necessary code and file structure.

 

Angular


NGX-Valdemort by Cédric Exbrayat

Dive into a set of pre-built validation rules and error messages for commonly encountered use cases, making it easy to quickly implement robust form validation for your application.


Passing configuration dynamically from one module to another using ModuleWithProviders by Madhusuthanan B

Learn how to pass configuration data dynamically between modules in an Angular application.


Flutter


Mastering Dart & Flutter DevTools by Ashita Prasad

Look at the first part of the series aimed at helping developers to understand how to use the tools effectively to build applications with Dart and Flutter.


Server-driven UI in Flutter - an experiment on remote widgets by Akshat Vinaybhai Patel

Learn the insights, code snippets and results of the experiment for readers to better understand the concept of Server-Driven UI and its potential in Flutter app development.


Flutter Photo Manager by Alex Li

Learn an easy-to-use API for accessing the device's photo library, that performs operations like retrieving images, videos, and albums, as well as deleting, creating, and updating files in the photo library.


Firebase


How to authenticate to Firebase using email and password in Jetpack Compose? By Alex Mamo 

Here’s a simple solution for implementing Firebase Authentication with email and password, using a clean architecture with Jetpack Compose on Android.


Google Cloud


Google Firestore Data Source plugin for Grafana by Prasanna Kumar

Learn how it allows users to perform operations like querying, aggregating, and visualizing data from Firestore, making it a powerful tool for monitoring and analyzing real-time data in a variety of applications. The repository provides the source code for the plugin and documentation on how to install and use it with Grafana.


Cluster cloner by Joshua Fox

See how this project aims to replicate clusters across different cloud environments and examine these varying infrastructure models.


Getting to know Cloud Firestore by Mustapha Adekunle

Learn how this post covers the basic features and benefits of Cloud Firestore, and how this document database is a scalable and versatile NoSQL cloud database.


Google’s Mandar Chaphalkar has submitted Data Governance with Dataplex

Discover how Dataplex can be used to transform data to meet specific business requirements, and how it can integrate with other Google Cloud services like BigQuery for efficient data storage and analysis.

Supporting DDR4 and DDR5 RDIMMs in open source DRAM security testing framework

In 2021, Google and Antmicro introduced a platform for testing DRAM memory chips against the unfortunate side effect of the physical shrinking of memory chips—the Rowhammer vulnerability. The platform was developed to propose a radical improvement over the “security through obscurity” approach that was predominant in the industry; as both Antmicro and Google believe that the open source approach to mitigating security threats is a way towards accelerating developments in the field.

The framework was originally developed in the context of securing consumer-facing devices, using off-the-shelf Digilent Arty (DDR3, Xilinx Series7 FPGA) and Xilinx ZCU104 (DDR4, Xilinx UltraScale+ FPGA) boards, then followed by a dedicated open hardware board from Antmicro that allowed work on custom LPDDR4 modules. The framework has since helped discover a new attack method named Blacksmith and continues to provide valuable insights into how the security of both edge device and data center memory can be improved.

In constant development since then, the project has welcomed two more major elements to the ecosystem in order to enable testing of DDR4 Registered Dual In-Line Memory Modules (RDIMM)—commonly used in data centers as well as the newer DDR5 standard and continues to provide useful data.

Memory testing for data center use cases

To extend the Rowhammer tester support from consumer-facing devices to shared-compute data center infrastructure, Antmicro developed the data center DRAM tester board. We adapted this open source hardware-test platform from the original LPDDR4 board to enable Rowhammer and other memory security experiments with DDR4 RDIMMs using a fully configurable, open source FPGA-based DDR controller.

The data center DRAM Xilinx Kintex-7 FPGA based test board features:

  • DDR4 RDIMM connector
  • 676 pins FPGA (compared to the 484 for the LPDDR version)
  • RJ45 Gigabit Ethernet
  • Micro-USB console
  • HDMI output connector
  • JTAG programming connector
  • MicroSD card slot
  • 12 MBytes QSPI Flash memory
  • HyperRAM—external DRAM memory that can be used as an FPGA cache
Photo of the Antmicro data center DRAM Xilinx Kintex-7 FPGA based tester board

It’s worth mentioning that the RDIMM DDR4 memory (as opposed to the custom LPDDR4 modules designed for the original project) are generic and available off-the-shelf. This makes it easier for security researchers to get started with data center memory security research compared to edge devices using LPDDR.

The Data Center DRAM Tester board design has now been upgraded into revision 1.2, which brings new features for implementing even more complex DRAM testing scenarios. The 1.2 boards support a Power over Ethernet (PoE) supply option so the board can act as a standalone network device with data exchange and power-cycling done over a single Ethernet cable. This simplifies integration of the board in DRAM testing clusters and custom runners capable of doing hardware-in-the-loop testing.

The new revision of the board will support hot-swapping of the DRAM module under test, which should speed up testing of multiple DRAM modules without the need to power-cycle the tester. Finally, the new revision of the board will include power-measurement circuitry so it will be possible to compare the peak and average power consumption of DRAM while working with different DRAM refresh scenarios.

We are also working on a custom enclosure design suitable for desktop and networked installations.

Extending open source testing to DDR5

With DDR5 quickly becoming the new standard for data center memory, Antmicro and Google’s Platforms teams also set out to develop a platform capable of interfacing with DDR5 memories, again directly from a low-cost FPGA without a dedicated hard block. The resulting DDR5 tester platform follows the structure of the data center DDR4 tester, while expanding on functionality of the Serial Presence Detection, which monitors the power supply states and system health, or adjusting the circuitry for a nominal IO voltage of 1.1V.

Photo of the Antmicro DDR5 testbed

Data center DRAM testing is part of Google’s and Antmicro’s belief in security through transparency. Both hyperscalers and a growing number of organizations who operate their own data centers increasingly embrace this perspective, and there is great value in providing them with a scalable, customizable, commercially supported open source platform that will help in collaborative research and mitigation of emerging security issues.

Rowhammer attacks, security threats, and countermeasures remain an active research area. With Google, Antmicro continues to adjust the Rowhammer test platform to most recent developments, opening the way for researchers and memory vendors to more sophisticated testing methods to enable testing of state-of-the-art memories used in data centers. This work stems from and complements other open source activities the companies jointly lead as members of RISC-V International and CHIPS Alliance, aimed at making the hardware ecosystem more open, secure and collaborative. If you’re interested in open source solutions for DRAM security testing and memory controller development, or more broadly, FPGA and ASIC design and verification, don’t hesitate to reach out to Antmicro at [email protected].

By Michael Gielda – Antmicro

Open Source Vizier: Towards reliable and flexible hyperparameter and blackbox optimization

Google Vizier is the de-facto system for blackbox optimization over objective functions and hyperparameters across Google, having serviced some of Google’s largest research efforts and optimized a wide range of products (e.g., Search, Ads, YouTube). For research, it has not only reduced language model latency for users, designed computer architectures, accelerated hardware, assisted protein discovery, and enhanced robotics, but also provided a reliable backend interface for users to search for neural architectures and evolve reinforcement learning algorithms. To operate at the scale of optimizing thousands of users’ critical systems and tuning millions of machine learning models, Google Vizier solved key design challenges in supporting diverse use cases and workflows, while remaining strongly fault-tolerant.

Today we are excited to announce Open Source (OSS) Vizier (with an accompanying systems whitepaper published at AutoML Conference 2022), a standalone Python package based on Google Vizier. OSS Vizier is designed for two main purposes: (1) managing and optimizing experiments at scale in a reliable and distributed manner for users, and (2) developing and benchmarking algorithms for automated machine learning (AutoML) researchers.


System design

OSS Vizier works by having a server provide services, namely the optimization of blackbox objectives, or functions, from multiple clients. In the main workflow, a client sends a remote procedure call (RPC) and asks for a suggestion (i.e., a proposed input for the client’s blackbox function), from which the service begins to spawn a worker to launch an algorithm (i.e., a Pythia policy) to compute the following suggestions. The suggestions are then evaluated by clients to form their corresponding objective values and measurements, which are sent back to the service. This pipeline is repeated multiple times to form an entire tuning trajectory.

The use of the ubiquitous gRPC library, which is compatible with most programming languages, such as C++ and Rust, allows maximum flexibility and customization, where the user can also write their own custom clients and even algorithms outside of the default Python interface. Since the entire process is saved to an SQL datastore, a smooth recovery is ensured after a crash, and usage patterns can be stored as valuable datasets for research into meta-learning and multitask transfer-learning methods such as the OptFormer and HyperBO.

In the distributed pipeline, multiple clients each send a “Suggest” request to the Service API, which produces Suggestions for the clients using Pythia. The clients evaluate these suggestions and return measurements. All transactions are stored to allow fault-tolerance.

Usage

Because of OSS Vizier’s emphasis as a service, in which clients can send requests to the server at any point in time, it is thus designed for a broad range of scenarios — the budget of evaluations, or trials, can range from tens to millions, and the evaluation latency can range from seconds to weeks. Evaluations can be done asynchronously (e.g., tuning an ML model) or in synchronous batches (e.g., wet lab settings involving multiple simultaneous experiments). Furthermore, evaluations may fail due to transient errors and be retried, or may fail due to persistent errors (e.g., the evaluation is impossible) and should not be retried.

This broadly supports a variety of applications, which include hyperparameter tuning deep learning models or optimizing non-computational objectives, which can be e.g., physical, chemical, biological, mechanical, or even human-evaluated, such as cookie recipes.

The OSS Vizier API allows (1) developers to integrate other packages, with PyGlove and Vertex Vizier already included, and (2) users to optimize their experiments, such as machine learning pipelines and cookie recipes.

Integrations, algorithms, and benchmarks

As Google Vizier is heavily integrated with many of Google’s internal frameworks and products, OSS Vizier will naturally be heavily integrated with many of Google’s open source and external frameworks. Most prominently, OSS Vizier will serve as a distributed backend for PyGlove to allow large-scale evolutionary searches over combinatorial primitives such as neural architectures and reinforcement learning algorithms. Furthermore, OSS Vizier shares the same client-based API with Vertex Vizier, allowing users to quickly swap between open-source and production-quality services.

For AutoML researchers, OSS Vizier is also outfitted with a useful collection of algorithms and benchmarks (i.e., objective functions) unified under common APIs for assessing the strengths and weaknesses of proposed methods. Most notably, via TensorFlow Probability, researchers can now use the JAX-based Gaussian Process Bandit algorithm, based on the default algorithm in Google Vizier that tunes internal users’ objectives.


Resources and future direction

We provide links to the codebase, documentation, and systems whitepaper. We plan to allow user contributions, especially in the form of algorithms and benchmarks, and further integrate with the open-source AutoML ecosystem. Going forward, we hope to see OSS Vizier as a core tool for expanding research and development over blackbox optimization and hyperparameter tuning.


Acknowledgements

OSS Vizier was developed by members of the Google Vizier team in collaboration with the TensorFlow Probability team: Setareh Ariafar, Lior Belenki, Emily Fertig, Daniel Golovin, Tzu-Kuo Huang, Greg Kochanski, Chansoo Lee, Sagi Perel, Adrian Reyes, Xingyou (Richard) Song, and Richard Zhang.

In addition, we thank Srinivas Vasudevan, Jacob Burnim, Brian Patton, Ben Lee, Christopher Suter, and Rif A. Saurous for further TensorFlow Probability integrations, Daiyi Peng and Yifeng Lu for PyGlove integrations, Hao Li for Vertex/Cloud integrations, Yingjie Miao for AutoRL integrations, Tom Hennigan, Varun Godbole, Pavel Sountsov, Alexey Volkov, Mihir Paradkar, Richard Belleville, Bu Su Kim, Vytenis Sakenas, Yujin Tang, Yingtao Tian, and Yutian Chen for open source and infrastructure help, and George Dahl, Aleksandra Faust, Claire Cui, and Zoubin Ghahramani for discussions.

Finally we thank Tom Small for designing the animation for this post.

Source: Google AI Blog


The Flan Collection: Advancing open source methods for instruction tuning

Language models are now capable of performing many new natural language processing (NLP) tasks by reading instructions, often that they hadn’t seen before. The ability to reason on new tasks is mostly credited to training models on a wide variety of unique instructions, known as “instruction tuning”, which was introduced by FLAN and extended in T0, Super-Natural Instructions, MetaICL, and InstructGPT. However, much of the data that drives these advances remain unreleased to the broader research community. 

In “The Flan Collection: Designing Data and Methods for Effective Instruction Tuning”, we closely examine and release a newer and more extensive publicly available collection of tasks, templates, and methods for instruction tuning to advance the community’s ability to analyze and improve instruction-tuning methods. This collection was first used in Flan-T5 and Flan-PaLM, for which the latter achieved significant improvements over PaLM. We show that training a model on this collection yields improved performance over comparable public collections on all tested evaluation benchmarks, e.g., a 3%+ improvement on the 57 tasks in the Massive Multitask Language Understanding (MMLU) evaluation suite and 8% improvement on BigBench Hard (BBH). Analysis suggests the improvements stem both from the larger and more diverse set of tasks and from applying a set of simple training and data augmentation techniques that are cheap and easy to implement: mixing zero-shot, few-shot, and chain of thought prompts at training, enriching tasks with input inversion, and balancing task mixtures. Together, these methods enable the resulting language models to reason more competently over arbitrary tasks, even those for which it hasn’t seen any fine-tuning examples. We hope making these findings and resources publicly available will accelerate research into more powerful and general-purpose language models.


Public instruction tuning data collections

Since 2020, several instruction tuning task collections have been released in rapid succession, shown in the timeline below. Recent research has yet to coalesce around a unified set of techniques, with different sets of tasks, model sizes, and input formats all represented. This new collection, referred to below as “Flan 2022”, combines prior collections from FLAN, P3/T0, and Natural Instructions with new dialog, program synthesis, and complex reasoning tasks.

A timeline of public instruction tuning collections, including: UnifiedQA, CrossFit, Natural Instructions, FLAN, P3/T0, MetaICL, ExT5, Super-Natural Instructions, mT0, Unnatural Instructions, Self-Instruct, and OPT-IML Bench. The table describes the release date, the task collection name, the model name, the base model(s) that were finetuned with this collection, the model size, whether the resulting model is Public (green) or Not Public (red), whether they train with zero-shot prompts (“ZS”), few-shot prompts (“FS”), chain-of-thought prompts (“CoT”) together (“+”) or separately (“/”), the number of tasks from this collection in Flan 2022, the total number of examples, and some notable methods, related to the collections, used in these works. Note that the number of tasks and examples vary under different assumptions and so are approximations. Counts for each are reported using task definitions from the respective works.

In addition to scaling to more instructive training tasks, The Flan Collection combines training with different types of input-output specifications, including just instructions (zero-shot prompting), instructions with examples of the task (few-shot prompting), and instructions that ask for an explanation with the answer (chain of thought prompting). Except for InstructGPT, which leverages a collection of proprietary data, Flan 2022 is the first work to publicly demonstrate the strong benefits of mixing these prompting settings together during training. Instead of a trade-off between the various settings, mixing prompting settings during training improves all prompting settings at inference time, as shown below for both tasks held-in and held-out from the set of fine-tuning tasks.

Training jointly with zero-shot and few-shot prompt templates improves performance on both held-in and held-out tasks. The stars indicate the peak performance in each setting. Red lines denote the zero-shot prompted evaluation, lilac denotes few-shot prompted evaluation.

Evaluating instruction tuning methods

To understand the overall effects of swapping one instruction tuning collection for another, we fine-tune equivalently-sized T5 models on popular public instruction-tuning collections, including Flan 2021, T0++, and Super-Natural Instructions. Each model is then evaluated on a set of tasks that are already included in each of the instruction tuning collections, a set of five chain-of-thought tasks, and then a set of 57 diverse tasks from the MMLU benchmark, both with zero-shot and few-shot prompts. In each case, the new Flan 2022 model, Flan-T5, outperforms these prior works, demonstrating a more powerful general-purpose NLP reasoner.

Comparing public instruction tuning collections on held-in, chain-of-thought, and held-out evaluation suites, such as BigBench Hard and MMLU. All models except OPT-IML-Max (175B) are trained by us, using T5-XL with 3B parameters. Green text indicates improvement over the next best comparable T5-XL (3B) model.

Single task fine-tuning

In applied settings, practitioners usually deploy NLP models fine-tuned specifically for one target task, where training data is already available. We examine this setting to understand how Flan-T5 compares to T5 models as a starting point for applied practitioners. Three settings are compared: fine-tuning T5 directly on the target task, using Flan-T5 without further fine-tuning on the target task, and fine-tuning Flan-T5 on the target task. For both held-in and held-out tasks, fine-tuning Flan-T5 offers an improvement over fine-tuning T5 directly. In some instances, usually where training data is limited for a target task, Flan-T5 without further fine-tuning outperforms T5 with direct fine-tuning.

Flan-T5 outperforms T5 on single-task fine-tuning. We compare single-task fine-tuned T5 (blue bars), single-task fine-tuned Flan-T5 (red), and Flan-T5 without any further fine-tuning (beige).

An additional benefit of using Flan-T5 as a starting point is that training is significantly faster and cheaper, converging more quickly than T5 fine-tuning, and usually peaking at higher accuracies. This suggests less task-specific training data may be necessary to achieve similar or better results on a particular task.

Flan-T5 converges faster than T5 on single-task fine-tuning, for each of five held-out tasks from Flan fine-tuning. Flan-T5’s learning curve is indicated with the solid lines, and T5’s learning curve with the dashed line. All tasks are held-out during Flan finetuning.

There are significant energy efficiency benefits for the NLP community to adopt instruction-tuned models like Flan-T5 for single task fine-tuning, rather than conventional non-instruction-tuned models. While pre-training and instruction fine-tuning are financially and computationally expensive, they are a one-time cost, usually amortized over millions of subsequent fine-tuning runs, which can become more costly in aggregate, for the most prominent models. Instruction-tuned models offer a promising solution in significantly reducing the amount of fine-tuning steps needed to achieve the same or better performance.


Conclusion

The new Flan instruction tuning collection unifies the most popular prior public collections and their methods, while adding new templates and simple improvements like training with mixed prompt settings. The resulting method outperforms Flan, P3, and Super-Natural Instructions on held-in, chain of thought, MMLU, and BBH benchmarks by 3–17% across zero-shot and few-shot variants. Results suggest this new collection serves as a more performant starting point for researchers and practitioners interested in both generalizing to new instructions or fine-tuning on a single new task.


Acknowledgements

It was a privilege to work with Jason Wei, Barret Zoph, Le Hou, Hyung Won Chung, Tu Vu, Albert Webson, Denny Zhou, and Quoc V Le on this project.

Source: Google AI Blog