Author Archives: Open Source Programs Office

Wrapping up Google Code-in 2017

Today marks the conclusion of the 8th annual Google Code-in (GCI), our contest that teaches teenage students through contributions to open source projects. As with most years, the contest evolved a bit and grew. And it grew. And it doubled. And then it grew some more...
These numbers may increase as mentors finish reviewing the final work submitted by students.
Mentors from each of the 25 open source organizations are now busy reviewing the last of the work submitted by student participants. We’re looking forward to sharing the stats.

Each organization will pick two Grand Prize Winners who will be flown to Northern California to visit Google’s headquarters, enjoy a day of adventure in San Francisco, and meet their mentors and Google engineers.

We’d like to congratulate all of the student participants for challenging themselves and making a contribution to open source in the process! We’d also like to congratulate the mentors for surviving the unusually busy contest.

Further, we’d like to thank the mentors and the organization administrators. They are the heart of this program, volunteering countless hours creating tasks, reviewing student work, and helping students into the world of open source. Mentors teach young students about the many facets of open source development, from community standards and communicating across time zones to version control and testing. We couldn’t run this program without you!

Stay tuned, we’ll be announcing the Grand Prize Winners and Finalists on January 31st.

By Josh Simmons, Google Open Source

OpenCensus: A Stats Collection and Distributed Tracing Framework

Today we’re pleased to announce the release of OpenCensus, a vendor-neutral open source library for metric collection and tracing. OpenCensus is built to add minimal overhead and be deployed fleet wide, especially for microservice-based architectures.

The Need for Instrumentation & Observability 

As a startup, often the focus is to get an initial version of the product out the door, rapidly prototype and iterate with customers. Most startups start out with monolithic applications as a simple model-view-controller (MVC) web application. As the customer base, code, and number of engineers increase, they migrate from monolithic architecture to a microservices architecture. A microservices architecture has its advantages, but often makes debugging more challenging as traditional debugging and monitoring tools don’t always work in these environments or are designed for monolithic use cases. When operating multiple microservices with strict service level objectives (SLOs), you need insights into the root cause of reliability and performance problems.

Not having proper instrumentation and observability can result in lost engineering hours, violated SLOs and frustrated customers. Instead, diagnostic data should be collected from across the stack. This data can be used for incident management to identify and debug potential bottlenecks or for system tuning and performance improvement.

OpenCensus

At Google scale, an instrumentation layer with minimal overhead is a requirement. As Google grew, we realized the importance of having a highly efficient tracing and stats instrumentation library that could be deployed fleet wide.

OpenCensus is the open source version of Google’s Census library, written based on years of optimization experience. It aims to make the collection and submission of app metrics and traces easier for developers. It is a vendor neutral, single distribution of libraries that automatically collects traces and metrics from your app, displays them locally, and sends them to analysis tools. OpenCensus currently supports Prometheus, SignalFX, Stackdriver and Zipkin.

Developers can use this powerful, out-of-the box library to instrument microservices and send data to any supported backend. For an Application Performance Management (APM) vendor, OpenCensus provides free instrumentation coverage with minimal work, and affords customers a simple setup experience.

Below are Stackdriver Trace and Monitor screenshots showing traces generated from a demo app, which calls Google’s Cloud Bigtable API and uses OpenCensus.



We’d love to hear your feedback on OpenCensus. Try using it in your app, tell us about your success story, and help by contributing to our existing language-specific libraries, or by creating one for an not-yet-supported language. You can also help us integrate OpenCensus with new APM tools!

We hope you find this as useful as we have. Visit opencensus.io for more information.

By Pritam Shah, Census team

Googlers on the road: Linux.conf.au 2018

It’s summer in Sydney and Linux.conf.au (LCA) 2018 is just a week away. LCA, an annual event that attracts people from all over the globe, including Googlers, runs January 22nd to 26th.

LCA is a cornerstone of the free and open source software (FOSS) community. It’s volunteer-run, administered by Linux Australia, and has been running since 1999. Despite its name, the conference program covers all things FOSS. The event is five days long and includes two days of miniconfs that make the program even more interesting.

The Google Open Source team is escaping “wintery” Northern California and will be hosting a Birds of a Feather (BoF) session and co-hosting an event with GDG Sydney, both focused on our student programs.

A few Googlers ended up with sessions in the program and one is running a miniconf:

Tuesday, January 23rd
All day     Create hardware with FPGAs, Linux and Python Miniconf hosted by Tim Ansell (sold out)
11:40am  Learn by Contributing to Open Source by Josh Simmons
5:15pm    Assembling a balsa-wood Raspberry Pi case by Josh Deprez

Wednesday, January 24th
3:50pm    Securing the Linux boot process by Matthew Garrett

Thursday, January 25th
12:25pm  Google Summer of Code and Google Code-in Birds of a Feather session
6:00pm    Google Summer of Code and Google Code-in Meetup with GDG Sydney

Friday, January 26th
11:40am  The State of Kernel Self-Protection by Kees Cook
1:40pm    QUIC: Replacing TCP for the Web by Jana Iyengar
2:35pm    The Web Is Dead! Long Live The Web! by Sam Thorogood

Not able to make the conference? They’ll be posting session recordings to YouTube afterwards, thanks in part to students who have worked on TimVideos, a suite of open source software and hardware for recording video, as part of Google Summer of Code.

Naturally, you will also find the Google Open Source team at other upcoming events including FOSDEM. We look forward to seeing you in 2018!

By Josh Simmons, Google Open Source

Container Structure Tests: Unit Tests for Docker Images

Usage of containers in software applications is on the rise, and with their increasing usage in production comes a need for robust testing and validation. Containers provide great testing environments, but actually validating the structure of the containers themselves can be tricky. The Docker toolchain provides us with easy ways to interact with the container images themselves, but no real way of verifying their contents. What if we want to ensure a set of commands runs successfully inside of our container, or check that certain files are in the correct place with the correct contents, before shipping?

The Container Tools team at Google is happy to announce the release of the Container Structure Test framework. This framework provides a convenient and powerful way to verify the contents and structure of your containers. We’ve been using this framework at Google to test all of our team’s released containers for over a year now, and we’re excited to finally share it with the public.

The framework supports four types of tests:
  • Command Tests - to run a command inside your container image and verify the output or error it produces
  • File Existence Tests - to check the existence of a file in a specific location in the image’s filesystem
  • File Content Tests - to check the contents and metadata of a file in the filesystem
  • A unique Metadata Test - to verify configuration and metadata of the container itself
Users can specify test configurations through YAML or JSON. This provides a way to abstract away the test configuration from the implementation of the tests, eliminating the need for hacky bash scripting or other solutions to test container images.

Command Tests

The Command Tests give the user a way to specify a set of commands to run inside of a container, and verify that the output, error, and exit code were as expected. An example configuration looks like this:
globalEnvVars:
- key: "VIRTUAL_ENV"
value: "/env"
- key: "PATH"
value: "/env/bin:$PATH"

commandTests:

# check that the python binary is in the correct location
- name: "python installation"
command: "which"
args: ["python"]
expectedOutput: ["/usr/bin/python\n"]

# setup a virtualenv, and verify the correct python binary is run
- name: "python in virtualenv"
setup: [["virtualenv", "/env"]]
command: "which"
args: ["python"]
expectedOutput: ["/env/bin/python\n"]

# setup a virtualenv, install gunicorn, and verify the installation
- name: "gunicorn flask"
setup: [["virtualenv", "/env"],
["pip", "install", "gunicorn", "flask"]]
command: "which"
args: ["gunicorn"]
expectedOutput: ["/env/bin/gunicorn"]
Regexes are used to match the expected output, and error, of each command (or excluded output/error if you want to make sure something didn’t happen). Additionally, setup and teardown commands can be run with each individual test, and environment variables can be specified to be set for each individual test, or globally for the entire test run (shown in the example).

File Tests

File Tests allow users to verify the contents of an image’s filesystem. We can check for existence of files, as well as examine the contents of individual files or directories. This can be particularly useful for ensuring that scripts, config files, or other runtime artifacts are in the correct places before shipping and running a container.
fileExistenceTests:

# check that the apt-packages text file exists and has the correct permissions
- name: 'apt-packages'
path: '/resources/apt-packages.txt'
shouldExist: true
permissions: '-rw-rw-r--'
Expected permissions and file mode can be specified for each file path in the form of a standard Unix permission string. As with the Command Tests’ “Excluded Output/Error,” a boolean can be provided to these tests to tell the framework to be sure a file is not present in a filesystem.

Additionally, the File Content Tests verify the contents of files and directories in the filesystem. This can be useful for checking package or repository versions, or config file contents among other things. Following the pattern of the previous tests, regexes are used to specify the expected or excluded contents.
fileContentTests:

# check that the default apt repository is set correctly
- name: 'apt sources'
path: '/etc/apt/sources.list'
expectedContents: ['.*httpredir\.debian\.org/debian jessie main.*']

# check that the retry policy is correctly specified
- name: 'retry policy'
path: '/etc/apt/apt.conf.d/apt-retry'
expectedContents: ['Acquire::Retries 3;']

Metadata Test

Unlike the previous tests which all allow any number to be specified, the Metadata test is a singleton test which verifies a container’s configuration. This is useful for making sure things specified in the Dockerfile (e.g. entrypoint, exposed ports, mounted volumes, etc.) are manifested correctly in a built container.
metadataTest:
env:
- key: "VIRTUAL_ENV"
value: "/env"
exposedPorts: ["8080", "2345"]
volumes: ["/test"]
entrypoint: []
cmd: ["/bin/bash"]
workdir: ["/app"]

Tiny Images

One interesting case that we’ve put focus on supporting is “tiny images.” We think keeping container sizes small is important, and sometimes the bare minimum in a container image might even exclude a shell. Users might be used to running something like:
`docker run -d "cat /etc/apt/sources.list && grep -rn 'httpredir.debian.org' image"`
… but this breaks without a working shell in a container. With the structure test framework, we convert images to in-memory filesystem representations, so no shell is needed to examine the contents of an image!

Dockerless Test Runs

At their core, Docker images are just bundles of tarballs. One of the major use cases for these tests is running in CI systems, and often we can't guarantee that we'll have access to a working Docker daemon in these environments. To address this, we created a tar-based test driver, which can handle the execution of all file-related tests through simple tar manipulation. Command tests are currently not supported in this mode, since running commands in a container requires a container runtime.

This means that using the tar driver, we can retrieve images from a remote registry, convert them into filesystems on disk, and verify file contents and metadata all without a working Docker daemon on the host! Our container-diff library is leveraged here to do all the image processing; see our previous blog post for more information.
structure-test -test.v -driver tar -image gcr.io/google-appengine/python:latest structure-test-examples/python/python_file_tests.yaml

Running in Bazel

Structure tests can also be run natively through Bazel, using the “container_test” rule. Bazel provides convenient build rules for building Docker images, so the structure tests can be run as part of a build to ensure any new built images are up to snuff before being released. Check out this example repo for a quick demonstration of how to incorporate these tests into a Bazel build.

We think this framework can be useful for anyone building and deploying their own containers in the wild, and hope that it can promote their usage everywhere through increasing the robustness of containers. For more detailed information on the test specifications, check out the documentation in our GitHub repository.

By Nick Kubala, Container Tools team

Talk shop with Google Open Source

Hello world! The Google Open Source team is ringing in the new year by launching accounts on Twitter, Facebook, and Google+ to engage more with the community and keep folks up to date.
Free and open source software (FOSS) is fundamental to computing, the internet, and Google. Since 2004, Google Open Source has helped Googlers get code in and out of Google and supported FOSS through student programs and financial support. One thing is clear after 14 years: FOSS is all about community.

We’re part of that community, seeing people at events, on mailing lists, and in the trenches of code repositories. And few things are more enjoyable and productive than talking with people in the community…

… so we thought we’d start doing more of that. We want to:
We hope you’ll come along and let us know. You’ll find us at @GoogleOSS and +GoogleOpenSource, as well as on Facebook and YouTube.

By Josh Simmons, Google Open Source

Talk shop with Google Open Source

Hello world! The Google Open Source team is ringing in the new year by launching accounts on Twitter, Facebook, and Google+ to engage more with the community and keep folks up to date.
Free and open source software (FOSS) is fundamental to computing, the internet, and Google. Since 2004, Google Open Source has helped Googlers get code in and out of Google and supported FOSS through student programs and financial support. One thing is clear after 14 years: FOSS is all about community.

We’re part of that community, seeing people at events, on mailing lists, and in the trenches of code repositories. And few things are more enjoyable and productive than talking with people in the community…

… so we thought we’d start doing more of that. We want to:
We hope you’ll come along and let us know. You’ll find us at @GoogleOSS and +GoogleOpenSource, as well as on Facebook and YouTube.

By Josh Simmons, Google Open Source

Talk shop with Google Open Source

Hello world! The Google Open Source team is ringing in the new year by launching accounts on Twitter, Facebook, and Google+ to engage more with the community and keep folks up to date.
Free and open source software (FOSS) is fundamental to computing, the internet, and Google. Since 2004, Google Open Source has helped Googlers get code in and out of Google and supported FOSS through student programs and financial support. One thing is clear after 14 years: FOSS is all about community.

We’re part of that community, seeing people at events, on mailing lists, and in the trenches of code repositories. And few things are more enjoyable and productive than talking with people in the community…

… so we thought we’d start doing more of that. We want to:
We hope you’ll come along and let us know. You’ll find us at @GoogleOSS and +GoogleOpenSource, as well as on Facebook and YouTube.

By Josh Simmons, Google Open Source

Seeking open source projects for Google Summer of Code 2018

Do you lead or represent a free or open source software organization? Are you seeking new contributors? (Who isn’t?) Do you enjoy the challenge and reward of mentoring new developers? Apply to be a mentor organization for Google Summer of Code 2018!

We are seeking open source projects and organizations to participate in the 14th annual Google Summer of Code (GSoC). GSoC is a global program that gets student developers contributing to open source. Each student spends three months working on a project, with the support of volunteer mentors, for participating open source organizations.

Last year 1,318 students worked with 198 open source organizations. Organizations include individual projects and umbrella organizations that serve as fiscal sponsors, such as Apache Software Foundation or the Python Software Foundation.

You can apply starting today. The deadline to apply is January 23 at 16:00 UTC. Organizations chosen for GSoC 2018 will be posted on February 12.

Please visit the program site for more information on how to apply, a detailed timeline of important deadlines and general program information. We also encourage you to check out the Mentor Guide and join the discussion group.

Best of luck to all of the applicants!

By Josh Simmons, Google Open Source

Google Code-in is breaking records

It’s been an incredible (and incredibly busy!) three weeks for the 25 mentor organizations participating in Google Code-in (GCI) 2017, our seven week global contest designed to introduce teens to open source software development. Participants complete bite sized “tasks” in topics that include coding, documentation, UI/UX, quality assurance and more. Volunteer mentors from each open source project help participants along the way.

Total registered students has already surpassed 2016 numbers and we are less than halfway to the finish! We’re thrilled that high school students are embracing GCI like never before.

Check out some of the statistics below (current as of Thursday, December 14):
  • Total registered students: 6,146
  • Number of students who have completed at least one task: 1,573 (51% of those students have completed more than 3 tasks, earning them a GCI t-shirt)
  • Total number of tasks completed: 5,499
  • Most tasks completed by one student: 39

Top 5 Countries by Tasks Completed

Countries Represented by Mentors and Students



Of course, GCI wouldn’t be possible without the effort of the more than 725 mentors and organization administrators. Based in 65 countries, mentors answer questions, review submissions, and approve tasks for students at all hours of the day -- and sometimes night! They work tirelessly to help encourage and guide the next generation of open source contributors.

Every year we express our gratitude to the mentors and organization administrators. We are particularly grateful for them given how many more students are participating in GCI this year. Thank you all, and hang in there!

By Mary Radomile, Google Open Source

TFGAN: A Lightweight Library for Generative Adversarial Networks

Crossposted on the Google Research Blog

Training a neural network usually involves defining a loss function, which tells the network how close or far it is from its objective. For example, image classification networks are often given a loss function that penalizes them for giving wrong classifications; a network that mislabels a dog picture as a cat will get a high loss. However, not all problems have easily-defined loss functions, especially if they involve human perception, such as image compression or text-to-speech systems. Generative Adversarial Networks (GANs), a machine learning technique that has led to improvements in a wide range of applications including generating images from text, superresolution, and helping robots learn to grasp, offer a solution. However, GANs introduce new theoretical and software engineering challenges, and it can be difficult to keep up with the rapid pace of GAN research.

A video of a generator improving over time. It begins by producing random noise, and eventually learns to generate MNIST digits.
In order to make GANs easier to experiment with, we’ve open sourced TFGAN, a lightweight library designed to make it easy to train and evaluate GANs. It provides the infrastructure to easily train a GAN, provides well-tested loss and evaluation metrics, and gives easy-to-use examples that highlight the expressiveness and flexibility of TFGAN. We’ve also released a tutorial that includes a high-level API to quickly get a model trained on your data.
This demonstrates the effect of an adversarial loss on image compression. The top row shows image patches from the ImageNet dataset. The middle row shows the results of compressing and uncompressing an image through an image compression neural network trained on a traditional loss. The bottom row shows the results from a network trained with a traditional loss and an adversarial loss. The GAN-loss images are sharper and more detailed, even if they are less like the original.
TFGAN supports experiments in a few important ways. It provides simple function calls that cover the majority of GAN use-cases so you can get a model running on your data in just a few lines of code, but is built in a modular way to cover more exotic GAN designs as well. You can just use the modules you want -- loss, evaluation, features, training, etc. are all independent.. TFGAN’s lightweight design also means you can use it alongside other frameworks, or with native TensorFlow code. GAN models written using TFGAN will easily benefit from future infrastructure improvements, and you can select from a large number of already-implemented losses and features without having to rewrite your own. Lastly, the code is well-tested, so you don’t have to worry about numerical or statistical mistakes that are easily made with GAN libraries.
Most neural text-to-speech (TTS) systems produce over-smoothed spectrograms. When applied to the Tacotron TTS system, a GAN can recreate some of the realistic-texture, which reduces artifacts in the resulting audio.
When you use TFGAN, you’ll be using the same infrastructure that many Google researchers use, and you’ll have access to the cutting-edge improvements that we develop with the library. Anyone can contribute to the github repositories, which we hope will facilitate code-sharing among ML researchers and users.

By Joel Shor, Senior Software Engineer, Machine Perception