Tag Archives: Open source

Kubernetes best practices: How and why to build small container images



Editor’s note: Today marks the first installment in a seven-part video and blog series from Google Developer Advocate Sandeep Dinesh on how to get the most out of your Kubernetes environment. Today he tackles the theory and practicalities of keeping your container images as small as possible.

Docker makes building containers a breeze. Just put a standard Dockerfile into your folder, run the docker ‘build’ command, and shazam! Your container image is built!

The downside of this simplicity is that it’s easy to build huge containers full of things you don’t need—including potential security holes.

In this episode of “Kubernetes Best Practices,” let’s explore how to create production-ready container images using Alpine Linux and the Docker builder pattern, and then run some benchmarks that can determine how these containers perform inside your Kubernetes cluster.

The process for creating containers images is different depending on whether you are using an interpreted language or a compiled language. Let’s dive in!

Containerizing interpreted languages


Interpreted languages, such as Ruby, Python, Node.js, PHP and others send source code through an interpreter that runs the code. This gives you the benefit of skipping the compilation step, but has the downside of requiring you to ship the interpreter along with the code.

Luckily, most of these languages offer pre-built Docker containers that include a lightweight environment that allows you to run much smaller containers.

Let’s take a Node.js application and containerize it. First, let’s use the “node:onbuild” Docker image as the base. The “onbuild” version of a Docker container pre-packages everything you need to run so you don’t need to perform a lot of configuration to get things working. This means the Dockerfile is very simple (only two lines!). But you pay the price in terms of disk size— almost 700MB!

FROM node:onbuild
EXPOSE 8080
By using a smaller base image such as Alpine, you can significantly cut down on the size of your container. Alpine Linux is a small and lightweight Linux distribution that is very popular with Docker users because it’s compatible with a lot of apps, while still keeping containers small.

Luckily, there is an official Alpine image for Node.js (as well as other popular languages) that has everything you need. Unlike the default “node” Docker image, “node:alpine” removes many files and programs, leaving only enough to run your app.

The Alpine Linux-based Dockerfile is a bit more complicated to create as you have to run a few commands that the onbuild image otherwise does for you.

FROM node:alpine
WORKDIR /app
COPY package.json /app/package.json
RUN npm install --production
COPY server.js /app/server.js
EXPOSE 8080
CMD npm start
But, it’s worth it, because the resulting image is much smaller at only 65 MB!

Containerizing compiled languages


Compiled languages such as Go, C, C++, Rust, Haskell and others create binaries that can run without many external dependencies. This means you can build the binary ahead of time and ship it into production without having to ship the tools to create the binary such as the compiler.

With Docker’s support for multi-step builds, you can easily ship just the binary and a minimal amount of scaffolding. Let’s learn how.

Let’s take a Go application and containerize it using this pattern. First, let’s use the “golang:onbuild” Docker image as the base. As before, the Dockerfile is only two lines, but again you pay the price in terms of disk size—over 700MB!

FROM golang:onbuild
EXPOSE 8080
The next step is to use a slimmer base image, in this case the “golang:alpine” image. So far, this is the same process we followed for an interpreted language.

Again, creating the Dockerfile with an Alpine base image is a bit more complicated as you have to run a few commands that the onbuild image did for you.

FROM golang:alpine
WORKDIR /app
ADD . /app
RUN cd /app && go build -o goapp
EXPOSE 8080
ENTRYPOINT ./goapp

But again, the resulting image is much smaller, weighing in at only 256MB!
However, we can make the image even smaller: You don’t need any of the compilers or other build and debug tools that Go comes with, so you can remove them from the final container.

Let’s use a multi-step build to take the binary created by the golang:alpine container and package it by itself.

FROM golang:alpine AS build-env
WORKDIR /app
ADD . /app
RUN cd /app && go build -o goapp

FROM alpine
RUN apk update && \
   apk add ca-certificates && \
   update-ca-certificates && \
   rm -rf /var/cache/apk/*
WORKDIR /app
COPY --from=build-env /app/goapp /app
EXPOSE 8080
ENTRYPOINT ./goapp

Would you look at that! This container is only 12MB in size!
While building this container, you may notice that the Dockerfile does strange things such as manually installing HTTPS certificates into the container. This is because the base Alpine Linux ships with almost nothing pre-installed. So even though you need to manually install any and all dependencies, the end result is super small containers!

Note: If you want to save even more space, you could statically compile your app and use the “scratch” container. Using “scratch” as a base container means you are literally starting from scratch with no base layer at all. However, I recommend using Alpine as your base image rather than “scratch” because the few extra MBs in the Alpine image make it much easier to use standard tools and install dependencies.

Where to build and store your containers


In order to build and store the images, I highly recommend the combination of Google Container Builder and Google Container Registry. Container Builder is very fast and automatically pushes images to Container Registry. Most developers should easily get everything done in the free tier, and Container Registry is the same price as raw Google Cloud Storage (cheap!).

Platforms like Google Kubernetes Engine can securely pull images from Google Container Registry without any additional configuration, making things easy for you!

In addition, Container Registry gives you vulnerability scanning tools and IAM support out of the box. These tools can make it easier for you to secure and lock down your containers.

Evaluating performance of smaller containers


People claim that small containers’ big advantage is reduced time—both time-to-build and time-to-pull. Let’s test this, using containers created with onbuild, and ones created with Alpine in a multistage process!

TL;DR: No significant difference for powerful computers or Container Builder, but significant difference for smaller computers and shared systems (like many CI/CD systems). Small Images are always better in terms of absolute performance.

Building images on a large machine


For the first test, I am going to build using a pretty beefy laptop. I’m using our office WiFi, so the download speeds are pretty fast!


For each build, I remove all Docker images in my cache.

Build:
Go Onbuild: 35 Seconds
Go Multistage: 23 Seconds
The build takes about 10 seconds longer for the larger container. While this penalty is only paid on the initial build, your Continuous Integration system could pay this price with every build.

The next test is to push the containers to a remote registry. For this test, I used Container Registry to store the images.

Push:
Go Onbuild: 15 Seconds
Go Multistage: 14 Seconds
Well this was interesting! Why does it take the same amount of time to push a 12MB object and a 700MB object? Turns out that Container Registry uses a lot of tricks under the covers, including a global cache for many popular base images.

Finally, I want to test how long it takes to pull the image from the registry to my local machine.

Pull:
Go Onbuild: 26 Seconds
Go Multistage: 6 Seconds
At 20 seconds, this is the biggest difference between using the two different container images. You can start to see the advantage of using a smaller image, especially if you pull images often.

You can also build the containers in the cloud using Container Builder, which has the added benefit of automatically storing them in Container Registry.

Build + Push:
Go Onbuild: 25 Seconds
Go Multistage: 20 Seconds
So again, there is a small advantage to using the smaller image, but not as dramatic as I would have expected.

Building images on small machines


So is there an advantage for using smaller containers? If you have a powerful laptop with a fast internet connection and/or Container Builder, not really. However, the story changes if you’re using less powerful machines. To simulate this, I used a modest Google Compute Engine f1-micro VM to build, push and pull these images, and the results are staggering!

Pull:
Go Onbuild: 52 seconds
Go Multistage: 6 seconds
Build:
Go Onbuild: 54 seconds
Go Multistage: 28 seconds
Push:
Go Onbuild: 48 Seconds
Go Multistage: 16 seconds
In this case, using smaller containers really helps!

Pulling on Kubernetes


While you might not care about the time it takes to build and push the container, you should really care about the time it takes to pull the container. When it comes to Kubernetes, this is probably the most important metric for your production cluster.

For example, let’s say you have a three-node cluster, and one of the node crashes. If you are using a managed system like Kubernetes Engine, the system automatically spins up a new node to take its place.

However, this new node will be completely fresh, and will have to pull all your containers before it can start working. The longer it takes to pull the containers, the longer your cluster isn’t performing as well as it should!

This can occur when you increase your cluster size (for example, using Kubernetes Engine Autoscaling), or upgrade your nodes to a new version of Kubernetes (stay tuned for a future episode on this).

We can see that the pull performance of multiple containers from multiple deployments can really add up here, and using small containers can potentially shave minutes from your deployment times!

Security and vulnerabilities


Aside from performance, there are significant security benefits from using smaller containers. Small containers usually have a smaller attack surface as compared to containers that use large base images.

I built the Go “onbuild” and “multistage” containers a few months ago, so they probably contain some vulnerabilities that have since been discovered. Using Container Registry’s built-in Vulnerability Scanning, it’s easy to scan your containers for known vulnerabilities. Let’s see what we find.

Wow, that’s a big difference between the two! Only three “medium” vulnerabilities in the smaller container, compared with 16 critical and over 300 other vulnerabilities in the larger container.

Let’s drill down and see which issues the larger container has.

You can see that most of the issues have nothing to do with our app, but rather programs that we are not even using! Because the multistage image is using a much smaller base image, there are just fewer things that can be compromised.

Conclusion

The performance and security advantages of using small containers speak for themselves. Using a small base image and the “builder pattern” can make it easier to build small images, and there are many other techniques for individual stacks and programming languages to minimize container size as well. Whatever you do, you can be sure that your efforts to keep your containers small are well worth it!

Check in next week when we’ll talk about using Kubernetes namespaces to isolate clusters from one another. And don’t forget to subscribe to our YouTube channel and Twitter for the latest updates.

If you haven’t tried GCP and our various container services before, you can quickly get started with our $300 free credits.

DeepVariant Accuracy Improvements for Genetic Datatypes



Last December we released DeepVariant, a deep learning model that has been trained to analyze genetic sequences and accurately identify the differences, known as variants, that make us all unique. Our initial post focused on how DeepVariant approaches “variant calling” as an image classification problem, and is able to achieve greater accuracy than previous methods.

Today we are pleased to announce the launch of DeepVariant v0.6, which includes some major accuracy improvements. In this post we describe how we train DeepVariant, and how we were able to improve DeepVariant's accuracy for two common sequencing scenarios, whole exome sequencing and polymerase chain reaction sequencing, simply by adding representative data into DeepVariant's training process.

Many Types of Sequencing Data
Approaches to genomic sequencing vary depending on the type of DNA sample (e.g., from blood or saliva), how the DNA was processed (e.g., amplification techniques), which technology was used to sequence the data (e.g., instruments can vary even within the same manufacturer) and what section or how much of the genome was sequenced. These differences result in a very large number of sequencing "datatypes".

Typically, variant calling tools have been tuned for one specific datatype and perform relatively poorly on others. Given the extensive time and expertise involved in tuning variant callers for new datatypes, it seemed infeasible to customize each tool for every one. In contrast, with DeepVariant we are able to improve accuracy for new datatypes simply by including representative data in the training process, without negatively impacting overall performance.

Truth Sets for Variant Calling
Deep learning models depend on having high quality data for training and evaluation. In the field of genomics, the Genome in a Bottle (GIAB) consortium, which is hosted by the National Institute of Standards and Technology (NIST), produces human genomes for use in technology development, evaluation, and optimization. The benefit of working with GIAB benchmarking genomes is that their true sequence is known (at least to the extent currently possible). To achieve this, GIAB takes a single person's DNA and repeatedly sequences it using a wide variety of laboratory methods and sequencing technologies (i.e. many datatypes) and analyzes the resulting data using many different variant calling tools. A tremendous amount of work then follows to evaluate and adjudicate discrepancies to produce a high-confidence "truth set" for each genome.

The majority of DeepVariant’s training data is from the first benchmarking genome released by GIAB, HG001. The sample, from a woman of northern European ancestry, was made available as part of the International HapMap Project, the first large-scale effort to identify common patterns of human genetic variation. Because DNA from HG001 is commercially available and so well characterized, it is often the first sample used to test new sequencing technologies and variant calling tools. By using many replicates and different datatypes of HG001, we can generate millions of training examples which helps DeepVariant learn to accurately classify many datatypes, and even generalize to datatypes it has never seen before.

Improved Exome Model in v0.5
In the v0.5 release we formalized a benchmarking-compatible training strategy to withhold from training a complete sample, HG002, as well as any data from chromosome 20. HG002, the second benchmarking genome released by GIAB, is from a male of Ashkenazi Jewish ancestry. Testing on this sample, which differs in both sex and ethnicity from HG001, helps to ensure that DeepVariant is performing well for diverse populations. Additionally reserving chromosome 20 for testing guarantees that we can evaluate DeepVariant's accuracy for any datatype that has truth data available.

In v0.5 we also focused on exome data, which is the subset of the genome that directly codes for proteins. The exome is only ~1% of the whole human genome, so whole exome sequencing (WES) costs less than whole genome sequencing (WGS). The exome also harbors many variants of clinical significance which makes it useful for both researchers and clinicians. To increase exome accuracy we added a variety of WES datatypes, provided by DNAnexus, to DeepVariant's training data. The v0.5 WES model shows 43% fewer indel (insertion-deletion) errors and a 22% reduction in single nucleotide polymorphism (SNP) errors.
The total number of exome errors for HG002 across DeepVariant versions, broken down by indel errors (left) and SNP errors (right). Errors are either false positive (FP), colored yellow, or false negative (FN), colored blue. The largest accuracy jump is between v0.4 and v0.5, largely attributable to a reduction in indel FPs.
Improved Whole Genome Sequencing Model for PCR+ data in v0.6
Our newest release of DeepVariant, v0.6, focuses on improved accuracy for data that has undergone DNA amplification via polymerase chain reaction (PCR) prior to sequencing. PCR is an easy and inexpensive way to amplify very small quantities of DNA, and once sequenced results in what is known as PCR positive (PCR+) sequencing data. It is well known, however, that PCR can be prone to bias and errors, and non-PCR-based (or PCR-free) DNA preparation methods are increasingly common. DeepVariant's training data prior to the v0.6 release was exclusively PCR-free data, and PCR+ was one of the few datatypes for which DeepVariant had underperformed in external evaluations. By adding PCR+ examples to DeepVariant's training data, also provided by DNAnexus, we have seen significant accuracy improvements for this datatype, including a 60% reduction in indel errors.
DeepVariant v0.6 shows major accuracy improvements for PCR+ data, largely attributable to a reduction in indel errors. Here we re-analyze two PCR+ samples that were used in external evaluations, including DNAnexus on the left (see details in figure 10) and bcbio on the right, showing how indel accuracy improves with each DeepVariant version.
Independent evaluations of DeepVariant v0.6 from both DNAnexus and bcbio are also available. Their analyses support our findings of improved indel accuracy, and also include comparisons to other variant calling tools.

Looking Forward
We released DeepVariant as open source software to encourage collaboration and to accelerate the use of this technology to solve real world problems. As the pace of innovation in sequencing technologies continues to grow, including more clinical applications, we are optimistic that DeepVariant can be further extended to produce consistent and highly accurate results. We hope that researchers will use DeepVariant v0.6 to accelerate discoveries, and if there is a sequencing datatype that you would like to see us prioritize, please let us know.

Introducing kaniko: Build container images in Kubernetes and Google Container Builder without privileges



Building images from a standard Dockerfile typically relies upon interactive access to a Docker daemon, which requires root access on your machine to run. This can make it difficult to build container images in environments that can’t easily or securely expose their Docker daemons, such as Kubernetes clusters (for more about this, check out the 16th oldest open Kubernetes issue).

To overcome these challenges, we’re excited to introduce kaniko, an open-source tool for building container images from a Dockerfile even without privileged root access. With kaniko, we both build an image from a Dockerfile and push it to a registry. Since it doesn’t require any special privileges or permissions, you can run kaniko in a standard Kubernetes cluster, Google Kubernetes Engine, or in any environment that can’t have access to privileges or a Docker daemon.

How does kaniko work?


We run kaniko as a container image that takes in three arguments: a Dockerfile, a build context and the name of the registry to which it should push the final image. This image is built from scratch, and contains only a static Go binary plus the configuration files needed for pushing and pulling images.


The kaniko executor then fetches and extracts the base-image file system to root (the base image is the image in the FROM line of the Dockerfile). It executes each command in order, and takes a snapshot of the file system after each command. This snapshot is created in user-space by walking the filesystem and comparing it to the prior state that was stored in memory. It appends any modifications to the filesystem as a new layer to the base image, and makes any relevant changes to image metadata. After executing every command in the Dockerfile, the executor pushes the newly built image to the desired registry.

Kaniko unpacks the filesystem, executes commands and snapshots the filesystem completely in user-space within the executor image, which is how it avoids requiring privileged access on your machine. The docker daemon or CLI is not involved.

Running kaniko in a Kubernetes cluster


To run kaniko in a standard Kubernetes cluster your pod spec should look something like this, with the args parameters filled in. In this example, a Google Cloud Storage bucket provides the build context.

apiVersion: v1
kind: Pod
metadata:
 name: kaniko
spec:
 containers:
 - name: kaniko
   image: gcr.io/kaniko-project/executor:latest
   args: ["--dockerfile=<path to Dockerfile>",
           "--bucket=<GCS bucket>",
           "--destination=<gcr.io/$PROJECT/$REPO:$TAG"]
   volumeMounts:
     - name: kaniko-secret
       mountPath: /secret
   env:
     - name: GOOGLE_APPLICATION_CREDENTIALS
       value: /secret/kaniko-secret.json
 restartPolicy: Never
 volumes:
   - name: kaniko-secret
     secret:
       secretName: kaniko-secret

You’ll need to mount a Kubernetes secret that contains the necessary authentication to push the final image to a registry. You can find instructions for downloading the secret here.

Running kaniko in Google Cloud Container Builder


To run kaniko in Google Cloud Container Builder, we can add it as a build step to the build config:

steps:
 - name: gcr.io/kaniko-project/executor:latest
   args: ["--dockerfile=<path to Dockerfile>",
          "--context=<path to build context>",
          "--destination=<gcr.io/[PROJECT]/[IMAGE]:[TAG]>"]
The kaniko executor image will both build and push the image in this build step.


Comparison with other tools


Similar tools to kaniko include img and orca-build. Like kaniko, both tools build container images from Dockerfiles, but with different approaches and security trade-offs. The img tool builds as an unprivileged user within a container, while kaniko builds as a root user within a container in an unprivileged environment. The orca-build tool executes builds by wrapping runc, which uses kernel namespacing techniques to execute RUN commands. We're able to accomplish the same thing in kaniko by executing commands as a root user within a container.


Conclusion


You can find more documentation about running kaniko in our GitHub repo. Please open an issue if you run into any bugs! If you have any questions or feedback you can also reach out to us in our Google group.

Now, you can deploy to Kubernetes Engine from GitLab with a few clicks



In cloud developer circles, GitLab is a popular DevOps lifecycle tool. It lets you do everything from project planning and version control to CI/CD pipelines and monitoring, all in a single interface so different functional teams can collaborate. In particular, its Auto DevOps feature detects the language your app is written in and automatically builds your CI/CD pipelines for you.

Google Cloud started the cloud native movement with the invention and open sourcing of Kubernetes in 2014. Kubernetes draws on over a decade of experience running containerized workloads in production serving Google products at massive scale. Kubernetes Engine is our managed Kubernetes service, built by the same team that's the largest contributor to the Kubernetes open source project, and is run by experienced Google SREs, all of which enables you to focus on what you do best: creating applications to delight your users, while leaving the cluster deployment operations to us.

Today, GitLab and Google Cloud are announcing a new integration of GitLab and Kubernetes Engine that makes it easy for you to accelerate your application deployments by provisioning Kubernetes clusters, managed by Google, right from your DevOps pipeline supported by GitLab. You can now connect your Kubernetes Engine cluster to your GitLab project in just a few clicks, and use it to run your continuous integration jobs, and configure a complete continuous deployment pipeline, including previewing your changes live, and deploying them into production, all served by Kubernetes Engine.

Head over to GitLab, and add your first Kubernetes Engine cluster to your project from the CI/CD options in your repository today!

The Kubernetes Engine cluster can be added through the CI/CD -> Kubernetes menu option in the GitLab UI, which even supports creating a brand new Kubernetes Cluster.
Once connected, you can deploy the GitLab Runner into your cluster. This means that the continuous integration jobs will run on your Kubernetes Engine cluster, enabling you fine-grained control over the resources you allocate. For more information read the GitLab Runner docs.

Even more exciting is the new GitLab Auto DevOps integration with Kubernetes Engine. Using Auto DevOps with Kubernetes Engine, you'll have a continuous deployment pipeline that automatically creates a review app for each merge request  a special dynamic environment that allows you to preview changes before they go live  and once you merge, deploy the application into production on production-ready Google Kubernetes Engine.

To get started, go to CI/CD -> General pipeline settings, and select “Enable Auto DevOps”. For more information, read the AutoDev Ops docs.
Auto DevOps does the heavy lifting to detect what languages you’re using, and configure a Continuous Integration and Continuous Deployment pipeline that results in your app running live on the Kubernetes Engine cluster.
Now, whenever you create a merge request, GitLab will run a review pipeline to deploy a review app to your cluster where you can test your changes live. When you merge the code, GitLab will run a production pipeline to deploy your app to production, running on Kubernetes Engine!

Join us for a webinar co-hosted by Google Cloud and GitLab 


Want to learn more? We’re hosting a webinar to show how to build cloud-native applications with Gitlab and Kubernetes Engine. Register here for the April 26th webinar.

Want to get started deploying to Kubernetes? GitLab is offering $500 in Google Cloud Platform credits for new accounts. Try it out.

MobileNetV2: The Next Generation of On-Device Computer Vision Networks



Last year we introduced MobileNetV1, a family of general purpose computer vision neural networks designed with mobile devices in mind to support classification, detection and more. The ability to run deep networks on personal mobile devices improves user experience, offering anytime, anywhere access, with additional benefits for security, privacy, and energy consumption. As new applications emerge allowing users to interact with the real world in real time, so does the need for ever more efficient neural networks.

Today, we are pleased to announce the availability of MobileNetV2 to power the next generation of mobile vision applications. MobileNetV2 is a significant improvement over MobileNetV1 and pushes the state of the art for mobile visual recognition including classification, object detection and semantic segmentation. MobileNetV2 is released as part of TensorFlow-Slim Image Classification Library, or you can start exploring MobileNetV2 right away in coLaboratory. Alternately, you can download the notebook and explore it locally using Jupyter. MobileNetV2 is also available as modules on TF-Hub, and pretrained checkpoints can be found on github.

MobileNetV2 builds upon the ideas from MobileNetV1 [1], using depthwise separable convolution as efficient building blocks. However, V2 introduces two new features to the architecture: 1) linear bottlenecks between the layers, and 2) shortcut connections between the bottlenecks1. The basic structure is shown below.
Overview of MobileNetV2 Architecture. Blue blocks represent composite convolutional building blocks as shown above.
The intuition is that the bottlenecks encode the model’s intermediate inputs and outputs while the inner layer encapsulates the model’s ability to transform from lower-level concepts such as pixels to higher level descriptors such as image categories. Finally, as with traditional residual connections, shortcuts enable faster training and better accuracy. You can learn more about the technical details in our paper, “MobileNet V2: Inverted Residuals and Linear Bottlenecks”.

How does it compare to the first generation of MobileNets?
Overall, the MobileNetV2 models are faster for the same accuracy across the entire latency spectrum. In particular, the new models use 2x fewer operations, need 30% fewer parameters and are about 30-40% faster on a Google Pixel phone than MobileNetV1 models, all while achieving higher accuracy.
MobileNetV2 improves speed (reduced latency) and increased ImageNet Top 1 accuracy
MobileNetV2 is a very effective feature extractor for object detection and segmentation. For example, for detection when paired with the newly introduced SSDLite [2] the new model is about 35% faster with the same accuracy than MobileNetV1. We have open sourced the model under the Tensorflow Object Detection API [4].

Model
Params
Multiply-Adds
mAP
Mobile CPU
MobileNetV1 + SSDLite
5.1M
1.3B
22.2%
270ms
4.3M
0.8B
22.1%
200ms

To enable on-device semantic segmentation, we employ MobileNetV2 as a feature extractor in a reduced form of DeepLabv3 [3], that was announced recently. On the semantic segmentation benchmark, PASCAL VOC 2012, our resulting model attains a similar performance as employing MobileNetV1 as feature extractor, but requires 5.3 times fewer parameters and 5.2 times fewer operations in terms of Multiply-Adds.

Model
Params
Multiply-Adds
mIOU
MobileNetV1 + DeepLabV3
11.15M
14.25B
75.29%
2.11M
2.75B
75.32%

As we have seen MobileNetV2 provides a very efficient mobile-oriented model that can be used as a base for many visual recognition tasks. We hope by sharing it with the broader academic and open-source community we can help to advance research and application development.

Acknowledgements:
We would like to acknowledge our core contributors Menglong Zhu, Andrey Zhmoginov and Liang-Chieh Chen. We also give special thanks to Bo Chen, Dmitry Kalenichenko, Skirmantas Kligys, Mathew Tang, Weijun Wang, Benoit Jacob, George Papandreou and Hartwig Adam.

References
  1. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H, arXiv:1704.04861, 2017.
  2. MobileNetV2: Inverted Residuals and Linear Bottlenecks, Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC. arXiv preprint. arXiv:1801.04381, 2018.
  3. Rethinking Atrous Convolution for Semantic Image Segmentation, Chen LC, Papandreou G, Schroff F, Adam H. arXiv:1706.05587, 2017.
  4. Speed/accuracy trade-offs for modern convolutional object detectors, Huang J, Rathod V, Sun C, Zhu M, Korattikara A, Fathi A, Fischer I, Wojna Z, Song Y, Guadarrama S, Murphy K, CVPR 2017.
  5. Deep Residual Learning for Image Recognition, He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. arXiv:1512.03385,2015


1 The shortcut (also known as skip) connections, popularized by ResNets[5] are commonly used to connect the non-bottleneck layers. MobilenNetV2 inverts this notion and connects the bottlenecks directly.

My first open source project and Google Code-in

This is a guest post from a mentor with coala, an open source tool for linting and fixing code in many different languages, which participated in Google Code-in 2017.

About two years ago, my friend Gyan and I built a small web app which checked whether or not a given username was available on a few popular social media websites. The idea was simple: judge availability of the username on the basis of an HTTP response. Here’s a pseudo-code example:
website_url = form_website_url(website, username)
# Eg: form_website_url('github', 'manu-chroma') returns 'github.com/manu-chroma'

if website_url_response.http_code == 404:
username available
else:
username taken
Much to our delight, it worked! Well, almost. It had a lot of bugs but we didn’t care much at the time. It was my first Python project and the first time I open sourced my work. I always look back on it as a cool idea, proud that I made it and learned a lot in the process.

But the project had been abandoned until John from coala approached me. John suggested we use it for Google Code-in because one of coala’s tasks for the students was to create accounts on a few common coding related websites. Students could use the username availability tool to find a good single username–people like their usernames to be consistent across websites–and coala could use it to verify that the accounts were created.

I had submitted a few patches to coala in the past, so this sounded good to me! The competition clashed with my vacation plans, but I wanted to get involved, so I took the opportunity to become a mentor.

Over the course of the program, students not only used the username availability tool but they also began making major improvements. We took the cue and began adding tasks specifically about the tool. Here are just a few of the things students added:
  • Regex to determine whether a given username was valid for any given website
  • More websites, bringing it to a total of 13
  • Tests (!)
The web app is online so you can check username availability too!

I had such a fun time working with students in Google Code-in, their enthusiasm and energy was amazing. Special thanks to students Andrew, Nalin, Joshua, and biscuitsnake for all the time and effort you put into the project. You did really useful work and I hope you learned from the experience!

I want to thank John for approaching me in the first place and suggesting we use and improve the project. He was an unstoppable force throughout the competition, helping both students and fellow mentors. John even helped me with code reviews to really refine the work students submitted, and help them improve based on the feedback.

Kudos to the Google Open Source team for organizing it so well and lowering the barriers of entry to open source for high school students around the world.

By Manvendra Singh, coala mentor

Kubernetes 1.10: an insider take on what’s new



The Kubernetes community today announced the release of Kubernetes 1.10, just a few weeks since it graduated from CNCF incubation. As a founding member of the CNCF and the primary authors of Kubernetes, Google continues to be the largest contributor to the project in this release, as well as reviewer of contributions and mentor to community members. At Google we believe growing a vibrant community helps deliver a platform that's open and portable, so users benefit by being able to run their workloads consistently anywhere they want.

In this post, we highlight a few elements of the 1.10 release that we helped contribute to.

Container storage plugins


The Kubernetes implementation of the Container Storage Interface (CSI) has moved to beta in Kubernetes 1.10. CSI enables third-party storage providers to develop solutions outside of the core Kubernetes codebase. Because these plugins are decoupled from the core codebase, installing them is as easy as deploying a Pod to your cluster.

Saad Ali (chair of SIG-Storage) is a primary author of both the CSI specification and Kubernetes' implementation of the specification. "Kubernetes provides a powerful volume plugin system that makes it easy to consume different types of block and file storage,” he explains. “However, adding support for new volume plugins has been challenging. With the adoption of the Container Storage Interface, the Kubernetes volume layer is finally becoming truly extensible. Third-parties can now write and deploy plugins exposing new storage systems in Kubernetes without ever having to touch the core Kubernetes code. Ultimately this will give Kubernetes and Kubernetes Engine users more options for the storage that backs their stateful containerized workloads."

Custom resolver configuration


A key feature of Kubernetes is being able to refer to your Services by a simple DNS name, rather than deal with the complexities of an external discovery service. While this works great for internal names, some Kubernetes Engine customers reported that it caused an overload on the internal DNS server for workloads that primarily look up external names.

Zihong Zheng implemented a feature to allow you to customize the resolver on a per-pod basis. "Kubernetes users can now avoid this trade-off if they want to, so that neither ease of use nor flexibility are compromised," he says. Building this upstream means that the feature is available to Kubernetes users wherever they run.


Device plugins and GPU support


Also moving to beta in 1.10 are Device Plugins, an extension mechanism that lets device vendors advertise their resources to the kubelet without changing Kubernetes core code. A primary use case for device plugins is connecting GPUs to Kubernetes.

Jiaying Zhang is Google's feature lead for device plugins. She worked closely with device vendors to understand their needs, identify common requirements, come up with an execution plan, and work with the OSS community to build a production-ready system. Kubernetes Engine support for GPUs is built on the Device Plugins framework, and our early access customers influenced the feature as it moved to production readiness in Kubernetes 1.10.

API extensions


Googler Daniel Smith (co-chair of SIG API Machinery) first proposed the idea of API extension just a couple months after Kubernetes was open-sourced. We now have two methods for extending the Kubernetes API: Custom Resource Definitions (formerly Third-Party Resources), and API Aggregation, which moves to GA in Kubernetes 1.10. Aggregation, which is used to power ecosystem extensions like the Service Catalog and Metrics Server, allows independently built API server binaries to be hosted through the Kubernetes master, with the same authorization, authentication and security configurations on both. “We’ve been running the aggregation layer in every Google Kubernetes Engine cluster since 1.7 without difficulties, so it’s clearly time to promote this mechanism to GA,” says Daniel. "We’re working to provide a complete extensibility solution, which involves getting both CRDs and admission control webhooks to GA by the end of the year.”

Use Kubernetes to run your Spark workloads


Google's contributions to the open Kubernetes ecosystem extend farther than the Kubernetes project itself. Anirudh Ramanathan (chair of SIG-Big Data) led the upstream implementation of native Kubernetes support in Apache Spark 2.3, a headline feature in that release. Along with Yinan Li, we are hard at work on a Spark Operator, which lets you run Spark workloads in an idiomatic Kubernetes fashion.

Paired with the priority and preemption feature implemented by Bobby Salamat (chair of SIG-Scheduling) and David Oppenheimer (co-author of the Borg paper) you'll soon be able to increase the efficiency of your cluster by using Spark to schedule batch work to run only when the cluster has free resources.

Growing the community


We’re also heavily invested in mentoring for the Kubernetes project. Outreachy is an internship program that helps traditionally underrepresented groups learn and grow tech skills by contributing to open-source projects. Kubernetes' SIG-CLI participated in Outreachy over the 1.10 timeframe with Google's Antoine Pelisse as mentor. With his help, Yolande Amate from Cameroon and Ellen Korbes from Brazil took on the challenge of making improvements to the "kubectl create" and "kubectl set" commands.

With the internship over, Ellen is now a proud Kubernetes project member (and has written a series of blog posts about her path to contribution), and Yolande continues to submit PRs and is working toward her membership.


1.10 available soon on Kubernetes Engine


This latest version of Kubernetes will start rolling out to alpha clusters on Kubernetes Engine in early April. If you want to be among the first to access it on your production clusters, join our early access program today.

If you haven’t tried GCP and Kubernetes Engine before, you can quickly get started with our $300 free credits.

A galactic experience in Google Code-in 2017

This is a guest post from Liquid Galaxy, one of the organizations that participated in both Google Summer of Code and Google Code-in 2017.

Liquid Galaxy, an open source project that powers panoramic views spanning multiple computers and displays, has been participating in Google Summer of Code (GSoC) since 2011. However, we never applied to participate in Google Code-in (GCI) because we heard stories from other projects about long hours and interrupted holidays in service of mentoring eager young students.

That changed in 2017! And, while the stories are true, we have to say it’s also an amazing and worthwhile experience.

It was hard for our small project to recruit the number of mentors needed. Thankfully, our GSoC mentors stepped up, as did many former GSoC students. We even had forward thinking students who were interested in participating in GSoC 2018 volunteer to mentor! While it was challenging, our team of mentors helped us have a nearly flawless GCI experience.

The Google Open Source team only had to nudge us once, when a student’s task had been pending review for more than 36 hours. We’re pretty happy with that considering we had nearly 500 tasks completed over the 50 days of the contest.

More important than our experience, though, is the student experience. We learned a lot, seeing how they chose tasks, the attention to detail some of them put into their work, and the level of interaction between the students and the mentors. Considering these were young students, ranging in age from 13 to 17, they far exceeded our expectations.

There was one piece of advice the Google Open Source team gave us that we didn’t understand as GCI newbies: have a large number of tasks ready from day one, and leave some unpublished until the halfway point. That ended up being key, it ensured we had enough tasks for the initial flood of students and some in reserve for the second flood around the holidays. Our team of mentors worked hard from the moment we were accepted into GCI to the moment we began to create over 150 tasks in five different categories. Students seemed to think we did a good job and told us they enjoyed the variety of tasks and level of difficulty.

We’re glad we finally participated in Google Code-in and we’ll definitely be applying next time! You can learn more about the project and the students who worked with us on our blog.

By Andreu Ibáñez, Liquid Galaxy org admin

Cloudprober: open source black-box monitoring software

Ever wonder if users can actually access your microservices? Observe timeouts in your applications, and not sure if it's the network or if your servers are too busy? Curious about the 99%-ile network latency between your on-premise data center and services running in the cloud?

Cloudprober, which we open sourced last year, answers questions like these and more. It’s black-box monitoring software that "probes" your systems and services and generates metrics based on probe results. This kind of monitoring strategy doesn’t make assumptions about how your service is implemented and it works at the same layer as your service’s users. You can make changes to your service’s implementation with peace of mind, knowing you’ll notice if a change prevents users from accessing the service.

A probe can be anything: a ping, an HTTP request, or even a custom program that mimics how your services are consumed (for example, creating and accessing a blog post). Cloudprober builds and exports standard metrics, and provides a way to easily integrate them with your existing monitoring stack, such as Prometheus-Grafana, Stackdriver and soon InfluxDB. Cloudprober is written in Go and works on all major platforms: Linux, Mac OS, and Windows. It's released as a static binary as well as a Docker image.

Here’s an example probe config that runs an HTTP probe against your forwarding rules and exports data to Stackdriver and Prometheus:
probe {
name: "internal-web"
type: HTTP
# Probe all forwarding rules that contain web-fr in their name.
targets {
gce_targets {
forwarding_rules {}
}
regex: "web-fr-.*"
}
interval_msec: 5000
timeout_msec: 1000
http_probe {
port: 8080
}
}

// Export data to stackdriver
surfacer {
type: STACKDRIVER
}

// Prometheus exporter
surfacer {
type: PROMETHEUS
}

The probe config is run like this from the command-line:
./cloudprober --config_file $HOME/cloudprober/cloudprober.cfg

This example probe config highlights two major features of Cloudprober: automatic, continuous discovery of cloud targets, and data export over multiple channels (Stackdriver and Prometheus in this case). Cloud deployments are dynamic and are often changing constantly. Cloudprober's dynamic target discovery feature ensures you have one less thing to worry about when doing minor infrastructure changes. Data export in various formats helps it integrate well with your existing monitoring setup.

Other features include:
  • Go text templates based configuration which adds programming capability to configs, such as "for" loops and conditionals
  • Fast and efficient implementation of core probe types
  • Custom probes through the "external" probe type
  • The ability to read config through metadata
  • And cloud (Stackdriver) logging
Though most of the cloud support is specific to Google Cloud Platform (GCP), it’s easy to add support for other providers. Cloudprober has an extensible architecture so you can add new types of targets, probes and monitoring backends.

Cloudprober was built by the Cloud Networking Site Reliability Engineering (SRE) team at Google to monitor network availability and associated features. Today, it's used by several other Google Cloud SRE teams as well.

We’re excited to share Cloudprober with the wider devops community! You can find more examples in the GitHub repository and more information on the project website.

By Manu Garg, Cloud Networking Team

Open sourcing GTXiLib, an accessibility test automation framework for iOS

Google believes everyone should be able to access and enjoy the web. We share guidance on building accessible tech over at Google Accessibility and we recently launched a dedicated disability support team. Today, we’re excited to announce that we’ve open sourced GTXiLib, an accessibility test automation framework for iOS, under the Apache license.

We want our products to be accessible and automation, with frameworks like GTXiLib, is one of the ways we scale our accessibility testing. GTXiLib can automate the process of checking for some kinds of issues such as missing labels, hints, or low contrast text.

GTXiLib is written in Objective-C and will integrate with your existing XCTests to perform all the registered accessibility checks before the test tearDown. When the checks fail, the existing test fails as well. Fixing your tests will thus lead to better accessibility and your tests can catch new accessibility issues as well.
  • Reuse your tests: GTXiLib integrates into your existing functional tests, enhancing the value of any tests that you have or any that you write.
  • Incremental accessibility testing: GTXiLib can be installed onto a single test case, test class or a specific subset of tests giving you the freedom to add accessibility testing incrementally. This helped drive GTXiLib adoption in large projects at Google.
  • Author your own checks: GTXiLib has a simple API to create custom checks based on the specific needs of your app. For example, you can ensure every button in your app has an accessibilityHint using a custom check.
Do you also care about accessibility? Help us sharpen GTXiLib by suggesting a check or better yet, writing one. You can add GTXiLib to your project using CocoaPods or by using its Xcode project file.

We hope you find this useful and look forward to feedback and contributions from the community! Please check out the README for more information.

By Siddartha Janga, Google Central Accessibility Team