Tag Archives: Compute

Kubernetes best practices: How and why to build small container images

Editor’s note: Today marks the first installment in a seven-part video and blog series from Google Developer Advocate Sandeep Dinesh on how to get the most out of your Kubernetes environment. Today he tackles the theory and practicalities of keeping your container images as small as possible.

Docker makes building containers a breeze. Just put a standard Dockerfile into your folder, run the docker ‘build’ command, and shazam! Your container image is built!

The downside of this simplicity is that it’s easy to build huge containers full of things you don’t need—including potential security holes.

In this episode of “Kubernetes Best Practices,” let’s explore how to create production-ready container images using Alpine Linux and the Docker builder pattern, and then run some benchmarks that can determine how these containers perform inside your Kubernetes cluster.

The process for creating containers images is different depending on whether you are using an interpreted language or a compiled language. Let’s dive in!

Containerizing interpreted languages

Interpreted languages, such as Ruby, Python, Node.js, PHP and others send source code through an interpreter that runs the code. This gives you the benefit of skipping the compilation step, but has the downside of requiring you to ship the interpreter along with the code.

Luckily, most of these languages offer pre-built Docker containers that include a lightweight environment that allows you to run much smaller containers.

Let’s take a Node.js application and containerize it. First, let’s use the “node:onbuild” Docker image as the base. The “onbuild” version of a Docker container pre-packages everything you need to run so you don’t need to perform a lot of configuration to get things working. This means the Dockerfile is very simple (only two lines!). But you pay the price in terms of disk size— almost 700MB!

FROM node:onbuild
By using a smaller base image such as Alpine, you can significantly cut down on the size of your container. Alpine Linux is a small and lightweight Linux distribution that is very popular with Docker users because it’s compatible with a lot of apps, while still keeping containers small.

Luckily, there is an official Alpine image for Node.js (as well as other popular languages) that has everything you need. Unlike the default “node” Docker image, “node:alpine” removes many files and programs, leaving only enough to run your app.

The Alpine Linux-based Dockerfile is a bit more complicated to create as you have to run a few commands that the onbuild image otherwise does for you.

FROM node:alpine
COPY package.json /app/package.json
RUN npm install --production
COPY server.js /app/server.js
CMD npm start
But, it’s worth it, because the resulting image is much smaller at only 65 MB!

Containerizing compiled languages

Compiled languages such as Go, C, C++, Rust, Haskell and others create binaries that can run without many external dependencies. This means you can build the binary ahead of time and ship it into production without having to ship the tools to create the binary such as the compiler.

With Docker’s support for multi-step builds, you can easily ship just the binary and a minimal amount of scaffolding. Let’s learn how.

Let’s take a Go application and containerize it using this pattern. First, let’s use the “golang:onbuild” Docker image as the base. As before, the Dockerfile is only two lines, but again you pay the price in terms of disk size—over 700MB!

FROM golang:onbuild
The next step is to use a slimmer base image, in this case the “golang:alpine” image. So far, this is the same process we followed for an interpreted language.

Again, creating the Dockerfile with an Alpine base image is a bit more complicated as you have to run a few commands that the onbuild image did for you.

FROM golang:alpine
ADD . /app
RUN cd /app && go build -o goapp

But again, the resulting image is much smaller, weighing in at only 256MB!
However, we can make the image even smaller: You don’t need any of the compilers or other build and debug tools that Go comes with, so you can remove them from the final container.

Let’s use a multi-step build to take the binary created by the golang:alpine container and package it by itself.

FROM golang:alpine AS build-env
ADD . /app
RUN cd /app && go build -o goapp

FROM alpine
RUN apk update && \
   apk add ca-certificates && \
   update-ca-certificates && \
   rm -rf /var/cache/apk/*
COPY --from=build-env /app/goapp /app

Would you look at that! This container is only 12MB in size!
While building this container, you may notice that the Dockerfile does strange things such as manually installing HTTPS certificates into the container. This is because the base Alpine Linux ships with almost nothing pre-installed. So even though you need to manually install any and all dependencies, the end result is super small containers!

Note: If you want to save even more space, you could statically compile your app and use the “scratch” container. Using “scratch” as a base container means you are literally starting from scratch with no base layer at all. However, I recommend using Alpine as your base image rather than “scratch” because the few extra MBs in the Alpine image make it much easier to use standard tools and install dependencies.

Where to build and store your containers

In order to build and store the images, I highly recommend the combination of Google Container Builder and Google Container Registry. Container Builder is very fast and automatically pushes images to Container Registry. Most developers should easily get everything done in the free tier, and Container Registry is the same price as raw Google Cloud Storage (cheap!).

Platforms like Google Kubernetes Engine can securely pull images from Google Container Registry without any additional configuration, making things easy for you!

In addition, Container Registry gives you vulnerability scanning tools and IAM support out of the box. These tools can make it easier for you to secure and lock down your containers.

Evaluating performance of smaller containers

People claim that small containers’ big advantage is reduced time—both time-to-build and time-to-pull. Let’s test this, using containers created with onbuild, and ones created with Alpine in a multistage process!

TL;DR: No significant difference for powerful computers or Container Builder, but significant difference for smaller computers and shared systems (like many CI/CD systems). Small Images are always better in terms of absolute performance.

Building images on a large machine

For the first test, I am going to build using a pretty beefy laptop. I’m using our office WiFi, so the download speeds are pretty fast!

For each build, I remove all Docker images in my cache.

Go Onbuild: 35 Seconds
Go Multistage: 23 Seconds
The build takes about 10 seconds longer for the larger container. While this penalty is only paid on the initial build, your Continuous Integration system could pay this price with every build.

The next test is to push the containers to a remote registry. For this test, I used Container Registry to store the images.

Go Onbuild: 15 Seconds
Go Multistage: 14 Seconds
Well this was interesting! Why does it take the same amount of time to push a 12MB object and a 700MB object? Turns out that Container Registry uses a lot of tricks under the covers, including a global cache for many popular base images.

Finally, I want to test how long it takes to pull the image from the registry to my local machine.

Go Onbuild: 26 Seconds
Go Multistage: 6 Seconds
At 20 seconds, this is the biggest difference between using the two different container images. You can start to see the advantage of using a smaller image, especially if you pull images often.

You can also build the containers in the cloud using Container Builder, which has the added benefit of automatically storing them in Container Registry.

Build + Push:
Go Onbuild: 25 Seconds
Go Multistage: 20 Seconds
So again, there is a small advantage to using the smaller image, but not as dramatic as I would have expected.

Building images on small machines

So is there an advantage for using smaller containers? If you have a powerful laptop with a fast internet connection and/or Container Builder, not really. However, the story changes if you’re using less powerful machines. To simulate this, I used a modest Google Compute Engine f1-micro VM to build, push and pull these images, and the results are staggering!

Go Onbuild: 52 seconds
Go Multistage: 6 seconds
Go Onbuild: 54 seconds
Go Multistage: 28 seconds
Go Onbuild: 48 Seconds
Go Multistage: 16 seconds
In this case, using smaller containers really helps!

Pulling on Kubernetes

While you might not care about the time it takes to build and push the container, you should really care about the time it takes to pull the container. When it comes to Kubernetes, this is probably the most important metric for your production cluster.

For example, let’s say you have a three-node cluster, and one of the node crashes. If you are using a managed system like Kubernetes Engine, the system automatically spins up a new node to take its place.

However, this new node will be completely fresh, and will have to pull all your containers before it can start working. The longer it takes to pull the containers, the longer your cluster isn’t performing as well as it should!

This can occur when you increase your cluster size (for example, using Kubernetes Engine Autoscaling), or upgrade your nodes to a new version of Kubernetes (stay tuned for a future episode on this).

We can see that the pull performance of multiple containers from multiple deployments can really add up here, and using small containers can potentially shave minutes from your deployment times!

Security and vulnerabilities

Aside from performance, there are significant security benefits from using smaller containers. Small containers usually have a smaller attack surface as compared to containers that use large base images.

I built the Go “onbuild” and “multistage” containers a few months ago, so they probably contain some vulnerabilities that have since been discovered. Using Container Registry’s built-in Vulnerability Scanning, it’s easy to scan your containers for known vulnerabilities. Let’s see what we find.

Wow, that’s a big difference between the two! Only three “medium” vulnerabilities in the smaller container, compared with 16 critical and over 300 other vulnerabilities in the larger container.

Let’s drill down and see which issues the larger container has.

You can see that most of the issues have nothing to do with our app, but rather programs that we are not even using! Because the multistage image is using a much smaller base image, there are just fewer things that can be compromised.


The performance and security advantages of using small containers speak for themselves. Using a small base image and the “builder pattern” can make it easier to build small images, and there are many other techniques for individual stacks and programming languages to minimize container size as well. Whatever you do, you can be sure that your efforts to keep your containers small are well worth it!

Check in next week when we’ll talk about using Kubernetes namespaces to isolate clusters from one another. And don’t forget to subscribe to our YouTube channel and Twitter for the latest updates.

If you haven’t tried GCP and our various container services before, you can quickly get started with our $300 free credits.

Introducing kaniko: Build container images in Kubernetes and Google Container Builder without privileges

Building images from a standard Dockerfile typically relies upon interactive access to a Docker daemon, which requires root access on your machine to run. This can make it difficult to build container images in environments that can’t easily or securely expose their Docker daemons, such as Kubernetes clusters (for more about this, check out the 16th oldest open Kubernetes issue).

To overcome these challenges, we’re excited to introduce kaniko, an open-source tool for building container images from a Dockerfile even without privileged root access. With kaniko, we both build an image from a Dockerfile and push it to a registry. Since it doesn’t require any special privileges or permissions, you can run kaniko in a standard Kubernetes cluster, Google Kubernetes Engine, or in any environment that can’t have access to privileges or a Docker daemon.

How does kaniko work?

We run kaniko as a container image that takes in three arguments: a Dockerfile, a build context and the name of the registry to which it should push the final image. This image is built from scratch, and contains only a static Go binary plus the configuration files needed for pushing and pulling images.

The kaniko executor then fetches and extracts the base-image file system to root (the base image is the image in the FROM line of the Dockerfile). It executes each command in order, and takes a snapshot of the file system after each command. This snapshot is created in user-space by walking the filesystem and comparing it to the prior state that was stored in memory. It appends any modifications to the filesystem as a new layer to the base image, and makes any relevant changes to image metadata. After executing every command in the Dockerfile, the executor pushes the newly built image to the desired registry.

Kaniko unpacks the filesystem, executes commands and snapshots the filesystem completely in user-space within the executor image, which is how it avoids requiring privileged access on your machine. The docker daemon or CLI is not involved.

Running kaniko in a Kubernetes cluster

To run kaniko in a standard Kubernetes cluster your pod spec should look something like this, with the args parameters filled in. In this example, a Google Cloud Storage bucket provides the build context.

apiVersion: v1
kind: Pod
 name: kaniko
 - name: kaniko
   image: gcr.io/kaniko-project/executor:latest
   args: ["--dockerfile=<path to Dockerfile>",
           "--bucket=<GCS bucket>",
     - name: kaniko-secret
       mountPath: /secret
       value: /secret/kaniko-secret.json
 restartPolicy: Never
   - name: kaniko-secret
       secretName: kaniko-secret

You’ll need to mount a Kubernetes secret that contains the necessary authentication to push the final image to a registry. You can find instructions for downloading the secret here.

Running kaniko in Google Cloud Container Builder

To run kaniko in Google Cloud Container Builder, we can add it as a build step to the build config:

 - name: gcr.io/kaniko-project/executor:latest
   args: ["--dockerfile=<path to Dockerfile>",
          "--context=<path to build context>",
The kaniko executor image will both build and push the image in this build step.

Comparison with other tools

Similar tools to kaniko include img and orca-build. Like kaniko, both tools build container images from Dockerfiles, but with different approaches and security trade-offs. The img tool builds as an unprivileged user within a container, while kaniko builds as a root user within a container in an unprivileged environment. The orca-build tool executes builds by wrapping runc, which uses kernel namespacing techniques to execute RUN commands. We're able to accomplish the same thing in kaniko by executing commands as a root user within a container.


You can find more documentation about running kaniko in our GitHub repo. Please open an issue if you run into any bugs! If you have any questions or feedback you can also reach out to us in our Google group.

Reflecting on our ten year App Engine journey

Ten years ago, we announced Google App Engine with a simple video and a blog post. It is truly humbling to look back at the strides we have made with App Engine, how it’s changed application development, and inspired our customers to develop on the cloud and build their businesses on our platform.

As an early member of the engineering team, there are a few key memories that stand out from the early days. I remember the excitement of seeing the first App Engine app crack the 50 qps barrier, followed by the ensuing “uh oh” moment when we realized that it might keep growing. I remember the time someone wanted to acquire one of our hastily-developed demo apps, and also the team meeting shortly after launch when we decided to figure out a way to let customers pay us money. From those modest roots, it’s amazing to see how far we’ve come and to see App Engine’s DNA throughout Google Cloud Platform (GCP).

A decade of digital transformation

Over the past decade, technology has had an impressive influence on our everyday lives—from mobile experiences and ML/ AI technologies, to blockchains and Quantum computing. Today, businesses reach their customers across a number of different web and mobile platforms, in real time.

This fundamental shift in technology means that application developers have a completely different set of requirements today than they did ten years ago: agility, faster time to market, and zero overhead to name a few. Core App Engine concepts like serverless, zero server management, event-driven programming, and paying only for the resources you use are just as relevant today as they were 10 years ago. Businesses have moved away from owning infrastructure resources on-premises to running virtual machines on the cloud, freeing them from managing infrastructure and allowing them to focus on writing code.

How App Engine has evolved

We introduced App Engine with the goal of empowering developers to be more productive. With App Engine, you have the freedom to focus on building great applications, not managing them. That goal was just as important then as it is today.

App Engine and its associated noSQL storage system Datastore started out as a fully managed platform that let developers access the same building blocks that we use to run our own internal applications, and build applications that run reliably, even under heavy load, with large amounts of data at global scale. I want to highlight some of the key innovations that we’ve added to App Engine in the past decade. Click on the below interactive timeline to view some of the highlights:

Where we're headed from here

App Engine was one of the very first investments for GCP—a fully managed serverless platform before businesses really understood the concept of serverless. In ten years, we shipped a lot of features, learned countless lessons, empowered many businesses such as Best Buy, Idexx laboratories, and Khan Academy—and we’re not done yet! App Engine has an exciting future ahead. We’re working on new features, new runtimes and customer driven capabilities that we’re excited to share with you in the coming months. I’m sure the next ten years will be as exciting as the first ten.

If you haven’t already done so, try out App Engine today and share your feedback and comments with the team.

Whitepaper: Running your modern .NET Application on Kubernetes

This week we conclude our whitepaper series on migration with the final installment, “Running Your Modern .NET Application on Kubernetes.” If you’re just tuning in, you may want to read the first and second posts on the series, as well as the corresponding whitepapers.

Using the .NET-based PetShop application as an example, the first paper discusses monoliths versus microservices, and how to think about deconstructing a monolith into microservices using domain driven design. The second paper explores modernization techniques using microservices, including using Firebase for authentication, and Cloud SQL for PostgreSQL for our data layer.

By now, we’ve come a long way towards modernizing PetShop, but we aren’t quite to the point where we can consider it “cloud-native.” In this final paper we dive into the missing components of the equation: containerizing PetShop with Kubernetes, and orchestration, including autoscaling with Kubernetes.

Containerizing the application provides speed and agility. In a traditional development and deployment pipeline, there are usually environmental discrepancies—configuration differences between a software engineer’s laptop, test and staging environments, as well as your production environment. Containers allow you to run code in a consistent way across multiple environments.

All that said, containers are not a panacea, and they can be cumbersome to run at scale. Which is why orchestration is imperative. Kubernetes orchestration tools can help you schedule container instances, monitor the health of containers, automate recovery and even automate scaling containers up and down to help handle load.

Are you ready to get started on the last leg of this .NET application modernization journey? Download the whitepaper, and visit the GitHub repository.

Now, you can deploy to Kubernetes Engine from GitLab with a few clicks

In cloud developer circles, GitLab is a popular DevOps lifecycle tool. It lets you do everything from project planning and version control to CI/CD pipelines and monitoring, all in a single interface so different functional teams can collaborate. In particular, its Auto DevOps feature detects the language your app is written in and automatically builds your CI/CD pipelines for you.

Google Cloud started the cloud native movement with the invention and open sourcing of Kubernetes in 2014. Kubernetes draws on over a decade of experience running containerized workloads in production serving Google products at massive scale. Kubernetes Engine is our managed Kubernetes service, built by the same team that's the largest contributor to the Kubernetes open source project, and is run by experienced Google SREs, all of which enables you to focus on what you do best: creating applications to delight your users, while leaving the cluster deployment operations to us.

Today, GitLab and Google Cloud are announcing a new integration of GitLab and Kubernetes Engine that makes it easy for you to accelerate your application deployments by provisioning Kubernetes clusters, managed by Google, right from your DevOps pipeline supported by GitLab. You can now connect your Kubernetes Engine cluster to your GitLab project in just a few clicks, and use it to run your continuous integration jobs, and configure a complete continuous deployment pipeline, including previewing your changes live, and deploying them into production, all served by Kubernetes Engine.

Head over to GitLab, and add your first Kubernetes Engine cluster to your project from the CI/CD options in your repository today!

The Kubernetes Engine cluster can be added through the CI/CD -> Kubernetes menu option in the GitLab UI, which even supports creating a brand new Kubernetes Cluster.
Once connected, you can deploy the GitLab Runner into your cluster. This means that the continuous integration jobs will run on your Kubernetes Engine cluster, enabling you fine-grained control over the resources you allocate. For more information read the GitLab Runner docs.

Even more exciting is the new GitLab Auto DevOps integration with Kubernetes Engine. Using Auto DevOps with Kubernetes Engine, you'll have a continuous deployment pipeline that automatically creates a review app for each merge request  a special dynamic environment that allows you to preview changes before they go live  and once you merge, deploy the application into production on production-ready Google Kubernetes Engine.

To get started, go to CI/CD -> General pipeline settings, and select “Enable Auto DevOps”. For more information, read the AutoDev Ops docs.
Auto DevOps does the heavy lifting to detect what languages you’re using, and configure a Continuous Integration and Continuous Deployment pipeline that results in your app running live on the Kubernetes Engine cluster.
Now, whenever you create a merge request, GitLab will run a review pipeline to deploy a review app to your cluster where you can test your changes live. When you merge the code, GitLab will run a production pipeline to deploy your app to production, running on Kubernetes Engine!

Join us for a webinar co-hosted by Google Cloud and GitLab 

Want to learn more? We’re hosting a webinar to show how to build cloud-native applications with Gitlab and Kubernetes Engine. Register here for the April 26th webinar.

Want to get started deploying to Kubernetes? GitLab is offering $500 in Google Cloud Platform credits for new accounts. Try it out.

How to run Windows Containers on Compute Engine

Container virtualization is a rapidly evolving technology that can simplify how you deploy and manage distributed applications. When people discuss containers, they usually mean Linux-based containers. This makes sense, because native Linux kernel features like cgroups introduced the idea of resource isolation, eventually leading to containers as we know them today.

For a long time, you could only containerize Linux processes, but Microsoft introduced support for Windows-based containers in Windows Server 2016 and Windows 10. With this, you can now take an existing Windows application, containerize it using Docker, and run it as an isolated container on Windows. Microsoft supports two flavors of Windows containers: Windows Server and Hyper-V. You can build Windows containers on either the microsoft/windowsservercore and microsoft/nanoserver base images. You can read more about Windows containers in the Microsoft Windows containers documentation.

Windows containers, meet Google Cloud

Google Cloud provides container-optimized VM images that you can use to run containers on Compute Engine. We also offer a Windows VM image for containers that comes with Docker, microsoft/windowsservercore, and microsoft/nanoserver base images installed.

To run Windows containers on Compute Engine, first you need a Windows VM. In Google Cloud Platform Console, go to the Compute Engine section and create an instance. Make sure you’ve selected Windows Server version 1709 Datacenter Core for Containers (Beta) for the boot disk.

After you’ve created the instance, create a new Windows password either from the console or gcloud and RDP into the VM. Once inside the VM, you’ll notice that it’s a bare-minimum OS with a minimal UI. The good news is that Docker and the base images are already installed; that’s all you need to run Windows containers.
For the first Windows container, let’s create a HelloWorld PowerShell script that we can call, similar to this example.

The microsoft/nanoserver:1709 image is already installed, but that image does not include PowerShell. Instead, there's a microsoft/powershell image based on microsoft/nanoserver:1709 image that we can use.

First, let’s pull the PowerShell image and run it:

C:\..> docker pull microsoft/powershell:6.0.1-nanoserver-1709
C:\..> docker run -it microsoft/powershell:6.0.1-nanoserver-1709

This takes us inside the PowerShell container. Now, we can create a HelloWorld PowerShell script and exit the container:

PS C:\> Add-Content C:\Users\Public\helloworld.ps1 'Write-Host "Hello World"'
PS C:\> exit

We now need to create a new image that has the PowerShell script, using the modified container. Get the container ID and create a new container image with that ID using the docker commit command:

C:\..> docker ps -a
C:\..> docker commit  helloworld

Finally, we can run the image by pointing to the script inside the container:

C:\..> docker run --rm helloworld pwsh c:\Users\Public\helloworld.ps1
Hello World!

There you go, you’re running a Windows container on GCP!

If you want to try the steps on your own, we have a codelab you can try out: 
Just don’t forget to to shut down or delete the VM when you’re done experimenting, to avoid incurring charges!

To learn more about Windows containers and Windows on GCP, check out our documentation. And if you have feedback or want to know more, drop us a note in the comments.

Kubernetes 1.10: an insider take on what’s new

The Kubernetes community today announced the release of Kubernetes 1.10, just a few weeks since it graduated from CNCF incubation. As a founding member of the CNCF and the primary authors of Kubernetes, Google continues to be the largest contributor to the project in this release, as well as reviewer of contributions and mentor to community members. At Google we believe growing a vibrant community helps deliver a platform that's open and portable, so users benefit by being able to run their workloads consistently anywhere they want.

In this post, we highlight a few elements of the 1.10 release that we helped contribute to.

Container storage plugins

The Kubernetes implementation of the Container Storage Interface (CSI) has moved to beta in Kubernetes 1.10. CSI enables third-party storage providers to develop solutions outside of the core Kubernetes codebase. Because these plugins are decoupled from the core codebase, installing them is as easy as deploying a Pod to your cluster.

Saad Ali (chair of SIG-Storage) is a primary author of both the CSI specification and Kubernetes' implementation of the specification. "Kubernetes provides a powerful volume plugin system that makes it easy to consume different types of block and file storage,” he explains. “However, adding support for new volume plugins has been challenging. With the adoption of the Container Storage Interface, the Kubernetes volume layer is finally becoming truly extensible. Third-parties can now write and deploy plugins exposing new storage systems in Kubernetes without ever having to touch the core Kubernetes code. Ultimately this will give Kubernetes and Kubernetes Engine users more options for the storage that backs their stateful containerized workloads."

Custom resolver configuration

A key feature of Kubernetes is being able to refer to your Services by a simple DNS name, rather than deal with the complexities of an external discovery service. While this works great for internal names, some Kubernetes Engine customers reported that it caused an overload on the internal DNS server for workloads that primarily look up external names.

Zihong Zheng implemented a feature to allow you to customize the resolver on a per-pod basis. "Kubernetes users can now avoid this trade-off if they want to, so that neither ease of use nor flexibility are compromised," he says. Building this upstream means that the feature is available to Kubernetes users wherever they run.

Device plugins and GPU support

Also moving to beta in 1.10 are Device Plugins, an extension mechanism that lets device vendors advertise their resources to the kubelet without changing Kubernetes core code. A primary use case for device plugins is connecting GPUs to Kubernetes.

Jiaying Zhang is Google's feature lead for device plugins. She worked closely with device vendors to understand their needs, identify common requirements, come up with an execution plan, and work with the OSS community to build a production-ready system. Kubernetes Engine support for GPUs is built on the Device Plugins framework, and our early access customers influenced the feature as it moved to production readiness in Kubernetes 1.10.

API extensions

Googler Daniel Smith (co-chair of SIG API Machinery) first proposed the idea of API extension just a couple months after Kubernetes was open-sourced. We now have two methods for extending the Kubernetes API: Custom Resource Definitions (formerly Third-Party Resources), and API Aggregation, which moves to GA in Kubernetes 1.10. Aggregation, which is used to power ecosystem extensions like the Service Catalog and Metrics Server, allows independently built API server binaries to be hosted through the Kubernetes master, with the same authorization, authentication and security configurations on both. “We’ve been running the aggregation layer in every Google Kubernetes Engine cluster since 1.7 without difficulties, so it’s clearly time to promote this mechanism to GA,” says Daniel. "We’re working to provide a complete extensibility solution, which involves getting both CRDs and admission control webhooks to GA by the end of the year.”

Use Kubernetes to run your Spark workloads

Google's contributions to the open Kubernetes ecosystem extend farther than the Kubernetes project itself. Anirudh Ramanathan (chair of SIG-Big Data) led the upstream implementation of native Kubernetes support in Apache Spark 2.3, a headline feature in that release. Along with Yinan Li, we are hard at work on a Spark Operator, which lets you run Spark workloads in an idiomatic Kubernetes fashion.

Paired with the priority and preemption feature implemented by Bobby Salamat (chair of SIG-Scheduling) and David Oppenheimer (co-author of the Borg paper) you'll soon be able to increase the efficiency of your cluster by using Spark to schedule batch work to run only when the cluster has free resources.

Growing the community

We’re also heavily invested in mentoring for the Kubernetes project. Outreachy is an internship program that helps traditionally underrepresented groups learn and grow tech skills by contributing to open-source projects. Kubernetes' SIG-CLI participated in Outreachy over the 1.10 timeframe with Google's Antoine Pelisse as mentor. With his help, Yolande Amate from Cameroon and Ellen Korbes from Brazil took on the challenge of making improvements to the "kubectl create" and "kubectl set" commands.

With the internship over, Ellen is now a proud Kubernetes project member (and has written a series of blog posts about her path to contribution), and Yolande continues to submit PRs and is working toward her membership.

1.10 available soon on Kubernetes Engine

This latest version of Kubernetes will start rolling out to alpha clusters on Kubernetes Engine in early April. If you want to be among the first to access it on your production clusters, join our early access program today.

If you haven’t tried GCP and Kubernetes Engine before, you can quickly get started with our $300 free credits.

Kubernetes Engine Private Clusters now available in beta

Google Cloud Platform (GCP) employs several security measures to help ensure authenticity, privacy and integrity of your data in transit. As enterprise users turn to Google Kubernetes Engine (GKE) as their preferred deployment model, they too require the same levels of privacy for their data.

Today, we're excited to announce the beta launch of Kubernetes Engine Private Clusters. Now, Kubernetes Engine allows you to deploy clusters privately as part of the Google Virtual Private Cloud (VPC), which provides an isolated boundary where you can securely deploy your applications. With Kubernetes Engine Private Clusters, your cluster’s nodes can only be accessed from within the trusted VPC. In addition, private clusters protect the master node from unwanted access, as the master is completely blocked from access from the internet by default.

In the Kubernetes Engine Private Cluster model, your nodes have access to the rest of your VPC private deployments, including private access to Google managed services such as gcr.io, Google Cloud Storage and Google BigQuery. Access to the internet isn’t possible unless you set up additional mechanisms such as a NAT gateway.

Kubernetes Engine Private Clusters greatly simplify PCI-DSS compliance of your deployments, by limiting how a cluster can be reached from outside of a private network.

Let's take a closer look at how Kubernetes Engine Private Clusters fit into GCP’s private VPC model.

Get started with Private Clusters on Kubernetes Engine

The following tutorial highlights how you can enable Private Clusters for your deployments. In this private cluster model, the Kubernetes Engine cluster nodes are allocated private IP addresses and the master is protected from internet access. As you can see in the example below, you enable a Kubernetes Engine Private Cluster at cluster creation time, selecting the private IP range within your RFC 1918 IP space to use for your master, nodes, pods and services.

Note that you must deploy Kubernetes Engine Private Clusters with IP Aliases enabled. It also requires cluster version 1.8.5 or later.

The following diagram displays the internals of private clusters:

The fastest way to get started is to use the UI during cluster creation:
Alternatively, you can also create your private cluster with the GCP gcloud CLI:

# Create a Private Cluster with IP Alias auto-subnetwork)
gcloud beta container clusters create  --project=<project_id>>
--zone= --private-cluster --master-ipv4-cidr= 
--enable-ip-alias --create-subnetwork=""</master_cidr_block></zone></project_id></cluster></code>

The Master Authorized Network firewall protects access to the Kubernetes Engine master. When Kubernetes Engine Private Clusters is enabled, it's set to “default deny,” making your master inaccessible from the public internet at creation time.

Try it out today!

Create a Kubernetes Engine Private Cluster today. Stay tuned for more updates in this space as we continue to invest in Kubernetes Engine to ensure customers get defense-in-depth security features.

Interested in optimal load balancing?

Do you want to get access to a more container-native load balancing approach in Kubernetes Engine? Sign up here!

Easy HPC clusters on GCP with Slurm

High performance and technical computing is all about scale and speed. Many applications, such as drug discovery and genomics, financial services and image processing require access to a large and diverse set of computing resources on demand. With more and faster computing power, you can convert an idea into a discovery, a hypothesis into a cure or an inspiration into a product. Google Cloud provides the HPC community with on-demand access to large amounts of high-performance resources with Compute Engine. But a challenge remains: how do you harness these powerful resources to execute your HPC jobs quickly, and seamlessly augment an existing HPC cluster with Compute Engine capacity?

To help with this problem, we teamed up with SchedMD to release a preview of tools that makes it easier to launch the Slurm workload manager on Compute Engine, and to expand your existing cluster when you need extra resources. This integration was built by the experts at SchedMD in accordance with Slurm best practices. Slurm is a leading open-source HPC workload manager used often in the TOP500 supercomputers around the world.
With this integration, you can easily launch an auto-scaled Slurm cluster on Compute Engine. The cluster auto-scales according to job requirements and queue depth. Once the Slurm cloud cluster is setup, you can also use Slurm to federate jobs from your on-premises cluster to the Slurm cluster running in Compute Engine. With your HPC cluster in the cloud, you can give each researcher, team or job a dedicated, tailor-fit set of elastic resources so they can focus on solving their problems rather than waiting in queues.
Let’s walk through launching a Slurm cluster on Compute Engine.

Step 1: Grab the Cloud Deployment Manager scripts from SchedMD’s Github repository. Review the included README.md for more information. You may want to customize the deployment manager scripts for your needs. Many cluster parameters can be configured in the included slurm-cluster.yaml file.

At a minimum, you need to edit slurm-cluster.yaml to paste in your munge_key and specify your GCP username in default_users and the Slurm version you want to use (e.g., 17.11.5).

Step 2: Run the following command from the Cloud Shell or your local terminal with gcloud command installed:

gcloud deployment-manager deployments create slurm --config 

Then, navigate to the Deployment Manager section of the developer console and observe that your deployment is successful.
Step 3: If you navigate to the Compute Engine section of the developer console, you’ll see that Deployment Manager created a number of VM instances for you, among them a Slurm login node. After the VMs are provisioned, Slurm will be installed and configured on the VMs. You can now SSH into the login node by clicking the SSH button in the console or by running gcloud compute ssh login1 --zone=us-west1-a (Note: You may need to change the zone if you modified it in the slurm-cluster.yaml file.

Once you’ve logged in, you can interact with Slurm and submit jobs as usual using sbatch. For example, copy the sample script below into a new file called slurm-sample1.sh:

#SBATCH --job-name=hostname_sleep_sample
#SBATCH --output=out_%j.txt
#SBATCH --nodes=2

srun hostname
sleep 60

Then, submit it with:

sbatch slurm-sample1.sh

You can then observe the job being distributed to the compute nodes using the sinfo and squeue commands. Notice how if the submitted job requires more resources than initially deployed, new instances will be automatically created, up to the maximum specified in slurm-cluster.yaml. To try this, set #SBATCH --nodes=4 and resubmit the job. Once the ephemeral compute instances are idle for a period of time specified, they'll be deprovisioned.

Note that for your convenience the deployment manager scripts set up NFS as part of the deployment.

Check out the included README for more information and if you need help getting started with Slurm check out the quick start guide or contact SchedMD.

Network policies for Kubernetes are generally available

We're pleased to announce the GA of network policies for Kubernetes, which we originally announced into beta last September. Network policies are fully tested and supported for production workloads on Google Kubernetes Engine, and, as a community, we recommend users enable them.

Network policies are sets of constraints that allow Kubernetes admins to designate how groups of Pods can communicate with each other, allowing the creation of a hierarchy of network controls. For example, if you have a multi-tier application, you can create a network policy that ensures a compromised front-end service doesn’t communicate with a back-end service such as billing.

Network policies for Kubernetes Engine was implemented in close collaboration with our partner Tigera, the company that’s driving Project Calico.

With GA, the community has added the following additional features:

  • Test support for up to 2,000 Kubernetes Engine nodes 
  • Support for the latest network policies API, currently at Kubernetes 1.9 
  • Calico version 2.6.7, which implements the network policies feature 
  • Calico Kubernetes Engine images on Google Container Registry 
What’s next for Kubernetes network policies?

  • Upgrading to Calico 3.0. For the purposes of this release, we adopted Calico 2.6, but will move to Calico 3.0 soon, giving you the ability to apply Calico network policies and extend base Kubernetes policies with advanced capabilities.
  • Application Layer Policy, which integrates with Istio to enable enforcement of security rules at multiple layers in the stack, and extend the existing network policies definition with layer 5-7 rules, for fine-grained control of application connectivity. Tigera recently shared a tech preview of this Calico feature, and we’re excited to see how Kubernetes Engine users will adopt this additional capability.

The pace of Kubernetes development comes fast and furious, particularly in the area of network security. To learn how to get started with and make the most of network policies in Kubernetes, check out this recent blog post by Google developer experience engineer Ahmet Alp Balkan, then try out network policies for yourself.

If you haven’t tried GCP and Kubernetes Engine before, you can quickly get started with our $300 free credits.