Tag Archives: Compute

Independent research firm names Google Cloud the Insight PaaS Leader



Forrester Research, a leading analyst firm, just named Google Cloud Platform (GCP) the leader in The Forrester Wave™: Insight Platforms-As-A-Service, Q3 2017, its analysis of cloud providers offering Platform as a Service. According to the report, an insight PaaS makes it easier to:

  • Manage and access large, complex data sets
  • Update and evolve applications that deliver insight at the moment of action
  • Update and upgrade technology
  • Integrate and coordinate team member activities

For this Wave, Forrester evaluated eight separate vendors. It looked at 36 evaluation criteria spanning three broad buckets  current offering, strategy and market presence.

Of the eight vendors, Google Cloud’s insight PaaS scored highest for both current offering and strategy.
“Google was the only vendor in our evaluation to offer insight execution features like full machine learning automation with hyperparameter tuning, container management and API management. Google will appeal to firms that want flexibility and extreme scalability for highly competent data scientists and cloud application development teams used to building solutions on PaaS.”  The Forrester Wave: Insight Platforms-As-A-Service, Q3 2017
Our presence in the Insight Platform as a Service market goes way back. We started with a vision for serverless computing back in 2008 with Google App Engine and added serverless data processing in 2010 with Google BigQuery. In 2016 we added machine learning (Cloud Machine Learning Engine) to GCP to help bring the power of TensorFlow (Google’s open source machine learning framework) to everyone. We continue to be amazed by what companies like Snap and The Telegraph are doing with these technologies and look forward to building on these insight services to help you build the amazing applications of tomorrow.

Sign up here to get a complimentary copy of the report.

Google Cloud Platform at SIGGRAPH 2017



For decades, the SIGGRAPH conference has brought together pioneers in the field of computer graphics. This year at SIGGRAPH 2017, we're excited to announce several updates and product releases that reinforce Google Cloud Platform (GCP)’s leadership in cloud-based media and entertainment solutions.

As part of our ongoing collaboration with Autodesk, our hosted ZYNC Render service now supports its 3ds Max 3D modeling, animation and rendering software. Widely used in the media and entertainment, architecture and visualization industries, artists using ZYNC Render can scale their rendering needs to tens of thousands of cores on-demand to meet the ever-increasing need for high resolution, large format imagery. Support for 3ds Max builds on our success with Autodesk; since we announced Autodesk Maya support in April 2016, users have logged nearly 27 million core hours on that platform, and we look forward to what 3ds Max users will create.
ZYNC Render for Autodesk 3ds Max
At launch of 3ds Max support, we’ll also offer support for leading renderers such as Arnold, an Autodesk product, and V-Ray from Chaos Group.

In addition, we’re showing a technology preview of V-Ray GPU for Autodesk Maya on ZYNC Render. Utilizing NVIDIA GPUs running on GCP, V-Ray GPU provides highly scalable, GPU-enhanced rendering performance.

We’re also previewing support for Foundry’s VR toolset CaraVR on ZYNC Render. Running on ZYNC Render, CaraVR can now leverage the massive scalability of Google Compute Engine to render large VR datasets.

We’re also presenting remote desktop workflows that leverage Google Cloud GPUs such as the new NVIDIA P100, which can perform both display and compute tasks. As a result, we're taking full advantage of V-Ray 3.6 Hybrid Rendering technology, as well as NVIDIA's NVLink to share data across multiple NVIDIA P100 cards. We're also showing how to deploy and manage a “farm” of hundreds of GPUs in the cloud.

Google Cloud’s suite of media and entertainment offerings is expansive  from content ingestion and creation to graphics rendering to distribution. Combined with our online video platform Anvato, core infrastructure offerings around compute, GPU and storage, cutting-edge machine learning and Hollywood studio-specific security engagements, Google Cloud provides comprehensive and end-to-end solutions for creative professionals to build media solutions of their choosing.

To learn more about Google Cloud in the media and entertainment field, visit our Google Cloud Media Solutions page. And to experience the power of GCP for yourself, sign up for a free trial.

Three steps to Compute Engine startup-time bliss: Google Cloud Performance Atlas



Scaling to millions of requests, with less headaches, is one of the joys of working on Google Cloud Platform (GCP). With Compute Engine, you can leverage technologies like Instance Groups and Load Balancing to make it even easier. However, there comes a point with VM-based applications where the time it takes to boot up your instance can be problematic if you’re also trying to scale to handle a usage spike.

Before startup time causes woes in your application, let’s take a look at three simple steps to find what parts of bootup are taking the most time and how you can shorten your boot time.

Where does the time go?

One of the most important first steps to clearing up your startup time performance is to profile the official boot stages at a macro level. This gives you a sense of how long Compute Engine is taking to create your instance, vs. how much time your code is taking to run. While the official documentation lists the three startup phases as provisioning, staging and running, it’s a little easier to do performance testing on request, provisioning and booting, since we can time each stage externally, right from Cloud Shell.

  • Request is the time between asking for a VM and getting a response back from the Create Instance API acknowledging that you’ve asked for it. We can directly profile this by timing how long it takes GCP to respond to the insert instance REST command.
  • Provisioning is the time GCE takes to find space for your VM on its architecture; you can find this by polling the Get Instance API on a regular basis, and wait for the “status” flag to change from “provisioning” to “running.”
  • Boot time is when your startup scripts, and other custom code, executes; all the way up to the point when the instance is available. Fellow Cloud Developer Advocate Terry Ryan likes to profile this stage by repeatedly polling the endpoint, and timing the change between receiving 500, 400 and 200 status codes

An example graph generated by timing the request, provision and boot stages, measured 183 times

Profiling your startup scripts

Barring some unforeseen circumstance, the majority of boot-up time for your instances usually happens during the boot phase, when your instance executes its startup scripts. As such, it’s extremely helpful to profile your boot scripts in order to see which phases are creating performance bottlenecks.

Timing your startup scripts is a little trickier than it may seem at first glance. Chances are that your code is integrated into a very powerful tooling system (like Stackdriver Custom Metric API, statsd or brubeck) to help you profile and monitor performance. However applying each one to the startup scripts can create a difficult interaction and boot time overhead, which could skew your profiling results, thus making the testing meaningless.

One neat trick that gets the job done is wrapping each section of your startup script with the SECONDS command (if you're on a linux build), then append the time elapsed for each stage to a file, and set up a new endpoint to serve that file when requested.

This allows you to poll the endpoint from an external location and get data back without too much heavy lifting or modification to your service. This method will also give you a sense of what stages of your script are taking the most boot time.
An example graph generated by timing each stage in a linux startup script

Moving to custom images

For most developers, most of the time that a startup script runs is bringing down packages and installing applications to allow the service to run properly. That’s because many instances are created with public images  preconfigured combinations of OS and bootloaders. These images are great when you want to get up and running fast, but as you start building production-level systems, you’ll soon realize that the large portion of bootup time is no longer booting the OS, but the user-executed startup sequence that grabs packages and binaries and initializes them.

You can address this by creating custom images for your instance. Create a custom image by taking a snapshot of the host disk information (post-boot and install), and store it in a distribution location. Later, when the target instance is booted, the image information is copied right to the hard drive. This is ideal for situations where you've created and modified a root persistent disk to a certain state and would like to save that state to reuse it with new instances. It’s also good when your setup includes installing (and compiling) a number of big libraries, or pieces of software.
An example graph generated by timing the startup phases of the instance. Notice that the graph on the right side is in sub-second scale.

Every millisecond counts

When you’re trying to scale to millions of requests per second, being serviced by thousands of instances, small change in boot time can make a big difference in costs, response time and most importantly, the perception of performance by your users.

If you’d like to know more about ways to optimize your Google Cloud applications, check out the rest of the Google Cloud Performance Atlas blog posts and videos because when it comes to performance, every millisecond counts.

Guest post: Using GCP for massive drug discovery virtual screening



[Editor’s note: Today we hear from Boston, MA-based Silicon Therapeutics, which is applying computational methods in the context of complex biochemical problems relevant in human biology.]

As an integrated computational drug discovery firm, we recently deployed our INSITE Screening platform on Google Cloud Platform (GCP) to analyze over 10 million commercially available molecular compounds as potential starting materials for next-generation medicines. In one week, we performed over 500 million docking computations to evaluate how a protein responds to a given molecule. Each computation involved a docking program that predicted the preferred orientation of a small molecule to a protein and the associated energetics so we could assess whether or not it will bind and alter the function of the target protein.

With a combination of Google Compute Engine standard and Preemptible VMs, we used up to 16,000 cores, for a total of 3 million core-hours and a cost of about $30,000. While this might sound like a lot of time and money, it's a lot less expensive and a lot faster than experimentally screening all compounds. Using a physics-based approach such as our INSITE platform is much more computationally expensive than some other computational screening approaches, but it allows us to find novel binders without the use of any prior information about active compounds (this particular target has no drug-like compounds known to bind). In a final stage of the calculations we performed all-atom molecular dynamics (MD) simulations on the top 1,000 molecules to determine which ones to purchase and experimentally assay for activity.

The bottom line: We successfully completed the screen using our INSITE platform on GCP and found several molecules that have recently been experimentally verified to have on-target and cell-based activity.

We chose to run this high-performance computing (HPC) job on GCP over other public cloud providers for a number of reasons:
  • Availability of high-performance compute infrastructure. Compute Engine has a good inventory of high-performance processors that can be configured with large amounts of cores and memory. It also offers GPUs  a great fit for some of our computations, such as molecular dynamics and free energy calculations. SSD made a big difference in performance, as our total I/O for this screen exceeded 40 TB of raw data. Fast connectivity between the front-end and the compute nodes was also a big factor, as the front-end disk was NFS-mounted on the compute nodes.
  • Support for industry standard tools. As a startup, we value the ability to run our workloads wherever we see fit. Our priorities can change rapidly based on project challenges (chemistry and biology), competition, opportunities and the availability of compute resources. Our INSITE platform is built on a combination of open-source and proprietary in-house software, so portability and repeatability across in-house and public clouds is essential.
  • An attractive pricing model. Preemptible VMs are great combination of cost-effective and predictable, offering up to 80% off standard instances  no bidding and no surprises. That means we don't have to worry about jobs being killed due to a bidding war, which can create significant delays in completing our screens and requires unnecessary human overhead to manage the jobs.
We initialized multiple clusters for the screening; specifically, our cluster’s front-end consisted of three full-priced n1-highmem-32 VM instances with 208GB of RAM that ran the queuing system, and that connected to a 2TB SSD NFS filestore that housed the compound library. Each of these front-end nodes then spawned up to 128 compute nodes configured as n1-highcpu-32 Preemptible VMs, each with 28.8GB of memory. Those compute nodes performed the actual molecular compound screens, and wrote their results back to the filestore. Preemptible VMs run for a maximum of 24 hours; when that time elapsed, the front-end nodes drained any jobs remaining on the compute nodes and re-spawned a new set of nodes until all 10 million compounds had been successfully run.

To manage compute jobs, we enlisted the help of two popular open-source tools: Slurm, a workload manager used by 60% of the world’s TOP500 clusters, and ElastiCluster, which provides a command-line tool to create, manage and setup compute clusters hosted on a variety of cloud infrastructures. Using these open-source packages is economical, provides the lion’s share of the functionality of paid software solutions and ensures we can run our workloads in-house or elsewhere.

More compute = better results

But ultimately, the biggest benefit of using GCP was being able to more thoroughly screen compounds than we could have done with in-house resources. The target protein in this particular study was highly flexible, and having access to massive amounts of compute power allowed us to more accurately model the underlying physics of the system by accounting for protein flexibility. This yielded more active compounds than we would have found without the GCP resources.

The reality is that all proteins are flexible, and undergo some form of induced fit upon ligand binding, so treating protein flexibility is always important in virtual screening if you want the best results. Most molecular docking programs only account for ligand flexibility, so if the receptor structure is not quite right then active compounds might not fit and therefore be missed, no matter how good the docking program is. Our INSITE screening platform incorporates protein flexibility in a novel way that can greatly improve the hit rate in virtual screening, even as it requires a lot of computational resources when screening millions of commercially available compounds.

Example of the dynamic nature of protein target (Interleukin018, IL18)
From the initial 10 million compounds, we prioritized 250 promising compounds for experimental validation in our lab. As a small company, we don't have the capabilities to experimentally screen millions of compounds, and there's no need to do so with an accurate virtual screening approach like we have in our INSITE platform. We're excited to report that at least five of these compounds have shown activity in human cells, suggesting them as promising starting points for new medicines. To our knowledge, there are no drug-like small molecule activators of this important and challenging immune-oncology target.

To learn more about the science at Silicon Therapeutics, please visit our website. And if you’re an engineer with expertise in high performance computing, GPUs and/or molecular simulations, be sure to visit our job listings.

Guest post: Using GCP for massive drug discovery virtual screening



[Editor’s note: Today we hear from Boston, MA-based Silicon Therapeutics, which is applying computational methods in the context of complex biochemical problems relevant in human biology.]

As an integrated computational drug discovery firm, we recently deployed our INSITE Screening platform on Google Cloud Platform (GCP) to analyze over 10 million commercially available molecular compounds as potential starting materials for next-generation medicines. In one week, we performed over 500 million docking computations to evaluate how a protein responds to a given molecule. Each computation involved a docking program that predicted the preferred orientation of a small molecule to a protein and the associated energetics so we could assess whether or not it will bind and alter the function of the target protein.

With a combination of Google Compute Engine standard and Preemptible VMs, we used up to 16,000 cores, for a total of 3 million core-hours and a cost of about $30,000. While this might sound like a lot of time and money, it's a lot less expensive and a lot faster than experimentally screening all compounds. Using a physics-based approach such as our INSITE platform is much more computationally expensive than some other computational screening approaches, but it allows us to find novel binders without the use of any prior information about active compounds (this particular target has no drug-like compounds known to bind). In a final stage of the calculations we performed all-atom molecular dynamics (MD) simulations on the top 1,000 molecules to determine which ones to purchase and experimentally assay for activity.

The bottom line: We successfully completed the screen using our INSITE platform on GCP and found several molecules that have recently been experimentally verified to have on-target and cell-based activity.

We chose to run this high-performance computing (HPC) job on GCP over other public cloud providers for a number of reasons:
  • Availability of high-performance compute infrastructure. Compute Engine has a good inventory of high-performance processors that can be configured with large amounts of cores and memory. It also offers GPUs  a great fit for some of our computations, such as molecular dynamics and free energy calculations. SSD made a big difference in performance, as our total I/O for this screen exceeded 40 TB of raw data. Fast connectivity between the front-end and the compute nodes was also a big factor, as the front-end disk was NFS-mounted on the compute nodes.
  • Support for industry standard tools. As a startup, we value the ability to run our workloads wherever we see fit. Our priorities can change rapidly based on project challenges (chemistry and biology), competition, opportunities and the availability of compute resources. Our INSITE platform is built on a combination of open-source and proprietary in-house software, so portability and repeatability across in-house and public clouds is essential.
  • An attractive pricing model. Preemptible VMs are great combination of cost-effective and predictable, offering up to 80% off standard instances  no bidding and no surprises. That means we don't have to worry about jobs being killed due to a bidding war, which can create significant delays in completing our screens and requires unnecessary human overhead to manage the jobs.
We initialized multiple clusters for the screening; specifically, our cluster’s front-end consisted of three full-priced n1-highmem-32 VM instances with 208GB of RAM that ran the queuing system, and that connected to a 2TB SSD NFS filestore that housed the compound library. Each of these front-end nodes then spawned up to 128 compute nodes configured as n1-highcpu-32 Preemptible VMs, each with 28.8GB of memory. Those compute nodes performed the actual molecular compound screens, and wrote their results back to the filestore. Preemptible VMs run for a maximum of 24 hours; when that time elapsed, the front-end nodes drained any jobs remaining on the compute nodes and re-spawned a new set of nodes until all 10 million compounds had been successfully run.

To manage compute jobs, we enlisted the help of two popular open-source tools: Slurm, a workload manager used by 60% of the world’s TOP500 clusters, and ElastiCluster, which provides a command-line tool to create, manage and setup compute clusters hosted on a variety of cloud infrastructures. Using these open-source packages is economical, provides the lion’s share of the functionality of paid software solutions and ensures we can run our workloads in-house or elsewhere.

More compute = better results

But ultimately, the biggest benefit of using GCP was being able to more thoroughly screen compounds than we could have done with in-house resources. The target protein in this particular study was highly flexible, and having access to massive amounts of compute power allowed us to more accurately model the underlying physics of the system by accounting for protein flexibility. This yielded more active compounds than we would have found without the GCP resources.

The reality is that all proteins are flexible, and undergo some form of induced fit upon ligand binding, so treating protein flexibility is always important in virtual screening if you want the best results. Most molecular docking programs only account for ligand flexibility, so if the receptor structure is not quite right then active compounds might not fit and therefore be missed, no matter how good the docking program is. Our INSITE screening platform incorporates protein flexibility in a novel way that can greatly improve the hit rate in virtual screening, even as it requires a lot of computational resources when screening millions of commercially available compounds.

Example of the dynamic nature of protein target (Interleukin018, IL18)
From the initial 10 million compounds, we prioritized 250 promising compounds for experimental validation in our lab. As a small company, we don't have the capabilities to experimentally screen millions of compounds, and there's no need to do so with an accurate virtual screening approach like we have in our INSITE platform. We're excited to report that at least five of these compounds have shown activity in human cells, suggesting them as promising starting points for new medicines. To our knowledge, there are no drug-like small molecule activators of this important and challenging immune-oncology target.

To learn more about the science at Silicon Therapeutics, please visit our website. And if you’re an engineer with expertise in high performance computing, GPUs and/or molecular simulations, be sure to visit our job listings.

Container Engine now runs Kubernetes 1.7 to drive enterprise-ready secure hybrid workloads



Just over a week ago Google led the most recent open source release of Kubernetes 1.7, and today, that version is available on Container Engine, Google Cloud Platform’s (GCP) managed container service. Container Engine is one of the first commercial Kubernetes offerings running the latest 1.7 release, and includes differentiated features for enterprise security, extensibility, hybrid networking and developer efficiency. Let’s take a look at what’s new in Container Engine.

Enterprise security


Container Engine is designed with enterprise security in mind. By default, Container Engine clusters run a minimal, Google curated Container-Optimized OS (COS) to ensure you don’t have to worry about OS vulnerabilities. On top of that, a team of Google Site Reliability Engineers continuously monitor and manage the Container Engine clusters, so you don’t have to. Now, Container Engine adds several new security enhancements:

  • Starting with this release, kubelet will only have access to the objects it needs to know. The Node authorizer beta restricts each kubelet’s API access to resources (such as secrets) belonging to its scheduled pods. This feature increases the protection of a cluster from a compromised/untrusted node.
  • Network isolation can be an important extra boundary for sensitive workloads. The Kubernetes NetworkPolicy API allows users to control which pods can communicate with each other, providing defense-in-depth and improving secure multi-tenancy. Policy enforcement can now be enabled in alpha clusters.
  • HTTP re-encryption through Google Cloud Load Balancing (GCLB) allows customers to use HTTPS from the GCLB to their service backends. This is an often requested feature that gives customers the peace of mind knowing that their data is fully encrypted in-transit even after it enters Google’s global network.

Together the above features improve workload isolation within a cluster, which is a frequently requested security feature in Kubernetes. Node Authorizer and NetworkPolicy can be combined with the existing RBAC control in Container Engine to improve the foundations of multi-tenancy:
  • Network isolation between Pods (network policy)
  • Resource isolation between Nodes (node authorizer)
  • Centralized control over cluster resources (RBAC)

Enterprise and hybrid networks


Perhaps the most awaited features by our enterprise users are networking support for hybrid cloud and VPN with Container Engine. New in this release:
  • GA Support for all private IP (RFC-1918) addresses, allowing users to create clusters and access resources in all private IP ranges and extending the ability to use Container Engine clusters with existing networks.
  • Exposing services by internal load balancing is beta, allowing Kubernetes and non-Kubernetes services to access one another on a private network1.
  • Source IP preservation is now generally available and allows applications to be fully aware of client IP addresses for services exposed through Kubernetes

Enterprise extensibility

As more enterprises use Container Engine, we're making a major investment to improve extensibility. We heard feedback that customers want to offer custom Kubernetes-style APIs in their clusters.

API Aggregation, launching today in beta on Container Engine, enables you to extend the Kubernetes API with custom APIs. For example, you can now add existing API solutions such as service catalog, or build your own in the future.

Users also want to incorporate custom business logic and third-party solutions into their Container Engine clusters. So we’re introducing Dynamic Admission Control in alpha clusters, providing two ways to add business logic to your cluster:
  • Initializers can modify Kubernetes objects as they are created. For example, you can use an initializer to add Istio capability to a Container Engine alpha cluster, by injecting an Istio sidecar container in every Pod deployed.
  • Webhooks enable you to validate enterprise policy. For example, you can verify that containers being deployed pass your enterprise security audits.
As part of our plans to improve extensibility for enterprises, we're replacing the Third Party Resource (TPR) API with the improved Custom Resource Definition (CRD) API. CRDs are a lightweight way to store structured metadata in Kubernetes, which make it easy to interact with custom controllers via kubectl. If you use the TPR beta feature, please plan to migrate to CRD before upgrading to the 1.8 release.

Workload diversity


Container Engine now enhances your ability to run stateful workloads like databases and key value stores, such as ZooKeeper, with a new automated application update capability. You can:
  • Select from a range of StatefulSet update strategies beta, including rolling updates
  • Optimize roll-out speed with parallel or ordered pod provisioning, particularly useful for applications such as Kafka.
A popular workload on Google Cloud and Container Engine is training machine learning models for better predictive analytics. Many of you have requested GPUs to speed up training time, so we’ve updated Container Engine to support NVIDIA K80 GPUs in alpha clusters for experimentation with this exciting feature. We’ll support additional GPUs in the future.

Developer efficiency


When developers don’t have to worry about infrastructure, they can spend more time building applications. Kubernetes provides building blocks to de-couple infrastructure and application management, and Container Engine builds on that foundation with best-in-class automation features.

We’ve automated large parts of maintaining the health of the cluster, with auto-repair and auto-upgrade of nodes.
  • Auto-repair beta keeps your cluster healthy by proactively monitoring for unhealthy nodes and repairs them automatically without developer involvement.
  • In this release, Container Engine’s auto-upgrade beta capability incorporates Pod Disruption Budgets at the node layer, making upgrades to infrastructure and application controllers predictable and safer.
Container Engine also offers cluster- and pod-level auto-scaling so applications can respond to user demand without manual intervention. This release introduces several GCP-optimized enhancements to cluster autoscaling:
  • Support for scaling node pools to 0 or 1, for when you don’t need capacity
  • Price-based expander for auto-scaling in the most cost-effective way
  • Balanced scale-out of similar node groups, useful for clusters that span multiple zones

The combination of auto-repair, auto-upgrades and cluster autoscaling in Container Engine enables application developers to deploy and scale their apps without being cluster admins.

We’ve also updated the Container Engine UI to assist in debugging and troubleshooting by including detailed workload-related views. For each workload, we show the type (DaemonSet, Deployment, StatefulSet, etc.), running status, namespace and cluster. You can also debug each pod and view annotations, labels, the number of replicas and status, etc. All views are cross-cluster so if you're using multiple clusters, these views allow you to focus on your workloads, no matter where they run. In addition, we also include load balancing and configuration views with deep links to GCP networking, storage and compute. This new UI will be rolling out in the coming week.

Container Engine everywhere


Google Cloud is enabling a shift in enterprise computing: from local to global, from days to seconds, and from proprietary to open. The benefits of this model are becoming clear and exemplified by Container Engine, which saw more than 10x growth last year.

To keep up with demand, we're expanding our global capacity with new Container Engine clusters in our latest GCP regions:
  • Sydney (australia-southeast1)
  • Singapore (asia-southeast1)
  • Oregon (us-west1)
  • London (europe-west2)

These new regions join the half dozen others from Iowa to Belgium to Taiwan where Container Engine clusters are already up and running.

This blog post highlighted some of the new features available in Container Engine. You can find the complete list of new features in the Container Engine release notes.

The rapid adoption of Container Engine and its technology is translating into real customer impact. Here are a few recent stories that highlight the benefits companies are seeing:

  • BQ, one of the leading technology companies in Europe that designs and develops consumer electronics, was able to scale quickly from 15 to 350 services while reducing its cloud hosting costs by approximately 60% through better utilization and use of Preemptible VMs on Container Engine. Read the full story here.
  • Meetup, the social media networking platform, switched from a monolithic application in on-premises data centers to an agile microservices architecture in a multi-cloud environment with the help of Container Engine. This gave its engineering teams autonomy to work on features and develop roadmaps that are independent from other teams, translating into faster release schedules, greater creativity and new functionality. Read the case study here.
  • Loot Crate, a leader in fan subscription boxes, launched a new offering on Container Engine to quickly get their Rails app production ready and able to scale with demand and zero downtime deployments. Read how it built its continuous deployment pipeline with Jenkins in this post.
At Google Cloud we’re really proud of our compute infrastructure, but what really makes it valuable is the services that run on top. Google creates game-changing services on top of world-class infrastructure and tooling. With Kubernetes and Container Engine, Google Cloud makes these innovations available to developers everywhere.

GCP is the first cloud offering a fully managed way to try the newest Kubernetes release, and with our generous 12-month free trial of $300 credits, there’s no excuse to not try it today.

Thanks for your feedback and support. Keep the conversation going and connect with us on the Container Engine Slack channel.



1 Support for accessing Internal Load Balancers over Cloud VPN is currently in alpha; customers can apply for access here.


Guest post: Loot Crate unboxes Google Container Engine for new Sports Crate venture



[Editor’s note: Gamers and superfans know Loot Crate, which delivers boxes of themed swag to 650,000 subscribers every month. Loot Crate built its back-end on Heroku, but for its next venture  Sports Crate  the company decided to containerize its Rails app with Google Container Engine, and added continuous deployment with Jenkins. Read on to learn how they did it.]

Founded in 2012, Loot Crate is the worldwide leader in fan subscription boxes, partnering with entertainment, gaming and pop culture creators to deliver monthly themed crates, produce interactive experiences and digital content and film original video productions. In our first five years, we’ve delivered over 14 million crates to fans in 35 territories across the globe.
In early 2017 we were tasked with launching an offering to Major League Baseball fans called Sports Crate. There were only a couple of months until the 2017 MLB season started on April 2nd, so we needed the site to be up and capturing emails from interested parties as fast as possible. Other items on our wish list included the ability to scale the site as traffic increased, automated zero-downtime deployments, effective secret management and to reap the benefits of Docker images. Our other Loot Crate properties are built on Heroku, but for Sports Crate, we decided to try Container Engine, which we suspected would allow our app to scale better during peak traffic, manage our resources using a single Google login and better manage our costs.


Continuous deployment with Jenkins

Our goal was to be able to successfully deploy an application to Container Engine with a simple git push command. We created an auto-scaling, dual-zone Kubernetes cluster on Container Engine, and tackled how to do automated deployments to the cluster. After a lot of research and a conversation with Google Cloud Solutions Architect Vic Iglesias, we decided to go with Jenkins Multibranch Pipelines. We followed this guide on continuous deployment on Kubernetes and soon had a working Jenkins deployment running in our cluster ready to handle deploys.

Our next task was to create a Dockerfile of our Rails app to deploy to Container Engine. To speed up build time, we created our own base image with Ruby and our gems already installed, as well as a rake task to precompile assets and upload them to Google Cloud Storage when Jenkins builds the Docker image.

Dockerfile in hand, we set up the Jenkins Pipeline to build the Docker image, push it to Google Container Registry and deploy Kubernetes and its services to our environment. We put a Jenkinsfile in our GitHub repo that uses a switch statement based on the GitHub branch name to choose which Kubernetes namespace to deploy to. (We have three QA environments, a staging environment and production environment).

The Jenkinsfile checks out our code from GitHub, builds the Docker image, pushes the image to Container Registry, runs a Kubernetes job that performs any database migrations (checking for success or failure) and runs tests. It then deploys the updated Docker image to Container Engine and reports the status of the deploy to Slack. The entire process takes under 3 minutes.

Improving secret management in the local development environment

Next, we focused on making local development easier and more secure. We do our development locally, and with our Heroku-based applications, we deploy using environment variables that we add in the Heroku config or in the UI. That means that anyone with the Heroku login and permission can see them. For Sports Crate, we wanted to make the environment variables more secure; we put them in a Kubernetes secret that the applications can easily consume, which also keeps the secrets out of the codebase and off developer laptops.

The local development environment consumes those environmental variables using a railtie that goes out to Kubernetes, retrieves the secrets for the development environment, parses them and puts them into the Rails environment. This allows our developers to "cd" into a repo and run "rails server" or "rails console" with the Kubernetes secrets pulled down before the app starts.

TLS termination and load balancing

Another requirement was to set up effective TLS termination and load balancing. We used a Kubernetes Ingress resource with an Nginx Ingress Controller, whose automatic HTTP-to-HTTPS redirect functionality isn’t available from Google Cloud Platform's (GCP) Ingress controller. Once we had the Ingress resource configured with our certificate and our Nginx Ingress controller running behind a service with a static IP, we were able to get to our application from the outside world. Things were starting to come together!

Auto-scaling and monitoring

With all of the basic pieces of our infrastructure on GCP in place, we looked towards auto-scaling, monitoring and educating our QA team on deployment practices and logging. For pod auto-scaling, we implemented a Kubernetes Horizontal Pod Autoscaler on our deployment. This checks CPU utilization and scales the pods up if we start getting a lot of traffic to our app. For monitoring, we implemented Datadog’s Kubernetes Agent and set up metrics to check for any critical issues, and send alerts to PagerDuty. We use StackDriver for logging and educated our team on how to use the StackDriver Logging console to properly drill down to the app, namespace and pod for which they wanted information.

Net-net

With launch day around the corner, we ran load tests on our new app and were amazed at how well it handled large amounts of traffic. The pods auto-scaled exactly as we needed them to and our QA team fell in love with continuous deployment with Jenkins Multibranch Pipelines. All told, Container Engine met all of our requirements, and we were up and running within a month.
Our next project is to move our other monolithic Rails apps off of Heroku and onto Container Engine as decoupled microservices that can take advantage of the newest Kubernetes features. We look forward to improving on what has already been an extremely powerful tool.

Choosing the right compute option in GCP: a decision tree



When you start a new project on Google Cloud Platform (GCP), one of earliest decisions you make is which computing service to use: Google Compute Engine, Google Container Engine, App Engine or even Google Cloud Functions and Firebase.

GCP offers a range of compute services that go from giving users full control (i.e., Compute Engine) to highly-abstracted (i.e., Firebase and Cloud Functions), letting Google take care of more and more of the management and operations along the way.

Here’s how many long-time readers of our blog think about GCP compute options. If you're used to managing VMs and want a similar experience in the cloud, pick Compute Engine. If you use containers and Kubernetes, you can abstract away some of the necessary management overhead by using Container Engine. If you want to focus on your code and avoid the infrastructure pieces entirely, use App Engine. Finally, if you want to focus purely on code and build microservices that expose API endpoints for your applications, use Firebase and Cloud Functions.

Over the years, you've told us that this model works great if you have no constraints, but can be challenging if you do. We’ve heard your feedback and propose another way to choose your compute options using a constraint-based set of questions. (It should go without saying that we’re considering very small aspects of your project.)

1. Are you building a mobile or HTML application that does its heavy lifting, processing-wise, on the client? If you're building a thick client that only relies on a backend for synchronization and/or storage, Firebase is a great option. Firebase allows you to store complex NoSQL documents (or objects if that’s how you think of them) and files using a very easy-to-use API and client available for iOS, Android and Javascript. There’s also a REST API for access from other platforms.

2. Are you building a system based more on events than user interaction? In other words, are you building an app that responds to uploaded files, or maybe logins to other applications? Are you already looking at “serverless” or “Functions as a Service” solutions? Look no further than Cloud Functions. Cloud Functions allows you to write Javascript functions that run on Node.js and that can call any one of our APIs including Cloud Vision, Translate, Cloud Storage or over 100 others. With Cloud Functions, you can build complex individual functions that get exposed as microservices to take advantage of all our services without having to maintain systems and glue them all together.

3. Does your solution already exist somewhere else? Does it include licensed software? Does it require anything other than HTTP/S? If you answered “no,” App Engine is worth a look. App Engine is a serverless solution that runs your code on our infrastructure and charges you only for what you use. We scale it up or down for you depending on demand. In addition, App Engine has access to all the Google SDKs available so you can take advantage of the full Google Cloud ecosystem.

4. Are you looking to build a container-based system built on Kubernetes? If you're already using Kubernetes on GCP, you should really consider Container Engine. (You should think about it wherever you're going to run Kubernetes actually.) Container Engine reduces building a Kubernetes solutions to a single click. Additionally, it auto-scales Kubernetes cluster members, allowing you to build Kubernetes solutions that grow and contract based on demand.

5. Are you building a stateful system? Are you looking to use GPUs in your solution? Are you building a non-Kubernetes container-based solution? Are you migrating an existing on-prem solution to the cloud? Are you using licensed software? Are using protocols other than HTTP/S? Have you not found another solution to meet your needs? If you answered “yes” to any of these questions, you’re probably going to need to run your solution on virtual machines on Compute Engine. Compute Engine is our most flexible computing product, and allows you the most freedom to configure and manage your VMs however you like.

Put all of these questions together and you get the following flowchart:
This is by no means a comprehensive decision tree, and each one of our products supports a wider range of use cases than is presented here. But this should be a good guide to get you started.

To find out more about or computing solutions please check out Computing on Google Cloud Platform and then try it out for yourself today with $300 in free credits when you sign up.

Happy building!


Solution guide: Building connected vehicle apps with Cloud IoT Core



With the Internet of Things (IoT), vehicles are evolving from self-contained commodities focused on transportation to sophisticated, Internet-connected endpoints often capable of two-way communication. The new data streams generated by modern connected vehicles drive innovative business models such as usage-based insurance, enable new in-vehicle experiences and build the foundation for advances such as autonomous driving and vehicle-to-vehicle (V2V) communication.
Through all this, we here at Google Cloud are excited to help make this world a reality. We recently published a solution guide that describes how various Google Cloud Platform (GCP) services fit into the picture.

A data deluge

Vehicles can produce upwards of 560 GB data per vehicle, per day. This deluge of data represents both incredible opportunities and daunting challenges for the platforms that connect and manage vehicle data, including:

  • Device management. Connecting devices to any platform requires authentication, authorization, the ability to push update software, configuration and monitoring. These services must be able to scale to millions of devices and constant availability.
  • Data ingestion. Messages must be reliably received, processed and stored.
  • Data analytics. Complex analysis of time-series data generated from devices must be used to gain insights into event, tolerances, trends and possible failures.
  • Applications. Business-level application logic must be developed and integrated with existing data sources that may come from a third party or exist in on-premise data centers.
  • Predictive models. In order to predict business-level outcomes, predictive models based on current and historical data must be developed.

GCP services, including the recently launched Cloud IoT Core provides a robust computing platform that takes advantage of Google’s end-to-end security model. Let’s take a look at how we can implement a connected vehicle platform using Google Cloud services.
(click to enlarge)

Device Management
: To handle secure device management and communications, Cloud IoT Core makes it easy for you to securely connect your globally distributed devices to GCP and centrally manage them. IoT Core Device Manager provides authentication and authorization, while IoT Core Protocol Bridge enables the messaging between the vehicles and the platform.

Data Ingestion: Cloud Pub/Sub provides a scalable data ingestion point that can handle large data volumes generated by vehicles sending GPS location, engine RPM or images. Cloud BigTable’s scalable storage services are well-suited for time series data storage and analytics.

Data Analytics: Cloud Dataflow can process data pipelines that combine the vehicle device data with corporate vehicle and customer data, then store the combined data in BigQuery. BigQuery provides a powerful analytics engine as-a-service and integrates with common visualization tools such as Tableau, Looker and Qlik.

Applications: Compute Engine, Container Engine and App Engine all provide computing components for a connected vehicle platform. Compute Engine offers a range of different machine types that make it an ideal service for any third-party integration components. Container Engine runs and manages containers, which provide a high degree of flexibility and scalability thanks to their microservices architecture. Finally, App Engine is a scalable serverless platform ideal for consumer mobile and web application frontend services.

Predictive Models: TensorFlow and Cloud Machine Learning Engine provide a sophisticated modeling framework and scalable execution environment. TensorFlow provides the framework to develop custom deep neural network models and is optimized for performance, flexibility and scale  all of which are critical when leveraging IoT-generated data. Machine Learning Engine provides a scalable environment to train TensorFlow models using specialized Google computing infrastructure hardware including GPUs and TPUs.

Summary

Vehicles are becoming sophisticated IoT devices with built-in mobile technology platforms to which third parties can connect and offer advanced services. GCP provides a secure, robust and scalable platform to connect IoT devices ranging from sophisticated head units to simple, low-powered sensors. You can learn more about the next generation of connected vehicles with GCP by reading the solution paper: Designing a Connected Vehicle Platform on Cloud IoT Core.

Google Compute Engine ranked #1 in price-performance by Cloud Spectator



Cloud Spectator, an independent benchmarking and consulting agency, has released a new comparative benchmarking study that ranks Google Cloud #1 for price-performance and block storage performance against AWS, Microsoft Azure and IBM SoftLayer.

In January 2017, Cloud Spectator tested the overall price-performance, VM performance and block storage performance of four major cloud service providers: Google Compute Engine, Amazon Web Services, Microsoft Azure, and IBM SoftLayer. The result is a rare apples-to-apples comparison among major Cloud Service Providers (CSPs), whose distinct pricing models can make them difficult to compare.

According to Cloud Spectator, “A lack of transparency in the public cloud IaaS marketplace for performance often leads to misinformation or false assumptions.” Indeed, RightScale estimates that up to 45% of cloud spending is wasted on resources that never end up being used — a serious hit to any company’s IT budget.

The report can be distilled into three key insights, which upend common misconceptions about cloud pricing and performance:
  • Insight #1: VM performance varies across cloud providers. In testing, Cloud Spectator observed differences of up to 1.4X in VM performance and 6.1X in block storage performance.
  • Insight #2: You don’t always get what you pay for. Cloud Spectator’s study found no correlation between price and performance.
  • Insight #3: Resource contention (the “Noisy Neighbor Effect”) can affect performance — but CSPs can limit those effects. Cloud Spectator points out that noisy neighbors are a real problem with some cloud vendors. To try and handle the problem, some vendors throttle down their customers access to resources (like disks) in an attempt to compensate for other VMs (so called Noisy Neighbors) on the same host machine.

You can download the full report here, or keep reading for key findings.

Key finding: Google leads for overall price-performance

Value, defined as the ratio of price and performance, varies by 2.4x across the compared IaaS providers, with Google achieving the highest CloudSpecs Score (see Methodology, below) among the four cloud IaaS providers. This is due to strong disk performance and the most inexpensive packaged pricing found in the study.


To learn more, download “2017 Best Hyperscale Cloud Providers: AWS vs. Azure vs. Google vs. SoftLayer,” a report by Cloud Spectator.


Methodology

Cloud Spectator’s price-performance calculation, the CloudSpecs Score™, provides information on how much performance the user receives for each unit of cost. The CloudSpecs Score™ is an indexed, comparable score ranging from 0-100 indicative of value based on a combination of cost and performance. The calculation of the CloudSpecs Score™ is: price-performance_value = [VM performance score] / [VM cost] best_VM_value = max{price-performance_values} CloudSpecs Score™ = 100*price-performance_value / best_VM_value
Overall storage CloudSpecs Score™ was calculated by averaging block storage and vCPU-memory price-performance scores together so that they have equal weight for each VM size. Then, all resulting VM size scores were averaged together.