Tag Archives: Google Cloud Platform

Introducing VPC-native clusters for Google Kubernetes Engine



[Editor's note: This is one of many posts on enterprise features enabled by Kubernetes Engine 1.10. For the full coverage, follow along here.]

Over the past few weeks, we’ve made some exciting announcements around Google Kubernetes Engine, starting with the general availability of Kubernetes 1.10 in the service. This latest version has new features that will really help enterprise use cases such as support for Shared Virtual Private Cloud (VPC) and Regional Clusters for high availability and reliability.

Building on that momentum, we are excited to announce the ability to create VPC-native clusters in Kubernetes Engine. A VPC-native cluster uses Alias IP routing built into the VPC network, resulting in a more scalable, secure and simple system that is suited for demanding enterprise deployments and use cases.

VPC-native clusters using Alias IP
VPC-native clusters rely on Alias IP which provides integrated VPC support for container networking. Without Alias IP, Kubernetes Engine uses Routes for Pod networking, which requires the Kubernetes control plane to maintain static routes to each Node. By using Alias IP, the VPC control panel automatically manages routing setup for Pods. In addition to this automatic management, native integration of container networking into the VPC fabric improves scalability and integration between Kubernetes and other VPC features.

Alias IP has been available on Google Cloud Platform (GCP) for Google Compute Engine instances for some time. Extending this functionality to Kubernetes Engine provides the following benefits:
  • Scale enhancements - VPC-native clusters no longer carry the burden of Routes and can scale to more nodes. VPC-native clusters will not be subject to Route quotas and limits, allowing you to seamlessly increase your Cluster size.
  • Hybrid connectivity - Alias IP subnets can be advertised by the Cloud Router over Cloud VPN or Cloud Interconnect, allowing you to connect your hybrid on-premises deployments with your Kubernetes Engine cluster. In addition, Alias IP advertisements with Cloud Router gives you granular control over which subnetworks and secondary range(s) are published to peer routers.
  • Better VPC integration - Alias IP provides Kubernetes Engine Pods with direct access to Google services like Google Cloud Storage, BigQuery and any other services served from the googleapis.com domain, without the overhead of a NAT proxy. Alias IP also enables enhanced VPC features such as Shared VPC.
  • Security checks - Alias IP allows you to enable anti-spoofing checks for the Nodes in your cluster. These anti-spoofing checks are provisioned on instances by default to ensure that traffic is not sent from arbitrary source IPs. Since Alias IP ranges in VPC-native clusters are known to the VPC network, they pass anti-spoofing checks by gidefault.
  • IP address management - VPC-native clusters integrate directly into your VPC IP address management system, preventing potential double allocation of your VPC IP space. Route-based clusters required manually blocking off the set of IPs assigned to your Cluster. VPC-native clusters provide two modes of allocating IPs, providing a full spectrum of control to the user. In the default method, Kubernetes Engine auto-selects and assigns secondary ranges for Pods and Services ranges. And if you need tight control over subnet assignments, you can create a custom subnet and secondary ranges and use it for Node, Pods and Service IPs. With Alias IP, GCP ensures that the Pod IP addresses cannot conflict with IP addresses on other resources.
Early adopters are already benefiting from the security and scale of VPC-native clusters in Kubernetes Engine. Vungle, an in-app video advertising platform for performance marketers, uses VPC-native clusters in Kubernetes Engine for its demanding applications
“VPC-native clusters, using Alias IPs, in Google Kubernetes Engine allowed us to run our bandwidth-hungry applications on Kubernetes without any of the performance degradation that we had seen when using overlay networks."
- Daniel Nelson, Director of Engineering, Vungle
Try it out today!
Create VPC-native clusters in Kubernetes Engine to get the ease of access and scale enterprise workloads require. Also, don’t forget to sign up for our upcoming webinar, 3 reasons why you should run your enterprise workloads on Google Kubernetes Engine.

Stackdriver brings powerful alerting capabilities to the condition editor UI



If you use Stackdriver, you probably rely on our alerting stack to be informed when your applications are misbehaving or aren’t performing as expected. We know how important it is to receive notifications at the right time as well as in the right situation. Imprecisely specifying what situation you want to be alerted on can lead to too many alerts (false positives) or too few (false negatives). When defining a Stackdriver alerting policy, it’s imperative that conditions be made as specific as possible, which is part of the reason that we introduced the ability to manage alerting policies in the Stackdriver Monitoring API last month. This, for example, enables users to create alerting conditions for resources filtered by certain metadata so that they can assign different conditions to parts of their applications that use similar resources but perform different functions.

But what about users who want to specify similar filters and aggregations using the Stackdriver UI? How can you get a more precise way to define the behavior that a metric must exhibit for the condition to be met (for example, alerting on certain resources filtered by metadata), as well as a more visual way of finding the right metrics to alert on for your applications?

We’ve got you covered. We are excited to announce the beta version of our new alerting condition configuration UI. In addition to allowing you to define alerting conditions more precisely, this new UI provides an easier, more visual way to find the metrics to alert on. The new UI lets you use the same metrics selector as used in Stackdriver’s Metrics Explorer to define a broader set of conditions. Starting today, you can use that metrics selector to create and edit threshold conditions for alerting policies. The same UI that you use to select metrics for charts can now be used for defining alerting policy conditions. It’s a powerful and more complete method for identifying your time series and specific aggregations. You’ll be able to express more targeted, actionable alerts with fewer false alerts.

We’ve already seen some great use cases for this functionality. Here are some ways in which our users have used this UI during early testing:

1. Alerting on aggregations of custom metrics and logs-based metrics
The ability to alert on aggregations of custom metrics or logs-based metrics is a common request from our users. This was recently made possible with the introduction of support for alerting policy management in the Stackdriver Monitoring v3 API. However, until this beta launch, there was no visual equivalent. With the introduction of this new UI, you can now visually explore metrics and define their alerting conditions before committing to an alerting policy. This adds a useful visual representation so you’ll have choices when setting up alert policies.

For example, below is a screen recording that shows how to aggregate a sum across a custom metrics grouped by pod:

2. Filter metadata to alert on specific Kubernetes resources
With the recent introduction of Stackdriver Kubernetes Monitoring, you have more out-of-the-box observability into your Kubernetes clusters. Now, with the addition of this new threshold condition UI, you can set up alerts on specific resources defined by metadata fields, instead of having to include the entire cluster.

For example, below is a screen recording showing how to alert when Kubernetes resources with a specific service name (customers-service) cross a certain aggregated threshold of the bytes transmitted. Using the metrics selector, you can configure the specific filters, grouping and aggregations that you’re interested in:

3. Edit metric threshold conditions that were created via the API
Many Stackdriver users utilize both the API and the alerting UI to create and edit alerting conditions. With this release, you can edit directly in the new UI many conditions that were previously created using the API.

Getting started with the new Stackdriver condition editor UI
To use the new UI, you must first opt in. When adding a policy condition, go to the Select condition type page. At the top of this page is an invitation to try a new variant of the UI:

Note that the new condition editor does not support process-health and uptime-check conditions, which continue to use the existing UI. The new UI supports all other condition types.

If you prefer to go back to the current UI, you can do so at any time by opting out. We’re looking forward to hearing more from users about what you’re accomplishing with the new UI.

To learn more, check out some specifics here on using the alerting UI.

Please send us feedback either via the feedback widget (click on your avatar -> Send Feedback), or by emailing us.

Related content:
New ways to manage and automate your Stackdriver alerting policies
Extracting value from your logs with Stackdriver logs-based metrics
Announcing Stackdriver Kubernetes Monitoring: Comprehensive Kubernetes observability from the start

Stackdriver brings powerful alerting capabilities to the condition editor UI



If you use Stackdriver, you probably rely on our alerting stack to be informed when your applications are misbehaving or aren’t performing as expected. We know how important it is to receive notifications at the right time as well as in the right situation. Imprecisely specifying what situation you want to be alerted on can lead to too many alerts (false positives) or too few (false negatives). When defining a Stackdriver alerting policy, it’s imperative that conditions be made as specific as possible, which is part of the reason that we introduced the ability to manage alerting policies in the Stackdriver Monitoring API last month. This, for example, enables users to create alerting conditions for resources filtered by certain metadata so that they can assign different conditions to parts of their applications that use similar resources but perform different functions.

But what about users who want to specify similar filters and aggregations using the Stackdriver UI? How can you get a more precise way to define the behavior that a metric must exhibit for the condition to be met (for example, alerting on certain resources filtered by metadata), as well as a more visual way of finding the right metrics to alert on for your applications?

We’ve got you covered. We are excited to announce the beta version of our new alerting condition configuration UI. In addition to allowing you to define alerting conditions more precisely, this new UI provides an easier, more visual way to find the metrics to alert on. The new UI lets you use the same metrics selector as used in Stackdriver’s Metrics Explorer to define a broader set of conditions. Starting today, you can use that metrics selector to create and edit threshold conditions for alerting policies. The same UI that you use to select metrics for charts can now be used for defining alerting policy conditions. It’s a powerful and more complete method for identifying your time series and specific aggregations. You’ll be able to express more targeted, actionable alerts with fewer false alerts.

We’ve already seen some great use cases for this functionality. Here are some ways in which our users have used this UI during early testing:

1. Alerting on aggregations of custom metrics and logs-based metrics
The ability to alert on aggregations of custom metrics or logs-based metrics is a common request from our users. This was recently made possible with the introduction of support for alerting policy management in the Stackdriver Monitoring v3 API. However, until this beta launch, there was no visual equivalent. With the introduction of this new UI, you can now visually explore metrics and define their alerting conditions before committing to an alerting policy. This adds a useful visual representation so you’ll have choices when setting up alert policies.

For example, below is a screen recording that shows how to aggregate a sum across a custom metrics grouped by pod:

2. Filter metadata to alert on specific Kubernetes resources
With the recent introduction of Stackdriver Kubernetes Monitoring, you have more out-of-the-box observability into your Kubernetes clusters. Now, with the addition of this new threshold condition UI, you can set up alerts on specific resources defined by metadata fields, instead of having to include the entire cluster.

For example, below is a screen recording showing how to alert when Kubernetes resources with a specific service name (customers-service) cross a certain aggregated threshold of the bytes transmitted. Using the metrics selector, you can configure the specific filters, grouping and aggregations that you’re interested in:

3. Edit metric threshold conditions that were created via the API
Many Stackdriver users utilize both the API and the alerting UI to create and edit alerting conditions. With this release, you can edit directly in the new UI many conditions that were previously created using the API.

Getting started with the new Stackdriver condition editor UI
To use the new UI, you must first opt in. When adding a policy condition, go to the Select condition type page. At the top of this page is an invitation to try a new variant of the UI:

Note that the new condition editor does not support process-health and uptime-check conditions, which continue to use the existing UI. The new UI supports all other condition types.

If you prefer to go back to the current UI, you can do so at any time by opting out. We’re looking forward to hearing more from users about what you’re accomplishing with the new UI.

To learn more, check out some specifics here on using the alerting UI.

Please send us feedback either via the feedback widget (click on your avatar -> Send Feedback), or by emailing us.

Related content:
New ways to manage and automate your Stackdriver alerting policies
Extracting value from your logs with Stackdriver logs-based metrics
Announcing Stackdriver Kubernetes Monitoring: Comprehensive Kubernetes observability from the start

Kubernetes best practices: mapping external services



Editor’s note: Today is the sixth installment in a seven-part video and blog series from Google Developer Advocate Sandeep Dinesh on how to get the most out of your Kubernetes environment.

If you’re like most Kubernetes users, chances are you use services that live outside your cluster. For example, maybe you use the Twillio API to send text messages, or maybe the Google Cloud Vision API to do image analysis.

If your applications in your different environments connect to the same external endpoint, and have no plans to bring the external service into your Kubernetes cluster, it is perfectly fine to use the external service endpoint directly in your code. However, there are many scenarios where this is not the case.

A good example of this are databases. While some cloud-native databases such as Cloud Firestore or Cloud Spanner use a single endpoint for all access, most databases have separate endpoints for different instances.

At this point, you may be thinking that a good solution to finding the endpoint is to use ConfigMaps. Simply store the endpoint address in a ConfigMap, and use it in your code as an environment variable. While this solution works, there are a few downsides. You need to modify your deployment to include the ConfigMap and write additional code to read from the environment variables. But most importantly, if the endpoint address changes you may need to restart all running containers to get the updated endpoint address.

In this episode of “Kubernetes best practices”, let’s learn how to leverage Kubernetes’ built-in service discovery mechanisms for services running outside the cluster, just like you can for services inside the cluster! This gives you parity across your dev and prod environments, and if you eventually move the service inside the cluster, you don’t have to change your code at all.

Scenario 1: Database outside cluster with IP address

A very common scenario is when you are hosting your own database, but doing so outside the cluster, for example on a Google Compute Engine instance. This is very common if you run some services inside Kubernetes and some outside, or need more customization or control than Kubernetes allows.

Hopefully, at some point, you can move all services inside the cluster, but until then you are living in a hybrid world. Thankfully, you can use static Kubernetes services to ease some of the pain.

In this example, I created a MongoDB server using Cloud Launcher. Because it is created in the same network (or VPC) as the Kubernetes cluster, it can be accessed using the high performance internal IP address. In Google Cloud, this is the default setup, so there is nothing special you need to configure.

Now that we have the IP address, the first step is to create a service:
kind: Service
apiVersion: v1
metadata:
 name: mongo
Spec:
 type: ClusterIP
 ports:
 - port: 27017
   targetPort: 27017
You might notice there are no Pod selectors for this service. This creates a service, but it doesn’t know where to send the traffic. This allows you to manually create an Endpoints object that will receive traffic from this service.

kind: Endpoints
apiVersion: v1
metadata:
 name: mongo
subsets:
 - addresses:
     - ip: 10.240.0.4
   ports:
     - port: 27017
You can see that the Endpoints manually defines the IP address for the database, and it uses the same name as the service. Kubernetes uses all the IP addresses defined in the Endpoints as if they were regular Kubernetes Pods. Now you can access the database with a simple connection string:
mongodb://mongo
> No need to use IP addresses in your code at all! If the IP address changes in the future, you can update the Endpoint with the new IP address, and your applications won’t need to make any changes.

Scenario 2: Remotely hosted database with URI

If you are using a hosted database service from a third party, chances are they give you a unified resource identifier (URI) that you can use to connect to. If they give you an IP address, you can use the method in Scenario 1.

In this example, I have two MongoDB databases hosted on mLab. One of them is my dev database, and the other is production.

The connection strings for these databases are as follows:
mongodb://<dbuser>:<dbpassword>@ds149763.mlab.com:49763/dev
mongodb://<dbuser>:<dbpassword>@ds145868.mlab.com:45868/prod
mLab gives you a dynamic URI and a dynamic port, and you can see that they are both different. Let’s use Kubernetes to create an abstraction layer over these differences. In this example, let’s connect to the dev database.

You can create a “ExternalName” Kubernetes service, which gives you a static Kubernetes service that redirects traffic to the external service. This service does a simple CNAME redirection at the kernel level, so there is very minimal impact on your performance.

The YAML for the service looks like this:
kind: Service
apiVersion: v1
metadata:
 name: mongo
spec:
 type: ExternalName
 externalName: ds149763.mlab.com
Now, you can use a much more simplified connection string:
mongodb://<dbuser>:<dbpassword>@mongo:<port>/dev
Because “ExternalName” uses CNAME redirection, it can’t do port remapping. This might be okay for services with static ports, but unfortunately it falls short in this example, where the port is dynamic. mLab’s free tier gives you a dynamic port number and you cannot change it. This means you need a different connection string for dev and prod.

However, if you can get the IP address, then you can do port remapping as I will explain in the next section.

Scenario 3: Remotely hosted database with URI and port remapping

While the CNAME redirect works great for services with the same port for each environment, it falls short in scenarios where the different endpoints for each environment use different ports. Thankfully we can work around that using some basic tools.

The first step is to get the IP address from the URI.

If you run the nslookup, hostname, or ping command against the URI, you can get the IP address of the database.

You can now create a service that remaps the mLab port and an endpoint for this IP address.
kind: Service
apiVersion: v1
metadata:
 name: mongo
spec:
 ports:
 - port: 27017
   targetPort: 49763
---
kind: Endpoints
apiVersion: v1
metadata:
 name: mongo
subsets:
 - addresses:
     - ip: 35.188.8.12
   ports:
     - port: 49763
Note: A URI might use DNS to load-balance to multiple IP addresses, so this method can be risky if the IP addresses change! If you get multiple IP addresses from the above command, you can include all of them in the Endpoints YAML, and Kubernetes will load balance traffic to all the IP addresses.

With this, you can connect to the remote database without needing to specify the port. The Kubernetes service does the port remapping transparently!
mongodb://<dbuser>:<dbpassword>@mongo/dev

Conclusion

Mapping external services to internal ones gives you the flexibility to bring these services into the cluster in the future while minimizing refactoring efforts. Even if you don’t plan to bring them in today, you never know what tomorrow might bring! Additionally, it makes it easier to manage and understand which external services your organization is using.

If the external service has a valid domain name and you don’t need port remapping, then using the “ExternalName” service type is an easy and quick way to map the external service to an internal one. If you don’t have a domain name or need to do port remapping, simply add the IP addresses to an endpoint and use that instead.

Going to Google Cloud Next18? Stop by to meet me and other Kubernetes team members in the "Meet the Experts" zone! Hope to see you there!

Kubernetes best practices: mapping external services



Editor’s note: Today is the sixth installment in a seven-part video and blog series from Google Developer Advocate Sandeep Dinesh on how to get the most out of your Kubernetes environment.

If you’re like most Kubernetes users, chances are you use services that live outside your cluster. For example, maybe you use the Twillio API to send text messages, or maybe the Google Cloud Vision API to do image analysis.

If your applications in your different environments connect to the same external endpoint, and have no plans to bring the external service into your Kubernetes cluster, it is perfectly fine to use the external service endpoint directly in your code. However, there are many scenarios where this is not the case.

A good example of this are databases. While some cloud-native databases such as Cloud Firestore or Cloud Spanner use a single endpoint for all access, most databases have separate endpoints for different instances.

At this point, you may be thinking that a good solution to finding the endpoint is to use ConfigMaps. Simply store the endpoint address in a ConfigMap, and use it in your code as an environment variable. While this solution works, there are a few downsides. You need to modify your deployment to include the ConfigMap and write additional code to read from the environment variables. But most importantly, if the endpoint address changes you may need to restart all running containers to get the updated endpoint address.

In this episode of “Kubernetes best practices”, let’s learn how to leverage Kubernetes’ built-in service discovery mechanisms for services running outside the cluster, just like you can for services inside the cluster! This gives you parity across your dev and prod environments, and if you eventually move the service inside the cluster, you don’t have to change your code at all.

Scenario 1: Database outside cluster with IP address

A very common scenario is when you are hosting your own database, but doing so outside the cluster, for example on a Google Compute Engine instance. This is very common if you run some services inside Kubernetes and some outside, or need more customization or control than Kubernetes allows.

Hopefully, at some point, you can move all services inside the cluster, but until then you are living in a hybrid world. Thankfully, you can use static Kubernetes services to ease some of the pain.

In this example, I created a MongoDB server using Cloud Launcher. Because it is created in the same network (or VPC) as the Kubernetes cluster, it can be accessed using the high performance internal IP address. In Google Cloud, this is the default setup, so there is nothing special you need to configure.

Now that we have the IP address, the first step is to create a service:
kind: Service
apiVersion: v1
metadata:
 name: mongo
Spec:
 type: ClusterIP
 ports:
 - port: 27017
   targetPort: 27017
You might notice there are no Pod selectors for this service. This creates a service, but it doesn’t know where to send the traffic. This allows you to manually create an Endpoints object that will receive traffic from this service.

kind: Endpoints
apiVersion: v1
metadata:
 name: mongo
subsets:
 - addresses:
     - ip: 10.240.0.4
   ports:
     - port: 27017
You can see that the Endpoints manually defines the IP address for the database, and it uses the same name as the service. Kubernetes uses all the IP addresses defined in the Endpoints as if they were regular Kubernetes Pods. Now you can access the database with a simple connection string:
mongodb://mongo
> No need to use IP addresses in your code at all! If the IP address changes in the future, you can update the Endpoint with the new IP address, and your applications won’t need to make any changes.

Scenario 2: Remotely hosted database with URI

If you are using a hosted database service from a third party, chances are they give you a unified resource identifier (URI) that you can use to connect to. If they give you an IP address, you can use the method in Scenario 1.

In this example, I have two MongoDB databases hosted on mLab. One of them is my dev database, and the other is production.

The connection strings for these databases are as follows:
mongodb://<dbuser>:<dbpassword>@ds149763.mlab.com:49763/dev
mongodb://<dbuser>:<dbpassword>@ds145868.mlab.com:45868/prod
mLab gives you a dynamic URI and a dynamic port, and you can see that they are both different. Let’s use Kubernetes to create an abstraction layer over these differences. In this example, let’s connect to the dev database.

You can create a “ExternalName” Kubernetes service, which gives you a static Kubernetes service that redirects traffic to the external service. This service does a simple CNAME redirection at the kernel level, so there is very minimal impact on your performance.

The YAML for the service looks like this:
kind: Service
apiVersion: v1
metadata:
 name: mongo
spec:
 type: ExternalName
 externalName: ds149763.mlab.com
Now, you can use a much more simplified connection string:
mongodb://<dbuser>:<dbpassword>@mongo:<port>/dev
Because “ExternalName” uses CNAME redirection, it can’t do port remapping. This might be okay for services with static ports, but unfortunately it falls short in this example, where the port is dynamic. mLab’s free tier gives you a dynamic port number and you cannot change it. This means you need a different connection string for dev and prod.

However, if you can get the IP address, then you can do port remapping as I will explain in the next section.

Scenario 3: Remotely hosted database with URI and port remapping

While the CNAME redirect works great for services with the same port for each environment, it falls short in scenarios where the different endpoints for each environment use different ports. Thankfully we can work around that using some basic tools.

The first step is to get the IP address from the URI.

If you run the nslookup, hostname, or ping command against the URI, you can get the IP address of the database.

You can now create a service that remaps the mLab port and an endpoint for this IP address.
kind: Service
apiVersion: v1
metadata:
 name: mongo
spec:
 ports:
 - port: 27017
   targetPort: 49763
---
kind: Endpoints
apiVersion: v1
metadata:
 name: mongo
subsets:
 - addresses:
     - ip: 35.188.8.12
   ports:
     - port: 49763
Note: A URI might use DNS to load-balance to multiple IP addresses, so this method can be risky if the IP addresses change! If you get multiple IP addresses from the above command, you can include all of them in the Endpoints YAML, and Kubernetes will load balance traffic to all the IP addresses.

With this, you can connect to the remote database without needing to specify the port. The Kubernetes service does the port remapping transparently!
mongodb://<dbuser>:<dbpassword>@mongo/dev

Conclusion

Mapping external services to internal ones gives you the flexibility to bring these services into the cluster in the future while minimizing refactoring efforts. Even if you don’t plan to bring them in today, you never know what tomorrow might bring! Additionally, it makes it easier to manage and understand which external services your organization is using.

If the external service has a valid domain name and you don’t need port remapping, then using the “ExternalName” service type is an easy and quick way to map the external service to an internal one. If you don’t have a domain name or need to do port remapping, simply add the IP addresses to an endpoint and use that instead.

Going to Google Cloud Next18? Stop by to meet me and other Kubernetes team members in the "Meet the Experts" zone! Hope to see you there!

Beyond CPU: horizontal pod autoscaling with custom metrics in Google Kubernetes Engine



Many customers of Kubernetes Engine, especially enterprises, need to autoscale their environments based on more than just CPU usage—for example queue length or concurrent persistent connections. In Kubernetes Engine 1.9 we started adding features to address this and today, with the latest beta release of Horizontal Pod Autoscaler (HPA) on Kubernetes Engine 1.10, you can configure your deployments to scale horizontally in a variety of ways.

To walk you through your different horizontal scaling options, meet Barbara, a DevOps engineer working at a global video-streaming company. Barbara runs her environment on Kubernetes Engine, including the following microservices:
  • A video transcoding service that processes newly uploaded videos
  • A Google Cloud Pub/Sub queue for the list of videos that the transcoding service needs to process
  • A video-serving frontend that streams videos to users
A high-level diagram of Barbara’s application.

To make sure she meets the service level agreement for the latency of processing the uploads (which her company defines as a total travel time of the uploaded file) Barbara configures the transcoding service to scale horizontally based on the queue length—adding more replicas when there are more videos to process or removing replicas and saving money when the queue is short. In Kubernetes Engine 1.10 she accomplishes that by using the new ‘External’ metric type when configuring the Horizontal Pod Autoscaler. You can read more about this here.

apiVersion: autoscaling/v2beta1                                                 
kind: HorizontalPodAutoscaler                                                   
metadata:                                                                       
  name: transcoding-worker                                                                    
  namespace: video                                                            
spec:                                                                           
  minReplicas: 1                                                                
  maxReplicas: 20                                                                
  metrics:                                                                      
  - external:                                                                      
      metricName: pubsub.googleapis.com|subscription|num_undelivered_messages   
      metricSelector:                                                           
        matchLabels:                                                            
          resource.labels.subscription_id: transcoder_subscription                            
      targetAverageValue: "10"                                                   
    type: External                                                              
  scaleTargetRef:                                                               
    apiVersion: apps/v1                                              
    kind: Deployment                                                            
    name: transcoding-worker
To handle scaledowns correctly, Barbara also makes sure to set graceful termination periods of pods that are long enough to allow any transcoding already happening on pods to complete. She also writes her application to stop processing new queue items after it receives the SIGTERM termination signal from Kubernetes Engine.
A high-level diagram of Barbara’s application showing the scaling bottleneck.

Once the videos are transcoded, Barbara needs to ensure great viewing experience for her users. She identifies the bottleneck for the serving frontend: the number of concurrent persistent connections that a single replica can handle. Each of her pods already exposes its current number of open connections, so she configures the HPA object to maintain the average value of open connections per pod at a comfortable level. She does that using the Pods custom metric type.

apiVersion: autoscaling/v2beta1                                                 
kind: HorizontalPodAutoscaler                                                   
metadata:                                                                       
  name: frontend                                                                    
  namespace: video                                                            
spec:                                                                           
  minReplicas: 4                                                                
  maxReplicas: 40                                                                
  metrics:  
  - type: Pods
    pods:
      metricName: open_connections
      targetAverageValue: 100                                                                                                                            
  scaleTargetRef:                                                               
    apiVersion: apps/v1                                              
    kind: Deployment                                                            
    name: frontend
To scale based on the number of concurrent persistent connections as intended, Barbara also configures readiness probes such that any saturated pods are temporarily removed from the service until their situation improves. She also ensures that the streaming client can quickly recover if its current serving pod is scaled down.

It is worth noting here that her pods expose the open_connections metric as an endpoint for Prometheus to monitor. Barbara uses the prometheus-to-sd sidecar to make those metrics available in Stackdriver. To do that, she adds the following YAML to her frontend deployment config. You can read more about different ways to export metrics and use them for autoscaling here.

containers:
  ...
  - name: prometheus-to-sd
    image: gcr.io/google-containers/prometheus-to-sd:v0.2.6
    command:
    - /monitor
    - --source=:http://localhost:8080
    - --stackdriver-prefix=custom.googleapis.com
    - --pod-id=$(POD_ID)
    - --namespace-id=$(POD_NAMESPACE)
    env:
    - name: POD_ID
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.uid
    - name: POD_NAMESPACE
      valueFrom:
        fieldRef:
          fieldPath: metadata.namespace
Recently, Barbara’s company introduced a new feature: streaming live videos. This introduces a new bottleneck to the serving frontend. It now needs to transcode some streams in real- time, which consumes a lot of CPU and decreases the number of connections that a single replica can handle.
A high-level diagram of Barbara’s application showing the new bottleneck due to CPU intensive live transcoding.
To deal with that, Barbara uses an existing feature of the Horizontal Pod Autoscaler to scale based on multiple metrics at the same time—in this case both the number of persistent connections as well as CPU consumption. HPA selects the maximum signal of the two, which is then used to trigger autoscaling:

apiVersion: autoscaling/v2beta1                                                 
kind: HorizontalPodAutoscaler                                                   
metadata:                                                                       
  name: frontend                                                                    
  namespace: video                                                            
spec:                                                                           
  minReplicas: 4                                                                
  maxReplicas: 40                                                                
  metrics:  
  - type: Pods
    pods:
      metricName: open_connections
      targetAverageValue: 100
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 60                                                                                                                        
  scaleTargetRef:                                                               
    apiVersion: apps/v1                                              
    kind: Deployment                                                            
    name: frontend
These are just some of the scenarios that HPA on Kubernetes can help you with.

Take it for a spin

Try Kubernetes Engine today with our generous 12-month free trial of $300 credits. Spin up a cluster (or a dozen) and experience the difference of running Kubernetes on Google Cloud, the cloud built for containers. And watch this space for future posts about how to use Cluster Autoscaler and Horizontal Pod Autoscaler together to make the most out of Kubernetes Engine.

Get higher availability with Regional Persistent Disks on Google Kubernetes Engine



Building highly available stateful applications on Kubernetes has been a challenge for a long time. As such, many enterprises have to write complex application logic on top of Kubernetes APIs for running applications such as databases and distributed file systems.

Today, we’re excited to announce the beta launch of Regional Persistent Disks (Regional PD) for Kubernetes Engine, making it easier for organizations of all sizes to build and run highly available stateful applications in the cloud. While Persistent Disks (PD) are zonal resources, and applications built on top of PDs can become unavailable in the event of a zonal failure, Regional PDs provide network-attached block storage with synchronous replication of data between two zones in a region. This approach maximizes application availability without sacrificing consistency. Regional PD automatically handles transient storage unavailability in a zone, and provides an API to facilitate cross-zone failover (learn more about this in the documentation).

Regional PD has native integration with the Kubernetes master that manages health monitoring and failover to the secondary zone in case of an outage in the primary zone. With a Regional PD, you also take advantage of replication at the storage layer, rather than worrying about application-level replication. This offers a convenient building block for implementing highly available solutions on Kubernetes Engine, and can provide cross-zone replication to existing legacy services. Internally, for instance, we used Regional PD to implement high availability in Google Cloud SQL for Postgres.

To understand the power of Regional PD, imagine you want to deploy Wordpress with MySQL in Kubernetes. Before Regional PD, if you wanted to build an HA configuration, you needed to write complex application logic, typically using custom resources, or use a commercial replication solution. Now, simply use Regional PD as the storage backend for the Wordpress and MySQL databases. Because of the block-level replication, the data is always present in another zone, so you don’t need to worry if there is a zonal outage.

With Regional PD, you can build a two-zone HA solution by simply changing the storage class definition in the dynamic provisioning specification—no complex Kubernetes controller management required!

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: repd
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-standard
  replication-type: regional-pd
  zones: us-central1-a, us-central1-b
Alternatively, you can manually provision a Regional PD using the gcloud command line tool:
gcloud beta compute disks create gce-disk-1
    --region europe-west1
    --replica-zones europe-west1-b,europe-west1-c
Regional PDs are now available in Kubernetes Engine clusters. To learn more, check out the documentation, Kubernetes guide and sample solution using Regional PD.

Better cost control with Google Cloud Billing programmatic notifications



By running your workloads on Google Cloud Platform (GCP) you have access to the tools you need to build and scale your business. At the same time, it’s important to keep your costs under control by informing users and managing their spending.

Today, we’re adding programmatic budget notifications to Google Cloud Billing, a powerful feature that helps you stick to your budget and take automatic action when your budget is out of control.

Monitor your costs
You can use Cloud Billing budget notifications with third-party or homegrown cost-management solutions, as well as Google Cloud services. For example, as an engineering manager, you can set up budget notifications to alert your entire team through Slack every time you hit 80 percent of your budget.

Control your costs
You can also configure automated actions based on the notifications to control your costs, such as selectively turning off particular resources or terminating all resources for a project. For example, as a PhD student working at a lab with a fixed grant amount, you can use budget notifications to trigger a cap to terminate your project when you use up your grant. This way, you can be confident that you won’t go over budget.

Work with your existing workflow and tools
To make it easy to get started with budget notifications, we’ve included examples of reference architectures for a few common use cases in our documentation:
  • Monitoring - listen to your PubSub notifications with Cloud Functions
  • Forward notifications to Slack - send custom billing alerts with the current spending for your budget to a Slack channel
  • Cap (disable) billing on a project - disable billing for a project and terminate all resources to make sure you don’t overspend
  • Selectively control resources - when you want to terminate expensive resources but not disable your whole environment.
Get started
You can set up programmatic budget notifications in a few simple steps:

  1. Navigate to Billing in the Google Cloud Console and create your budget.
  2. Enable Cloud Pub/Sub, then set up a Cloud Pub/Sub topic for your budget.
  3. When creating your budget you will see a new section “Manage notifications” where you can configure your Cloud Pub/Sub topic: 

  4. Set up a Cloud Function to listen to budget notifications and trigger an action.
  5. Cloud Billing sends budget notifications multiple times per day, so you will always have the most up-to-date information on your spending.
You can get started today by reading the Google Cloud Billing documentation. If you’ll be at Google Cloud Next ‘18, be sure to come by my session on Google Cloud Billing and cost control.

Google Cloud named a leader in latest Forrester Research Public Cloud Platform Native Security Wave



Today, we are happy to announce that Forrester Research has named Google Cloud as one of just two leaders in The Forrester WaveTM: Public Cloud Platform Native Security (PCPNS), Q2 2018 report, and rated Google Cloud highest in strategy. The report evaluates the native security capabilities and features of public cloud providers, such as encryption, identity and access management (IAM) and workload security.

The report finds that most security and risk professionals (S&R) now believe that “native security capabilities of large public cloud platforms actually offer more affordable and superior security than what S&R teams could deliver themselves if the workloads remained on-premises.”

The report particularly highlights that “Google has been continuing to invest in PCPNS. The platform’s security configuration policies are very granular in the admin console as well as in APIs. The platform has a large number of security certifications, broad partner ecosystem, offers deep native support for guest operating systems and Kubernetes containers, and supports auto-scaling (GPUs can be added to instances).”

In this wave, Forrester evaluated seven public cloud platforms against 37 criteria, looking at current offerings, strategy and market presence. Of the seven vendors, Google Cloud scored highest on strategy, and received the highest score possible in its strategic plans in physical security, certifications and attestations, hypervisor security, guest OS workload security, network security, and machine learning criteria.

Further, Forrester cited Google Cloud’s security roadmap. As part of our roadmap, Google Cloud continues to redefine what’s possible in the cloud with unique security capabilities like Access Transparency, Istio, Identity-Aware Proxy, VPC Service Controls, and Asylo.

“The vendor plans: to 1) provide ongoing security improvements to the admin console using device trust, location, etc., 2) implement hardware-backed encryption key management, and 3) improve visibility into the platform by launching a unified risk dashboard."

At Google, we have worked for over a decade to build a secure, scalable and flexible cloud foundation. Our belief is that if you put security first, everything else will follow. Security continues to be top of mind—from our custom hardware like our Titan chip, to data encryption both at rest and in transit by default. On this strong foundation, we offer enterprises a rich set of controls and capabilities to meet their security and compliance requirements.

You can download the full Forrester Public Cloud Platform Native Security Wave Q2 2018 report here. To learn more about GCP, visit our website, and sign up for a free trial.

Let’s hit the road! Join Google Developers Community Roadshow

Posted by Przemek Pardel, Developer Relations Program Manager, Regional Lead

This summer, Google Developers team is touring 10 countries and 14 cities in Europe in a colorful community bus. We'll be visiting university campuses and technology parks to meet you locally and talk about our programs for developers and start-ups.

Join us to find out how Google supports developer communities. Learn about Google Developer Groups, Women Techmakers program and various ways how we engage with the broader developer community in Europe and around the world.

Our bus will stop in the following locations between 12.00 and 4pm:

  • 4th June, Estonia, Tallinn
  • 6th June, Latvia, Riga
  • 8th June, Lithuania, Vilnius
  • 11th June, Poland, Gdańsk
  • 13th June, Poland, Poznań
  • 15th June, Poland, Kraków
  • 18th June, Slovenia, Ljubljana
  • 19th June, Croatia, Zagreb
  • 21st June, Bulgaria, Sofia

Want to meet us on the way? Sign up for the event in your city here.

What to expect:

  • Information: learn more about how Google supports developer communities around the world, from content, speakers to a global network
  • Network: with other community organizers from your city
  • Workshops: join some of our product workshops on tour (Actions on Google, Google Cloud, Machine Learning), and meet with Google teams
  • Fun: live music, games and more!

Are you interested in starting a new developer community or are you an organizer who would like to join the global Google Community Program? Let us know and receive an invitation-only pass to our private events.

Google Developers team