Tag Archives: Announcements

BigQuery arrives in the Tokyo region



We launched BigQuery, our enterprise data warehouse solution, in 2011 so that our customers could leverage the processing power of Google's infrastructure to perform super-fast SQL queries. And although BigQuery is already available to all our customers no matter where they’re located, many enterprises need additional options for storing data and performing analysis in the countries where they operate. That’s why we’re rolling out regional availability for BigQuery. Google Cloud’s Tokyo region is our first expansion site, but it’s only the first step in a globally phased rollout that will continue throughout 2018 and 2019.

By bringing BigQuery availability to places like Tokyo, we’re helping more enterprises analyze and draw actionable insights from their business data at scales prohibitive or impossible with legacy data warehouses. Mizuho Bank—a leading global bank with one of the largest customer bases in Japan and a global network of financial and business centers—is one of many businesses exploring the potential for local BigQuery resources newly at their disposal.
“We believe BigQuery will become a significant driver to transform the way our data analysts work. Mizuho Bank currently runs SQL queries in on-premise data warehouses, but we used to experience issues of long processing time due to a fixed limit on our data-processing resources. We have wanted to move the data to the cloud for some time. Now that BigQuery is available at Google Cloud’s Tokyo region, we were able to conduct a proof of concept (PoC) under the conditions that satisfy Mizuho’s security requirements. We used real-world data to design the PoC to reflect our day-to-day operations. 
With BigQuery, we no longer need to worry about limited data-processing resources. We can aggregate processing tasks, perform parallel processing on large queries, and engage multiprocessing in order to substantially reduce working hours of data analysts collaborating across multiple departments. This streamlining can potentially generate more time for our data analysts to work on designing queries and models, and then interpreting the results. BigQuery can not only achieve cost-savings over the existing system but also provides integrated services that are very easy to use. 
The PoC has been a great success as we were able to confirm that by using data processing and visualization features like Dataprep and Data Studio with BigQuery, we can conduct all these data analysis processes seamlessly in the cloud.”

—Yoshikazu Kurosu, Senior Manager of Business Development Department, Mizuho Bank, Ltd. 

Financial services organizations are one of several industries that depend on the robust security Google Cloud offers when storing and analyzing sensitive data. Another industry that shares the same challenges is telecom providers. NTT Communications, a global leader in information and communication technology (ICT) solutions, is also looking to BigQuery due to its speed and scale.
“We’ve been using BigQuery on our Enterprise Cloud, a service we provide for enterprises, to detect and analyze anomalies in our network and servers. In our proof of concept (PoC) using BigQuery in Google Cloud’s Tokyo region, we performed evaluations of large-scale log data (cumulative total of 27.8 billion records) streamed real-time from nine regions around the world. Data analytics infrastructure requires real-time handling of extensive log data generated by both equipment and software. Our infrastructure also depends on speed and power to perform ultra-high speed analysis and output of time-series data.

BigQuery achieves high-speed response even in emergency situations, offers an excellent cost-performance ratio, and enables usage and application of large-scale log data that exceeds the capabilities of traditional data warehouses. We will continue to strengthen our cooperation with GCP services, BigQuery included, to provide cloud solutions to support secure data usage on behalf of our customers.” 
—Masaaki Moribayashi, Senior Vice President, Cloud Services, NTT Communications Corporation 
We hope regional availability will help more enterprises use BigQuery to store and analyze their sensitive data in a way that addresses local requirements.

To learn more about BigQuery, visit our website. And to get started using BigQuery in the US, EU or Tokyo, read our documentation on working with Dataset Locations.

Dialogflow Enterprise Edition is now generally available



Back in November, we announced the beta of Dialogflow Enterprise Edition. Today, on the heels of introducing Cloud Text-to-Speech and updating Cloud Speech-to-Text, we’re releasing Dialogflow Enterprise Edition for general availability, complete with support and a Service Level Agreement (SLA) that businesses need for their production deployments. It’s been a busy month for Cloud AI and conversational technology!

Hundreds of thousands of developers already use Dialogflow (formerly API.AI) to build voice- and text-based conversational experiences powered by machine learning and natural language understanding. Even if you have little or no experience in conversational interfaces, you can build natural and rich experiences in days or weeks with the easy-to-use Dialogflow platform. Further, Dialogflow's cross-platform capabilities let you build the experience once and launch it across websites, mobile apps and a variety of popular platforms and devices, including the Google Assistant, Amazon Alexa, and Facebook Messenger.

Starting today, Dialogflow API V2 is generally available and is now the default for all new agents, integrating with Google Cloud Speech-to-Text, enabling agent management via API, supporting gRPC and providing an easy transition to Enterprise Edition with no code migration.

We're constantly expanding Dialogflow capabilities to provide you with the best developer experience. Here are some other new features that we added since the beta:

KLM Royal Dutch Airlines, Domino’s and Ticketmaster have built conversational experiences with Dialogflow, helping them meet their customers where they are and assist them across their journeys. And for DPD, the UK’s leading parcel delivery company, “Dialogflow made it easy to build an AI-powered conversational experience that delights consumers using the resources and skill sets we already have. We estimate that Dialogflow helped us get our conversational interface to market 12 months sooner than planned,” says Max Glaisher, Product Innovation Manager at DPD.

Innovative businesses across industries are adopting Dialogflow Enterprise Edition


Companies are now turning to Dialogflow Enterprise Edition as they scale up to engage with growing numbers of users across multiple platforms.

Ubisoft

Video game publisher Ubisoft uses Dialogflow as part of “Sam,” a personal gaming assistant that delivers personalized information and tips related to its video games and services. “Using Dialogflow Enterprise Edition, Ubisoft has access to a natural language processing system for Sam that can understand text and voice input efficiently out of the box,” says Thomas Belmont, Producer at Ubisoft. He adds that while developing Sam, “The team needed tools that let them iterate quickly and make modifications immediately, and Dialogflow Enterprise Edition was the best choice for those needs.” Dialogflow has also helped ensure a good user experience, with Sam answering 88% of player (user) requests in its first three months as a beta.
                                  (click to enlarge)



Best Buy Canada

To enhance the online shopping experience, Best Buy Canada built a conversational experience to make it quicker and easier for consumers to find the information they need using the Google Assistant. “Using Dialogflow, we've been able to steadily increase user sessions with our agent,” says Chris Rowinski, Product Owner at Best Buy Canada. “In the coming months, we plan on moving to Dialogflow Enterprise Edition so we can continue to scale up as our users and user engagement grow on voice- and screen-based devices."
                                (click to enlarge)

(Experience available in Canada only)

Ticketmaster

Ticketmaster, the world’s top live-event ticketing company, picked Dialogflow to build a conversational experience for event discovery and ticketing, and is now also developing a solution to optimize interactive voice response (IVR) for customer service. “I remember how excited I was the first time I saw Dialogflow; my mind started racing with ideas about how Ticketmaster could benefit from a cloud-based natural language processing provider,” says Tariq El-Khatib, Product Manager at Ticketmaster. “Now with the launch of Dialogflow Enterprise Edition, I can start turning those ideas into reality. With higher transaction quotas and support levels, we can integrate Dialogflow with our Customer Service IVR to increase our rate of caller intent recognition and improve customer experience.”


Try Dialogflow today using a free credit


See the quickstart to set up a Google Cloud Platform project and quickly create a Dialogflow Enterprise Edition agent. Remember, you get a $300 free credit to get started with any GCP product (good for 12 months).

Introducing kaniko: Build container images in Kubernetes and Google Container Builder without privileges



Building images from a standard Dockerfile typically relies upon interactive access to a Docker daemon, which requires root access on your machine to run. This can make it difficult to build container images in environments that can’t easily or securely expose their Docker daemons, such as Kubernetes clusters (for more about this, check out the 16th oldest open Kubernetes issue).

To overcome these challenges, we’re excited to introduce kaniko, an open-source tool for building container images from a Dockerfile even without privileged root access. With kaniko, we both build an image from a Dockerfile and push it to a registry. Since it doesn’t require any special privileges or permissions, you can run kaniko in a standard Kubernetes cluster, Google Kubernetes Engine, or in any environment that can’t have access to privileges or a Docker daemon.

How does kaniko work?


We run kaniko as a container image that takes in three arguments: a Dockerfile, a build context and the name of the registry to which it should push the final image. This image is built from scratch, and contains only a static Go binary plus the configuration files needed for pushing and pulling images.


The kaniko executor then fetches and extracts the base-image file system to root (the base image is the image in the FROM line of the Dockerfile). It executes each command in order, and takes a snapshot of the file system after each command. This snapshot is created in user-space by walking the filesystem and comparing it to the prior state that was stored in memory. It appends any modifications to the filesystem as a new layer to the base image, and makes any relevant changes to image metadata. After executing every command in the Dockerfile, the executor pushes the newly built image to the desired registry.

Kaniko unpacks the filesystem, executes commands and snapshots the filesystem completely in user-space within the executor image, which is how it avoids requiring privileged access on your machine. The docker daemon or CLI is not involved.

Running kaniko in a Kubernetes cluster


To run kaniko in a standard Kubernetes cluster your pod spec should look something like this, with the args parameters filled in. In this example, a Google Cloud Storage bucket provides the build context.

apiVersion: v1
kind: Pod
metadata:
 name: kaniko
spec:
 containers:
 - name: kaniko
   image: gcr.io/kaniko-project/executor:latest
   args: ["--dockerfile=<path to Dockerfile>",
           "--bucket=<GCS bucket>",
           "--destination=<gcr.io/$PROJECT/$REPO:$TAG"]
   volumeMounts:
     - name: kaniko-secret
       mountPath: /secret
   env:
     - name: GOOGLE_APPLICATION_CREDENTIALS
       value: /secret/kaniko-secret.json
 restartPolicy: Never
 volumes:
   - name: kaniko-secret
     secret:
       secretName: kaniko-secret

You’ll need to mount a Kubernetes secret that contains the necessary authentication to push the final image to a registry. You can find instructions for downloading the secret here.

Running kaniko in Google Cloud Container Builder


To run kaniko in Google Cloud Container Builder, we can add it as a build step to the build config:

steps:
 - name: gcr.io/kaniko-project/executor:latest
   args: ["--dockerfile=<path to Dockerfile>",
          "--context=<path to build context>",
          "--destination=<gcr.io/[PROJECT]/[IMAGE]:[TAG]>"]
The kaniko executor image will both build and push the image in this build step.


Comparison with other tools


Similar tools to kaniko include img and orca-build. Like kaniko, both tools build container images from Dockerfiles, but with different approaches and security trade-offs. The img tool builds as an unprivileged user within a container, while kaniko builds as a root user within a container in an unprivileged environment. The orca-build tool executes builds by wrapping runc, which uses kernel namespacing techniques to execute RUN commands. We're able to accomplish the same thing in kaniko by executing commands as a root user within a container.


Conclusion


You can find more documentation about running kaniko in our GitHub repo. Please open an issue if you run into any bugs! If you have any questions or feedback you can also reach out to us in our Google group.

How to dynamically generate GCP IAM credentials with a new HashiCorp Vault secrets engine



Applications often require secrets such as credentials at build- or run-time. These credentials are an assertion of a service’s or user’s identity that they can use to authenticate to other services. On Google Cloud Platform (GCP), you can manage services or temporary users using Cloud Identity and Access Management (IAM) service accounts, which are identities whose credentials your application code can use to access other GCP services. You can access a service account from code running on GCP, in your on-premises environment, or even another cloud.

Protecting service account keys is critical—you should tightly control access to them, rotate them, and make sure they're not committed in code. But managing these credentials shouldn’t get harder as your number of services increases. HashiCorp Vault is a popular open source tool for secret management that allows users to store, manage and control access to tokens, passwords, certificates, API keys and many other secrets. Vault supports pluggable mechanisms known as secrets engines for managing different secret types. These engines allow developers to store, rotate and dynamically generate many kinds of secrets in Vault. Because Vault's secrets engines are pluggable, they each provide different functionality. Some engines store static data, others manage PKI and certificates, and others manage database credentials.

Today, we're pleased to announce a Google Cloud Platform IAM secrets engine for HashiCorp Vault. This allows a user to dynamically generate IAM service account credentials with which a human, machine or application can access specific Google Cloud resources using their own identity. To limit the impact of a security incident, Vault allows credentials to be easily revoked.

This helps address some common use cases, for example:
  • Restricting application access for recurring jobs: An application runs a monthly batch job. Instead of a hard-coded, long-lived credential, the application can request a short-lived GCP IAM credential at the start of the job. After a short time, the credential automatically expires, reducing the surface area for a potential attack.
  • Restricting user access for temporary users: A contracting firm needs 90 days of read-only access to build dashboards. Instead of someone generating this credential and distributing it to the firm, the firm requests this credential through Vault. This creates a 1-1 mapping of the credential to its users in audit and access logs.

Getting started with the IAM service account secret engine


Let’s go through an example for generating new service account credentials using Vault. This example assumes the Vault server is already up and running.

You can also watch a demo of the backend in our new video below.
First, enable the secrets engine:

$ vault secrets enable gcp 
Then, set up the engine with initial config and role sets:

$ vault write gcp/config \
credentials=@path/to/creds.json \
 ttl=3600 \
 max_ttl=86400

This config supplies default credentials that Vault will use to generate the service account keys and access tokens, as well as TTL metadata for the leases Vault assigns to these secrets when generated.

Role sets define the sets of IAM roles, bound to specific resources, that you assign to generated credentials. Each role set can generate one of two types of secrets: either `access_token` for one-use OAuth access tokens or `service_account_key` for long-lived service account keys. Here are some examples for both types of rolesets:

# Create role sets
$ vault write gcp/roleset/token-role-set \
    project="myproject" \
    secret_type="access_token" \
    bindings=@token_bindings.hcl
    token_scopes="https://www.googleapis.com/auth/cloud-platform"

$ vault write gcp/roleset/key-role-set \
    project="myproject" \
    secret_type="service_account_key"
    bindings=”
The above bindings param expects a string (or, using the special Vault syntax ‘@’, a path to a file containing this string) with the following HCL (or JSON) format
resource "path/to/my/resource" {
    roles = [
      "roles/viewer",
      "roles/my-other-role",
    ]
}

resource "path/to/another/resource" {
    roles = [
      "roles/editor",
    ]
}
Creating a new role set generates a new service account for a role set. When a user generates a set of credentials, they specify a role set (and thus service account) under which to create the credentials.

Once you have set up the secrets engine, a Vault client can easily generate new secrets:

$ vault read gcp/key/key-role-set

Key                 Value
---                 -----
lease_id            gcp/key/key-roleset/
lease_duration      1h
lease_renewable     true
key_algorithm       KEY_ALG_RSA_2048
key_type            TYPE_GOOGLE_CREDENTIALS_FILE
private_key_data    


$ vault read gcp/token/token-role-set

Key                 Value
---                 -----
lease_id           gcp/token/test/
lease_duration     59m59s
lease_renewable    false
token              ya29.c.restoftoken...
These credentials can then be used to make calls to GCP APIs as needed and can be automatically revoked by Vault.

To learn more, check out the GCP IAM service account secret engine documentation.

How WePay uses HashiCorp Vault on GCP


WePay is an online payment service provider who uses HashiCorp Vault on GCP. It currently runs HashiCorp Vault servers as virtual machines on Google Compute Engine for two primary use cases:
  • Plain vanilla secret storage using a configuration service: WePay has a service-oriented architecture built on Google Kubernetes Engine. Each microservice stores secrets such as passwords, tokens, private keys, and certificates in a centralized configuration service. This in turn uses the generic "kv" (key value) HashiCorp Vault secrets engine to manage application secrets. The configuration service authenticates services that talk to it, and authorizes those services to access their secrets at deployment time. Secrets are segmented by service using base paths, i.e., superSecurePaymentSystem would only be able to access secrets in the superSecurePaymentSystem path.
  • Key management using a key management service: WePay needs a way to centrally manage the provisioning, deprovisioning and rotation of encryption keys used in its applications. A central key management service generates encryption keys, and stores these in HashiCorp Vault using the "kv" secret engine.
WePay has its complete infrastructure built in GCP, and the introduction of the GCP IAM service account secrets engine will help to put in stronger security practices. WePay is exploring options on how to use the GCP IAM service account secrets engine in its infrastructure, and is excited by the possibilities.

Continuing work for HashiCorp Vault on GCP


We're excited to see what amazing applications and services users will build using the new HashiCorp Vault GCP secrets engine. This feature release is part of our long-standing ongoing partnership with HashiCorp since 2013. We're excited to continue working together to help HashiCorp users make the best use of GCP services and features. To get you up and running with, HashiCorp Vault for IAM service accounts, check out our solution brief “Using Vault on Compute Engine for Secret Management” for an overview of best practices, and a new video on authentication options.

As always, both HashiCorp and Google welcome contributions from the open-source community. Give us a tweet or open an issue on GitHub if you have any questions!

How to dynamically generate GCP IAM credentials with a new HashiCorp Vault secrets engine



Applications often require secrets such as credentials at build- or run-time. These credentials are an assertion of a service’s or user’s identity that they can use to authenticate to other services. On Google Cloud Platform (GCP), you can manage services or temporary users using Cloud Identity and Access Management (IAM) service accounts, which are identities whose credentials your application code can use to access other GCP services. You can access a service account from code running on GCP, in your on-premises environment, or even another cloud.

Protecting service account keys is critical—you should tightly control access to them, rotate them, and make sure they're not committed in code. But managing these credentials shouldn’t get harder as your number of services increases. HashiCorp Vault is a popular open source tool for secret management that allows users to store, manage and control access to tokens, passwords, certificates, API keys and many other secrets. Vault supports pluggable mechanisms known as secrets engines for managing different secret types. These engines allow developers to store, rotate and dynamically generate many kinds of secrets in Vault. Because Vault's secrets engines are pluggable, they each provide different functionality. Some engines store static data, others manage PKI and certificates, and others manage database credentials.

Today, we're pleased to announce a Google Cloud Platform IAM secrets engine for HashiCorp Vault. This allows a user to dynamically generate IAM service account credentials with which a human, machine or application can access specific Google Cloud resources using their own identity. To limit the impact of a security incident, Vault allows credentials to be easily revoked.

This helps address some common use cases, for example:
  • Restricting application access for recurring jobs: An application runs a monthly batch job. Instead of a hard-coded, long-lived credential, the application can request a short-lived GCP IAM credential at the start of the job. After a short time, the credential automatically expires, reducing the surface area for a potential attack.
  • Restricting user access for temporary users: A contracting firm needs 90 days of read-only access to build dashboards. Instead of someone generating this credential and distributing it to the firm, the firm requests this credential through Vault. This creates a 1-1 mapping of the credential to its users in audit and access logs.

Getting started with the IAM service account secret engine


Let’s go through an example for generating new service account credentials using Vault. This example assumes the Vault server is already up and running.

You can also watch a demo of the backend in our new video below.
First, enable the secrets engine:

$ vault secrets enable gcp 
Then, set up the engine with initial config and role sets:

$ vault write gcp/config \
credentials=@path/to/creds.json \
 ttl=3600 \
 max_ttl=86400

This config supplies default credentials that Vault will use to generate the service account keys and access tokens, as well as TTL metadata for the leases Vault assigns to these secrets when generated.

Role sets define the sets of IAM roles, bound to specific resources, that you assign to generated credentials. Each role set can generate one of two types of secrets: either `access_token` for one-use OAuth access tokens or `service_account_key` for long-lived service account keys. Here are some examples for both types of rolesets:

# Create role sets
$ vault write gcp/roleset/token-role-set \
    project="myproject" \
    secret_type="access_token" \
    bindings=@token_bindings.hcl
    token_scopes="https://www.googleapis.com/auth/cloud-platform"

$ vault write gcp/roleset/key-role-set \
    project="myproject" \
    secret_type="service_account_key"
    bindings=”
The above bindings param expects a string (or, using the special Vault syntax ‘@’, a path to a file containing this string) with the following HCL (or JSON) format
resource "path/to/my/resource" {
    roles = [
      "roles/viewer",
      "roles/my-other-role",
    ]
}

resource "path/to/another/resource" {
    roles = [
      "roles/editor",
    ]
}
Creating a new role set generates a new service account for a role set. When a user generates a set of credentials, they specify a role set (and thus service account) under which to create the credentials.

Once you have set up the secrets engine, a Vault client can easily generate new secrets:

$ vault read gcp/key/key-role-set

Key                 Value
---                 -----
lease_id            gcp/key/key-roleset/
lease_duration      1h
lease_renewable     true
key_algorithm       KEY_ALG_RSA_2048
key_type            TYPE_GOOGLE_CREDENTIALS_FILE
private_key_data    


$ vault read gcp/token/token-role-set

Key                 Value
---                 -----
lease_id           gcp/token/test/
lease_duration     59m59s
lease_renewable    false
token              ya29.c.restoftoken...
These credentials can then be used to make calls to GCP APIs as needed and can be automatically revoked by Vault.

To learn more, check out the GCP IAM service account secret engine documentation.

How WePay uses HashiCorp Vault on GCP


WePay is an online payment service provider who uses HashiCorp Vault on GCP. It currently runs HashiCorp Vault servers as virtual machines on Google Compute Engine for two primary use cases:
  • Plain vanilla secret storage using a configuration service: WePay has a service-oriented architecture built on Google Kubernetes Engine. Each microservice stores secrets such as passwords, tokens, private keys, and certificates in a centralized configuration service. This in turn uses the generic "kv" (key value) HashiCorp Vault secrets engine to manage application secrets. The configuration service authenticates services that talk to it, and authorizes those services to access their secrets at deployment time. Secrets are segmented by service using base paths, i.e., superSecurePaymentSystem would only be able to access secrets in the superSecurePaymentSystem path.
  • Key management using a key management service: WePay needs a way to centrally manage the provisioning, deprovisioning and rotation of encryption keys used in its applications. A central key management service generates encryption keys, and stores these in HashiCorp Vault using the "kv" secret engine.
WePay has its complete infrastructure built in GCP, and the introduction of the GCP IAM service account secrets engine will help to put in stronger security practices. WePay is exploring options on how to use the GCP IAM service account secrets engine in its infrastructure, and is excited by the possibilities.

Continuing work for HashiCorp Vault on GCP


We're excited to see what amazing applications and services users will build using the new HashiCorp Vault GCP secrets engine. This feature release is part of our long-standing ongoing partnership with HashiCorp since 2013. We're excited to continue working together to help HashiCorp users make the best use of GCP services and features. To get you up and running with, HashiCorp Vault for IAM service accounts, check out our solution brief “Using Vault on Compute Engine for Secret Management” for an overview of best practices, and a new video on authentication options.

As always, both HashiCorp and Google welcome contributions from the open-source community. Give us a tweet or open an issue on GitHub if you have any questions!

Introducing Kayenta: An open automated canary analysis tool from Google and Netflix



To perform continuous delivery at any scale, you need to be able to release software changes not just at high velocity, but safely as well. Today, Google and Netflix are pleased to announce Kayenta, an open-source automated canary analysis service that allows teams to reduce risk associated with rolling out deployments to production at high velocity.

Developed jointly by Google and Netflix, Kayenta is an evolution of Netflix’s internal canary system, reimagined to be completely open, extensible and capable of handling more advanced use cases. It gives enterprise teams the confidence to quickly push production changes by reducing error-prone, time-intensive and cumbersome manual or ad-hoc canary analysis.

Kayenta is integrated with Spinnaker, an open-source multi-cloud continuous delivery platform.

This allows teams to easily set up an automated canary analysis stage within a Spinnaker pipeline. Kayenta fetches user-configured metrics from their sources, runs statistical tests, and provides an aggregate score for the canary. Based on the score and set limits for success, Kayenta can automatically promote or fail the canary, or trigger a human approval path.
"Automated canary analysis is an essential part of the production deployment process at Netflix and we are excited to release Kayenta. Our partnership with Google on Kayenta has yielded a flexible architecture that helps perform automated canary analysis on a wide range of deployment scenarios such as application, configuration and data changes. Spinnaker’s integration with Kayenta allows teams to stay close to their pipelines and deployments without having to jump into a different tool for canary analysis. By the end of the year, we expect Kayenta to be making thousands of canary judgments per day. Spinnaker and Kayenta are fast, reliable and easy-to-use tools that minimize deployment risk, while allowing high velocity at scale."

Greg Burrell, Senior Reliability Engineer at Netflix (read more in Netflix’s blog post)
A final result summary from Kayenta looks something like the following:
“Canary analysis along with Spinnaker deployment pipelines enables us to automatically identify bad deployments. With 1000+ pipelines running in production, any form of human intervention as a part of canary analysis can be a huge blocker to our continuous delivery efforts. Automated canary deployment, as enabled by Kayenta, has allowed our team to increase development velocity by detecting anomalies faster. Additionally, being open source, standardizing on Kayenta helps reduce the risk of vendor lock-in. We look forward to working closely with Google and Netflix to further integrate Kayenta into our development cycle and get rid of our self-developed Jenkins job canary”

— Tom Feiner, Systems Operations Engineer at Waze

Continuous delivery challenges at scale


Canary analysis is a good way to reduce the risk associated with introducing new changes to end users in production. The basic idea is to route a small subset of production traffic, for example 1%, through a deployment that reflects the changes (the canary) and a newly deployed instance that has the same code and configuration as production (the baseline). The production instance is not modified in any way. Typically three instances are created each for baseline and canary, while production has multiple instances. Creating a new baseline helps minimize startup effects and limit the system variations between it and the canary. The system then compares the key performance and functionality metrics between the canary and the baseline, as chosen by the system owner. To continue with deployment, the canary should behave as well or better than the baseline.
Canary analysis is often carried out in a manual, ad-hoc or statistically incorrect manner. A team member, for instance, manually inspects logs and graphs showcasing a variety of metrics (CPU usage, memory usage, error rate, CPU usage per request) across the canary and production to make a decision on how to proceed with the proposed change. Manual or ad-hoc canary analysis can cause additional challenges:

  • Speed and scalability bottlenecks: For organizations like Google and Netflix that run at scale and that want to perform comparisons many times over multiple deployments in a single day, manual canary analysis isn't really an option. Even for other organizations, a manual approach to canary analysis can’t keep up with the speed and shorter delivery time frame of continuous delivery. Configuring dashboards for each canary release can be a significant manual effort, and manually comparing hundreds of different metrics across the canary and baseline is tiresome and laborious.
  • Accounting for human error: Manual canary analysis requires subjective assessment and is prone to human bias and error. It can be hard for people to separate real issues from noise. Humans often make mistakes while interpreting metrics and logs and deciding whether to promote or fail the canary. Collecting, monitoring and then aggregating multiple canary metrics for analysis in a manual manner further adds to the areas where an error can be made because of human judgement.
  • Risk of incorrect decisions: Comparing short-term metrics of a new deployment to the long-running production instances in a manual or ad-hoc fashion is an inaccurate way to identify the health of the canary. Mainly because it can be hard to distinguish whether the performance deviations you see in the canary are statistically relevant or simply random. As a result, you may end up pushing bad deployments to production.
  • Poor support for advanced use cases: To optimize the continuous delivery cycle with canary analysis, you need a high degree of confidence about whether to promote or fail the canary. But gaining the confidence to make go/no-go decisions based on manual or ad-hoc processes is time-consuming, primarily because a manual or ad-hoc canary analysis can’t handle advanced use cases such as adjusting boundaries and parameters in real-time.

The Kayenta approach 

Compared to manual or ad-hoc analysis, Kayenta runs automatic statistical tests on user-specified metrics and returns an aggregate score (success, marginal, failure). This rigorous analysis helps better inform rollout or rollback decisions and identify bad deployments that might go unnoticed with traditional canary analysis. Other benefits of Kayenta include:

  • Open: Enterprise teams that want to perform automated canary analysis with commercial offerings must provide confidential metrics to the provider, resulting in vendor lock-in.
  • Built for hybrid and multi-cloud: Kayenta provides a consistent way to detect problems across canaries, irrespective of the target environment. Given its integration to Spinnaker, Kayenta lets teams perform automated canary analysis across multiple environments, including Google Cloud Platform (GCP), Kubernetes, on-premise servers or other cloud providers.
  • Extensible: Kayenta makes it easy to add new metric sources, judges, and data stores. As a result, you can configure Kayenta to serve more diverse environments as your needs change.
  • Gives confidence quickly: Kayenta lets you adjust boundaries and parameters while performing automatic canary analysis. This lets you move fast and decide whether to promote or fail the canary as soon as you’ve collected enough data.
  • Low overhead: It's easy to get started with Kayenta. No need to write custom scripts or manually fetch canary metrics, merge these metrics or perform statistical analysis to decide whether to either deploy or rollback the canary. Deep links are provided by Kayenta within canary analysis results for in-depth diagnostic purposes.
  • Insight: For advanced use cases, Kayenta can help perform retrospective canary analysis. This gives engineering and operations teams insights into how to refine and improve canary analysis over time.

Integration with Spinnaker

Kayenta’s integration to Spinnaker has produced a new “Canary” pipeline stage in Spinnaker. Here you can specify which metrics to check from which sources, including monitoring tools such as Stackdriver, Prometheus, Datadog or Netflix’s internal tool Atlas. Next, Kayenta fetches metrics data from the source, creates a pair of control/experiment time series datasets and calls a Canary Judge. The Canary Judge performs statistical tests, evaluating each metric individually, and returns an aggregate score from 0 to 100 using pre-configured metric weights. Depending on user configuration, the score is then classified as "success," "marginal," or "failure." Success promotes the canary and continues the deployment, a marginal score can trigger a human approval path and failure triggers a roll back.

Join us!


With Kayenta, you now have an open, automated way to perform canary analysis and quickly deploy changes to production with confidence. By open-sourcing Kayenta, our goal is to build a community where metric stores and judges are provided both by the open source community and via proprietary systems. Here are some ways you can learn more about Kayenta and contribute to the project:


  • Read more about Kayenta in Netflix’s blog
  • Attend Spinnaker meetup in Bay Area

We hope you will join us!

Introducing Kayenta: An open automated canary analysis tool from Google and Netflix



To perform continuous delivery at any scale, you need to be able to release software changes not just at high velocity, but safely as well. Today, Google and Netflix are pleased to announce Kayenta, an open-source automated canary analysis service that allows teams to reduce risk associated with rolling out deployments to production at high velocity.

Developed jointly by Google and Netflix, Kayenta is an evolution of Netflix’s internal canary system, reimagined to be completely open, extensible and capable of handling more advanced use cases. It gives enterprise teams the confidence to quickly push production changes by reducing error-prone, time-intensive and cumbersome manual or ad-hoc canary analysis.

Kayenta is integrated with Spinnaker, an open-source multi-cloud continuous delivery platform.

This allows teams to easily set up an automated canary analysis stage within a Spinnaker pipeline. Kayenta fetches user-configured metrics from their sources, runs statistical tests, and provides an aggregate score for the canary. Based on the score and set limits for success, Kayenta can automatically promote or fail the canary, or trigger a human approval path.
"Automated canary analysis is an essential part of the production deployment process at Netflix and we are excited to release Kayenta. Our partnership with Google on Kayenta has yielded a flexible architecture that helps perform automated canary analysis on a wide range of deployment scenarios such as application, configuration and data changes. Spinnaker’s integration with Kayenta allows teams to stay close to their pipelines and deployments without having to jump into a different tool for canary analysis. By the end of the year, we expect Kayenta to be making thousands of canary judgments per day. Spinnaker and Kayenta are fast, reliable and easy-to-use tools that minimize deployment risk, while allowing high velocity at scale."

Greg Burrell, Senior Reliability Engineer at Netflix (read more in Netflix’s blog post)
A final result summary from Kayenta looks something like the following:
“Canary analysis along with Spinnaker deployment pipelines enables us to automatically identify bad deployments. With 1000+ pipelines running in production, any form of human intervention as a part of canary analysis can be a huge blocker to our continuous delivery efforts. Automated canary deployment, as enabled by Kayenta, has allowed our team to increase development velocity by detecting anomalies faster. Additionally, being open source, standardizing on Kayenta helps reduce the risk of vendor lock-in. We look forward to working closely with Google and Netflix to further integrate Kayenta into our development cycle and get rid of our self-developed Jenkins job canary”

— Tom Feiner, Systems Operations Engineer at Waze

Continuous delivery challenges at scale


Canary analysis is a good way to reduce the risk associated with introducing new changes to end users in production. The basic idea is to route a small subset of production traffic, for example 1%, through a deployment that reflects the changes (the canary) and a newly deployed instance that has the same code and configuration as production (the baseline). The production instance is not modified in any way. Typically three instances are created each for baseline and canary, while production has multiple instances. Creating a new baseline helps minimize startup effects and limit the system variations between it and the canary. The system then compares the key performance and functionality metrics between the canary and the baseline, as chosen by the system owner. To continue with deployment, the canary should behave as well or better than the baseline.
Canary analysis is often carried out in a manual, ad-hoc or statistically incorrect manner. A team member, for instance, manually inspects logs and graphs showcasing a variety of metrics (CPU usage, memory usage, error rate, CPU usage per request) across the canary and production to make a decision on how to proceed with the proposed change. Manual or ad-hoc canary analysis can cause additional challenges:

  • Speed and scalability bottlenecks: For organizations like Google and Netflix that run at scale and that want to perform comparisons many times over multiple deployments in a single day, manual canary analysis isn't really an option. Even for other organizations, a manual approach to canary analysis can’t keep up with the speed and shorter delivery time frame of continuous delivery. Configuring dashboards for each canary release can be a significant manual effort, and manually comparing hundreds of different metrics across the canary and baseline is tiresome and laborious.
  • Accounting for human error: Manual canary analysis requires subjective assessment and is prone to human bias and error. It can be hard for people to separate real issues from noise. Humans often make mistakes while interpreting metrics and logs and deciding whether to promote or fail the canary. Collecting, monitoring and then aggregating multiple canary metrics for analysis in a manual manner further adds to the areas where an error can be made because of human judgement.
  • Risk of incorrect decisions: Comparing short-term metrics of a new deployment to the long-running production instances in a manual or ad-hoc fashion is an inaccurate way to identify the health of the canary. Mainly because it can be hard to distinguish whether the performance deviations you see in the canary are statistically relevant or simply random. As a result, you may end up pushing bad deployments to production.
  • Poor support for advanced use cases: To optimize the continuous delivery cycle with canary analysis, you need a high degree of confidence about whether to promote or fail the canary. But gaining the confidence to make go/no-go decisions based on manual or ad-hoc processes is time-consuming, primarily because a manual or ad-hoc canary analysis can’t handle advanced use cases such as adjusting boundaries and parameters in real-time.

The Kayenta approach 

Compared to manual or ad-hoc analysis, Kayenta runs automatic statistical tests on user-specified metrics and returns an aggregate score (success, marginal, failure). This rigorous analysis helps better inform rollout or rollback decisions and identify bad deployments that might go unnoticed with traditional canary analysis. Other benefits of Kayenta include:

  • Open: Enterprise teams that want to perform automated canary analysis with commercial offerings must provide confidential metrics to the provider, resulting in vendor lock-in.
  • Built for hybrid and multi-cloud: Kayenta provides a consistent way to detect problems across canaries, irrespective of the target environment. Given its integration to Spinnaker, Kayenta lets teams perform automated canary analysis across multiple environments, including Google Cloud Platform (GCP), Kubernetes, on-premise servers or other cloud providers.
  • Extensible: Kayenta makes it easy to add new metric sources, judges, and data stores. As a result, you can configure Kayenta to serve more diverse environments as your needs change.
  • Gives confidence quickly: Kayenta lets you adjust boundaries and parameters while performing automatic canary analysis. This lets you move fast and decide whether to promote or fail the canary as soon as you’ve collected enough data.
  • Low overhead: It's easy to get started with Kayenta. No need to write custom scripts or manually fetch canary metrics, merge these metrics or perform statistical analysis to decide whether to either deploy or rollback the canary. Deep links are provided by Kayenta within canary analysis results for in-depth diagnostic purposes.
  • Insight: For advanced use cases, Kayenta can help perform retrospective canary analysis. This gives engineering and operations teams insights into how to refine and improve canary analysis over time.

Integration with Spinnaker

Kayenta’s integration to Spinnaker has produced a new “Canary” pipeline stage in Spinnaker. Here you can specify which metrics to check from which sources, including monitoring tools such as Stackdriver, Prometheus, Datadog or Netflix’s internal tool Atlas. Next, Kayenta fetches metrics data from the source, creates a pair of control/experiment time series datasets and calls a Canary Judge. The Canary Judge performs statistical tests, evaluating each metric individually, and returns an aggregate score from 0 to 100 using pre-configured metric weights. Depending on user configuration, the score is then classified as "success," "marginal," or "failure." Success promotes the canary and continues the deployment, a marginal score can trigger a human approval path and failure triggers a roll back.

Join us!


With Kayenta, you now have an open, automated way to perform canary analysis and quickly deploy changes to production with confidence. By open-sourcing Kayenta, our goal is to build a community where metric stores and judges are provided both by the open source community and via proprietary systems. Here are some ways you can learn more about Kayenta and contribute to the project:


  • Read more about Kayenta in Netflix’s blog
  • Attend Spinnaker meetup in Bay Area

We hope you will join us!

Cloud Endpoints: Introducing a new way to manage API configuration rollout



Google Cloud Endpoints is a distributed API gateway that you can use to develop, deploy, protect and monitor APIs that you expose. Cloud Endpoints is built on the same services that Google uses to power its own APIs, and you can now configure it to use a new managed rollout strategy that automatically uses the latest service configuration, without having to re-deploy or restart it.

Cloud Endpoints uses the distributed Extensible Service Proxy (ESP) to serve APIs with low latency and high performance. ESP is a service proxy based on NGINX, so you can be confident that it can scale to handle simultaneous requests to your API. ESP runs in its own Docker container for better isolation and scalability and is distributed in the Google Container Registry and Docker registry. You can run ESP on Google App Engine flexible, Google Kubernetes Engine, Google Compute Engine, open-source Kubernetes, or an on-premises server running Linux or Mac OS.

Introducing rollout_strategy: managed


APIs are a critical part of using cloud services, and Cloud Endpoints provides a convenient way to take care of API management tasks such as authorization, monitoring and rate limiting. With Cloud Endpoints, you can describe the surface of the API using an OpenAPI specification or a gRPC service configuration file. To manage your API with ESP and Cloud Endpoints, deploy your OpenAPI specification or gRPC service configuration file using the brand new command:

gcloud endpoints services deploy

This command generates a configuration ID. Previously, in order for ESP to apply a new configuration, you had to restart ESP with the generated configuration ID of the last API configuration deployment. If your service was deployed to the App Engine flexible environment, you had to re-deploy your service every time you deployed changes to the API configuration, even if there were no changes to the source code.

Cloud Endpoint’s new rollout_strategy: managed option configures ESP to use the latest deployed service configuration. When you specify this option, ESP detects the change to a new service configuration within one minute, and automatically begins using it. We recommend that you specify this option instead of a specific configuration ID for ESP to use.

With the new managed rollout deployment strategy, Cloud Endpoints becomes an increasingly frictionless API management solution that doesn’t require you to re-deploy your services or restart ESP on every API configuration change.

For information on deploying ESP with this new option, see the documentation for your API implementation:

More reading 

Toward better phone call and video transcription with new Cloud Speech-to-Text



It’s been full speed ahead for our Cloud AI speech products as of late. Last month, we introduced Cloud Text-to-Speech, our speech synthesis API featuring DeepMind WaveNet models. And today, we’re announcing the largest overhaul of Cloud Speech-to-Text (formerly known as Cloud Speech API) since it was introduced two years ago.

We first unveiled the Cloud Speech API in 2016, and it’s been generally available for almost a year now, with usage more than doubling every six months. Today, with the opening of NAB and SpeechTek conferences, we’re introducing new features and updates that we think will make Speech-to-Text much more useful for business, including phone-call and video transcription.

Cloud Speech-to-Text now supports:

  1. A selection of pre-built models for improved transcription accuracy from phone calls and video
  2. Automatic punctuation, to improve readability of transcribed long-form audio
  3. A new mechanism (recognition metadata) to tag and group your transcription workloads, and provide feedback to the Google team
  4. A standard service level agreement (SLA) with a commitment to 99.9% availability

Let’s take a deeper look at the new updates to Cloud Speech-to-Text.

New video and phone call transcription models


There are lots of different ways to use speech recognition technology—everything from human-computer interaction (e.g., voice commands or IVRs) to speech analytics (e.g., call center analytics). In this version of Cloud Speech-to-Text, we’ve added models that are tailored for specific use cases— e.g., phone call transcriptions and transcriptions of audio from video.
For example, for processing phone calls, we’ve routed incoming English US phone call requests to a model that's optimized to handle phone calls and is considered by many customers to be best-in-class in the industry. Now we’re giving customers the power to explicitly choose the model that they prefer rather than rely on automatic model selection.

Most major cloud providers use speech data from incoming requests to improve their products. Here at Google Cloud, we’ve avoided this practice, but customers routinely request that we use real data that's representative of theirs, to improve our models. We want to meet this need, while being thoughtful about privacy and adhering to our data protection policies. That’s why today, we’re putting forth one of the industry’s first opt-in programs for data logging, and introducing a first model based on this data: enhanced phone_call.

We developed the enhanced phone_call model using data from customers who volunteered to share their data with Cloud Speech-to-Text for model enhancement purposes. Customers who choose to participate in the program going forward will gain access to this and other enhanced models that result from customer data. The enhanced phone_call model has 54% fewer errors than our basic phone_call model for our phone call test set.
In addition, we’re also unveiling the video model, which has been optimized to process audio from videos and/or audio with multiple speakers. The video model uses machine learning technology similar to that used by YouTube captioning, and shows a 64% reduction in errors compared to our default model on a video test set.

Both the enhanced phone_call and premium-priced video model are now available for en-US transcription and will soon be available for additional languages. We also continue to offer our existing models for voice command_and_search, as well as our default model for longform transcription.
Check out the demo on our product website to upload an audio file and see transcription results from each of these models.

Generate readable text with automatic punctuation


Most of us learn how to use basic punctuation (commas, periods, question marks) by the time we leave grade school. But properly punctuating transcribed speech is hard to do. Here at Google, we learned just how hard it can be from our early attempts at transcribing voicemail messages, which produced run-on sentences that were notoriously hard to read.
A few years ago, Google started providing automatic punctuation with our Google Voice voicemail transcription service. Recently, the team created a new LSTM neural network to improve automating punctuation in long-form speech transcription. Architected with performance in mind, the model is now available to you in beta in Cloud Speech-to-Text, and can automatically suggests commas, question marks and periods for your text.

Describe your use cases with recognition metadata


The progress we've made with Cloud Speech-to-Text is due in large part to the feedback you’ve given us over the last two years, and we want to open up those lines of communication even further, with recognition metadata. Now, you can describe your transcribed audio or video with tags such as “voice commands for a shopping app” or “basketball sports tv shows.” We then aggregate this information across Cloud Speech-to-Text users to prioritize what we work on next. Providing recognition metadata increases the probability that your use case will improve with time, but the program is entirely optional.

Customer references

We’re really excited about this new version of Cloud Speech-to-Text, but don’t just take our word for it—here’s what our customers have to say.
“Unstructured data, like audio, is full of rich information but many businesses struggle to find applications that make it easy to extract value from it and manage it. Descript makes it easier to edit and view audio files, just like you would a document. We chose to power our application with Google Cloud Speech-to-Text. Based on our testing, it’s the most advanced speech recognition technology and the new video model had half as many errors as anything else we looked at. And, with its simple pricing model, we’re able to offer the best prices for our users.”  
Andrew Mason, CEO, Descript
"LogMeIn’s GoToMeeting provides market leading collaboration software to millions of users around the globe. We are always looking for the best customer experience and after evaluating multiple solutions to allow our users to transcribe meetings we found Google’s Cloud Speech-to-Text’s new video model to be far more accurate than anything else we’ve looked at. We are excited to work with Google to help drive value for our customers beyond the meeting with the addition of transcription for GoToMeeting recordings." 
 – Matt Kaplan, Chief Product Officer, Collaboration Products at LogMeIn
"At InteractiveTel, we've been using Cloud Speech-to-Text since the beginning to power our real-time telephone call transcription and analytics products. We've constantly been amazed by Google's ability to rapidly improve features and performance, but we were stunned by the results obtained when using the new phone_call model. Just by switching to the new phone_call model we experienced accuracy improvements in excess of 64% when compared to other providers, and 48% when compared to Google's generic narrow-band model."  
 Jon Findley, Lead Product Engineer, InteractiveTel
Access to quality speech transcription technology opens up a world of possibilities for companies that want to connect with and learn from their users. With this update to Cloud Speech-to-Text, you get access to the latest research from our team of machine learning experts, all through a simple REST API. Pricing is $0.006 per 15 seconds of audio for all models except the video model, which is $0.012 per 15 seconds. We'll be providing the new video model for the same price ($0.006 per 15 seconds) for a limited trial period through May 31. To learn more, try out the demo on our product page or visit our documentation.

Now, you can deploy to Kubernetes Engine from GitLab with a few clicks



In cloud developer circles, GitLab is a popular DevOps lifecycle tool. It lets you do everything from project planning and version control to CI/CD pipelines and monitoring, all in a single interface so different functional teams can collaborate. In particular, its Auto DevOps feature detects the language your app is written in and automatically builds your CI/CD pipelines for you.

Google Cloud started the cloud native movement with the invention and open sourcing of Kubernetes in 2014. Kubernetes draws on over a decade of experience running containerized workloads in production serving Google products at massive scale. Kubernetes Engine is our managed Kubernetes service, built by the same team that's the largest contributor to the Kubernetes open source project, and is run by experienced Google SREs, all of which enables you to focus on what you do best: creating applications to delight your users, while leaving the cluster deployment operations to us.

Today, GitLab and Google Cloud are announcing a new integration of GitLab and Kubernetes Engine that makes it easy for you to accelerate your application deployments by provisioning Kubernetes clusters, managed by Google, right from your DevOps pipeline supported by GitLab. You can now connect your Kubernetes Engine cluster to your GitLab project in just a few clicks, and use it to run your continuous integration jobs, and configure a complete continuous deployment pipeline, including previewing your changes live, and deploying them into production, all served by Kubernetes Engine.

Head over to GitLab, and add your first Kubernetes Engine cluster to your project from the CI/CD options in your repository today!

The Kubernetes Engine cluster can be added through the CI/CD -> Kubernetes menu option in the GitLab UI, which even supports creating a brand new Kubernetes Cluster.
Once connected, you can deploy the GitLab Runner into your cluster. This means that the continuous integration jobs will run on your Kubernetes Engine cluster, enabling you fine-grained control over the resources you allocate. For more information read the GitLab Runner docs.

Even more exciting is the new GitLab Auto DevOps integration with Kubernetes Engine. Using Auto DevOps with Kubernetes Engine, you'll have a continuous deployment pipeline that automatically creates a review app for each merge request  a special dynamic environment that allows you to preview changes before they go live  and once you merge, deploy the application into production on production-ready Google Kubernetes Engine.

To get started, go to CI/CD -> General pipeline settings, and select “Enable Auto DevOps”. For more information, read the AutoDev Ops docs.
Auto DevOps does the heavy lifting to detect what languages you’re using, and configure a Continuous Integration and Continuous Deployment pipeline that results in your app running live on the Kubernetes Engine cluster.
Now, whenever you create a merge request, GitLab will run a review pipeline to deploy a review app to your cluster where you can test your changes live. When you merge the code, GitLab will run a production pipeline to deploy your app to production, running on Kubernetes Engine!

Join us for a webinar co-hosted by Google Cloud and GitLab 


Want to learn more? We’re hosting a webinar to show how to build cloud-native applications with Gitlab and Kubernetes Engine. Register here for the April 26th webinar.

Want to get started deploying to Kubernetes? GitLab is offering $500 in Google Cloud Platform credits for new accounts. Try it out.