Category Archives: Google Cloud Platform Blog

Product updates, customer stories, and tips and tricks on Google Cloud Platform

Build a mobile gaming analytics platform

Popular mobile games can attract millions of players and generate terabytes of game-related data in a short burst of time. This places extraordinary pressure on the infrastructure powering these games and requires scalable data analytics services to provide timely, actionable insights in a cost-effective way.

To address these needs, a growing number of successful gaming companies use Google’s web-scale analytics services to create personalized experiences for their players. They use telemetry and smart instrumentation to gain insight into how players engage with the game and to answer questions like: At what game level are players stuck? What virtual goods did they buy? And what's the best way to tailor the game to appeal to both casual and hardcore players?


A new reference architecture describes how you can collect, archive and analyze vast amounts of gaming telemetry data using Google Cloud Platform’s data analytics products. The architecture demonstrates two patterns for analyzing mobile game events:

  • Batch processing: This pattern helps you process game logs and other large files in a fast, parallelized manner. For example, leading mobile gaming company DeNA moved to BigQuery from Hadoop to get faster query responses for their log file analytics pipeline. In this GDC Lightning Talk video they explain the speed benefits of Google’s analytics tools and how the team was able to process large gaming datasets without the need to manage any infrastructure.
  • Real-time processing: Use this pattern when you want to understand what's happening in the game right now. Cloud Pub/Sub and Cloud Dataflow provide a fully managed way to perform a number of data-processing tasks like data cleansing and fraud detection in real-time. For example, you can highlight a player with maximum hit-points outside the valid range. Real-time processing is also a great way to continuously update dashboards of key game metrics, like how many active users are currently logged in or which in-game items are most popular.

Some Cloud Dataflow features are especially useful in a mobile context since messages may be delayed from the source due to mobile Internet connection issues or batteries running out. Cloud Dataflow's built-in session windowing functionality and triggers aggregate events based on the actual time they occurred (event time) as opposed to the time they're processed so that you can still group events together by user session even if there's a delay from the source.

But why choose between one or the other pattern? A key benefit of this architecture is that you can write your data pipeline processing once and execute it in either batch or streaming mode without modifying your codebase. So if you start processing your logs in batch mode, you can easily move to real-time processing in the future. This is an advantage of the high-level Cloud Dataflow model that was released as open source by Google.



Cloud Dataflow loads the processed data into one or more BigQuery tables. BigQuery is built for very large scale, and allows you to run aggregation queries against petabyte-scale datasets with fast response times. This is great for interactive analysis and data exploration, like the example screenshot above, where a simple BigQuery SQL query dynamically creates a Daily Active Users (DAU) graph using Google Cloud Datalab.


And what about player engagement and in-game dynamics? The BigQuery example above shows a bar chart of the ten toughest game bosses. It looks like boss10 killed players more than 75% of the time, much more than the next toughest. Perhaps it would make sense to lower the strength of this boss? Or maybe give the player some more powerful weapons? The choice is yours, but with this reference architecture you'll see the results of your changes straight away. Review the new reference architecture to jumpstart your data-driven quest to engage your players and make your games more successful, contact us, or sign up for a free trial of Google Cloud Platform to get started.

Further Reading and Additional Resources


- Posted by Oyvind Roti, Solutions Architect

Last chance to present at NEXT: Call for speakers closing

We recently announced our global user conference in March, GCP NEXT 2016. This will be the largest gathering of the Google Cloud Platform community, where you can meet with developers like you who are using Google Cloud Platform to build amazing applications. You’ll be able to explore the latest cloud technologies, get hands-on with our platform and learn directly from those who built it.

We’d love for you to tell us your story at NEXT. Submit a proposal to our call for speakers by January 15, 2016 at 11:59pm PST and share your Google Cloud Platform project and experiences. Submissions should align to one of our four technical track topics:

  • Data and Analytics: Learn how Google Cloud Platform can help you build more intelligent applications and make better, more timely decisions.
  • Infrastructure and Operations: See how Google’s infrastructure — including our networks, storage, security, data center operations and DevOps tools — gives you scale, security and reliability.
  • App and Services Development: Understand how different components of Google Cloud Platform can work together to help you develop and deploy powerful apps.
  • Solutions Showcase: Learn how our customers and other developers are using Google Cloud Platform in production.

And for those of you who’d rather sit back and learn from other developers, we look forward to seeing you there, too. Register today and get our early bird rate.

To keep up to date on GCP NEXT 2016, follow us on Google+, Twitter, and LinkedIn.

- Posted by Julia Ferraioli, Developer Advocate, Google Cloud Platform

Top 5 Power Features of the Google Cloud CLI

If you’re a heavy user of Google Cloud Platform, you probably already know about Google Cloud SDK, the powerful command line tool for working with all things Cloud Platform, available for Windows, Linux and OS X. In this post, the Cloud SDK engineering team, based in New York City, shares their favorite power features of the Google Cloud command-line interface (CLI).

#1. Using cloud service emulators

Whether you’re on your commute without connectivity, want a fast reliable way of running tests, or just want to test your application on your development machine without talking to a remote service, gcloud emulators is your friend. Emulators provide a local mock implementation of Google Cloud services, so you can develop and test core functionality. Currently the Cloud SDK includes Google Cloud Datastore and Google Cloud Pub/Sub emulators, with more to come.

You can start an emulator, such as the emulator for Cloud Datastore, like so:

$ gcloud beta emulators datastore start
...
[datastore] To connect, set host to http://localhost:8967/datastore
[datastore] Admin console is running at http://localhost:8851/_ah/admin

Now, you’ve got a datastore running on your local machine! The API is available at the port on localhost, listed above.

Client libraries such as gcloud-node, gcloud-ruby, gcloud-python, and gcloud-java can be configured to use this local emulator by respecting the DATASTORE_LOCAL_HOST
environment variable.

The gcloud emulators command has a neat little trick for automatically setting environment variables like this for each service.

$ $(gcloud beta emulators datastore env-init)
$ echo $DATASTORE_HOST 
http://localhost:8967

The emulator comes with a simple, web-based console which is also available on localhost. Read more in the gcloud emulators documentation.

(Pro tip from Vilas, Engineer on the Cloud SDK)

#2. Type like the wind with autocompletion


Another power feature of the Cloud CLI is tab auto-completion. You can tab-autocomplete gcloud subcommands, and also for many entities such as your instances and zones. Try it out!
(Pro tip from Mark, Engineer on the Cloud SDK)

#3. Using --format to filter, sort and transform output to CSV, JSON, and more


The Cloud CLI gives you a lot of information about your environment, which you might often want to use as input to another script or program. The --format flag provides an easy way to massage the output into a format that makes sense.

Here’s an example using the --format flag to list the zone and IP of your Google Compute Engine instances in CSV format, sorted by zone and name.

$ gcloud compute instances list 
   --format='csv(zone:sort=1,name:sort=2,networkInterfaces[0].networkIP)' 
   > list.csv

You can then open the CSV file in a viewer such as Google Sheets:
This is just a taste of what --format supports. You can also expose data in JSON and tabular format, and use projections to select, sort and filter your data. Read more in the Google Cloud SDK reference for the --format flag to learn some more neat tricks you can do with --format.

(Pro tip from Glenn, Engineer on the Cloud SDK)

#4. Using the gcloud tool with PowerShell


If you’re a user of PowerShell, then often it’s handy to work with PowerShell objects. Some tweaks to the --format command allow you to do this. For example, if you use this command to list all Compute Engine instances in Asia:

PS> (gcloud compute instances list --format=json | Out-String | 
ConvertFrom-Json) | ?{$_.zone -match 'asia'} | select name

name
----
asia-ops-1
asia-ops-2

You can use this command to restart all the instances in Asia:

PS> (gcloud compute instances list --format=json | Out-String |
ConvertFrom-Json) | ?{$_.zone -match ’asia’} | %{gcloud compute instances
reset $_.name --zone $_.zone}

(Pro tip from Valentin, Engineer on the Cloud SDK)

#5. Easily ssh to your Compute Engine instances


The gcloud CLI offers a number of ways to easily use secure shell to access any Linux-based Compute Engine instances you have. For example, you can run:

$ gcloud compute ssh my-instance-name

This command automatically connects you through SSH to any instance that can be accessed publicly. No more SSH keygen or looking up external IPs!

You can go a step further with:

$ gcloud compute config-ssh

This command creates an alias for each of your Compute Engine instances to your ~/.ssh/config file. These aliases are then available to system utilities that also use SSH, such as ssh, scp and sftp. Now you can type commands from your terminal such as:

$ ssh myvm.asia-east1-c.myproject

or

$ sftp myvm.asia-east1-c.myproject

Read more about the gcloud compute ssh command or the gcloud compute config-ssh command
in the documentation.

(Pro tip from Stephen, Engineer on the Cloud SDK)

- Posted by the Google Cloud SDK team

Happy New Year from Google Cloud Platform – still the price/performance leader in public cloud!

Pay less, compute more!

Developers running cloud-based apps and services will find out whether it’s a happy new year or not once they take a look at their bill. In case you’ve been reading recent announcements and were wondering, rest assured: Google continues to be the price/performance leader in public cloud.

As you can see, we’re anywhere from 15-41% less expensive than AWS for compute resources, after their reduction. We use automatic Sustained Usage Discounts and our new Custom Machine Types to ensure that we’re presenting exact spec-to-spec comparisons here, something AWS can’t match.

*You might have noticed that the price is a little closer for high-memory instance types: AWS provides a particularly high ratio of ram to CPU on these, so for our comparison instance we’re using one with 4 cores rather than 2… and it’s still over 15% less expensive.

While price cuts sound appealing on the surface, when you unpack the specifics of Amazon’s pricing model, it can be an unpleasant surprise. We often hear from customers who are locked into contracts and aren’t eligible for the new rates, or are stuck with instances that no longer fit their needs.

We designed Google Cloud Platform pricing to be as flexible and beneficial to our customers as possible. You’re not required to make a long-term commitment to a price, machine class or region ahead of time. Our combination of lower list prices, sustained use discounting, no prepaid lock-in, per-minute billing, Preemptible VM’s and Custom Machine Types offers a structural price advantage that’s unmatched in the industry. For a detailed look at our pricing versus Amazon’s see this post on understanding cloud pricing.

As you consider which cloud provider to build and host your apps with in 2016, check out our TCO Tool. Explore how different combinations of development and production instances, as well as environmental assumptions, change the total cost of a real-world application hosted in the cloud — and be sure to compare it to our competitors!

Happy New Year!

- Posted by Miles Ward, Global Head of Solutions, Google Cloud Platform

With Amadeus, Cloud is in the Air

Today we hear from Olivier Favorel, Senior Manager, Airline IT at Amadeus. Operating in 195 countries, Amadeus is a leading technology company dedicated exclusively to the global travel industry. When an increase in CPU consumption of just 100 microseconds can mean thousands of dollars of extra hosting, Amadeus turned to Google Cloud Platform to offer new alternatives to its airline customers.

At Amadeus, we develop the technology that will shape the future of travel. To understand the business needs of our customers and partners, we’re highly focused on the trends impacting airlines. One main trend is the exponential growth of consumers browsing and shopping for airline products across digital channels.

Airline “look-to-book” ratios, or the average number of search requests before a flight reservation is actually made, were previously as low as 10:1. Today, these can easily run to 1000:1. Moreover, demand is never constant, thus managing demand fluctuations requires the ability to anticipate strong traffic peaks and make necessary capacity arrangements  a challenging task for airlines. In order to cope with the pressure of ever increasing online shopping transactions, shopping engines have developed cache-based solutions. However, cache-based systems have certain limitations, as they don’t accurately reflect an airline’s sophisticated revenue management policies.

Large network carriers are investing in advanced revenue management solutions to capture maximum traveler value and generate revenue. Maximizing revenue requires real-time capability to process every shopping request and make the right “flight availability” (availability of seats in a particular fare class) offer at an optimal price. Furthermore, it’s crucial for airlines to display consistent offers across various shopping platforms to capture every sales opportunity. Cache-based systems conflict with real-time revenue optimization, thus hindering airlines’ merchandising and personalization capabilities to make the right product offer to the right customer at the right time for the right price.

Given the challenge to maintain accurate and consistent airline offers across all distribution channels, how can we ensure high performance in dynamic content distribution for massive volumes?

With the help of Google Cloud Platform, Amadeus has developed a unique cloud-based solution, Amadeus Airline Cloud Availability. The solution offloads the processing of shopping transactions outside the airline reservation system, where the booking and payment are finally performed. This solution can be deployed in any public or private cloud, bringing airline offers closer to the shopping platform serving online travel agencies, meta searches or global distribution systems, while taking full advantage of more efficient solutions.
Figure 1: Amadeus Airline Cloud Availability architecture






This solution helps airlines efficiently manage the huge increase in search and shopping traffic.

We conducted a pilot of Amadeus Airline Cloud Availability in Google Cloud Platform from February to July 2015, together with Lufthansa. The objective of the pilot was two-fold:

  • Demonstrate the scalability and performance of flight availability requests using Google Compute Engine. Amadeus is currently handling requests for 4M+ flights per second in its private data center in Munich, for more than 140 airline carriers. This traffic increases by 50% every year.
  • Contain infrastructure cost of flight availability traffic.

The flight availability requests are handled by a farm of C++ backends accessing data through a Couchbase cluster, a distributed NoSQL store that hosts the airline flight and fare details. CPU consumption is a critical indicator for these kinds of large scale applications; an increase in CPU consumption of 100 microseconds per transaction translates into several thousands of dollars in extra hosting costs over a one-year period.

The initial deployment of our solution in Compute Engine was seamless thanks to the intuitive console and vast set of pre-installed Linux images (CentOS 7.1 in our case). First flight availability backend instances were ready to accept traffic only two hours after our initial connection.


The 1,500 cores challenge


Amadeus and Google engineering teams worked hand-in-hand to get the most out of a pre-allocated capacity of 1,500 cores spread over 3 regions (Central US, Western Europe and East Asia), each region being fed by airline data thanks to Couchbase Cross-Datacenter (XDCR) replication protocol.

Our mission was to increase the volume of flight availability requests processed per dollar. Several actions were undertaken:

  • Reducing the CPU path-length per transaction thanks to several C++ low level optimizations, and usage of Google’s tcmalloc memory allocator.
  • Increasing the IO throughput towards Couchbase data store to keep our application cores busy. We were quite impressed by the stability and very low latency of the internal Compute Engine network (stable sub-millisecond round-trip to Couchbase cluster nodes).
  • Enabling NOOP scheduler on VMs hosting our Couchbase cluster (optimal IO scheduling pattern to increase throughput to SSD drives).
  • Adjusting the VMs size (CPU/Memory ratio) to ensure that our servers were running constantly between 85-90% CPU usage (n1-highcpu-16 for application servers and n1-highmem-4 for Couchbase cluster nodes).

Figure 2: GCP Console and Performance Reports

The results


Pilot objectives were achieved much faster than initially planned, thanks to the flexibility of GCP and the reactivity of Google support teams.

The overall throughput of flight availability requests processed by 1,500 cores was doubled after only three months of joint effort.


Going further


We’re now engaging in the second phase of the pilot, aiming at dynamically adjusting the hardware capacity to the fluctuating shopping demand, further tuning the size of our VMs and leveraging the benefits of Compute Engine Preemptible VMs (“low-cost VMs” as we like to call them):

  • Dynamic capacity adjustment is being implemented thanks to Kubernetes (Google’s container orchestration and cluster management solution) that’s being rolled out in the pilot framework to dynamically spawn or shut down application VMs in line with flight availability traffic fluctuation. Kubernetes is shipped by our PaaS partner, Red Hat, as part of their OpenShift offer (we’re building our internal application platform, Amadeus Cloud Services, on top of these strategic products, to ensure our independence to the underlying IaaS provider). Per-minute billing of instances further optimizes the hosting costs.
  • Preemptible VMs, released in May 2015, run at a much lower price than standard VMs (70% off) but might be terminated, or preempted, by Compute Engine if it requires access to those resources for other tasks. Our plan is to oversize the number of computation VMs by 10% and use exclusively preemptible instance types, assuming that a fraction of those VMs will be terminated on a daily basis but still keeping our overall processing power at the required level to handle the flight availability traffic. Significant cost savings are anticipated with this new feature as well.
  • Custom machine types, released in November 2015, are being setup to replace our standard instance types (n1-highcpu-16 and n1-highmem-4). Custom VMs will be sized with only the required amount of cores and minimal memory requirement (per GB). The objective is to avoid any waste of CPU/memory.


Return on experience


Our journey on GCP was very exciting and impressed us for the following reasons:

  • Performance: Network latency, throughput and stability have astonishing performance. Also, the on-going migration of VMs to next-generation Intel architecture (Haswell) in many regions will bring even more CPU gains to flight availability request processing.
  • Stability: We faced very few VM outages over the 6-month pilot duration. The maintenance notification process is working great and the live VM migration is really transparent.
  • Monitoring: The Stackdriver framework is awesome to report both system metrics (CPU, Memory, IOs) and user-defined KPIs (like the rate of airline flights processed per second). Coupled with an efficient alerting system and the “Cloud Console” mobile app, we rapidly ended up with a production-grade monitoring solution.
  • Pace of innovation: During the six month duration of the pilot, three major announcements were made that helped our project: introduction of preemptible VMs, rollout of custom machine types and most importantly a 15% price drop in May 2015.


Summary


The pilot in the Google Cloud Platform changed our approach to performance optimization, from a pure CPU cost angle to an infrastructure driven approach (the efficiency is what matters in the end). GCP proved to be a very efficient sandbox environment for internal benchmarking, and we have no doubt that it will become a natural hosting solution for more Amadeus applications in the future.

Mapping your knowledge to Google Cloud Platform

Google Cloud Platform (GCP) is growing all the time and we love introducing new products and features and getting them into your hands. This rapid pace of innovation does mean that there is always something new to learn about and this can take up a lot of your time. We also know that GCP isn’t the only cloud platform you’re using or have used, and it’s important that we help you leverage that experience to get up to speed fast.

Our goal is to make it easier for you to stay on top of the services we offer and help you map your existing expertise to GCP at the same time. To that end, we are happy to release a new whitepaper that we have created to help you apply your existing knowledge and expertise to GCP.

This document is the first part of an on-going series. We start with the basics on how to complete base level operations, followed by a deep dive into the virtual compute platforms and the underlying networks. In the coming months we’ll add more information on how to work with storage, data, containers, and much more.

We hope you find this useful in learning about GCP. Please tell us what you think and what else you would like us to add. And don't forget to use our free trial to try out the things you've learned!

- Posted by Peter-Mark Verwoerd, Cloud Solutions Architect

Announcing NEXT 2016: Join us for what’s next for cloud

Last June, we kicked off Google Cloud Platform Next, and so many of you wanted to come that we had to move to a different location to accommodate everyone! Today, we’re excited to announce GCP NEXT 2016 in San Francsico: the event created specifically for you to learn about Google Cloud Platform directly from those who built it. You’ll also hear from developers and organizations that use Google Cloud Platform to build and run their businesses.

At GCP NEXT 2016, you’ll see how cutting-edge features in Google Cloud Platform will help you build powerful, reliable and intelligent applications at any scale.


Set your calendars (or your DeLorean) for March 23 - 24, 2016 to join us at GCP NEXT, where you’ll:
  • Hear about the latest in Google Cloud Platform product developments                                                  Watch in-depth product demos and exclusive talks from SVPs Diane Greene and Urs Hölzle.
  • Try your hand at different parts of the platform in code labs                                                                         Get hands-on experience with our platform with immersive tutorials, led by Google engineers and advocates.
  • Enjoy the NEXT Playground                                                                                                                               Play with fun demos powered by Google Cloud Platform, and find out how they were made to see what’s possible with cloud. Chat and exchange ideas with other technologists in the hallway track.
  • Learn the fundamentals of Google Cloud Platform                                                                                       Need to learn the basics of our platform or want to refresh your skills? Join us before GCP NEXT for a full-day, instructor-led Bootcamp on March 22.
  • Join us for some fun at the after-party                                                                                                        Network with our community and attend our after-party, NEXT After Dark.

We’ll also host tracks that dive a little deeper into specific areas of cloud, so you can learn more about topics that interest you most. This year, we’re opening up a call for speakers, and we hope to see you submit a session proposal! Share your project or experiences using our platform in one of our featured tracks:

  • Data and Analytics                                                                                                                                            Data is key to intelligent applications and decision making. Learn how Google Cloud Platform can help you build more intelligent applications and make better, more timely decisions.
  • Infrastructure and Operations                                                                                                                          Learn how Google’s infrastructure — including our networks, storage, security, data center operations and DevOps tools — gives you scale, security and reliability. Sessions in this track will cover popular tooling, common DevOps patterns and how to manage at any scale.
  • App and Services Development                                                                                                                      Want to understand how different components of Google Cloud Platform can work together in a variety of configurations? In this track, we'll discuss topics such as app architecture, development, deployment and continuous integration.
  • Solutions Showcase                                                                                                                                       Listen to some of our customers talk about how they’re using Google Cloud Platform in production. From cloud-native startups to enterprises in the process of migrating to cloud, they'll tell you about their experiences powering everything from mobile applications to mission-critical deployments. Hear about practical solutions, patterns, and lessons that you can apply to your own applications.

We look forward to seeing you at GCP NEXT 2016. If you can’t make it in person, catch sessions via livestream. Registration opens today, and the call for speakers closes on January 15, 2016; make sure to get your proposal submitted in time!

cloud.google.com/Next2016

To keep up to date on GCP NEXT 2016, follow us on Google+, Twitter, and LinkedIn.

- Posted by Julia Ferraioli, Developer Advocate, Google Cloud Platform

Meeting the challenge of financial data transformation

Today’s guest post comes from Salvatore Sferrazza and Sebastian Just from FIS Global, an international provider of financial services and technology solutions. Salvatore and Sebastian tell us how Google Cloud Dataflow transforms fluctuating, large-scale financial services data so that it can be accurately captured and moved across systems.

Much software development in the capital markets (and enterprise systems in general) revolves around the transformation, enrichment and movement of data from one system to another. The unpredictable nature of financial market data volumes, often driven by volatility, exacerbates the pain of scaling and posting data when and where it’s needed for daily trade reconciliation, settlement and regulatory reporting. The implications of technology missteps within such crucial business processes range from missed business opportunities to undesired risk exposure to regulatory non-compliance. These activities must be relentlessly predictable, repeatable and measurable to yield maximum value to stakeholders.

While developers rely on the Extract, Transform and Load (ETL) activities that are so crucial to processing data, they now face limits in terms of the speed and efficiency of ETL as the amount of transactions grows faster than they can process it. As shortened settlement durations and the Consolidated Audit Trail (CAT) loom on the horizon, financial services institutions need simple, fast and powerful approaches to quickly scale and ultimately mitigate time-sensitive risks and operational costs.

Traditionally, developers have considered the activities around ETL data an unglamorous yet necessary dimension of building software products for encapsulating functions that are core to every tier of computing. So when data-driven enterprises are tasked with harvesting insights from massive data sets, it’s quite likely that ETL, in one form or another, is lurking nearby. But in today’s world, data can come from anywhere and in any format, creating a series of labor, time and intellectual challenges. While there may be hundreds of ways to solve the problem, few provide the efficiency and effectiveness so needed in our “big data” world — until recently.

The Google Cloud Dataflow service and its associated software development kit (SDK) provides a series of powerful tools for a myriad of data transformation duties. Designed to perform data processing tasks of any size in a managed services environment, Google Cloud Dataflow simplifies the mechanics of large-scale transformation and supports both batch and stream processing using the same programming model. In our latest white paper, we introduce some of the main concepts behind building and running applications that use Dataflow, then get “hands on” with a job to transform and ingest options market symbol data before storing the transformations within a Google BigQuery data set.

In short, Google Cloud Dataflow allows you to focus on data processing tasks and not cluster management. Rather than asking you to guess the right cluster size, Dataflow automatically scales up or down horizontally as much as needed for your exact processing requirements. This includes scaling all the way down to zero when there is no work, so you’re never paying for an idle cluster. Dataflow also alleviates the pain of writing ETL jobs by standardizing the process of implementing application requirements. As a result, you’ll be able to focus on the data transformations you need to make rather than on the processing mechanics themselves. This not only provides greater flexibility, lower latency and enhanced control of ETL jobs; it offers built-in cost management and ties together other useful Google Cloud services. Beyond common ETL, Dataflow pipelines may also include inline computation ranging from simple counting to highly complex, multi-step analysis. In our experience with the service so far, it can potentially remove much of the work from engineers within financial institutions and regulatory organizations, while providing elasticity to the entire process and ensuring accuracy, scale, performance and cost efficiency.

As market volatility and reporting requirements drive the need for accuracy, low latency and risk reduction, transforming and interpreting market data in a big data world is imperative to trading efficiency and accessibility. Every second counts. With a more cost-effective, real-time and scalable method of processing an ever-increasing volume of data, financial institutions will be able to address specific requirements and volumes at hand while keeping up with the demands of a rapidly evolving global financial system. We hope our experience, as captured in the technical white paper, will prove useful to others in their quest for the more effective way to process data.

Please see this paper’s GitHub page for the complete and buildable project source code.

- Posted by Salvatore Sferrazza, Principal at FIS and Sebastian Just, Manager at FIS

Monitoring Container Engine with Google Cloud Monitoring

You’ve decided to adopt a microservice architecture and containerize your application. Congrats! But how will you monitor it? To solve that problem, we've worked to make Google Container Engine and Google Cloud Monitoring fit together like peas in a pod.

When you launch your Container Engine cluster, you can enable Cloud Monitoring with one click. Check it out!
Information will be collected about the CPU usage, memory usage and disk usage for all of the containers in your cluster. This information is annotated and stored in Cloud Monitoring, where you can choose to either access it via the API or in the Cloud Monitoring UI. From Cloud Monitoring, you can easily examine not only the container level resource usage but also see this aggregated across pods and clusters.

If you head over to the Cloud Monitoring dashboard and click on the Infrastructure dropdown, you can see a new option for Container Engine.




If you have more than one cluster with monitoring enabled, you'll see a page listing the clusters in your project along with how many pods and instances are in them. However, if you only have one cluster, you'll be directed straight to details about it, as shown below.
This page gives you a view of your cluster. It lists all the pods running in your cluster, recent events from the cluster, as well as resource usage aggregated across the nodes in your cluster. In this case, you can see that this cluster has the system components in it (DNS, UI, logging and monitoring) as well as the frontend and redis pods from the guestbook tutorial in the Container Engine documentation.

From here, you can easily drill down to the details of individual pods and containers, where you'll see metadata about the pod and its containers, such as how many times they've been restarted, along with metrics about the pod's resource usage.

But this is just the first piece. Since Cloud Monitoring makes heavy use of tags (the equivalent of Container Engine's labels), you can create groups based on how you've labeled your containers or pods. For example, if you're running a web app in a replication controller, you may have all of your frontend web containers labeled with “role=frontend.” In Cloud Monitoring, you can now create a group “Frontend” that matches all resources with the tag role and the value frontend.
You can also make queries that aggregate across pods without needing to create a group, making it possible to visualize the performance of an entire replication controller or service on a single graph. You can do this by creating a new dashboard from the top-level menu option named Dashboards, and adding a chart. In the example below, you can see the aggregated memory usage of all the php-redis frontend pods in the cluster.



With these tools, you can create powerful alerting policies that trigger when the aggregate across the group or any container within the group violates a threshold, for example, using too much memory. You can also tag your group as a cluster so that Cloud Monitoring's cluster insights detection will show outliers across the set of containers when they're detected, potentially helping you to pinpoint cases where your load isn't evenly distributed or nodes don't have even workloads.




And since this is all based on tags, it will update automatically, even as your containers move across the nodes of your cluster, even if you're auto-scaling and adding and removing nodes over time.

We have a lot more work planned to continue to integrate Container Engine and Cloud Monitoring and make it easy to collect your application and service metrics as well as system metrics that you can use today.

Do you have ideas of what we should do to make things better? Let us know by sending feedback through the Cloud Monitoring console or directly at [email protected]. You can find more information on the available metrics in our docs.

- Posted by Alex Robinson, Software Engineer, Google Container Engine and Jeremy Katz, Software Engineer, Google Cloud Monitoring

Cloud Audit Logs to help you with audit and compliance needs

Not having a full view of administrative actions in your Google Cloud Platform projects can make it challenging and slow going to troubleshoot when an important application breaks or stops working. It can also make it difficult to monitor access to sensitive data and resources managed by your project. That’s why we created Google Cloud Audit Logs, and today they’re available in beta for App Engine and BigQuery. Cloud Audit Logs help you with your audit and compliance needs by enabling you to track the actions of administrators in your Google Cloud Platform projects. They consist of two log streams: Admin Activity and Data Access.

Admin Activity audit logs contain an entry for every administrative action or API call that modifies the configuration or metadata for the related application, service or resource, for example, adding a user to a project, deploying a new version in App Engine or creating a BigQuery dataset. You can inspect these actions across your projects on the Activity page in the Google Cloud Platform Console.

activity stream.png

Data Access audit logs contain an entry for every one of the following events:
  • API calls that read the configuration or metadata of an application, service or resource
  • API calls that create, modify or read user-provided data managed by a service (e.g. inserting data into a dataset or launching a query in BigQuery)

Currently, only BigQuery generates a Data Access log as it manages user-provided data, but ultimately all Cloud Platform services will provide a Data Access log.

There are many additional uses of Audit Logs beyond audit and compliance needs. In particular, the BigQuery team has put together a collection of examples that show how you can use Audit Logs to better understand your utilization and spending on BigQuery. We’ll be sharing more examples in future posts.


Accessing the Logs
Both of these logs are available in Google Cloud Logging, which means that you’ll be able to view the individual log entries in the Logs Viewer as well as take advantage of the many logs management capabilities available, including exporting the logs to Google Cloud Storage for long-term retention, streaming to BigQuery for real-time analysis and publishing to Google Cloud Pub/Sub to enable processing via Google Cloud Dataflow. The specific content and format of the logs can be found in the Cloud Logging documentation for Audit Logs.

Audit Logs are available to you at no additional charge. Applicable charges for using other Google Cloud Platform services (such as BigQuery and Cloud Storage) as well as streaming logs to BigQuery will still apply. As we find more ways to provide greater insight into administrative actions in GCP projects, we’d love to hear your feedback. Share it here: [email protected].

Posted by Joe Corkery, Product Manager, Google Cloud Platform