Category Archives: Google Cloud Platform Blog

Product updates, customer stories, and tips and tricks on Google Cloud Platform

How to run SAP Fiori Front-End Server (OpenUI5) on GCP in 20 mins



Who enjoys doing risky development on their SAP system? No one. But if you need to build enterprise apps that use your SAP backend, not doing development is a non-starter. One solution is to apply Gartner’s Bimodal IT, the practice of managing two separate but coherent styles of work: one focused on predictability; the other on exploration. This is an awesome strategy for creating frontend innovation with modern HTML5 / JS applications that are loosely coupled to backend core ERP system, reducing risk. And it turns out that Google Cloud Platform (GCP) can be a great way to do Bimodal IT in a highly cost-effective way.

This blog walks through setting up SAP OpenUI5 on a GCP instance running a local node.js webserver to run sample apps. These apps can be the building blocks to develop new enterprise apps in the cloud without impacting your SAP backend. Let’s take a deeper look.

Set up your GCP account:

Make sure that you have set up your GCP free trial ($300 credit):
https://cloud.google.com/free/

After signing up, you can access GCP at
https://console.cloud.google.com

Everything in GCP happens in a project so we need to create one and enable billing (this uses your $300 free credit).

From the GCP Console, select or create a project by clicking the GO TO THE MANAGE RESOURCES PAGE

Make sure that billing is enabled (using your $300 free credit):

Setting up SAP OpenUI5 in GCP


1. Create a compute instance (virtual machine):


In the top left corner click on ‘Products and Services’:
















Select ‘Compute Engine → VM instances’
  • Click ‘Create instance’
  • Give it the coolest name you can think of
  • Select the zone closest to where you are located
  • Under ‘Machine Type’, choose “micro (1 shared CPU)”. Watch the cost per month drop like a stone!
  • Under ‘Firewall’, check ‘Allow HTTP traffic’

Keep everything else as default and click Create.Your Debian VM should start in about 5-10 seconds.


2. Set up OpenUI5 on the new image:

SAP has an open-source version of its SAPUI5 that is the basis for its Fiori Front-End Server called OpenUI5.OpenUI5 comes with a number of sample apps. Let’s deploy this to a local node.js webserver on the instance.
Install nodejs and npm (node package manager):
sudo apt-get update
curl -sL https://deb.nodesource.com/setup_10.x | sudo -E bash -
sudo apt-get install -y nodejs

SAP files are zipped so install unzip with:
sudo apt-get install unzip

Make a project directory and change to it (feel free to change the name):
mkdir saptest 
cd saptest
Download the latest Stable OpenUI5 SDK from:
https://openui5.org/download.html
eg.,
wget https://www.google.com/url?q=https://openui5.hana.ondemand.com/downloads/openui5-sdk-1.54.6.zip&sa=D&source=hangouts&ust=1529597279793000&usg=AFQjCNHiQIJnKJVJyacNwVjl_6dogj-ejQ
Time to grab a coffee as the download may take about 5 to 10 minutes depending on your connection speed.
Extract the zip file to your project directory with:
unzip openui5-sdk-1.54.5.zip
Next we will set up a local static node.js http server to serve up requests running on port 8888. Download static_server.js and package.json from Github into your project folder:
curl -O
https://raw.githubusercontent.com/htammen/static_server/master/static_server.js
curl -O
https://raw.githubusercontent.com/htammen/static_server/master/package.json
(https://github.com/htammen/static_server)
Identify your primary working directory and create a symbolic link to your resources folder. This allows the demo apps to work out of the box without modification (adjust the path to match your own):
pwd
ln -s /home/<me>/saptest/resources resources 
Call the node package manager to install the http server:
npm install
Run the node.js static server to accept http requests:
node static_server.js
Your node server should now be running and be able to serve up SAPOpenUI5 sample applications from localhost. However, we should make this testable from outside the VM (e.g., mobile) so let’s set up a firewall rule to allow traffic to our new static server on port 8888.
In the GCP Console click on ‘Products and Services’ (top left)
Networking → VPC Networking → Firewall Rules.
Click New to create a new firewall rule and enter the following settings:


Name:
allow-nodeserver
Network:
default
Priority
1000
Direction
Ingress
Action on Match
Allow
Targets
All instances on network
Source filter
IP ranges
Source IP ranges
0.0.0.0/0
Specified Protocols and ports
tcp:8888

Now, click ‘Create’.
Go to Products and Services → Compute Engine → VM instances and copy the External IP. Open up a browser and navigate to:
http://<External IP>:8888/index.html 
Congratulations! You are now running the OpenUI5 front-end on your GCP instance.

3. Explore the OpenUI5 demo apps

You can take a look at the sample applications offered un OpenUI5 by clicking on ‘Demo Apps’ or you can navigate directly to the shopping cart application with:
http://<External IP>:8888/test-resources/sap/m/demokit/cart/webapp/index.html
(Pro-Tip: email this link to yourself and open on your mobile device to see the adaptable UI in action. Really cool.)
These demo apps are just connecting to local sample data in XML files. In the real world oData is often used. oData is a great way of connecting your front-end systems to backend SAP systems. This can be activated on your SAP Gateway. Please consult your SAP documentation setting this up.
SAPUI5 has even more capabilities than OpenUI5 (e.g. charts and micro graphs). This is available either in your SAP Deployment or on the SAP Cloud Platform. In addition, you can also leverage this on top of GCP via Cloud Foundry. Learn more here.
Good luck in your coding adventures!

References and other links

This awesome blog was the baseline of this tutorial:
https://blogs.sap.com/2015/09/25/running-ui5-apps-on-local-nodejs-server/

Some other good links to check out:
https://openui5.org/getstarted.html
https://blogs.sap.com/2017/03/13/how-to-consume-an-odata-service-with-openui5-sapui5/
https://blogs.sap.com/2015/07/15/sapui5-vs-fiori/
https://blogs.sap.com/2015/05/11/s4-hana-delivers-the-netweaver-vision-of-a-bimodal-it/

GPUs as a service with Kubernetes Engine are now generally available



[Editor's note: This is one of many posts on enterprise features enabled by Kubernetes Engine 1.10. For the full coverage, follow along here.]

Today, we’re excited to announce the general availability of GPUs in Google Kubernetes Engine, which have become one of the platform’s fastest growing features since they entered beta earlier this year, with core-hours soaring by 10X since the end of 2017.

Together with the GA of Kubernetes Engine 1.10, GPUs make Kubernetes Engine a great fit for enterprise machine learning (ML) workloads. By using GPUs in Kubernetes Engine for your CUDA workloads, you benefit from the massive processing power of GPUs whenever you need, without having to manage hardware or even VMs. We recently introduced the latest and the fastest NVIDIA Tesla V100 to the portfolio, and the P100 is generally available. Last but not least, we also offer the entry-level K80, which is largely responsible for the popularity of GPUs. All our GPU models are available as Preemptible GPUs, as a way to reduce costs while benefiting from GPUs in Google Cloud. Check out the latest prices for GPUs here.

As the growth in GPU core-hours indicates, our users are excited about GPUs in Kubernetes Engine. Ocado, the world’s largest online-only grocery retailer, is always looking to apply state-of-the-art machine learning models for Ocado.com customers and Ocado Smart Platform retail partners, and runs the models on preemptible, GPU-accelerated instances on Kubernetes Engine.
“GPU-attached nodes combined with Kubernetes provide a powerful, cost-effective and flexible environment for enterprise-grade machine learning. Ocado chose Kubernetes for its scalability, portability, strong ecosystem and huge community support. It’s lighter, more flexible and easier to maintain compared to a cluster of traditional VMs. It also has great ease-of-use and the ability to attach hardware accelerators such as GPUs and TPUs, providing a huge boost over traditional CPUs.”
— Martin Nikolov, Research Software Engineer, Ocado
GPUs in Kubernetes Engine also have a number of unique features:
  • Node Pools allow your existing cluster to use GPUs whenever you need.
  • Cluster Autoscaler automatically creates nodes with GPUs whenever pods requesting GPUs are scheduled, and scale down to zero when GPUs are no longer consumed by any active pods.
  • Taint and toleration technology ensures that only pods that request GPUs will be scheduled on the nodes with GPUs, and prevents pods that do not require GPUs from running on them.
  • Resource quota that allows administrators to limit resource consumption per namespace in a large cluster shared by multiple users or teams.
We also heard from you that you need an easy way to understand how your GPU jobs are performing: how busy the GPUs are, how much memory is available, and how much memory is allocated. We are thrilled to announce that you can now monitor those information natively from the GCP Console.You can also visualize these metrics in Stackdriver.
Fig 1. GPU memory usage and duty cycle 

The general availability of GPUs in Kubernetes Engine represents a lot of hard work behind the scenes, polishing the internals for enterprise workloads. Jiaying Zhang, the technical lead for this general availability, led the Device Plugins effort in Kubernetes 1.10, working closely with the OSS community to understand its needs, identify common requirements, and come up with an execution plan to build a production-ready system.

Try them today

To get started using GPUs in Kubernetes Engine using our free-trial of $300 credits, you’ll need to upgrade your account and apply for a GPU quota for the credits to take effect. For a more detailed explanation of Kubernetes Engine with GPUs, for example how to install NVIDIA drivers and how to configure a pod to consume GPUs, check out the documentation.

In addition to GPUs in Kubernetes Engine, Cloud TPUs are also now publicly available in Google Cloud. For example, RiseML uses Cloud TPUs in Kubernetes Engine for a hassle-free machine learning infrastructure that is easy-to-use, highly scalable, and cost-efficient. If you want to be among the first to access Cloud TPUs in Kubernetes Engine, join our early access program today.

Thanks for your feedback on how to shape our roadmap to better serve your needs. Keep the conversation going by connecting with us on the Kubernetes Engine Slack channel.

Cloud TPU now offers preemptible pricing and global availability




Deep neural networks have enabled breakthroughs across a variety of business and research challenges, including translating text between languages, transcribing speech, classifying image content, and mastering the game of Go. Because training and running deep learning models can be extremely computationally demanding, we rely on our custom-built Tensor Processing Units (TPUs) to power several of our major products, including Translate, Photos, Search, Assistant, and Gmail.

Cloud TPUs allow businesses everywhere to transform their own products and services with machine learning, and we’re working hard to make Cloud TPUs as widely available and as affordable as possible. As of today, Cloud TPUs are available in two new regions in Europe and Asia, and we are also introducing preemptible pricing for Cloud TPUs that is 70% lower than the normal price.

Cloud TPUs are available in the United States, Europe, and Asia at the following rates, and you can get started in minutes via our Quickstart guide:
One Cloud TPU (v2-8) can deliver up to 180 teraflops and includes 64 GB of high-bandwidth memory. The colorful cables link multiple TPU devices together over a custom 2-D mesh network to form Cloud TPU Pods. These accelerators are programmed via TensorFlow and are widely available today on Google Cloud Platform.

Benchmarking Cloud TPU performance-per-dollar


Training a machine learning model is analogous to compiling code: ML training needs to happen fast for engineers, researchers, and data scientists to be productive, and ML training needs to be affordable for models to be trained over and over as a production application is built, deployed, and refined. Key metrics include time-to-accuracy and training cost.

Researchers at Stanford recently hosted an open benchmarking competition called DAWNBench that focused on time-to-accuracy and training cost, and Cloud TPUs won first place in the large-scale ImageNet Training Cost category. On a single Cloud TPU, our open-source AmoebaNet reference model cost only $49.30 to reach the target accuracy, and our open-source ResNet-50 model cost just $58.53. Our TPU Pods also won the ImageNet Training Time category: the same ResNet-50 code running on just half of a TPU pod was nearly six times faster than any non-TPU submission, reaching the target accuracy in approximately 30 minutes!

Although we restricted ourselves to standard algorithms and standard learning regimes for the competition, another DAWNBench submission from fast.ai (3rd place in ImageNet Training Cost, 4th place in ImageNet Training Time) altered the standard ResNet-50 training procedure in two clever ways to achieve faster convergence (GPU implementation here). After DAWNBench was over, we easily applied the same optimizations to our Cloud TPU ResNet-50 implementation. This reduced ResNet-50 training time on a single Cloud TPU from 8.9 hours to 3.5 hours, a 2.5X improvement, which made it possible to train ResNet-50 for just $25 with normal pricing.

Preemptible Cloud TPUs make the Cloud TPU platform even more affordable. You can now train ResNet-50 on ImageNet from scratch for just $7.50. Preemptible Cloud TPUs allow fault-tolerant workloads to run more cost-effectively than ever before; these TPUs behave similarly to Preemptible VMs. And because TensorFlow has built-in support for saving and restoring from checkpoints, deadline-insensitive workloads can easily take advantage of preemptible pricing. This means you can train cutting-edge deep learning models to achieve DAWNBench-level accuracy for less than you might pay for lunch!




Select Open-Source Reference Models
Normal training cost
(TF 1.8)
Preemptible training cost
(TF 1.8)
ResNet-50 (with optimizations from fast.ai): Image classification
~$25
~$7.50
ResNet-50 (original implementation): Image classification
~$59
~$18
AmoebaNet: Image classification (model architecture evolved from scratch on TPUs to maximize accuracy)
~$49
~$15
RetinaNet: Object detection
~$40
~$12
Transformer: Neural machine translation
~$41
~$13
ASR Transformer: Speech recognition (transcribe speech to text)
~$86
~$27

Start using Cloud TPUs today

We aim for Google Cloud to be the best place to run all of your machine learning workloads. Cloud TPUs offer great performance-per-dollar for training and batch inference across a variety of machine learning applications, and we also offer top-of-the-line GPUs with recently-improved preemptible pricing.

We’re excited to see what you build! To get started, please check out the Cloud TPU Quickstart, try our open source reference models, and be sure to sign up for a free trial to start with $300 in cloud credits. Finally, we encourage you to watch our Cloud-TPU-related sessions from Google I/O and the TensorFlow Dev Summit: “Effective machine learning with Cloud TPUs” and “Training Performance: A user’s guide to converge faster.


A datacenter technician scoots past two rows of Cloud TPUs and supporting equipment.

Building scalable web applications with Cloud Datastore — new solution



If you manage database systems for large web applications, your job can be quite challenging. When unforeseen situations arise, making configuration changes can be complex and risky due to the stateful nature of those database systems. And before launching a new application, you have to do a lot of capacity planning, such as the number of virtual machines (VMs), the amount of disk storage, and the optimal network configuration, while contending with unknown factors such as the volume and frequency of open database connections and evolving usage patterns. You also need to do regular maintenance work to upgrade database software and scale resources to meet growing demand.

All of this planning and maintenance takes time, money, and attention away from developing new application features, so it is important to find a balance between provisioning enough resources to handle heavy loads and overspending on unused resources.

Cloud Datastore can help minimize these challenges by providing a scalable, highly available, high-performance, and fully-managed NoSQL database system.

We recently published an article that presents an overview of how to build large web applications with Cloud Datastore. The article includes scenarios of full-fledged web applications that use Cloud Datastore jointly with other products in the Google Cloud Platform (GCP) ecosystem.

Check out the article for all the details and next steps for building your own scalable solutions using Cloud Datastore!

GCP arrives in the Nordics with a new region in Finland



Click here for the Finnish version, thank you!

Our sixteenth Google Cloud Platform (GCP) region, located in Finland, is now open for you to build applications and store your data.

The new Finland region, europe-north1, joins the Netherlands, Belgium, London, and Frankfurt in Europe and makes it easier to build highly available, performant applications using resources across those geographies.

Hosting applications in the new region can improve latencies by up to 65% for end-users in the Nordics and by up to 88% for end-users in Eastern Europe, compared to hosting them in the previously closest region. You can visit www.gcping.com to see for yourself how fast the Finland region is from your location.

Services


The Nordic region has everything you need to build the next great application, and three zones that allow you to distribute applications and storage across multiple zones to protect against service disruptions.

You can also access our Multi-Regional services in Europe (such as BigQuery) and all the other GCP services via the Google Network, the largest cloud network as measured by number of points of presence. Please visit our Service Specific Terms to get detailed information on our data storage capabilities.

Build sustainably


The new region is located in our existing data center in Hamina. This facility is one of the most advanced and efficient data centers in the Google fleet. Our high-tech cooling system, which uses sea water from the Gulf of Finland, reduces energy use and is the first of its kind anywhere in the world. This means that when you use this region to run your compute workloads, store your data, and develop your applications, you are doing so sustainably.

Hear from our customers


“The road to emission-free and sustainable shipping is a long and challenging one, but thanks to exciting innovation and strong partnerships, Rolls-Royce is well-prepared for the journey. For us being able to train machine learning models to deliver autonomous vessels in the most effective manner is key to success. We see the Google Cloud for Finland launch as a great advantage to speed up our delivery of the project.”
– Karno Tenovuo, Senior Vice President Ship Intelligence, Rolls-Royce

“Being the world's largest producer of renewable diesel refined from waste and residues, as well as being a technologically advanced refiner of high-quality oil products, requires us to take advantage of leading-edge technological possibilities. We have worked together with Google Cloud to accelerate our journey into the digital future. We share the same vision to leave a healthier planet for our children. Running services on an efficient and sustainably operated cloud is important for us. And even better that it is now also available physically in Finland.”
– Tommi Touvila, Chief Information Officer, Neste

“We believe that technology can enhance and improve the lives of billions of people around the world. To do this, we have joined forces with visionary industry leaders such as Google Cloud to provide a platform for our future innovation and growth. We’re seeing tremendous growth in the market for our operations, and it’s essential to select the right platform. The Google Cloud Platform cloud region in Finland stands for innovation.”
– Anssi Rönnemaa, Chief Finance and Commercial Officer, HMD Global

“Digital services are key growth drivers for our renewal of a 108-year old healthcare company. 27% of our revenue is driven by digital channels, where modern technology is essential. We are moving to a container-based architecture running on GCP at Hamina. Google has a unique position to provide services within Finland. We also highly appreciate the security and environmental values of Google’s cloud operations.”
– Kalle Alppi, Chief Information Officer, Mehiläinen

Partners in the Nordics


Our partners in the Nordics are available to help design and support your deployment, migration and maintenance needs.


"Public cloud services like those provided by Google Cloud help businesses of all sizes be more agile in meeting the changing needs of the digital era—from deploying the latest innovations in machine learning to cost savings in their infrastructure. Google Cloud Platform's new Finland region enables this business optimization and acceleration with the help of cloud-native partners like Nordcloud and we believe Nordic companies will appreciate the opportunity to deploy the value to their best benefit.”
– Jan Kritz, Chief Executive Officer, Nordcloud

Nordic partners include: Accenture, Adapty, AppsPeople, Atea, Avalan Solutions, Berge, Cap10, Cloud2, Cloudpoint, Computas, Crayon, DataCenterFinland, DNA, Devoteam, Doberman, Deloitte, Enfo, Evry, Gapps, Greenbird, Human IT Cloud, IIH Nordic, KnowIT, Koivu Solutions, Lamia, Netlight, Nordcloud, Online Partners, Outfox Intelligence AB, Pilvia, Precis Digital, PwC, Quality of Service IT-Support, Qvik, Skye, Softhouse, Solita, Symfoni Next, Soprasteria, Tieto, Unifoss, Vincit, Wizkids, and Webstep.

If you want to learn more or wish to become a partner, visit our partners page.

Getting started


For additional details on the region, please visit our Finland region page where you’ll get access to free resources, whitepapers, the "Cloud On-Air" on-demand video series and more. Our locations page provides updates on the availability of additional services and regions. Contact us to request access to new regions and help us prioritize what we build next.

Partner Interconnect now generally available



We are happy to announce that Partner Interconnect, launched in beta in April, is now generally available. Partner Interconnect lets you connect your on-premises resources to Google Cloud Platform (GCP) from the partner location of your choice, at a data rate that meets your needs.

With general availability, you can now receive an SLA for Partner Interconnect connections if you use one of the recommended topologies. If you were a beta user with one of those topologies, you will automatically be covered by the SLA. Charges for the service start with GA (see pricing).

Partner Interconnect is ideal if you want physical connectivity to your GCP resources but cannot connect at one of Google’s peering locations, or if you want to connect with an existing service provider. If you need help understanding the connection options, the information here can help.

In this blog we will walk through how you can start using Partner Interconnect, from choosing a partner that works best for you all the way through how you can deploy and start using your interconnect.


Choosing a partner


If you already have a service provider partner for network connectivity, you can check the list of supported service providers to see if they offer Partner Interconnect service. If not, you can select a partner from the list based on your data center location.

Some critical factors to consider are:
  • Make sure the partner can offer the availability and latency you need between your on-premises network and their network.
  • Check whether the partner offers Layer 2 connectivity, Layer 3 connectivity, or both. If you choose a Layer 2 Partner, you have to configure and establish a BGP session between your Cloud Routers and on-premises routers for each VLAN attachment that you create. If you choose a Layer 3 partner, they will take care of the BGP configuration.
  • Please review the recommended topologies for production-level and non-critical applications. Google provides a 99.99% (with Global Routing) or 99.9% availability SLA, and that only applies to the connectivity between your VPC network and the partner's network.

Bandwidth options and pricing


Partner Interconnect provides flexible options for bandwidth between 50 Mbps and 10 Gbps. Google charges on a monthly basis for VLAN attachments depending on capacity and egress traffic (see options and pricing).

Setting up Partner Interconnect VLAN attachments


Once you’ve established network connectivity with a partner, and they have set up interconnects with Google, you can set up and activate VLAN attachments using these steps:
  1. Create VLAN attachments.
  2. Request provisioning from the partner.
  3. If you have a Layer 2 partner, complete the BGP configuration and then activate the attachments for traffic to start. If you have a Layer 3 partner, simply activate the attachments, or use the pre-activation option.
With Partner Interconnect, you can connect to GCP where and how you want to. Follow these steps to easily access your GCP compute resources from your on-premises network.

Related content


Try full-stack monitoring with Stackdriver on us



In advance of the new simplified Stackdriver pricing that will go into effect on June 30, we want to make sure everyone gets a chance to try Stackdriver. That’s why we’ve decided to offer the full power of Stackdriver, including premium monitoring, logging and application performance management (APM), to all customers—new and existing—for free until the new pricing goes into effect. This offer will be available starting June 18.

Stackdriver, our full-stack logging and monitoring tool, collects logs and metrics, as well as other data from your cloud apps and other sources, then generates useful dashboards, charts and alerts to let you act on information as soon as you get it. Here’s what’s included when you try Stackdriver:
  • Out-of-the-box observability across the entire Google Cloud Platform (GCP) and Amazon Web Services (AWS) services you use
  • Platform, system, application and custom metrics on demand with Metrics Explorer
  • Uptime checks to monitor the availability of the internet-facing endpoints you depend on
  • Alerting policies to let you know when something is wrong. Alerting and notification options, previously available only on the premium tier, are now available for free during this limited time
  • Access to logging and APM features like logs-based metrics, using Trace to understand application health, debugging live with debugger and more
Want to estimate your usage once the new pricing goes into effect? Check out our earlier blog post on viewing and managing your costs. You’ll see the various ways you can estimate usage to plan for the best use of Stackdriver monitoring in your environment. And if you are not already a Stackdriver user, you can sign up to try Stackdriver now!

Related content:

Behind the scenes with the Dragon Ball Legends GCP backend



Dragon Ball Legends, a new mobile game from Bandai Namco Entertainment (BNE), is based on its popular Dragon Ball Z franchise, and is rolling out to gamers around the world as we speak. But planning the cloud infrastructure to power the game dates back to February 2017, when BNE approached Google Cloud to talk about the interesting challenges they were facing, and how we could help.

Based on their anticipated demand, BNE had three ambitious requirements for their game:
  1. Extreme scalability. The game would be launched globally, so it needed backend that could scale with millions of players and still perform well.
  2. Global network. Because the game allows real-time player versus player battles, it needs a reliable and low-latency network across regions.
  3. Real-time data analytics. The game is designed to evolve with players in real-time, so it was critical to have a data analytics pipeline to stream data to a data warehouse. Then the operation team can measure and evaluate how people are playing the game and adjust it on-the-fly.
All three of these are areas where we have a lot of experience. Google has multiple global services with more than a billion users, and we use the data those services generate to improve them over time. And because Google Cloud Platform (GCP) runs on the same infrastructure as these Google services, GCP customers can take advantage of the same enabling technologies.

Let’s take a look at how BNE worked with Google Cloud to build the infrastructure for Dragon Ball Legends.


Challenge #1: Extreme scalability

MySQL is extensively used by gaming companies in Japan because engineers are used to working with relational databases with schema, SQL queries and strong consistency. This simplifies a lot on the application side that doesn’t have to handle any database limitations like eventual consistency or schema enforcement. MySQL is a widely used even outside gaming and most backend engineers already have strong experience using this database.

While MySQL offers many advantages, it has one big limitation: scalability. Indeed, as a scale-up database if you want to increase MySQL performance, you need to add more CPU, RAM or disk. And when a single instance of MySQL can’t handle the load anymore, you can divide the load by sharding—splitting users into groups and assigning them to multiple independent instances of MySQL. Sharding has a number of drawbacks, however. Most gaming developers calculate the number of shards they’ll need for the database before the game launches since resharding is labor-intensive and error-prone. That causes gaming companies tend to overprovision the database to eventually handle more players than they expect. If the game is as popular as expected, everything is fine. But what if the game is a runaway hit and exceeds the anticipated demand? And what about the long tail representing a gradual decrease in active players? What if it’s an out-and-out flop? MySQL sharding is not dynamically scalable, and adjusting its size requires maintenance as well as risk.

In an ideal world, databases can scale in and out without downtime while offering the advantages of a relational database. When we first heard that BNE was considering MySQL sharding to handle the massive anticipated traffic for Dragon Ball Legends, we suggested they consider Cloud Spanner instead.


Why Cloud Spanner?

Cloud Spanner is a fully managed relational database that offers horizontal scalability and high availability while keeping strong consistency with a schema that is similar to MySQL’s. Better yet, as a managed service, it’s looked after by Google SREs, removing database maintenance and minimizing the risk of downtime. We thought Cloud Spanner would be able to help BNE make their game global.


Evaluation to implementation

Before adopting a new technology, engineers should always test it to confirm its expected performance in a real world scenario. Before replacing MySQL, BNE created a new Cloud Spanner instance in GCP, including a few tables with a similar schema to what they used in MySQL. Since their backend developers were writing in Scala, they chose the Java client library for Cloud Spanner and wrote some sample code to load-test Cloud Spanner and see if it could keep up with their queries per second (QPS) requirements for writes—around 30,000 QPS at peak. Working with our customer engineer and the Cloud Spanner engineering team, they met this goal easily. They even developed their own DML (Data Manipulation Language) wrapper to write SQL commands like INSERT, UPDATE and DELETE.


Game release

With the proof of concept behind them, they could start their implementation. Based on the expected daily active users (DAU), BNE calculated how many Cloud Spanner nodes they needed—enough for the 3 million pre-registered players they were expecting. To prepare the release, they organized two closed beta tests to validate their backend, and didn’t have a single issue with the database! In the end, over 3 million participants worldwide pre-registered for Dragon Ball Legends, and even with this huge number, the official game release went flawlessly.

Long story short, BNE can focus on improving the game rather than spending time operating their databases.


Challenge #2: Global network

Let’s now talk about BNE’s second challenge: building a global real-time player-vs-player (PvP) game. BNE’s goal for Dragon Ball Legends was to let all its players play against one another, anywhere in the world. If you know anything about networking, you understand the challenge around latency. Round-trip time (RTT) ( between Tokyo and San Francisco, for example, is on average around 100 ms. To address that, they decided to divide every game second into 250 ms intervals. So while the game looks like it’s real-time to users, it’s actually a really fast turn-based game at its core (you can read more about the architecture here). And while some might say that 250ms offers plenty of room for latency, it’s extremely hard to predict the latency when communicating across the Internet.


Why Cloud Networking?

Here’s what it looks like for a game client to access the game server on GCP over the internet. Since the number of hops can vary every time, this means that playing PvP can sometimes feel fast, sometimes slow.

Once of the main reasons BNE decided to use GCP for the Dragon Ball Legends backend was the Google dedicated network. As you can see in the picture below, when using GCP, once the game client accesses one of the hundreds of GCP Point Of Presence (POP) around the world, it’s on the Google dedicated network. That means none unpredictable hops, for predictable and lowest possible latency.


Taking advantage of the Google Cloud Network

Usually, gaming companies implement PvP by connecting two players directly or through a dedicated game server. Usually combat games that require low latency between players will prefer P2P communication. In general, when two players are geographically close, P2P works very well, but it’s often unreliable when trying to communicate across regions (some carriers even block P2P protocols). For two players from two different continents to communicate through Google’s dedicated network, players first try to communicate through P2P, and if that fails, they failover to an open source implementation of STUN/TURN Server called coturn, which acts as a relay between the two players.. That way, cross continent battles leverage the low-latency and reliable Google network as much as possible.


Challenge #3: Real-time data analytics

BNE’s last challenge was around real-time data analytics. BNE wanted to offer the best user experience to their fans and one of the ways to do that is through live game operations, or LiveOps, in which operators make constant changes to the game so it always feels fresh. But to understand players’ needs, they needed data— usually users’ actions log data. And if they could get this data in near real-time, they could then make decisions on what changes to apply to the game to increase users’ satisfaction and engagement.

To gather this data, BNE used a combination of Cloud Pub/Sub, Cloud Dataflow to transform in users’ data in real-time and insert it into BigQuery.
  • Cloud Pub/Sub offers a globally reliable messaging system that buffers the logs until they can be handled by Cloud Dataflow.
  • Cloud Dataflow is a fully managed parallel processing service that lets you execute ETL in real-time and in parallel.
  • BigQuery is the fully managed data warehouse where all the game logs are stored. Since BigQuery offers petabyte-scale storage, scaling was not a concern. Thanks to heavy parallel processing when querying the logs, BNE can get a response to a query, scanning terabytes of data in a few seconds.
This system lets a game producer visualize a player’s behavior in near real-time and take decision on what new features to bring to the game or what to change inside the game to satisfy all their fans.


Takeaways

Using Cloud Spanner, BNE could focus on developing an amazing game instead of spending time on database capacity planning and scaling. Operations-wise, by using a fully managed scalable database, they drastically reduced risks related to human error as well as an operational overhead.

Using Cloud Networking, they leveraged Google’s dedicated network to offer the best user experience to their fans, even when fighting across regions.

And finally, using Google’s analytics stack (Cloud PubSub, Cloud Dataflow and BigQuery), BNE was able to analyze players’ behaviors in near real-time and make decisions about how to adjust the game to make their fans even happier!

If you want to hear more details about how they evaluated and adopted Cloud Spanner for their game, please join them at their Google Cloud NEXT’18 session in San Francisco.

Introducing QUIC support for HTTPS load balancing



For four years now, Google has been using QUIC, a UDP-based encrypted transport protocol optimized for HTTPS, to deliver traffic for our products – from Google Web Search, to YouTube, to this very blog. If you’re reading this in Chrome, you’re probably using QUIC right now. QUIC makes the web faster, particularly for slow connections, and now your cloud services can enjoy that speed: today, we’re happy to be the first major public cloud to offer QUIC support for our HTTPS load balancers.

QUIC’s key features include establishing connections faster, stream-based multiplexing, improved loss recovery, and no head-of-line blocking. QUIC is designed with mobility in mind, and supports migrating connections from WiFi to Cellular and back.

Benefits of QUIC


If your service is sensitive to latency, QUIC will make it faster because of the way it establishes connections. When a web client uses TCP and TLS, it requires two to three round trips with a server to establish a secure connection before the browser can send a request. With QUIC, if a client has talked to a given server before, it can start sending data without any round trips, so your web pages will load faster. How much faster? On a well-optimized site like Google Search, connections are often pre-established, so QUIC’s faster connections can only speed up some requests—but QUIC still improves mean page load time by 8% globally, and up to 13% in regions where latency is higher.

Cedexis benchmarked our Cloud CDN performance using a Google Cloud project. Here’s what happened when we enabled QUIC.

Encryption is built into QUIC, using AEAD algorithms such as AES-GCM and ChaCha20 for both privacy and integrity. QUIC authenticates the parts of its headers that it doesn’t encrypt, so attackers can’t modify any part of a message.

Like HTTP/2, QUIC multiplexes multiple streams into one connection, so that a connection can serve several HTTP requests simultaneously. But HTTP/2 uses TCP as its transport, so all of its streams can be blocked when a single TCP packet is lost—a problem called head-of-line blocking. QUIC is different: Loss of a UDP packet within a QUIC connection only affects the streams contained within that packet. In other words, QUIC won’t let a problem with one request slow the others down, even on an unreliable connection.

Enabling QUIC

You can enable QUIC in your load balancer with a single setting in the GCP Console. Just edit the frontend configuration for your load balancer and enable QUIC negotiation for the IP and port you want to use, and you’re done.

You can also enable QUIC using gcloud:
gcloud compute target-https-proxies update proxy-name 
--quic_override=ENABLE
Once you’ve enabled QUIC, your load balancer negotiates QUIC with clients that support it, like Google Chrome and Chromium. Clients that do not support QUIC continue to use HTTPS seamlessly. If you distribute your own mobile client, you can integrate Cronet to gain QUIC support. The load balancer translates QUIC to HTTP/1.1 for your backend servers, just like traffic with any other protocol, so you don’t need to make any changes to your backends—all you need to do is enable QUIC in your load balancer.

The Future of QUIC

We’re working to help QUIC become a standard for web communication, just as we did with HTTP/2. The IETF formed a QUIC working group in November 2016, which has seen intense engagement from IETF participants, and is scheduled to complete v1 drafts this November. QUIC v1 will support HTTP over QUIC, use TLS 1.3 as the cryptographic handshake, and support migration of client connections. At the working group’s most recent interop event, participants presented over ten independent implementations.

QUIC is designed to evolve over time. A client and server can negotiate which version of QUIC to use, and as the IETF QUIC specifications become more stable and members reach clear consensus on key decisions, we’ve used that version negotiation to keep pace with the current IETF drafts. Future planned versions will also include features such as partial reliability, multipath, and support for non-HTTP applications like WebRTC.

QUIC works across changing network connections. QUIC can migrate client connections between cellular and Wifi networks, so requests don’t time out and fail when the current network degrades. This migration reduces the number of failed requests and decreases tail latency, and our developers are working on making it even better. QUIC client connection migration will soon be available in Cronet.

Try it out today

Read more about QUIC in the HTTPS load balancing documentation and enable it for your project(s) by editing your HTTP(S) load balancer settings. We look forward to your feedback!

Labelling and grouping your Google Cloud Platform resources



Do you run and administer applications on Google Cloud Platform (GCP)? Do you need to group or classify your GCP resources to satisfy compliance demands? Do you need to manage traffic going to and from a VM, monitor specific resources, or see those resources by billing account? If you answered yes to any of these questions, you’ll be glad to know that GCP provides multiple ways to annotate your resources, to make them easier to track: security marks, labels and network tags.

While each annotation has different functionality and scope, they are not mutually exclusive and you will often use a combination of them to meet your requirements. To help you choose which annotation, when, take a look at this flowchart.

Let’s take a deeper look at each of these types of annotations.

Annotation type: Security marks

Security marks, or marks, provide you a way to annotate assets and then search, select, or filter using the mark via Cloud Security Command Center (Cloud SCC)

Use cases:
Here are the main use cases for security marks:
  • classifying and organizing assets and findings independent of resource-level labelling mechanisms, including multi-parented groupings
  • enabling tracking of violation severity and priority
  • integrating with workflow systems for assignment and resolution of incidents
  • enabling differentiated policy enforcement on resources, projects or groups of projects
  • enhancing security focused insights into your resources, e.g., clarifying which publicly accessible buckets are within policy and which are not
Note that Cloud Labels and Tags for the associated supported resources also appear and are indexed by Cloud SCC, so that you can use automation logic to create/modify marks based on the values of existing resource labels and tags.

How to use:
Marks are key-value pairs that are supported by a number of resources. They provide a security-focused view of the supported resources and are only visible from Cloud SCC. Edit or view access to the inventory of resources in Cloud SCC and their associated marks requires the securityCenter.editor IAM role, independently of the roles and permissions on the underlying resource. You can set marks at the org level, project level or for individual resources that support marks. To work with marks you can use, cURL, REST API, the Cloud SCC python library or the Cloud SCC asset inventory page by selecting the resources you wish to apply a mark to, then adding the key-value pair items.

What you can annotate:
A valid mark meets the following criteria:
  • While in Alpha, keys must have a minimum length of 1 character and a maximum length of 63 characters, and cannot be empty. In the Beta and GA releases keys will be extended to support up to 256 characters and value can then have a maximum of 4096 characters.
  • Keys and values can contain only lowercase letters, numeric characters, underscores, and dashes. All characters must use UTF-8 encoding, and can include international characters.
You can find an up-to-date list of resources that can be annotated using marks here.

Annotation type: Labels

Labels are key-value pairs that are supported by a number of GCP resources. You can use labels to track your spend in exported billing data. You can also use labels to filter and group resources for other use cases, for example, to identify all those resources that are in a test environment, as opposed to those in production.

Here’s a list of all the things you can do with labels:
  • Identify resources used by individual teams or cost centers
  • Distinguish deployment environments (prod,stage, qa, test)
  • Identify owners, state labels.
  • Use for cost allocation and billing breakdowns.
  • Monitor resource groups via Stackdriver, which can use labels accessible in the resource metadata
How to use:
A valid label meets the following criteria:
  • Each label must be a key-value pair.
  • Keys must have a minimum length of 1 character and a maximum length of 63 characters, and cannot be empty. Values can be empty, and have a maximum length of 63 characters.
  • Keys and values can contain only lowercase letters, numeric characters, underscores, and dashes. All characters must use UTF-8 encoding, and can include international characters.
  • The key portion of a label must be unique. However, you can use the same key with multiple resources. Keys must start with a lowercase letter or international character.
Check the supported resources to learn how to apply labels and to what you can apply them. For instance, BigQuery lets you add labels to your datasets, tables, and views, while Cloud Storage allows you to add labels to buckets. You can add labels to projects but not to folders.

The permissions you need to add labels to resources are determined on a product-by-product basis. For example, BigQuery requires the bigquery.datasets.update permission to modify labels on datasets. The owner of the dataset has this permission by default but you can also assign at the project level the predefined IAM roles bigquery.dataOwner and bigquery.admin, which include this permission. You can also add labels to tables and views; this action requires the bigquery.tables.update permission. Assigning the predefined IAM roles at project level bigquery.dataOwner, bigquery.dataEditor or bigquery.admin grants this permission. The owner of the dataset has full permissions over the tables and views that the dataset contains.

What you can label:
An up-to-date list of GCP products that support labels can be found here. Then, drill down into each product’s documentation for more details.

Note that you can label instances, but if you are annotating these to manage network traffic, you should use tags instead (see below).

Annotation type: Network tags

Network tags apply to instances and are the means for controlling network traffic to and from a VM instance. On GCP networks, tags identify which VM instances are subject to firewall rules and network routes. You can use the tags as source and destination values in firewall rules. For routes, tags are used to identify to which instances a certain route applies.

How to use:
Using tags means you can create additional isolation between subnetworks by selectively allowing only certain instances to communicate. If you arrange for all instances in a subnetwork to share the same tag, you can specify that tag in firewall rules to simulate a per-subnetwork firewall. For example if you have a subnet called ‘subnet-a’, you can tag all instances in subnet-a with the tag ‘my-subnet-a’, and use that tag in firewall rules as a source or destination.

Tags can be added or removed from an instance using gcloud commands, Cloud Console or API calls. The following gcloud command will add the tags ‘production’ and ‘web’ to an instance
gcloud compute instances add-tags [INSTANCE_NAME] --tags production,web
You can set firewall rules using gcloud commands and the Console. The following gcloud command sets a firewall rule using tags in the source and destination.It allows traffic from instances tagged web-production to instances tagged log-data via tcp port 443
gcloud compute firewall-rules create web-logdata \
    --network logging-network \
    --allow TCP:443 \
    --source-tags web-production \
    --target-tags log-data
For routes, tags are used to identify which instances a certain route applies to. For example, you might create a route that applies to all VM instances that have been tagged with the string vpn. You can set routes using gcloud commands or the Console. The following gcloud command creates a route called my-route in a network called my-network that restricts the route to only apply to instances tagged ‘web-prod’ or ‘api-gate-prod’.

gcloud compute routes create my-route --destination-range 10.0.0.0/16 \
--network my-network [--tags=web-prod,api-gate-prod]
Network tags can however be modified by anyone in your org who has the Compute InstanceAdmin role in the project the network was created in. You can create a custom role with more restricted permissions that disable the ability to set tags on instances by removing the compute.instances.setTag permission from the Compute InstanceAdmin role. Instead of using tags and custom roles to prevent developers from adjusting tags (and thus enabling a firewall rule on their instances), use service accounts; Unless they have access to the appropriate centrally managed service accounts they will be unable to modify the rule. Refer to service accounts vs tags to determine whether the restrictions when using service accounts for firewall rules are acceptable.

Tags values have to meet the following criteria:
  • Can be no longer than 63 characters each
  • Can only contain lowercase letters, numeric characters, and dashes
  • Must start and end with either a number or a lowercase character.

Marks, labels and network tags at a glance

To recap, here’s a table that summarizes common use cases and their associated annotations.

Use Case
Annotation(s) required
Notes
Taking inventory of GCP resources
Security marks
Labels
Labels currently support a wider range of resources than Security marks. The Cloud SCC view of your resources includes, as properties of the resource, the labels and tags you’ve applied; you can then further apply ACL’d security marks to control the organization and filtering of the super set of resources, tags and labels.
Classifying or grouping GCP resources and data into logical groups such as dev or production environments for non-ACL’d use cases
Labels

Classify or grouping GCP resources and data into logical groups such as dev or production environments for security use cases
Security marks
Use these when you want the control of the groups to be at either an organization level or specifically not in the control of the resource owner.
Grouping and classifying sensitive resources and/or organize resources for attribution in security use cases
Security marks
Reading & setting marks is restricted to the Cloud SCC specific roles
Billing break down and cost allocation
Labels

Network traffic management to and from instances
Tags
Network tags can be modified by anyone in your org who has the Compute InstanceAdmin role
Monitoring groupings of related resources for Operational tasks
Labels

Used with Stackdriver resource groups
Monitoring of groupings of related resources and/or findings for security risk assessments, vulnerability management and threat detection
Security marks
Used in Cloud SCC

On your mark, get set, go

If you manage a big, complex environment, you know how hard it can be to keep track of all your GCP resources. We hope that security marks, labels and network tags can make that task a little bit easier. For more information on tracking resources in GCP, check out this how-to guide on creating and managing labels. And watch this space for more insider tips on managing GCP resources like a pro.