Tag Archives: networking

5 must-see network sessions at Google Cloud NEXT 2018



Whether you’re moving data to or from Google Cloud, or are knee-deep plumbing your cloud network architecture, there’s a lot to learn at Google Cloud Next 2018 next week in San Francisco). Here’s our shortlist of the five must-see networking breakout sessions at the show, in chronological order from Wednesday to Thursday.
Operations engineer, Rebekah Roediger, delivering cloud network capacity one link at a time, in our Netherlands cloud region (europe-west4).

GCP Network and Security Telemetry
Speakers: Ines Envid, Senior Product Manager, Yuri Solodkin, Staff Software Engineer and Vineet Bhan, Head of Security Partnerships
Network and security telemetry is fundamental to operate your deployments in public clouds with confidence, providing the required visibility on the behavior of your network and access control firewalls.
When: July 24th, 2018 12:35pm


A Year in GCP Networking
Speakers: Srinath Padmanabhan, Networking Product Marketing Manager, Google Cloud and Nick Jacques, Lead Cloud Engineer, Target
In this session, we will talk about the valuable advancements that have been made in GCP Networking over the last year. We will introduce you to the GCP Network team and will tell you about what you can do to extract the most value from your GCP Deployment.
When: July 24th, 2018 1:55pm


Cloud Load Balancing Deep Dive and Best Practices
Speakers: Prajakta Joshi, Sr. Product Manager and Mike Columbus, Networking Specialist Team Manager
Google Cloud Load Balancing lets enterprises and cloud-native companies deliver highly available, scalable, low-latency cloud services with a global footprint. You will see demos and learn how enterprise customers deploy Cloud Load Balancing and the best practices they use to deliver smart, secure, modern services across the globe.
When: July 25th, 2018 12:35pm


Hybrid Connectivity - Reliably Extending Your Enterprise Network to GCP
Speaker: John Veizades, Product Manager, Google Cloud
In this session, you will learn how to connect to GCP with highly reliable and secure networking to support extending your data center networks into the cloud. We will cover details of resilient routing techniques, access to Google API from on premise networks, connection locations, and partners that support connectivity to GCP -- all designed to support mission-critical network connectivity to GCP.
When: July 26th, 2018 11:40am


VPC Deep Dive and Best Practices
Speakers: Emanuele Mazza, Networking Product Specialist, Google, Neha Pattan, Software Engineer, Google and Kamal Congevaram Muralidharan, Senior Member Technical Staff, Paypal
This session will walk you through the unique operational advantages of GCP VPC for your enterprise cloud deployments. We’ll go through detailed use cases, how to seal and audit your VPC, how to extend your VPC to on-prem in hybrid scenarios, and how to deploy highly available services.
When: July 26th, 2018 9:00am


Be sure to reserve your spot in these sessions today—space is filling up!

Our Los Angeles cloud region is open for business



Hey, LA — the day has arrived! The Los Angeles Google Cloud Platform region is officially open for business. You can now store data and build highly available, performant applications in Southern California.

Los Angeles Mayor Eric Garcetti said it best: “Los Angeles is a global hub for fashion, music, entertainment, aerospace, and more—and technology is essential to strengthening our status as a center of invention and creativity. We are excited that Google Cloud has chosen Los Angeles to provide infrastructure and technology solutions to our businesses and entrepreneurs.”

The LA cloud region, us-west2, is our seventeenth overall and our fifth in the United States.

Hosting applications in the new region can significantly improve latency for end users in Southern California, and by up to 80% across Northern California and the Southwest, compared to hosting them in the previously closest region, Oregon. You can visit www.gcping.com to see how fast the LA region is for you.

Services


The LA region has everything you need to build the next great application:

Of note, the LA region debuted with one of our newest products: Cloud FilestoreBETA, our managed file storage service for applications that require a filesystem interface and a shared filesystem for data.

The region also has three zones, allowing you to distribute apps and storage across multiple zones to protect against service disruptions. You can also access our multi-regional services (such as BigQuery) in the United States and all the other GCP services via our Google Network, and combine any of the services you deploy in LA with other GCP services around the world. Please visit our Service Specific Terms for detailed information on our data storage capabilities.

Google Cloud Network

Google Cloud’s global networking infrastructure is the largest cloud network as measured by number of points of presence. This private network provides a high-bandwidth, highly reliable, low-latency link to each region across the world. With it, you can reach the LA region as easily as any region. In addition, the global Google Cloud Load Balancing makes it easy to deploy truly global applications.

Also, if you’d like to connect to the Los Angeles region privately, we offer Dedicated Interconnect at two locations: Equinix LA1 and CoreSite LA1.

LA region celebration

We celebrated the launch of the LA cloud region the best way we know how: with our customers. At the celebration, we announced new services to help content creators take advantage of the cloud: Filestore, Transfer Appliance and of course, the new region itself, in the heart of media and entertainment country. The region’s proximity to content creators is critical for cloud-based visual effects and animation workloads. With proximity comes low latency, which lets you treat the cloud as if it were part of your on-premises infrastructure—or even migrate your entire studio to the cloud.
Paul-Henri Ferrand, President of Global Customer Operations, officially announces the opening of our Los Angeles cloud region.


What customers are saying


“Google Cloud makes the City of Los Angeles run more smoothly and efficiently to better serve Angelenos city-wide. We are very excited to have a cloud region of our own that enables businesses, big or small, to leverage the latest cloud technology and foster innovation.”
- Ted Ross, General Manager and Chief Information Officer for City of LA Information Technology Agency, City of LA

“Using Google Cloud for visual effects rendering enables our team to be fast, flexible and to work on multiple large projects simultaneously without fear of resource starvation. Cloud is at the heart of our IT strategy and Google provides us with the rendering power to create Oscar-winning graphics in post-production work.”
- Steve MacPherson, Chief Technology Officer, Framestore

“A lot of our short form projects pop up unexpectedly, so having extra capacity in region can help us quickly capitalize on these opportunities. The extra speed the LA region gives us will help us free up our artists to do more creative work. We’re also expanding internationally, and hiring more artists abroad, and we’ve found that Google Cloud has the best combination of global reach, high performance and cost to help us achieve our ambitions.”
- Tom Taylor, Head of Engineering, The Mill

What SoCal partners are saying


Our partners are available to help design and support your deployment, migration and maintenance needs.

“Cloud and data are the new equalizers, transforming the way organizations are built, work and create value. Our premier partnership with Google Cloud Platform enables us to help our clients digitally transform through efforts like app modernization, data analytics, ML and AI. Google’s new LA cloud region will enhance the deliverability of these solutions and help us better service the LA and Orange County markets - a destination where Neudesic has chosen to place its corporate home.”
- Tim Marshall, CTO and Co-Founder, Neudesic

“Enterprises everywhere are on a journey to harness the power of cloud to accelerate business objectives, implement disruptive features, and drive down costs. The Taos and Google Cloud partnership helps companies innovate and scale, and we are excited for the new Google Cloud LA region. The data center will bring a whole new level of uptime and service to our Southern California team and clients.”
- Hamilton Yu, President and COO, Taos

“As a launch partner for Google Cloud and multi-year recipient of Google’s Partner of the Year award, we are thrilled to have Google’s new cloud region in Los Angeles, our home base and where we have a strong customer footprint. SADA Systems has a track record of delivering industry expertise and innovative technical services to customers nationwide. We are excited to leverage the scale and power of Google Cloud along with SADA’s expertise for our clients in the Los Angeles area to continue their cloud transformation journey.”
- Tony Safoian, CEO & President, SADA Systems

Getting started


For additional details on the LA region, please visit our LA region page where you’ll get access to free resources, whitepapers, the "Cloud On-Air" on-demand video series and more. Our locations page provides updates on the availability of additional services and regions. Contact us to request early access to new regions and help us prioritize where we build next.

GCP arrives in the Nordics with a new region in Finland



Click here for the Finnish version, thank you!

Our sixteenth Google Cloud Platform (GCP) region, located in Finland, is now open for you to build applications and store your data.

The new Finland region, europe-north1, joins the Netherlands, Belgium, London, and Frankfurt in Europe and makes it easier to build highly available, performant applications using resources across those geographies.

Hosting applications in the new region can improve latencies by up to 65% for end-users in the Nordics and by up to 88% for end-users in Eastern Europe, compared to hosting them in the previously closest region. You can visit www.gcping.com to see for yourself how fast the Finland region is from your location.

Services


The Nordic region has everything you need to build the next great application, and three zones that allow you to distribute applications and storage across multiple zones to protect against service disruptions.

You can also access our Multi-Regional services in Europe (such as BigQuery) and all the other GCP services via the Google Network, the largest cloud network as measured by number of points of presence. Please visit our Service Specific Terms to get detailed information on our data storage capabilities.

Build sustainably


The new region is located in our existing data center in Hamina. This facility is one of the most advanced and efficient data centers in the Google fleet. Our high-tech cooling system, which uses sea water from the Gulf of Finland, reduces energy use and is the first of its kind anywhere in the world. This means that when you use this region to run your compute workloads, store your data, and develop your applications, you are doing so sustainably.

Hear from our customers


“The road to emission-free and sustainable shipping is a long and challenging one, but thanks to exciting innovation and strong partnerships, Rolls-Royce is well-prepared for the journey. For us being able to train machine learning models to deliver autonomous vessels in the most effective manner is key to success. We see the Google Cloud for Finland launch as a great advantage to speed up our delivery of the project.”
– Karno Tenovuo, Senior Vice President Ship Intelligence, Rolls-Royce

“Being the world's largest producer of renewable diesel refined from waste and residues, as well as being a technologically advanced refiner of high-quality oil products, requires us to take advantage of leading-edge technological possibilities. We have worked together with Google Cloud to accelerate our journey into the digital future. We share the same vision to leave a healthier planet for our children. Running services on an efficient and sustainably operated cloud is important for us. And even better that it is now also available physically in Finland.”
– Tommi Touvila, Chief Information Officer, Neste

“We believe that technology can enhance and improve the lives of billions of people around the world. To do this, we have joined forces with visionary industry leaders such as Google Cloud to provide a platform for our future innovation and growth. We’re seeing tremendous growth in the market for our operations, and it’s essential to select the right platform. The Google Cloud Platform cloud region in Finland stands for innovation.”
– Anssi Rönnemaa, Chief Finance and Commercial Officer, HMD Global

“Digital services are key growth drivers for our renewal of a 108-year old healthcare company. 27% of our revenue is driven by digital channels, where modern technology is essential. We are moving to a container-based architecture running on GCP at Hamina. Google has a unique position to provide services within Finland. We also highly appreciate the security and environmental values of Google’s cloud operations.”
– Kalle Alppi, Chief Information Officer, Mehiläinen

Partners in the Nordics


Our partners in the Nordics are available to help design and support your deployment, migration and maintenance needs.


"Public cloud services like those provided by Google Cloud help businesses of all sizes be more agile in meeting the changing needs of the digital era—from deploying the latest innovations in machine learning to cost savings in their infrastructure. Google Cloud Platform's new Finland region enables this business optimization and acceleration with the help of cloud-native partners like Nordcloud and we believe Nordic companies will appreciate the opportunity to deploy the value to their best benefit.”
– Jan Kritz, Chief Executive Officer, Nordcloud

Nordic partners include: Accenture, Adapty, AppsPeople, Atea, Avalan Solutions, Berge, Cap10, Cloud2, Cloudpoint, Computas, Crayon, DataCenterFinland, DNA, Devoteam, Doberman, Deloitte, Enfo, Evry, Gapps, Greenbird, Human IT Cloud, IIH Nordic, KnowIT, Koivu Solutions, Lamia, Netlight, Nordcloud, Online Partners, Outfox Intelligence AB, Pilvia, Precis Digital, PwC, Quality of Service IT-Support, Qvik, Skye, Softhouse, Solita, Symfoni Next, Soprasteria, Tieto, Unifoss, Vincit, Wizkids, and Webstep.

If you want to learn more or wish to become a partner, visit our partners page.

Getting started


For additional details on the region, please visit our Finland region page where you’ll get access to free resources, whitepapers, the "Cloud On-Air" on-demand video series and more. Our locations page provides updates on the availability of additional services and regions. Contact us to request access to new regions and help us prioritize what we build next.

Behind the scenes with the Dragon Ball Legends GCP backend



Dragon Ball Legends, a new mobile game from Bandai Namco Entertainment (BNE), is based on its popular Dragon Ball Z franchise, and is rolling out to gamers around the world as we speak. But planning the cloud infrastructure to power the game dates back to February 2017, when BNE approached Google Cloud to talk about the interesting challenges they were facing, and how we could help.

Based on their anticipated demand, BNE had three ambitious requirements for their game:
  1. Extreme scalability. The game would be launched globally, so it needed backend that could scale with millions of players and still perform well.
  2. Global network. Because the game allows real-time player versus player battles, it needs a reliable and low-latency network across regions.
  3. Real-time data analytics. The game is designed to evolve with players in real-time, so it was critical to have a data analytics pipeline to stream data to a data warehouse. Then the operation team can measure and evaluate how people are playing the game and adjust it on-the-fly.
All three of these are areas where we have a lot of experience. Google has multiple global services with more than a billion users, and we use the data those services generate to improve them over time. And because Google Cloud Platform (GCP) runs on the same infrastructure as these Google services, GCP customers can take advantage of the same enabling technologies.

Let’s take a look at how BNE worked with Google Cloud to build the infrastructure for Dragon Ball Legends.


Challenge #1: Extreme scalability

MySQL is extensively used by gaming companies in Japan because engineers are used to working with relational databases with schema, SQL queries and strong consistency. This simplifies a lot on the application side that doesn’t have to handle any database limitations like eventual consistency or schema enforcement. MySQL is a widely used even outside gaming and most backend engineers already have strong experience using this database.

While MySQL offers many advantages, it has one big limitation: scalability. Indeed, as a scale-up database if you want to increase MySQL performance, you need to add more CPU, RAM or disk. And when a single instance of MySQL can’t handle the load anymore, you can divide the load by sharding—splitting users into groups and assigning them to multiple independent instances of MySQL. Sharding has a number of drawbacks, however. Most gaming developers calculate the number of shards they’ll need for the database before the game launches since resharding is labor-intensive and error-prone. That causes gaming companies tend to overprovision the database to eventually handle more players than they expect. If the game is as popular as expected, everything is fine. But what if the game is a runaway hit and exceeds the anticipated demand? And what about the long tail representing a gradual decrease in active players? What if it’s an out-and-out flop? MySQL sharding is not dynamically scalable, and adjusting its size requires maintenance as well as risk.

In an ideal world, databases can scale in and out without downtime while offering the advantages of a relational database. When we first heard that BNE was considering MySQL sharding to handle the massive anticipated traffic for Dragon Ball Legends, we suggested they consider Cloud Spanner instead.


Why Cloud Spanner?

Cloud Spanner is a fully managed relational database that offers horizontal scalability and high availability while keeping strong consistency with a schema that is similar to MySQL’s. Better yet, as a managed service, it’s looked after by Google SREs, removing database maintenance and minimizing the risk of downtime. We thought Cloud Spanner would be able to help BNE make their game global.


Evaluation to implementation

Before adopting a new technology, engineers should always test it to confirm its expected performance in a real world scenario. Before replacing MySQL, BNE created a new Cloud Spanner instance in GCP, including a few tables with a similar schema to what they used in MySQL. Since their backend developers were writing in Scala, they chose the Java client library for Cloud Spanner and wrote some sample code to load-test Cloud Spanner and see if it could keep up with their queries per second (QPS) requirements for writes—around 30,000 QPS at peak. Working with our customer engineer and the Cloud Spanner engineering team, they met this goal easily. They even developed their own DML (Data Manipulation Language) wrapper to write SQL commands like INSERT, UPDATE and DELETE.


Game release

With the proof of concept behind them, they could start their implementation. Based on the expected daily active users (DAU), BNE calculated how many Cloud Spanner nodes they needed—enough for the 3 million pre-registered players they were expecting. To prepare the release, they organized two closed beta tests to validate their backend, and didn’t have a single issue with the database! In the end, over 3 million participants worldwide pre-registered for Dragon Ball Legends, and even with this huge number, the official game release went flawlessly.

Long story short, BNE can focus on improving the game rather than spending time operating their databases.


Challenge #2: Global network

Let’s now talk about BNE’s second challenge: building a global real-time player-vs-player (PvP) game. BNE’s goal for Dragon Ball Legends was to let all its players play against one another, anywhere in the world. If you know anything about networking, you understand the challenge around latency. Round-trip time (RTT) ( between Tokyo and San Francisco, for example, is on average around 100 ms. To address that, they decided to divide every game second into 250 ms intervals. So while the game looks like it’s real-time to users, it’s actually a really fast turn-based game at its core (you can read more about the architecture here). And while some might say that 250ms offers plenty of room for latency, it’s extremely hard to predict the latency when communicating across the Internet.


Why Cloud Networking?

Here’s what it looks like for a game client to access the game server on GCP over the internet. Since the number of hops can vary every time, this means that playing PvP can sometimes feel fast, sometimes slow.

Once of the main reasons BNE decided to use GCP for the Dragon Ball Legends backend was the Google dedicated network. As you can see in the picture below, when using GCP, once the game client accesses one of the hundreds of GCP Point Of Presence (POP) around the world, it’s on the Google dedicated network. That means none unpredictable hops, for predictable and lowest possible latency.


Taking advantage of the Google Cloud Network

Usually, gaming companies implement PvP by connecting two players directly or through a dedicated game server. Usually combat games that require low latency between players will prefer P2P communication. In general, when two players are geographically close, P2P works very well, but it’s often unreliable when trying to communicate across regions (some carriers even block P2P protocols). For two players from two different continents to communicate through Google’s dedicated network, players first try to communicate through P2P, and if that fails, they failover to an open source implementation of STUN/TURN Server called coturn, which acts as a relay between the two players.. That way, cross continent battles leverage the low-latency and reliable Google network as much as possible.


Challenge #3: Real-time data analytics

BNE’s last challenge was around real-time data analytics. BNE wanted to offer the best user experience to their fans and one of the ways to do that is through live game operations, or LiveOps, in which operators make constant changes to the game so it always feels fresh. But to understand players’ needs, they needed data— usually users’ actions log data. And if they could get this data in near real-time, they could then make decisions on what changes to apply to the game to increase users’ satisfaction and engagement.

To gather this data, BNE used a combination of Cloud Pub/Sub, Cloud Dataflow to transform in users’ data in real-time and insert it into BigQuery.
  • Cloud Pub/Sub offers a globally reliable messaging system that buffers the logs until they can be handled by Cloud Dataflow.
  • Cloud Dataflow is a fully managed parallel processing service that lets you execute ETL in real-time and in parallel.
  • BigQuery is the fully managed data warehouse where all the game logs are stored. Since BigQuery offers petabyte-scale storage, scaling was not a concern. Thanks to heavy parallel processing when querying the logs, BNE can get a response to a query, scanning terabytes of data in a few seconds.
This system lets a game producer visualize a player’s behavior in near real-time and take decision on what new features to bring to the game or what to change inside the game to satisfy all their fans.


Takeaways

Using Cloud Spanner, BNE could focus on developing an amazing game instead of spending time on database capacity planning and scaling. Operations-wise, by using a fully managed scalable database, they drastically reduced risks related to human error as well as an operational overhead.

Using Cloud Networking, they leveraged Google’s dedicated network to offer the best user experience to their fans, even when fighting across regions.

And finally, using Google’s analytics stack (Cloud PubSub, Cloud Dataflow and BigQuery), BNE was able to analyze players’ behaviors in near real-time and make decisions about how to adjust the game to make their fans even happier!

If you want to hear more details about how they evaluated and adopted Cloud Spanner for their game, please join them at their Google Cloud NEXT’18 session in San Francisco.

Introducing QUIC support for HTTPS load balancing



For four years now, Google has been using QUIC, a UDP-based encrypted transport protocol optimized for HTTPS, to deliver traffic for our products – from Google Web Search, to YouTube, to this very blog. If you’re reading this in Chrome, you’re probably using QUIC right now. QUIC makes the web faster, particularly for slow connections, and now your cloud services can enjoy that speed: today, we’re happy to be the first major public cloud to offer QUIC support for our HTTPS load balancers.

QUIC’s key features include establishing connections faster, stream-based multiplexing, improved loss recovery, and no head-of-line blocking. QUIC is designed with mobility in mind, and supports migrating connections from WiFi to Cellular and back.

Benefits of QUIC


If your service is sensitive to latency, QUIC will make it faster because of the way it establishes connections. When a web client uses TCP and TLS, it requires two to three round trips with a server to establish a secure connection before the browser can send a request. With QUIC, if a client has talked to a given server before, it can start sending data without any round trips, so your web pages will load faster. How much faster? On a well-optimized site like Google Search, connections are often pre-established, so QUIC’s faster connections can only speed up some requests—but QUIC still improves mean page load time by 8% globally, and up to 13% in regions where latency is higher.

Cedexis benchmarked our Cloud CDN performance using a Google Cloud project. Here’s what happened when we enabled QUIC.

Encryption is built into QUIC, using AEAD algorithms such as AES-GCM and ChaCha20 for both privacy and integrity. QUIC authenticates the parts of its headers that it doesn’t encrypt, so attackers can’t modify any part of a message.

Like HTTP/2, QUIC multiplexes multiple streams into one connection, so that a connection can serve several HTTP requests simultaneously. But HTTP/2 uses TCP as its transport, so all of its streams can be blocked when a single TCP packet is lost—a problem called head-of-line blocking. QUIC is different: Loss of a UDP packet within a QUIC connection only affects the streams contained within that packet. In other words, QUIC won’t let a problem with one request slow the others down, even on an unreliable connection.

Enabling QUIC

You can enable QUIC in your load balancer with a single setting in the GCP Console. Just edit the frontend configuration for your load balancer and enable QUIC negotiation for the IP and port you want to use, and you’re done.

You can also enable QUIC using gcloud:
gcloud compute target-https-proxies update proxy-name 
--quic_override=ENABLE
Once you’ve enabled QUIC, your load balancer negotiates QUIC with clients that support it, like Google Chrome and Chromium. Clients that do not support QUIC continue to use HTTPS seamlessly. If you distribute your own mobile client, you can integrate Cronet to gain QUIC support. The load balancer translates QUIC to HTTP/1.1 for your backend servers, just like traffic with any other protocol, so you don’t need to make any changes to your backends—all you need to do is enable QUIC in your load balancer.

The Future of QUIC

We’re working to help QUIC become a standard for web communication, just as we did with HTTP/2. The IETF formed a QUIC working group in November 2016, which has seen intense engagement from IETF participants, and is scheduled to complete v1 drafts this November. QUIC v1 will support HTTP over QUIC, use TLS 1.3 as the cryptographic handshake, and support migration of client connections. At the working group’s most recent interop event, participants presented over ten independent implementations.

QUIC is designed to evolve over time. A client and server can negotiate which version of QUIC to use, and as the IETF QUIC specifications become more stable and members reach clear consensus on key decisions, we’ve used that version negotiation to keep pace with the current IETF drafts. Future planned versions will also include features such as partial reliability, multipath, and support for non-HTTP applications like WebRTC.

QUIC works across changing network connections. QUIC can migrate client connections between cellular and Wifi networks, so requests don’t time out and fail when the current network degrades. This migration reduces the number of failed requests and decreases tail latency, and our developers are working on making it even better. QUIC client connection migration will soon be available in Cronet.

Try it out today

Read more about QUIC in the HTTPS load balancing documentation and enable it for your project(s) by editing your HTTP(S) load balancer settings. We look forward to your feedback!

Google Cloud using P4Runtime to build smart networks



Data networks are difficult to design, build and manage, and often don’t work as well as we would like. Here at Google, we deploy and use a lot of network capacity in and between data centers to deliver our portfolio of services, and the costs and burdens of deploying and managing these networks have only grown with the the scale and complexity of these networks. Almost ten years ago, we took steps to address this by adopting software-defined networking (SDN) as the basis for our network architecture. SDN allowed us to program our networks with software running on standard servers and became a fundamental component of our largest systems. In that time, we’ve continued to develop and improve our SDN technology, and now it’s time to take the next step with P4Runtime.

We are excited to announce our collaboration with the Open Networking Foundation (ONF) on Stratum, an open source project to implement an open reference platform for a truly "software-defined" data plane, designed and built around P4Runtime from the beginning. P4 Runtime allows the SDN control plane to establish a contract with the dataplane about forwarding behavior, and to then establish forwarding behavior through simple RPCs. As part of the project, we’re working with network vendors to make this functionality available in networking products across the industry. As a small-but-complete SDN embedded software solution, Stratum will help bring P4Runtime to a variety of network devices.

But just what is it about P4Runtime that helps with the challenges of building large-scale and reliable networks? Network hardware is typically closed, runs proprietary software and is complex, thanks to the need to operate autonomously and run legacy protocols. Modern data centers and wide-area networks are large, must be fast and simple and are often built using commodity network switch chips interconnected into a large fabric. And despite high-quality whitebox switches and open SDN technology such as OpenFlow, there still aren’t a lot of good, portable options on the market to build these networks.

At Google, we designed our own hardware switches and switch software, but our goal has always been to leverage industry SDN solutions that interoperate with our data centers and wide-area networks. P4Runtime is a new way for control plane software to program the forwarding path of a switch and provides a well-defined API to specify the switch forwarding pipelines, as well as to configure these pipelines via simple RPCs. P4Runtime can be used to control any forwarding plane, from a fixed-function ASIC to a fully programmable network switch.

Google Cloud is looking to P4Runtime as the foundation for our next generation of data centers and wide area network control-plane programming, to drive industry adoption and to enable others to benefit from it. With P4Runtime we’ll be able to continue to build the larger, higher performance and smarter networks that you’ve come to expect.

Three ways to configure robust firewall rules



If you administer firewall rules for Google Cloud VPCs, you want to ensure that firewall rules you create can only be associated with correct VM instances by developers in your organization. Without that assurance, it can be difficult to manage access to sensitive content hosted on VMs in your VPCs or allow these instances access to the internet, and you must carefully audit and monitor the instances to ensure that such unintentional access is not given through the use of tags. With Google VPC, there are now multiple ways to help achieve the required level of control, which we’ll describe here in detail.

As an example, imagine you want to create a firewall rule to restrict access to sensitive user billing information in a data store running on a set of VMs in your VPC. Further, you’d like to ensure that developers who can create VMs for applications other than the billing frontend cannot enable these VMs to be governed by firewall rules created to allow access to billing data.
Example topology of a VPC setup requiring secure firewall access.
The traditional approach here is to attach tags to VMs and create a firewall rule that allows access to specific tags, e.g., in the above example you could create a firewall rule that allows all VMs with the billing-frontend tag access to all VMs with the tag billing-data. The drawback of this approach is that any developer with Compute InstanceAdmin role for the project can now attach billing-frontend as a tag to their VM, and thus unintentionally gain access to sensitive data.

Configuring Firewall rules with Service Accounts


With the general availability of firewall rules using service accounts, instead of using tags, you can block developers from enabling a firewall rule on their instances unless they have access to the appropriate centrally managed service accounts. Service accounts are special Google accounts that belong to your application or service running on a VM and can be used to authenticate the application or service for resources it needs to access. In the above example, you can create a firewall rule to allow access to the billing-data@ service account only if the originating source service account of the traffic is billing-frontend@.
Firewall setup using source and target service accounts. (Service accounts names are abbreviated for simplicity.)
You can create this firewall rule using the following gcloud command:
gcloud compute firewall-rules create secure-billing-data \
    --network web-network \
    --allow TCP:443 \
    --source-service-accounts billing-frontend@web.iam.gserviceaccount.com \
    --target-service-accounts billing-data@web.iam.gserviceaccount.com
If, in the above example, the billing frontend and billing data applications are autoscaled, you can specify the service accounts for the corresponding applications in the InstanceTemplate configured for creating the VMs.

The advantage of using this approach is that once you set it up, the firewall rules may remain unchanged despite changes in underlying IAM permissions. However, you can currently only associate one service account with a VM and to change this service account, the instance must be in a stopped state.

Creating custom IAM role for InstanceAdmin


If you want the flexibility of tags and the limitations of service accounts is a concern, you can create a custom role with more restricted permissions that disable the ability to set tags on VMs; do this by removing the compute.instances.setTag permission. This custom role can have other permissions present in the InstanceAdmin role and can then be assigned to developers in the organization. With this custom role, you can create your firewall rules using tags:
gcloud compute firewall-rules create secure-billing-data \
    --network web-network \
    --allow TCP:443 \
    --source-tags billing-frontend \
    --target-tags billing-data
Note, however, that permissions assigned to a custom role are static in nature and must be updated with any new permissions that might be added to the InstanceAdmin role, as and when required.

Using subnetworks to partition workloads


You can also create firewall rules using source and destination IP CIDR ranges if the workloads can be partitioned into subnetworks of distinct ranges as shown in the example diagram below.
Firewall setup using source and destination ranges.
In order to restrict developers’ ability to create VMs in these subnetworks, you can grant Compute Network User role selectively to developers on specific subnetworks or use Shared VPC.

Here’s how to configure a firewall rule with source and destination ranges using gcloud:
gcloud compute firewall-rules create secure-billing-data \
    --network web-network \
    --allow TCP:443 \
    --source-ranges 10.20.0.0/16 \
    --destination-ranges 10.30.0.0/16
This method allows for better scalability with large VPCs and allows for changes in the underlying VMs as long as the network topology remains unchanged. Note, however, that if a VM instance has can_ip_forward enabled, it may send traffic using the above source range and thus gain access to sensitive workloads.

As you can see, there’s a lot to consider when configuring firewall rules for your VPCs. We hope these tips help you configure firewall rules in a more secure and efficient manner. To learn more about configuring firewall rules, check out the documentation.

Simplify Cloud VPC firewall management with service accounts



Firewalls provide the first line of network defense for any infrastructure. On Google Cloud Platform (GCP), Google Cloud VPC firewalls do just that—controlling network access to and between all the instances in your VPC. Firewall rules determine who's allowed to talk to whom and more importantly who isn’t. Today, configuring and maintaining IP-based firewall rules is a complex and manual process that can lead to unauthorized access if done incorrectly. That’s why we’re excited to announce a powerful new management feature for Cloud VPC firewall management: support for service accounts.

If you run a complex application on GCP, you’re probably already familiar with service accounts in Cloud Identity and Access Management (IAM) that provide an identity to applications running on virtual machine instances. Service accounts simplify the application management lifecycle by providing mechanisms to manage authentication and authorization of applications. They provide a flexible yet secure mechanism to group virtual machine instances with similar applications and functions with a common identity. Security and access control can subsequently be enforced at the service account level.


Using service accounts, when a cloud-based application scales up or down, new VMs are automatically created from an instance template and assigned the correct service account identity. This way, when the VM boots up, it gets the right set of permissions and within the relevant subnet, so firewall rules are automatically configured and applied.

Further, the ability to use Cloud IAM ACLs with service accounts allows application managers to express their firewall rules in the form of intent, for example, allow my “application x” servers to access my “database y.” This remediates the need to manually manage Server IP Address lists while simultaneously reducing the likelihood of human error.
This process is leaps-and-bounds simpler and more manageable than maintaining IP address-based firewall rules, which can neither be automated nor templated for transient VMs with any semblance of ease.

Here at Google Cloud, we want you to deploy applications with the right access controls and permissions, right out of the gate. Click here to learn how to enable service accounts. And to learn more about Cloud IAM and service accounts, visit our documentation for using service accounts with firewalls.

What a year! Google Cloud Platform in 2017



The end of the year is a time for reflection . . . and making lists. As 2017 comes to a close, we thought we’d review some of the most memorable Google Cloud Platform (GCP) product announcements, white papers and how-tos, as judged by popularity with our readership.

As we pulled the data for this post, some definite themes emerged about your interests when it comes to GCP:
  1. You love to hear about advanced infrastructure: CPUs, GPUs, TPUs, better network plumbing and more regions. 
  2.  How we harden our infrastructure is endlessly interesting to you, as are tips about how to use our security services. 
  3.  Open source is always a crowd-pleaser, particularly if it presents a cloud-native solution to an age-old problem. 
  4.  You’re inspired by Google innovation — unique technologies that we developed to address internal, Google-scale problems. So, without further ado, we present to you the most-read stories of 2017.

Cutting-edge infrastructure

If you subscribe to the “bigger is always better” theory of cloud infrastructure, then you were a happy camper this year. Early in 2017, we announced that GCP would be the first cloud provider to offer Intel Skylake architecture, GPUs for Compute Engine and Cloud Machine Learning became generally available and Shazam talked about why cloud GPUs made sense for them. In the spring, you devoured a piece on the performance of TPUs, and another about the then-largest cloud-based compute cluster. We announced yet more new GPU models and topping it all off, Compute Engine began offering machine types with a whopping 96 vCPUs and 624GB of memory.

It wasn’t just our chip offerings that grabbed your attention — you were pretty jazzed about Google Cloud network infrastructure too. You read deep dives about Espresso, our peering-edge architecture, TCP BBR congestion control and improved Compute Engine latency with Andromeda 2.1. You also dug stories about new networking features: Dedicated Interconnect, Network Service Tiers and GCP’s unique take on sneakernet: Transfer Appliance.

What’s the use of great infrastructure without somewhere to put it? 2017 was also a year of major geographic expansion. We started out the year with six regions, and ended it with 13, adding Northern Virginia, Singapore, Sydney, London, Germany, Sao Paolo and Mumbai. This was also the year that we shed our Earthly shackles, and expanded to Mars ;)

Security above all


Google has historically gone to great lengths to secure our infrastructure, and this was the year we discussed some of those advanced techniques in our popular Security in plaintext series. Among them: 7 ways we harden our KVM hypervisor, Fuzzing PCI Express and Titan in depth.

You also grooved on new GCP security services: Cloud Key Management and managed SSL certificates for App Engine applications. Finally, you took heart in a white paper on how to implement BeyondCorp as a more secure alternative to VPN, and support for the European GDPR data protection laws across GCP.

Open, hybrid development


When you think about GCP and open source, Kubernetes springs to mind. We open-sourced the container management platform back in 2014, but this year we showed that GCP is an optimal place to run it. It’s consistently among the first cloud services to run the latest version (most recently, Kubernetes 1.8) and comes with advanced management features out of the box. And as of this fall, it’s certified as a conformant Kubernetes distribution, complete with a new name: Google Kubernetes Engine.

Part of Kubernetes’ draw is as a platform-agnostic stepping stone to the cloud. Accordingly, many of you flocked to stories about Kubernetes and containers in hybrid scenarios. Think Pivotal Container Service and Kubernetes’ role in our new partnership with Cisco. The developers among you were smitten with Cloud Container Builder, a stand-alone tool for building container images, regardless of where you deploy them.

But our open source efforts aren’t limited to Kubernetes — we also made significant contributions to Spinnaker 1.0, and helped launch the Istio and Grafeas projects. You ate up our "Partnering on open source" series, featuring the likes of HashiCorp, Chef, Ansible and Puppet. Availability-minded developers loved our Customer Reliability Engineering (CRE) team’s missive on release canaries, and with API design: Choosing between names and identifiers in URLs, our Apigee team showed them a nifty way to have their proverbial cake and eat it too.

Google innovation


In distributed database circles, Google’s Spanner is legendary, so many of you were delighted when we announced Cloud Spanner and a discussion of how it defies the CAP Theorem. Having a scalable database that offers strong consistency and great performance seemed to really change your conception of what’s possible — as did Cloud IoT Core, our platform for connecting and managing “things” at scale. CREs, meanwhile, showed you the Google way to handle an incident.

2017 was also the year machine learning became accessible. For those of you with large datasets, we showed you how to use Cloud Dataprep, Dataflow, and BigQuery to clean up and organize unstructured data. It turns out you don’t need a PhD to learn to use TensorFlow, and for visual learners, we explained how to visualize a variety of neural net architectures with TensorFlow Playground. One Google Developer Advocate even taught his middle-school son TensorFlow and basic linear algebra, as applied to a game of rock-paper-scissors.

Natural language processing also became a mainstay of machine learning-based applications; here, we highlighted with a lighthearted and relatable example. We launched the Video Intelligence API and showed how Cloud Machine Learning Engine simplifies the process of training a custom object detector. And the makers among you really went for a post that shows you how to add machine learning to your IoT projects with Google AIY Voice Kit. Talk about accessible!

Lastly, we want to thank all our customers, partners and readers for your continued loyalty and support this year, and wish you a peaceful, joyful, holiday season. And be sure to rest up and visit us again Next year. Because if you thought we had a lot to say in 2017, well, hold onto your hats.

One year of Cloud Performance Atlas



In March of this year, we kicked off a new content initiative called Cloud Performance Atlas, where we highlight best practices for GCP performance, and how to solve the most common performance issues that cloud developers come across.

Here’s the top topics from 2017 that developers found most useful.


5. The bandwidth delay problem


Every now and again, I’ll get a question from a company who recently updated their connection bandwidth from their on-premises systems to Google Cloud, and for some reason, aren’t getting any better performance as a result. The issue, as we’ve seen multiple times, usually resides in an area of TCP called “the bandwidth delay problem.”

The TCP algorithm works by transferring data in packets between two connections. A packet is sent to a connection, and then an acknowledgement packet is returned. To get maximum performance in this process, the connection between the two endpoints has to be optimized so that neither the sender or receiver is waiting around for acknowledgements from prior packets.

The most common way to address this problem is to adjust the window sizes for the packets to match the bandwidth of the connection. This allows both sides to continue sending data until an ACK arrives back from the client for an earlier packet, thereby creating no gaps and achieving maximum throughput. As such, a low window size will limit your connection throughput, regardless of the available or advertised bandwidth between instances.

Find out more by checking out the video, or article!

4. Improving CDN performance with custom keys


Google Cloud boasts an extremely powerful CDN that can leverage points-of-presence around the globe to get your data to users as fast as possible.

When setting up Cloud CDN for your site, one of the most important things is to ensure that you’re using the right Custom Cache Keys to configure what assets get cached, and which ones don’t. In most cases, this isn’t an issue, but if you’re leveraging a large site with content re-used across protocols (i.e., http and https) you can run into a problem where your cache fill costs can increase more than expected.

You can see how we helped a sports website get their CDN keys just right in the video, and article.


3. Google Cloud Storage and the sequential filename challenge


Google Cloud Storage is a one-stop-shop for all your content serving needs. However, one developer continued to run into a problem of slow upload speeds when pushing their content into the cloud.

The issue was that Cloud Storage uses the file path and name of the files being uploaded to segment and shard the connection to multiple frontends (improving performance). As we found out, if those file names are sequential then you could end up in a situation where multiple connections get squashed down to a single upload thread (thus hurting performance)!

As shown in the video and article, we were able to help a nursery camera company get past this issue with a few small fixes.

2. Improving Compute Engine boot time with custom images


Any cloud-based service needs to grow and shrink its resource allocations to respond to traffic load. Most of the time, this is a good thing, especially during the holiday season. ;) As traffic increases to your service/application, your backends will need to spin up more Compute Engine VMs to provide a consistent experience to your users.

However, if it takes too long for your VMs to start up, then the quality and performance for you users can be negatively impacted, especially if your VM needs to do a lot of things during its startup script, like compile code, or install large packages.

As we showed in the video, (article) you can pre-compute a lot of that work into a custom image of boot disks. When your VMs are loaded, they simply need to copy in the custom image to the disk (with everything already installed), rather than doing everything from scratch.

If you’re looking to improve your GCE boot performance, custom images are worth checking out!

1. App Engine boot time


Modern managed languages (Java, Python, Javascript, etc.) typically have a run-time dependencies step that occurs at the init phase of the program when code is imported and instantiated.

Before execution can begin, any global data, functions or state information are also set up. Most of the time, these systems are global in scope, since they need to be used by so many subsystems (for example, a logging system).

In the case of App Engine, this global initialization work can end up delaying start-time, since it must complete before a request can be serviced. And as we showed in the video and article, as your application responds to spikes in workload, this type of global variable contention can put a hurt on your request response times.


See you soon!


For the rest of 2017 our Cloud Performance team is enjoying a few hot cups of tea, relaxing with the holidays and counting down the days until the new year. In 2018, we’ve got a lot of awesome new topics to cover, including increased networking performance, Cloud Functions and Cloud Spanner!

Until then, make sure you check out the Cloud Performance Atlas videos on Youtube or our article series on Medium.

Thanks again for a great year everyone, and remember, every millisecond counts!