Category Archives: Google Cloud Platform Blog

Product updates, customer stories, and tips and tricks on Google Cloud Platform

How to do serverless pixel tracking with GCP



Whether they’re opening a newsletter or visiting a shopping cart page, how users interact with web content is very interesting to publishers. One way to understand user behavior is by using pixels, small 1x1 transparent images embedded into the web property. When loaded, the pixel calls a web server that records the request parameters passed in the URL that can be processed later.

Adding a pixel is easy, but hosting it and processing the request can be challenging for various reasons:
  • You need to set up, manage and monitor your ad servers
  • Users are usually global, which means that you need ad servers around the world
  • User visits are spiky, so pixel servers must scale up to sustain the load and scale down to limit the spend.
Google Cloud Platform (GCP) services such as Container Engine and managed autoscaled instance groups can help with those challenges. But at Google Cloud, we think companies should avoid managing infrastructure whenever possible.

For example, we recently worked with GCP partner and professional services firm DoiT International to build a pixel tracking platform that relieves the administrator from setting up or managing any servers. Instead, this serverless pixel tracking solution leverages managed GCP services, including:
  • Google Cloud Storage: A global or regional object store that offers different options such as Standard, Nearline, Cold with various prices and SLAs depending on your needs. In our case, we used Standard, which offers low millisecond latency
  • Google HTTP(s) Load Balancer: A global anycast IP load balancer service that can scale to millions of QPS with integrated logging. It also can be leveraged by Cloud CDN to prevent useless access to Google Cloud Storage by caching pixels closer to the user in Google edges
  • BigQuery: Google's fully managed, petabyte-scale, low-cost enterprise data warehouse for analytics
  • Stackdriver Logging: A logging system that allows you to store, search, analyze, monitor and alert on log data and events from GCP and Amazon Web Services (AWS). It supports Google load balancers and can export data to Cloud Storage, BigQuery or Pub/Sub
Tracking pixels with these services works as follows:
  1. A client calls a pixel URL that's served directly by Cloud Storage.
  2. A Google Cloud Load Balancer in front of Cloud Storage records the request to Stackdriver Logging, whether there was a cache hit or not.
  3. Stackdriver Logging exports every request to BigQuery as they come in, which acts as a storage and querying engine for ad-hoc analytics that can help business analysts better understand their users.


All those services are fully managed and do not require you to set up any instances or VMs. You can learn more about this solution by:
Going forward, we look forward to building more serverless solutions on top of GCP managed offerings. Let us know in the comments if there’s a solution that you’d like us to build!

Introducing Google Cloud IoT Core: for securely connecting and managing IoT devices at scale



Today we're announcing a new fully-managed Google Cloud Platform (GCP) service called Google Cloud IoT Core. Cloud IoT Core makes it easy for you to securely connect your globally distributed devices to GCP, centrally manage them and build rich applications by integrating with our data analytics services. Furthermore, all data ingestion, scalability, availability and performance needs are automatically managed for you in GCP style.

When used as part of a broader Google Cloud IoT solution, Cloud IoT Core gives you access to new operational insights that can help your business react to, and optimize for, change in real time. This advantage has value across multiple industries; for example:
  • Utilities can monitor, analyze and predict consumer energy usage in real time
  • Transportation and logistics firms can proactively stage the right vehicles/vessels/aircraft in the right places at the right times
  • Oil and gas and manufacturing companies can enable intelligent scheduling of equipment maintenance to maximize production and minimize downtime

So, why is this the right time for Cloud IoT Core?


About all the things


Many enterprises that rely on industrial devices such as sensors, conveyor belts, farming equipment, medical equipment and pumps particularly, globally distributed ones are struggling to monitor and manage those devices for several reasons:
  • Operational cost and complexity: The overhead of managing the deployment, maintenance and upgrades for exponentially more devices is stifling. And even with a custom solution in place, the resource investments required for necessary IT infrastructure are significant.
  • Patchwork security: Ensuring world-class, end-to-end security for globally distributed devices is out of reach or at least not a core competency for most organizations.
  • Data fragmentation: Despite the fact that machine-generated data is now an important data source for making good business decisions, the massive amount of data generated by these devices is often stored in silos with a short expiration date, and hence never reaches downstream analytic systems (nor decision makers).
Cloud IoT Core is designed to help resolve these problems by removing risk, complexity and data silos from the device monitoring and management process. Instead, it offers you the ability to more securely connect and manage all your devices as a single global system. Through a single pane of glass you can ingest data generated by all those devices into a responsive data pipeline and, when combined with other Cloud IoT services, analyze and react to that data in real time.

Key features and benefits


Several key Cloud IoT Core features help you meet these goals, including:

  • Fast and easy setup and management: Cloud IoT Core lets you connect up to millions of globally dispersed devices into a single system with smooth and even data ingestion ensured under any condition. Devices are registered to your service quickly and easily via the industry-standard MQTT protocol. For Android Things-based devices, firmware updates can be automatic.
  • Security out-of-the-box: Secure all device data via industry-standard security protocols. (Combine Cloud IoT Core with Android Things for device operating-system security, as well.) Apply Google Cloud IAM roles to devices to control user access in a fine-grained way.
  • Native integration with analytic services: Ingest all your IoT data so you can manage it as a single system and then easily connect it to our native analytic services (including Google Cloud Dataflow, Google BigQuery and Google Cloud Machine Learning Engine) and partner BI solutions (such as Looker, Qlik, Tableau and Zoomdata). Pinpoint potential problems and uncover solutions using interactive data visualizations, or build rich machine-learning models that reflect how your business works.
  • Auto-managed infrastructure: All this in the form of a fully-managed, pay-as-you-go GCP service, with no infrastructure for you to deploy, scale or manage.
"With Google Cloud IoT Core, we have been able to connect large fleets of bicycles to the cloud and quickly build a smart transportation fleet management tool that provides operators with a real-time view of bicycle utilization, distribution and performance metrics, and it forecasts demand for our customers."
 Jose L. Ugia, VP Engineering, Noa Technologies

Next steps

Cloud IoT Core is currently available as a private beta, and we’re launching with these hardware and software partners:

Cloud IoT Device Partners
Cloud IoT Application Partners

When generally available, Cloud IoT Core will serve as an important, foundational tool for hardware partners and customers alike, offering scalability, flexibility and efficiency for a growing set of IoT use cases. In the meantime, we look forward to your feedback!

Cloud Spanner is now production-ready; let the migrations begin!



Cloud Spanner, the world’s first horizontally-scalable and strongly-consistent relational database service, is now generally available for your mission-critical OLTP applications.

We’ve carefully designed Cloud Spanner to meet customer requirements for enterprise databases — including ANSI 2011 SQL support, ACID transactions, 99.999% availability and strong consistency — without compromising latency. As a combined software/hardware solution that includes atomic clocks and GPS receivers across Google’s global network, Cloud Spanner also offers additional accuracy, reliability and performance in the form of a fully-managed cloud database service. Thanks to this unique combination of qualities, Cloud Spanner is already delivering long-term value for our customers with mission-critical applications in the cloud, including customer authentication systems, business-transaction and inventory-management systems, and high-volume media systems that require low latency and high throughput. For example, Snap uses Cloud Spanner to power part of its search infrastructure.

Looking toward migration


In preparation for general availability, we’ve been working closely with our partners to make adoption as smooth and easy as possible. Thus today, we're also announcing our initial data integration partners: Alooma, Informatica and Xplenty.

Now that these partners are in the early stages of Cloud Spanner “lift-and-shift” migration projects for customers, we asked a couple of them to pass along some of their insights about the customer value of Cloud Spanner, as well as any advice about planning for a successful migration:

From Alooma:

Cloud Spanner is a game-changer because it offers horizontally scalable, strongly consistent, highly available OLTP infrastructure in the cloud for the first time. To accelerate migrations, we recommend that customers replicate their data continuously between the source OLTP database and Cloud Spanner, thereby maintaining both infrastructures in the same state — this allows them to migrate their workloads gradually in a predictable manner.

From Informatica:
“Informatica customers are stretching the limits of latency and data volumes, and need innovative enterprise-scale capabilities to help them outperform their competition. We are excited about Cloud Spanner because it provides a completely new way for our mutual customers to disrupt their markets. For integration, migration and other use cases, we are partnering with Google to help them ingest data into Cloud Spanner and integrate a variety of heterogeneous batch, real-time, and streaming data in a highly scalable, performant and secure way.”

From Xplenty:
"Cloud Spanner is one of those cloud-based technologies for which businesses have been waiting: With its horizontal scalability and ACID compliance, it’s ideal for those who seek the lower TCO of a fully managed cloud-based service without sacrificing the features of a legacy, on-premises database. In our experience with customers migrating to Cloud Spanner, important considerations include accounting for data types, embedded code and schema definitions, as well as understanding Cloud Spanner’s security model to efficiently migrate your current security and access-control implementation."

Next steps


We encourage you to dive into a no-cost trial to experience first-hand the value of a relational database service that offers strong consistency, mission-critical availability and global scale (contact us about multi-regional instances) with no workarounds — and with no infrastructure for you to deploy, scale or manage. (Read more about Spanner’s evolution inside Google in this new paper presented at the SIGMOD ‘17 conference today.) If you like what you see, a growing partner ecosystem is standing by for migration help, and to add further value to Cloud Spanner use cases via data analytics and visualization tooling.

Mapping your organization with the Google Cloud Platform resource hierarchy



As your cloud footprint grows, it becomes harder to answer questions like
"How do I best organize my resources?" "How do I separate departments, teams, environments and applications?" "How do I delegate administrative responsibilities in a way that maintains central visibility?" and "How do I manage billing and cost allocation?"

Google Cloud Platform (GCP) tools like Cloud Identity & Access Management, Cloud Resource Manager, and Organization policies let you tackle these problems in a way that best meets your organization’s requirements.

Specifically, the Organization resource, which represents a company in GCP and is the root of the resource hierarchy, provides centralized visibility and control over all its GCP resources.

Now, we're excited to announce the beta launch of Folders, an additional layer under Organization that provides greater flexibility in arranging GCP resources to match your organizational structure.

"As our work with GCP scaled, we started looking for ways to streamline our projects, Thanks to Cloud Resource Manager, we now centrally control and monitor how resources are created and billed in our domain. We use IAM and Folders to provide our departments with the autonomy and velocity they need, without losing visibility into resource access and usage. This has significantly reduced our management overhead, and had a direct positive effect on our ability to support our customers at scale.”  Marcin Kołda, Senior Software Engineer at Ocado Technology.

The Google Cloud resource hierarchy


Organization, Projects and now Folders comprise the GCP resource hierarchy. You can think of the hierarchy as the equivalent of the filesystem in traditional operating systems. It provides ownership, in that each GCP resource has exactly one parent that controls its lifecycle. It provides grouping, as resources can be assembled into Projects and Folders that logically represent services, applications or organizational entities, such as departments and teams in your organization. Furthermore, it provides the “scaffolding” for access control and configuration policies, which you can attach at any node and propagate down the hierarchy, simplifying management and improving security.

The diagram below shows an example of the GCP resource hierarchy.
Projects are the first level of ownership, grouping and policy attach point. At the other end of the spectrum, the Organization contains all the resources that belong to a company and provides the high-level scope for centralized visibility and control. A policy defined at the Organization level is inherited by all the resources in the hierarchy. In the middle, Folders can contain Projects or other Folders and provide the flexibility to organize and create the boundaries for your isolation requirements.

As the Organization Admin for your company, you can, for example, create first-level Folders under the Organization to map your departments: Engineering, IT, Operations, Marketing, etc. You can then delegate full control of each Folder to the lead of the corresponding department by assigning them the Folder Admin IAM role. Each department can organize their own resources by creating sub-folders for teams, or applications. You can define Organization-wide policies centrally at the Organization level, and they're inherited by all resources in the Organization, ensuring central visibility and control. Similarly, policies defined at the Folder level are propagated down the corresponding subtree, providing teams and departments with the appropriate level of autonomy.

What to consider when mapping your organization onto GCP


Each organization has a unique structure, culture, velocity and autonomy requirements. While there isn’t a predefined recipe that fits all scenarios, here are some criteria to consider as you organize your resources in GCP.

Isolation: Where do you want to establish trust boundaries: at the department and team level, at the application or service level, or between production, test and dev environments? Use Folders with their nested hierarchy and Projects to create isolation between your cloud resources. Set IAM policies at the different levels of the hierarchy to determine who has access to which resources.

Delegation: How do you balance autonomy with centralized control? Folders and IAM help you establish compartments where you can allow more freedom for developers to create and experiment, and reserve areas with stricter control. You can for example create a Development Folder where users are allowed to create Projects, spin up virtual machines (VMs) and enable services. You can also safeguard your production workflows by collecting them in dedicated Projects and Folders where least privilege is enforced through IAM.

Inheritance: How can inheritance optimize policy management? As we mentioned, you can define policies at every node of the hierarchy and propagate them down. IAM policies are additive. If, for example, bob@myorganization.com is granted Compute Engine instanceAdmin role for a Folder, he will be able to start VMs in each Project under that Folder.

Shared resources: Are there resources that need to be shared across your organization, like networks, VM images, service accounts? Use Projects and Folders to build central repositories for your shared resources and limit administrative privileges over these resources to only selected users. Use least privilege principle to allow access to other users.

Managing the GCP resource hierarchy


As part of the Folders beta launch, we've redesigned the Cloud Console user interface to improve visibility and management of the resource hierarchy. You can now effortlessly browse the hierarchy, manage resources and define IAM policies via the new scope picker and the Manage Resources page shown below.
In this example, the Organization “myorganization.com” is structured in two top-level folders for the Engineering and IT departments. The Engineering department then creates two sub-folders for Product_A and Product_B, which in turn contain folders for the production, development and test environments. You can define IAM permissions for each Folder from within the same UI, by selecting the resources of interest and accessing the control pane on the right hand side, as shown below.
By leveraging IAM permissions, the Organization Admin can restrict visibility to users within portions of the tree, creating isolation and enforcing trust boundaries between departments, products or environments. In order to maximize security of the production environment for Product_A for example, only selected users may be granted access or visibility to the corresponding Folder. Developer bob@myorganization.com, for instance, is working on new features for Product_A, but in order to minimize risk of mistakes in the production environment, he's not given visibility to the Production Folder. You can see his visibility of the Organization hierarchy in the diagram below:


As with any other GCP component, alongside the UI, we've provided API and command line (gcloud) interfaces to programmatically manage the entire resource hierarchy, enabling automation and standardization of policies and environments.

The following script creates the resource hierarchy above programmatically using the gcloud command line tool.


# Find your Organization ID
 
me@cloudshell:~$ gcloud organizations list
DISPLAY_NAME        ID     DIRECTORY_CUSTOMER_ID
myorganization.com  358981462196  C03ryezon
 
# Create first level folder “Engineering” under the Organization node
 
me@cloudshell:~$ gcloud alpha resource-manager folders create
--display-name=Engineering --organization=358981462196
Waiting for [operations/fc.2201898884439886347] to finish...done.                                                                                                                     Created [<Folder 
createTime: u'2017-04-16T22:49:10.144Z' 
displayName: u'Engineering' 
lifecycleState: LifecycleStateValueValuesEnum(ACTIVE, 1) 
name: u'folders/1000107035726' 
parent: u'organizations/358981462196'>].

 
# Add a Folder Admin role to the “Engineering” folder
 
me@cloudshell:~$ gcloud alpha resource-manager folders add-iam-policy-binding 
1000107035726 --member=user:bob@myorganizayion.com 
--role=roles/resourcemanager.folderAdmin
bindings: 
- members:  
- user:bob@myorganization.com  
- user:admin@myorganization.com  
role: roles/resourcemanager.folderAdmin
- members:  
- user:alice@myorganization.com  
role: roles/resourcemanager.folderEditor
etag: BwVNX61mPnc=
 
 
# Check the IAM policy set on the “Engineering” folder
 
me@cloudshell:~$ gcloud alpha resource-manager folders get-iam-policy 
1000107035726
bindings: 
- members:  
- user:bob@myorganization.com  
- user:admin@myorganization.com  
role: roles/resourcemanager.folderAdmin
- members:  
- user:alice@myorganization.com  
role: roles/resourcemanager.folderEditor
etag: BwVNX61mPnc=
 

 
# Create second level folder “Product_A” under folder “Engineering”
 
me@cloudshell:~$ gcloud alpha resource-manager folders create 
--display-name=Product_A --folder=1000107035726
Waiting for [operations/fc.2194220672620579778] to finish...done.                                                                                                                     Created [].
 
# Crate third level folder “Development” under folder “Product_A”
 
me@cloudshell:~$ gcloud alpha resource-manager folders create 
--display-name=Development --folder=1000107035726
Waiting for [operations/fc.3497651884412259206] to finish...done.                                                                                                                     Created [].
 
# List all the folders under the Organization
 
me@cloudshell:~$ gcloud alpha resource-manager folders list 
--organization=358981462196
DISPLAY_NAME  PARENT_NAME                 ID
IT            organizations/358981462196  575615098945
Engineering   organizations/358981462196  661646869517
Operations    organizations/358981462196  895951706304
 
# List all the folders under the “Engineering” folder
 
me@cloudshell:~$ gcloud alpha resource-manager folders list 
--folder=1000107035726
DISPLAY_NAME  PARENT_NAME           ID
Product_A     folders/1000107035726  732853632103
Product_B     folders/1000107035726  941564020040
 
 
# Create a new project in folder “Product_A”
 
me@cloudshell:~$ gcloud alpha projects create my-awesome-service-2 --folder 
732853632103
Create in progress for [https://cloudresourcemanager.googleapis.com/v1/projects/my-awesome-service-3].Waiting for [operations/pc.2821699584791562398] to finish...done. 
 
 
 
# List projects under folder “Production”
 
me@cloudshell:~$ gcloud alpha projects list --filter 'parent.id=725271112613'
PROJECT_ID            NAME                  PROJECT_NUMBER
my-awesome-service-1  my-awesome-service-1  869942226409
my-awesome-service-2  my-awesome-service-2  177629658252


As you can see, Cloud Resource Manager is a powerful way to manage and organize GCP resources that belong to an organization. To learn more, check out the Quickstarts, and stay tuned as we add additional capabilities in the months to come.

Compute Engine machine types with up to 64 vCPUs now ready for your production workloads



Today, we're happy to announce general availability for our largest virtual machine shapes, including both predefined and custom machine types, with up to 64 virtual CPUs and 416 GB of memory.


64 vCPU machine types are available on our Haswell, Broadwell and Skylake (currently in Alpha) generation Intel processor host machines.

Tim Kelton, co-founder and Cloud Architect of Descartes Labs, an early adopter of our 64 vCPU machine types, had this to say:
"Recently we used the 64 vCPU instances during the building of both our global composite imagery layers and GeoVisual Search. In both cases, our parallel processing jobs needed tens of thousands of CPU hours to complete the task. The new 64 vCPU instances allow us to work across more satellite imagery scenes simultaneously on a single instance, dramatically speeding up our total processing times."
The new 64 core machines are available for use today. If you're new to GCP and want to give these larger virtual machines a try, it’s easy to get started with our $300 credit for 12 months.

Google Cloud Platform launches Northern Virginia region



Google Cloud Platform (GCP) continues to rapidly expand our global footprint, and we’re excited to announce the availability of our latest cloud region: Northern Virginia.
The launch of Northern Virginia (us-east4) brings the total number of regions serving the Americas market to four including Oregon, Iowa and South Carolina. We’ll continue to turn up new options for developers in this market with future regions in São Paulo, Montreal and California.

Google Cloud customers benefit from our commitment to large-scale infrastructure investments. Each region gives developers additional choice on how to run their applications closest to their customers, while Google’s networking backbone transforms compute and storage infrastructure into a global-scale computer, giving developers around the world access to the same cloud infrastructure that Google engineers use every day.

We’ve launched Northern Virginia with three zones and the following services:
Incredible user experiences hinge on incredibly performant infrastructure. Developers who want to serve the Northeastern and Mid-Atlantic regions of the United States will see significant reductions in latency when they run their workloads in the Northern Virginia region. Our performance testing shows 25%-85% reductions in RTT latency when serving customers in Washington DC, New York, Boston, Montreal and Toronto compared to using our Iowa or South Carolina regions.
"We are a latency-sensitive business and the addition of the Northern Virginia region will allow us to expand our coverage area and reduce latency to our current users. This will also allow us to significantly increase the capability of our Data Lake platform, which we are looking at as a competitive advantage" — Linh Chung, CIO at Viant, a Time Inc. Company
We want to help you build what’s Next for you. Our locations page provides updates on the availability of additional services, and for guidance on how to build and create highly available applications, take a look at our zones and regions page. Give us a shout to request early access to new regions and help us prioritize what we build next.

Windows on the rise at GCP



It’s been a little over three months since we made our no-charge VM migration tool available for GCP in the Google Cloud Console, and customers have jumped at the chance to move their enterprise workloads to Google Cloud. While customers are moving applications using a variety of source operating systems to Google Cloud, we've been especially excited to see that almost half of the VM migrations to Google Cloud via this new service have been of Microsoft Windows workloads.

Why is this significant to you? Because our goal to make Google Cloud the best place to run any application  from Windows workloads to new cloud native applications. We believe that the significant number of Windows applications migrating to Google Cloud through this new service is indicative of strong demand to give enterprise Windows applications the agility, scale and security advantages of Google Cloud.
“We are leveraging Google Cloud to deliver the experiences our customers demand, and we want to make sure that all our workloads can take advantage of Google Cloud’s unique infrastructure and services. Using the free Google Cloud migration tools, we’ve been able to easily move our Windows servers to Google Cloud with near-zero downtime.”  Rob Wilson, CTO at Smyths Toys
We're happy to see customers take advantage of our first class support for Windows, SQL Server and both .NET and .NET Core on GCP. We’ve made sure that those applications are well-supported by providing support for Windows Server 2016 within weeks of it reaching GA, by adding support for SQL Server Web, Standard and Enterprise editions (including support for High Availability), by integrating Visual Studio and PowerShell, by making all of Google’s APIs available via NuGet and by joining the .NET Foundation’s Technical Steering Committee. Further, with Stackdriver Logging, Error Reporting and Trace support for Windows and .NET; developers and administrators have the support they need to build, deploy and manage their applications. Finally, with the recent announcement of .NET Core support in all of our libraries and tooling, as well as in our App Engine and Container Engine products, you’re covered into the future as well.

Internally, we’ve seen other signs of more Windows and .NET workloads running on GCP, including a 57% increase in Windows CPU usage in the second half of 2016. Further, we know that sometimes you need help to take advantage of the full capabilities of GCP, which is why we announced the Windows Partner Program. These top-notch systems integrators will help you to not just “lift & shift,” but rather to “move & improve,” with cutting-edge capabilities such as big data processing, data analytics, machine learning and container management.

Learn more about Windows, SQL Server and .NET on GCP; and don’t hesitate to reach out with questions and suggestions. We’ve had lots of folks make the switch already, we’d love you to join them. Our migration service is offered at no charge and you get $300 of GCP credits when you sign up so you can migrate a few servers to see how easy it is to run your windows apps in GCP. Click here to get started.

Use Google Cloud Client Libraries to store files, save entities, and log data



To develop a cloud application, you usually need to access an online object storage, a scalable NoSQL database and a logging infrastructure. To that end, Google Cloud Platform (GCP) provides the Cloud Storage API, the Cloud Datastore API, and the Stackdriver Logging API. Better yet, you can now access those APIs via the latest Google Cloud Client Libraries which we’re proud to announce, are now Generally Available (GA) in seven server-side languages: C#, Go, Java, Node.js, PHP, Python and Ruby.

Online object storage

For your object storage needs, the Cloud Storage API enables you for instance to upload blobs of data, such as picture or movies, directly into buckets. To do so in Node.js for example, you first need to install the Cloud Client Library:

npm install --save @google-cloud/storage

and then simply run the following code to upload a local file into a specific bucket:

const Storage = require('@google-cloud/storage');
 
// Instantiates a client
const storage = Storage();
 
// References an existing bucket, e.g. “my-bucket”
const bucket = storage.bucket(bucketName);
 
// Upload a local file to the bucket, e.g. “./local/path/to/file.txt”
return bucket.upload(fileName)
 .then((results) => {
  const file = results[0];
  console.log(`File ${file.name} uploaded`);
});


NoSQL Database

With Cloud Datastore, one of our NoSQL offerings, you can create entities, which are structured objects, and save them in GCP so that they can be retrieved or queried by your application at a later time. Here’s an example in Java, where you specify the maven dependency in the following manner:

    com.google.cloud
    google-cloud-datastore
    1.0.0

followed by executing this code to create a task entity:

// Imports the Google Cloud Client Library
import com.google.cloud.datastore.Datastore;
import com.google.cloud.datastore.DatastoreOptions;
import com.google.cloud.datastore.Entity;
import com.google.cloud.datastore.Key;
 
public class QuickstartSample {
  public static void main(String... args) throws Exception {
 
  // Instantiates a client
  Datastore datastore = DatastoreOptions.getDefaultInstance().getService();
 
  // The kind for the new entity
  String kind = "Task";
 
  // The name/ID for the new entity
  String name = "sampletask1";
 
  // The Cloud Datastore key for the new entity
  Key taskKey = datastore.newKeyFactory().setKind(kind).newKey(name);
 
  // Prepares the new entity
  Entity task = Entity.newBuilder(taskKey)
  .set("description", "Buy milk")
  .build();
 
  // Saves the entity
  datastore.put(task);
  }
}

Logging framework

Our libraries also allow you to send log data and events very easily to the Stackdriver Logging API. As a Python developer for instance, the first step is to install the Cloud Client Library for Logging:

pip install --upgrade google-cloud-logging

Then add the following code to your project (e.g. your __init__.py file):

import logging
import google.cloud.logging
client = google.cloud.logging.Client()
# Attaches a Google Stackdriver logging handler to the root logger
client.setup_logging(logging.INFO)

Then, just use the standard Python logging module to directly report logs to Stackdriver Logging:

import logging
logging.error('This is an error')

We encourage you to visit the client libraries page for Cloud Storage, Cloud Datastore and Stackdriver Logging to learn more on how to get started programmatically with these APIs across all of the supported languages. To see the full list of APIs covered by the Cloud Client Libraries, or to give us feedback, you can also visit the respective GitHub repositories in the Google Cloud Platform organization.

Putting gRPC multi-language support to the test



gRPC is an RPC framework developed and open-sourced by Google. There are many benefits to gRPC, such as efficient connectivity with HTTP/2, efficient data serialization with Protobuf, bi-directional streaming and more, but one of the biggest benefits is often overlooked: multi-language support.

Out of the box, gRPC supports multiple programming languages : C#, Java, Go, Node.js, Python, PHP, Ruby and more. In the new microservices world, the multi-language support provides the flexibility you need to implement services in whatever language and framework you like and let gRPC handle the low-level connectivity and data transfer between microservices in an efficient and consistent way.

This all sounds nice in theory but does it work in reality? As a long-time Java and C# developer, I wanted to see how well gRPC delivered on its multi-language promise. The plan was to run a couple of Java gRPC samples to see how gRPC worked in Java. Then, I wanted to see how easy it would be to port those samples into C#. Finally, I wanted to mix and match Java and C# clients/servers and see how well they worked together.

gRPC Java support

First, I wanted to figure out how well gRPC supports individual languages. Getting started with Java is pretty straightforward just add the Maven or Gradle dependencies and plugins. Ray Tsang, a colleague and Java expert, has written and published some gRPC samples in Java in his GitHub repository, so I started exploring those.

I tried a simple gRPC client and a simple gRPC server written in Java. These are "Hello World"-type samples, where a client sends a request to the server and the server echoes it back. The samples are Maven projects, so I used my favorite Java editor (Eclipse) to import the projects into a new workspace. First, I started the server:

Starting server on port 8080
Server started!

Then, I started the client. As expected, the client sent a request and received a response from the server:

Sending request
Received response: Hello there, Mete

Ray also has a more interesting sample that uses a bi-directional streaming feature. He built a chat server and a chat client based on JavaFX to talk to that server. I was able to get the two chat clients talking to each other through the chat server with little effort.



Two JavaFX clients talking to each other via Java server

gRPC C# support

So far so good. Next, I wanted to see how easy it was to rewrite the same samples in C#. With a little help from the gRPC documentation samples, I was able to create a GreeterClient and a GreeterServer for the Hello World sample. The code is very similar to Java but it looks a little nicer. (Ok, I'm biased in favor of C# :-) )

One minor difference: with Java, you can use Maven or Gradle plugins to generate gRPC stub classes automatically. In the case of C#, you need to bring in the gRPC Tools NuGet package and generate the stub classes with it. Take a look at generate_protos.bat to see how I did that. The good news is that you can rely on the same service definition file to generate Java and C# stub clients, which makes it easy to write client and server apps in different languages.

I also implemented the bi-directional streaming chat example with ChatServer and ChatWindowsClient but instead of JavaFX, I used Windows Forms. As before, the code is quite similar to the Java version but gRPC takes advantage of the language to make sure developers are not missing out on language specific features.

For example, ChatServerImpl.java creates and returns a StreamObserver as a handler for client messages. This works but felt a little unintuitive. On the other hand, ChatServerImpl.cs uses the async/await pattern of C# and writes to the response stream asynchronously, which yields a cleaner implementation.

gRPC multi-language test

The real test for multi-language support is how well Java and C# implementations work together. To test that, I started the Java chat server. Then, I started a Java chat client and a C# chat client both talking to the same Java chat server. It was nice to see the two clients talking to each other through the chat server with no special configuration or effort on my part.
One Windows Form client and one JavaFX client talking to each other via Java server


Conclusion

Designing a framework is hard. Designing a framework that works across different languages while maintaining the unique benefits of each language is even harder. Whether I worked with gRPC in Java or C#, it never felt alien. It's obvious that a lot of thought and effort went into making sure that gRPC was idiomatic for each language. That's great to see from a framework trying to cover a wide range of languages.

If you want to run the samples yourself, take a look at Ray's Java gRPC samples and my C# gRPC samples. You can also watch a recording of Ray’s talk on Java gRPC.

Happy gRPCing! :-)

Building lean containers using Google Cloud Container Builder



Building a Java application requires a lot of files — source code, application libraries, build systems, build system dependencies and of course, the JDK. When you containerize an application, these files sometimes get left in, causing bloat. Over time, this bloat costs you both time and money by storing and moving unnecessary bits between your Docker registry and your container runtime.

A better way to help ensure your container is as small as possible is to separate the building of the application (and the tools needed to do so) from the assembly of the runtime container. Using Google Cloud Container Builder, we can do just that, allowing us to build significantly leaner containers. These lean containers load faster and save on storage costs.

Container layers

Each line in a Dockerfile adds a new layer to a container. Let’s look at an example:

FROM busybox
 
COPY ./lots-of-data /data
 
RUN rm -rf /data
 
CMD ["/bin/sh"]


In this example, we copy the local directory, "lots-of-data", to the "data" directory in the container, and then immediately delete it. You might assume such an operation is harmless, but that's not the case.

The reason is because of Docker’s “copy-on-write” strategy, which makes all previous layers read-only. If each successive command generates data that's not needed at container runtime, nor deleted in the same command, that space cannot be reclaimed.


Spinnaker containers

Spinnaker is an open source, cloud-focused continuous delivery tool created by Netflix. Spinnaker is actively maintained by a community of partners, including Netflix and Google. It has a microservice architecture, with each component written in Groovy and Java, and uses Gradle as its build tool.

Spinnaker publishes each microservice container on Quay.io. Each service has nearly identical Dockerfiles, so we’ll use the Gate service as the example. Previously, we had a Dockerfile that looked like this:

FROM java:8

COPY . workdir/

WORKDIR workdir

RUN GRADLE_USER_HOME=cache ./gradlew buildDeb -x test

RUN dpkg -i ./gate-web/build/distributions/*.deb

CMD ["/opt/gate/bin/gate"]

With Spinnaker, Gradle is used to do the build, which in this case builds a Debian package. Gradle is a great tool, but it downloads a large number of libraries in order to function. These libraries are essential to the building of the package, but aren’t needed at runtime. All of the runtime dependencies are bundled up in the package itself.

As discussed before, each command in the Dockerfile creates a new layer in the container. If data is generated in that layer and not deleted in the same command, that space cannot be recovered. In this case, Gradle is downloading hundreds of megabytes of libraries to the "cache" directory in order to perform the build, but we're not deleting those libraries.

A more efficient way to perform this build is to merge the two “RUN” commands, and remove all of the files (including the source code) when complete:

FROM java:8
 
COPY . workdir/
 
WORKDIR workdir
 
RUN GRADLE_USER_HOME=cache ./gradlew buildDeb -x test && \
  dpkg -i ./gate-web/build/distributions/*.deb && \
  cd .. && \
  rm -rf workdir
 
CMD ["/opt/gate/bin/gate"]

This took the final container size down from 652MB to 284MB, a savings of 56%. But can we do even better?

Enter Container Builder

Using Container Builder, we're able to further separate building the application from building its runtime container.

The Container Builder team publishes and maintains a series of Docker containers with common developer tools such as git, docker and the gcloud command line interface. Using these tools, we’ll define a "cloudbuild.yaml" file with one step to build the application, and another to assemble its final runtime environment.

Here's the "cloudbuild.yaml" file we'll use:
steps:
- name: 'java:8'
  env: ['GRADLE_USER_HOME=cache']
  entrypoint: 'bash'
  args: ['-c', './gradlew gate-web:installDist -x test']
- name: 'gcr.io/cloud-builders/docker'
  args: ['build', 
         '-t', 'gcr.io/$PROJECT_ID/$REPO_NAME:$COMMIT_SHA', 
         '-t', 'gcr.io/$PROJECT_ID/$REPO_NAME:latest',
         '-f', 'Dockerfile.slim', '.']
 
images:
- 'gcr.io/$PROJECT_ID/$REPO_NAME:$COMMIT_SHA'
- 'gcr.io/$PROJECT_ID/$REPO_NAME:latest'

Let’s go through each step and explore what is happening.

Step 1: Build the application

- name: 'java:8'
  env: ['GRADLE_USER_HOME=cache']
  entrypoint: 'bash'
  args: ['-c', './gradlew gate-web:installDist -x test']

Our lean runtime container doesn’t contain "dpkg", so we won't use the "buildDeb" Gradle task. Instead, we use a different task, "installDist", which creates the same directory hierarchy for easy copying.

Step 2: Assemble the runtime container

- name: 'gcr.io/cloud-builders/docker'
  args: ['build', 
         '-t', 'gcr.io/$PROJECT_ID/$REPO_NAME:$COMMIT_SHA',
         '-t', 'gcr.io/$PROJECT_ID/$REPO_NAME:latest', 
         '-f', 'Dockerfile.slim', '.']


Next, we invoke the Docker build to assemble the runtime container. We'll use a different file to define the runtime container, named "Dockerfile.slim". Its contents are below:
FROM openjdk:8u111-jre-alpine
 
COPY ./gate-web/build/install/gate /opt/gate
 
RUN apk --nocache add --update bash
 
CMD ["/opt/gate/bin/gate"]

The output of the "installDist" Gradle task from Step 1 already has the directory hierarchy we want (i.e. "gate/bin/", "gate/lib/", etc), so we can simply copy it into our target container.

One of the major savings is the choice of the Alpine Linux base layer, "openjdk:8u111-jre-alpine". Not only is this layer incredibly lean, but we also choose to only include the JRE, instead of the bulkier JDK that was necessary to build the application.


Step 3: Publish the image to the registry

images:
- 'gcr.io/$PROJECT_ID/$REPO_NAME:$COMMIT_SHA'
- 'gcr.io/$PROJECT_ID/$REPO_NAME:latest'

Lastly, we tag the container with the commit hash and crown it as the "latest" container. We then push this container to our Google Cloud Container Registry (grc.io) with these tags.

Conclusion

In the end, using Container Builder resulted in a final container size of 91.6MB, which is 85% smaller than our initial Dockerfile and even 68% smaller than our improved version.
*The major savings comes from separating the build and runtime environments, and from choosing a lean base layer for the final container.
Applying this approach across each microservice yielded similar results; our sum total container footprint shrunk from almost 6GB down to less than 1GB.