Tag Archives: Management Tools

Better cost control with Google Cloud Billing programmatic notifications

By Matt Leonard, Product Manager, Google Cloud Platform

By running your workloads on Google Cloud Platform (GCP) you have access to the tools you need to build and scale your business. At the same time, it’s important to keep your costs under control by informing users and managing their spending.

Today, we’re adding programmatic budget notifications to Google Cloud Billing, a powerful feature that helps you stick to your budget and take automatic action when your budget is out of control.

Monitor your costs
You can use Cloud Billing budget notifications with third-party or homegrown cost-management solutions, as well as Google Cloud services. For example, as an engineering manager, you can set up budget notifications to alert your entire team through Slack every time you hit 80 percent of your budget.

Control your costs
You can also configure automated actions based on the notifications to control your costs, such as selectively turning off particular resources or terminating all resources for a project. For example, as a PhD student working at a lab with a fixed grant amount, you can use budget notifications to trigger a cap to terminate your project when you use up your grant. This way, you can be confident that you won’t go over budget.

Work with your existing workflow and tools
To make it easy to get started with budget notifications, we’ve included examples of reference architectures for a few common use cases in our documentation:

Monitoring - listen to your PubSub notifications with Cloud Functions
Forward notifications to Slack - send custom billing alerts with the current spending for your budget to a Slack channel
Cap (disable) billing on a project - disable billing for a project and terminate all resources to make sure you don’t overspend
Selectively control resources - when you want to terminate expensive resources but not disable your whole environment.

Get started
You can set up programmatic budget notifications in a few simple steps:

Navigate to Billing in the Google Cloud Console and create your budget.
Enable Cloud Pub/Sub, then set up a Cloud Pub/Sub topic for your budget.
When creating your budget you will see a new section “Manage notifications” where you can configure your Cloud Pub/Sub topic:

Set up a Cloud Function to listen to budget notifications and trigger an action.

You can get started today by reading the Google Cloud Billing documentation. If you’ll be at Google Cloud Next ‘18, be sure to come by my session on Google Cloud Billing and cost control.

Source: Google Cloud Platform Blog

Apigee named a Leader in the Gartner Magic Quadrant for Full Life Cycle API Management for the third consecutive time

By Ed Anuff, Director of Product Management, Google Cloud

APIs are the de-facto standard for building and connecting modern applications. But securely delivering, managing and analyzing APIs, data and services, both inside and outside an organization, is complex. And it’s getting even more challenging as enterprise IT environments grow dependent on combinations of public, private and hybrid cloud infrastructures.

Choosing the right APIs can be critical to a platform’s success. Likewise, full lifecycle API management can be a key ingredient in running a successful API-based program. Tools like Gartner’s Magic Quadrant for Full Life Cycle API Management help enterprises evaluate these platforms so they can find the right one to fit their strategy and planning.

Today, we’re thrilled to share that Gartner has recognized Apigee as a Leader in the 2018 Magic Quadrant for Full Life Cycle API Management. This year, Apigee was not only positioned furthest on Gartner’s “completeness of vision” axis for the third time running, it was also positioned highest in “ability to execute.”

Ticketmaster, a leader in ticket sales and distribution, has used Apigee since 2013. The company uses the Apigee platform to enforce consistent security across its APIs, and to help reach new audiences by making it easier for partners and developers to build upon and integrate with Ticketmaster services.

"Apigee has played a key role in helping Ticketmaster build its API program and bring ‘moments of joy’ to fans everywhere, on any platform," said Ismail Elshareef, Ticketmaster's senior vice president of fan experience and open platform.

We’re excited that APIs and API management have become essential to how enterprises deliver applications in and across clouds, and we’re honored that Apigee continues to be recognized as a leader in its category. Most importantly, we look forward to continuing to help customers innovate and accelerate their businesses as part of Google Cloud.

The Gartner 2018 Magic Quadrant for Full Life Cycle Management is available at no charge here.

To learn more about Apigee, please visit the Apigee website.

This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available from Apigee here.
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner's research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Source: Google Cloud Platform Blog

Announcing variable substitution in Stackdriver alerting notifications

By Amir Hermelin, Product Manager

When an outage occurs in your cloud application, having fast insight into what’s going on is crucial to resolving the issue quickly. If you use Google Stackdriver, you probably rely on alerting policies to detect these issues and notify you with relevant information. To improve the organization and readability of the information contained in these alerts, we’ve added some new features to make our alerting notifications more descriptive, useful and actionable. We’ll gradually roll out these updates over the next few weeks.

One of these new features is the ability to add variables to your alerting notifications. You can use this to include more metadata in your notifications, for example information on Kubernetes clusters and other resources. You can also use this to construct specific playbook information and links using the variable substitution.

In addition, we’re transitioning to HTML-formatted emails that are easier to read and more clearly organized. We’re also adding the documentation field to Slack notifications, as well as webhook, so teams using these notification methods can utilize these new features.

New variable substitution in alerting policy documentation

You can now include variables in the documentation section of your alerting policies. The contents of this field are also now included in Slack and webhook notifications, in addition to email.

The following syntax:

${varname}

will be formatted by replacing the expression ${varname} with the value of varname. We support only simple variable substitutions; more complex expressions, for example ${varname1 + varname2}, are not. We also support the use of $$ as an escape sequence (so that the literal text "${" may be written using "$${").

Variable	Meaning
condition.name	The REST resource name of the condition (e.g. "projects/foo/alertPolicies/12345/conditions/5678")
condition.display_name	The display name for the triggering condition
metadata.user_label.key	The value of the metadata label "key" (replace "key" appropriately)
metric.type	The metric (e.g. "compute.googleapis.com/instance/cpu/utilization")
metric.display_name	The display name associated with this metric type
metric.label.key	The value of the metric label "key" (replace "key" appropriately)
policy.user_label.key	The value of the user label "key" (replace "key" appropriately)
policy.name	The REST resource name of the policy (e.g. "projects/foo/alertPolicies/12345")
policy.display_name	The display name associated with the alerting policy
project	The project ID of the Stackdriver host account
resource.project	The project ID of the monitored resource of the alerting policy.
resource.type	The type of the resource (e.g. "gce_instance")
resource.display_name	The display name of the resource
resource.label.key	The value of the resource label "key" (replace "key" appropriately)

Note: You can only set policy user labels via the Monitoring API.

@mentions for Slack

Slack notifications now include the alerting policy documentation. This means that you can include customized Slack formatting and control sequences for your alerts. For the various options, please refer to the Slack documentation.

One useful feature is linking to a user. So for example, including this line in the documentation field

@backendoncall policy ${policy.display_name} triggered an incident

notifies the user backend-oncall in addition to sending the message to the relevant Slack channel that was described in the policy’s notification options.

Notification examples

Now, when you look at a Stackdriver notification, all notification methods (with the exception of SMS) include the following fields:

Incident ID/link: the incident that triggered the notification along with a link to the incident page
Policy name: the name of the configured alerting policy
Condition name: the name of the alerting policy condition that is in violation Email:

Email:

Slack:

Webhook:

{  
   "incident":{  
      "incident_id":"0.kmttg2it8kr0",
      "resource_id":"",
      "resource_name":"totally-new cassweb1",
      "started_at":1514931579,
      "policy_name":"Backend processing utilization too high",
      "condition_name":"Metric Threshold on Instance (GCE) cassweb1",
      "url":"https://app.google.stackdriver.com/incidents/0.kmttg2it8kr0?project=tot
ally-new",
      "documentation":{  
         "content":"CPU utilization sample. This might affect our backend
processing.\u000AFollowing playbook here: https://my.sample.playbook/cassweb1",
         "mime_type":"text/markdown"
      },
      "state":"open",
      "ended_at":null,
      "summary":"CPU utilization for totally-new cassweb1 is above the threshold of
 0.8 with a value of 0.994."
   },
   "version":"1.2"
}

Next steps

We’ll be rolling out these new features in the coming weeks as part of the regular updating process. There’s no action needed on your part, and the changes will not affect the reliability or latency of your existing alerting notification pipeline. Of course, we encourage you to give meaningful names to your alerting policies and conditions, as well as add a “documentation” section to configured alerting policies to help oncall engineers understand the alert notification when they receive it. And as always, please send us your requests and feedback, and thank you for using Stackdriver!

Source: Google Cloud Platform Blog

Viewing trace spans and request logs in multi-project deployments

By Morgan McLean, Product Manager, John Bryan, Software Engineer, and Dave Raffensberger, Software Engineer

Google Cloud Platform (GCP) provides developers and operators with fine-grained billing and resource access management for separate applications through projects. But while isolating application services across projects is important for security and cost allocation, it can make debugging cross-service issues more difficult.

Stackdriver Trace, our tool for analyzing latency data from your applications, can now visualize traces and logs for requests that cross multiple projects, all in a single waterfall chart. This lets you see how requests propagate through services in separate projects and helps to identify sources of poor performance across your entire stack.

To view spans and log entries for cross-project traces, follow the instructions in the Viewing traces across projects documentation. Your projects will need to be part of a single organization, as explained in Best Practices for Enterprise Organizations. To do so, create an organization and then migrate existing projects to it.

Once your projects are in an organization, you’re ready to view multi-project traces. First, select any one of the relevant projects in the GCP Console, and then navigate to the Trace List page and select a trace. You will see spans for all the projects in your organization for which you have “cloudtrace.traces.get” permission. The “Project” label in the span details panel on the right indicates which project the selected span is from.

You can also view log entries associated with the request from all projects that were part of the trace. This requires the “logging.logEntries.list” permission on the associated projects and it requires you to set the LogEntry “trace” field using the format “projects/[PROJECT-ID]/traces/[TRACE-ID]” when you write your logs to Stackdriver Logging. You may also set the LogEntry “span_id” field as the 16-character hexadecimal encoded span ID to associate logs with specific trace spans. See Viewing Trace Details > Log Entries for details.

If you use Google Kubernetes Engine or the Stackdriver Logging Agent via Fluentd, you can set the LogEntry “trace” and “span_id” fields by writing structured logs with the keys of “logging.googleapis.com/trace” and “logging.googleapis.com/span_id”. See Special fields in structured payloads for more information.

To view the associated log entries inline with trace spans, click “Show Logs.”

Automatic association of traces and logs

Here are the GCP languages and environments that support automatically associating traces and log entries:

Node.js: if you use the Stackdriver Trace Node.js SDK, and you write your logs using Winston or Bunyan, you can use the @google-cloud/logging-bunyan or @google-cloud/logging-winston packages, which will automatically annotate your log entries with the “trace” field when the request is sampled for trace and a trace context is available.
Ruby: use the google-cloud-trace and the google-cloud-logging gems to automatically assign the “trace” field of log entries.
Java Spring: the Spring Boot Starter for Stackdriver Trace can collect trace spans and the Spring Cloud GCP Stackdriver Logging Support can automatically tag log entries with trace and span IDs.
Both the App Engine Standard environment and App Engine Flexible default containers for Ruby, PHP, Java and Node.js automatically associate traces and logs.
The GCP HTTP(S) load balancers also write trace and span IDs in log entries.

Now, having applications in multiple projects is no longer a barrier to identifying the sources of poor performance in your stack. Click here to learn more about Stackdriver Trace.

Source: Google Cloud Platform Blog

New ways to manage and automate your Stackdriver alerting policies

By Michael Safyan, Software Engineer and Amir Hermelin, Product Manager, Google Stackdriver

If your organization uses Google Stackdriver, our hybrid monitoring, logging and diagnostics suite, you’re most likely familiar with Stackdriver alerting. DevOps teams use alerting to monitor and respond to incidents impacting their applications running in the cloud. We’ve received a lot of great feedback about the Stackdriver alerting functionality, notably, the need for a programmatic interface to manage alerting policies and a means of automating them across different cloud projects.

Today, we're pleased to announce the beta release of new endpoints in the Stackdriver Monitoring v3 API to manage alerting policies and notification channels. Now, it’s possible to create, read, write, and manage your Stackdriver alerting policies and notification channels. You can perform these operations using client libraries in one of the supported languages (Java or C#, with more to come later) or by directly invoking the API, which supports both gRPC and HTTP / JSON REST protocols. There's also command line support in the Google Cloud SDK via the gcloud alpha monitoring policies, gcloud alpha monitoring channel-descriptors, and gcloud alpha monitoring channels commands.

Providing programmatic access to alerting policies and notification channels can help automate common tasks such as:

Copying policies and notification channels between different projects, for example between test, dev and production
Disabling and later re-enabling policies and notification channels in the event of alerting storms
Utilizing user labels to organize and filter notification channels and policies
Programatically verifying SMS channels as new SMS numbers get added to the team

Organizing policies

If you have multiple alerting policies configured by various teams within a single Google Cloud project, navigating and organizing these policies can be challenging. With the Stackdriver Alerting API, you can add "user labels" to annotate policies with metadata, which then makes it easier to find and navigate these policies. For example, here’s how to list all your policies:

gcloud alpha monitoring policies list

Here’s how to tag a given policy with your team name:

gcloud alpha monitoring policies update \
        "projects/my-project/alertPolicies/12345" \
        --update-user-labels=team=myteamname

You can then easily find policies that have your team name:

gcloud alpha monitoring policies list --filter='user_label.team="myteamname"'

Updating channels

When someone new joins your DevOps team, it can be a very tedious process to update all your policies so that they receive all the relevant notifications. Now, with the Alerting API, you can quickly add your new teammate to all of the alerting policies that your team owns.

First, find the channels that belong to the team member:

gcloud alpha monitoring channels list

If they don't already have a notification channel, you can create one:

gcloud alpha monitoring channels create \
      --display-name="Anastasia Alertmaestro" \
      --type="email" \
      [email protected]

Then, add a notification channel to a given policy:

gcloud alpha monitoring policies update \
     "projects/my-project/alertPolicies/12345" \    
     --add-notification-channels="projects/my-project/notificationChannels/56789"

Combined with the policies list command, adding the notification channel to all of your team's policies is a matter of a simple BASH script, not tons of tedious point-and-click configuration.

Disabling alerts to a given endpoint

If you're in the middle of a pagerstorm and getting endless alerts, it’s easy to disable notifications to a channel without removing that channel from all existing policies:

gcloud alpha monitoring channels update \
    "projects/my-project/notificationChannels/9817323" \
    --enabled=false

Conclusion

To summarize, the alerting policy and notification channel management features in the Monitoring v3 API will help you simplify and automate a number of tasks. We hope that this saves you time, and we look forward to your feedback!

Please send your feedback to google-stackdriver-discussion_AT_googlegroups.com.

Source: Google Cloud Platform Blog

How to export logs from Stackdriver Logging: new solution documentation

By Charles Baer, Solutions Architect

Stackdriver Logging is broadly integrated with Google Cloud Platform (GCP), offering rich logging information about GCP services and how you use them. The Stackdriver Logging Export functionality allows you to export your logs and use the information to suit your needs.

There are lots of reasons to export your logs: to retain them for long-term storage (months or years) to meet compliance requirements; to run data analytics against the metrics extracted from the logs; or simply to import them into another system. Stackdriver Logging can export to Cloud Storage, BigQuery and Cloud Pub/Sub.

How you set up Logging Export on GCP depends on the complexity of your GCP organization, the types of logs to export and how you want to use the logs.

We recently put together a three-part solution that explores best practices for three common logging export scenarios:

For each scenario, we provide examples of export requirements, detailed setup steps, best practices and tips on using the exported logs.

We’re always looking for more feedback and suggestions on how to improve Stackdriver Logging. Please keep sending us your requests and feedback.

Source: Google Cloud Platform Blog

Introducing Stackdriver APM and Stackdriver Profiler

Distributed tracing, debugging, and profiling for your performance-sensitive applications
By Morgan McLean, Product Manager

Like all developers that care about their users, you’re probably obsessed with how your applications perform and how you can make them faster and more reliable. Monitoring and logging software like Stackdriver Monitoring and Logging provide a first line of defense, alerting you to potential infrastructure or security problems, but what if the performance problem lies deeper than that, in your code?

Here at Google, we’re developers too, and we know that tracking down performance problems in your code can be hard—particularly if the application is live. Today we’re announcing new products that offer the same Application Performance Management (APM) capabilities that we use internally to monitor and tune the performance of our own applications. These tools are powerful, can be used on applications running anywhere, and are priced so that virtually any developer can make use of them.

The foundation of our APM tooling is two existing products, Stackdriver Trace and Debugger, which give you the power to analyze and debug applications while they're running in production, without impacting user experience in any way.

On top of that, we’re introducing Stackdriver Profiler to our APM toolkit, which lets you profile and explore how your code actually executes in production, to optimize performance and reduce cost of computation.

We’re also announcing integrations between Stackdriver Debugger and GitHub Enterprise and GitLab, adding to our existing code mirroring functionality for GitHub, Bitbucket, Google Cloud Repositories, as well as locally-stored source code.

All of these tools work with code and applications that run on any cloud or even on-premises infrastructure, so no matter where you run your application, you now have a consistent, accessible APM toolkit to monitor and manage the performance of your applications.

Introducing Stackdriver Profiler

Production profiling is immensely powerful, and lets you gauge the impact of any function or line of code on your application’s overall performance. If you don’t analyze code execution in production, unexpectedly resource-intensive functions increase the latency and cost of web services every day, without anyone knowing or being able to do anything about it.

At Google, we continuously profile our applications to identify inefficiently written code, and these tools are used every day across the company. Outside of Google, however, these techniques haven’t been widely adopted by service developers, for a few reasons:

While profiling client applications locally can yield useful results, inspecting service execution in development or test environments does not.
Profiling production service performance through traditional methods can be difficult and risks causing slowdowns for customers.
Existing production profiling tools can be expensive, and there’s always the option of simply scaling up a poorly performing service with more computing power (for a price).

Stackdriver Profiler addresses all of these concerns:

It analyzes code execution across all environments.
It runs continually and uses statistical methods to minimize impact on targeted codebases.
It makes it more cost-effective to identify and remediate your performance problems rather than scaling up and increasing your monthly bill.

Stackdriver Profiler collects data via lightweight sampling-based instrumentation that runs across all of your application’s instances. It then displays this data on a flame chart, presenting the selected metric (CPU time, wall time, RAM used, contention, etc.) for each function on the horizontal axis, with the function call hierarchy on the vertical axis.

Early access customers have used Stackdriver Profiler to improve performance and reduce their costs.

"We used Stackdriver Profiler as part of an effort to improve the scalability of our services. It helped us to pinpoint areas we can optimize and reduce CPU time, which means a lot to us at our scale."

— Evan Yin, Software Engineer, Snap Inc.

"Profiler helped us identify very slow parts of our code which were hidden in the middle of large and complex batch processes. We run hundreds of batches every day, each with different data sets and configurations, which makes it hard to track down performance issues related to client-specific configurations. Stackdriver Profiler was super helpful."

—Nicolas Fonrose, CEO, Teevity

Stackdriver Profiler is now in public beta, available for everyone. It supports:

Unearth tricky code problems with Stackdriver Debugger

Stackdriver Debugger provides a familiar breakpoint-style debugging process for production applications, with no negative customer impact.

Additionally, Stackdriver Debugger’s logpoints feature allows you to add log statements to production apps, instantly, without having to redeploy them.
Debugger simplifies root-cause analysis for hard-to-find production code issues. Without Debugger, finding these kinds of problems usually requires manually adding new log statements to application code, redeploying any affected services, analyzing logs to determine what is actually going wrong, and finally, either discovering and fixing the issue or adding additional log statements and starting the cycle all over again. Debugger reduces this iteration cycle to zero.

Stackdriver Debugger is generally available and supports the following languages and platforms:

Reduce latency with Stackdriver Trace

Stackdriver Trace allows you to analyze how customer requests propagate through your application, and is immensely useful for reducing latency and performing root cause analysis. Trace continuously samples requests, automatically captures their propagation and latency, presents the results for display, and finds any latency-related trends. You can also add custom metadata to your traces for deeper analysis.

Trace is based off of Google’s own Dapper, which pioneered the concept of distributed tracing and which we still used every day to make our services faster and more reliable.

We’re also adding multi-project support to Trace in the coming weeks, a long-requested feature that will let you view complete traces across multiple GCP projects at the same time. Expect to hear more about this very soon.

Stackdriver Trace is generally available and offers the following platform and language support:

Get started today with Stackdriver APM

Whether your application is just getting off the ground, or live and in production, using APM tools to monitor and tune its performance can be a game changer. To get started with Stackdriver APM, simply link the appropriate instrumentation library for each tool to your app and start gathering telemetry for analysis. Stackdriver Debugger is currently free, as is the beta of Stackdriver Profiler. Stackdriver Trace includes a large monthly quota of free trace submissions.

To learn more, see the Stackdriver Profiler, Debugger and Trace documentation

Source: Google Cloud Platform Blog

Understand your spending at a glance with Google Cloud Billing reports beta

By Stephen Elliott, Product Manager, Google Cloud Platform

Whether you’re a developer working on a new project, an engineering manager checking your budget or a billing administrator keeping tabs on your monthly spending, you're probably asking yourself questions about your GCP bill such as:

Which project cost the most last month?
What’s the trend for my GCP costs?
Which GCP product costs the most?

Today, we’re excited to launch Cloud Billing reports in beta to help you quickly answer these questions, and others like them. Billing reports lets you view your GCP usage costs at a glance as well as discover and analyze trends.

With billing reports you can see data for all the projects linked to a billing account. You can adjust your views to uncover specific trends, including:

Costs grouped by project, product or SKU
Different time aggregation including daily and monthly views. You can even view hourly if you select a time range of one week or less.
Costs with and without the application of service credits

Let’s watch billing reports in action:

Billing reports will be available to all accounts in the coming weeks. Get started by navigating to your account’s billing page in the GCP console and opening the reports tab in the left-hand navigation bar.

You can learn more in the billing reports documentation. If you're interested in creating more visualizations of your billing data you can do so by exporting to BigQuery and visualizing your billing data with Data Studio.

Source: Google Cloud Platform Blog

Introducing the ability to connect to Cloud Shell from any terminal

By Erik Kuefler, Software Engineer, Google Cloud

If you develop or administer apps running on Google Cloud Platform (GCP), you’re probably familiar with Cloud Shell, an on-demand interactive shell environment that contains a wide variety of pre-installed developer tools. Up until now, you could only access Cloud Shell from your browser. Today, we're introducing the ability to connect to Cloud Shell directly from your terminal using the gcloud command-line tool.

Starting an SSH session is a single command:

erik@localhost:~$ ls
Desktop
erik@localhost:~$ gcloud alpha cloud-shell ssh
Welcome to Cloud Shell! Type "help" to get started.
erik@cloudshell:~$ ls
server.py  README-cloudshell.txt

You can also use gcloud to copy files between your Cloud Shell and your local machine:

erik@localhost:~$ gcloud alpha cloud-shell scp cloudshell:~/data.txt localhost:~
data.txt                                           100% 1897    28.6KB/s   00:00
erik@localhost:~$

If you're using Mac or Linux, you can even mount your Cloud Shell home directory onto your local file system after installing sshfs. This allows you to edit the files in your Cloud Shell home directory using whatever local tools you want! All the data in your remotely mounted file system is stored on a Persistent Disk, so it's fast, strongly consistent and retained across sessions and regions.

erik@localhost:~$ gcloud alpha cloud-shell get-mount-command ~/my-cloud-shell
sshfs [email protected]: /home/ekuefler/my-cloud-shell -p 6000 -oIdentityFile=/home/ekuefler/.ssh/google_compute_engine
erik@localhost:~$ sshfs [email protected]: /home/ekuefler/my-cloud-shell -p 6000 -oIdentityFile=/home/ekuefler/.ssh/google_compute_engine
erik@localhost:~$ cd my-cloud-shell
erik@localhost:~$ ls
server.py  README-cloudshell.txt
erik@localhost:~$ vscode server.py

We're sure you'll find plenty of uses for these features, but here are a few to get you started:

Use it as a playground — take advantage of the tools and language runtimes installed in Cloud Shell to do quick experiments without having to install software on your machine.
Use it as a sandbox — install or run untrusted programs in Cloud Shell without the risk of them damaging your local machine or reading your data, or to avoid polluting your machine with programs you rarely need to run.
Use it as a portable development environment — store your files in your Cloud Shell home directory and edit them using your favorite IDEs when you're at your desk, then keep working on the same files later from a Chromebook using the web terminal and editor.

The full documentation for the command-line interface is available here. The cloud-shell command group is currently in alpha, so we're still making changes to it and welcome your feedback and suggestions via the feedback link at the bottom of the documentation page.

Source: Google Cloud Platform Blog

Best practices for working with Google Cloud Audit Logging

By Grace Mollison, Cloud Solutions Architect, and Mary Koes, Product Manager

As an auditor, you probably spend a lot of time reviewing logs. Google Cloud Audit Logging is an integral part of the Google Stackdriver suite of products, and understanding how it works and how to use it is a key skill you need to implement an auditing approach for systems deployed on Google Cloud Platform (GCP). In this post, we’ll discuss the key functionality of Cloud Audit Logging and call out some best practices.

The first thing to know about Cloud Audit Logging is that each project consists of two log streams: admin activity and data access. GCP services generate these logs to help you answer the question of "who did what, where, and when?" within your GCP projects. Further, these logs are distinct from your application logs.

Admin activity logs contain log entries for API calls or other administrative actions that modify the configuration or metadata of resources. Admin activity logs are always enabled. There's no charge for admin activity audit logs, and they're retained for 13 months/400 days.

Data access logs, on the other hand, record API calls that create, modify or read user-provided data. Data access audit logs are disabled by default because they can grow to be quite large.

For your reference, here’s the full list of GCP services that produce audit logs.

Configure and view audit logs

Getting started with Cloud Audit Logging is simple. Some services are on by default, and others are just a few clicks away from being operational. Here’s how to set up, configure and use various Cloud Audit Logging capabilities.

Configuring audit log collection

Admin activity logs are enabled by default; you don’t need to do anything to start collecting them. With the exception of BigQuery, however, data Access audit logs are disabled by default. Follow the guidance detailed in Configuring Data Access Logs to enable them.

One best practice for data access logs is to use a test project to validate the configuration for your data access audit collection before you propagate it to developer and production projects. If you configure your IAM controls incorrectly, your projects may become inaccessible.

Viewing audit logs

You can view audit logs from two places in the GCP Console: via the activity feed, which provides summary entries, and via the Stackdriver Logs viewer page, which gives full entries.

Permissions

You should consider access to audit log data as sensitive and configure appropriate access controls. You can do this by using IAM roles to apply access controls to logs.

To view logs, you need to grant the IAM role logging.viewer (Logs Viewer) for the admin activity logs, and logging.privateLogViewer (Private Logs viewer) for the data access logs.

When configuring roles for Cloud Audit Logging, this how to guide describes some typical scenarios and provides guidance on configuring IAM policies that address the need to control access to audit logs. One best practice is to ensure that you’ve applied the appropriate IAM controls, to restrict who can access the audit logs.

Viewing the activity feed

You can see a high-level overview of all your audit logs on the Cloud Console Activity page. Click on any entry to display a detailed view of that event, as shown below.

By default, this feed does not display data access logs. To enable them, go to the Filter configuration panel and select the “Data Access” field under Categories. (Please note, you also need to have the Private Logs Viewer IAM permission in order to see data access logs).

Viewing audit logs via the Stackdriver Logs viewer

You can view detailed log entries from the audit logs in the Stackdriver Logs Viewer. With Logs Viewer, you can filter or perform free text search on the logs, as well as select logs by resource type and log name (“activity” for the admin activity logs and “data_access” for the data access logs).

The example below displays some log entries in their JSON format, and highlights a few important fields.

Filtering Audit Logs

Stackdriver provides both basic and advanced logs filters. Basic log filters allows you to filter the results displayed in the feed by user, resource type and date/time.

An advanced logs filter is a Boolean expression that specifies a subset of all the log entries in your project. You can use to it choose the log entries:

from specific logs or log services
within a given time range
that satisfy conditions on metadata or user-defined fields
that represent a sampling percentage of all log entries

The following filter shows a filter on all calls made to the Cloud IAM API that calls the SetIamPolicy method.

resource.type="project"
logName="projects/a-project-id-here/logs/cloudaudit.googleapis.com%2Factivity"
protoPayload.methodName="SetIamPolicy"

Below is a snippet of the log entry that shows that the SetIamPolicy call was made to grant the BigQuery dataviewer IAM role to Alice.

resourceName: "projects/a-project-id-here"  
 response: {
  @type: "type.googleapis.com/google.iam.v1.Policy"   
  bindings: [
   0: {
    members: [
     0: "user:[email protected]"      
    ]
    role: "roles/bigquery.dataViewer"     
   }

Exporting logs

Log entries are held in Stackdriver Logging for a limited time known as the retention period. After that, the entries are deleted. To keep log entries longer, you need to export them outside of Stackdriver Logging by configuring log sinks.

A sink includes a destination and a filter that selects the log entries to export, and consists of the following properties:

Sink identifier: A name for the sink
Parent resource: The resource in which you create the sink. This can be a project, folder, billing account, or an organization
Logs filter: Selects which log entries to export through this sink, giving you the flexibility to export all logs or specific logs
Destination: A single place to send the log entries matching your filter. Stackdriver Logging supports three destinations: Google Cloud Storage buckets, BigQuery datasets, and Cloud Pub/Sub topics.
Writer identity: A service account that has been granted permissions to write to the destination.

You need to configure log sinks before you can receive any logs, and you can’t retroactively export logs that were written before the sink was created.

Another feature for working with logs is Aggregated Exports, which allows you to set up a sink at the Cloud IAM organization or folder level, and export logs from all the projects inside the organization or folder. For example, the following gcloud command sends all admin activity logs from your entire organization to a single BigQuery sink:

gcloud logging sinks create my-bq-sink 
bigquery.googleapis.com/projects/my-project/datasets/my_dataset 
--log-filter='logName: "logs/cloudaudit.googleapis.com%2Factivity"' 
--organization=1234 --include-children

Be aware that an aggregated export sink sometimes exports very large numbers of log entries. When designing your aggregated export sink to export the data you need to store, here are some best practices to keep in mind:

Ensure that logs are exported for longer term retention
Ensure that appropriate IAM controls are set against the export sink destination
Design aggregated exports for your organization to filter and export the data that will be useful for future analysis
Configure log sinks before you start receiving logs
Follow the best practices for common logging export scenarios

Managing exclusions

Stackdriver Logging provides exclusion filters to let you completely exclude certain log messages for a specific product or messages that match a certain query. You can also choose to sample certain messages so that only a percentage of the messages appear in Stackdriver Logs Viewer. Excluded log entries do not count against the Stackdriver Logging logs allotment provided to projects.

It’s also possible to export log entries before they're excluded. For more information, see Exporting Logs. Excluding this noise will not only make it easier to review the logs but will also allow you to minimize any charges for logs over your monthly allotment.

Best practices:

Ensure you're using exclusion filters to exclude logging data that will not be useful. For example, you shouldn’t need to log data access logs in development projects. Storing data access logs is a paid service (see our log allotment and coverage charges), so recording superfluous data incurs unnecessary overhead

Cloud Audit Logging best practices, recapped

Cloud Audit Logging is a powerful tool that can help you manage and troubleshoot your GCP environment, as well as demonstrate compliance. As you start to set up your logging environment, here are some best practices to keep in mind:

Use a test project to validate the configuration of your data-access audit collection before propagating to developer and production projects
Be sure you’ve applied appropriate IAM controls to restrict who can access the audit logs
Determine whether you need to export logs for longer-term retention
Set appropriate IAM controls against the export sink destination
Design aggregated exports on which your organization can filter and export the data for future analysis
Configure log sinks before you start receiving logs
Follow the best practices for common logging export scenarios
Make sure to use exclusion filters to exclude logging data that isn’t useful.

We hope you find these best practices helpful when setting up your audit logging configuration. Please leave a comment if you have any best practice tips of your own.

googblogs.com

All Google blogs and Press in one site

Tag Archives: Management Tools

Better cost control with Google Cloud Billing programmatic notifications

Source: Google Cloud Platform Blog

Apigee named a Leader in the Gartner Magic Quadrant for Full Life Cycle API Management for the third consecutive time

Source: Google Cloud Platform Blog

Announcing variable substitution in Stackdriver alerting notifications

New variable substitution in alerting policy documentation

@mentions for Slack

Notification examples

Email:

Slack:

Webhook:

Next steps

Source: Google Cloud Platform Blog

Viewing trace spans and request logs in multi-project deployments

Source: Google Cloud Platform Blog

New ways to manage and automate your Stackdriver alerting policies

Organizing policies

Updating channels

Disabling alerts to a given endpoint

Conclusion

Source: Google Cloud Platform Blog

How to export logs from Stackdriver Logging: new solution documentation

Source: Google Cloud Platform Blog

Introducing Stackdriver APM and Stackdriver Profiler

Introducing Stackdriver Profiler

Unearth tricky code problems with Stackdriver Debugger

Reduce latency with Stackdriver Trace

Get started today with Stackdriver APM

Source: Google Cloud Platform Blog

Understand your spending at a glance with Google Cloud Billing reports beta

Source: Google Cloud Platform Blog

Introducing the ability to connect to Cloud Shell from any terminal

Source: Google Cloud Platform Blog

Best practices for working with Google Cloud Audit Logging

Configure and view audit logs

Managing exclusions

Source: Google Cloud Platform Blog