Tag Archives: Big Data & Machine Learning

Monitor and manage your costs with Cloud Platform billing export to BigQuery



The flexibility and scalability of the cloud means that your usage can fluctuate dramatically from day to day with demand. And while you always pay only for what you use, customers often ask us to help them better understand their bill.

A prerequisite for understanding your bill is better access to detailed usage and billing data. So today, we are excited to announce the general availability of billing export to BigQuery, our data warehouse service, enabling a more granular and timely view into your GCP costs than ever before.

Billing export to BigQuery is a new and improved version of our existing billing export to CSV/JSON files, and like the name implies, exports your cloud usage data directly into a BigQuery dataset. Once the data is there, you can write simple SQL queries in BigQuery, visualize your data in Data Studio, or programmatically export the data into other tools to analyze your spend.

New billing data is exported automatically into the dataset as it becomes available-- usually multiple times per day. BigQuery billing export also contains a few new features to help you organize your data:
  • User labels to categorize and track costs 
  • Additional product metadata to organize by GCP services: 
    • Service description 
    • Service category 
    • SKU ID to uniquely identify each resource type 
  • Export time to help organize cost by invoice 

Getting started with billing export to BigQuery 


It’s easy to export billing data into BigQuery and start to analyze it. The first step is to enable the export, which begins to build your billing dataset, following these setup instructions. Note that you need Billing Admin permissions in GCP to enable export so check you have the appropriate permissions or work with your Billing Admin.

Once you have billing export set up, the data will automatically start being populated within a few hours. Your BigQuery dataset will continue to automatically update as new data is available.


NOTE: Your BigQuery dataset only reflects costs incurred from the date you set up billing export; we will not backfill billing data at this time. While our existing CSV and JSON export features continue to remain available in their current format, we strongly encourage you to enable billing export to BigQuery as early as possible to build out your billing dataset, and to take advantage of the more granular cost analysis it allows.

Querying the billing export data


Now that you've populated your dataset, you can start the fun part--data analysis. You can export the full dataset, complete with new elements such as user labels, or write queries against the data to answer specific questions. Here are a couple of simple examples of how you might use BigQuery queries on exported billing data.

Query every row without grouping


The most granular view of your billing costs is to query every row without grouping. Assume all fields, except labels and resource types, are the same (project, product, and so on).

SELECT
     resource_type,
     TO_JSON_STRING(labels) as labels,
     cost as cost
FROM `project.dataset.table`;

Group by label map as a JSON string 

This is a quick and easy way to break down cost by each label combination.

SELECT
     TO_JSON_STRING(labels) as labels,
     sum(cost) as cost
FROM `project.dataset.table`
GROUP BY labels;

You can see more query examples or write your own.

Visualize Spend Over Time with Data Studio


Many business intelligence tools natively integrate with BigQuery as the backend datastore. With Data Studio, you can easily visualize your BigQuery billing data, and with a few clicks set up a dashboard and get up-to-date billing reports throughout the day, using labels to slice and dice your GCP bill.


You can find detailed instructions about how to copy and setup a Data Studio template here: Visualize spend over time with Data Studio

Here at Google Cloud, we’re all about making your cloud costs as transparent and predictable as possible. To learn more about billing export to BigQuery, check out the documentation, and let us know how else we can help you understand your bill, by sending us feedback.

Demystifying ML: How machine learning is used for speech recognition


This is the second blog post in our series that looks at how machine learning can be used to solve common problems. In this article, we discuss how ML is used in speech recognition.

In our previous post, we talked about how email classification can be used to improve customer service. But emails are are just one way customers interact with businesses. When it comes to difficult issues, 48% of customers opt to speak with a customer service representative over the phone as opposed to text-based chat or email (source: American Express 2014 Global Customer Service Barometer). Additionally, in today’s business landscape, an increasing number of interactions are happening in real time.

Take the example of a commercial bank. For urgent matters — say, a customer reporting a stolen credit card — it doesn’t make sense to send an email. In these instances, a customer is more likely to call a customer service representative, and getting that customer to the right representative as fast as possible can be the difference between a minor inconvenience and a bigger problem. This means that speech recognition systems, with the ability to swiftly identify exact words and their context, are more important than ever.

Since speech recognition requires bridging the gap from the physical to digital world, there are many layers of engineering that go into the process. In layman's terms, you start with the input: the audio waveform. This waveform is digitized and converted using a Fourier transform — which converts a signal from a function of time into a function of frequency — similar to the spectrum display grid on certain audio equipment. We then use machine learning to find the most likely phonemes (distinct units of sound) and probable sequences of words based on the sequence of converted frequency graphs. Finally, depending on the application, an output in the form of a textual answer or result is returned. In the case of a customer service call center, this textual output (or its binary equivalent) allows your call to be routed typically in a matter of milliseconds.

Building your own speech recognition system is a complex process, and each layer involves its own interesting implementations and challenges. In this post, we’ll focus on phoneme modeling, i.e., isolated word recognition.

This is a simplified representation. The actual process contains all possible phonemes, and looks at matching not just discrete phonemes, but the beginning, middle and end of those waveforms. (A “k” sound, or the aspirated start of a certain vowel, for example.)

Returning to our example of the customer service call center for a moment, a HMM can construct a graph linking phonemes, or sometimes even consecutive words into a sequence, resulting in a histogram of possible outputs corresponding to various support teams in your company. With a large dataset of recorded customer statements and their call center destinations, you can build a robust AI-based routing system that gets customers to the right help as fast as possible.

As we noted earlier, building your own speech recognition system is a major undertaking. It requires a large dataset to train your model, and this sample data must be labeled through a fairly manual and laborious process. At scale, the data is costly to store on-site, and converting all the stored data into a functioning model requires multiple iterative attempts, as well as substantial computation resources, and sometimes multiple days or weeks for a training process. If you’re interested in quickly deploying a speech-based application, but want to avoid the ordeal of training your own model, you can always use a tool like Cloud Speech API.

There are also many more ways speech recognition can be helpful — from closed captioning to real-time transcription. If you’re interested in learning more, you can check out our best practices and sample applications.