Tag Archives: open source release

Omnitone: Spatial audio on the web


Spatial audio is a key element for an immersive virtual reality (VR) experience. By bringing spatial audio to the web, the browser can be transformed into a complete VR media player with incredible reach and engagement. That’s why the Chrome WebAudio team has created and is releasing the Omnitone project, an open source spatial audio renderer with the cross-browser support.

Our challenge was to introduce the audio spatialization technique called ambisonics so the user can hear the full-sphere surround sound on the browser. In order to achieve this, we implemented the ambisonic decoding with binaural rendering using web technology. There are several paths for introducing a new feature into the web platform, but we chose to use only the Web Audio API. In doing so, we can reach a larger audience with this cross-browser technology, and we can also avoid the lengthy standardization process for introducing a new Web Audio component. This is possible because the Web Audio API provides all the necessary building blocks for this audio spatialization technique.



Omnitone Audio Processing Diagram

The AmbiX format recording, which is the target of the Omnitone decoder, contains 4 channels of audio that are encoded using ambisonics, which can then be decoded into an arbitrary speaker setup. Instead of the actual speaker array, Omnitone uses 8 virtual speakers based on an the head-related transfer function (HRTF) convolution to render the final audio stream binaurally. This binaurally-rendered audio can convey a sense of space when it is heard through headphones.

The beauty of this mechanism lies in the sound-field rotation applied to the incoming spatial audio stream. The orientation sensor of a VR headset or a smartphone can be linked to Omnitone’s decoder to seamlessly rotate the entire sound field. The rest of the spatialization process will be handled automatically by Omnitone. A live demo can be found at the project landing page.

Throughout the project, we worked closely with the Google VR team for their VR audio expertise. Not only was their knowledge on the spatial audio a tremendous help for the project, but the collaboration also ensured identical audio spatialization across all of Google’s VR applications - both on the web and Android (e.g. Google VR SDK, YouTube Android app). The Spatial Media Specification and HRTF sets are great examples of the Google VR team’s efforts, and Omnitone is built on top of this specification and HRTF sets.

With emerging web-based VR projects like WebVR, Omnitone’s audio spatialization can play a critical role in a more immersive VR experience on the web. Web-based VR applications will also benefit from high-quality streaming spatial audio, as the Chrome Media team has recently added FOA compression to the open source audio codec Opus. More exciting things like VR view integration, higher-order ambisonics and mobile web support will also be coming soon to Omnitone.

We look forward to seeing what people do with Omnitone now that it's open source. Feel free to reach out to us or leave a comment with your thoughts and feedback on the issue tracker on GitHub.

By Hongchan Choi and Raymond Toy, Chrome Team

Due to the incomplete implementation of multichannel audio decoding on various browsers, Omnitone does not support mobile web at the time of writing.

Kubernetes 1.3 is here!

With all of the excitement being generated around the Kubernetes 1.3 release and the first anniversary of Kubernetes 1.0 (#k8sbday), now is a great time to point out some of the features that enterprise users should be taking note of.

If you’re not familiar with Kubernetes, let me get you up to speed.

Kubernetes is an open-source container automation framework that builds upon 15 years of experience of running production workloads at Google. Once you declare a desired state, Kubernetes works to drive your system toward that state. As a developer this means less time handling trivial tasks that a computer can automate and more time focusing on developing applications that provide value to users.

Additionally, Kubernetes aims to be a framework that you can operate at planetary scale, run anywhere, and never outgrow.

With the release of Kubernetes 1.3, Kubernetes is closer than ever to meeting those goals; the 1.3 release adds exciting features such as:
Aside from features, the coolest part about working with Kubernetes is hearing user stories. I’ll soon be publishing an interview with Joseph Jacks, co-founder of Kismatic, the enterprise Kubernetes company, on the Kubernetes blog.

Joseph is very active in the Kubernetes community and has extensive experience with Kubernetes in production. In the interview I ask him why he bet his business on Kubernetes, what could be better, and how he sees Kubernetes growing in the near future.

Kubernetes has many, many features to offer that I didn’t get to cover in this short write-up. If you know anyone that needs to ramp up on Kubernetes, the easiest way is the free course I created with Kelsey Hightower, Scalable Microservices with Kubernetes. The course covers the basic features of Kubernetes. If you want an overview of what’s new in Kubernetes 1.3, feel free to look at the “What’s new in Kubernetes 1.3” video or slides.

Finally for a more in-depth look at the 1.3 release, make sure to check out: 5 days of Kubernetes 1.3 blog series.

Want to learn more about container orchestration and cloud native platforms? Here’s some recommended reading to follow up with:
By Carter Morgan, Developer Programs Engineer

GitHub on BigQuery: Analyze all the code



Google, in collaboration with GitHub, is releasing an incredible new open dataset on Google BigQuery. So far you've been able to monitor and analyze GitHub's pulse since 2011 (thanks GitHub Archive project!) and today we're adding the perfect complement to this. What could you do if you had access to analyze all the open source software in the world, with just one SQL command?

The Google BigQuery Public Datasets program now offers a full snapshot of the content of more than 2.8 million open source GitHub repositories in BigQuery. Thanks to our new collaboration with GitHub, you'll have access to analyze the source code of almost 2 billion files with a simple (or complex) SQL query. This will open the doors to all kinds of new insights and advances that we're just beginning to envision.

For example, let's say you're the author of a popular open source library. Now you'll be able to find every open source project on GitHub that's using it. Even more, you'll be able to guide the future of your project by analyzing how it's being used, and improve your APIs based on what your users are actually doing with it.

On the security side, we've seen how the most popular open source projects benefit from having multiple eyes and hands working on them. This visibility helps projects get hardened and buggy code cleaned up. What if you could search for errors with similar patterns in every other open source project? Would you notify their authors and send them pull requests? Well, now you can. Some concepts to keep in mind while working with BigQuery and the GitHub contents dataset:
To learn more, read GitHub's announcement and try some sample queries. Share your queries and findings in our reddit.com/r/bigquery and Hacker News posts. The ideas are endless, and I'll start collecting tips and links to other articles on this post on Medium.

Stay curious!

Announcing SyntaxNet: The World’s Most Accurate Parser Goes Open Source

Originally posted on the Google Research Blog

By Slav Petrov, Senior Staff Research Scientist

At Google, we spend a lot of time thinking about how computer systems can read and understand human language in order to process it in intelligent ways. Today, we are excited to share the fruits of our research with the broader community by releasing SyntaxNet, an open-source neural network framework implemented in TensorFlow that provides a foundation for Natural Language Understanding (NLU) systems. Our release includes all the code needed to train new SyntaxNet models on your own data, as well as Parsey McParseface, an English parser that we have trained for you and that you can use to analyze English text.

Parsey McParseface is built on powerful machine learning algorithms that learn to analyze the linguistic structure of language, and that can explain the functional role of each word in a given sentence. Because Parsey McParseface is the most accurate such model in the world, we hope that it will be useful to developers and researchers interested in automatic extraction of information, translation, and other core applications of NLU.

How does SyntaxNet work?

SyntaxNet is a framework for what’s known in academic circles as a syntactic parser, which is a key first component in many NLU systems. Given a sentence as input, it tags each word with a part-of-speech (POS) tag that describes the word's syntactic function, and it determines the syntactic relationships between words in the sentence, represented in the dependency parse tree. These syntactic relationships are directly related to the underlying meaning of the sentence in question. To take a very simple example, consider the following dependency tree for Alice saw Bob:


This structure encodes that Alice and Bob are nouns and saw is a verb. The main verb saw is the root of the sentence and Alice is the subject (nsubj) of saw, while Bob is its direct object (dobj). As expected, Parsey McParseface analyzes this sentence correctly, but also understands the following more complex example:


This structure again encodes the fact that Alice and Bob are the subject and object respectively of saw, in addition that Alice is modified by a relative clause with the verb reading, that saw is modified by the temporal modifier yesterday, and so on. The grammatical relationships encoded in dependency structures allow us to easily recover the answers to various questions, for example whom did Alice see?, who saw Bob?, what had Alice been reading about? or when did Alice see Bob?.

Why is Parsing So Hard For Computers to Get Right?

One of the main problems that makes parsing so challenging is that human languages show remarkable levels of ambiguity. It is not uncommon for moderate length sentences - say 20 or 30 words in length - to have hundreds, thousands, or even tens of thousands of possible syntactic structures. A natural language parser must somehow search through all of these alternatives, and find the most plausible structure given the context. As a very simple example, the sentence Alice drove down the street in her car has at least two possible dependency parses:


The first corresponds to the (correct) interpretation where Alice is driving in her car; the second corresponds to the (absurd, but possible) interpretation where the street is located in her car. The ambiguity arises because the preposition in can either modify drove or street; this example is an instance of what is called prepositional phrase attachment ambiguity.

Humans do a remarkable job of dealing with ambiguity, almost to the point where the problem is unnoticeable; the challenge is for computers to do the same. Multiple ambiguities such as these in longer sentences conspire to give a combinatorial explosion in the number of possible structures for a sentence. Usually the vast majority of these structures are wildly implausible, but are nevertheless possible and must be somehow discarded by a parser.

SyntaxNet applies neural networks to the ambiguity problem. An input sentence is processed from left to right, with dependencies between words being incrementally added as each word in the sentence is considered. At each point in processing many decisions may be possible—due to ambiguity—and a neural network gives scores for competing decisions based on their plausibility. For this reason, it is very important to use beam search in the model. Instead of simply taking the first-best decision at each point, multiple partial hypotheses are kept at each step, with hypotheses only being discarded when there are several other higher-ranked hypotheses under consideration. An example of a left-to-right sequence of decisions that produces a simple parse is shown below for the sentence I booked a ticket to Google.
Furthermore, as described in our paper, it is critical to tightly integrate learning and search in order to achieve the highest prediction accuracy. Parsey McParseface and other SyntaxNet models are some of the most complex networks that we have trained with the TensorFlow framework at Google. Given some data from the Google supported Universal Treebanks project, you can train a parsing model on your own machine.

So How Accurate is Parsey McParseface?

On a standard benchmark consisting of randomly drawn English newswire sentences (the 20 year old Penn Treebank), Parsey McParseface recovers individual dependencies between words with over 94% accuracy, beating our own previous state-of-the-art results, which were already better than any previous approach. While there are no explicit studies in the literature about human performance, we know from our in-house annotation projects that linguists trained for this task agree in 96-97% of the cases. This suggests that we are approaching human performance—but only on well-formed text. Sentences drawn from the web are a lot harder to analyze, as we learned from the Google WebTreebank (released in 2011). Parsey McParseface achieves just over 90% of parse accuracy on this dataset.

While the accuracy is not perfect, it’s certainly high enough to be useful in many applications. The major source of errors at this point are examples such as the prepositional phrase attachment ambiguity described above, which require real world knowledge (e.g. that a street is not likely to be located in a car) and deep contextual reasoning. Machine learning (and in particular, neural networks) have made significant progress in resolving these ambiguities. But our work is still cut out for us: we would like to develop methods that can learn world knowledge and enable equal understanding of natural language across all languages and contexts.

To get started, see the SyntaxNet code and download the Parsey McParseface parser model. Happy parsing from the main developers, Chris Alberti, David Weiss, Daniel Andor, Michael Collins & Slav Petrov.

XRay: a function call tracing system

At Google we spend a lot of time debugging and tuning the performance of our production systems. Some standard practices when doing this involves using profilers, debuggers, and analysis of logs and execution traces. Doing this at scale, in production, is difficult. One of the ways for getting high fidelity data from production systems is to build applications with instrumentation, and then reconstruct the instrumentation data into a form humans can consume (summary statistics, reports, etc.). Instrumentation comes at a cost though, sometimes too high to make it feasible to deploy in production.

Getting this balance right is hard. This is why we've developed XRay, a function call tracing system that has very little overhead when not enabled, but can be dynamically turned on and only impose moderate costs. XRay works as a combination of compiler-inserted instrumentation points which functionally do nothing (called "nop sleds") and a library that can be enabled and disabled at runtime which replaces the nop sleds with the appropriate instrumentation instructions.

We've been using XRay to debug internal systems, from core infrastructure services like Bigtable to ad serving systems. XRay's detailed function tracing has enabled several teams in Google to debug issues that would be really hard to solve without XRay.

We think XRay is an important piece of technology, not only at Google, but for developers around the world. It's because of this that we're working on making XRay opensource. To kick-start that process, we're releasing a white paper describing the technical details of XRay. In the following weeks, we will be engaging the LLVM community, where we are committed to making XRay available for wide use and distribution.

We hope that by open-sourcing XRay we can contribute to the advancement of debugging real-world applications. We're looking forward to working with the LLVM community and other projects to make the data XRay generates useful for debugging a wide variety of applications.

By Dean Michael Berris, Google Engineering

CCTZ v2.0 — now with more civil time

Last September we announced an open source project called CCTZ, a C++ library that enables computing with arbitrary time zones. Today we're announcing CCTZ v2.0 which introduces a new civil time library. Civil time is a legally recognized representation of time used by humans (i.e., year, month, day, hour, minute and second). The most common example of a civil time is a time zone independent date. In version 2.0, CCTZ's time zone and new civil time libraries cooperate with the standard C++ <chrono> library to give programmers a complete (and simple!) framework in which to reason about and solve even the most complicated time programming problems.

To learn more, please check out the project page on GitHub. Pay particular attention to the fundamental concepts section which establishes a simple, cross-platform and language agnostic mental model that will help you reason about time programming challenges with ease and confidence. And don't forget to subscribe to the new CCTZ mailing list to ask questions and learn about future announcements.

by Greg Miller and Bradley White, Google Engineering

New algorithms may lower the cost of secure computing

Here at Google we strive to make computing not only more cost-efficient, faster, and easier but also more secure. Hash functions are essential building blocks in computing, but must be protected against certain inputs. Today, we are open-sourcing 3 new hash function implementations: faster, data-parallel versions of SipHash, a fast cryptographically strong pseudorandom function, and the entirely new HighwayHash, which reaches even higher speeds thanks to the data parallel features of modern computers.

Our first hash function produces the same output as SipHash, but 1.5 times as quickly thanks to AVX-2 instructions. The second improvement uses j-lanes tree hashing to process multiple inputs in parallel, which is 3 times as fast. This technique is known to be secure, but produces different output than the original SipHash and is slightly slower for short inputs.

HighwayHash is based on a new way of mixing inputs with just a few AVX-2 multiply and permute instructions. We are hopeful that the result is a cryptographically strong pseudorandom function, but new cryptanalysis methods might be needed for analyzing this promising family of hash functions. HighwayHash is significantly faster than SipHash for all measured input sizes, with about 7 times higher throughput at 1 KiB.

We believe our efforts represent the current state of the art in high-speed attack-resistant hashing. These new functions can lower the cost of safe and secure computing. We invite everyone to use, study, and analyze the open-source implementations.

By Jan Wassenberg and Jyrki Alakuijala, Google Research

Seesaw: scalable and robust load balancing

Like all good projects, this one started out because we had an itch to scratch…


As a Site Reliability Engineer at Google working on corporate infrastructure, there are a  large number of internally used services that need to be load balanced, both for scalability and reliability. Back in 2012, load balancing was provided by two different platforms and both presented different sets of management and stability challenges. In order to alleviate these issues, we set about looking for a replacement load balancing platform.

After evaluating a number of platforms, including existing open source projects, we were unable to find one that met all of our needs and decided to set about developing a robust and scalable load balancing platform. The requirements were not exactly complex - we needed the ability to handle traffic for unicast and anycast VIPs, perform load balancing with NAT and DSR (also known as DR), and perform adequate health checks against the backends. Above all we wanted a platform that allowed for ease of management, including automated deployment of configuration changes.

One of the two existing platforms was built upon Linux LVS, which provided the necessary load balancing at the network level. This was known to work successfully and we opted to retain this for the new platform. Several design decisions were made early on in the project — the first of these was to use the Go programming language, since it provided an incredibly powerful way to implement concurrency (goroutines and channels), along with easy interprocess communication (net/rpc). The second was to implement a modular multi-process architecture. The third was to simply abort and terminate a process if we ended up in an unknown state, which would ideally allow for failover and/or self-recovery.

After a period of concentrated development effort, we completed and successfully deployed Seesaw v2 as a replacement for both existing platforms. Overall it allowed us to increase service availability and reduce management overhead. We're pleased to be able to make this platform available to the rest of the world and hope that other enterprises are able to benefit from this project. You can find the code at https://github.com/google/seesaw.

By Joel Sing, Google Site Reliability Engineer

J2ObjC 1.0 Release

We are pleased to announce the 1.0 release of J2ObjC, a Google-authored open-source compiler that lets iPhone/iPad applications use Java code. J2ObjC's goal is to support the sharing of an application's non-UI code (such as data access, or application logic) by writing it once in Java, then building it into the iOS application. This same code can be shared with the Android and web versions of the application (the latter using the GWT compiler), as well as with server-side code. J2ObjC is licensed under the Apache License, Version 2.0.
J2ObjC is not a Java emulator, but instead translates Java to Objective-C classes that extend the iOS Foundation Framework. It supports the Java 8 language and runtime required by client-side application developers. JUnit and Mockito test translation and execution is also supported.  J2ObjC can be used with most build tools, including Xcode and Make, and there are Gradle and Maven plug-ins.
J2ObjC does not translate user interfaces, as world-class apps need to have world-class user interfaces that adhere closely to the different iOS and Android design standards. J2ObjC instead focuses on writing common abstractions once, and verifying them with a common set of unit tests. This ensures that an app's features work the same across platforms, improving customer experiences. Teams developing multi-platform apps still need great engineers for each platform, but with J2ObjC they don't waste time rewriting each others' code.

Using continuous integration, J2ObjC helps product velocity. As each feature is added or bug fix made to the application's shared code, all platforms are automatically rebuilt and tested. And because common features are shared across platforms, a bug found on one platform is fixed once for all platforms.

Several of Google’s iOS applications use J2ObjC for these reasons, including Inbox by Gmail, Google Calendar, Google Docs, Google Sheets, Google Slides and Google My Business. Each team has dedicated iOS designers and engineers, but application logic common to all platforms is written once.

By Tom Ball, Google Engineering

Stay in touch with RememberMe

Occasionally on the open source blog we feature the personal projects of Googlers. Today we hear from Blair Kutzman, whose loss inspired the creation of RememberMe, an open source email reminder tool.


On June 2, 2014 my cousin Jeremy Monnett was killed in a plane crash. He left behind his two sons, Miles and Brooks, and wife Kate.  As much as we look to explain events like these, they unfortunately can happen at any moment and to anybody.
Jeremy.jpg
RememberMe is an open source project I created in memory of Jeremy that helps people keep in touch. The basic premise is simple — at a configurable interval (eg. daily, weekly, monthly, etc) an email arrives in your inbox reminding you to contact someone. The reply-to field on the mail is set so that replying will send your response to your loved one.

The loss of Jeremy reminded me of just how important it is to keep in touch with the people you love the most. I hope that by making this project public, others will create email address for young children, spouses, and journal their daily thoughts and daily activities. This provides a fabulous way both to chronicle their lives, and also share your daily thoughts.

Please check out the project on github and enjoy!

By Blair Kutzman, Google Engineering