Tag Archives: releases

Announcing Google Radio PHY Test, aka “Graphyte”, as part of the Chromium Project

With many different Chromebook models for sale from several different OEMs, the Chrome OS Factory team interfaces with many different Contract Manufacturers (CMs), Original Device Manufacturers (ODMs), and factory teams. The Google Chromium OS Factory Software Platform, a suite of factory tools provided to Chrome OS partners, allows any factory team to quickly bring up a Chrome OS manufacturing test line.

Today, we are announcing that this platform has been extended to remove the friction of bringing up wireless verification test systems with a component called Google Radio PHY Test or “Graphyte.” Graphyte is an open source software framework that can be used and extended by the wireless ecosystem of chipset companies, test solution providers, and wireless device manufacturers, as opposed to the traditional approach of vendor-specific solutions. It is developed in Python and capable of running on Linux and Chrome OS test stations with an initial focus on Wi-Fi, Bluetooth, and 802.15.4 physical layer verification.

Verifying that a wireless device is working properly requires chipset- and instrument-specific software which coordinate transmitting and measuring power and signal quality across channels, bandwidths, and data rates. Graphyte provides high-level API abstractions for controlling wireless chipsets and test instruments, allowing anyone to develop a “plugin” for a given chipset or instrument.

Graphyte architecture.

We’ve worked closely with industry leaders like Intel and LitePoint to ensure the Graphyte APIs have the right level of abstraction, and it is already being used in production on multiple manufacturing lines and several different products.

To get started, use Git to clone our repository. You can learn more by reading the Graphyte User Manual and checking out the example of how to use Graphyte in a real test. You can get involved by joining our mailing list. If you’d like to contribute, please follow the Chromium OS Developer Guide.

To get started with the LitePoint Graphyte plugin, please contact LitePoint directly. To get started with the Intel Graphyte plugin, please contact Intel directly.

Happy testing!

By Kurt Williams, Technical Program Manager

Announcing Guetzli: A New Open Source JPEG Encoder

Crossposted on the Google Research Blog

At Google, we care about giving users the best possible online experience, both through our own services and products and by contributing new tools and industry standards for use by the online community. That’s why we’re excited to announce Guetzli, a new open source algorithm that creates high quality JPEG images with file sizes 35% smaller than currently available methods, enabling webmasters to create webpages that can load faster and use even less data.

Guetzli [guɛtsli] — cookie in Swiss German — is a JPEG encoder for digital images and web graphics that can enable faster online experiences by producing smaller JPEG files while still maintaining compatibility with existing browsers, image processing applications and the JPEG standard. From the practical viewpoint this is very similar to our Zopfli algorithm, which produces smaller PNG and gzip files without needing to introduce a new format; and different than the techniques used in RNN-based image compression, RAISR, and WebP, which all need client and ecosystem changes for compression gains at internet scale.

The visual quality of JPEG images is directly correlated to its multi-stage compression process: color space transform, discrete cosine transform, and quantization. Guetzli specifically targets the quantization stage in which the more visual quality loss is introduced, the smaller the resulting file. Guetzli strikes a balance between minimal loss and file size by employing a search algorithm that tries to overcome the difference between the psychovisual modeling of JPEG's format, and Guetzli’s psychovisual model, which approximates color perception and visual masking in a more thorough and detailed way than what is achievable by simpler color transforms and the discrete cosine transform. However, while Guetzli creates smaller image file sizes, the tradeoff is that these search algorithms take significantly longer to create compressed images than currently available methods.

orig-libjpeg-guetzli.png
Figure 1. 16x16 pixel synthetic example of  a phone line  hanging against a blue sky — traditionally a case where JPEG compression algorithms suffer from artifacts. Uncompressed original is on the left. Guetzli (on the right) shows less ringing artefacts than libjpeg (middle) and has a smaller file size.
And while Guetzli produces smaller image file sizes without sacrificing quality, we additionally found that in experiments where compressed image file sizes are kept constant that human raters consistently preferred the images Guetzli produced over libjpeg images, even when the libjpeg files were the same size or even slightly larger. We think this makes the slower compression a worthy tradeoff.

montage-cats-zoom-eye2.png
Figure 2. 20x24 pixel zoomed areas from a picture of a cat’s eye. Uncompressed original on the left. Guetzli (on the right) shows less ringing artefacts than libjpeg (middle) without requiring a larger file size.
It is our hope that webmasters and graphic designers will find Guetzli useful and apply it to their photographic content, making users’ experience smoother on image-heavy websites in addition to reducing load times and bandwidth costs for mobile users. Last, we hope that the new explicitly psychovisual approach in Guetzli will inspire further image and video compression research.

By Robert Obryk and Jyrki Alakuijala, Software Engineers, Google Research Europe

Another option for file sharing

Originally posted on the Google Security Blog

Existing mechanisms for file sharing are so fragmented that people waste time on multi-step copying and repackaging. With the new open source project Upspin, we aim to improve the situation by providing a global name space to name all your files. Given an Upspin name, a file can be shared securely, copied efficiently without "download" and "upload", and accessed by anyone with permission from anywhere with a network connection.

Our target audience is personal users, families, or groups of friends. Although Upspin might have application in enterprise environments, we think that focusing on the consumer case enables easy-to-understand and easy-to-use sharing.

File names begin with the user's email address followed by a slash-separated Unix-like path name:


Any user with appropriate permission can access the contents of this file by using Upspin services to evaluate the full path name, typically via a FUSE filesystem so that unmodified applications just work. Upspin names usually identify regular static files and directories, but may point to dynamic content generated by devices such as sensors or services.

If the user wishes to share a directory (the unit at which sharing privileges are granted), she adds a file called Access to that directory. In that file she describes the rights she wishes to grant and the users she wishes to grant them to. For instance,


allows Joe and Mae to read any of the files in the directory holding the Access file, and also in its subdirectories. As well as limiting who can fetch bytes from the server, this access is enforced end-to-end cryptographically, so cleartext only resides on Upspin clients, and use of cloud storage does not extend the trust boundary.

Upspin looks a bit like a global file system, but its real contribution is a set of interfaces, protocols, and components from which an information management system can be built, with properties such as security and access control suited to a modern, networked world. Upspin is not an "app" or a web service, but rather a suite of software components, intended to run in the network and on devices connected to it, that together provide a secure, modern information storage and sharing network. Upspin is a layer of infrastructure that other software and services can build on to facilitate secure access and sharing. This is an open source contribution, not a Google product. We have not yet integrated with the Key Transparency server, though we expect to eventually, and for now use a similar technique of securely publishing all key updates. File storage is inherently an archival medium without forward secrecy; loss of the user's encryption keys implies loss of content, though we do provide for key rotation.

It’s early days, but we’re encouraged by the progress and look forward to feedback and contributions. To learn more, see the GitHub repository at Upspin.

By Andrew Gerrand, Eric Grosse, Rob Pike, Eduardo Pinheiro and Dave Presotto, Google Software Engineers

Introducing Python Fire, a library for automatically generating command line interfaces

Today we are pleased to announce the open-sourcing of Python Fire. Python Fire generates command line interfaces (CLIs) from any Python code. Simply call the Fire function in any Python program to automatically turn that program into a CLI. The library is available from pypi via `pip install fire`, and the source is available on GitHub.

Python Fire will automatically turn your code into a CLI without you needing to do any additional work. You don't have to define arguments, set up help information, or write a main function that defines how your code is run. Instead, you simply call the `Fire` function from your main module, and Python Fire takes care of the rest. It uses inspection to turn whatever Python object you give it -- whether it's a class, an object, a dictionary, a function, or even a whole module -- into a command line interface, complete with tab completion and documentation, and the CLI will stay up-to-date even as the code changes.

To illustrate this, let's look at a simple example.

#!/usr/bin/env python
import fire

class Example(object):
def hello(self, name='world'):
"""Says hello to the specified name."""
return 'Hello {name}!'.format(name=name)

def main():
fire.Fire(Example)

if __name__ == '__main__':
main()

When the Fire function is run, our command will be executed. Just by calling Fire, we can now use the Example class as if it were a command line utility.

$ ./example.py hello
Hello world!
$ ./example.py hello David
Hello David!
$ ./example.py hello --name=Google
Hello Google!

Of course, you can continue to use this module like an ordinary Python library, enabling you to use the exact same code both from Bash and Python. If you're writing a Python library, then you no longer need to update your main method or client when experimenting with it; instead you can simply run the piece of your library that you're experimenting with from the command line. Even as the library changes, the command line tool stays up to date.

At Google, engineers use Python Fire to generate command line tools from Python libraries. We have an image manipulation tool built by using Fire with the Python Imaging Library, PIL. In Google Brain, we use an experiment management tool built with Fire, allowing us to manage experiments equally well from Python or from Bash.

Every Fire CLI comes with an interactive mode. Run the CLI with the `--interactive` flag to launch an IPython REPL with the result of your command, as well as other useful variables already defined and ready to use. Be sure to check out Python Fire's documentation for more on this and the other useful features Fire provides.

Between Python Fire's simplicity, generality, and power, we hope you find it a useful library for your own projects.

By David Bieber, Software Engineer on Google Brain

Announcing TensorFlow 1.0

Originally posted on the Google Developer Blog

In just its first year, TensorFlow has helped researchers, engineers, artists, students, and many others make progress with everything from language translation to early detection of skin cancer and preventing blindness in diabetics. We're excited to see people using TensorFlow in over 6000 open source repositories online.

Today, as part of the first annual TensorFlow Developer Summit, hosted in Mountain View and livestreamed around the world, we're announcing TensorFlow 1.0:

It's faster: TensorFlow 1.0 is incredibly fast! XLA lays the groundwork for even more performance improvements in the future, and tensorflow.org now includes tips & tricksfor tuning your models to achieve maximum speed. We'll soon publish updated implementations of several popular models to show how to take full advantage of TensorFlow 1.0 - including a 7.3x speedup on 8 GPUs for Inception v3 and 58x speedup for distributed Inception v3 training on 64 GPUs!

It's more flexible: TensorFlow 1.0 introduces a high-level API for TensorFlow, with tf.layers, tf.metrics, and tf.losses modules. We've also announced the inclusion of a new tf.keras module that provides full compatibility with Keras, another popular high-level neural networks library.

It's more production-ready than ever: TensorFlow 1.0 promises Python API stability (details here), making it easier to pick up new features without worrying about breaking your existing code.

Other highlights from TensorFlow 1.0:
  • Python APIs have been changed to resemble NumPy more closely. For this and other backwards-incompatible changes made to support API stability going forward, please use our handy migration guide and conversion script.
  • Experimental APIs for Javaand Go
  • Higher-level API modules tf.layers, tf.metrics, and tf.losses - brought over from tf.contrib.learnafter incorporating skflowand TF Slim
  • Experimental release of XLA, a domain-specific compiler for TensorFlow graphs, that targets CPUs and GPUs. XLA is rapidly evolving - expect to see more progress in upcoming releases.
  • Introduction of the TensorFlow Debugger (tfdbg), a command-line interface and API for debugging live TensorFlow programs.
  • New Android demos for object detection and localization, and camera-based image stylization.
  • Installation improvements: Python 3 docker images have been added, and TensorFlow's pip packages are now PyPI compliant. This means TensorFlow can now be installed with a simple invocation of pip install tensorflow.
We're thrilled to see the pace of development in the TensorFlow community around the world. To hear more about TensorFlow 1.0 and how it's being used, you can watch the TensorFlow Developer Summit talks on YouTube, covering recent updates from higher-level APIs to TensorFlow on mobile to our new XLA compiler, as well as the exciting ways that TensorFlow is being used:



Click herefor a link to the livestream and video playlist (individual talks will be posted online later in the day).

The TensorFlow ecosystem continues to grow with new techniques like Foldfor dynamic batching and tools like the Embedding Projector along with updatesto our existing tools like TensorFlow Serving. We're incredibly grateful to the community of contributors, educators, and researchers who have made advances in deep learning available to everyone. We look forward to working with you on forums like GitHub issues, Stack Overflow, @TensorFlow, the [email protected]group, and at future events.

By Amy McDonald Sandjideh, Technical Program Manager, TensorFlow

Introducing Draco: compression for 3D graphics

3D graphics are a fundamental part of many applications, including gaming, design and data visualization. As graphics processors and creation tools continue to improve, larger and more complex 3D models will become commonplace and help fuel new applications in immersive virtual reality (VR) and augmented reality (AR).  Because of this increased model complexity, storage and bandwidth requirements are forced to keep pace with the explosion of 3D data.

The Chrome Media team has created Draco, an open source compression library to improve the storage and transmission of 3D graphics. Draco can be used to compress meshes and point-cloud data. It also supports compressing points, connectivity information, texture coordinates, color information, normals and any other generic attributes associated with geometry.

With Draco, applications using 3D graphics can be significantly smaller without compromising visual fidelity. For users this means apps can now be downloaded faster, 3D graphics in the browser can load quicker, and VR and AR scenes can now be transmitted with a fraction of the bandwidth, rendered quickly and look fantastic.


Sample Draco compression ratios and encode/decode performance*

Transmitting 3D graphics for web-based applications is significantly faster using Draco’s JavaScript decoder, which can be tied to a 3D web viewer. The following video shows how efficient transmitting and decoding 3D objects in the browser can be - even over poor network connections.



Video and audio compression have shaped the internet over the past 10 years with streaming video and music on demand. With the emergence of VR and AR, on the web and on mobile (and the increasing proliferation of sensors like LIDAR) we will soon be swimming in a sea of geometric data. Compression technologies, like Draco, will play a critical role in ensuring these experiences are fast and accessible to anyone with an internet connection. More exciting developments are in store for Draco, including support for creating multiple levels of detail from a single model to further improve the speed of loading meshes.

We look forward to seeing what people do with Draco now that it's open source. Check out the code on GitHub and let us know what you think. Also available is a JavaScript decoder with examples on how to incorporate Draco into the three.js 3D viewer.

By Jamieson Brettle and Frank Galligan, Chrome Media Team

* Specifications: Tests ran with textures and positions quantized at 14-bit precision, normal vectors at 7-bit precision. Ran on a single-core of a 2013 MacBook Pro.  JavaScript decoded using Chrome 54 on Mac OS X.

JanusGraph connects the past and future of Titan

We are thrilled to collaborate with a group of individuals and companies, including Expero, GRAKN.AI, Hortonworks and IBM, in launching a new project — JanusGraph — under The Linux Foundation to advance the state-of-the-art in distributed graph computation.



JanusGraph is a fork of the popular open source project Titan, originally released in 2012 by Aurelius, and subsequently acquired by DataStax. Titan has been widely adopted for large-scale distributed graph computation and many users have contributed to its ongoing development, which has slowed down as of late: there have been no Titan releases since the 1.0 release in September 2015, and the repository has seen no updates since June 2016.

This new project will reinvigorate development of the distributed graph system to add new functionality, improve performance and scalability, and maintain a variety of storage backends.

The name "Janus" comes from the name of a Roman god who looks simultaneously into the past to the Titans (divine beings from Greek mythology) as well as into the future.

All are welcome to participate in the JanusGraph project, whether by contributing features or bug fixes, filing feature requests and bugs, improving the documentation or helping shape the product roadmap through feature requests and use cases.

Get involved by taking a look at our website and browse the code on GitHub.

We look forward to hearing from you!

By Misha Brukman, Google Cloud Platform

Apache Beam graduates to a top-level project

Please join me in extending a hearty digital “Huzzah!” to the Apache Beam community: as announced today, Apache Beam is an official graduate of the Apache Incubator and is now a full-fledged, top-level Apache project. This achievement is a direct reflection of the hard work the community has invested in transforming Beam into an open, professional and community-driven project.

11 months ago, Google and a number of partners donated a giant pile of code to the Apache Software Foundation, thus forming the incubating Beam project. The bulk of this code composed the Google Cloud Dataflow SDK: the libraries that developers used to write streaming and batch pipelines that ran on any supported execution engine. At the time, the main supported engine was Google’s Cloud Dataflow service with support for Apache Spark and Apache Flink in development); as of today there are five officially supported runners. Though there were many motivations behind the creation of Apache Beam, the one at the heart of everything was a desire to build an open and thriving community and ecosystem around this powerful model for data processing that so many of us at Google spent years refining. But taking a project with over a decade of engineering momentum behind it from within a single company and opening it to the world is no small feat. That’s why I feel today’s announcement is so meaningful.

With that context in mind, let’s look at some statistics squirreled away in the graduation maturity model assessment:

  • Out of the ~22 large modules in the codebase, at least 10 modules have been developed from scratch by the community, with little to no contribution from Google.
  • Since September, no single organization has had more than ~50% of the unique contributors per month.
  • The majority of new committers added during incubation came from outside Google.

And for good measure, here’s a quote from the Vice President of the Apache Incubator, lifted from the public Apache incubator general discussions list where Beam’s graduation was first proposed:

“In my day job as well as part of my work at Apache, I have been very impressed at the way that Google really understands how to work with open source communities like Apache. The Apache Beam project is a great example of this and is a great example of how to build a community." -- Ted Dunning, Vice President of Apache Incubator

The point I’m trying to make here is this: while Google’s commitment to Apache Beam remains as strong as it always has been, everyone involved (both within Google and without) has done an excellent job of building an open source project that’s truly open in the best sense of the word.

This is what makes open source software amazing: people coming together to build great, practical systems for everyone to use because the work is exciting, useful and relevant. This is the core reason I was so excited about us creating Apache Beam in the first place, the reason I’m proud to have played some small part in that journey, and the reason I’m so grateful for all the work the community has invested in making the project a reality.

Naturally, graduation is only one milestone in the lifetime of the project, and we have many more ahead of us, but becoming top-level project is an indication that Apache Beam now has a development community that is ready for prime time.

That means we’re ready to continue pushing forward the state of the art in stream and batch processing. We’re ready to bring the promise of portability to programmatic data processing, much in the way SQL has done so for declarative data analysis. We’re ready to build the things that never would have gotten built had this project stayed confined within the walls of Google. And last but perhaps not least, we’re ready to recoup the vast quantities of text space previously consumed by the mandatory “(incubating)” moniker accompanying all of our initial mentions of Apache Beam!

But seriously, whatever your motivation, please consider joining us along the way. We have an exciting road ahead.

By Tyler Akidau, Apache Beam PMC and Staff Software Engineer at Google

Grumpy: Go running Python!

Google runs millions of lines of Python code. The front-end server that drives youtube.com and YouTube’s APIs is primarily written in Python, and it serves millions of requests per second! YouTube’s front-end runs on CPython 2.7, so we’ve put a ton of work into improving the runtime and adapting our application to work optimally within it. These efforts have borne a lot of fruit over the years, but we always run up against the same issue: it's very difficult to make concurrent workloads perform well on CPython.

To solve this problem, we investigated a number of other Python runtimes. Each had trade-offs and none solved the concurrency problem without introducing other issues.
MeatGrinder.png
So we asked ourselves a crazy question: What if we were to implement an alternative runtime optimized for real-time serving? Once we started going down the rabbit hole, Go seemed like an obvious choice of platform since its operational characteristics align well with our use case (e.g. lightweight threads). We wanted first class language interoperability and Go’s powerful runtime type reflection system made this straightforward. Python in Go felt very natural, and so Grumpy was born.

Grumpy is an experimental Python runtime for Go. It translates Python code into Go programs, and those transpiled programs run seamlessly within the Go runtime. We needed to support a large existing Python codebase, so it was important to have a high degree of compatibility with CPython (quirks and all). The goal is for Grumpy to be a drop-in replacement runtime for any pure-Python project.

Two design choices we made had big consequences. First, we decided to forgo support for C extension modules. This means that Grumpy cannot leverage the wealth of existing Python C extensions but it gave us a lot of flexibility to design an API and object representation that scales for parallel workloads. In particular, Grumpy has no global interpreter lock, and it leverages Go’s garbage collection for object lifetime management instead of counting references. We think Grumpy has the potential to scale more gracefully than CPython for many real world workloads. Results from Grumpy’s synthetic Fibonacci benchmark demonstrate some of this potential:



Second, Grumpy is not an interpreter. Grumpy programs are compiled and linked just like any other Go program. The downside is less development and deployment flexibility, but it offers several advantages. For one, it creates optimization opportunities at compile time via static program analysis. But the biggest advantage is that interoperability with Go code becomes very powerful and straightforward: Grumpy programs can import Go packages just like Python modules! For example, the Python snippet below uses Go’s standard net/http package to start a simple server:

from __go__.net.http import ListenAndServe, RedirectHandler

handler = RedirectHandler('http://github.com/google/grumpy', 303)
ListenAndServe('127.0.0.1:8080', handler)

We’re excited about the prospects for Grumpy. Although it’s still alpha software, most of the language constructs and many core built-in types work like you’d expect. There are still holes to fill — many built-in types are missing methods and attributes, built-in functions are absent and the standard library is virtually empty. If you find things that you wish were working, file an issue so we know what to prioritize. Or better yet, submit a pull request.

Stay Grumpy!

By Dylan Trotter, YouTube Engineering

Open sourcing the Embedding Projector: a tool for visualizing high dimensional data

Originally posted on the Google Research Blog

Recent advances in machine learning (ML) have shown impressive results, with applications ranging from image recognition, language translation, medical diagnosis and more. With the widespread adoption of ML systems, it is increasingly important for research scientists to be able to explore how the data is being interpreted by the models. However, one of the main challenges in exploring this data is that it often has hundreds or even thousands of dimensions, requiring special tools to investigate the space.

To enable a more intuitive exploration process, we are open-sourcing the Embedding Projector, a web application for interactive visualization and analysis of high-dimensional data recently shown as an A.I. Experiment, as part of TensorFlow. We are also releasing a standalone version at projector.tensorflow.org, where users can visualize their high-dimensional data without the need to install and run TensorFlow.


Exploring Embeddings

The data needed to train machine learning systems comes in a form that computers don't immediately understand. To translate the things we understand naturally (e.g. words, sounds, or videos) to a form that the algorithms can process, we use embeddings, a mathematical vector representation that captures different facets (dimensions) of the data. For example, in this language embedding, similar words are mapped to points that are close to each other.

With the Embedding Projector, you can navigate through views of data in either a 2D or a 3D mode, zooming, rotating, and panning using natural click-and-drag gestures. Below is a figure showing the nearest points to the embedding for the word “important” after training a TensorFlow model using the word2vec tutorial. Clicking on any point (which represents the learned embedding for a given word) in this visualization, brings up a list of nearest points and distances, which shows which words the algorithm has learned to be semantically related. This type of interaction represents an important way in which one can explore how an algorithm is performing.


Methods of Dimensionality Reduction

The Embedding Projector offers three commonly used methods of data dimensionality reduction, which allow easier visualization of complex data: PCA, t-SNE and custom linear projections. PCA is often effective at exploring the internal structure of the embeddings, revealing the most influential dimensions in the data. t-SNE, on the other hand, is useful for exploring local neighborhoods and finding clusters, allowing developers to make sure that an embedding preserves the meaning in the data (e.g. in the MNIST dataset, seeing that the same digits are clustered together). Finally, custom linear projections can help discover meaningful "directions" in data sets - such as the distinction between a formal and casual tone in a language generation model - which would allow the design of more adaptable ML systems.

A custom linear projection of the 100 nearest points of "See attachments." onto the "yes" - "yeah" vector (“yes” is right, “yeah” is left) of a corpus of 35k frequently used phrases in emails
The Embedding Projector website includes a few datasets to play with. We’ve also made it easy for users to publish and share their embeddings with others (just click on the “Publish” button on the left pane). It is our hope that the Embedding Projector will be a useful tool to help the research community explore and refine their ML applications, as well as enable anyone to better understand how ML algorithms interpret data. If you'd like to get the full details on the Embedding Projector, you can read the paper here. Have fun exploring the world of embeddings!

By Daniel Smilkov and the Big Picture group