Tag Archives: Open source

Google and Binomial Partner to Open-Source Basis Universal Texture Format

Today, Google and Binomial are excited to announce that we have partnered to open source the Basis Universal texture codec to improve the performance of transmitting images on the web and within desktop and mobile applications, while maintaining GPU efficiency. This release fills an important gap in the graphics compression ecosystem and complements earlier work in Draco geometry compression.

The Basis Universal texture format is 6-8 times smaller than JPEG on the GPU, yet is a similar storage size as JPEG – making it a great alternative to current GPU compression methods that are inefficient and don’t operate cross platform – and provides a more performant alternative to JPEG/PNG. It creates compressed textures that work well in a variety of use cases - games, virtual & augmented reality, maps, photos, small-videos, and more!

Without a universal texture format, developers are left with 2 options:

  • Use GPU formats and take the storage size hit.
  • Use other formats that have reduced storage size but couldn't compete with the GPU performance.

Maintaining so many different GPU formats is a burden on the whole ecosystem, from GPU manufacturers to software developers to the end user who can’t get a great cross platform experience. We’re streamlining this with one solution that has built-in flexibility (like optional higher quality modes) but is much easier on everyone to improve and maintain.

How does it all work? Compress your image using the encoder, choosing the quality settings that make sense for your project (you can also submit multiple images for small videos or optimization purposes, just know they’ll share the same color palette). Insert the transcoder code before rendering, which will turn the intermediary format into the GPU format your computer can read. The image stays compressed throughout this process, even on your GPU!  Instead of needing to decode and read the whole image, the GPU will read only the parts it needs. Enjoy the performance benefits!
Basis Universal can efficiently target the most common GPU formats
Google and Binomial will be working together to continue to support, maintain and add features, so check back frequently for the latest. This initial release of Basis Universal transcodes into the following GPU formats: PVRTC1 opaque, ETC1, ETC2 basic alpha, BC1-5, and BC7 opaque. Over the coming months more functionality will be added including BC7 transparent, ASTC opaque and alpha, PVRTC1 transparent, and higher quality BC7/ASTC.
Basis Universal reduces transmission size for texture while maintaining similar image quality.
See full benchmarking results
Basis Universal improves GPU memory usage over .jpeg and .png
With this partnership, we hope to see adoption of the transcoder in all major browsers to make performant cross-platform compressed textures accessible to everyone via the WebGL API, and the forthcoming WebGPU API. In addition to opening up the possibility of seamless integration into pipelines, everyone now has access to the state of the art compressor, which will also be open sourced.

We look forward to seeing what people do with Basis Universal now that it's open sourced. Check out the code and demo on GitHub, let us know what you think, and how you plan to use it! Currently, Basis Universal transcoders are available in C++ and WebAssembly.

By Stephanie Hurlburt, Binomial and Jamieson Brettle, Chrome Media

OpenTelemetry: The Merger of OpenCensus and OpenTracing

We’ve talked about OpenCensus a lot over the past few years, from the project’s initial announcement, roots at Google and partners (Microsoft, Dynatrace) joining the project, to new functionality that we’re continually adding. The project has grown beyond our expectations and now sports a mature ecosystem with Google, Microsoft, Omnition, Postmates, and Dynatrace making major investments, and a broad base of community contributors.

We recently announced that OpenCensus and OpenTracing are merging into a single project, now called OpenTelemetry, which brings together the best of both projects and has a frictionless migration experience. We’ve made a lot of progress so far: we’ve established a governance committee, a Java prototype API + implementation, workgroups for each language, and an aggressive implementation schedule.

Today we’re highlighting the combined project at the keynote of Kubecon and announcing that OpenTelemetry is now officially part of the Cloud Native Computing Foundation! Full details are available in the CNCF’s official blog post, which we’ve copied below:

A Brief History of OpenTelemetry (So Far)

After many months of planning, discussion, prototyping, more discussion, and more planning, OpenTracing and OpenCensus are merging to form OpenTelemetry, which is now a CNCF sandbox project. The seed governance committee is composed of representatives from Google, Lightstep, Microsoft, and Uber, and more organizations are getting involved every day.

And we couldn't be happier about it – here’s why.

Observability, Outputs, and High-Quality Telemetry

Observability is a fashionable word with some admirably nerdy and academic origins. In control theory, “observability” measures how well we can understand the internals of a given system using only its external outputs. If you’ve ever deployed or operated a modern, microservice-based software application, you have no doubt struggled to understand its performance and behavior, and that’s because those “outputs” are usually meager at best. We can’t understand a complex system if it’s a black box. And the only way to light up those black boxes is with high-quality telemetry: distributed traces, metrics, logs, and more.

So how can we get our hands – and our tools – on precise, low-overhead telemetry from the entirety of a modern software stack? One way would be to carefully instrument every microservice, piece by piece, and layer by layer. This would literally work, it’s also a complete non-starter – we’d spend as much time on the measurement as we would on the software itself! We need telemetry as a built-in feature of our services.

The OpenTelemetry project is designed to make this vision a reality for our industry, but before we describe it in more detail, we should first cover the history and context around OpenTracing and OpenCensus.

OpenTracing and OpenCensus

In practice, there are several flavors (or “verticals” in the diagram) of telemetry data, and then several integration points (or “layers” in the diagram) available for each. Broadly, the cloud-native telemetry landscape is dominated by distributed traces, timeseries metrics, and logs; and end-users typically integrate with a thin instrumentation API or via straightforward structured data formats that describe those traces, metrics, or logs.



For several years now, there has been a well-recognized need for industry-wide collaboration in order to amortize the shared cost of software instrumentation. OpenTracing and OpenCensus have led the way in that effort, and while each project made different architectural choices, the biggest problem with either project has been the fact that there were two of them. And, further, that the two projects weren’t working together and striving for mutual compatibility.

Having two similar-yet-not-identical projects out in the world created confusion and uncertainty for developers, and that made it harder for both efforts to realize their shared mission: built-in, high-quality telemetry for all.

Getting to One Project

If there’s a single thing to understand about OpenTelemetry, it’s that the leadership from OpenTracing and OpenCensus are co-committed to migrating their respective communities to this single and unified initiative. Although all of us have numerous ideas about how we could boil the ocean and start from scratch, we are resisting those impulses and focusing instead on preparing our communities for a successful transition; our priorities for the merger are clear:
  • Straightforward backwards compatibility with both OpenTracing and OpenCensus (via software bridges)
  • Minimizing the time where OpenTelemetry, OpenTracing, and OpenCensus are being co-developed: we plan to put OpenTracing and OpenCensus into “readonly mode” before the end of 2019.
  • And, again, to simplify and standardize the telemetry solutions available to developers.
In many ways, it’s most accurate to think of OpenTelemetry as the next major version of both OpenTracing and OpenCensus. Like any version upgrade, we will try to make it easy for both new and existing end-users, but we recognize that the main benefit to the ecosystem is the consolidation itself – not some specific and shiny new feature – and we are prioritizing our own efforts accordingly.

How you can help

OpenTelemetry’s timeline is an aggressive one. While we have many open-source and vendor-licensed observability solutions providing guidance, we will always want as many end-users involved as possible. The single most valuable thing any end-user can do is also one of the easiest: check out the actual work we’re doing and provide feedback. Via GitHub, Gitter, email, or whatever feels easiest.

Of course we also welcome code contributions to OpenTelemetry itself, code contributions that add OpenTelemetry support to existing software projects, documentation, blog posts, and the rest of it. If you’re interested, you can sign up to join the integration effort by filling in this form.

By Ben Sigelman, co-creator of OpenTracing and member of the OpenTelemetry governing committee, and Morgan McLean, Product Manager for OpenCensus at Google since the project’s inception

Google fosters the open source hardware community

Open source silicon promises new challenges and opportunities for both industry and the open source community. To take full advantage of open silicon we will need new design methodologies, new governance models, and increased collaborations between industry, academia, and not for profits. A vibrant free and open source software community has been vital to both Google and our customer’s success. We look forward to supporting the new domain of open source silicon to similarly benefit all participants.

Working through its Open Source Programs Office (OSPO), Google is actively engaged in helping seed the open silicon space. Specifically by providing funding, strategic, and legal support to key open hardware efforts including lowRISC and CHIPS alliance.

lowRISC

lowRISC is a leader in open silicon community outreach, technical documentation, and advancing the goal of a truly open source system on a chip. We have long supported lowRISC’s mission of transparently implemented silicon and robust engagement of the open source silicon community, providing funding, advice, and recognizing their open source community leadership by selecting them as a Google Summer of Code mentoring organization.

Similar to the benefits of open source software, we believe our users will derive great outcomes from open source silicon. Besides enabling and encouraging innovation, chip designs derived from a common, open baseline will provide the benefit of implementation choice while still guaranteeing software compatibility and a set of common interfaces. With regards to security, the transparency of an open source approach is critical to both bugfinding and establishing implementation trustworthiness.

"Google has encouraged and supported lowRISC since the very start. They clearly share our optimism for what open source hardware can offer and our community-driven vision of the future. We are excited by the expanding open source RISC-V ecosystem and look forward to lowRISC community IP being deployed in the real world,” said Alex Bradbury, Co-founder and Director. “We believe lowRISC can act as an important catalyst for open source silicon, providing a shared engineering resource to ensure quality, provide support and help to maintain IP from a range of partners.”
lowRISC board members (L to R): Dominic Rizzo (Google), Alex Bradbury (lowRISC), Gavin Ferris (lowRISC), Dr Robert Mullins (University of Cambridge), Prof. Luca Benini (ETH Zürich), and Ron Minnich (Google, not pictured).
A first example of Google’s ongoing collaboration with ETH Zürich and lowRISC is the recently released “Ibex” RISC-V core. ETH Zürich donated their Zero-riscy core as a starting point and technical work to extend the core was done across all three organizations. You can learn more about Google’s collaboration with lowRISC on the RISC-V core here.

Furthermore, Google is excited to announce that it is joining the board of lowRISC, with the appointment of Dominic Rizzo and Ronald Minnich as corporate directors.

CHIPS Alliance 

Along with our increased funding, support and collaboration with lowRISC, we are also happy to announce our status as a founding member of the Linux Foundation’s CHIPS Alliance project. CHIPS Alliance features an industry-driven, collaborative model to release high-quality silicon IP and supporting technical collateral. Most recently, in collaboration with CHIPS Alliance, we released a Universal Verification Methodology (UVM) instruction stream generator to aid in the verification of RISC-V cores. We believe such open sourcing of verification tools will prove critical to the long-term success of the open source silicon community.

Google has been an early, strong supporter of the open silicon community. We believe deeply in a future where transparent, trustworthy open source chip designs are commonplace. To get there, we are committed to establishing a collaborative, community-focused, open source basis for free and open silicon design.

By Parthasarathy Ranganathan, Distinguished Engineer, Google and Dominic Rizzo, Open Silicon Tech Lead, Google 

Summer is here! Welcome to our 2019 GSoC Students!


After reviewing 7,555 student proposals, our 206 mentoring organizations have chosen the 1,276 students they will be working with during the 15th Google Summer of Code (GSoC). Congratulations to our 2019 GSoC students and a big thank you to everyone who applied. This year’s students come from 64 countries!

The next step for participating students is the Community Bonding period which runs from May 6 through May 27. During this time, students will get up to speed on the culture and code base of their new community. They’ll also get acquainted with their mentor(s) and learn more about the languages or tools that they will need to complete their projects. Coding begins May 27 and will continue throughout the summer until August 26.

If you were not selected for this year’s Summer of Code - don’t be discouraged! Many students apply more than once to GSoC before being accepted. You can improve your odds for next time by contributing to the open source project of your choice directly now; organizations are always eager for new contributors! Look around for a project that interests you and get started.

Happy coding, everyone!

By Stephanie Taylor, GSoC Program Lead

Announcing Google-Landmarks-v2: An Improved Dataset for Landmark Recognition & Retrieval



Last year we released Google-Landmarks, the largest world-wide landmark recognition dataset available at that time. In order to foster advancements in research on instance-level recognition (recognizing specific instances of objects, e.g. distinguishing Niagara Falls from just any waterfall) and image retrieval (matching a specific object in an input image to all other instances of that object in a catalog of reference images), we also hosted two Kaggle challenges, Landmark Recognition 2018 and Landmark Retrieval 2018, in which more than 500 teams of researchers and machine learning (ML) enthusiasts participated. However, both instance recognition and image retrieval methods require ever larger datasets in both the number of images and the variety of landmarks in order to train better and more robust systems.

In support of this goal, this year we are releasing Google-Landmarks-v2, a completely new, even larger landmark recognition dataset that includes over 5 million images (2x that of the first release) of more than 200 thousand different landmarks (an increase of 7x). Due to the difference in scale, this dataset is much more diverse and creates even greater challenges for state-of-the-art instance recognition approaches. Based on this new dataset, we are also announcing two new Kaggle challenges—Landmark Recognition 2019 and Landmark Retrieval 2019—and releasing the source code and model for Detect-to-Retrieve, a novel image representation suitable for retrieval of specific object instances.
Heatmap of the landmark locations in Google-Landmarks-v2, which demonstrates the increase in the scale of the dataset and the improved geographic coverage compared to last year’s dataset.
Creating the Dataset
A particular problem in preparing Google-Landmarks-v2 was the generation of instance labels for the landmarks represented, since it is virtually impossible for annotators to recognize all of the hundreds of thousands of landmarks that could potentially be present in a given photo. Our solution to this problem was to crowdsource the landmark labeling through the efforts of a world-spanning community of hobby photographers, each familiar with the landmarks in their region.
Selection of images from Google-Landmarks-v2. Landmarks include (left to right, top to bottom) Neuschwanstein Castle, Golden Gate Bridge, Kiyomizu-dera, Burj khalifa, Great Sphinx of Giza, and Machu Picchu.
Another issue for research datasets is the requirement that images be shared freely and stored indefinitely, so that the dataset can be used to track the progress of research over a long period of time. As such, we sourced the Google-Landmarks-v2 images through Wikimedia Commons, capturing both world-famous and lesser-known, local landmarks while ensuring broad geographic coverage (thanks in part to Wiki Loves Monuments) and photos sourced from public institutions, including historical photographs that are valuable to test instance recognition over time.

The Kaggle Challenges
The goal of the Landmark Recognition 2019 challenge is to recognize a landmark presented in a query image, while the goal of Landmark Retrieval 2019 is to find all images showing that landmark. The challenges include cash prizes totaling $50,000 and the winning teams will be invited to present their methods at the Second Landmark Recognition Workshop at CVPR 2019.

Open Sourcing our Model
To foster research reproducibility and help push the field of instance recognition forward, we are also releasing open-source code for our new technique, called Detect-to-Retrieve (which will be presented as a paper in CVPR 2019). This new method leverages bounding boxes from an object detection model to give extra weight to image regions containing the class of interest, which significantly improves accuracy. The model we are releasing is trained on a subset of 86k images from the original Google-Landmarks dataset that were annotated with landmark bounding boxes. We are making these annotations available along with the original dataset here.

We invite researchers and ML enthusiasts to participate in the Landmark Recognition 2019 and Landmark Retrieval 2019 Kaggle challenges and to join the Second Landmark Recognition Workshop at CVPR 2019. We hope that this dataset will help advance the state-of-the-art in instance recognition and image retrieval. The data is being made available via the Common Visual Data Foundation.

Acknowledgments
The core contributors to this project are Andre Araujo, Bingyi Cao, Jack Sim and Tobias Weyand. We would like to thank our team members Daniel Kim, Emily Manoogian, Nicole Maffeo, and Hartwig Adam for their kind help. Thanks also to Marvin Teichmann and Menglong Zhu for their contribution to collecting the landmark bounding boxes and developing the Detect-to-Retrieve technique. We would like to thank Will Cukierski and Maggie Demkin for their help organizing the Kaggle challenge, Elan Hourticolon-Retzler, Yuan Gao, Qin Guo, Gang Huang, Yan Wang, Zhicheng Zheng for their help with data collection, Tsung-Yi Lin for his support with CVDF hosting, as well as our CVPR workshop co-organizers Bohyung Han, Shih-Fu Chang, Ondrej Chum, Torsten Sattler, Giorgos Tolias, and Xu Zhang. We have great appreciation for the Wikimedia Commons Community and their volunteer contributions to an invaluable photographic archive of the world’s cultural heritage. And finally, we’d like to thank the Common Visual Data Foundation for hosting the dataset.

Source: Google AI Blog


Season of Docs announces participating organizations

Season of Docs has announced the 50 participating open source organizations! You can view the list of participating organizations on the website.

Technical writer applications open on May 29, 2019 at 18:00 UTC. 

During the technical writer exploration phase, which runs from now until May 28, 2019, technical writers can explore the list of participating organizations and their project ideas. They should reach out to the organizations to gain a better understanding of the organizations and discuss project ideas before applying to Season of Docs.

For more information about the technical writer exploration phase, visit the technical writer guide on the website.

What is Season of Docs?

Documentation is essential to the adoption of open source projects as well as to the success of their communities. Season of Docs brings together technical writers and open source projects to foster collaboration and improve documentation in the open source space. You can find out more about the program on the introduction page of the website.

During the program, technical writers spend a few months working closely with an open source community. They bring their technical writing expertise to the project's documentation and, at the same time, learn about the open source project and new technologies.

The open source projects work with the technical writers to improve the project's documentation and processes. Together, they may choose to build a new documentation set, redesign the existing docs, or improve and document the project's contribution procedures and onboarding experience.

How do I take part in Season of Docs as a technical writer?

First, take a look at the technical writer guide on the website. The guide includes information on eligibility and the application process.

The technical writer exploration phase runs from April 30 - May 28, 2019. During this period, you can explore the list of participating organizations and their project ideas. When you find one or more projects that interest you, you should approach the relevant open source organization directly to discuss project ideas.

Then, read create a technical writing application and prepare your application materials. On May 29, 2019 at 18:00 UTC, Season of Docs will begin accepting technical writer applications and publish a link to the application form on the website. The deadline for technical writer applications is June 28, 2019 at 18:00 UTC.

Is there a stipend for participating technical writers?

Yes. There is an optional stipend that technical writers can request as part of their application. The stipend amount is calculated based on the technical writer's home location. See the technical writer stipends page for more information.

If you have any questions about the program, please email us at season-of-docs-support@googlegroups.com.

General timeline

  • April 30 - May 28: Technical writers explore the list of participating organizations and project ideas.
  • May 29 - June 28: Technical writers submit their proposals to Season of Docs. 
  • July 30: Google announces the accepted technical writer projects
  • August 1 - September 1: Community bonding: Technical writers get to know mentors and the open source community, and refine their projects in collaboration with their mentors.
  • September 2 - November 29: Technical writers work with open source mentors on the accepted projects, and submit their work at the end of the period.
  • December 10: Google publishes the list of successfully-completed projects.
See the full timeline for details, including the provision for projects that run longer than three months.

Care to join us?

Explore the Season of Docs website at g.co/seasonofdocs to learn more about participating in the program. Use our logo and other promotional resources to spread the word. Examine the timeline, check out the FAQ, and apply now!

By Andrew Chen, Google Open Source and Sarah Maddox, Google Technical Writer

Google Open Source Peer Bonus winners are here!

At Google we’ve always used open source to innovate, build amazing products, and bring better technology to the world. We also enjoy being part of the community and are always looking for ways to give back.

In 2011 we launched the Google Open Source Peer Bonus program with the goal of supporting the ecosystem and sustainability of open source by rewarding external developers for their contributions to open source projects. Over the years the program has grown and expanded. Now we reward not just software developers but all types of contributors, including technical writers, user experience and graphic designers, community managers and marketers, mentors and educators, ops and security experts.

We are very pleased to announce the latest Google Open Source Peer Bonus Winners and their projects. We have a record number of 90 recipients this cycle representing 20 countries all over the world: Australia, Belgium, Canada, China, France, Germany, India, Italy, Japan, the Netherlands, Poland, Russia, Singapore, Slovakia, South Africa, Spain, Sweden, United Kingdom, Ukraine and USA.

Below is the list of projects and awardees who gave us permission to thank them publicly:
Name Project Name Project
Cyril TovenaAgonesVincent DemeesterKnative Build Pipeline
Rebecca CloseAMPHTMLNader Ziadaknative/build
Leon TanAMPHTMLJim AngelKubernetes
Wassim CheghamAngularZach ArnoldKubernetes
Paul GschwendtnerAngular MaterialSerguei BezverkhiKubernetes
Maxim KoretskyiAngular-in-depth blogDamini Satya KammakomatiKubernetes
Kaxil NaikApache AirflowJennifer RondeauKubernetes
Kohei SutouApache ArrowMichael FrombergerKythe
Matthias BaetensApache BeamMark BrownLinux kernel
Lukazs GajowyApache BeamLuis ChamberlainLinux Kernel
Suneel MarthiApache BeamTetsuo HandaLinux kernel
Maximilian MichelsApache BeamTakashi IwaiLinux kernel
Alex Van BoxelApache BeamHeiko StuebnerLinux Kernel
Thomas WeiseApache BeamCong WangLinux kernel
Julian HydeApache CalciteRichard HughesLinux Vendor Firmware Service
Lan SunApache GroovyAaron PuchertLLVM/ Clang
Campion FellinApps Script CLI – ClaspOrne BrocaarLoRa Server
Nicolò RibaudoBabelGraeme RocherMicronaut
Rong Jie LooBazelAnders F Björklundminikube
Dave MielkeBRLTTYIskren ChernevMoment JS
Raphael Kubo da CostaChromiumTim DeschryverNgRx
Mike BanoncorebootBrandon RobertsNgRx
Elyes HaouascorebootEelco DolstraNixOS
Angel PonscorebootGuy BedfordNode.js
Ansgar BurchardtDebianYaw AnokwaOpen Data Kit
Chris LambDebian's Reproducible BuildsAndreas BartelsOpen Location Code
Zach LeathermaneleventyWes McKinneypandas
Vladimir GlavnyyFlatBuffersPradyun Gedampip
Alexandre ArdhuinFlutterMarvin Hagemeisterpreact
Kyle WongFlutterAndre Wigginspreact
Duncan LyallForseti SecurityChris Rocheprotoc-gen-validate (PGV)
Ross ScroggsGAM (Google Apps Manager)Ernest DurbinPython Package Index (PyPI)
Gert van DijkGerritRamon Santamariaraylib
Luca MilanesioGerrit Code ReviewAleksa SarairunC
David OstrovskyGerrit Code ReviewCornelius Weigskaffold
David PursehouseGerrit Code ReviewAnton Lindqvistsyzkaller
Matthias SohnGerrit Code ReviewZdenko PodobnýTesseract
Derrick StoleeGitKeqiu HuTonY
Roman LebedevGoogle BenchmarkBasarat Ali SyedTypeScript Deep Dive (book)
Florent Revestgooglecartographer/cartographer_rosPeter WongV8
Kirill KatsnelsongRPCKevin MurrayVerilog to Routing
Eddie KohlerhotcrpDarrell CommanderVirtualGL
Daniel-Constantin MierlaKamailioLin ClarkWasi + Wasmtime
Philipp CrocollKeepass2Android Password SafeSébastien HelleuWeechat
Shashwathi ReddyKnative buildWesley ShieldsYara
Congratulations to our recipients! We look forward to your continued support and contributions to open source!

By Maria Tabak, Google Open Source

Evaluating the Unsupervised Learning of Disentangled Representations



The ability to understand high-dimensional data, and to distill that knowledge into useful representations in an unsupervised manner, remains a key challenge in deep learning. One approach to solving these challenges is through disentangled representations, models that capture the independent features of a given scene in such a way that if one feature changes, the others remain unaffected. If done successfully, a machine learning system that is designed to navigate the real world, such as a self driving car or a robot, can disentangle the different factors and properties of objects and their surroundings, enabling the generalization of knowledge to previously unobserved situations. While, unsupervised disentanglement methods have already been used for curiosity driven exploration, abstract reasoning, visual concept learning and domain adaptation for reinforcement learning, recent progress in the field makes it difficult to know how well different approaches work and the extent of their limitations.

In "Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations" (to appear at ICML 2019), we perform a large-scale evaluation on recent unsupervised disentanglement methods, challenging some common assumptions in order to suggest several improvements to future work on disentanglement learning. This evaluation is the result of training more than 12,000 models covering most prominent methods and evaluation metrics in a reproducible large-scale experimental study on seven different data sets. Importantly, we have also released both the code used in this study as well as more than 10,000 pretrained disentanglement models. The resulting library, disentanglement_lib, allows researchers to bootstrap their own research in this field and to easily replicate and verify our empirical results.

Understanding Disentanglement
To better understand the ground-truth properties of an image that can be encoded in a disentangled representation, first consider the ground-truth factors of the data set Shapes3D. In this toy model, shown in the figure below, each panel represents one factor that could be encoded into a vector representation of the image. The model shown is defined by the shape of the object in the middle of the image, its size, the rotation of the camera and the color of the floor, the wall and the object.
Visualization of the ground-truth factors of the Shapes3D data set: Floor color (upper left), wall color (upper middle), object color (upper right), object size (bottom left), object shape (bottom middle), and camera angle (bottom right).
The goal of disentangled representations is to build models that can capture these explanatory factors in a vector. The figure below presents a model with a 10-dimensional representation vector. Each of the 10 panels visualizes what information is captured in one of the 10 different coordinates of the representation. From the top right and the top middle panel we see that the model has successfully disentangled floor color, while the two bottom left panels indicate that object color and size are still entangled.
Visualization of the latent dimensions learned by a FactorVAE model (see below). The ground-truth factors wall and floor color as well as rotation of the camera are disentangled (see top right, top center and bottom center panels), while the ground-truth factors object shape, size and color are entangled (see top left and the two bottom left images).
Key Results of this Reproducible Large-scale Study
While the research community has proposed a variety of unsupervised approaches to learn disentangled representations based on variational autoencoders and has devised different metrics to quantify their level of disentanglement, to our knowledge no large-scale empirical study has evaluated these approaches in a unified manner. We propose a fair, reproducible experimental protocol to benchmark the state of unsupervised disentanglement learning by implementing six different state-of-the-art models (BetaVAE, AnnealedVAE, FactorVAE, DIP-VAE I/II and Beta-TCVAE) and six disentanglement metrics (BetaVAE score, FactorVAE score, MIG, SAP, Modularity and DCI Disentanglement). In total, we train and evaluate 12,800 such models on seven data sets. Key findings of our study include:
  • We do not find any empirical evidence that the considered models can be used to reliably learn disentangled representations in an unsupervised way, since random seeds and hyperparameters seem to matter more than the model choice. In other words, even if one trains a large number of models and some of them are disentangled, these disentangled representations seemingly cannot be identified without access to ground-truth labels. Furthermore, good hyperparameter values do not appear to consistently transfer across the data sets in our study. These results are consistent with the theorem we present in the paper, which states that the unsupervised learning of disentangled representations is impossible without inductive biases on both the data set and the models (i.e., one has to make assumptions about the data set and incorporate those assumptions into the model).
  • For the considered models and data sets, we cannot validate the assumption that disentanglement is useful for downstream tasks, e.g., that with disentangled representations it is possible to learn with fewer labeled observations.
The figure below demonstrates some of these findings. The choice of random seed across different runs has a larger impact on disentanglement scores than the model choice and the strength of regularization (while naively one might expect that more regularization should always lead to more disentanglement). A good run with a bad hyperparameter can easily beat a bad run with a good hyperparameter.
The violin plots show the distribution of FactorVAE scores attained by different models on the Cars3D data set. The left plot shows how the distribution changes as different disentanglement models are considered while the right plot displays the different distributions as the regularization strength in a FactorVAE model is varied. The key observation is that the violin plots substantially overlap which indicates that all methods strongly depend on the random seed.
Based on these results, we make four observations relevant to future research:
  1. Given the theoretical result that the unsupervised learning of disentangled representations without inductive biases is impossible, future work should clearly describe the imposed inductive biases and the role of both implicit and explicit supervision.
  2. Finding good inductive biases for unsupervised model selection that work across multiple data sets persists as a key open problem.
  3. The concrete practical benefits of enforcing a specific notion of disentanglement of the learned representations should be demonstrated. Promising directions include robotics, abstract reasoning and fairness.
  4. Experiments should be conducted in a reproducible experimental setup on a diverse selection of data sets.
Open Sourcing disentanglement_lib
In order for others to verify our results, we have released disentanglement_lib, the library we used to create the experimental study. It contains open-source implementations of the considered disentanglement methods and metrics, a standardized training and evaluation protocol, as well as visualization tools to better understand trained models.

The advantages of this library are three-fold. First, with less than four shell commands disentanglement_lib can be used to reproduce any of the models in our study. Second, researchers may easily modify our study to test additional hypotheses. Third, disentanglement_lib is easily extendible and can be used to bootstrap research into the learning of disentangled representations—it is easy to implement new models and compare them to our reference implementation using a fair, reproducible experimental setup.

Reproducing all the models in our study requires a computational effort of approximately 2.5 GPU years, which can be prohibitive. So, we have also released >10,000 pretrained disentanglement_lib models from our study that can be used together with disentanglement_lib.

We hope that this will accelerate research in this field by allowing other researchers to benchmark their new models against our pretrained models and to test new disentanglement metrics and visualization approaches on a diverse set of models.

Acknowledgments
This research was done in collaboration with Francesco Locatello, Mario Lucic, Stefan Bauer, Gunnar Rätsch, Sylvain Gelly and Bernhard Schölkopf at Google AI Zürich, ETH Zürich and the Max-Planck Institute for Intelligent Systems. We also wish to thank Josip Djolonga, Ilya Tolstikhin, Michael Tschannen, Sjoerd van Steenkiste, Joan Puigcerver, Marcin Michalski, Marvin Ritter, Irina Higgins and the rest of the Google Brain team for helpful discussions, comments, technical help and code contributions.

Source: Google AI Blog


MorphNet: Towards Faster and Smaller Neural Networks



Deep neural networks (DNNs) have demonstrated remarkable effectiveness in solving hard problems of practical relevance such as image classification, text recognition and speech transcription. However, designing a suitable DNN architecture for a given problem continues to be a challenging task. Given the large search space of possible architectures, designing a network from scratch for your specific application can be prohibitively expensive in terms of computational resources and time. Approaches such as Neural Architecture Search and AdaNet use machine learning to search the design space in order to find improved architectures. An alternative is to take an existing architecture for a similar problem and, in one shot, optimize it for the task at hand.

Here we describe MorphNet, a sophisticated technique for neural network model refinement, which takes the latter approach. Originally presented in our paper, “MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep Networks”, MorphNet takes an existing neural network as input and produces a new neural network that is smaller, faster, and yields better performance tailored to a new problem. We've applied the technique to Google-scale problems to design production-serving networks that are both smaller and more accurate, and now we have open sourced the TensorFlow implementation of MorphNet to the community so that you can use it to make your models more efficient.

How it Works
MorphNet optimizes a neural network through a cycle of shrinking and expanding phases. In the shrinking phase, MorphNet identifies inefficient neurons and prunes them from the network by applying a sparsifying regularizer such that the total loss function of the network includes a cost for each neuron. However, rather than applying a uniform cost per neuron, MorphNet calculates a neuron cost with respect to the targeted resource. As training progresses, the optimizer is aware of the resource cost when calculating gradients, and thus learns which neurons are resource-efficient and which can be removed.

As an example, consider how MorphNet calculates the computation cost (e.g., FLOPs) of a neural network. For simplicity, let's think of a neural network layer represented as a matrix multiplication. In this case, the layer has 2 inputs (xn), 6 weights (a,b,...,f), and 3 outputs (yn; neurons). Using the standard textbook method of multiplying rows and columns, you can work out that evaluating this layer requires 6 multiplications.
Computation cost of neurons.
MorphNet calculates this as the product of input count and output count. Note that although the example on the left shows weight sparsity where two of the weights are 0, we still need to perform all the multiplications to evaluate this layer. However, the middle example shows structured sparsity, where all the weights in the row for neuron yn are 0. MorphNet recognizes that the new output count for this layer is 2, and the number of multiplications for this layer dropped from 6 to 4. Using this idea, MorphNet can determine the incremental cost of every neuron in the network to produce a more efficient model (right) where neuron y3 has been removed.

In the expanding phase, we use a width multiplier to uniformly expand all layer sizes. For example, if we expand by 50%, then an inefficient layer that started with 100 neurons and shrank to 10 would only expand back to 15, while an important layer that only shrank to 80 neurons might expand to 120 and have more resources with which to work. The net effect is re-allocation of computational resources from less efficient parts of the network to parts of the network where they might be more useful.

One could halt MorphNet after the shrinking phase to simply cut back the network to meet a tighter resource budget. This results in a more efficient network in terms of the targeted cost, but can sometimes yield a degradation in accuracy. Alternatively, the user could also complete the expansion phase, which would match the original target resource cost but with improved accuracy. We'll cover an example of this full implementation later.

Why MorphNet?
There are four key value propositions offered by MorphNet:
  • Targeted Regularization: The approach that MorphNet takes towards regularization is more intentional than other sparsifying regularizers. In particular, the MorphNet approach to induce better sparsification is targeted at the reduction of a particular resource (such as FLOPs per inference or model size). This enables better control of the network structures induced by MorphNet, which can be markedly different depending on the application domain and associated constraints. For example, the left panel of the figure below presents a baseline network with the commonly used ResNet-101 architecture trained on JFT. The structures generated by MorphNet when targeting FLOPs (center, with 40% fewer FLOPs) or model size (right, with 43% fewer weights) are dramatically different. When optimizing for computation cost, higher-resolution neurons in the lower layers of the network tend to be pruned more than lower-resolution neurons in the upper layers. When targeting smaller model size, the pruning tradeoff is the opposite. 
  • Targeted Regularization by MorphNet. Rectangle width is proportional to the number of channels in the layer. The purple bar at the bottom is the input layer. Left: Baseline network used as input to MorphNet. Center: Output applying FLOP regularizer. Right: Output applying size regularizer.
    MorphNet stands out as one of the few solutions available that can target a particular parameter for optimization. This enables it to target parameters for a specific implementation. For example, one could target latency as a first-order optimization parameter in a principled manner by incorporating device-specific compute-time and memory-time.
  • Topology Morphing: As MorphNet learns the number of neurons per layer, the algorithm could encounter a special case of sparsifying all the neurons in a layer. When a layer has 0 neurons, this effectively changes the topology of the network by cutting the affected branch from the network. For example, in the case of a ResNet architecture, MorphNet might keep the skip-connection but remove the residual block as shown below (left). For Inception-style architectures, MorphNet might remove entire parallel towers as shown on the right.
  • Left: MorphNet can remove residual connections in ResNet-style networks. Right: It can also remove parallel towers in Inception-style networks.
  • Scalability: MorphNet learns the new structure in a single training run and is a great approach when your training budget is limited. MorphNet can also be applied directly to expensive networks and datasets. For example, in the comparison above, MorphNet was applied directly to ResNet-101, which was originally trained on JFT at a cost of 100s of GPU-months.
  • Portability: MorphNet produces networks that are "portable" in the sense that they are intended to be retrained from scratch and the weights are not tied to the architecture learning procedure. You don't have to worry about copying checkpoints or following special training recipes. Simply train your new network as you normally would!

Morphing Networks
As a demonstration, we applied MorphNet to Inception V2 trained on ImageNet by targeting FLOPs (see below). The baseline approach is to use a width multiplier to trade off accuracy and FLOPs by uniformly scaling down the number of outputs for each convolution (red). The MorphNet approach targets FLOPs directly and produces a better trade-off curve when shrinking the model (blue). In this case, FLOP cost is reduced 11% to 15% with the same accuracy as compared to the baseline.
MorphNet applied to Inception V2 on ImageNet. Applying the flop regularizer alone (blue) improves the performance relative to baseline (red) by 11-15%. A full cycle, including both the regularizer and width multiplier, yields an increase in accuracy for the same cost (“x1”; purple), with continued improvement from a second cycle (“x2”; cyan).
At this point, you could choose one of the MorphNet networks to meet a smaller FLOP budget. Alternatively, you could complete the cycle by expanding the network back to the original FLOP cost to achieve better accuracy for the same cost (purple). Repeating the MorphNet shrink/expand cycle again results in another accuracy increase (cyan), leading to a total accuracy gain of 1.1%.

Conclusion
We’ve applied MorphNet to several production-scale image processing models at Google. Using MorphNet resulted in significant reduction in model-size/FLOPs with little to no loss in quality. We invite you to try MorphNet—the open source TensorFlow implementation can be found here, and you can also read the MorphNet paper for more details.

Acknowledgements
This project is a joint effort of the core team including: Elad Eban, Ariel Gordon, Max Moroz, Yair Movshovitz-Attias, and Andrew Poon. We also extend a special thanks to our collaborators, residents and interns: Shraman Ray Chaudhuri, Bo Chen, Edward Choi, Jesse Dodge, Yonatan Geifman, Hernan Moraldo, Ofir Nachum, Hao Wu, and Tien-Ju Yang for their contributions to this project.

Source: Google AI Blog


Bringing the best of open source to Google Cloud customers


Google’s belief in an open cloud stems from our deep commitment to open source. We believe open source is an essential part of the public cloud: It’s the foundation of IT infrastructure worldwide and has been a part of Google’s foundation since day one. This is reflected in our contributions to projects like Kubernetes, TensorFlow, Go, and many more.

Today, we’re taking our commitment to open source to the next level by announcing strategic partnerships with leading open source-centric companies in the areas of data management and analytics, including:
  • Confluent
  • DataStax
  • Elastic
  • InfluxData
  • MongoDB
  • Neo4j
  • Redis Labs
We’ve always seen our friends in the open-source community as equal collaborators, and not simply a resource to be mined. With that in mind, we’ll be offering managed services operated by these partners that are tightly integrated into Google Cloud Platform (GCP), providing a seamless user experience across management, billing and support. This makes it easier for our enterprise customers to build on open source technologies, and it delivers on our commitment to continually support and grow these open source communities.

Making open source even more accessible with a cloud-native experience

The open-source database market is big, and growing fast. According to SearchDataManagement.com, “more than 70% of new applications developed by corporate users will run on an open source database management system, and half of the existing relational database installations built on commercial DBMS technologies will be converted to open source platforms or [are] in the process of being converted."

This mirrors what we hear from our customers—that you want to be able to use open-source technology easily and in a cloud-native way. The partnerships we are announcing today make this possible by offering an elevated experience similar to Google’s native services. It also means that you aren’t locked in or out when you are using these technologies—we think that’s important for our customers and our partners.

Here are some of the benefits these partnerships will offer:
  • Fully managed services running in the cloud, with best efforts made to optimize performance and latency between the service and application.
  • A single user interface to manage apps, which includes the ability to provision and manage the service from the Google Cloud Console.
  • Unified billing, so you get one invoice from Google Cloud that includes the partner’s service.
  • Google Cloud support for the majority of these partners, so you can manage and log support tickets in a single window and not have to deal with different providers.
To further our mission of making GCP the best destination for open source-based services, we will work with our partners to build integrations with native GCP services like Stackdriver for monitoring and IAM, validate these services for security, and optimize performance for users.

Partnering with leaders in open source

The partners we are announcing today include several of the top-ranked databases in their respective categories. We’re working alongside these creators and supporting the growth of these companies’ technologies to inspire strong customer experiences and adoption. These new partners include:

Confluent: Founded by the team that built Apache Kafka, Confluent builds an event streaming platform that lets companies easily access data as real-time streams. Learn more.

DataStax: DataStax powers enterprises with its always-on, distributed cloud database built on Apache Cassandra and designed for hybrid cloud. Learn more.

Elastic: As the creators of the Elastic Stack, Elastic builds self-managed and SaaS offerings that make data usable in real time and at scale for search use cases, like logging, security, and analytics. Learn more.

InfluxData: InfluxData’s time series platform can instrument, observe, learn and automate any system, application and business process across a variety of use cases. InfluxDB (developed by InfluxData) is an open-source time series database optimized for fast, high-availability storage and retrieval of time series data in fields such as operations monitoring, application metrics, IoT sensor data, and real-time analytics. Learn more.

MongoDB: MongoDB is a modern, general-purpose database platform that brings software and data to developers and the applications they build, with a flexible model and control over data location. Learn more.

Neo4j: Neo4j is a native graph database platform specifically optimized to map, store, and traverse networks of highly connected data to reveal invisible contexts and hidden relationships. By analyzing data points and the connections between them, Neo4j powers real-time applications. Learn more.

Redis Labs: Redis Labs is the home of Redis, the world’s most popular in-memory database, and commercial provider of Redis Enterprise. It offers performance, reliability, and flexibility for personalization, machine learning, IoT, search, e-commerce, social, and metering solutions worldwide. Learn more.

We’re pleased to bring these partner technologies to you. Partnering with the companies that invest in developing open-source technologies means you get benefits like expertise in operating these services at scale, additional enterprise features, and shorter cycles in bringing the latest innovation to the cloud. 

We look forward to seeing what you build with these open source technologies. Learn more here about open source on GCP.

By Chris DiBona, Director, Open Source and Kevin Ichhpurani, Corporate VP, Global Ecosystem and Business Development

Cross-posted from the Google Cloud Blog