Category Archives: Open Source Blog

News about Google’s open source projects and programs

Open source SystemVerilog tools in ASIC design

Open source hardware is undeniably undergoing a renaissance whose origin can be traced to the establishment of RISC-V Foundation (later redubbed RISC-V International). The open ISA and ecosystem, in which Antmicro participated since the beginning as a Founding member, has sparked many open source CPU implementations, new tooling, methodologies, and trends which allow for more collaborative and software driven design.

Many of those broader open hardware activities have been finding a home in CHIPS Alliance, an open source organization we participate in as a Platinum member alongside Google, Intel, Western Digital, SiFive and others, whose goals explicitly encompass:
  • creating and maintaining open source ASIC and FPGA design tools (digital and analog)
  • open source core and uncore IP
  • interconnects, interoperability specs and more
This is in perfect alignment with Antmicro’s mission—as we’ve been heavily involved with many of the projects inside of and related to CHIPS providing commercial support, engineering services, and assistance in practical adoption for enterprise deployments.

As of this time, a range of everyday design, development, testing, and verification tasks are already possible using open source tools and components and are part of our and our customer’s everyday workflow. Other developments are within reach given a reasonable amount of development, which we can provide based on specific scenarios. Others still are much further away, but with dedicated efforts inside CHIPS in which we are involved together with partners like Google and Western Digital, there is a pathway towards a completely open hardware design and verification ecosystem. This will eventually unlock incredible potential in new design methodologies, vertical integration capabilities, and education and business opportunities. Until then, Antmicro can help you with extracting practical value for many scenarios such as simulation, linting, formatting, synthesis, continuous integration and more.

Building a SystemVerilog ecosystem in CHIPS

Some of the challenges towards practical adoption of open source in ASIC design have been related to the fact that a significant proportion of advanced ASIC design is done in SystemVerilog, a fairly complex and powerful language in its own right, which used to be poorly supported in the open source tooling ecosystem. Partial solutions like SystemVerilog to Verilog converters or paid plugins existed, but direct support lagged behind, making open source tools for SystemVerilog a difficult sell previously.

This has been fortunately changing rapidly with a dedicated development effort spearheaded by Google and Antmicro. Projects in this space include Verible, Surelog, UHDM and sv-tests that we have been developing, as well as integrating with existing tools like Yosys, Verilator under the umbrella of the SymbiFlow open source FPGA project, and which are now officially being transferred into the CHIPS Alliance to increase awareness and build a broader SystemVerilog ecosystem.

In this note, we will walk you through the state of the art in new SystemVerilog capabilities in open source projects, and invite you to reach out to see how CHIPS Alliance’s SystemVerilog projects can be useful to you today or in the near future.

A walk through the state of the art in new SystemVerilog capabilities in open source projects

Verible

The Verible project originated at Google; its main mission is to make SystemVerilog easily and quickly parsable for a wide variety of applications mostly focusing on developer tools.

Verible is a set of tools based on a common SystemVerilog parsing engine, providing a command line interface which makes integration with other tools for daily usage or CI systems for automatic testing and deployment a breeze.

Antmicro has been involved in the development of Verible since its initial open source release and we now provide a significant portion of current development efforts, helping adapt it for use in various open source projects or commercial environments that use SystemVerilog. One notable user is the security-focused OpenTitan project, which has driven many interesting developments and provides a good showcase of the capabilities being completely open source, well documented, fairly complex, and used in real applications.

Linter

One of the most common use cases for Verible is linting. The linter analyzes code for patterns and constructs that are deemed undesirable according to the implemented lint rules. The rules follow authoritative style guides that can be enforced on a project or company level in various SystemVerilog projects.

The rules range from simple ones like making sure the module name matches the file name to more sophisticated like checking variable naming conventions (all caps, snake case, specific prefix or suffix etc.) or making sure the labels after the begin and end statements match.

A full list of rules can be found in the Verible lint documentation and is constantly growing. Usage is very simple:

$ verible-verilog-lint --ruleset all core.sv 

core.sv:3:11: Interface names must use lower_snake_case naming convention and end with _if. [Style: interface-conventions] [interface-name-style]


The output of the linter is easy to understand, as the way issues are reported to the user is modeled after popular programming language compilers.

The linter is highly configurable. It is possible to select the rules for which the compliance will be checked, some rules allow for detailed configuration (e.g. max line length).

Rules can also be selectively waived in specific files or at specific lines or even by regex matching. In addition, some rules can be automatically fixed by the linter itself.

Formatter

The Verible formatter is a complementary tool for the linter. It is used to automatically detect various formatting issues like improper indentation or alignment. As opposed to the linter, it only detects and fixes issues that have no lexical impact on the source code.

The formatter also comes with useful helper scripts for selective and interactive reformatting (e.g. only format files that changed according to git, ask before applying changes to each chunk).

A toolset that consists of both the linter and the formatter can effectively remove all the discussions about styling, preferences and conventions from all pull requests. Developers can then focus solely on the technical aspects of the proposed changes.

$ cat sample.sv

typedef struct {

bit first;

        bit second;

bit

   third

        ;

  bit fourth;

bit fifth; bit sixth;

}

 foo_t;



$ verible-verilog-format sample.sv

typedef struct {

  bit first;

  bit second;

  bit third;

  bit fourth;

  bit fifth;

  bit sixth;

} foo_t;

Indexer

The Verible parser itself can be relatively easily used to perform many other tasks. One of the interesting use cases is generating a Kythe compatible indexing database.

Indexing a SystemVerilog project makes it very easy to collaborate on a project remotely. It is possible to navigate through the source code using nothing else than just a web browser.

The Kythe integration can be served on an arbitrary server, can be deployed after every commit in a project, etc. A showcase of the indexing mechanism can be found in our GitHub repository. The demo downloads the latest version of the Ibex core, indexes it, and deploys it to be viewed on a remote machine. The results can be viewed on the example index webpage.

The demo downloads the latest version of the Ibex core, indexes it, and deploys it to be viewed on a remote machine. The results can be viewed on the example index webpage.

Indexing is widely adopted for many larger open source software projects.

Thanks to Verible, it is now possible to do the same in the world of open source HDL designs, and of course private, company-wide deployments like this are also possible.

Surelog and UHDM

SystemVerilog is a powerful language but also complex. So far no open source tools have been able to support it in full. Implementing it separately for each project such as the Yosys synthesis tool or the Verilator simulator would take a colossal amount of time, and that’s where Surelog and UHDM come in.

Surelog, originally created and led by Alain Dargelas, aims to be a fully-featured SystemVerilog 2017 preprocessor, parser, and elaborator. It’s a modern tool and thus follows the current version of the SV standard without unnecessary deviations or legacy baggage.

What’s interesting is that Surelog is only a language frontend designed to integrate well with other tools—it outputs an elaborated design in an intermediate format called UHDM.

UHDM stands for Universal Hardware Data Model, and it’s both a file format for storing hardware designs and a library able to manipulate this format. A client application can access the data using VPI, which is a standard programming interface for SystemVerilog.

What this means is that the work required to create a SystemVerilog parser only needs to be done once, and other tools can use that parser via UHDM. This is much easier than implementing a full SystemVerilog parser within each tool. What’s more, any improvements in the unified parser will provide benefits for all client applications. Finally, any other parser is free to emit UHDM as well, so in the future we might see e.g. a UHDM backend for Verible.

Just like in Verible’s case, both Surelog and UHDM have recently been contributed into CHIPS Alliance to drive a broader adoption. We are actively contributing to both projects, especially around the integrations with tooling such as Yosys and Verilator, and practical use in open source and customer projects.

Recent Antmicro contributions adding UHDM frontends for Yosys and Verilator enabled Ibex synthesis and simulation. The complete OpenTitan project is the next milestone.

The Surelog/UHDM/Yosys flow enabling SystemVerilog synthesis without the necessity of converting the HDL code to Verilog is a great improvement for open source ASIC build flows such as OpenROAD’s OpenLane flow (which we also support commercially). Removing the code conversion step enables the developers to perform e.g. circuit equivalence validation to check the correctness of the design.

More information about Surelog/UHDM and Verible can be found in a dedicated CHIPS Alliance presentation that was recently given by Henner Zeller, Google’s Verible lead.

UVM is in the picture

No open source ASIC design toolkit can be complete without support for Universal Verification Methodology, or UVM, which is one of the most widespread verification methodologies for large-scale ASIC design. This has also been an underrepresented area in open source tooling and changing that is an enormous undertaking, but working together with our customers, most notably Western Digital, we have been making progress on that front as well.

Across the ASIC development landscape, UVM verification is currently performed with proprietary simulators, but a more easily distributable, collaborative and open ecosystem is needed to close the feedback loop between (emerging) open source design approaches and verification. Verilator is an extremely popular choice for other system development use cases but it has historically not focused on UVM-style verification. Other styles of verification, such as the very interesting and popular Python-based cocotb framework maintained by FOSSi Foundation, have been enabled in Verilator. But support for UVM, partly due to the size and complexity of the methodology, has been notably absent.

One of the features missing from Verilator but needed for UVM is SystemVerilog stratified scheduling, which is a set of rules specified in the standard that govern the way time progresses in a simulation, as well as the order of operations. A SystemVerilog simulation is divided into smaller steps called time slots, and each time slot is further divided into multiple regions. Specific events can only happen in certain regions, and some regions can reoccur in a single time slot.

Until recently, Verilator had implemented only a small subset of these rules, as all scheduling was being done at compilation time. Spearheading a long-standing development effort within CHIPS Alliance, in collaboration with the maintainer of Verilator, Wilson Snyder, we have built is a proof-of-concept version of Verilator with a dynamic scheduler, which manages the occurrence of certain events at runtime, extending the stratified scheduling support. More details can be found in Antmicro’s presentation for the inaugural CHIPS Alliance Deep Dive Cafe Talk.

Another feature required for UVM is constrained randomization, which allows generating random inputs to feed to a design in order to thoroughly test it. Unlike unconstrained randomization, which is already provided by Verilator, it allows the user to specify some rules for input generation, thus limiting the possible value space and making sure that the input makes sense. Work on adding this to Verilator has already started, although the feature is still in its infancy. There are many other features on the roadmap which will eventually enable practical UVM support—stay tuned with our CHIPS Alliance events to follow that development.

What next?

Support for SystemVerilog parsers, for the intermediate format, and for their respective backends and integrations with various tooling, as well as for UVM is now under heavy development. If you would like to see more effort put into a specific area, reach out to us at [email protected]. Antmicro offers commercial support services to extend the flows we’ve briefly presented here to various practical applications and designs, and to effectively integrate this approach into people’s workflows.

Adding to this our cloud expertise, Antmicro customers can benefit from a complete and industry-proven methodology scalable between teams and across on-premise and cloud installations, transforming chip design workflows to be more software-driven and collaborative. To take advantage of open source solutions with tools like Verilator, Yosys, OpenROAD and others - tell us about your use case and we will see what can be done today.

If you are interested in collaborating on the development of SystemVerilog-focused and other open hardware tooling, join CHIPS Alliance and participate in our workgroups and help us push innovation in ASIC design forward.

Originally posted on the Antmicro blog.

By guest author Michael Gielda, Antmicro, and Tim Ansell, Software Engineer

Announcing the latest Open Source Peer Bonus winners

 

Image that says Google Open Source Peer Bonus with a graphic of a trophy with the open source logo inside

The Google Open Source Peer Bonus program is designed to reward external open source contributors nominated by Googlers for their exceptional contributions to open source. We are very excited to announce our latest round of 112 winners—a new record—from 33 countries! We’re also sharing some comments by Googlers about what the Open Source Peer Bonus program means to them.

“I've nominated a number of open source contributors for the Peer Bonus program. Since most people volunteer out of passion for a project and expect nothing in return, getting an email from Google thanking them for their contribution carries a lot of meaning.” — Jason Miller

The Open Source Peer Bonus program rewards open source enthusiasts for contributions across open source, including code contributions, community work, documentation, mentoring, and other types of open source contribution—if a Googler believes that someone has made a positive contribution to an open source project, that person can be nominated for an Open Source Peer Bonus.

“Open Source is core to work at Google—it's the very spirit of its community and users. The Open Source Peer Bonus represents the way we want to share the spirit with everyone who feels the same spirit and puts it into developing cool stuff out there!” — Cristina Conti

Collaboration and innovation lie at the core of open source, advancing modern technology and removing barriers. Google relies on open source for many of our products and services and we are thrilled to have an opportunity to give back to the community by rewarding open source contributors.

“I've been active in the open-source community for many years. I've often been amazed by some contributors who go out of their way to help me and others; fix bugs, implement features, provide support and do code reviews. Since I started working at Google, I've had the privilege of nominating a few of these contributors for the Open Source Peer Bonus. I'm happy to see their effort get support and recognition from the corporate world. I hope that other big tech companies follow Google's lead in this regard.” — Ram Rachum

“Developers that take the time to share their code and expertise with the larger developer community help empower us all to make better software. Android demos can help other devs get their apps working and also helps Google see gaps and room for improvements in APIs or documentation. Open-source developers are an invaluable part of the ecosystem! Thank you!” — Emilie Roberts

Below is the list of current winners who gave us permission to thank them publicly:

Winner

Open Source Project

Neil Pang

acmesh-official

Bryn Rhodes

Android FHIR SDK

Simon Marquis

Android Install Referrer

Alexey Knyazev

ANGLE

Mike Hardy

ankidroid

Jeff Geerling

Ansible, Drupal

Jan Lukavský

Apache Beam

Phil Sturgeon

APIs You Won't Hate

Joseph Kearney

autoimpute

Olek Wojnar

Bazel

Jesse Chan

Bazel Hardware Description Language Build Rules

Pierre Quentel

Brython

Elizabeth Barron

CHAOSS

Mathias Buus

chromecasts

Matthew Kotsenas

CIPD (Part of Chrome CI software)

Orta Therox

CocoaPods

Matt Godbolt

Compiler Explorer

Dmitry Safonov

CRIU

Adrian Reber

CRIU (Checkpoint/Restore in User-space)

Prerak Mann

Dart - package:ffigen

Alessandro Arzilli

delve

Derek Parker

delve

Sarthak Gupta

DRS-Filer (elixir-cloud-aai)

Eddie Jaoude

Eddiehub

Josh Holtz

fastlane

Eduardo Silva

Fluent Bit

Mike Rydstrom

Flutter

Balvinder Singh Gambhir

Flutter

James Clarke

Flutter

Jody Donetti

FusionCache

Jenny Bryan

gargle

Gennadii Donchyts

gee-community

Ævar Arnfjörð Bjarmason

Git

Joel Sing

Go

Sean Liao

Go

Cuong Manh Le

Go

Daniel Martí

gofumpt

Cristian Bote

Goober

Romulo Santos

Google Cloud Community

Jenn Viau

GoogleCloudPlatform / gke-poc-toolkit

Nikita Shoshin

gopls

Mulr Manders

gopls

Shirou Wakayama

gopsutil

Pontus Leitzler

govim

Paul Jolly

govim

Arsala Bangash

Grey Software

Santiago Torres-Arias

In-Toto

David Wu

KataGo

Alexey Odinokov

kpt, kpt-functions-catalog, and kustomize

Alvaro Aleman

Kubernetes

Manuel de Brito Fontes

Kubernetes

Arnaud Meukam

Kubernetes

Federico Gimenez

Kubernetes

Elana Hashman

Kubernetes

Katrina Verey

Kustomize

Max Kellermann

MusicPlayerDaemon/MPD

Kamil Myśliwiec

NestJS

Weyert de Boer

Node.js Pub/Sub Client Library

James McKinney

Open Civic Data Division Identifiers

Angelos Tzotsos

OSGeo-Live, pycsw, GeoNode, OSGeo Foundation board member (non-paid), and more ...

Daniel Axtens

Patchwork

Ero Carrera

pefile

Nathaniel Brough

Pigweed

Alex Hall

PySnooper

Loic Mathieu

Quarkus Google Cloud Services

Federico Brigante

Refined Github

Michael Long

Resolver

Bruno Levy

RISC-V Ecosystem on FPGAs

Mara Bos

Rust

Eddy B.

Rust

Aleksey Kladov

Rust Analyzer

Noel Power

Samba

David Barri

scalajs-react

Marco Vermeulen

SDKman

Naveen Srinivasan

Security Scorecards

Marina Moore

Sigstore

Feross Aboukhadijeh

simple-peer

Ajay Ramachandran

SponsorBlock

Eddú Meléndez Gonzales

Spring Cloud GCP

Dominik Honnef

staticcheck

Zoe Carver

Swift

Rodrigo Melo

SymbiFlow + Open Source FPGA Tooling Ecosystem

Carlos de Paula

SymbiFlow and RISC-V ecosystem

Naoya Hatta

System Verilog Test Suite

Mike Popoloski

System Verilog Test Suite

Soule Ba

Tekton

Priti Desai

Tekton

Joyce Er

TensorBoard

Vignesh Kothapalli

TensorFlow

Hyeyoon Lee

TensorFlow

Akhil Chinnakotla

TensorFlow

Stephen Wu

TensorFlow

Vishnu Banna

TensorFlow

Haidong Rong

TensorFlow

Sean Morgan

TensorFlow

Jason Zaman

TensorFlow

Yong Tang

TensorFlow

Mahamed Ali

Terraform Provider Google

Sayak Paul

tfhub.dev

Aidan Doherty

The Good Docs Project

Alyssa Rock

The Good Docs Project

Heinrich Schuchardt

U-Boot

Aditya Sharma

User Story (GSoC project)

Dan Clark

V8

Armin Brauns

Verilog to Routing & SymbiFlow

Marwan Sulaiman

vscode-go

Ryan Christian

WMR & Microbundle

Yaroslav Podorvanov

yaroslav-harakternik

Anirudh Vegesana

Yolo

Alistair Miles

zarr

Thank you for your contributions to open source! Congratulations!

By Erin McKean and Maria Tabak —Google Open Source Programs Office

Using Saliency in progressive JPEG XL images

At Google, we are working towards improving the web experience for users. Getting images delivered fast is a crucial part of the web experience and progressive images can help getting the salient parts, detected by machine learning, first. When you look at an image, you don’t immediately look at the entire image, but tend to gaze at the most interesting, or “salient”, parts of the image first. When delivering images over the web, it is now possible to organize the data in such a way that the most salient parts arrive first. Ideally you don’t even notice that some less salient parts have not yet arrived, because by the time you look at those parts they have already arrived and rendered.

We will explain how this works with the new open source image format JPEG XL, but we’ll start by taking a step back and describing how images are currently delivered and rendered on the web.

How partial images are displayed on the web

It’s important that web sites including images load quickly, because waiting for images to load causes frustration. Two techniques in particular are used to make images appear fast: One is showing an approximation of the image before all bytes of the image are transmitted, often known as “progressive image loading.” Another is making the byte size of the image smaller by using strong image compression.

What is progressive image loading?

Some image formats are implemented in a way that does not allow any kind of progressive image loading; all the bytes of the image have to be received before rendering can begin. The next, most simple, type of image loading is sometimes called “sequential image loading.” For these images, the data is organized in a way that pixels come in a particular order, typically in rows and from top to bottom.

Formats with this kind of image loading include PNG, webp, and JPEG. The JPEG format allows more sophisticated forms of progressive images. Here, we can organize the data so that it comes in multiple scans, with each scan showing more detail than the previous one.

For example, even if only approximately 15% of the data for an image is loaded, it often already has decent results. See the following images comparing no progression:

100% of bytes loaded, original image
100% of bytes loaded, original image

15% of bytes loaded, no progressive image loading
15% of bytes loaded, no progressive image loading

15% of bytes loaded, sequential image loading
15% of bytes loaded, sequential image loading

100% of bytes loaded, original image
15% of bytes loaded, progressive JPEG

In the first scan, the progressive JPEG only has a small amount of information available for the image, (e.g. only the average color of 8x8 blocks). Known as the DC-only scan, because the average color of each 8x8 block is called DC-component in the discrete cosine transform, it is the basis of JPEG image compression. Check out this computerphile video on JPEG DCT for a basic introduction. Instead of displaying an image that consists of 8x8 blocks, JPEG rendering in Chrome and Firefox choose to render the preview with some smoothing, to provide a less distracting experience.

Progressive JPEG XLs

While the quality (and therefore byte-sizes) of the individual scans in a progressive JPEG image can be controlled, the order within a scan is still top to bottom, like in a sequential JPEG. JPEG XL goes beyond that by making it possible to send the data necessary to display all details of the most salient parts first, followed by the less salient parts. For example, in a portrait, we can decide to first send the bytes for the face, and then, for the out-of-focus background.

In general, progressive JPEG XL works in the following way:
  • There is always an 8x8 downsampled image available (similar to a DC-only scan in a progressive JPEG). The decoder can display that with a nice upsampling, which gives the impression of a smoothed version of the image.
  • The image is divided into square groups (typically of size 256 x 256) and it is possible to provide an order of these groups during encoding. In particular, we can order the groups by saliency and choose an order that anticipates where the viewer might look first, while not being disturbing.
While the format allows for a very flexible order of the groups, our current encoder chooses a starting group and then grows concentric squares around that group. This is because we expect that this will be less distracting to the user. To make successive updates even less noticeable, we smooth the boundary between groups for which all the data has arrived and those that still contain an incomplete approximation. One requirement of this technique is a good way of identifying where the salient parts of an image are, which is needed when encoding an image. This information is typically represented by a saliency map which can be visualized as a heatmap image, where the more salient parts are redder.

Original image next to saliency map image
Original image.                                                                                                             Saliency map.

Smooth DC-image next to image with group border
Smooth DC-image.                                                                                                  Image with group order.

Stay tuned for videos showing progressive JPEG XL in action.

How to find good saliency maps for images

Saliency prediction models (overview) aim at predicting which regions in an image will attract human attention. To predict saliency effectively, our model leverages the power of deep neural nets to consider both high level semantic signals like face, objects, shapes etc., as well as low or medium level signals like color, intensity, texture, and so on. The model is trained on a large scale public gaze/saliency data set, to make sure the predicted saliency best mimics human gaze/fixation behaviour on each image. The model takes an image as the input and output a saliency map, which can serve as a visual importance map, and hence help determine the decoding order for each region in the image. Example images and their predicted saliency are as follows:

Example images and their predicted saliency

At the time of writing (July 2021), Chrome and Firefox did not yet support decoding JPEG XL image progressively in the way we describe, but the spec does allow encoding arbitrary group orders.

Different users have different experiences when it comes to looking at images loading on the web.We hope that this way of progressively delivering images will improve user experience especially on lower-bandwidth connections.

By Moritz Firsching and Junfeng He – Google Research

Using Saliency in progressive JPEG XL images

At Google, we are working towards improving the web experience for users. Getting images delivered fast is a crucial part of the web experience and progressive images can help getting the salient parts, detected by machine learning, first. When you look at an image, you don’t immediately look at the entire image, but tend to gaze at the most interesting, or “salient”, parts of the image first. When delivering images over the web, it is now possible to organize the data in such a way that the most salient parts arrive first. Ideally you don’t even notice that some less salient parts have not yet arrived, because by the time you look at those parts they have already arrived and rendered.

We will explain how this works with the new open source image format JPEG XL, but we’ll start by taking a step back and describing how images are currently delivered and rendered on the web.

How partial images are displayed on the web

It’s important that web sites including images load quickly, because waiting for images to load causes frustration. Two techniques in particular are used to make images appear fast: One is showing an approximation of the image before all bytes of the image are transmitted, often known as “progressive image loading.” Another is making the byte size of the image smaller by using strong image compression.

What is progressive image loading?

Some image formats are implemented in a way that does not allow any kind of progressive image loading; all the bytes of the image have to be received before rendering can begin. The next, most simple, type of image loading is sometimes called “sequential image loading.” For these images, the data is organized in a way that pixels come in a particular order, typically in rows and from top to bottom.

Formats with this kind of image loading include PNG, webp, and JPEG. The JPEG format allows more sophisticated forms of progressive images. Here, we can organize the data so that it comes in multiple scans, with each scan showing more detail than the previous one.

For example, even if only approximately 15% of the data for an image is loaded, it often already has decent results. See the following images comparing no progression:

100% of bytes loaded, original image
100% of bytes loaded, original image

15% of bytes loaded, no progressive image loading
15% of bytes loaded, no progressive image loading

15% of bytes loaded, sequential image loading
15% of bytes loaded, sequential image loading

100% of bytes loaded, original image
15% of bytes loaded, progressive JPEG

In the first scan, the progressive JPEG only has a small amount of information available for the image, (e.g. only the average color of 8x8 blocks). Known as the DC-only scan, because the average color of each 8x8 block is called DC-component in the discrete cosine transform, it is the basis of JPEG image compression. Check out this computerphile video on JPEG DCT for a basic introduction. Instead of displaying an image that consists of 8x8 blocks, JPEG rendering in Chrome and Firefox choose to render the preview with some smoothing, to provide a less distracting experience.

Progressive JPEG XLs

While the quality (and therefore byte-sizes) of the individual scans in a progressive JPEG image can be controlled, the order within a scan is still top to bottom, like in a sequential JPEG. JPEG XL goes beyond that by making it possible to send the data necessary to display all details of the most salient parts first, followed by the less salient parts. For example, in a portrait, we can decide to first send the bytes for the face, and then, for the out-of-focus background.

In general, progressive JPEG XL works in the following way:
  • There is always an 8x8 downsampled image available (similar to a DC-only scan in a progressive JPEG). The decoder can display that with a nice upsampling, which gives the impression of a smoothed version of the image.
  • The image is divided into square groups (typically of size 256 x 256) and it is possible to provide an order of these groups during encoding. In particular, we can order the groups by saliency and choose an order that anticipates where the viewer might look first, while not being disturbing.
While the format allows for a very flexible order of the groups, our current encoder chooses a starting group and then grows concentric squares around that group. This is because we expect that this will be less distracting to the user. To make successive updates even less noticeable, we smooth the boundary between groups for which all the data has arrived and those that still contain an incomplete approximation. One requirement of this technique is a good way of identifying where the salient parts of an image are, which is needed when encoding an image. This information is typically represented by a saliency map which can be visualized as a heatmap image, where the more salient parts are redder.

Original image next to saliency map image
Original image.                                                                                                             Saliency map.

Smooth DC-image next to image with group border
Smooth DC-image.                                                                                                  Image with group order.

Stay tuned for videos showing progressive JPEG XL in action.

How to find good saliency maps for images

Saliency prediction models (overview) aim at predicting which regions in an image will attract human attention. To predict saliency effectively, our model leverages the power of deep neural nets to consider both high level semantic signals like face, objects, shapes etc., as well as low or medium level signals like color, intensity, texture, and so on. The model is trained on a large scale public gaze/saliency data set, to make sure the predicted saliency best mimics human gaze/fixation behaviour on each image. The model takes an image as the input and output a saliency map, which can serve as a visual importance map, and hence help determine the decoding order for each region in the image. Example images and their predicted saliency are as follows:

Example images and their predicted saliency

At the time of writing (July 2021), Chrome and Firefox did not yet support decoding JPEG XL image progressively in the way we describe, but the spec does allow encoding arbitrary group orders.

Different users have different experiences when it comes to looking at images loading on the web.We hope that this way of progressively delivering images will improve user experience especially on lower-bandwidth connections.

By Moritz Firsching and Junfeng He – Google Research

El Carro extends the flexibility and choices for Oracle databases on Kubernetes

When we released El Carro, our goal was to provide the best experience possible to run Oracle databases on Kubernetes with the help of our operator. Today, we want to take a closer look at how that works. The diagram below shows the high-level architecture of a database that is managed by El Carro. At the core is the actual database instance with its background processes which run in a single container that contains the Oracle installation. So how does this container image get created and what goes into it? The image itself is essentially a snapshot of a filesystem that contains an operating system, packages and other software, and custom scripts. Specifically for El Carro, an image is made up of a base OS, required packages, and an Oracle database installation. The image must be stored on a container registry that is accessible by the Kubernetes cluster, and El Carro will expect oracle binaries to be installed in certain paths—or create symbolic links to those locations.

Architecture Diagram showing the operator controlling the db container.

Initially, El Carro worked with 12c for Enterprise Edition and 18c for Express Edition. And while 12c is still popular with many users, the extended support ended this summer. So the first news is that we added support for 19c, Oracle’s long term release. The choice should be easy for any new database deployments, but the options don’t end there.

We know that DBAs have different preferences in how and where software gets installed and we believe that making different options available will ultimately empower users. With the exception of Express Edition, redistributing is not a right granted by Oracle licenses, preventing the community from providing a public container registry with usable images. Rather than that, each user will have to build their own image based on binaries they download from Oracle themselves, using their own license agreement with Oracle. All of the other containers used by the El Carro operator use open source software and are made available on our public registry, so that you do not have to build and host them yourself.

Option 1 - Use El Carro to build your own image with GCP

If you are using GCP, then we have an easy way for you to create custom images, you just upload Oracle binaries and patches to your own GCS bucket and start a Cloud Build job that will create the container image for you and upload it to your own, private container registry. A single build script and serverless cloud services take care of the whole process, so that you don’t have to worry about building locally and moving more images across the internet. In addition to creating seeded images (see below), this method also allows you to build containers with the Oracle Patches such as Release Update Revisions (RURs).

Diagram of container image build pipeline where a cloud build job reads installation files from GCS and writes finished images to GCR.

Option 2 - Use El Carro to build you own image locally

You can also use the same Dockerfile and build process from Option 1—but without Google Cloud. Download Oracle installers and patches locally or to a VM used for the builds—then start a script that invokes Docker and builds the image on that machine. Lastly, tag and push the container image to a container registry of your choice. You will have to do a few more steps yourself if you don’t use Cloud Build, but you get the same image and customization options as with Option 1.

Option 3 - Use Oracle build scripts to build your own image

Oracle also maintains an open source repository of scripts to build container images with their database. Maybe you are already using those images either with docker or Kubernetes, or you prefer to use Oracle’s own build method over ours. We recently added functionality to El Carro to make sure that the resulting images work just as well as the ones that El Carro can build for you.

Option 4 - Use Oracle’s Container Registry directly

There is a way to avoid building your own images: The Oracle Container Registry contains pre-built images that can be used with our Kubernetes operator directly and without modification. But since Oracle’s registry can only be accessed by customers, it is protected with a password. After accepting Oracle’s license conditions, one can either copy images to their own registry, or configure OCR as a private repository in Kubernetes.

The Power of Seeding

Aside from the installation, it is the creation of a database that takes the longest time in the initial provisioning process and it is often a frustrating wait time before you can log in and use your database for the first time after creation. To reduce this wait time, the first two options allow you to build a pre-seeded database image that already contains a snapshot of a created and configured database. That way this initialization step is moved to the container build process and minimizes the startup time of new database instances.

Aside from the wait time, relying on a seeded image (i.e. including an empty database in the image can provide consistency in config options if the same image is to be used in multiple deployments).

Option 1 - El Carro on GCP

Option 2 - El Carro local build

Option 3 - Oracle local build

Option 4 - Oracle Container Registry

Versions

12c, 18c, 19c

12c, 18c, 19c

12c, 18c, 19c

19c

Editions

XE, EE

XE, EE

EE

EE

Patches Updates

yes

yes

no

no

Seeded Images

yes

yes

no

no

Automatic build pipeline

yes

no

no

n/a

Conclusion

We believe in an open cloud approach and empowering users with choice and flexibility. In the context of running Oracle databases on Kubernetes that means that you get to choose your database container images. El Carro provides build scripts that allow you to not only customize containers but also to increase security and robustness with the ability to bake patches and updates into the container image. Seeding container images with a database further reduces the deployment time by avoiding this step on first startup - which is especially useful in environments that create many databases - such as automatic test pipelines.

But other users may feel more comfortable in receiving support when they use Oracle’s pre-built images from their registry.

The choice is yours. Just know that El Carro is here to help you modernize your Oracle database workloads with Kubernetes. And if you have any other feature requests or choices that matter to you—let us know by filing an issue on Github.

By Bjoern Rost, Product Manager and Ash Gbadamassi, Software Engineer – Cloud Databases

Google Summer of Code 2021: Results announced!

In 2021, our global online program, Google Summer of Code (GSoC), focused on bringing more student developers into open source for 10 weeks from June to August, concluding yesterday, on August 30th with the final mentor evaluations of their students. We are pleased to announce that 1,205 students from 67 countries have successfully completed this year’s program. There were also 199 open source organizations and over 2,100 mentors, from 75 countries, that took part in the program. Congratulations to all students and mentors who completed GSoC 2021!

The final step of each GSoC program is the student and mentor evaluations.These help us gain valuable insights from our participants about the impact of the program. Here are some results from this year’s evaluations:
  • 96% of students think that GSoC helped their programming skills
  • 99% of students would recommend their GSoC mentors
  • 94% of students will continue working with their GSoC organization
  • 99% of students plan to continue working on open source
  • 36% of students said GSoC has already helped them get a job or internship
  • 72% of students said they would consider being a mentor
  • 88% of students said they would apply to GSoC again
Evaluations also give students and mentors the opportunity to give suggestions to GSoC program administrators. In past evaluations, a number of students have requested a ‘Student Summit’ in order to help connect their GSoC experience with the wider open source community.

We’re proud to announce that this year we held our first GSoC Student Summit on August 27th. Over 275 students attended the virtual summit! The goal of the Student Summit was to inspire and inform our 2021 students. We included talks from Googlers, GSoC mentors and former students who shared their personal and professional path to GSoC and open source. Students were also able to ask the presenters questions and even participate in trivia games to win prizes! More importantly, the summit was a place and time where students from around the world could come together and celebrate their GSoC accomplishments. Inspired by what they learned from the summit, the students know that while their GSoC time has ended their open source journey has just begun.

By Romina Vicente, Project Coordinator for the Google Open Source Programs Office

schema-dts turns 1.0: Author valid Schema.org JSON-LD in TypeScript

Today, schema-dts turns 1.0 to properly reflect its current maturity. I started the project in November 2018 to improve the developer experience of writing Structured Data.

The project has continued to improve, validating a broader and more complex subset of Schema.org, improving type-checking performance, and eliminating the runtime bundle entirely. Many of these improvements were only fully understood due to feedback and reports from the community. Today, schema-dts receives more than 100k downloads/week on NPM. These users have helped validate and harden the library over the past few years.

Here are some of the highlighted improvement since the last announcement:

0kb Bundle Runtime Size

The library is now entirely type only. Previously, convenience enums were generated in .js files, but improved TypeScript completions mean that this is no longer necessary.