Category Archives: Open Source Blog

News about Google’s open source projects and programs

Google Summer of Code 2020 mentoring orgs announced!

We are delighted to announce the open source projects and organizations that have been accepted for Google Summer of Code (GSoC) 2020, the 16th year of the program!

After careful review, we have chosen 200 open source projects to be mentor organizations this year, 30 of which are new to the program. Please see the program website for a complete list of the accepted organizations.

Are you a student interested in participating in GSoC this year? We will begin accepting student applications on Monday, March 16, 2020 at 18:00 UTC and the deadline to apply is Tuesday, March 31, 2020 at 18:00 UTC.


The most successful applications come from students who start preparing now. Here are some proactive tips for a successful before the application period begins:
You can find more information on our website which includes a full timeline of important dates. We also highly recommend perusing the FAQ and Program Rules and watching some of our other videos with more details about GSoC for students and mentors.

A hearty congratulations—and thank you—to all of our mentor organizations! We look forward to working with all of you during Google Summer of Code 2020.

By Stephanie Taylor, Google Open Source

AutoFlip: An Open Source Framework for Intelligent Video Reframing

Originally posted on the AI Blog

Videos filmed and edited for television and desktop are typically created and viewed in landscape aspect ratios (16:9 or 4:3). However, with an increasing number of users creating and consuming content on mobile devices, historical aspect ratios don’t always fit the display being used for viewing. Traditional approaches for reframing video to different aspect ratios usually involve static cropping, i.e., specifying a camera viewport, then cropping visual contents that are outside. Unfortunately, these static cropping approaches often lead to unsatisfactory results due to the variety of composition and camera motion styles. More bespoke approaches, however, typically require video curators to manually identify salient contents on each frame, track their transitions from frame-to-frame, and adjust crop regions accordingly throughout the video. This process is often tedious, time-consuming, and error-prone.

To address this problem, we are happy to announce AutoFlip, an open source framework for intelligent video reframing. AutoFlip is built on top of the MediaPipe framework that enables the development of pipelines for processing time-series multimodal data. Taking a video (casually shot or professionally edited) and a target dimension (landscape, square, portrait, etc.) as inputs, AutoFlip analyzes the video content, develops optimal tracking and cropping strategies, and produces an output video with the same duration in the desired aspect ratio.
Left: Original video (16:9). Middle: Reframed using a standard central crop (9:16). Right: Reframed with AutoFlip (9:16). By detecting the subjects of interest, AutoFlip is able to avoid cropping off important visual content.

AutoFlip Overview

AutoFlip provides a fully automatic solution to smart video reframing, making use of state-of-the-art ML-enabled object detection and tracking technologies to intelligently understand video content. AutoFlip detects changes in the composition that signify scene changes in order to isolate scenes for processing. Within each shot, video analysis is used to identify salient content before the scene is reframed by selecting a camera mode and path optimized for the contents.

Shot (Scene) Detection

A scene or shot is a continuous sequence of video without cuts (or jumps). To detect the occurrence of a shot change, AutoFlip computes the color histogram of each frame and compares this with prior frames. If the distribution of frame colors changes at a different rate than a sliding historical window, a shot change is signaled. AutoFlip buffers the video until the scene is complete before making reframing decisions, in order to optimize the reframing for the entire scene.

Video Content Analysis

We utilize deep learning-based object detection models to find interesting, salient content in the frame. This content typically includes people and animals, but other elements may be identified, depending on the application, including text overlays and logos for commercials, or motion and ball detection for sports.

The face and object detection models are integrated into AutoFlip through MediaPipe, which uses TensorFlow Lite on CPU. This structure allows AutoFlip to be extensible, so developers may conveniently add new detection algorithms for different use cases and video content. Each object type is associated with a weight value, which defines its relative importance — the higher the weight, the more influence the feature will have when computing the camera path.


Left: People detection on sports footage. Right: Two face boxes (‘core’ and ‘all’ face landmarks). In narrow portrait crop cases, often only the core landmark box can fit.

Reframing

After identifying the subjects of interest on each frame, logical decisions about how to reframe the content for a new view can be made. AutoFlip automatically chooses an optimal reframing strategy — stationary, panning or tracking — depending on the way objects behave during the scene (e.g., moving around or stationary). In stationary mode, the reframed camera viewport is fixed in a position where important content can be viewed throughout the majority of the scene. This mode can effectively mimic professional cinematography in which a camera is mounted on a stationary tripod or where post-processing stabilization is applied. In other cases, it is best to pan the camera, moving the viewport at a constant velocity. The tracking mode provides continuous and steady tracking of interesting objects as they move around within the frame.

Based on which of these three reframing strategies the algorithm selects, AutoFlip then determines an optimal cropping window for each frame, while best preserving the content of interest. While the bounding boxes track the objects of focus in the scene, they typically exhibit considerable jitter from frame-to-frame and, consequently, are not sufficient to define the cropping window. Instead, we adjust the viewport on each frame through the process of Euclidean-norm optimization, in which we minimize the residuals between a smooth (low-degree polynomial) camera path and the bounding boxes.

Top: Camera paths resulting from following the bounding boxes from frame-to-frame. Bottom: Final smoothed camera paths generated using Euclidean-norm path formation. Left: Scene in which objects are moving around, requiring a tracking camera path. Right: Scene where objects stay close to the same position; a stationary camera covers the content for the full duration of the scene.

AutoFlip’s configuration graph provides settings for either best-effort or required reframing. If it becomes infeasible to cover all the required regions (for example, when they are too spread out on the frame), the pipeline will automatically switch to a less aggressive strategy by applying a letterbox effect, padding the image to fill the frame. For cases where the background is detected as being a solid color, this color is used to create seamless padding; otherwise a blurred version of the original frame is used.

AutoFlip Use Cases

We are excited to release this tool directly to developers and filmmakers, reducing the barriers to their design creativity and reach through the automation of video editing. The ability to adapt any video format to various aspect ratios is becoming increasingly important as the diversity of devices for video content consumption continues to rapidly increase. Whether your use case is portrait to landscape, landscape to portrait, or even small adjustments like 4:3 to 16:9, AutoFlip provides a solution for intelligent, automated and adaptive video reframing.


What’s Next?

Like any machine learning algorithm, AutoFlip can benefit from an improved ability to detect objects relevant to the intent of the video, such as speaker detection for interviews or animated face detection on cartoons. Additionally, a common issue arises when input video has important overlays on the edges of the screen (such as text or logos) as they will often be cropped from the view. By combining text/logo detection and image inpainting technology, we hope that future versions of AutoFlip can reposition foreground objects to better fit the new aspect ratios. Lastly, in situations where padding is required, deep uncrop technology could provide improved ability to expand beyond the original viewable area.

While we work to improve AutoFlip internally at Google, we encourage contributions from developers and filmmakers in the open source communities.

Acknowledgments

We would like to thank our colleagues who contributed to Autoflip, Alexander Panagopoulos, Jenny Jin, Brian Mulford, Yuan Zhang, Alex Chen, Xue Yang, Mickey Wang, Justin Parra, Hartwig Adam, Jingbin Wang, and Weilong Yang; MediaPipe team who helped with open sourcing, Jiuqiang Tang, Tyler Mullen, Mogan Shieh, Ming Guang Yong, and Chuo-Ling Chang.

By Nathan Frey, Senior Software Engineer, Google Research, Los Angeles and Zheng Sun, Senior Software Engineer, Google Research, Mountain View

HarbourBridge: From PostgreSQL to Cloud Spanner

Would you like to try out Cloud Spanner with data from an existing PostgreSQL database? Maybe you’ve wanted to ‘kick the tires’ on Spanner, but have been discouraged by the effort involved?

Today, we’re announcing a tool that makes trying out Cloud Spanner using PostgreSQL data simple and easy.

HarbourBridge is a tool that loads Spanner with the contents of an existing PostgreSQL database. It requires zero configuration—no manifests or data maps to write. Instead, it ingests pg_dump output, automatically builds a Spanner schema, and creates a new Spanner database populated with data from pg_dump.

HarbourBridge is part of the Cloud Spanner Ecosystem, a collection of public, open source repositories contributed to, owned, and maintained by the Cloud Spanner user community. None of these repositories are officially supported by Google as part of Cloud Spanner.

Get up and running fast

HarbourBridge is designed to simplify Spanner evaluation, and in particular to bootstrap the process by getting moderate-size PostgreSQL datasets into Spanner (up to a few GB). Many features of PostgreSQL, especially those that don't map directly to Spanner features, are ignored, e.g. (non-primary) indexes, functions and sequences.

View HarbourBridge as a way to get up and running fast, so you can focus on critical things like tuning performance and getting the most out of Spanner. Expect that you'll need to tweak and enhance what HarbourBridge produces—More on this later.

Quick-start guide

The HarbourBridge README contains a step-by-step quick-start guide. We’ll quickly review the main steps. Before you begin, you'll need a Cloud Spanner instance, Cloud Spanner API enabled for your Google Cloud project, authentication credentials configured to use the Cloud API, and Go installed on your development machine.

To download HarbourBridge and install it, run
go get -u github.com/cloudspannerecosystem/harbourbridge
The tool should now be installed as $GOPATH/bin/harbourbridge. To use HarbourBridge on a PostgreSQL database called mydb, run
pg_dump mydb | $GOPATH/bin/harbourbridge
The tool will use the cloud project specified by the GCLOUD_PROJECT environment variable, automatically determine the Cloud Spanner instance associated with this project, convert the PostgreSQL schema for mydb to a Spanner schema, create a new Cloud Spanner database with this schema, and finally, populate this new database with the data from mydb. HarbourBridge also generates several files when it runs: a schema file, a report file (with details of the conversion), and a bad data file (if any data is dropped). See Files Generated by HarbourBridge.

Take care with ACLs

Note that PostgreSQL table-level and row-level ACLs are dropped during conversion since they are not supported by Spanner (Spanner manages access control at the database level). All data written to Spanner will be visible to anyone who can access the database created by HarbourBridge (which inherits default permissions from your Cloud Spanner instance).

Next steps

The tables created by HarbourBridge provide a starting point for evaluation of Spanner. While they preserve much of the core structure of your PostgreSQL schema and data, many important PostgreSQL features have been dropped.

In particular, HarbourBridge preserves primary keys but drops all other indexes. This means that the out-of-the-box performance you get from the tables created by HarbourBridge can be significantly slower than PostgreSQL performance. If HarbourBridge has dropped indexes that are important to the performance of your SQL queries, consider adding Secondary Indexes to the tables created by HarbourBridge. Use the existing PostgreSQL indexes as a guide. In addition, Spanner's Interleaved Tables can provide a significant performance boost.

Other dropped features include functions, sequences, procedures, triggers, and views. In addition, types have been mapped based on the types supported by Spanner. Types such as integers, floats, char/text, bools, timestamps and (some) array types map fairly directly to Spanner, but many other types do not and instead are mapped to Spanner's STRING(MAX). See Schema Conversion for details of the type conversions and their tradeoffs.

Recap

HarbourBridge automates much of the manual work of trying out Cloud Spanner using PostgreSQL data. The goal is to bootstrap your evaluation and help get you to the meaty issues as quickly as possible. The tables generated by HarbourBridge provide a starting point, but they will likely need to be tweaked and enhanced to support a full evaluation.

We encourage you to try out the tool, send feedback, file issues, fork and modify the codebase, and send PRs for fixes and new functionality. Our plans and aspirations for developing HarbourBridge further are outlined in the HarbourBridge Whitepaper. HarbourBridge is part of the Cloud Spanner Ecosystem, owned and maintained by the Cloud Spanner user community. It is not officially supported by Google as part of Cloud Spanner.

By Nevin Heintze, Cloud Spanner

Importing SA360 WebQuery reports to BigQuery

Context

Search Ads 360 (SA36) is an enterprise-class search campaign management platform used by marketers to manage global ad campaigns across multiple engines. It offers powerful reporting capability through WebQuery reports, API, BiqQuery and Datastudio connectors.

Effective Ad campaign management requires multi-dimensional analysis of campaign data along with customers’ first-party data by building custom reports with dimensions combined from paid-search reports and business data.

Customers’ business data resides in a data-warehouse, which is designed for analysis, insights and reporting. To integrate ads data into the data-warehouse, the usual approach is to bring/ load the campaign data into the warehouse; to achieve this, SA360 offers various options to retrieve paid-search data, each of these methods provide a unique capabilities.

Comparison AreaWebQueryBQ ConnectorDatastudio ConnectorAPI
Technical complexityLow
Medium
Medium
High
Ease of report customizationHigh
Medium
Low
High
Reporting DetailsCompleteLimited
Reports not supported on API are not available
E.g.
Location targets
Remarketing targets
Audience reports
Possible Data WarehouseAny
The report is generic and needs to be loaded into the data-warehouse using DWs custom loading methods.
BigQuery ONLYNoneAny
Comparing these approaches, in terms of technical knowledge required, as well as, supporters data warehousing solution, the easiest one is WebQuery report for which a marketer can build a report by choosing the dimensions/metrics they want on the SA360 User Interface.

BigQuery data-transfer service is limited to importing data in BigQuery and Datastudio connector does not allow retrieving data.

WebQuery offers a simpler and customizable method than other alternatives and also offers more options for the kind of data (vs. BQ transfer service which does not bring Business Data from SA360 to BigQuery). It was originally designed for Microsoft Excel to provide an updatable view of a report. In the era of cloud computing, a need was felt for a tool which would help consume the report and make it available on an analytical platform or a cloud data warehouse like BigQuery.

Solution Approach



This tool showcases how to bridge this gap of bringing SA360 data to a data warehouse, in generic fashion, where the report from SA360 is fetched in XML format and converted it into a CSV file using SAX parsers. This CSV file is then transferred to staging storage to be finally ETLed into the Data Warehouse.

As a concrete example, we chose to showcase a solution with BigQuery as the destination (cloud) data warehouse, though the solution architecture is flexible for any other system.

Conclusion

The tool helps marketers bring advertising data closer to their analytical systems helping them derive better insights. In case you use BigQuery as your Data Warehouse, you can use this tool as-is. You can also adopt by adding components for analytical/data-warehousing systems you use and improve it for the larger community.

To get started, follow our step-by-step guide.
Notable Features of the tool are as following:
  • Modular Authorization module
  • Handle arbitrarily large web-query reports
  • Batch mode to process multiple reports in a single call
  • Can be used as part of ETL workflow (Airflow compatible)
By Anant Damle, Solutions Architect and Meera Youn, Technical Partnership Lead

Announcing our Google Code-in 2019 Winners!

Google Code-in (GCI) 2019 was epic in every regard. Not only did we celebrate 10 years of the Google Code-in program, but we also broke all of our previous records for the program. It was a very, very busy seven weeks for everyone—we had 3,566 students from 76 countries complete 20,840 tasks with a record 29 open source organizations!

We want to congratulate all of the students who took part in this year’s 10th anniversary of Google Code-in. Great job!

Today we are excited to announce the Grand Prize Winners, Runners Up, and Finalists with each organization.

The 58 Grand Prize Winners completed an impressive 2,158 tasks while also helping other students.

Each of the Grand Prize Winners will be awarded a four-day trip to Google’s campus in northern California to meet with Google engineers, one of the mentors they worked with during the contest, and enjoy some fun in California with the other winners. We look forward to seeing these winners in a few months!

Grand Prize Winners

The Grand Prize Winners hail from 21 countries, listed by full name alphabetically below:
Name
Organization
Country
Aayushman Choudhary
JBoss Community
India
Abdur-Raheem Idowu
Haiku
Norway
Abhinav Kaushlya
The Julia Programming Language
India
Aditya Vardhan Singh
The ns-3 Network Simulator project
India
Anany Sachan
OpenWISP
India
Andrea Gonzales
Sugar Labs
Malaysia
Anmol Jhamb
Fedora Project
India
Aria Vikram
Open Roberta
India
Artur Grochal
Drupal
Poland
Bartłomiej Pacia
Systers, An AnitaB.org Community
Poland
Ben Houghton
Wikimedia
United Kingdom
Benjamin Amos
The Terasology Foundation
United Kingdom
Chamindu Amarasinghe
SCoRe Lab
Sri Lanka
Danny Lin
CCExtractor Development
United States
Diogo Fernandes
Apertium
Luxembourg
Divyansh Agarwal
AOSSIE
India
Duc Minh Nguyen
Metabrainz Foundation
Vietnam
Dylan Iskandar
Liquid Galaxy
United States
Emilie Ma
Liquid Galaxy
Canada
Himanshu Sekhar Nayak
BRL-CAD
India
Jayaike Ndu
CloudCV
Nigeria
Jeffrey Liu
BRL-CAD
United States
Joseph Semrai
SCoRe Lab
United States
Josh Heng
Circuitverse.org
United Kingdom
Kartik Agarwala
The ns-3 Network Simulator project
India
Kartik Singhal
AOSSIE
India
Kaustubh Maske Patil
CloudCV
India
Kim Fung
The Julia Programming Language
United Kingdom
Kumudtiha Karunarathna
FOSSASIA
Sri Lanka
M.Anantha Vijay
Circuitverse.org
India
Maathavan Nithiyananthan
Apertium
Sri Lanka
Manuel Alcaraz Zambrano
Wikimedia
Spain
Naman Modani
Copyleft Games
India
Navya Garg
OSGeo
India
Neel Gopaul
Drupal
Mauritius
Nils André
CCExtractor Development
United Kingdom
Paraxor
Fedora Project
United Arab Emirates
Paweł Sadowski
OpenWISP
Poland
Pola Łabędzka
Systers, An AnitaB.org Community
Poland
Pranav Karthik
FOSSASIA
Canada
Pranay Joshi
OSGeo
India
Prathamesh Mutkure
OpenMRS
India
Pratish Rai
R Project for Statistical Computing
India
Pun Waiwitlikhit
The Mifos Initiative
Thailand
Rachit Gupta
The Mifos Initiative
India
Rafał Bernacki
Haiku
Poland
Ray Ma
OpenMRS
New Zealand
Rick Wierenga
TensorFlow
Netherlands
Sayam Sawai
JBoss Community
India
Sidaarth “Sid” Sabhnani
Copyleft Games
United States
Srevin Saju
Sugar Labs
Bahrain
Susan He
Open Roberta
Australia
Swapneel Singh
The Terasology Foundation
India
Sylvia Li
Metabrainz Foundation
New Zealand
Umang Majumder
R Project for Statistical Computing
India
Uzay Girit
Public Lab
France
Vladimir Mikulic
Public Lab
Bosnia and Herzegovina
William Zhang
TensorFlow
United States

Runners Up

And a big kudos to our 58 Runners Up from 20 countries. They will receive a GCI backpack, jacket and a GCI tshirt. The Runners Up are listed alphabetically by First name below:
Name
Organization

Name
Organization
Adev Saputra
Drupal

Kunal Bhatia
Score Lab
Adrian Serapio
R Project for Statistical Computing

Laxya Pahuja
The Mifos Initiative
Alberto Navalón Lillo
Apertium

Łukasz Zbrzeski
Score Lab
Alvii_07
Liquid Galaxy

Madhav Mehndiratta
Fedora Project
Amar Fadil
OpenWISP

Marcus Chong
Sugar Labs
Ananya Gangavarapu
TensorFlow

Mateusz Samkiewicz
JBoss Community
Andrey Shcherbakov
Wikimedia

Maya Farber Brodsky
CCExtractor Development
Antara Bhattacharya
Metabrainz Foundation

Michał Piechowiak
Fedora Project
Anthony Zhou
Public Lab

Moodhunt
Metabrainz Foundation
Bartosz Dokurno
Circuitverse.org

Muhammad Wasif
FOSSASIA
Ching Lam Choi
The Julia Programming Language

name not shown
Haiku
Chirag Bhansali
AOSSIE

Nathan Taylor
Sugar Labs
Chiranjiv Singh Malhi
BRL-CAD

Nishanth Thumma
Open Roberta
Daksha Aeer
Systers, An AnitaB.org Community

Panagiotis Vasilopoulos
Haiku
Devansh Khetan
OpenMRS

Rachin Kalakheti
TensorFlow
Dhanus SL
OSGeo

Regan Iwadha
JBoss Community
Dhhyey Desai
AOSSIE

Ribhav Sharma
OpenMRS
Eric Xue
Copyleft Games

Richard Botez
Open Roberta
Eryk Mikołajek
BRL-CAD

Rishabh Verma
The Mifos Initiative
Hannah Guo
The Terasology Foundation

Rishank Kanaparti
Copyleft Games
Harsh Khandeparkar
Public Lab

Rishi R
R Project for Statistical Computing
Hirochika Matsumoto
CloudCV

Sai Putravu
The ns-3 Network Simulator project
Ilya Maier
Systers, An AnitaB.org Community

Samuel Sloniker
Apertium
Irvan Ayush Chengadu
Drupal

Shivam Rai
OSGeo
Jakub Niklas
The Terasology Foundation

Siddharth Sinha
FOSSASIA
Jun Rong Lam
Circuitverse.org

Soumitra Shewale
The Julia Programming Language
Karol Ołtarzewski
OpenWISP

Stanisław Howard
The ns-3 Network Simulator project
Kripa Kini
Liquid Galaxy

Suryansh Pathak
CloudCV
Krzysztof Krysiński
CCExtractor Development

Taavi Väänänen
Wikimedia

Finalists

And a hearty congratulations to our 58 Finalists from 20 countries. The finalists will win a special GCI jacket and a GCI tshirt. They are listed alphabetically by first name below:
Name
Organization

Name
Organization
Abinav Chari
CloudCV

Musab Kılıç
CCExtractor Development
Andre Christoga Pramaditya
CloudCV

Nail Anıl Örcün
The Terasology Foundation
Anish Agnihotri
OSGeo

Natalie Shapiro
Circuitverse.org
Aryan Gulati
FOSSASIA

Nate Clark
The Terasology Foundation
Ayush Sharma
Fedora Project

Nicholas Gregory
Wikimedia
Ayush Sharma
SCoRe Lab

Nikita Ermishin
OpenWISP
Daniel Oluojomu
JBoss Community

Nishith P
FOSSASIA
Dhruv Baronia
TensorFlow

Oliver Fogelin
R Project for Statistical Computing
Diana Hernandez
Systers, An AnitaB.org Community

Oussama Hassini
The Mifos Initiative
Gambali Seshasai Chaitanya
Apertium

Param Nayar
Copyleft Games
Hao Liu
R Project for Statistical Computing

Peter Terpstra
The ns-3 Network Simulator project
Hardik Jhalani
Systers, An AnitaB.org Community

Piyush Sharma
The Mifos Initiative
Hrishikesh Patil
OpenMRS

Robert Chen
Public Lab
Jackson Lewis
The ns-3 Network Simulator project

Rohan Cherivirala
Open Roberta
Jan Rosa
Wikimedia

Ruixuan Tu
Haiku
Janiru Hettiarachchi
Liquid Galaxy

Saptashwa Mandal
Drupal
Janiru Wijekoon
Metabrainz Foundation

Sashreek Magan
Sugar Labs
Joshua Yang
Apertium

Sauhard Jain
AOSSIE
Kevin Liu
Open Roberta
Sharman Maheshwari
SCoRe Lab
Krishna Rama Rao
AOSSIE

Sumagna Das
BRL-CAD
Li Chen
Fedora Project

Tanvir Singh
OSGeo
Madhav Shekhar Sharma
The Julia Programming Language

Techno-Disaster
CCExtractor Development
Mbah Javis
TensorFlow

Thusal Ranawaka
BRL-CAD
Merul Dhiman
Liquid Galaxy

Vivek Mishra
Copyleft Games
Michelle (Wai Man) Lo
OpenMRS

Yu Fai Wong
JBoss Community
Mihir Bhave
OpenWISP

Yuqi Qiu
Metabrainz Foundation
Mohit S A
Circuitverse.org

Zakhar Vozmilov
Public Lab
Mokshit Jain
Drupal

Zakiyah Hasanah
Sugar Labs
Mudit Somani
The Julia Programming Language

Zoltán Szatmáry
Haiku

Our 794 mentors, the heart and soul of GCI, are the reason the contest thrives. Mentors volunteer their time to help these bright students become open source contributors. They spend hundreds of hours during their holiday breaks answering questions, reviewing submitted tasks, and welcoming the students to their communities. GCI would not be possible without their dedication, patience and tireless efforts.

We will post more numbers from GCI 2019 here on the Google Open Source Blog over the next few weeks, so please stay tuned.

Congratulations to our Grand Prize Winners, Runners Up, Finalists, and all of the students who spent the last couple of months learning about, and contributing to, open source. We hope they will continue their journey in open source!

By Stephanie Taylor, Google Open Source

BazelCon 2019

Cross-posted from the original BazelCon 2019 recap .

Last month the Google Bazel team hosted its largest ever Bazel user conference: BazelCon 2019, an annual gathering of the community surrounding the Bazel build system. This is the main Bazel event of the year which serves as an opportunity for Bazel contributors, maintainers, and users to meet and learn from each other, present Bazel migration stories, educate new users, and collaborate together on the future of Bazel.

BazelCon 2019 by the Numbers

  • 400+ attendees (2x increase over BazelCon 2018)
  • 125 organizations represented including Microsoft, Spotify, Uber, Apple, Cruise, EA, Lyft, Tesla, SpaceX, SAP, Bloomberg, Wix, Etsy, BMW and others
  • 26 full-length talks and 15 lightning talks by members of the external community and Googlers
  • 16 hours of Q&A during Office Hours with Bazel team members
  • 45 Bazel Bootcamp attendees
  • 5 Birds of a Feather sessions on iOS, Python, Java, C++ and Front-end Bazel rules
  • 182 users in the #bazelcon2019 Slack channel

BazelCon 2019 Full Length Talks

The full playlist also includes lighting talks.
  • Keynote: The Role of Catastrophic Failure in Software Design – Jeff Atwood (Stack Overflow/Discourse)
  • Bazel State of the Union – John Field and Dmirty Lomov (Google)
  • Building Self Driving Cars with Bazel – Axel Uhlig and Patrick Ziegler (BMW Group)
  • Moving to a Bazel-based CI system: 6 Learnings – Or Shachar (Wix)
  • Bazel Federation – Florian Weikert (Google)
  • Lessons from our First 100,000 Bazel Builds – Kevin Gessner (Etsy)
  • Migrating Lyft-iOS to Bazel – Keith Smiley and Dave Lee (Lyft)
  • Test Selection – Benjamin Peterson (Dropbox)
  • Porting iOS Apps to Bazel – Oscar Bonilla (LinkedIn)
  • Boosting Dev Box Performance with Remote Execution for Non-Hermetic Build Engines – Erik Mavrinac (Microsoft)
  • Building on Key - Keeping your Actions and Remote Executions in Tune – George Gensure (UberATG)
  • Bazel remote execution API vs Goma – Mostyn Bramley-Moore (Vewd Software)
  • Integrating with ease: leveraging BuildStream interaction with Bazel build for consistent results – Daniel Silverstone (Codethink)
  • Building Self-Driving Cars with Bazel – Michael Broll and Nico Valigi (Cruise)
  • Make local development (with Bazel) great again! – Ittai Zeidman (Wix)
  • Gradle to Bazel – Chip Dickson and Charles Walker (SUM Global Technology)
  • Bazel Bootcamp – Kyle Cordes (Oasis Digital)
  • Bazel migration patterns: how to prove business value with a small investment – Alex Eagle and Greg Magolan (Google)
  • Dynamic scheduling: Fastest clean and incremental builds – Julio Merino (Google)
  • Building a great CI with Bazel – Philipp Wollermann (Google)
By Misha Narinsky, Bazel Team

Google Summer of Code 2020 is now open for mentor organization applications!

We are looking for open source projects and organizations to participate in the 16th annual Google Summer of Code (GSoC)! GSoC is a global program that draws university student developers from around the world to contribute to open source projects. Each student will spend three months working on a coding project with the support of volunteer mentors from participating open source organizations, mid-May to mid-August.

Last year, 1,276 students worked with 206 open source organizations and over 2,000 mentors. Organizations include small and medium sized open source projects, as well as a number of umbrella organizations with many sub-projects under them (Apache Software Foundation, Python Software Foundation, etc.).

Our 2020 goal is to accept more organizations into their first GSoC than ever before! We ask that veteran organizations refer other organizations they think would be a good fit to participate in GSoC.

You can apply to be a mentoring organization for GSoC starting today. The deadline to apply is February 5 at 19:00 UTC. Organizations chosen for GSoC 2020 will be publicly announced on February 20.

Please visit the program site for more information on how to apply and review the detailed timeline of important deadlines. We also encourage you to check out the Mentor Guide and our short video on why open source projects apply to be a part of the program.

Best of luck to all of the open source mentoring organization applicants!

By Stephanie Taylor, Google Open Source

Securing open source: How Google supports the new Kubernetes bug bounty

At Google, we care deeply about the security of open-source projects, as they’re such a critical part of our infrastructure—and indeed everyone’s. Today, the Cloud-Native Computing Foundation (CNCF) announced a new bug bounty program for Kubernetes that we helped create and get up and running. Here’s a brief overview of the program, other ways we help secure open-source projects and information on how you can get involved.

Launching the Kubernetes bug bounty program

Kubernetes is a CNCF project. As part of its graduation criteria, the CNCF recently funded the project’s first security audit, to review its core areas and identify potential issues. The audit identified and addressed several previously unknown security issues. Thankfully, Kubernetes already had a Product Security Committee, including engineers from the Google Kubernetes Engine (GKE) security team, who respond to and patch any newly discovered bugs. But the job of securing an open-source project is never done. To increase awareness of Kubernetes’ security model, attract new security researchers, and reward ongoing efforts in the community, the Kubernetes Product Security Committee began discussions in 2018 about launching an official bug bounty program.

Find Kubernetes bugs, get paid

What kind of bugs does the bounty program recognize? Most of the content you’d think of as ‘core’ Kubernetes, included at https://github.com/kubernetes, is in scope. We’re interested in common kinds of security issues like remote code execution, privilege escalation, and bugs in authentication or authorization. Because Kubernetes is a community project, we’re also interested in the Kubernetes supply chain, including build and release processes that might allow a malicious individual to gain unauthorized access to commits, or otherwise affect build artifacts. This is a bit different from your standard bug bounty as there isn’t a ‘live’ environment for you to test—Kubernetes can be configured in many different ways, and we’re looking for bugs that affect any of those (except when existing configuration options could mitigate the bug). Thanks to the CNCF’s ongoing support and funding of this new program, depending on the bug, you can be rewarded with a bounty anywhere from $100 to $10,000.

The bug bounty program has been in a private release for several months, with invited researchers submitting bugs and to help us test the triage process. And today, the new Kubernetes bug bounty program is live! We’re excited to see what kind of bugs you discover, and are ready to respond to new reports. You can learn more about the program and how to get involved here.

Dedicated to Kubernetes security

Google has been involved in this new Kubernetes bug bounty from the get-go: proposing the program, completing vendor evaluations, defining the initial scope, testing the process, and onboarding HackerOne to implement the bug bounty solution. Though this is a big effort, it’s part of our ongoing commitment to securing Kubernetes. Google continues to be involved in every part of Kubernetes security, including responding to vulnerabilities as part of the Kubernetes Product Security Committee, chairing the sig-auth Kubernetes special interest group, and leading the aforementioned Kubernetes security audit. We realize that security is a critical part of any user’s decision to use an open-source tool, so we dedicate resources to help ensure we’re providing the best possible security for Kubernetes and GKE.

Although the Kubernetes bug bounty program is new, it isn’t a novel strategy for Google. We have enjoyed a close relationship with the security research community for many years and, in 2010, Google established our own Vulnerability Rewards Program (VRP). The VRP provides rewards for vulnerabilities reported in GKE and virtually all other Google Cloud services. (If you find a bug in GKE that isn’t specific to Kubernetes core, you should still report it to the Google VRP!) Nor is Kubernetes the only open-source project with a bug bounty program. In fact, we recently expanded our Patch Rewards program to provide financial rewards both upfront and after-the-fact for security improvements to open-source projects.

Help keep the world’s infrastructure safe. Report a bug to the Kubernetes bug bounty, or a GKE bug to the Google VRP.

By Maya Kaczorowski, Product Manager, Container Security; and Aaron Small, Product Manager, GKE On-Prem security

Wombat Dressing Room, an npm publication proxy on GCP

We're excited to announce that we're open sourcing the service we use on the Google Cloud Client Libraries team for handling npm publications, it's called Wombat Dressing Room. Wombat Dressing Room provides features that help npm work better with automation, while maintaining good security practices.

A tradeoff is often made for automation

npm has top notch security features: CIDR-range restricted tokens, publication notifications, and two-factor authentication, to name a few. Of these, a feature critical to protecting publications is two-factor authentication (2FA).

2FA requires that you provide two pieces of information when accessing a protected resource: "something you know" (for instance, a password); and "something you have" (for instance, a code from an authenticator app). With 2FA, if your password is exposed, an attacker still can't publish a malicious package (unless they also steal the "something you have".)

On my team, a small number of developers manage over 75 Node.js libraries. We see automation as key to making this possible: we've written tools that automate releases, validate license headers, ensure contributors have signed CLAs; we adhere to the philosophy, automate all the things!

It's difficult to automate the step of entering a code off a cellphone. As a result, folks often opt to turn off 2FA in their automation.

What if you could have both automation and the added security of 2FA? This is why we built the Wombat Dressing Room.

A different approach to authentication

With Wombat Dressing Room, rather than an individual configuring two factor authentication in an authenticator app, 2FA is managed by a shared proxy server. Publications are then directed at the Wombat Dressing Room proxy, which provides the following security features:

Per-package publication tokens.

Wombat Dressing Room can generate authentication tokens tied to repositories on GitHub. These tokens are tied to a single GitHub repository, which the user generating the token must have push permissions for.

If a per-package publication token is leaked, an attacker can only hijack the single package that the token is associated with.

Limited lifetime tokens

Wombat Dressing Room can also generate access tokens that have a 24 hour lifespan. In this model, a leaked token is only vulnerable until the 24 hour lifespan is hit.

GitHub Releases as 2FA

In this authentication model, a package can only be published to npm if a GitHub release with a corresponding tag is found on GitHub.

This introduces a true "second factor", as users must prove they have access to both Wombat Dressing Room and the repository on GitHub.

Getting started with Wombat Dressing Room

We've been using Wombat Dressing Room to manage Google Cloud client libraries for over a year now in our fully automated library release process. As of today, the source is available for everyone on GitHub under an Apache 2.0 license.

Wombat Dressing Room runs on Google App Engine, and instructions on getting it up and running can be found in its README.md.

It's my hope that this will help other folks in the community, simplify and automate their release process, while minimizing the attack surface of their libraries.
By Benjamin Coe, works on Node.js client libraries for the Google Cloud Platform, and was the third engineer at npm, Inc.

Season of Docs Announces Results of 2019 Program

Season of Docs has announced the 2019 program results for standard-length projects. You can view a list of successfully completed technical writing projects on the website along with their final project reports.

During the program, technical writers spent a few months working closely with an open source community. They brought their technical writing expertise to improve the project's documentation while the open source projects provided mentors to introduce the technical writers to open source tools, workflows, and the project's technology.

The technical writers and their mentors did a fantastic job with the inaugural year of Season of Docs! Participants represented countries across all continents except for Antarctica! 36 technical writers out of 41 successfully completed their standard-length technical writing projects, and there are eight long-running projects in progress that are expected to finish in February.

  • 91.7% of the mentors had a positive experience and want to mentor again in future Season of Docs cycles
  • 88% of the technical writers had a positive experience
  • 96% plan to continue contributing to open source projects
  • 100% of the technical writers said that Season of Docs helped improved their knowledge of code and/or open source

Technical writing projects ranged from beginners' guides and tutorials to API and reference documentation; all of which benefited a diverse set of open source projects that included programming languages, software, compiler infrastructure, operating systems, software libraries, hardware, science, healthcare, and more. Take a look at the list of successful projects to see the wide range of subjects covered!

What is next?

The long-running projects are still in progress and finish in February 2020. Technical writers participating in these long-running projects submit their project reports by Feb. 25, and the writer and mentor evaluations are due by Feb. 28. Successfully completed long-running technical writing projects are then published on the results page on March 6, 2020.

If you were excited about participating, please do write social media posts. See the promotion and press page for images and other promotional materials you can include, and be sure to use the tag #SeasonOfDocs when promoting your ideas on social media. To include the tech writing and open source communities, add #WriteTheDocs, #techcomm, #TechnicalWriting, and #OpenSource to your posts.

Stay tuned for information about Season of Docs 2020—watch for posts in this blog and sign up for the announcements email list.

By Andrew Chen, Google Open Source and Sarah Maddox, Cloud Docs