Author Archives: Open Source Programs Office

Introducing Data Transfer Project: an open source platform promoting universal data portability

In 2007, a small group of engineers in our Chicago office formed the Data Liberation Front, a team that believed consumers should have better tools to put their data where they want, when they want, and even move it to a different service. This idea, called “data portability,” gives people greater control of their information, and pushes us to develop great products because we know they can pack up and leave at any time.

In 2011, we launched Takeout, a new way for Google users to download or transfer a copy of the data they store or create in a variety of industry-standard formats. Since then, we've continued to invest in Takeout—we now call it Download Your Data—and today, our users can download a machine-readable copy of the data they have stored in 50+ Google products, with more on the way.

Now, we’re taking our commitment to portability a step further. In tandem with Microsoft, Twitter, and Facebook we’re announcing the Data Transfer Project, an open source initiative dedicated to developing tools that will enable consumers to transfer their data directly from one service to another, without needing to download and re-upload it. Download Your Data users can already do this; they can transfer their information directly to their Dropbox, Box, MS OneDrive, and Google Drive accounts today. With this project, the development of which we mentioned in our blog post about preparations for the GDPR, we’re looking forward to working with companies across the industry to bring this type of functionality to individuals across the web.

Our approach

The organizations involved with this project are developing tools that can convert any service's proprietary APIs to and from a small set of standardized data formats that can be used by anyone. This makes it possible to transfer data between any two providers using existing industry-standard infrastructure and authorization mechanisms, such as OAuth. So far, we have developed adapters for seven different service providers across five different types of consumer data; we think this demonstrates the viability of this approach to scale to a large number of use cases.

Consumers will benefit from improved flexibility and control over their data. They will be able to import their information into any participating service that offers compelling features—even brand new ones that could rely on powerful, cloud-based infrastructure rather than the consumers’ potentially limited bandwidth and capability to transfer files. Services will benefit as well, as they will be able to compete for users that can move their data more easily.

Protecting users’ data and keeping them in control

Data security and privacy are foundational to the design of the Data Transfer Project. Services must first agree to allow data transfer between them, and then they will require that individuals authenticate each account independently. All credentials and user data will be encrypted both in transit and at rest. The protocol uses a form of perfect forward secrecy where a new unique key is generated for each transfer. Additionally, the framework allows partners to support any authorization mechanism they choose. This enables partners to leverage their existing security infrastructure when authorizing accounts.

As it is an open source product, anyone can inspect the code to verify that data isn't being collected or used for profiling purposes. Tech savvy consumers are also free to download and run an instance of the framework themselves. Interested parties can learn more at the Data Transfer Project website, which explains the technical foundations behind the project and goes into greater detail on how it works.

How to get involved

It is very early days for the Data Transfer Project and we encourage the developer community to join us and help extend the platform to support many more data types, service providers, and hosting solutions.

The Data Transfer Project’s open source code can be found at datatransferproject.dev and you can learn more about Google’s approach to portability in our paper, where we describe our history with this topic and the values and principles that motivated us to invest in the Data Transfer Project. Our prototype already supports data transfer for several product verticals including: photos, mail, contacts, calendar, and tasks. These are enabled by existing, publicly available APIs from Google, Microsoft, Twitter, Flickr, Instagram, Remember the Milk, and Smugmug.

Data portability makes it easy for consumers to try new services and use the ones that they like best. We’re thrilled to help drive an initiative that incentivizes companies large and small to continue innovating across the internet. We’re just getting started and we’re looking forward to what comes next.

By Brian Willard, Software Engineer and Greg Fair, Product Manager

Googlers on the road: CLS and OSCON 2018

Next week a veritable who’s who of free and open source software luminaries, maintainers and developers will gather to celebrate the 20th annual OSCON and the 20th anniversary of the Open Source Definition. Naturally, the Google Open Source and Google Cloud teams will be there too!

Program chairs at OSCON 2017, left to right:
Rachel Roumeliotis, Kelsey Hightower, Scott Hanselman.
Photo used with permission from O'Reilly Media.
This year OSCON returns to Portland, Oregon and runs from July 16-19. As usual, it is preceded by the free-to-attend Community Leadership Summit on July 14-15.

If you’re curious about our outreach programs, our approach to open source, or any of the open source projects we’ve released, please find us! We’re eager to chat. You’ll find us and many other Googlers throughout the week on stage, in the expo hall, and at several special events that we’re running, including:
Here’s a rundown of the sessions we’re hosting this year:

Sunday, July 15th (Community Leadership Summit)

11:45am   Asking for time and/or money by Cat Allman

Monday, July 16th (Tutorials)

9:00am    Getting started with TensorFlow by Josh Gordon
1:30pm    Introduction to natural language processing with Python by Barbara Fusinska

Tuesday, July 17th (Tutorials)

9:00am    Istio Day opening remarks by Kelsey Hightower
9:00am    TensorFlow Day opening remarks by Edd Wilder-James
9:05am    Sailing to 1.0: Istio community update by April Nassi
9:05am    The state of TensorFlow by Sandeep Gupta
9:30am    Introduction to fairness in machine learning by Hallie Benjamin
9:55am    Farm to table: A TensorFlow story by Gunhan Gulsoy
11:00am  Hassle-free, scalable machine learning with Kubeflow by Barbara Fusinska
11:05am  Istio: Zero-trust communication security for production services by Samrat Ray, Tao Li, and Mak Ahmad
12:00pm  Project Magenta: Machine learning for music and art by Sherol Chen
1:35pm    Istio à la carte by Daniel Ciruli

Wednesday, July 18th (Sessions)

9:00am    Wednesday opening welcome by Kelsey Hightower
11:50am  Machine learning for continuous integration by Joseph Gregorio
1:45pm    Live-coding a beautiful, performant mobile app from scratch by Emily Fortuna and Matt Sullivan
2:35pm    Powering TensorFlow with big data using Apache Beam, Flink, and Spark by Holden Karau
5:25pm    Teaching the Next Generation to FLOSS by Josh Simmons

Thursday, July 19th (Sessions)

9:00am    Thursday opening welcome by Kelsey Hightower
9:40am    20 years later, open source is as important as ever by Sarah Novotny
11:50am  Google’s approach to distributed systems observability by Jaana B. Dogan
2:35pm    gRPC versus REST: Let the battle begin with Alex Borysov
5:05pm    Shenzhen Go: A visual Go environment for everybody, even professionals by Josh Deprez

We look forward to seeing you and the rest of the community there!

By Josh Simmons, Google Open Source

Contributing to the AMP Project

This is a guest post by Adam Silverstein who was recently recognized through the Google Open Source Peer Bonus Program for contributions to the AMP Project. We invited Adam to share about his work on our blog.

I started my web career building websites for small businesses on WordPress, so when I decided to begin contributing to open source, WordPress was a natural place to start.

Now I work at the digital agency 10up, where I am a part of our open source team. We build popular sites like FiveThirtyEight where having the best possible AMP experience is critical. However, bringing FiveThirtyEight’s AMP version up to parity with the site’s responsive mobile experience was challenging, in part because of advanced features that aren’t directly supported in AMP.

One of those unsupported features was MathML, a standard for displaying mathematical formulas on the web. To avoid a clumsy work around (amp-iframe) and improve our presentation of formulas, I proposed a native `amp-mathml` component which could display formulas inline. Contributing improvements “upstream” to open source projects – especially as we encounter friction in real-world projects – is a core value at 10up and important to the health of the web. I expected that I could leverage the same open source MathJax library we used on the responsive website for an AMP implementation. Contributing this component would strengthen my understanding of AMP’s internals while simultaneously improving a client site and enabling the open MathML standard on any AMP page. Win, win, win!

I started by opening an issue on Google’s amphtml repository, describing MathML and proposing a native `amp-mathml` component. Justin Ridgewell from the AMP team immediately responded to the issue and asked Ali Ghassemi to track it. I offered to help write the code and received an enthusiastic response, encouraging me and assuring me that the team would be available on GitHub and in Slack to answer any questions.

This warm welcome gave me the confidence to dive in, but ramp up was daunting. The build tools and coding standards were quite different from other projects I work on and setup required some editor reconfiguring and reflex retraining. Getting the unit test to run on my system required tracking down and installing some missing dependencies.

Fortunately, AMP’s project documentation is thorough, and Ali guided me through the implementation, pointing me to existing, similar samples in the project. I already knew how to use JavaScript to render formulas with MathJax – my challenge was building an AMP component that ran this code and displayed it inline.

After a few days of concerted effort, I built a proof of concept and opened a pull request. The real fun began as I refined the approach and wrote documentation with help from the team. The team’s active engagement helped the process move along rapidly. Amazingly, the pull request was merged one week later, and today amp-mathml is live in the wild. FiveThirtyEight is already using the new, native implementation.

From opening the issue all the way to the merge of my pull request, I was impressed by the support and encouragement I received. Ali and honeybadgerdontcare provided regular reviews and thorough suggestions on the pull request when I pushed iterations. Their engagement throughout the process made me and my work feel valued, and helped me stay motivated to continue working on the feature.

Adding MathML to AMP reminded me why I find so much joy and professional growth in contributing to open source projects. I have a better understanding of AMP from the inside out, and I was welcomed into the project’s community with wide open arms. I'm proud of my contribution, and ready to tackle new challenges after seeing its success!
 
By Adam Silverstein, AMP Project contributor

Google Summer of Code 2018 statistics part 2

Now that Google Summer of Code (GSoC) 2018 is underway and students are wrapping up their first month of coding, we wanted to bring you some more statistics on the 2018 program. Lots and lots of numbers follow:

Organizations

Students are working with 206 organizations (the most we’ve ever had!), 41 of which are participating in GSoC for the first time.

Student Registrations

25,873 students from 147 countries registered for the program, which is a 25.3% increase over the previous high for the program back in 2017. There are 9 new countries with students registering for the first time: Angola, Bahamas, Burundi, Cape Verde, Chad, Equatorial Guinea, Kosovo, Maldives, and Mali.

Project Proposals

5,199 students from 101 countries submitted a total of 7,209 project proposals. 70.5% of the students submitted 1 proposal, 18.1% submitted 2 proposals, and 11.4% submitted 3 proposals (the max allowed).

Gender Breakdown

11.63% of accepted students are women, a 0.25% increase from last year. We are always working toward making our programs and open source more inclusive, and we collaborate with organizations and communities that help us improve every year.

Universities

The 1,268 students accepted into the GSoC 2018 program hailed from 613 universities, of which 216 have students participating for the first time in GSoC.

Schools with the most accepted students for GSoC 2018:
University Country Students
Indian Institute of Technology, Roorkee India 35
International Institute of Information Technology - Hyderabad India 32
Birla Institute of Technology and Science, Pilani (BITS Pilani) India 23
Indian Institute of Technology, Kharagpur India 22
Birla Institute of Technology and Science Pilani, Goa campus / BITS-Pilani - K.K.Birla Goa Campus India 18
Indian Institute of Technology, Kanpur India 16
University of Moratuwa Sri Lanka 16
Indian Institute of Technology, Patna India 14
Amrita University India 13
Indian Institute of Technology, Mandi India 11
Indraprastha Institute of Information and Technology, New Dehli India 11
University of Buea Cameroon 11
BITS Pilani, Hyderabad Campus India 11
Another post with stats on our awesome GSoC mentors will be coming soon!

By Stephanie Taylor, Google Open Source

31st anniversary of the GIF: give your terminal some personality with Tenor GIF for CLI

Creating ASCII art for your terminal isn’t new. Displaying animating ASCII GIFs in the CLI (command line interface), however, hasn’t been possible, or at least, not easy to do -- until now.

Shortly after Tenor was acquired by Google, I had an idea.

Many developers configure static ASCII art to appear when opening a terminal, but I imagined that ASCII art could animate like a GIF, and could easily be created from any GIF on Tenor.

After some tinkering, GIF for CLI was born.

Just in time for the 31st anniversary of the GIF, GIF for CLI is available today on GitHub. GIF for CLI takes in a GIF, short video, or a query to the Tenor GIF API and converts it to animated ASCII art. This means each time you log on to your programming workstation, your GIF is there to greet you in ASCII form. Animation and color support are performed using ANSI escape sequences.

Rob Delaney as “Peter” from Deadpool 2, in ASCII GIF form. See the original GIF on Tenor here.
For the more technically-minded, here are the details:

When the command line program is run, it takes the chosen .gif file (file, url, or Tenor query) and uses ffmpeg to split the animated gif or short video into static jpg frames. Those jpg frames are then converted to ASCII art (these ASCII art frames are cached in $HOME/.cache/gif-for-cli). The program then prints one frame at a time to the console, clearing the console using ANSI escape sequences between each frame.

You could also use GIF for CLI to run gif-for-cli in your .bashrc or .profile to get an animated ASCII art image as your MOTD, or with Git hooks.

GIF for CLI integrates with the Tenor GIF API to source the GIFs. The Tenor API powers GIF search within many of today’s most popular messaging apps and social platforms on iOS and Android. 

Share screen captures of your ASCII GIFs with us on Twitter with #GIFforCLI. 

By Sean Hayes, Tenor

Google Summer of Code 2018 statistics part 1

Since 2005, Google Summer of Code (GSoC) has been bringing new developers into the open source community every year. This year we accepted 1,264 students from 62 countries into the 2018 GSoC program to work with a record 206 open source organizations this summer.

Students are currently participating in the Community Bonding phase of the program where they become familiar with the open source projects they will be working with. They also spend time learning the codebase and the community’s best practices so they can start their 12 week coding projects on May 14th.

Each year we like to share program statistics about the GSoC program and the accepted students and mentors involved in the program. Here are a few stats:
  • 88.2% of the accepted students are participating in their first GSoC
  • 74.4% of the students are first time applicants

Degrees

  • 76.18% of accepted students are undergraduates, 17.5% are masters students, and 6.3% are getting their PhDs.
  • 73% are Computer Science majors, 4.2% are mathematics majors, 17% are other engineering majors (electrical, mechanical, aerospace, etc.)
  • We have students in a variety of majors including neuroscience, linguistics, typography, and music technologies.

Countries

This year there are four students that are the first to be accepted into GSoC from their home countries of Kosovo (three students) and Senegal. A complete list of accepted students and their countries is below:
CountryStudentsCountryStudentsCountryStudents
Argentina5Hungary7Russian Federation35
Australia10India605Senegal1
Austria14Indonesia3Serbia1
Bangladesh3Ireland1Singapore8
Belarus3Israel2Slovak Republic2
Belgium3Italy24South Africa1
Brazil19Japan7South Korea2
Bulgaria2Kosovo3Spain21
Cameroon14Latvia1Sri Lanka41
Canada31Lithuania5Sweden6
China52Malaysia2Switzerland5
Croatia3Mauritius1Taiwan3
Czech Republic4Mexico4Trinidad and Tobago1
Denmark1Morocco2Turkey8
Ecuador4Nepal1Uganda1
Egypt12Netherlands6Ukraine6
Finland3Nigeria6United Kingdom28
France22Pakistan5United States104
Germany53Poland3Venezuela1
Greece16Portugal10Vietnam4
Hong Kong3Romania10Venezuela1
There were a record number of students submitting proposals for the program this year -- 5,199 students from 101 countries.

In our next GSoC statistics post we will delve deeper into the schools, gender breakdown, mentors, and registration numbers for the 2018 program.

By Stephanie Taylor, Google Open Source

OpenCensus’s journey ahead: platforms and languages

We recently blogged about the value of OpenCensus and how Google uses Census internally. Today, we want to share more about our long-term vision for OpenCensus.

The goal of OpenCensus is to be a ubiquitous observability framework that allows developers to automatically collect, aggregate, and export traces, metrics, and other telemetry from their applications. We plan on getting there by building easy-to-use libraries and automatically integrate with as many technologies and frameworks as possible.

Our roadmap has two themes: increased language, framework, and platform coverage, and the addition of more powerful features.Today, we’ll discuss the first theme of the increased coverage.

Increasing Coverage

More Language Coverage

In January, we released OpenCensus for Java, Go, and C++ as well as tracing support for Python, PHP, and Ruby. We’re about to start development of OpenCensus for Node.js and .NET, and you’ll see activity on these repositories ramp up in the coming quarter.

Integration with more Frameworks, Platforms, and Clients

We want to provide a great out-of-the-box experience, so we need to automatically capture traces and metrics with as little developer effort as possible. To achieve this, we’ll be creating integrations for popular web frameworks, RPC frameworks, and storage clients. This will enable automatic context propagation, span creation, and trace annotations, without requiring extra work on behalf of developers.

As a basic example, OpenCensus already integrates with Go’s default gRPC and HTTP handlers to generate spans (with relevant annotations) and to pass context.

More complex integrations will provide more information to developers. Here’s an example of a trace captured with our upcoming MongoDB instrumentation, shown on Stackdriver Trace and AWS X-Ray:
A MongoDB trace shown in Stackdriver Trace

The same trace captured in X-Ray

Istio

OpenCensus will soon have out-of-the-box tracing and metrics collection in Istio. We’re currently working through our initial designs and implementation for integrations with the Envoy Sidecar and Istio Mixer service. Our goal is to provide Istio users with a great out of box tracing and metrics collection experience.

Kubernetes

We have two primary use cases in mind for Kubernetes deployments: providing cluster-wide visibility via z-pages, and better labeling of traces, stats, and metrics. Cluster-wide z-pages will allow developers to view telemetry in real time across an entire Kubernetes deployment, independently of their back-end. This is incredibly useful when debugging immediate high-impact issues like service outages.

Client Application Support

OpenCensus currently provides observability into back-end services, however this doesn’t tell the whole story about end-to-end application performance. Throughout 2018, we plan to add instrumentation for client and front-end web applications, so developers can get traces that begin from customers’ devices and reflect actual perceived latency, and metrics captured from client code.

We aim to add support for instrumenting Android, iOS, and front-end JavaScript, though this list may grow or change. Expect to hear more about this later in 2018.

Next Up

Next week we’ll discuss some of the new features that we’re looking to bring to OpenCensus, including notable enhancements to the trace sampling logic.

None of this is possible without the support and participation from the community. Please check out our repository and start contributing; we welcome contributions of any size -- however you want to take part. You can join other developers and users on the OpenCensus Gitter channel. We’d love to hear from you!

By Pritam Shah and Morgan McLean, Census team

Open sourcing Seurat: bringing high-fidelity scenes to mobile VR

Crossposted from the Google Developers Blog

Great VR experiences make you feel like you’re really somewhere else. To create deeply immersive experiences, there are a lot of factors that need to come together: amazing graphics, spatialized audio, and the ability to move around and feel like the world is responding to you.

Last year at I/O, we announced Seurat as a powerful tool to help developers and creators bring high-fidelity graphics to standalone VR headsets with full positional tracking, like the Lenovo Mirage Solo with Daydream. Seurat is a scene simplification technology designed to process very complex 3D scenes into a representation that renders efficiently on mobile hardware. Here’s how ILMxLAB was able to use Seurat to bring an incredibly detailed ‘Rogue One: A Star Wars Story’ scene to a standalone VR experience.

Today, we’re open sourcing Seurat to the developer community. You can now use Seurat to bring visually stunning scenes to your own VR applications and have the flexibility to customize the tool for your own workflows.

Behind the scenes: how Seurat works

Seurat works by taking advantage of the fact that VR scenes are typically viewed from within a limited viewing region, and leverages this to optimize the geometry and textures in your scene. It takes RGBD images (color and depth) as input and generates a textured mesh, targeting a configurable number of triangles, texture size, and fill rate, to simplify scenes beyond what traditional methods can achieve.


To demonstrate what Seurat can do, here’s a snippet from Blade Runner: Revelations, which launched today with the Lenovo Mirage Solo.

Blade Runner: Revolution by Alcon Interactive and Seismic Games
The Blade Runner universe is known for its stunning worlds, and in Revelations, you get to unravel a mystery around fugitive Replicants in the futuristic but gritty streets. To create the look and feel for Revelations, Seismic used Seurat to bring a scene of 46.6 million triangles down to only 307,000, improving performance by more than 100x with almost no loss in visual quality:

Original scene:

Seurat-processed scene: 

If you’re interested in learning more about Seurat or trying it out yourself, visit the Seurat GitHub page to access the documentation and source code. We’re looking forward to seeing what you build!

By Manfred Ernst, Software Engineer

Rolling out the red carpet for GSoC 2018 students!

Congratulations to our 2018 Google Summer of Code (GSoC) students and a big thank you to everyone who applied! Our 206 mentoring organizations have chosen the 1,264 students that they'll be working with during the 14th Google Summer of Code. This year’s students come from 64 different countries!

The next step for participating students is the Community Bonding period which runs from April 23rd through May 15th. During this time, students will get up to speed on the culture and code base of their new community. They’ll also get acquainted with their mentor(s) and learn more about the languages or tools they will need to complete their projects. Coding begins May 15th and will continue throughout the summer until August 14th.

To the more than 3,800 students who were not chosen this year - don’t be discouraged! Many students apply at least once to GSoC before being accepted. You can improve your odds for next time by contributing to the open source project of your choice directly; organizations are always eager for new contributors! Look around GitHub and elsewhere on the internet for a project that interests you and get started.

Happy coding, everyone!

By Stephanie Taylor, GSoC Program Lead

My first open source project and Google Code-in

This is a guest post from a mentor with coala, an open source tool for linting and fixing code in many different languages, which participated in Google Code-in 2017.

About two years ago, my friend Gyan and I built a small web app which checked whether or not a given username was available on a few popular social media websites. The idea was simple: judge availability of the username on the basis of an HTTP response. Here’s a pseudo-code example:
website_url = form_website_url(website, username)
# Eg: form_website_url('github', 'manu-chroma') returns 'github.com/manu-chroma'

if website_url_response.http_code == 404:
username available
else:
username taken
Much to our delight, it worked! Well, almost. It had a lot of bugs but we didn’t care much at the time. It was my first Python project and the first time I open sourced my work. I always look back on it as a cool idea, proud that I made it and learned a lot in the process.

But the project had been abandoned until John from coala approached me. John suggested we use it for Google Code-in because one of coala’s tasks for the students was to create accounts on a few common coding related websites. Students could use the username availability tool to find a good single username–people like their usernames to be consistent across websites–and coala could use it to verify that the accounts were created.

I had submitted a few patches to coala in the past, so this sounded good to me! The competition clashed with my vacation plans, but I wanted to get involved, so I took the opportunity to become a mentor.

Over the course of the program, students not only used the username availability tool but they also began making major improvements. We took the cue and began adding tasks specifically about the tool. Here are just a few of the things students added:
  • Regex to determine whether a given username was valid for any given website
  • More websites, bringing it to a total of 13
  • Tests (!)
The web app is online so you can check username availability too!

I had such a fun time working with students in Google Code-in, their enthusiasm and energy was amazing. Special thanks to students Andrew, Nalin, Joshua, and biscuitsnake for all the time and effort you put into the project. You did really useful work and I hope you learned from the experience!

I want to thank John for approaching me in the first place and suggesting we use and improve the project. He was an unstoppable force throughout the competition, helping both students and fellow mentors. John even helped me with code reviews to really refine the work students submitted, and help them improve based on the feedback.

Kudos to the Google Open Source team for organizing it so well and lowering the barriers of entry to open source for high school students around the world.

By Manvendra Singh, coala mentor