Monthly Archives: January 2021

Answering your top questions about the News Media Bargaining Code

We know many of you still have questions about the News Media Bargaining Code and its impact on the Google services you use after Mel Silva, Managing Director for Google Australia, appeared at a public hearing of the Senate Economics Legislation Committee last week. We want to address some key questions to help clarify our position. 


Question 1: What is Google’s position on this new law?

We are not against being regulated by a Code and we are willing to pay to support journalism—we are doing that around the world through News Showcase. But several aspects of the current version of this law are just unworkable for the services you use and our business in Australia. The Code, as it’s written, would break the way Google Search works and the fundamental principle of the internet, by forcing us to pay to provide links to news businesses’ sites. 


There aretwo other serious problemsremaining with the law, but at the heart of it, it comes down to this: the Code’s rules would undermine a free and open service that’s been built to serve everyone, and replace it with one where a law would give a handful of news businesses an advantage over everybody else.


Question 2: What have others said about this new law? 

It’s not just Google that is concerned about key aspects of this Code. Twice since the first draft was released in July 2020, regulators and the Government asked the public to make submissions to provide feedback. Here’s an overview of these submissions:


  • On 23 December 2020, the Australian Competition and Consumer Commission released the submissions people made to the first draft of the News Media Bargaining Code. We found that more than 80% of these submissions flagged significant concerns––and they have come from various groups including businesses of all sizes, industry groups, small news publishers, YouTube creators, and hundreds of individual Australians.  More information in this blog. 

  • The Senate Committee that is currently reviewing the law also asked for feedback. Our analysis shows that 34 of the 55 submissions they received voiced concerns about the law––including about the provision that makes digital platforms pay just to link (see question 4). This includes Google’s own submission. The remaining 21 submissions were either supportive or neutral towards most aspects of the Code. You can read through all submissions here.  


Question 3: What’s so bad about paying for links?  

The ability to link freely between sites is a fundamental part of the internet. Just like you don’t pay to include a hyperlink in an email, websites and search engines do not pay to provide links to third party websites. It creates a damaging precedent and privileges one group of content, that of news publishers, over everyone else, which breaks Google search. Read more about why linking freely is important for the open webhere.

Question 4: What have others said about paying for links?

  •  “[The law] risks breaching a fundamental principle of the web by requiring payment for linking between certain content online.” - Tim Berners-Lee, the inventor of the world wide web.

  • ”...the requirement for digital platforms to pay for providing a link to another website runs counter to one of the fundamental tenets of the internet: the ability to link freely between content. The ability to freely make these connections has underpinned the creativity and sharing of knowledge enabled by the internet. This legislation undercuts this fundamental principle that has, for decades, enabled the internet to deliver real benefits to all Australians.” - The Business Council of Australia

  • “The precedent of charging for links and snippets is a fundamental threat to the open internet, not just Google.” - Scott Farquhar, co-founder of Australian tech company Atlassian, as told to The Australian on January 15.

  • “In its current state [the bill] represents a fundamental challenge to the free and open Internet, to the functioning of the country’s digital economy, and to Australia’s economic future…” - Vint Cerf, chief internet evangelist at Google, also regarded as one of the ‘fathers of the internet’. 


Question 5: You say you're not against paying to support journalism, but the Code isn't workable. So what do you propose?

We are willing to pay to support journalism, but how we do that matters. Instead of requiring payment for linking to websites, we have proposed a model where Google could pay Australian news businesses under this new Code through Google News Showcase: our AU$1.38 billion (US$1 billion) commitment over three years to support the news industry worldwide. There are nearly 450 publications signed up already, including seven publishers with 25 titles here in Australia, and publications like Reuters, Germany’s Der Spiegel, France’s Le Monde, or piauí, in Brazil. 


Google News Showcase is a new product that will benefit both publishers and readers: Readers get more insight on the stories that matter to them with curated story panels across several Google services, and news publishers will increase their revenue through monthly licensing payment from Google as well as payment for paywalled content to provide users free access to select stories. In addition, news publishers have the opportunity to further grow their business through high-value traffic to their sites and deeper relationships with their audience.
Google News Showcase

News Showcase shows up as panels on Google News and Google Discover. In Germany and Brazil millions of users have already seen the panels publishers created there.

Google News Showcase would be subject to this new law. That means if a publisher is discussing a News Showcase deal with Google, and they’re not happy with the negotiation, they could go to an arbitrator to resolve any disagreements. 


Question 6: How does Google propose to change the law? 

We’re proposing reasonable amendments in three areas: 


  1. Instead of paying for links, we’re proposing to pay publishers through Google News Showcase, our AU$1.3 billion global investment in news partnerships over the next three years. We know that News Showcase works, because we’ve already signed News Showcase agreements with 450 publications, large and small, across a dozen countries, and they’re now getting paid. News Showcase would operate under this Code, with the option to go to an arbitrator if there are any disagreements. 

  2. To ensure that both publishers and platforms can negotiate fairly, we’ve proposed a standard commercial arbitration model for deals on News Showcase, one that would let arbitrators look at the comparable value of similar transactions, rather than an unpredictable process which  looks at only one side’s costs and discounts the value Google provides publishers. 

  3. Giving notice to certain news businesses about changes to our algorithm should be limited to significant actionable changes only, noting we make thousands of updates to Google Search every year.


Question 7: If this law passes as it stands, will Google Search still be available in Australia?

The ability to link freely between websites is fundamental to Search. This Code creates an unreasonable and unmanageable financial and operational risk to our business. As Mel Silva said during the Senate hearing last week, if the Code were to become law in its current form, we would have no real choice but to stop making Google Search available in Australia. 


After Mel said that, many media outlets reported that we have ‘threatened’ to leave Australia. Stopping to make Search available is the last thing we want to have happen, and it’s a worst case scenario if the Code remains unworkable. As we told Senators, we’re willing to pay publishers for value. We don’t object to a mandatory News Media Bargaining Code, and we believe there’s a clear path to make this Code work for everyone—publishers, digital platforms and Australian businesses and consumers. 


Question 8: Why is making Search unavailable in Australia the worst case scenario, why can't you just remove news from Search results?

This is not possible due to the extremely broad and vague definition of “news” in the Code—which includes any “content that reports, investigates or explains current issues or events of interest to Australians." This goes far beyond what most of us would consider “news.” And the content we’d need to remove could be on any website at any time, not just the websites of the news businesses registered under the Code.  

News Media Bargaining Code

Question 9: What’s happening in France? I read that you’re paying publishers there...

We have offered (and signed ) deals for News Showcase in France and a dozen other countries, the same as what we’re proposing in Australia. We believe that these new agreements demonstrate that News Showcase can work as a solution to pay publishers within a framework set by regulators, without breaking Google Search or the open web. 


Question 10: How does news content show up in Google?

Google does not show full news articles, we link you to news content, just like we link you to every other page on the web such as Wikipedia entries, personal blogs or business websites. You can read more about how news shows up in Google Search, and how we’re supporting the news industry in this blog


Question 11: What does Google contribute to the Australian economy?

Each year, Google provides $53 billion in benefits to businesses and consumers. In 2002, Google Australia started with just one person in a lounge room, today, our team has grown to be 1,800 strong. Today, we support an additional 116,000 jobs across the country, and provide $39 billion in benefits to Australian businesses and $14 billion in benefits to consumers. In the 2019 calendar year, Google Australia paid AU$59 million of corporate income taxes, and Google’s presence in Australia contributed over AU$700 million in taxes to the Australian Government’s revenue base. 


Question 12: What’s the impact of the revised law on YouTube?

On 8 December 2020, the Government confirmed that YouTube will not be included as a designated service in the Code at this time, and we agree that this is the right approach. However, the way that the Code is written leaves the door open for additional digital platforms to be added at any time, and several businesses have advocated for YouTube’s inclusion in their Senate submissions. We will continue to make our case to the Australian Government on why YouTube should remain excluded from the Code.


You can read more about our proposal for a workable News Media Bargaining Code at g.co/afaircode and in these blog posts. You can hear Mel Silva’s full testimony at the Senate hearing on 22 January on this site.  

#AndroidDevJourney spotlight – January edition

Posted by Luli Perkins, Developer Relations Program Manager


Header image with text saying Android Dev Journey

We kicked off the #AndroidDevJourney to give members of our community the opportunity to share their stories through our social platforms. Each Saturday from January through June we’ll feature a new developer on our Twitter account. We have received an overwhelming number of inspirational stories and hope you enjoy reading through the ones we’ve selected below.

For a chance to be featured in our February spotlight series, tweet us your story using #AndroidDevJourney.

Head shot of Niharika Arora

Niharika Arora

Tell me about your journey in becoming an Android Developer and how you got started.

My journey started in the field of Android when I was in my 4th year of undergrad studies. I got an internship in a startup named GreenAppleSolutions. There I got a chance to work on an Android project from scratch and luckily my first project went live on the Play Store. During this whole internship, I found Android so interesting because everything you code, you can see the results live in front of you on your device. I started loving Android and decided to take Android as my career path.

What’s one shortcut, tip, or hack you can’t live without?

I am a big fan of Android Lint, which has saved me many times from manually finding deprecated calls/APIs. It has also helped me in following the best practices and making my code more optimized, secure, and highly performant.

What's the one piece of advice you wish someone would have given you when you started on your journey?

Actually, there are two,

  • Clearing a small doubt is equally as important even if you think that is a stupid one. Ask as many queries as you can till the time you are satisfied with the answer.
  • Reading tutorials is good, but start exploring the documentations in depth. Initially, it might look too much to start with, but it will build you up to be a good developer in the long run.
Head shot of Walmyr Carvalho

Walmyr Carvalho

Tell me about your journey in becoming an Android Developer and how you got started.

Funny thing! I started working with mobile on iOS, in 2010, but then in 2011 my college final project was an app for civil construction and nobody on the team had a Mac, so we did it for Android (We got a 10, btw!)! At that time I was teaching technology to some government people and wasn’t into coding that much, but then after project in 2011 I got my first job as Junior Android Developer and it got me so hooked on the platform that I couldn’t leave!

I was able to work with Java on Eclipse + ADT, Holo, ActionBarSherlock, the beginnings of Material Design and was attending Google I/O ’13 when Google announced Android Studio, which was a very humbling but insightful experience to me, not only because of the learning but also the people I met that helped me a lot as well!

Since then, I’ve been working with mobile and, mostly, with Android for more than 10 years now, helping a lot of Brazilian tech companies and unicorns with their Android projects and since 2016 I’m one of the Google Developer Experts for Android around here.

Also, I love development and design communities, so I try to be involved with that as much as I can. I’m a former organizer of GDG São Paulo and the creator and organizer of Kotlin Meetup São Paulo and Android Dev BR - the biggest brazilian/lusophone Android community in the world, with more than 7.500 members!

Lastly, I’m also involved with the national startup community, as a mentor for ACE Startups and Google For Startups Accelerator programs in Brazil.

What’s one Android development shortcut, tip, or hack you can’t live without?

There’s a simple but powerful shortcut on Android Studio that I use a lot, which is the multi-cursor occurrence selection, which can be achieved using Ctrl + G (macOS) / Alt + J (Windows + Linux) for incremental occurrences selection and/or Ctrl + Cmd + G / Shift + Ctrl + Alt + J to select all occurrences once. Seems silly, but this shortcut helps me so much to get going on my code, especially when it comes to refactoring. I use it everyday!

What's the one piece of advice you wish someone would have given you when you started on your journey?

I think I would resume my advice in two words: learn and share.

Learn as much as you can, not only with the amazing content available on official documentation, and from the community, but also learn from your own mistakes through consistent practice. There’s a lot of content available for free on the internet, and also both Google and GDEs (Google Developer Experts) like me can get you going, so keep practicing and get your knowledge online!

And once you learn, share with other people! If I’m where I am today is because I was able to share what I couldn’t find when I was learning, so please, share your knowledge! The Android community is amazing and super helpful, you can reach literally the creators of the APIs and libraries you use on Twitter, Reddit and many other places. Write an article, record a podcast or a video, there are many formats that you could use.

The internet is such a powerful tool for learning and sharing and I really recommend you to do that there, and I’m definitely here to help if needed! :)

Head shot of Nate Washington

Nate Washington

Tell me about your journey in becoming an Android Developer and how you got started.

I became an Android developer in 2015, while working on my first business idea. I couldn’t afford to go back to school, so I decided to try my hand at starting a business instead. I launched a web application, but my customers insisted on having a native app for their needs as well. I originally looked for someone with more experience, but ultimately decided to just teach myself how to build an Android app. Fast forward to 2017, and my cofounder Christian and I launched the Android app for our company, Qoins, on the Google Play Store. Since then, we’ve served tens of thousands of Android customers and raised a few rounds of funding.

What’s one Android development shortcut, tip, or hack you can’t live without?

Being able to test my Android builds on virtual devices is a lifesaver. There are a lot of different scenarios to account for when building Android apps for thousands of different devices. Tools such as Firebase Test Labs, as well as other virtual device services allow me to create specific scenarios for hands-on testing that I can’t achieve with the physical Android devices that I own.

What's the one piece of advice you wish someone would have given you when you started on your journey?

Making mistakes is OK; it's all part of the process.

Headshot of Yuki Anzai

Yuki Anzai

Tell me about your journey in becoming an Android Developer and how you got started.

My journey began when I got my very first Android device, the HTC Magic, at Google Developer Day 2009. At that time, I was a college student and writing my personal application with JavaFX, so I had experience and familiarity with Java. Then I soon started to port my app to Android. After graduation I worked at a software company and wanted to develop Android apps as my job. But there seemed no opportunity at that company. So I created my own small company that is the agency to develop Android apps.

What’s one Android development shortcut, tip, or hack you can’t live without?

There are many. If I had to pick one, it would be Android Studio. I always appreciate the awesomeness of Android Studio because I started Android app development with Eclipse. (Also I can't live without Kotlin, RecyclerView, ConstraintLayout ...)

The shortcut of Android Studio that I can't live without is Command + B (Declaration or Usages. This allows us to jump between the declaration and usages. It's very useful to read source codes including Android platform and libraries codes.

What's the one piece of advice you wish someone would have given you when you started on your journey?

Read official documents. Read source codes of platform, libraries that you use. One of the ways to accelerate learning is to create an app through first to end (until release to the market).

Don't rely on libraries too much especially that affect the whole structure of your app. Your app might live longer than libraries.

Head shot of Madona Syombua

Madona Syombua

Tell me about your journey in becoming an Android Developer and how you got started.

My Android Journey started back in early 2014; before that, I worked as a junior Java developer for a small firm building inventory systems. However, that did not interest me, and I kept looking for something great to do with my Java knowledge. I bought my first phone, a Nokia, and saw apps in the phone and wondered how they made those apps. I researched and learned that apps were actually written in Java, and that's how my journey began.

I recall building my first application, Simple Math, with only activities since fragments were not there; what an improvement we've had over the years. Simple Math had 500 downloads with a 4.5 rating, and this really motivated me to build more applications. I later won the Grow With Google Scholarship (2018), which boosted my career. During this one-year scholarship, I launched my second application, Budgeting Buddy, on the Google Play Store and has a 4.5 rating with over five thousand downloads. I currently work for Streem as an Android Engineer, and I indeed love how far Android has come and how the technology and maintenance have improved over the years. Especially the Emulator.

What's one Android development shortcut, tip, or hack you can't live without.

A shortcut I can't live without is [options + Command + L ] and [Options + Command + O]; this really helps me during my pull request process. An amazing hack that I have learned to appreciate is the git local history option, WOW lifesaver. Sometimes you might forget what you had changed, but this hack always saves my life.

What's the one piece of advice you wish someone would have given you when you started on your journey?

Actually, when I transitioned into mobile completely, I felt the learning curve was something I would have to accommodate In my life, which has really helped me a lot. Always staying in front of the game by always learning what is new, what is being recommended, and why it is needed. For instance, having Room was an amazing advancement, now dagger Hilt, and many more. So if I can turn this around and advise new developers, be ready to learn and you will enjoy Android Development.


The Android Developer community prides itself in its inclusivity and welcomes developers from all backgrounds and stages of life. If you’re feeling inspired and want to learn more about how to become a part of our community, here are a few resources to help get you started.

Dive into developer.android.com

Follow us on Twitter

Subscribe to our YouTube channel

GDG logo

The Google Developer Groups program gives developers the opportunity to meet local developers with similar interests in technology. A GDG meetup event includes talks on a wide range of technical topics where you can learn new skills through hands-on workshops.

Join a chapter near you here.

Women Techmakers logo

Founded in 2014, Google’s Women Techmakers is dedicated to helping all women thrive in tech through community, visibility and resources. With a member base of over 100,000 women developers, we’re working with communities across the globe to build a world where all women can thrive in tech.

Become a member here.

GD Experts logo

The Google Developers Experts program is a global network of highly experienced technology experts, influencers and thought leaders who actively support developers, companies and tech communities by speaking at events, publishing content, and building innovative apps. Experts actively contribute to and support the developer and startup ecosystems around the world, helping them build and launch highly innovative apps.

Learn more about the program here.

Java is a registered trademark of Oracle and/or its affiliates.

Google Workspace Updates Weekly Recap – January 29, 2021

New updates

There are no new updates to share this week. Please see below for a recap of published announcements.


Previous announcements

Oops! You may have noticed we missed publishing last week’s recap. To make up for it, below we’re including announcements published on the Workspace Updates blog in the past two weeks. Please refer to the original blog posts for complete details.


Resize the Chat and Rooms sections in Gmail on the web
You can now resize the Chat and Rooms sections in the left-side navigation of Gmail on the web. | Learn more.


Quickly navigate to active cells and ranges with the new range name box in Google Sheets
We’re adding a range name box, located to the left of the formula bar, to improve navigation in Google Sheets. | Learn more.


Enable offline support for Google Calendar on web from your computer
You can now enable offline support for Google Calendar on Google Chrome from your computer. | Learn more.


Better privacy when screen sharing with muted web notifications
Now when you’re sharing your screen, Chrome will automatically hide the content of web pop-up notifications. This includes notifications from Google Chat, email notifications, and other third party websites. | Learn more.


Out of office information will now display when replying to or mentioning a user in a Google Docs comment
In Google Docs, you’ll now see out of office information when replying to or mentioning other users in a comment. | Learn more.


Control background replacement in Google Meet with a new admin setting
We’re adding the ability for admins to enable or disable the use of custom or preset backgrounds in Google Meet for meetings organized by an organizational unit (OU) level. | Learn more.


Indirect membership visibility and membership hierarchy APIs now generally available
We’re making it easier to identify, audit, and understand indirect group membership via the Cloud Identity Groups API. Specifically, we’re making the membership visibility and membership hierarchy APIs generally available. | Available to Google Workspace Enterprise Standard and Enterprise Plus, as well as G Suite Enterprise for Education and Cloud Identity Premium customers. | Learn more.

Google Workspace Updates Weekly Recap – January 29, 2021

New updates

There are no new updates to share this week. Please see below for a recap of published announcements.


Previous announcements

Oops! You may have noticed we missed publishing last week’s recap. To make up for it, below we’re including announcements published on the Workspace Updates blog in the past two weeks. Please refer to the original blog posts for complete details.


Resize the Chat and Rooms sections in Gmail on the web
You can now resize the Chat and Rooms sections in the left-side navigation of Gmail on the web. | Learn more.


Quickly navigate to active cells and ranges with the new range name box in Google Sheets
We’re adding a range name box, located to the left of the formula bar, to improve navigation in Google Sheets. | Learn more.


Enable offline support for Google Calendar on web from your computer
You can now enable offline support for Google Calendar on Google Chrome from your computer. | Learn more.


Better privacy when screen sharing with muted web notifications
Now when you’re sharing your screen, Chrome will automatically hide the content of web pop-up notifications. This includes notifications from Google Chat, email notifications, and other third party websites. | Learn more.


Out of office information will now display when replying to or mentioning a user in a Google Docs comment
In Google Docs, you’ll now see out of office information when replying to or mentioning other users in a comment. | Learn more.


Control background replacement in Google Meet with a new admin setting
We’re adding the ability for admins to enable or disable the use of custom or preset backgrounds in Google Meet for meetings organized by an organizational unit (OU) level. | Learn more.


Indirect membership visibility and membership hierarchy APIs now generally available
We’re making it easier to identify, audit, and understand indirect group membership via the Cloud Identity Groups API. Specifically, we’re making the membership visibility and membership hierarchy APIs generally available. | Available to Google Workspace Enterprise Standard and Enterprise Plus, as well as G Suite Enterprise for Education and Cloud Identity Premium customers. | Learn more.

Google Summer of Code 2021 is open for mentor organization applications!

GSoC logo
With the new year comes the start of our 17th edition of Google Summer of Code (GSoC)! Right now open source projects and organizations can apply to participate as mentoring organizations for the students in the 2021 program. GSoC is a global program that draws student developers (18 years old and over) from around the world to contribute to open source projects. This year, from June 7th to August 16th, each student will spend 10 weeks working on a coding project with the support of volunteer mentors from participating open source organizations.

Does your open source project want to learn more about becoming a mentoring organization? Visit the program site and read the mentor guide to learn about what it means to be a mentor organization, how to prepare your community (hint: have plenty of enthusiastic mentors!), creating appropriate project ideas (that will be ~175 hour projects for the student), and tips for preparing your application.

We welcome all types of organizations and are very eager to involve first-time organizations with a 2021 goal of accepting 40 new orgs. We encourage veteran organizations to refer other organizations they think would be a good fit to participate in GSoC as well.

Last year, 1,106 students completed the program under the guidance of over 2,000 mentors from 198 open source organizations. Many types of open source organizations are involved in GSoC, from small and medium sized open source organizations to larger, umbrella organizations with many sub-projects under them (Python Software Foundation, Apache Software Foundation, etc.). Some organizations are relatively young (less than 2 years old), while other organizations have been around for 20+ years.

You can apply to be a mentoring organization for GSoC starting today on the program site. The deadline to apply is February 19th at 19:00 UTC. We will publicly announce the organizations chosen for GSoC 2021 on March 9th.

Please visit the program site for more information on how to apply and review the detailed timeline of important deadlines. We also encourage you to check out the Mentor Guide and our short video on why open source projects want to be a part of the GSoC program.

Good luck to all open source mentoring organization applicants!

By Stephanie Taylor, Google Open Source

Google Summer of Code 2021 is open for mentor organization applications!

GSoC logo
With the new year comes the start of our 17th edition of Google Summer of Code (GSoC)! Right now open source projects and organizations can apply to participate as mentoring organizations for the students in the 2021 program. GSoC is a global program that draws student developers (18 years old and over) from around the world to contribute to open source projects. This year, from June 7th to August 16th, each student will spend 10 weeks working on a coding project with the support of volunteer mentors from participating open source organizations.

Does your open source project want to learn more about becoming a mentoring organization? Visit the program site and read the mentor guide to learn about what it means to be a mentor organization, how to prepare your community (hint: have plenty of enthusiastic mentors!), creating appropriate project ideas (that will be ~175 hour projects for the student), and tips for preparing your application.

We welcome all types of organizations and are very eager to involve first-time organizations with a 2021 goal of accepting 40 new orgs. We encourage veteran organizations to refer other organizations they think would be a good fit to participate in GSoC as well.

Last year, 1,106 students completed the program under the guidance of over 2,000 mentors from 198 open source organizations. Many types of open source organizations are involved in GSoC, from small and medium sized open source organizations to larger, umbrella organizations with many sub-projects under them (Python Software Foundation, Apache Software Foundation, etc.). Some organizations are relatively young (less than 2 years old), while other organizations have been around for 20+ years.

You can apply to be a mentoring organization for GSoC starting today on the program site. The deadline to apply is February 19th at 19:00 UTC. We will publicly announce the organizations chosen for GSoC 2021 on March 9th.

Please visit the program site for more information on how to apply and review the detailed timeline of important deadlines. We also encourage you to check out the Mentor Guide and our short video on why open source projects want to be a part of the GSoC program.

Good luck to all open source mentoring organization applicants!

By Stephanie Taylor, Google Open Source

Google Summer of Code 2021 is open for mentor organization applications!

GSoC logo
With the new year comes the start of our 17th edition of Google Summer of Code (GSoC)! Right now open source projects and organizations can apply to participate as mentoring organizations for the students in the 2021 program. GSoC is a global program that draws student developers (18 years old and over) from around the world to contribute to open source projects. This year, from June 7th to August 16th, each student will spend 10 weeks working on a coding project with the support of volunteer mentors from participating open source organizations.

Does your open source project want to learn more about becoming a mentoring organization? Visit the program site and read the mentor guide to learn about what it means to be a mentor organization, how to prepare your community (hint: have plenty of enthusiastic mentors!), creating appropriate project ideas (that will be ~175 hour projects for the student), and tips for preparing your application.

We welcome all types of organizations and are very eager to involve first-time organizations with a 2021 goal of accepting 40 new orgs. We encourage veteran organizations to refer other organizations they think would be a good fit to participate in GSoC as well.

Last year, 1,106 students completed the program under the guidance of over 2,000 mentors from 198 open source organizations. Many types of open source organizations are involved in GSoC, from small and medium sized open source organizations to larger, umbrella organizations with many sub-projects under them (Python Software Foundation, Apache Software Foundation, etc.). Some organizations are relatively young (less than 2 years old), while other organizations have been around for 20+ years.

You can apply to be a mentoring organization for GSoC starting today on the program site. The deadline to apply is February 19th at 19:00 UTC. We will publicly announce the organizations chosen for GSoC 2021 on March 9th.

Please visit the program site for more information on how to apply and review the detailed timeline of important deadlines. We also encourage you to check out the Mentor Guide and our short video on why open source projects want to be a part of the GSoC program.

Good luck to all open source mentoring organization applicants!

By Stephanie Taylor, Google Open Source

Google Summer of Code 2021 is open for mentor organization applications!

GSoC logo
With the new year comes the start of our 17th edition of Google Summer of Code (GSoC)! Right now open source projects and organizations can apply to participate as mentoring organizations for the students in the 2021 program. GSoC is a global program that draws student developers (18 years old and over) from around the world to contribute to open source projects. This year, from June 7th to August 16th, each student will spend 10 weeks working on a coding project with the support of volunteer mentors from participating open source organizations.

Does your open source project want to learn more about becoming a mentoring organization? Visit the program site and read the mentor guide to learn about what it means to be a mentor organization, how to prepare your community (hint: have plenty of enthusiastic mentors!), creating appropriate project ideas (that will be ~175 hour projects for the student), and tips for preparing your application.

We welcome all types of organizations and are very eager to involve first-time organizations with a 2021 goal of accepting 40 new orgs. We encourage veteran organizations to refer other organizations they think would be a good fit to participate in GSoC as well.

Last year, 1,106 students completed the program under the guidance of over 2,000 mentors from 198 open source organizations. Many types of open source organizations are involved in GSoC, from small and medium sized open source organizations to larger, umbrella organizations with many sub-projects under them (Python Software Foundation, Apache Software Foundation, etc.). Some organizations are relatively young (less than 2 years old), while other organizations have been around for 20+ years.

You can apply to be a mentoring organization for GSoC starting today on the program site. The deadline to apply is February 19th at 19:00 UTC. We will publicly announce the organizations chosen for GSoC 2021 on March 9th.

Please visit the program site for more information on how to apply and review the detailed timeline of important deadlines. We also encourage you to check out the Mentor Guide and our short video on why open source projects want to be a part of the GSoC program.

Good luck to all open source mentoring organization applicants!

By Stephanie Taylor, Google Open Source

Data Driven Security Hardening in Android

The Android platform team is committed to securing Android for every user across every device. In addition to monthly security updates to patch vulnerabilities reported to us through our Vulnerability Rewards Program (VRP), we also proactively architect Android to protect against undiscovered vulnerabilities through hardening measures such as applying compiler-based mitigations and improving sandboxing. This post focuses on the decision-making process that goes into these proactive measures: in particular, how we choose which hardening techniques to deploy and where they are deployed. As device capabilities vary widely within the Android ecosystem, these decisions must be made carefully, guided by data available to us to maximize the value to the ecosystem as a whole.

The overall approach to Android Security is multi-pronged and leverages several principles and techniques to arrive at data-guided solutions to make future exploitation more difficult. In particular, when it comes to hardening the platform, we try to answer the following questions:

  • What data are available and how can they guide security decisions?
  • What mitigations are available, how can they be improved, and where should they be enabled?
  • What are the deployment challenges of particular mitigations and what tradeoffs are there to consider?

By shedding some light on the process we use to choose security features for Android, we hope to provide a better understanding of Android's overall approach to protecting our users.

Data-driven security decision-making

We use a variety of sources to determine what areas of the platform would benefit the most from different types of security mitigations. The Android Vulnerability Rewards Program (VRP) is one very informative source: all vulnerabilities submitted through this program are analyzed by our security engineers to determine the root cause of each vulnerability and its overall severity (based on these guidelines). Other sources are internal and external bug-reports, which identify vulnerable components and reveal coding practices that commonly lead to errors. Knowledge of problematic code patterns combined with the prevalence and severity of the vulnerabilities they cause can help inform decisions about which mitigations are likely to be the most beneficial.



Types of Critical and High severity vulnerabilities fixed in Android Security Bulletins in 2019

Relying purely on vulnerability reports is not sufficient as the data are inherently biased: often, security researchers flock to "hot" areas, where other researchers have already found vulnerabilities (e.g. Stagefright). Or they may focus on areas where readily-available tools make it easier to find bugs (for instance, if a security research tool is posted to Github, other researchers commonly utilize that tool to explore deeper).

To ensure that mitigation efforts are not biased only toward areas where bugs and vulnerabilities have been reported, internal Red Teams analyze less scrutinized or more complex parts of the platform. Also, continuous automated fuzzers run at-scale on both Android virtual machines and physical devices. This also ensures that bugs can be found and fixed early in the development lifecycle. Any vulnerabilities uncovered through this process are also analyzed for root cause and severity, which inform mitigation deployment decisions.

The Android VRP rewards submissions of full exploit-chains that demonstrate a full end-to-end attack. These exploit-chains, which generally utilize multiple vulnerabilities, are very informative in demonstrating techniques that attackers use to chain vulnerabilities together to accomplish their goals. Whenever a researcher submits a full exploit chain, a team of security engineers analyzes and documents the overall approach, each link in the chain, and any innovative attack strategies used. This analysis informs which exploit mitigation strategies could be employed to prevent pivoting directly from one vulnerability to another (some examples include Address Space Layout Randomization and Control-Flow Integrity) and whether the process’s attack surface could be reduced if it has unnecessary access to resources.

There are often multiple different ways to use a collection of vulnerabilities to create an exploit chain. Therefore a defense-in-depth approach is beneficial, with the goal of reducing the usefulness of some vulnerabilities and lengthening exploit chains so that successful exploitation requires more vulnerabilities. This increases the cost for an attacker to develop a full exploit chain.

Keeping up with developments in the wider security community helps us understand the current threat landscape, what techniques are currently used for exploitation, and what future trends look like. This involves but is not limited to:

  • Close collaboration with the external security research community
  • Reading journals and attending conferences
  • Monitoring techniques used by malware
  • Following security research trends in security communities
  • Participating in external efforts and projects such as KSPP, syzbot, LLVM, Rust, and more

All of these data sources provide feedback for the overall security hardening strategy, where new mitigations should be deployed, and what existing security mitigations should be improved.

Reasoning About Security Hardening

Hardening and Mitigations

Analyzing the data reveals areas where broader mitigations can eliminate entire classes of vulnerabilities. For instance, if parts of the platform show a large number of vulnerabilities due to integer overflow bugs, they are good candidates to enable Undefined Behavior Sanitizer (UBSan) mitigations such as the Integer Overflow Sanitizer. When common patterns in memory access vulnerabilities appear, they inform efforts to build hardened memory allocators (enabled by default in Android 11) and implement mitigations (such as CFI) against exploitation techniques that provide better resilience against memory overflows or Use-After-Free vulnerabilities.

Before discussing how the data can be used, it is important to understand how we classify our overall efforts in hardening the platform. There are a few broadly defined buckets that hardening techniques and mitigations fit into (though sometimes a particular mitigation may not fit cleanly into any single one):

  • Exploit mitigations
    • Deterministic runtime prevention of vulnerabilities detects undefined or unexpected behavior and aborts execution when the behavior is detected. This turns potential memory corruption vulnerabilities into less harmful crashes. Often these mitigations can be enabled selectively and still be effective because they impact individual bugs. Examples include Integer Sanitizer and Bounds Sanitizer.
    • Exploitation technique mitigations target the techniques used to pivot from one vulnerability to another or to gain code execution. These mitigations theoretically may render some vulnerabilities useless, but more often serve to constrain the actions available to attackers seeking to exploit vulnerabilities. This increases the difficulty of exploit development in terms of time and resources. These mitigations may need to be enabled across an entire process's memory space to be effective. Examples include Address Space Layout Randomization, Control Flow Integrity (CFI), Stack Canaries and Memory Tagging.
    • Compiler transformations that change undefined behavior to defined behavior at compile-time. This prevents attackers from taking advantage of undefined behavior such as uninitialized memory. An example of this is stack initialization.
  • Architectural decomposition
    • Splits larger, more privileged components into smaller pieces, each of which has fewer privileges than the original. After this decomposition, a vulnerability in one of the smaller components will have reduced severity by providing less access to the system, lengthening exploit chains, and making it harder for an attacker to gain access to sensitive data or additional privilege escalation paths.
  • Sandboxing/isolation
    • Related to architectural decomposition, enforces a minimal set of permissions/capabilities that a process needs to correctly function, often through mandatory and/or discretionary access control. Like architectural decomposition, this makes vulnerabilities in these processes less valuable as there are fewer things attackers can do in that execution context, by applying the principle of least privilege. Some examples are Android Permissions, Unix Permissions, Linux Capabilities, SELinux, and Seccomp.
  • Migrating to memory-safe languages
    • C and C++ do not provide memory safety the way that languages like Java, Kotlin, and Rust do. Given that the majority of security vulnerabilities reported to Android are memory safety issues, a two-pronged approach is applied: improving the safety of C/C++ while also encouraging the use of memory safe languages.

Enabling these mitigations

With the broad arsenal of mitigation techniques available, which of these to employ and where to apply them depends on the type of problem being solved. For instance, a monolithic process that handles a lot of untrusted data and does complex parsing would be a good candidate for all of these. The media frameworks provide an excellent historical example where an architectural decomposition enabled incrementally turning on more exploit mitigations and deprivileging.

Architectural decomposition and isolation of the Media Frameworks over time

Remotely reachable attack surfaces such as NFC, Bluetooth, WiFi, and media components have historically housed the most severe vulnerabilities, and as such these components are also prioritized for hardening. These components often contain some of the most common vulnerability root causes that are reported in the VRP, and we have recently enabled sanitizers in all of them.

Libraries and processes that enforce or sit at security boundaries, such as libbinder, and widely-used core libraries such as libui, libcore, and libcutils are good targets for exploit mitigations since these are not process-specific. However, due to performance and stability sensitivities around these core libraries, mitigations need to be supported by strong evidence of their security impact.

Finally, the kernel’s high level of privilege makes it an important target for hardening as well. Because different codebases have different characteristics and functionality, susceptibility to and prevalence of certain kinds of vulnerabilities will differ. Stability and performance of mitigations here are exceptionally important to avoid negatively impacting the user experience, and some mitigations that make sense to deploy in user space may not be applicable or effective. Therefore our considerations for which hardening strategies to employ in the kernel are based on a separate analysis of the available kernel-specific data.

This data-driven approach has led to tangible and measurable results. Starting in 2015 with Stagefright, a large number of Critical severity vulnerabilities were reported in Android's media framework. These were especially sensitive because many of these vulnerabilities were remotely reachable. This led to a large architectural decomposition effort in Android Nougat, followed by additional efforts to improve our ability to patch media vulnerabilities quickly. Thanks to these changes, in 2020 we had no internet-reachable Critical severity vulnerabilities reported to us in the media frameworks.

Deployment Considerations

Some of these mitigations provide more value than others, so it is important to focus engineering resources where they are most effective. This involves weighing the performance cost of each mitigation as well as how much work is required to deploy it and support it without negatively affecting device stability or user experience.

Performance

Understanding the performance impact of a mitigation is a critical step toward enabling it. Adding too much overhead to some components or the entire system can negatively impact user experience by reducing battery life and making the device less responsive. This is especially true for entry-level devices, which should benefit from hardening as well. We thus want to prioritize engineering efforts on impactful mitigations with acceptable overheads.

When investigating performance, important factors include not just CPU time but also memory increase, code size, battery life, and UI jank. These factors are especially important to consider for more constrained entry-level devices, to ensure that the mitigations perform well across the entire Android ecosystem.

The system-wide performance impact of a mitigation is also dependent on where that mitigation is enabled, as certain components are more performance-sensitive than others. For example, binder is one of the most used paths for interprocess communication, so even small additional overhead could significantly impact user experience on a device. On the other hand, video players only need to ensure that frames are rendered at the source framerate; if frames are rendered much faster than the rate at which they are displayed, additional overhead may be more acceptable.

Benchmarks, if available, can be extremely useful to evaluate the performance impact of a mitigation. If there are no benchmarks for a certain component, new ones should be created, for instance by calling impacted codec code to decode a media file. If this testing reveals unacceptable overhead, there are often a few options to address it:

  • Selectively disable the mitigation in performance-sensitive functions identified during benchmarks. A small number of functions are often responsible for a large part of the runtime overhead, so disabling the mitigation in those functions can maximize the security benefit while minimizing the performance cost. Here is an example of this in one of the media codecs. These exempted functions must be manually reviewed for bugs to reduce the risk of disabling the mitigation there.
  • Optimize the implementation of the mitigation to improve its performance. This often involves modifying the compiler. For example, our team has upstreamed optimizations to the Integer Overflow Sanitizer and the Bounds Sanitizer.
  • Certain mitigations, such as the Scudo allocator’s built-in robustness against heap-based vulnerabilities, have tunable parameters that can be tweaked to improve performance.

Most of these improvements involve changes or contributions to the LLVM project. By working with upstream LLVM, these improvements have impact and benefit beyond Android. At the same time Android benefits from upstream improvements when others in the LLVM community make improvements as well.

Deployment and Support

There is more to consider when enabling a mitigation than its security benefit and performance cost, such as the cost of short-term deployment and long-term support.

Deployment Stability Considerations

One important issue is whether a mitigation can contain false positives. For example, if the Bounds Sanitizer produces an error, there is definitely an out-of-bounds access (although it might not be exploitable). But the Integer Overflow Sanitizer can produce false positives, as many integer overflows are harmless or even perfectly expected and correct.

It is thus important to consider the impact of a mitigation on the stability of the system. Whether a crash is due to a false positive or a legitimate security issue, it still disrupts the user experience and so is undesirable. This is another reason to carefully consider which components should have which mitigations, as crashes in some components are worse than others. If a mitigation causes a crash in a media codec, the user’s video playback will be stopped, but if netd crashes during an update, the phone could be bricked. For a mitigation like Bounds Sanitizer, where false positives are not an issue, we still need to perform extensive testing to ensure the device remains stable. Off-by-one errors, for example, may not crash during normal operation, but Bounds Sanitizer would abort execution and result in instability.

Another consideration is whether it is possible to enumerate everything a mitigation might break. For example, it is not easy to contain the risk of the Integer Overflow Sanitizer without extensive testing, as it is difficult to determine which overflows are intentional/benign (and thus should be allowed) and which could lead to vulnerabilities.

Support

We must consider not just issues caused by deploying mitigations but also how to support them long-term. This includes the developer time to integrate a mitigation into existing systems, enable and debug it, deploy it onto devices, and support it after launch. SELinux is a good example of this; it takes a significant amount of effort to write the policy for a new device, and even once enforcing mode is enabled, the policy must be supported for years as code changes and functionality is added or removed.

We try to make mitigations less disruptive and spread awareness of how they affect developers. This is done by making documentation available on source.android.com and by improving existing algorithms to reduce false positives. Making it easier to debug mitigations when something goes wrong reduces the developer maintenance burden that can accompany mitigations. For example, when developers found it difficult to identify UBSan errors, we enabled support for the UBSan Minimal Runtime by default in the Android build system. The minimal runtime itself was first upstreamed by others at Google specifically for this purpose. When the Integer Overflow Sanitizer crashes a program, that adds the following hint to the generic SIGABRT crash message:

    Abort message: 'ubsan: sub-overflow'

Developers who see this message then know to enable diagnostics mode, which prints out details about the crash:

    frameworks/native/services/surfaceflinger/SurfaceFlinger.cpp:2188:32: runtime error: unsigned integer overflow: 0 - 1 cannot be represented in type 'size_t' (aka 'unsigned long')

Similarly, upstream SELinux provides a tool called audit2allow that can be used to suggest rules to allow blocked behaviors:

    adb logcat -d | audit2allow -p policy

#============= rmt ==============
allow rmt kmem_device:chr_file { read write };

A debugging tool does not need to be perfect to be helpful; audit2allow does not always suggest the correct options, but for developers without detailed knowledge of SELinux it provides a strong starting point.

Conclusion

With every Android release, our team works hard to balance security improvements that benefit the entire ecosystem with performance and stability, drawing heavily from the data that are available to us. We hope that this sheds some light on the particular challenges involved and the overall process that leads to mitigations introduced in each Android release.

Thank you to Jeff Vander Stoep for contributions to this blog post.

Learning to Reason Over Tables from Less Data

The task of recognizing textual entailment, also known as natural language inference, consists of determining whether a piece of text (a premise), can be implied or contradicted (or neither) by another piece of text (the hypothesis). While this problem is often considered an important test for the reasoning skills of machine learning (ML) systems and has been studied in depth for plain text inputs, much less effort has been put into applying such models to structured data, such as websites, tables, databases, etc. Yet, recognizing textual entailment is especially relevant whenever the contents of a table need to be accurately summarized and presented to a user, and is essential for high fidelity question answering systems and virtual assistants.

In "Understanding tables with intermediate pre-training", published in Findings of EMNLP 2020, we introduce the first pre-training tasks customized for table parsing, enabling models to learn better, faster and from less data. We build upon our earlier TAPAS model, which was an extension of the BERT bi-directional Transformer model with special embeddings to find answers in tables. Applying our new pre-training objectives to TAPAS yields a new state of the art on multiple datasets involving tables. On TabFact, for example, it reduces the gap between model and human performance by ~50%. We also systematically benchmark methods of selecting relevant input for higher efficiency, achieving 4x gains in speed and memory, while retaining 92% of the results. All the models for different tasks and sizes are released on GitHub repo, where you can try them out yourself in a colab Notebook.

Textual Entailment
The task of textual entailment is more challenging when applied to tabular data than plain text. Consider, for example, a table from Wikipedia with some sentences derived from its associated table content. Assessing if the content of the table entails or contradicts the sentence may require looking over multiple columns and rows, and possibly performing simple numeric computations, like averaging, summing, differencing, etc.

A table together with some statements from TabFact. The content of the table can be used to support or contradict the statements.

Following the methods used by TAPAS, we encode the content of a statement and a table together, pass them through a Transformer model, and obtain a single number with the probability that the statement is entailed or refuted by the table.

The TAPAS model architecture uses a BERT model to encode the statement and the flattened table, read row by row. Special embeddings are used to encode the table structure. The vector output of the first token is used to predict the probability of entailment.

Because the only information in the training examples is a binary value (i.e., "correct" or "incorrect"), training a model to understand whether a statement is entailed or not is challenging and highlights the difficulty in achieving generalization in deep learning, especially when the provided training signal is scarce. Seeing isolated entailed or refuted examples, a model can easily pick-up on spurious patterns in the data to make a prediction, for example the presence of the word "tie" in "Greg Norman and Billy Mayfair tie in rank", instead of truly comparing their ranks, which is what is needed to successfully apply the model beyond the original training data.

Pre-training Tasks
Pre-training tasks can be used to “warm-up” models by providing them with large amounts of readily available unlabeled data. However, pre-training typically includes primarily plain text and not tabular data. In fact, TAPAS was originally pre-trained using a simple masked language modelling objective that was not designed for tabular data applications. In order to improve the model performance on tabular data, we introduce two novel pretraining binary-classification tasks called counterfactual and synthetic, which can be applied as a second stage of pre-training (often called intermediate pre-training).

In the counterfactual task, we source sentences from Wikipedia that mention an entity (person, place or thing) that also appears in a given table. Then, 50% of the time, we modify the statement by swapping the entity for another alternative. To make sure the statement is realistic, we choose a replacement among the entities in the same column in the table. The model is trained to recognize whether the statement was modified or not. This pre-training task includes millions of such examples, and although the reasoning about them is not complex, they typically will still sound natural.

For the synthetic task, we follow a method similar to semantic parsing in which we generate statements using a simple set of grammar rules that require the model to understand basic mathematical operations, such as sums and averages (e.g., "the sum of earnings"), or to understand how to filter the elements in the table using some condition (e.g.,"the country is Australia"). Although these statements are artificial, they help improve the numerical and logical reasoning skills of the model.

Example instances for the two novel pre-training tasks. Counterfactual examples swap entities mentioned in a sentence that accompanies the input table for a plausible alternative. Synthetic statements use grammar rules to create new sentences that require combining the information of the table in complex ways.

Results
We evaluate the success of the counterfactual and synthetic pre-training objectives on the TabFact dataset by comparing to the baseline TAPAS model and to two prior models that have exhibited success in the textual entailment domain, LogicalFactChecker (LFC) and Structure Aware Transformer (SAT). The baseline TAPAS model exhibits improved performance relative to LFC and SAT, but the pre-trained model (TAPAS+CS) performs significantly better, achieving a new state of the art.

We also apply TAPAS+CS to question answering tasks on the SQA dataset, which requires that the model find answers from the content of tables in a dialog setting. The inclusion of CS objectives improves the previous best performance by more than 4 points, demonstrating that this approach also generalizes performance beyond just textual entailment.

Results on TabFact (left) and SQA (right). Using the synthetic and counterfactual datasets, we achieve new state-of-the-art results in both tasks by a large margin.

Data and Compute Efficiency
Another aspect of the counterfactual and synthetic pre-training tasks is that since the models are already tuned for binary classification, they can be applied without any fine-tuning to TabFact. We explore what happens to each of the models when trained only on a subset (or even none) of the data. Without looking at a single example, the TAPAS+CS model is competitive with a strong baseline Table-Bert, and when only 10% of the data are included, the results are comparable to the previous state-of-the-art.

Dev accuracy on TabFact relative to the fraction of the training data used.

A general concern when trying to use large models such as this to operate on tables, is that their high computational requirements makes it difficult for them to parse very large tables. To address this, we investigate whether one can heuristically select subsets of the input to pass through the model in order to optimize its computational efficiency.

We conducted a systematic study of different approaches to filter the input and discovered that simple methods that select for word overlap between a full column and the subject statement give the best results. By dynamically selecting which tokens of the input to include, we can use fewer resources or work on larger inputs at the same cost. The challenge is doing so without losing important information and hurting accuracy. 

For instance, the models discussed above all use sequences of 512 tokens, which is around the normal limit for a transformer model (although recent efficiency methods like the Reformer or Performer are proving effective in scaling the input size). The column selection methods we propose here can allow for faster training while still achieving high accuracy on TabFact. For 256 input tokens we get a very small drop in accuracy, but the model can now be pre-trained, fine-tuned and make predictions up to two times faster. With 128 tokens the model still outperforms the previous state-of-the-art model, with an even more significant speed-up — 4x faster across the board.

Accuracy on TabFact using different sequence lengths, by shortening the input with our column selection method.

Using both the column selection method we proposed and the novel pre-training tasks, we can create table parsing models that need fewer data and less compute power to obtain better results.

We have made available the new models and pre-training techniques at our GitHub repo, where you can try it out yourself in colab. In order to make this approach more accessible, we also shared models of varying sizes all the way down to “tiny”. It is our hope that these results will help spur development of table reasoning among the broader research community.

Acknowledgements
This work was carried out by Julian Martin Eisenschlos, Syrine Krichene and Thomas Müller from our Language Team in Zürich. We would like to thank Jordan Boyd-Graber, Yasemin Altun, Emily Pitler, Benjamin Boerschinger, Srini Narayanan, Slav Petrov, William Cohen and Jonathan Herzig for their useful comments and suggestions.

Source: Google AI Blog