Category Archives: Online Security Blog

The latest news and insights from Google on security and safety on the Internet

How the Atheris Python Fuzzer Works

On Friday, we announced that we’ve released the Atheris Python fuzzing engine as open source. In this post, we’ll briefly talk about its origins, and then go into lots more detail on how it works.

The Origin Story 

Every year since 2013, Google has held a “Fuzzit”, an internal event where Googlers write fuzzers for their code or open source software. By October 2019, however, we’d already written fuzzers for most of the open-source C/C++ code we use. So for that Fuzzit, the author of this post wrote a Python fuzzing engine based on libFuzzer. Since then, over 50 Python fuzzers have been written at Google, and countless bugs have been reported and fixed.

Originally, this fuzzing engine could only fuzz native extensions, as it did not support Python coverage. But over time, the fuzzer morphed into Atheris, a high-performance fuzzing engine that supports both native and pure-Python fuzzing.

A Bit of Background 

Atheris is a coverage-guided fuzzer. To use Atheris, you specify an “entry point” via atheris.Setup(). Atheris will then rapidly call this entry point with different inputs, hoping to produce a crash. While doing so, Atheris will monitor how the program execution changes based on the input, and will attempt to find new and interesting code paths. This allows Atheris to find unexpected and buggy behavior very effectively.

import atheris

import sys


def TestOneInput(data):  # Our entry point

  if data == b"bad":

    raise RuntimeError("Badness!")

    

atheris.Setup(sys.argv, TestOneInput)

atheris.Fuzz()


Atheris is a native Python extension, and uses libFuzzer to provide its code coverage and input generation capabilities. The entry point passed to atheris.Setup() is wrapped in the C++ entry point that’s actually passed to libFuzzer. This wrapper will then be invoked by libFuzzer repeatedly, with its data proxied back to Python.


Python Code Coverage 


Atheris is a native Python extension, and is typically compiled with libFuzzer linked in. When you initialize Atheris, it registers a tracer with CPython to collect information about Python code flow. This tracer can keep track of every line reached and every function executed.


We need to get this trace information to libFuzzer, which is responsible for generating code coverage information. There’s a problem, however: libFuzzer assumes that the amount of code is known at compile-time. The two primary code coverage mechanisms are __sanitizer_cov_pcs_init (which registers a set of program counters that might be visited) and __sanitizer_cov_8bit_counters_init (which registers an array of booleans that are to be incremented when a basic block is visited). Both of these need to know at initialization time how many program counters or basic blocks exist. But in Python, that isn’t possible, since code isn’t loaded until well after Python starts. We can’t even know it when we start the fuzzer: it’s possible to dynamically import code later, or even generate code on the fly.


Thankfully, libFuzzer supports fuzzing shared libraries loaded at runtime. Both __sanitizer_cov_pcs_init and __sanitizer_cov_8bit_counters_init are able to be safely called from a shared library in its constructor (called when the library is loaded). So, Atheris simulates loading shared libraries! When tracing is initialized, Atheris first calls those functions with an array of 8-bit counters and completely made-up program counters. Then, whenever a new Python line is reached, Atheris allocates a PC and 8-bit counter to that line; Atheris will always report that line the same way from then on. Once Atheris runs out of PCs and 8-bit counters, it simply loads a new “shared library” by calling those functions again. Of course, exponential growth is used to ensure that the number of shared libraries doesn’t become excessive.


What's Special about Python 3.8+?


In the README, we advise users to use Python 3.8+ where possible. This is because Python 3.8 added a new feature: opcode tracing. Not only can we monitor when every line is visited and every function is called, but we can actually monitor every operation that Python performs, and what arguments it uses. This allows Atheris to find its way through if statements much better.


When a COMPARE_OP opcode is encountered, indicating a boolean comparison between two values, Atheris inspects the types of the values. If the values are bytes or Unicode, Atheris is able to report the comparison to libFuzzer via __sanitizer_weak_hook_memcmp. For integer comparison, Atheris uses the appropriate function to report integer comparisons, such as __sanitizer_cov_trace_cmp8.


In recent Python versions, a Unicode string is actually represented as an array of 1-byte, 2-byte, or 4-byte characters, based on the size of the largest character in the string. The obvious solution for coverage is to:

  • first compare two strings for equivalent character size and report it as an integer comparison with __sanitizer_cov_trace_cmp8
  • Second, if they’re equal, call __sanitizer_weak_hook_memcmp to report the actual string comparison
However, performance measurements discovered that the surprising best strategy is to convert both strings to utf-8, then compare those with __sanitizer_weak_hook_memcmp. Even with the performance overhead of conversion, libFuzzer makes progress much faster.

Building Atheris

Most of the effort to release Atheris was simply making it build outside of Google’s environment. At Google, building a Python project builds its entire universe of dependencies, including the Python interpreter. This makes it trivial for us to use libFuzzer with our projects - we just compile it into our Python interpreter, along with Address Sanitizer or whatever other features we want.

Unfortunately, outside of Google, it’s not that simple. We had many false starts regarding how to link libFuzzer with Atheris, including making it a standalone shared object, preloading it, etc. We eventually settled on linking it into the Atheris shared object, as it provides the best experience for most users.

However, this strategy still required us to make minor changes to libFuzzer, to allow it to be called as a library. Since most users won’t have the latest Clang and it typically takes several years for distributions to update their Clang installation, actually getting this new libFuzzer version would be quite difficult for most people, making Atheris installation a hassle. To avoid this, we actually patch libFuzzer if it’s too old. Atheris’s setup.py will detect an out-of-date libFuzzer, make a copy of it, mark its fuzzer entry point as visible, and inject a small wrapper to allow it to be called via the name LLVMFuzzerRunDriver. If the libFuzzer is sufficiently new, we just call it using LLVMFuzzerRunDriver directly.

The true problem comes from fuzzing native extensions with sanitizers. In theory, fuzzing a native extension with Atheris should be trivial - just build it with -fsanitize=fuzzer-no-link, and make sure Atheris is loaded first. Those magic function calls that Clang injected will point to the libFuzzer symbols inside Atheris. When just fuzzing a native extension without sanitizers, it actually is that simple. Everything works. Unfortunately, sanitizers make everything more complex.

When using a sanitizer like Address Sanitizer with Atheris, it’s necessary to LD_PRELOAD the sanitizer’s shared object. ASan requires that it be loaded first, before anything else; it must either be preloaded, or statically linked into the executable (in this case, the Python interpreter). ASan and UBSan define many of the same code coverage symbols as libFuzzer. In typical libFuzzer usage, this isn’t an issue, since ASan/UBSan declare those symbols weak; the libFuzzer ones take precedence. But when libFuzzer is loaded in a shared object later, that doesn’t work. The symbols from ASan/UBSan have already been loaded via LD_PRELOAD, and coverage information therefore goes to those libraries, leaving libFuzzer very broken.

The only good way to solve this is to link libFuzzer into python itself, instead of Atheris. Since it’s therefore part of the proper executable rather than a shared object that’s dynamically loaded later, symbol resolution works correctly and libFuzzer symbols take precedence. This is nontrivial. We’ve provided documentation about this, and a script to build a modified CPython 3.8.6. These scripts will use the same possibly-patched libFuzzer as Atheris.

Why is it called Atheris? 

Atheris Hispida, or the “Hairy bush viper”, is the closest thing that exists to a fuzzy Python.

Announcing Bonus Rewards for V8 Exploits

Starting today, the Chrome Vulnerability Rewards Program is offering a new bonus for reports which demonstrate exploitability in V8, Chrome’s JavaScript engine. We have historically had many great V8 bugs reported (thank you to all of our reporters!) but we'd like to know more about the exploitability of different V8 bug classes, and what mechanisms are effective to go from an initial bug to a full exploit. That's why we're offering this additional reward for bugs that show how a V8 vulnerability could be used as part of a real world attack.

In the past, exploits had to be fully functional to be rewarded at our highest tier, high-quality report with functional exploit. Demonstration of how a bug might be exploited is one factor that the panel may use to determine that a report is high-quality, our second highest tier, but we want to encourage more of this type of analysis. This information is very useful for us when planning future mitigations, making release decisions, and fixing bugs faster. We also know it requires a bit more effort for our reporters, and that effort should be rewarded. For the time being this only applies to V8 bugs, but we’re curious to see what our reporters come up with!

The full details are available on the Chrome VRP rules page. At a high-level, we’re offering increased reward amounts, up to double, for qualifying V8 bugs.

The following table shows the updated reward amounts for reports qualifying for this new bonus. These new, higher values replace the normal reward. If a bug in V8 doesn’t fit into one of these categories, it may still qualify for an increased reward at the panel’s discretion.

[1] Baseline reports are unable to meet the requirements to qualify for this special reward.

So what does a report need to do to demonstrate that a bug is likely exploitable? Any V8 bug report which would have previously been rewarded at the high-quality report with functional exploit level will likely qualify with no additional effort from the reporter. By definition, these demonstrate that the issue was exploitable. V8 reports at the high-quality level may also qualify if they include evidence that the bug is exploitable as part of their analysis. See the rules page for more information about our reward levels.

The following are some examples of how a report could demonstrate that exploitation is likely, but any analysis or proof of concept will be considered by the panel:

  • Executing shellcode from the context of Chrome or d8 (V8’s developer shell)
  • Creating an exploit primitive that allows arbitrary reads from or writes to specific addresses or attacker-controlled offsets
  • Demonstrating instruction pointer control
  • Demonstrating an ASLR bypass by computing the memory address of an object in a way that’s exposed to script
  • Providing analysis of how a bug could lead to type confusion with a JSObject

For example reports, see issues 914736 and 1076708.

We’d like to thank all of our VRP reporters for helping us keep Chrome users safe! We look forward to seeing what you find.

-The Chrome Vulnerability Rewards Panel

OpenTitan at One Year: the Open Source Journey to Secure Silicon

During the past year, OpenTitan has grown tremendously as an open source project and is on track to provide transparent, trustworthy, and cost-free security to the broader silicon ecosystem. OpenTitan, the industry’s first open source silicon root of trust, has rapidly increased engineering contributions, added critical new partners, selected our first tapeout target, and published a comprehensive logical security model for the OpenTitan silicon, among other accomplishments.

OpenTitan by the Numbers 

OpenTitan has doubled many metrics in the year since our public launch: in design size, verification testing, software test suites, documentation, and unique collaborators at least. Crucially, this growth has been both in the design verification collateral required for high volume production-quality silicon, as well as the digital design itself, a first for any open source silicon project.
  • More than doubled the number of commits at launch: from 2,500 to over 6,100 (across OpenTitan and the Ibex RISC-V core sub-project).
  • Grew to over 141K lines of code (LOC) of System Verilog digital design and verification.
  • Added 13 new IP blocks to grow to a total to 29 distinct hardware units.
  • Implemented 14 Device Interface Functions (DIFs) for a total 15 KLOC of C11 source code and 8 KLOC of test software.
  • Increased our design verification suite to over 66,000 lines of test code for all IP blocks.
  • Expanded documentation to over 35,000 lines of Markdown.
  • Accepted contributions from 52 new unique contributors, bringing our total to 100.
  • Increased community presence as shown by an aggregate of over 1,200 Github stars between OpenTitan and Ibex.

One year of OpenTitan and Ibex growth on GitHub: the total number of commits grew from 2,500 to over 6,100.


High quality development is one of OpenTitan’s core principles. Besides our many style guides, we require thorough documentation and design verification for each IP block. Each piece of hardware starts with auto-generated documentation to ensure consistency between documentation and design, along with extensive, progressively improving, design verification as it advances through the OpenTitan hardware stages to reach tapeout readiness.

One year of growth in Design Verification: from 30,000 to over 65,000 lines of testing source code. Each color represents design verification for an individual IP block.


Innovating for Open Silicon Development

Besides writing code, we have made significant advances in developing processes and security framework for high quality, secure open source silicon development. Design success is not just measured by the hardware, highly functional software and a firm contract between the two, with well-defined interfaces and well-understood behavior, play an important role.

OpenTitan’s hardware-software contract is realized by our DIF methodology, yet another way in which we ensure hardware IP quality. DIFs are a form of hardware-software co-design and the basis of our chip-level design verification testing infrastructure. Each OpenTitan IP block requires a style guide-compliant DIF, and this year we implemented 14 DIFs for a total 15 KLOC of C11 source code and 8 KLOC of tests.

We also reached a major milestone by publishing an open Security Model for a silicon root of trust, an industry first. This comprehensive guidance demonstrates how OpenTitan provides the core security properties required of a secure root of trust. It covers provisioning, secure boot, device identity, and attestation, and our ownership transfer mechanism, among other topics.

Expanding the OpenTitan Ecosystem 

Besides engineering effort and methodology development, the OpenTitan coalition added two new Steering Committee members in support of lowRISC as an open source not-for-profit organization. Seagate, a leader in storage technology, and Giesecke and Devrient Mobile Security, a major producer of certified secure systems. We also chartered our Technical Committee to steer technical development of the project. Technical Committee members are drawn from across our organizational and individual contributors, approving 9 technical RFCs and adding 11 new project committers this past year. 

On the strength of the OpenTitan open source project’s engineering progress, we are excited to announce today that Nuvoton and Google are collaborating on the first discrete OpenTitan silicon product. Much like the Linux kernel is itself not a complete operating system, OpenTitan’s open source design must be instantiated in a larger, complete piece of silicon. We look forward to sharing more on the industry’s first open source root of trust silicon tapeout in the coming months.

Onward to 2021

OpenTitan’s future is bright, and as a project it fully demonstrates the potential for open source design to enable collaboration across disparate, geographically far flung teams and organizations, to enhance security through transparency, and enable innovation in the open. We could not do this without our committed project partners and supporters, to whom we owe all this progress: Giesecke and Devrient Mobile Security, Western Digital, Seagate, the lowRISC CIC, Nuvoton, ETH Zürich, and many independent contributors.

Interested in contributing to the industry's first open source silicon root of trust? Contact us here.

Improving open source security during the Google summer internship program

Every summer, Google’s Information Security Engineering (ISE) team hosts a number of interns who work on impactful projects to help improve security at Google. This year was no different—well, actually it was a little bit different because internships went virtual. But our dedication to security was still front and center as our intern team worked on improvements in open source software.

Open source software is the foundation of many modern software products. Over the years, developers increasingly have relied on reusable open source components for their applications. It is paramount that these open source components are secure and reliable. 

The focus of this year’s intern projects reflects ISE’s general approach of tackling security issues at scale, and can be split into three main areas: 
  • Vulnerability research: Finding new vulnerabilities, developing infrastructure to search for known bug classes at scale, and experimenting with new detection approaches.
  • Mitigation and hardening: Developing hardening approaches with the goal of fully eliminating specific vulnerability classes or mitigating their impact.
  • Security education: Sharing knowledge to increase awareness among developers and to help train security engineers.
Vulnerability research

Fuzzing is a highly effective method of uncovering memory-corruption vulnerabilities in C and C++ applications. With OSS-Fuzz, Google provides a platform for fuzzing open source software. One of this year’s intern projects ported internal fuzz targets to OSS-Fuzz, which led to the discovery of new bugs. In this context, our interns experimented with setting up fuzzing for difficult fuzz targets such as the state machines of Memcached and Redis. Additionally, they added new fuzzers for complicated targets like nginx, PostgreSQL, and Envoy, a widely used cloud-native high-performance proxy. 

State-of-the-art fuzzing frameworks like AFL, libFuzzer, and Honggfuzz leverage feedback such as code coverage to guide the fuzzer. Recent academic papers suggest that symbolic execution can complement existing fuzzing frameworks to find bugs that are difficult for random mutation-based fuzzers to find. Our interns evaluated the possibility of using KLEE to augment libFuzzer and AFL. In particular, they found that adding KLEE to existing fuzzing frameworks provides benefits for fuzz targets such as sqlite and lcms. However, at this point in time, there is still work to be done before symbolic execution can be performed at scale (e.g., in OSS-Fuzz).

In addition to finding memory-corruption vulnerabilities, fuzzing can help find logic vulnerabilities. This can be difficult as it requires understanding the semantics of the target application. One approach uses differential testing to find different behaviors in applications that are supposed to behave in the same way. One of our intern projects this summer looked into leveraging differential fuzzing to expose logic vulnerabilities and found a number of cases where YAML parsers handle edge cases differently.

Other intern projects this summer focused on the search for application-specific vulnerabilities. Our interns aimed to discover common Google Kubernetes Engine (GKE) misconfigurations. The recently launched GKE-Auditor, created by one of our interns, implements 18 detectors to find misconfigurations in Node isolation, role-based access control, and pod security policies. Another project implemented regression tests for the Google Compute Engine (GCE) metadata server

Finally, one intern project looked into improving Visual Studio Code (VSCode), a popular cross-platform code editor that is based on Electron which combines the Chromium rendering engine and the Node.js runtime. VSCode can be vulnerable to DOM cross-site scripting attacks. For this reason, our intern’s work centered on making VSCode Trusted Types-compliant by using and contributing to the static and dynamic analysis tools to find violations. This work not only led to an improvement of VSCode, but also of Chromium.

Hardening 

Because finding all vulnerabilities is an impossible task, we always look for ways to mitigate their impact or eliminate certain vulnerability classes completely. The main focus of this year’s hardening projects were to enable security enhancements for major web frameworks and to provide sandboxing for popular libraries written in memory-unsafe languages such as C and C++.

In an effort to make the web more secure, our intern team added security enhancements including Content Security Policy (CSP), Fetch Metadata Request Headers, Cross-Origin Opener Policy (COOP), and Cross-Origin Embedder Policy (COEP) to a number of existing web frameworks (our previous post provides a good overview of these mitigations).

As a result, these web security features were implemented in a number of common application frameworks, including Apache Struts [CSP, COOP/COEP], Apache Wicket [Fetch Metadata, COOP/COEP], .NET Core [CSP], Django [Trusted Types, COOP], and WordPress [Fetch Metadata, CSP]. We're looking forward to working with open source maintainers to further develop and integrate these defenses into more popular frameworks!

Sandboxing 

Executing native code that comes from untrusted origins or processes data from untrusted sources is risky because it may be malicious or contain vulnerabilities. Sandboxing mitigates these risks by executing code in a low-privileged environment.This process often requires modifying the interfaces of third-party libraries and setting up their execution environment. Sandboxed API is a framework to help with these tasks that is used at Google. 

Our interns also worked on providing reusable sandboxes for popular open source libraries such as curl, OpenJPEG, LoadPNG, LibUV, and libTIFF. Now, anyone who wants to use these libraries to process untrusted data can do so safely.

Education

Capture the flag (CTF) competitions are useful for transferring security knowledge and training security engineers. The kCTF project provides a Kubernetes-based infrastructure which offers a hardened environment to securely deploy CTF tasks and isolate them from each other. One intern project added a number of improvements to the documentation including enabling a version control to allow multiple authors to work on one challenge and simplifingkCTF’s usage.

We would like to thank all of our interns for their hard work this summer! For more information on the Google internship program and other student opportunities, check out careers.google.com/students.

Fostering research on new web security threats

The web is an ecosystem built on openness and composability. It is an excellent platform for building capable applications, and it powers thousands of services created and maintained by engineers at Google that are depended on by billions of users. However, the web's open design also allows unrelated applications to sometimes interact with each other in ways which may undermine the platform's security guarantees.

Increasingly, security issues discovered in modern web applications hinge upon the misuse of long-standing web platform behaviors, allowing unsavory sites to reveal information about the user or their data in other web applications. This class of issues, broadly referred to as cross-site leaks (XS-Leaks), poses interesting challenges for security engineers and web browser developers due to a diversity of attacks and the complexity of building comprehensive defenses.

To promote a better understanding of these issues and protect the web from them, today marks the launch of the XS-Leaks wiki—an open knowledge base to which the security community is invited to participate, and where researchers can share information about new attacks and defenses.

The XS-Leaks wiki 

Available at xsleaks.dev (code on GitHub), the wiki explains the principles behind cross-site leaks, discusses common attacks, and proposes defense mechanisms aimed at mitigating these attacks. The wiki is composed of smaller articles that showcase the details of each cross-site leak, their implications, proof-of-concept code to help demonstrate the issue, and effective defenses. 

To improve the state of web security, we're inviting the security community to work with us on expanding the XS-Leaks wiki with information about new offensive and defensive techniques.

Defenses 

An important goal of the wiki is to help web developers understand the defense mechanisms offered by web browsers that can comprehensively protect their web applications from various kinds of cross-site leaks. 

Each attack described in the wiki is accompanied by an overview of security features which can thwart or mitigate it; the wiki aims to provide actionable guidance to assist developers in the adoption of new browser security features such as Fetch Metadata Request Headers, Cross-Origin Opener Policy, Cross-Origin Resource Policy, and SameSite cookies.

The Security Team at Google has benefited from over a decade of productive collaboration with security experts and browser engineers to improve the security of the web platform. We hope this new resource encourages further research into creative attacks and robust defenses for a major class of web security threats. We're excited to work together with the community to continue making the web safer for all users.

Special thanks to Manuel Sousa for starting the wiki as part of his internship project at Google, and to the contributors to the xsleaks GitHub repository for their original research in this area.

Announcing our open source security key test suite

Security keys and your phone’s built-in security keys are reshaping the way users authenticate online. These technologies are trusted by a growing number of websites to provide phishing-resistant two-factor authentication (2FA). To help make sure that next generation authentication protocols work seamlessly across the internet, we are committed to partnering with the ecosystem and providing essential technologies to advance state-of-the-art authentication for everyone. So, today we are releasing a new open source security key test suite

The protocol powering security keys


Under the hood, roaming security keys are powered by the FIDO Alliance CTAP protocols, the part of FIDO2 that ensures a seamless integration between your browser and security key. Whereas the security-key user experience aims to be straightforward, the CTAP protocols themselves are fairly complex. This is due to the broad range of authentication use cases the specification addresses: including websites, operating systems, and enterprise credentials. As the protocol specification continues to evolve—there is already a draft of CTAP 2.1—corner cases that can cause interoperability problems are bound to appear.

Building a test suite  

We encountered many of those tricky corner cases while implementing our open-source security-key firmware OpenSK and decided to create a comprehensive test suite to ensure all our new firmware releases handle them correctly. Over the last two years, our test suite grew to include over 80 tests that cover all the CTAP2 features.

Strengthening the ecosystem 

A major strength of the security key ecosystem is that the FIDO Alliance is an industry consortium with many participating vendors providing a wide range of distinct security keys catering to all users' needs. The FIDO Alliance offers testing for conformance to the current specifications. Those tests are a prerequisite to passing the interoperability tests that are required for a security key to become FIDO Certified. Our test suite complements those official tools by covering additional scenarios and in-market corner cases that are outside the scope of the FIDO Alliance’s testing program.

Back in March 2020, we demonstrated our test suite to the FIDO Alliance members and offered to extend testing to all FIDO2 keys. We got an overwhelmingly positive response from the members and have been working with many security key vendors since then to help them make the best use of our test suite.

Overall, the initial round of the tests on several keys has yielded promising results and we are actively collaborating with many vendors on building on those results to improve future keys.

Open-sourcing our test suite 

Today we are making our test suite open source to allow security key vendors to directly integrate it into their testing infrastructure and benefit from increased testing coverage. Moving forward, we are excited to keep collaborating with the FIDO Alliance, its members, the hardware security key industry and the open source community to extend our test suite to improve its coverage and make it a comprehensive tool that the community can rely on to ensure key interoperability. In the long term, it is our hope that strengthening the community testing capabilities will ultimately benefit all security key users by helping ensure they have a consistent experience no matter which security keys they are using.

Acknowledgements 

We thank our collaborators: Adam Langley, Alexei Czeskis, Arnar Birgisson, Borbala Benko, Christiaan Brand, Dirk Balfanz, Guillaume Endignoux, Jeff Hodges, Julien Cretin, Mark Risher, Oxana Comanescu, Tadek Pietraszek and all the security key vendors that worked with us.

Privacy-preserving features in the Mobile Driving License

In the United States and other countries a Driver's License is not only used to convey driving privileges, it is also commonly used to prove identity or personal details.

Presenting a Driving License is simple, right? You hand over the card to the individual wishing to confirm your identity (the so-called “Relying Party” or “Verifier”); they check the security features of the plastic card (hologram, micro-printing, etc.) to ensure it’s not counterfeit; they check that it’s really your license, making sure you look like the portrait image printed on the card; and they read the data they’re interested in, typically your age, legal name, address etc. Finally, the verifier needs to hand back the plastic card.

Most people are so familiar with this process that they don’t think twice about it, or consider the privacy implications. In the following we’ll discuss how the new and soon-to-be-released ISO 18013-5 standard will improve on nearly every aspect of the process, and what it has to do with Android.

Mobile Driving License ISO Standard

The ISO 18013-5 “Mobile driving licence (mDL) application” standard has been written by a diverse group of people representing driving license issuers (e.g. state governments in the US), relying parties (federal and state governments, including law enforcement), academia, industry (including Google), and many others. This ISO standard allows for construction of Mobile Driving License (mDL) applications which users can carry in their phone and can use instead of the plastic card.

Instead of handing over your plastic card, you open the mDL application on your phone and press a button to share your mDL. The Verifier (aka “Relying Party”) has their own device with an mDL reader application and they either scan a QR code shown in your mDL app or do an NFC tap. The QR code (or NFC tap) conveys an ephemeral cryptographic public key and hardware address the mDL reader can connect to.

Once the mDL reader obtains the cryptographic key it creates its own ephemeral keypair and establishes an encrypted and authenticated, secure wireless channel (BLE, Wifi Aware or NFC)). The mDL reader uses this secure channel to request data, such as the portrait image or what kinds of vehicles you're allowed to drive, and can also be used to ask more abstract questions such as “is the holder older than 18?”

Crucially, the mDL application can ask the user to approve which data to release and may require the user to authenticate with fingerprint or face — none of which a passive plastic card could ever do.

With this explanation in mind, let’s see how presenting an mDL application compares with presenting a plastic-card driving license:

  • Your phone need not be handed to the verifier, unlike your plastic card. The first step, which requires closer contact to the Verifier to scan the QR code or tap the NFC reader, is safe from a data privacy point of view, and does not reveal any identifying information to the verifier. For additional protection, mDL apps will have the option of both requiring user authentication before releasing data and then immediately placing the phone in lockdown mode, to ensure that if the verifier takes the device they cannot easily get information from it.
  • All data is cryptographically signed by the Issuing Authority (for example the DMV who issued the mDL) and the verifier's app automatically validates the authenticity of the data transmitted by the mDL and refuses to display inauthentic data. This is far more secure than holograms and microprinting used in plastic cards where verification requires special training which most (human) verifiers don't receive. With most plastic cards, fake IDs are relatively easy to create, especially in an international context, putting everyone’s identity at risk.
    • The amount of data presented by the mDL is minimized — only data the user elects to release, either explicitly via prompts or implicitly via e.g. pre-approval and user settings, is released. This minimizes potential data abuse and increases the personal safety of users.

      For example, any bartender who checks your mDL for the sole purpose of verifying you’re old enough to buy a drink needs only a single piece of information which is whether the holder is e.g. older than 21, yes or no. Compared to the plastic card, this is a huge improvement; a plastic card shows all your data even if the verifier doesn’t need it.

      Additionally, all of this information is available via a 2D barcode on the back so if you use your plastic card driving license to buy beer, tobacco, or other restricted items at a store it’s common in some states for the cashier to scan your license. In some cases, this means you may get advertising in the mail but they may sell your identifying information to the highest bidder or, worst case, leak their whole database.

These are some of the reasons why we think mDL is a big win for end users in terms of privacy.

One commonality between plastic-card driving licences and the mDL is how the relying party verifies that the person presenting the license is the authorized holder. In both cases, the verifier manually compares the appearance of the individual against a portrait photo, either printed on the plastic or transmitted electronically and research has shown that it’s hard for individuals to match strangers to portrait images.

The initial version of ISO 18013-5 won’t improve on this but the ISO committee working on the standard is already investigating ways to utilize on-device biometrics sensors to perform this match in a secure and privacy-protecting way. The hope is that improved fidelity in the process helps reduce unauthorized use of identity documents.

mDL support in Android

Through facilities such as hardware-based Keystore, Android already offers excellent support for security and privacy-sensitive applications and in fact it’s already possible to implement the ISO 18013-5 standard on Android without further platform changes. Many organizations participating in the ISO committee have already implemented 18013-5 Android apps.

That said, with purpose-built support in the operating system it is possible to provide better security and privacy properties. Android 11 includes the Identity Credential APIs at the Framework level along with a Hardware Abstraction Layer interface which can be implemented by Android OEMs to enable identity credential support in Secure Hardware. Using the Identity Credential API, the Trusted Computing Base of mDL applications does not include the application or even Android itself. This will be particularly important for future versions where the verifier must trust the device to identify and authenticate the user, for example through fingerprint or face matching on the holder's own device. It’s likely such a solution will require certified hardware and/or software and certification is not practical if the TCB includes the hundreds of millions of lines of code in Android and the Linux kernel.

One advantage of plastic cards is that they don't require power or network communication to be useful. Putting all your licenses on your phone could seem inconvenient in cases where your device is low on battery, or does not have enough battery life to start. The Android Identity Credential HAL therefore provides support for a mode called Direct Access, where the license is still available through an NFC tap even when the phone's battery is too low to boot it up. Device makers can implement this mode, but it will require hardware support that will take several years to roll out.

For devices without the Identity Credential HAL, we have an Android Jetpack which implements the same API and works on nearly every Android device in the world (API level 24 or later). If the device has hardware-backed Identity Credential support then this Jetpack simply forwards calls to the platform API. Otherwise, an Android Keystore-backed implementation will be used. While the Android Keystore-backed implementation does not provide the same level of security and privacy, it is perfectly adequate for both holders and issuers in cases where all data is issuer-signed. Because of this, the Jetpack is the preferred way to use the Identity Credential APIs. We also made available sample open-source mDL and mDL Reader applications using the Identity Credential APIs.

Conclusion

Android now includes APIs for managing and presenting with identity documents in a more secure and privacy-focused way than was previously possible. These can be used to implement ISO 18013-5 mDLs but the APIs are generic enough to be usable for other kinds of electronic documents, from school ID or bonus program club cards to passports.

Additionally, the Android Security and Privacy team actively participates in the ISO committees where these standards are written and also works with civil liberties groups to ensure it has a positive impact on our end users.

Fuzzing internships for Open Source Software

Open source software is the foundation of many modern software products. Over the years, developers increasingly have relied on reusable open source components for their applications. It is paramount that these open source components are secure and reliable, as weaknesses impact those that build upon it.

Google cares deeply about the security of the open source ecosystem and recently launched the Open Source Security Foundation with other industry partners. Fuzzing is an automated testing technique to find bugs by feeding unexpected inputs to a target program. At Google, we leverage fuzzing at scale to find tens of thousands of security vulnerabilities and stability bugs. This summer, as part of Google’s OSS internship initiative, we hosted 50 interns to improve the state of fuzz testing in the open source ecosystem.

The fuzzing interns worked towards integrating new projects and improving existing ones in OSS-Fuzz, our continuous fuzzing service for the open source community (which has 350+ projects, 22,700 bugs, 89% fixed). Several widely used open source libraries including but not limited to nginx, postgresql, usrsctp, and openexr, now have continuous fuzzing coverage as a result of these efforts.

Another group of interns focused on improving the security of the Linux kernel. syzkaller, a kernel fuzzing tool from Google, has been instrumental in finding kernel vulnerabilities in various operating systems. The interns were tasked with improving the fuzzing coverage by adding new descriptions to syzkaller like ip tunnels, io_uring, and bpf_lsm for example, refining the interface description language, and advancing kernel fault injection capabilities.

Some interns chose to write fuzzers for Android and Chrome, which are open source projects that billions of internet users rely on. For Android, the interns contributed several new fuzzers for uncovered areas - network protocols such as pppd and dns, audio codecs like monoblend, g722, and android framework. On the Chrome side, interns improved existing blackbox fuzzers, particularly in the areas: DOM, IPC, media, extensions, and added new libprotobuf-based fuzzers for Mojo.

Our last set of interns researched quite a few under-explored areas of fuzzing, some of which were fuzzer benchmarking, ML based fuzzing, differential fuzzing, bazel rules for build simplification and made useful contributions.

Over the course of the internship, our interns have reported over 150 security vulnerabilities and 750 functional bugs. Given the overall success of these efforts, we plan to continue hosting fuzzing internships every year to help secure the open source ecosystem and teach incoming open source contributors about the importance of fuzzing. For more information on the Google internship program and other student opportunities, check out careers.google.com/students. We encourage you to apply.

Privacy-Preserving Smart Input with Gboard

Google Keyboard (a.k.a Gboard) has a critical mission to provide frictionless input on Android to empower users to communicate accurately and express themselves effortlessly. In order to accomplish this mission, Gboard must also protect users' private and sensitive data. Nothing users type is sent to Google servers. We recently launched privacy-preserving input by further advancing the latest federated technologies. In Android 11, Gboard also launched the contextual input suggestion experience by integrating on-device smarts into the user's daily communication in a privacy-preserving way.

Before Android 11, input suggestions were surfaced to users in several different places. In Android 11, Gboard launched a consistent and coordinated approach to access contextual input suggestions. For the first time, we've brought Smart Replies to the keyboard suggestions - powered by system intelligence running entirely on device. The smart input suggestions are rendered with a transparent layer on top of Gboard’s suggestion strip. This structure maintains the trust boundaries between the Android platform and Gboard, meaning sensitive personal content cannot be not accessed by Gboard. The suggestions are only sent to the app after the user taps to accept them.

For instance, when a user receives the message “Have a virtual coffee at 5pm?” in Whatsapp, on-device system intelligence predicts smart text and emoji replies “Sounds great!” and “👍”. Android system intelligence can see the incoming message but Gboard cannot. In Android 11, these Smart Replies are rendered by the Android platform on Gboard’s suggestion strip as a transparent layer. The suggested reply is generated by the system intelligence. When the user taps the suggestion, Android platform sends it to the input field directly. If the user doesn't tap the suggestion, gBoard and the app cannot see it. In this way, Android and Gboard surface the best of Google smarts whilst keeping users' data private: none of their data goes to any app, including the keyboard, unless they've tapped a suggestion.

Additionally, federated learning has enabled Gboard to train intelligent input models across many devices while keeping everything individual users type on their device. Today, the emoji is as common as punctuation - and have become the way for our users to express themselves in messaging. Our users want a way to have fresh and diversified emojis to better express their thoughts in messaging apps. Recently, we launched new on-device transformer models that are fine-tuned with federated learning in Gboard, to produce more contextual emoji predictions for English, Spanish and Portuguese.

Furthermore, following the success of privacy-preserving machine learning techniques, Gboard continues to leverage federated analytics to understand how Gboard is used from decentralized data. What we've learned from privacy-preserving analysis has let us make better decisions in our product.

When a user shares an emoji in a conversation, their phone keeps an ongoing count of which emojis are used. Later, when the phone is idle, plugged in, and connected to WiFi, Google’s federated analytics server invites the device to join a “round” of federated analytics data computation with hundreds of other participating phones. Every device involved in one round will compute the emoji share frequency, encrypt the result and send it a federated analytics server. Although the server can’t decrypt the data individually, the final tally of total emoji counts can be decrypted when combining encrypted data across devices. The aggregated data shows that the most popular emoji is 😂 in Whatsapp, 😭 in Roblox(gaming), and ✔ in Google Docs. Emoji 😷 moved up from 119th to 42nd in terms of frequency during COVID-19.

Gboard always has a strong commitment to Google’s Privacy Principles. Gboard strives to build privacy-preserving effortless input products for users to freely express their thoughts in 900+ languages while safeguarding user data. We will keep pushing the state of the art in smart input technologies on Android while safeguarding user data. Stay tuned!

New Password Protections (and more!) in Chrome

Passwords are often the first line of defense for our digital lives. Today, we’re improving password security on both Android and iOS devices by telling you if the passwords you’ve asked Chrome to remember have been compromised, and if so, how to fix them.

To check whether you have any compromised passwords, Chrome sends a copy of your usernames and passwords to Google using a special form of encryption. This lets Google check them against lists of credentials known to be compromised, but Google cannot derive your username or password from this encrypted copy.

We notify you when you have compromised passwords on websites, but it can be time-consuming to go find the relevant form to change your password. To help, we’re adding support for ".well-known/change-password" URLs that let Chrome take users directly to the right “change password” form after they’ve been alerted that their password has been compromised.

Along with these improvements, Chrome is also bringing Safety Check to mobile. In our next release, we will launch Safety Check on iOS and Android, which includes checking for compromised passwords, telling you if Safe Browsing is enabled, and whether the version of Chrome you are running is updated with the latest security protections. You will also be able to use Chrome on iOS to autofill saved login details into other apps or browsers.

In Chrome 86 we’ll also be launching a number of additional features to improve user security, including:

Enhanced Safe Browsing for Android

Earlier this year, we launched Enhanced Safe Browsing for desktop, which gives Chrome users the option of more advanced security protections.

When you turn on Enhanced Safe Browsing, Chrome can proactively protect you against phishing, malware, and other dangerous sites by sharing real-time data with Google’s Safe Browsing service. Among our users who have enabled checking websites and downloads in real time, our predictive phishing protections see a roughly 20% drop in users typing their passwords into phishing sites.

Improvements to password filling on iOS

We recently launched Touch-to-fill for passwords on Android to prevent phishing attacks. To improve security on iOS too, we’re introducing a biometric authentication step before autofilling passwords. On iOS, you’ll now be able to authenticate using Face ID, Touch ID, or your phone passcode. Additionally, Chrome Password Manager allows you to autofill saved passwords into iOS apps or browsers if you enable Chrome autofill in Settings.


Mixed form warnings and download blocking

Secure HTTPS pages may sometimes still have non-secure features. Earlier this year, Chrome began securing and blocking what’s known as “mixed content”, when secure pages incorporate insecure content. But there are still other ways that HTTPS pages can create security risks for users, such as offering downloads over non-secure links, or using forms that don’t submit data securely.

To better protect users from these threats, Chrome 86 is introducing mixed form warnings on desktop and Android to alert and warn users before submitting a non-secure form that’s embedded in an HTTPS page.

Additionally, Chrome 86 will block or warn on some insecure downloads initiated by secure pages. Currently, this change affects commonly abused file types, but eventually secure pages will only be able to initiate secure downloads of any type. For more details, see Chrome’s plan to gradually block mixed downloads altogether

We encourage developers to update their forms and downloads to use secure connections for the safety and privacy of their users.