Author Archives: Kimberly Samra

AI-Powered Fuzzing: Breaking the Bug Hunting Barrier



Since 2016, OSS-Fuzz has been at the forefront of automated vulnerability discovery for open source projects. Vulnerability discovery is an important part of keeping software supply chains secure, so our team is constantly working to improve OSS-Fuzz. For the last few months, we’ve tested whether we could boost OSS-Fuzz’s performance using Google’s Large Language Models (LLM). 




This blog post shares our experience of successfully applying the generative power of LLMs to improve the automated vulnerability detection technique known as fuzz testing (“fuzzing”). By using LLMs, we’re able to increase the code coverage for critical projects using our OSS-Fuzz service without manually writing additional code. Using LLMs is a promising new way to scale security improvements across the over 1,000 projects currently fuzzed by OSS-Fuzz and to remove barriers to future projects adopting fuzzing. 




LLM-aided fuzzing

We created the OSS-Fuzz service to help open source developers find bugs in their code at scale—especially bugs that indicate security vulnerabilities. After more than six years of running OSS-Fuzz, we now support over 1,000 open source projects with continuous fuzzing, free of charge. As the Heartbleed vulnerability showed us, bugs that could be easily found with automated fuzzing can have devastating effects. For most open source developers, setting up their own fuzzing solution could cost time and resources. With OSS-Fuzz, developers are able to integrate their project for free, automated bug discovery at scale.  




Since 2016, we’ve found and verified a fix for over 10,000 security vulnerabilities. We also believe that OSS-Fuzz could likely find even more bugs with increased code coverage. The fuzzing service covers only around 30% of an open source project’s code on average, meaning that a large portion of our users’ code remains untouched by fuzzing. Recent research suggests that the most effective way to increase this is by adding additional fuzz targets for every project—one of the few parts of the fuzzing workflow that isn’t yet automated.




When an open source project onboards to OSS-Fuzz, maintainers make an initial time investment to integrate their projects into the infrastructure and then add fuzz targets. The fuzz targets are functions that use randomized input to test the targeted code. Writing fuzz targets is a project-specific and manual process that is similar to writing unit tests. The ongoing security benefits from fuzzing make this initial investment of time worth it for maintainers, but writing a comprehensive set of fuzz targets is an tough expectation for project maintainers, who are often volunteers. 




But what if LLMs could write additional fuzz targets for maintainers?



“Hey LLM, fuzz this project for me”

To discover whether an LLM could successfully write new fuzz targets, we built an evaluation framework that connects OSS-Fuzz to the LLM, conducts the experiment, and evaluates the results. The steps look like this:  




  1. OSS-Fuzz’s Fuzz Introspector tool identifies an under-fuzzed, high-potential portion of the sample project’s code and passes the code to the evaluation framework. 

  2. The evaluation framework creates a prompt that the LLM will use to write the new fuzz target. The prompt includes project-specific information.

  3. The evaluation framework takes the fuzz target generated by the LLM and runs the new target. 

  4. The evaluation framework observes the run for any change in code coverage.

  5. In the event that the fuzz target fails to compile, the evaluation framework prompts the LLM to write a revised fuzz target that addresses the compilation errors.





Experiment overview: The experiment pictured above is a fully automated process, from identifying target code to evaluating the change in code coverage.






At first, the code generated from our prompts wouldn’t compile, however after several rounds of  prompt engineering and trying out the new fuzz targets, we saw projects gain between 1.5% and 31% code coverage. One of our sample projects, tinyxml2, went from 38% line coverage to 69% without any interventions from our team. The case of tinyxml2 taught us: when LLM-generated fuzz targets are added, tinyxml2 has the majority of its code covered. 









Example fuzz targets for tinyxml2: Each of the five fuzz targets shown is associated with a different part of the code and adds to the overall coverage improvement. 






To replicate tinyxml2’s results manually would have required at least a day’s worth of work—which would mean several years of work to manually cover all OSS-Fuzz projects. Given tinyxml2’s promising results, we want to implement them in production and to extend similar, automatic coverage to other OSS-Fuzz projects. 




Additionally, in the OpenSSL project, our LLM was able to automatically generate a working target that rediscovered CVE-2022-3602, which was in an area of code that previously did not have fuzzing coverage. Though this is not a new vulnerability, it suggests that as code coverage increases, we will find more vulnerabilities that are currently missed by fuzzing. 




Learn more about our results through our example prompts and outputs or through our experiment report. 




The goal: fully automated fuzzing

In the next few months, we’ll open source our evaluation framework to allow researchers to test their own automatic fuzz target generation. We’ll continue to optimize our use of LLMs for fuzzing target generation through more model finetuning, prompt engineering, and improvements to our infrastructure. We’re also collaborating closely with the Assured OSS team on this research in order to secure even more open source software used by Google Cloud customers.   




Our longer term goals include:



  • Adding LLM fuzz target generation as a fully integrated feature in OSS-Fuzz, with continuous generation of new targets for OSS-fuzz projects and zero manual involvement.

  • Extending support from C/C++ projects to additional language ecosystems, like Python and Java. 

  • Automating the process of onboarding a project into OSS-Fuzz to eliminate any need to write even initial fuzz targets. 




We’re working towards a future of personalized vulnerability detection with little manual effort from developers. With the addition of LLM generated fuzz targets, OSS-Fuzz can help improve open source security for everyone. 

AI-Powered Fuzzing: Breaking the Bug Hunting Barrier



Since 2016, OSS-Fuzz has been at the forefront of automated vulnerability discovery for open source projects. Vulnerability discovery is an important part of keeping software supply chains secure, so our team is constantly working to improve OSS-Fuzz. For the last few months, we’ve tested whether we could boost OSS-Fuzz’s performance using Google’s Large Language Models (LLM). 




This blog post shares our experience of successfully applying the generative power of LLMs to improve the automated vulnerability detection technique known as fuzz testing (“fuzzing”). By using LLMs, we’re able to increase the code coverage for critical projects using our OSS-Fuzz service without manually writing additional code. Using LLMs is a promising new way to scale security improvements across the over 1,000 projects currently fuzzed by OSS-Fuzz and to remove barriers to future projects adopting fuzzing. 




LLM-aided fuzzing

We created the OSS-Fuzz service to help open source developers find bugs in their code at scale—especially bugs that indicate security vulnerabilities. After more than six years of running OSS-Fuzz, we now support over 1,000 open source projects with continuous fuzzing, free of charge. As the Heartbleed vulnerability showed us, bugs that could be easily found with automated fuzzing can have devastating effects. For most open source developers, setting up their own fuzzing solution could cost time and resources. With OSS-Fuzz, developers are able to integrate their project for free, automated bug discovery at scale.  




Since 2016, we’ve found and verified a fix for over 10,000 security vulnerabilities. We also believe that OSS-Fuzz could likely find even more bugs with increased code coverage. The fuzzing service covers only around 30% of an open source project’s code on average, meaning that a large portion of our users’ code remains untouched by fuzzing. Recent research suggests that the most effective way to increase this is by adding additional fuzz targets for every project—one of the few parts of the fuzzing workflow that isn’t yet automated.




When an open source project onboards to OSS-Fuzz, maintainers make an initial time investment to integrate their projects into the infrastructure and then add fuzz targets. The fuzz targets are functions that use randomized input to test the targeted code. Writing fuzz targets is a project-specific and manual process that is similar to writing unit tests. The ongoing security benefits from fuzzing make this initial investment of time worth it for maintainers, but writing a comprehensive set of fuzz targets is an tough expectation for project maintainers, who are often volunteers. 




But what if LLMs could write additional fuzz targets for maintainers?



“Hey LLM, fuzz this project for me”

To discover whether an LLM could successfully write new fuzz targets, we built an evaluation framework that connects OSS-Fuzz to the LLM, conducts the experiment, and evaluates the results. The steps look like this:  




  1. OSS-Fuzz’s Fuzz Introspector tool identifies an under-fuzzed, high-potential portion of the sample project’s code and passes the code to the evaluation framework. 

  2. The evaluation framework creates a prompt that the LLM will use to write the new fuzz target. The prompt includes project-specific information.

  3. The evaluation framework takes the fuzz target generated by the LLM and runs the new target. 

  4. The evaluation framework observes the run for any change in code coverage.

  5. In the event that the fuzz target fails to compile, the evaluation framework prompts the LLM to write a revised fuzz target that addresses the compilation errors.





Experiment overview: The experiment pictured above is a fully automated process, from identifying target code to evaluating the change in code coverage.






At first, the code generated from our prompts wouldn’t compile, however after several rounds of  prompt engineering and trying out the new fuzz targets, we saw projects gain between 1.5% and 31% code coverage. One of our sample projects, tinyxml2, went from 38% line coverage to 69% without any interventions from our team. The case of tinyxml2 taught us: when LLM-generated fuzz targets are added, tinyxml2 has the majority of its code covered. 









Example fuzz targets for tinyxml2: Each of the five fuzz targets shown is associated with a different part of the code and adds to the overall coverage improvement. 






To replicate tinyxml2’s results manually would have required at least a day’s worth of work—which would mean several years of work to manually cover all OSS-Fuzz projects. Given tinyxml2’s promising results, we want to implement them in production and to extend similar, automatic coverage to other OSS-Fuzz projects. 




Additionally, in the OpenSSL project, our LLM was able to automatically generate a working target that rediscovered CVE-2022-3602, which was in an area of code that previously did not have fuzzing coverage. Though this is not a new vulnerability, it suggests that as code coverage increases, we will find more vulnerabilities that are currently missed by fuzzing. 




Learn more about our results through our example prompts and outputs or through our experiment report. 




The goal: fully automated fuzzing

In the next few months, we’ll open source our evaluation framework to allow researchers to test their own automatic fuzz target generation. We’ll continue to optimize our use of LLMs for fuzzing target generation through more model finetuning, prompt engineering, and improvements to our infrastructure. We’re also collaborating closely with the Assured OSS team on this research in order to secure even more open source software used by Google Cloud customers.   




Our longer term goals include:



  • Adding LLM fuzz target generation as a fully integrated feature in OSS-Fuzz, with continuous generation of new targets for OSS-fuzz projects and zero manual involvement.

  • Extending support from C/C++ projects to additional language ecosystems, like Python and Java. 

  • Automating the process of onboarding a project into OSS-Fuzz to eliminate any need to write even initial fuzz targets. 




We’re working towards a future of personalized vulnerability detection with little manual effort from developers. With the addition of LLM generated fuzz targets, OSS-Fuzz can help improve open source security for everyone. 

AI-Powered Fuzzing: Breaking the Bug Hunting Barrier



Since 2016, OSS-Fuzz has been at the forefront of automated vulnerability discovery for open source projects. Vulnerability discovery is an important part of keeping software supply chains secure, so our team is constantly working to improve OSS-Fuzz. For the last few months, we’ve tested whether we could boost OSS-Fuzz’s performance using Google’s Large Language Models (LLM). 




This blog post shares our experience of successfully applying the generative power of LLMs to improve the automated vulnerability detection technique known as fuzz testing (“fuzzing”). By using LLMs, we’re able to increase the code coverage for critical projects using our OSS-Fuzz service without manually writing additional code. Using LLMs is a promising new way to scale security improvements across the over 1,000 projects currently fuzzed by OSS-Fuzz and to remove barriers to future projects adopting fuzzing. 




LLM-aided fuzzing

We created the OSS-Fuzz service to help open source developers find bugs in their code at scale—especially bugs that indicate security vulnerabilities. After more than six years of running OSS-Fuzz, we now support over 1,000 open source projects with continuous fuzzing, free of charge. As the Heartbleed vulnerability showed us, bugs that could be easily found with automated fuzzing can have devastating effects. For most open source developers, setting up their own fuzzing solution could cost time and resources. With OSS-Fuzz, developers are able to integrate their project for free, automated bug discovery at scale.  




Since 2016, we’ve found and verified a fix for over 10,000 security vulnerabilities. We also believe that OSS-Fuzz could likely find even more bugs with increased code coverage. The fuzzing service covers only around 30% of an open source project’s code on average, meaning that a large portion of our users’ code remains untouched by fuzzing. Recent research suggests that the most effective way to increase this is by adding additional fuzz targets for every project—one of the few parts of the fuzzing workflow that isn’t yet automated.




When an open source project onboards to OSS-Fuzz, maintainers make an initial time investment to integrate their projects into the infrastructure and then add fuzz targets. The fuzz targets are functions that use randomized input to test the targeted code. Writing fuzz targets is a project-specific and manual process that is similar to writing unit tests. The ongoing security benefits from fuzzing make this initial investment of time worth it for maintainers, but writing a comprehensive set of fuzz targets is an tough expectation for project maintainers, who are often volunteers. 




But what if LLMs could write additional fuzz targets for maintainers?



“Hey LLM, fuzz this project for me”

To discover whether an LLM could successfully write new fuzz targets, we built an evaluation framework that connects OSS-Fuzz to the LLM, conducts the experiment, and evaluates the results. The steps look like this:  




  1. OSS-Fuzz’s Fuzz Introspector tool identifies an under-fuzzed, high-potential portion of the sample project’s code and passes the code to the evaluation framework. 

  2. The evaluation framework creates a prompt that the LLM will use to write the new fuzz target. The prompt includes project-specific information.

  3. The evaluation framework takes the fuzz target generated by the LLM and runs the new target. 

  4. The evaluation framework observes the run for any change in code coverage.

  5. In the event that the fuzz target fails to compile, the evaluation framework prompts the LLM to write a revised fuzz target that addresses the compilation errors.





Experiment overview: The experiment pictured above is a fully automated process, from identifying target code to evaluating the change in code coverage.






At first, the code generated from our prompts wouldn’t compile, however after several rounds of  prompt engineering and trying out the new fuzz targets, we saw projects gain between 1.5% and 31% code coverage. One of our sample projects, tinyxml2, went from 38% line coverage to 69% without any interventions from our team. The case of tinyxml2 taught us: when LLM-generated fuzz targets are added, tinyxml2 has the majority of its code covered. 









Example fuzz targets for tinyxml2: Each of the five fuzz targets shown is associated with a different part of the code and adds to the overall coverage improvement. 






To replicate tinyxml2’s results manually would have required at least a day’s worth of work—which would mean several years of work to manually cover all OSS-Fuzz projects. Given tinyxml2’s promising results, we want to implement them in production and to extend similar, automatic coverage to other OSS-Fuzz projects. 




Additionally, in the OpenSSL project, our LLM was able to automatically generate a working target that rediscovered CVE-2022-3602, which was in an area of code that previously did not have fuzzing coverage. Though this is not a new vulnerability, it suggests that as code coverage increases, we will find more vulnerabilities that are currently missed by fuzzing. 




Learn more about our results through our example prompts and outputs or through our experiment report. 




The goal: fully automated fuzzing

In the next few months, we’ll open source our evaluation framework to allow researchers to test their own automatic fuzz target generation. We’ll continue to optimize our use of LLMs for fuzzing target generation through more model finetuning, prompt engineering, and improvements to our infrastructure. We’re also collaborating closely with the Assured OSS team on this research in order to secure even more open source software used by Google Cloud customers.   




Our longer term goals include:



  • Adding LLM fuzz target generation as a fully integrated feature in OSS-Fuzz, with continuous generation of new targets for OSS-fuzz projects and zero manual involvement.

  • Extending support from C/C++ projects to additional language ecosystems, like Python and Java. 

  • Automating the process of onboarding a project into OSS-Fuzz to eliminate any need to write even initial fuzz targets. 




We’re working towards a future of personalized vulnerability detection with little manual effort from developers. With the addition of LLM generated fuzz targets, OSS-Fuzz can help improve open source security for everyone. 

AI-Powered Fuzzing: Breaking the Bug Hunting Barrier



Since 2016, OSS-Fuzz has been at the forefront of automated vulnerability discovery for open source projects. Vulnerability discovery is an important part of keeping software supply chains secure, so our team is constantly working to improve OSS-Fuzz. For the last few months, we’ve tested whether we could boost OSS-Fuzz’s performance using Google’s Large Language Models (LLM). 




This blog post shares our experience of successfully applying the generative power of LLMs to improve the automated vulnerability detection technique known as fuzz testing (“fuzzing”). By using LLMs, we’re able to increase the code coverage for critical projects using our OSS-Fuzz service without manually writing additional code. Using LLMs is a promising new way to scale security improvements across the over 1,000 projects currently fuzzed by OSS-Fuzz and to remove barriers to future projects adopting fuzzing. 




LLM-aided fuzzing

We created the OSS-Fuzz service to help open source developers find bugs in their code at scale—especially bugs that indicate security vulnerabilities. After more than six years of running OSS-Fuzz, we now support over 1,000 open source projects with continuous fuzzing, free of charge. As the Heartbleed vulnerability showed us, bugs that could be easily found with automated fuzzing can have devastating effects. For most open source developers, setting up their own fuzzing solution could cost time and resources. With OSS-Fuzz, developers are able to integrate their project for free, automated bug discovery at scale.  




Since 2016, we’ve found and verified a fix for over 10,000 security vulnerabilities. We also believe that OSS-Fuzz could likely find even more bugs with increased code coverage. The fuzzing service covers only around 30% of an open source project’s code on average, meaning that a large portion of our users’ code remains untouched by fuzzing. Recent research suggests that the most effective way to increase this is by adding additional fuzz targets for every project—one of the few parts of the fuzzing workflow that isn’t yet automated.




When an open source project onboards to OSS-Fuzz, maintainers make an initial time investment to integrate their projects into the infrastructure and then add fuzz targets. The fuzz targets are functions that use randomized input to test the targeted code. Writing fuzz targets is a project-specific and manual process that is similar to writing unit tests. The ongoing security benefits from fuzzing make this initial investment of time worth it for maintainers, but writing a comprehensive set of fuzz targets is an tough expectation for project maintainers, who are often volunteers. 




But what if LLMs could write additional fuzz targets for maintainers?



“Hey LLM, fuzz this project for me”

To discover whether an LLM could successfully write new fuzz targets, we built an evaluation framework that connects OSS-Fuzz to the LLM, conducts the experiment, and evaluates the results. The steps look like this:  




  1. OSS-Fuzz’s Fuzz Introspector tool identifies an under-fuzzed, high-potential portion of the sample project’s code and passes the code to the evaluation framework. 

  2. The evaluation framework creates a prompt that the LLM will use to write the new fuzz target. The prompt includes project-specific information.

  3. The evaluation framework takes the fuzz target generated by the LLM and runs the new target. 

  4. The evaluation framework observes the run for any change in code coverage.

  5. In the event that the fuzz target fails to compile, the evaluation framework prompts the LLM to write a revised fuzz target that addresses the compilation errors.





Experiment overview: The experiment pictured above is a fully automated process, from identifying target code to evaluating the change in code coverage.






At first, the code generated from our prompts wouldn’t compile, however after several rounds of  prompt engineering and trying out the new fuzz targets, we saw projects gain between 1.5% and 31% code coverage. One of our sample projects, tinyxml2, went from 38% line coverage to 69% without any interventions from our team. The case of tinyxml2 taught us: when LLM-generated fuzz targets are added, tinyxml2 has the majority of its code covered. 









Example fuzz targets for tinyxml2: Each of the five fuzz targets shown is associated with a different part of the code and adds to the overall coverage improvement. 






To replicate tinyxml2’s results manually would have required at least a day’s worth of work—which would mean several years of work to manually cover all OSS-Fuzz projects. Given tinyxml2’s promising results, we want to implement them in production and to extend similar, automatic coverage to other OSS-Fuzz projects. 




Additionally, in the OpenSSL project, our LLM was able to automatically generate a working target that rediscovered CVE-2022-3602, which was in an area of code that previously did not have fuzzing coverage. Though this is not a new vulnerability, it suggests that as code coverage increases, we will find more vulnerabilities that are currently missed by fuzzing. 




Learn more about our results through our example prompts and outputs or through our experiment report. 




The goal: fully automated fuzzing

In the next few months, we’ll open source our evaluation framework to allow researchers to test their own automatic fuzz target generation. We’ll continue to optimize our use of LLMs for fuzzing target generation through more model finetuning, prompt engineering, and improvements to our infrastructure. We’re also collaborating closely with the Assured OSS team on this research in order to secure even more open source software used by Google Cloud customers.   




Our longer term goals include:



  • Adding LLM fuzz target generation as a fully integrated feature in OSS-Fuzz, with continuous generation of new targets for OSS-fuzz projects and zero manual involvement.

  • Extending support from C/C++ projects to additional language ecosystems, like Python and Java. 

  • Automating the process of onboarding a project into OSS-Fuzz to eliminate any need to write even initial fuzz targets. 




We’re working towards a future of personalized vulnerability detection with little manual effort from developers. With the addition of LLM generated fuzz targets, OSS-Fuzz can help improve open source security for everyone. 

Toward Quantum Resilient Security Keys



As part of our effort to deploy quantum resistant cryptography, we are happy to announce the release of the first quantum resilient FIDO2 security key implementation as part of OpenSK, our open source security key firmware. This open-source hardware optimized implementation uses a novel ECC/Dilithium hybrid signature schema that benefits from the security of ECC against standard attacks and Dilithium’s resilience against quantum attacks. This schema was co-developed in partnership with the ETH Zürich and won the ACNS secure cryptographic implementation workshop best paper.




Quantum processor

Quantum processor




As progress toward practical quantum computers is accelerating, preparing for their advent is becoming a more pressing issue as time passes. In particular, standard public key cryptography which was designed to protect against traditional computers, will not be able to withstand quantum attacks. Fortunately, with the recent standardization of public key quantum resilient cryptography including the Dilithium algorithm, we now have a clear path to secure security keys against quantum attacks.




While quantum attacks are still in the distant future, deploying cryptography at Internet scale is a massive undertaking which is why doing it as early as possible is vital. In particular, for security keys this process is expected to be gradual as users will have to acquire new ones once FIDO has standardized post quantum cryptography resilient cryptography and this new standard is supported by major browser vendors.



Hybrid signature scheme

Hybrid signature: Strong nesting with classical and PQC scheme




Our proposed implementation relies on a hybrid approach that combines the battle tested ECDSA signature algorithm and the recently standardized quantum resistant signature algorithm, Dilithium. In collaboration with ETH, we developed this novel hybrid signature schema that offers the best of both worlds. Relying on a hybrid signature is critical as the security of Dilithium and other recently standardized quantum resistant algorithms haven’t yet stood the test of time and recent attacks on Rainbow (another quantum resilient algorithm) demonstrate the need for caution. This cautiousness is particularly warranted for security keys as most can’t be upgraded – although we are working toward it for OpenSK. The hybrid approach is also used in other post-quantum efforts like Chrome’s support for TLS.




On the technical side, a large challenge was to create a Dilithium implementation small enough to run on security keys’ constrained hardware. Through careful optimization, we were able to develop a Rust memory optimized implementation that only required 20 KB of memory, which was sufficiently small enough. We also spent time ensuring that our implementation signature speed was well within the expected security keys specification. That said, we believe improving signature speed further by leveraging hardware acceleration would allow for keys to be more responsive.




Moving forward, we are hoping  to see this implementation (or a variant of it), being standardized as part of the FIDO2 key specification and supported by major web browsers so that users' credentials can be protected against quantum attacks. If you are interested in testing this algorithm or contributing to security key research, head to our open source implementation OpenSK.

Downfall and Zenbleed: Googlers helping secure the ecosystem



Finding and mitigating security vulnerabilities is critical to keeping Internet users safe.  However, the more complex a system becomes, the harder it is to secure—and that is also the case with computing hardware and processors, which have developed highly advanced capabilities over the years. This post will detail this trend by exploring Downfall and Zenbleed, two new security vulnerabilities (one of which was disclosed today) that prior to mitigation had the potential to affect billions of personal and cloud computers, signifying the importance of vulnerability research and cross-industry collaboration. Had these vulnerabilities not been discovered by Google researchers, and instead by adversaries, they would have enabled attackers to compromise Internet users. For both vulnerabilities, Google worked closely with our partners in the industry to develop fixes, deploy mitigations and gather details to share widely and better secure the ecosystem.

What are Downfall and Zenbleed?

Downfall (CVE-2022-40982) and Zenbleed (CVE-2023-20593) are two different vulnerabilities affecting CPUs - Intel Core (6th - 11th generation) and AMD Zen2, respectively. They allow an attacker to violate the software-hardware boundary established in modern processors. This could allow an attacker to access data in internal hardware registers that hold information belonging to other users of the system (both across different virtual machines and different processes). 


These vulnerabilities arise from complex optimizations in modern CPUs that speed up applications: 

  1. Preemptive multitasking and simultaneous multithreading enable users and applications to share CPU cores, while the CPU enforces security boundaries at the architecture level to stop a malicious user accessing data from other users. 

  2. Speculative execution allows the CPU core to execute instructions from a single execution thread without waiting for prior instructions to be completed.

  3. SIMD enables data-level parallelism where an instruction computes the same function multiple times with different data.


Downfall, affecting Intel CPUs, exploits the speculative forwarding of data from the SIMD Gather instruction. The Gather instruction helps the software access scattered data in memory quickly, which is crucial for high-performance computing workloads performing data encoding and processing. Downfall shows that this instruction forwards stale data from the internal physical hardware registers to succeeding instructions. Although this data is not directly exposed to software registers, it can trivially be extracted via similar exploitation techniques as Meltdown. Since these physical hardware register files are shared across multiple users sharing the same CPU core, an attacker can ultimately extract data from other users. 


Zenbleed, affecting AMD CPUs, shows that incorrectly implemented speculative execution of the SIMD Zeroupper instruction leaks stale data from physical hardware registers to software registers. Zeroupper instructions should clear the data in the upper-half of SIMD registers (e.g., 256-bit register YMM) which on Zen2 processors is done by just setting a flag that marks the upper half of the register as zero. However, if on the same cycle as a register to register move the Zeroupper instruction is mis-speculated, the zero flag doesn’t get rolled back properly, leading to the upper-half of the YMM register to hold stale data rather than the value of zero. Similar to Downfall, leaking stale data from physical hardware registers expose the data from other users who share the same CPU core and its internal physical registers. 


Comparison



Downfall

Zenbleed

Affects

Intel Core (6th-11th Gen)

AMD Zen 2

Leaks

Entire XMM/YMM/ZMM Register

Upper-half of 256-bit YMM Registers

Exploit

Gather Data Sampling

Architectural Data Leak

Discovered by

Microarchitectural Analysis

Fuzzing

Fix

Microcode blocking speculative forwarding from Gather

Microcode properly wiping out YMM register when Zeroupper 

Mitigation overhead

0-50% depending on the workload 

Statistically insignificant

Reported on

August 24, 2022

May, 15 2023

Fixed on

August 8, 2023

July 19, 2023


How did we protect our users?

Vulnerability research continues to be at the heart of our security work at Google. We invest in not only vulnerability research, but in the community as a whole in order to encourage further research that keeps all users safe. These vulnerabilities were no exception, and we worked closely with our industry partners to make them aware of the vulnerabilities, coordinate on mitigations, align on disclosure timelines and a plan to get details out to the ecosystem. 


Upon disclosures, we immediately published Security Bulletins for both Downfall and Zenbleed that detailed how Google responded to each vulnerability, and provided guidance for the industry. In addition to our bulletins, we posted technical details for insights on both Downfall and Zenbleed. It’s imperative that vulnerability research continues to be supported by the industry, and we’re dedicated to doing our part to helping protect those that do this important work.

Lessons learned 

These long existing vulnerabilities, their discovery and the mitigations that followed have provided several lessons learned that will help the industry move forward in vulnerability research, including: 

  • There are fundamental challenges in designing secure hardware that requires further research and understanding.

  • There are gaps in automated testing and verification of hardware for vulnerabilities. 

  • Optimization features that are supposed to make computation faster are closely related to security and can introduce new vulnerabilities, if not implemented properly.


As Downfall and Zenbleed, suggest, computer hardware is only becoming more complex everyday, and so we will see more vulnerabilities, which is why Google is investing in CPU/hardware security research. We look forward to continuing to share our insights and encourage the wider industry to join us in helping to expand on this work. 


Want to learn more?

Downfall will be presented at Blackhat USA 2023 on August 9 at 1:30pm. You can also read more about Zenbleed on this advisory.

Pixel Binary Transparency: verifiable security for Pixel devices



Pixel Binary Transparency

With Android powering billions of devices, we’ve long put security first. There’s the more visible security features you might interact with regularly, like spam and phishing protection, as well as less obvious integrated security features, like daily scans for malware. For example, Android Verified Boot strives to ensure all executed code comes from a trusted source, rather than from an attacker or corruption. And with attacks on software and mobile devices constantly evolving, we’re continually strengthening these features and adding transparency into how Google protects users. This blog post peeks under the hood of Pixel Binary Transparency, a recent addition to Pixel security that puts you in control of checking if your Pixel is running a trusted installation of its operating system. 



Supply Chain Attacks & Binary Transparency

Pixel Binary Transparency responds to a new wave of attacks targeting the software supply chain—that is, attacks on software while in transit to users. These attacks are on the rise in recent years, likely in part because of the enormous impact they can have. In recent years, tens of thousands of software users from Fortune 500 companies to branches of the US government have been affected by supply chain attacks that targeted the systems that create software to install a backdoor into the code, allowing attackers to access and steal customer data. 




One way Google protects against these types of attacks is by auditing Pixel phone  firmware (also called “factory images”) before release, during which the software is thoroughly checked for backdoors. Upon boot, Android Verified Boot runs a check on your device to be sure that it’s still running the audited code that was officially released by Google. Pixel Binary Transparency now expands on that function, allowing you to personally confirm that the image running on your device is the official factory image—meaning that attackers haven’t inserted themselves somewhere in the source code, build process, or release aspects of the software supply chain. Additionally, this means that even if a signing key were compromised, binary transparency would flag the unofficially signed images, deterring attackers by making their compromises more detectable.



How it works


Pixel Binary Transparency is a public, cryptographic log that records metadata about official factory images. With this log, Pixel users can mathematically prove that their Pixels are running factory images that match what Google released and haven’t been tampered with.




The Pixel Binary Transparency log is cryptographically guaranteed to be append-only, which means entries can be added to the log, but never changed or deleted. Being append-only provides resilience against attacks on Pixel images as attackers know that it’s more difficult to insert malicious code without being caught, since an image that’s been altered will no longer match the metadata Google added to the log. There’s no way to change the information in the log to match the tampered version of the software without detection (Ideally the metadata represents the entirety of the software, but it cannot attest to integrity of the build and release processes.)




For those who want to understand more about how this works, the Pixel Binary Transparency log is append-only thanks to a data structure called a Merkle tree, which is also used in blockchain, Git, Bittorrent, and certain NoSQL databases. The append-only property is derived from the single root hash of the Merkle tree—the top level cryptographic value in the tree. The root hash is computed by hashing each leaf node containing data (for example, metadata that confirms the security of your Pixel’s software), and recursively hashing intermediate nodes. 



The root hash of a Merkle tree should not change, if and only if, the leaf nodes do not change. By keeping track of the most recent root hash, you also keep track of all the previous leaves. You can read more about the details in the Pixel Binary Transparency documentation




Merkle Trees Proofs

There are two important computations that can be performed on a Merkle tree: the consistency proof and inclusion proof. These two proofs together allow you to check whether an entry is included in a transparency log and to trust that the log has not been tampered with.




Before you trust the contents of the log, you should use the consistency proof to check the integrity of the append-only property of the tree. The consistency proof is a set of hashes that show when the tree grows, the root hash only changes from the addition of new entries and not because previous entries were modified.




Once you have established that the tree has not been tampered with, you can use the inclusion proof to check whether a particular entry is in the tree. In the case of Pixel Binary Transparency, you can check that a certain version of firmware is published in the log (and thus, an official image released by Google) before trusting it.




You can learn more about Merkle trees on Google’s transparency.dev site, which goes deeper into the same concepts in the context of our Trillian transparency log implementation. 



Try It Out

Most Pixel owners won’t ever need to perform the consistency and inclusion proofs to check their Pixel’s image—Android Verified Boot already has multiple safeguards in place, including verifying the hash of the code and data contents and checking the validity of the cryptographic signature. However, we’ve made the process available to anyone who wants to check themselves—the Pixel Binary Transparency Log Technical Detail Page will walk you through extracting the metadata from your phone and then running the inclusion and consistency proofs to compare against the log.



More Security to Come

The first iteration of Pixel Binary Transparency lays the groundwork for more security checks. For example, building on Pixel Binary Transparency, it will be possible to make even more security data transparent for users, allowing proactive assurance for a device’s other executed code beyond its factory image. We look forward to building further on Pixel Binary Transparency and continually increasing resilience against software supply chain attacks.

The Ups and Downs of 0-days: A Year in Review of 0-days Exploited In-the-Wild in 2022


This is Google’s fourth annual year-in-review of 0-days exploited in-the-wild [2021, 2020, 2019] and builds off of the mid-year 2022 review. The goal of this report is not to detail each individual exploit, but instead to analyze the exploits from the year as a whole, looking for trends, gaps, lessons learned, and successes. 

Executive Summary

41 in-the-wild 0-days were detected and disclosed in 2022, the second-most ever recorded since we began tracking in mid-2014, but down from the 69 detected in 2021.  Although a 40% drop might seem like a clear-cut win for improving security, the reality is more complicated. Some of our key takeaways from 2022 include:




N-days function like 0-days on Android due to long patching times. Across the Android ecosystem there were multiple cases where patches were not available to users for a significant time. Attackers didn’t need 0-day exploits and instead were able to use n-days that functioned as 0-days.




0-click exploits and new browser mitigations drive down browser 0-days. Many attackers have been moving towards 0-click rather than 1-click exploits. 0-clicks usually target components other than the browser. In addition, all major browsers also implemented new defenses that make exploiting a vulnerability more difficult and could have influenced attackers moving to other attack surfaces. 




Over 40% of the 0-days discovered were variants of previously reported vulnerabilities. 17 out of the 41 in-the-wild 0-days from 2022 are variants of previously reported vulnerabilities. This continues the unpleasant trend that we’ve discussed previously in both the 2020 Year in Review report and the mid-way through 2022 report. More than 20% are variants of previous in-the-wild 0-days from 2021 and 2020.




Bug collisions are high. 2022 brought more frequent reports of attackers using the same vulnerabilities as each other, as well as security researchers reporting vulnerabilities that were later discovered to be used by attackers. When an in-the-wild 0-day targeting a popular consumer platform is found and fixed, it's increasingly likely to be breaking another attacker's exploit as well.




Based on our analysis of 2022 0-days we hope to see the continued focus in the following areas across the industry:



  1. More comprehensive and timely patching to address the use of variants and n-days as 0-days.

  2. More platforms following browsers’ lead in releasing broader mitigations to make whole classes of vulnerabilities less exploitable. 

  3. Continued growth of transparency and collaboration between vendors and security defenders to share technical details and work together to detect exploit chains that cross multiple products.

By the Numbers

For the 41 vulnerabilities detected and disclosed in 2022, no single find accounted for a large percentage of all the detected 0-days. We saw them spread relatively evenly across the year: 20 in the first half and 21 in the second half. The combination of these two data points, suggests more frequent and regular detections. We also saw the number of organizations credited with in-the-wild 0-day discoveries stay high. Across the 69 detected 0-days from 2021 there were 20 organizations credited. In 2022 across the 41 in-the-wild 0-days there were 18 organizations credited. It’s promising to see the number of organizations working on 0-day detection staying high because we need as many people working on this problem as possible. 




2022 included the detection and disclosure of 41 in-the-wild 0-days, down from the 69 in 2021. While a significant drop from 2021, 2022 is still solidly in second place. All of the 0-days that we’re using for our analysis are tracked in this spreadsheet.  





Limits of Number of 0-days as a Security Metric

The number of 0-days detected and disclosed in-the-wild can’t tell us much about the state of security. Instead we use it as one indicator of many. For 2022, we believe that a combination of security improvements and regressions influenced the approximately 40% drop in the number of detected and disclosed 0-days from 2021 to 2022 and the continued higher than average number of 0-days that we saw in 2022. 




Both positive and negative changes can influence the number of in-the-wild 0-days to both rise and fall. We therefore can’t use this number alone to signify whether or not we’re progressing in the fight to keep users safe. Instead we use the number to analyze what factors could have contributed to it and then review whether or not those factors are areas of success or places that need to be addressed.




Example factors that would cause the number of detected and disclosed in-the-wild 0-days to rise:




Security Improvements - Attackers require more 0-days to maintain the same capability

  • Discovering and fixing 0-days more quickly

  • More entities publicly disclosing when a 0-day is known to be in-the-wild 

  • Adding security boundaries to platforms

Security Regressions - 0-days are easier to find and exploit 

  • Variant analysis is not performed on reported vulnerabilities

  • Exploit techniques are not mitigated

  • More exploitable vulnerabilities are added to code than fixed





Example factors that would cause the number of detected and disclosed in-the-wild 0-days to decline:


Security Improvements - 0-days take more time, money, and expertise to develop for use

  • Fewer exploitable 0-day vulnerabilities exist

  • Each new 0-day requires the creation of a new exploitation technique

  • New vulnerabilities require researching new attack surfaces

Security Regressions - Attackers need fewer 0-days to maintain the same capability

  • Slower to detect in-the-wild 0-days so a bug has a longer lifetime

  • Extended time until users are able to install a patch

  • Less sophisticated attack methods: phishing, malware, n-day exploits are sufficient




Brainstorming the different factors that could lead to this number rising and declining allows us to understand what’s happening behind the numbers and draw conclusions from there. Two key factors contributed to the higher than average number of in-the-wild 0-days for 2022: vendor transparency & variants. The continued work on detection and transparency from vendors is a clear win, but the high percentage of variants that were able to be used in-the-wild as 0-days is not great. We discuss these variants in more depth in the “Déjà vu of Déjà vu-lnerability” section. 




In the same vein, we assess that a few key factors likely led to the drop in the number of in-the-wild 0-days from 2021 to 2022,  positives such as fewer exploitable bugs such that many attackers are using the same bugs as each other, and negatives likeless sophisticated attack methods working just as well as 0-day exploits and slower to detect 0-days. The number of in-the-wild 0-days alone doesn’t tell us much about the state of in-the-wild exploitation, it’s instead the variety of factors that influenced this number where the real lessons lie. We dive into these in the following sections.

Are 0-days needed on Android?

In 2022, across the Android ecosystem we saw a series of cases where the upstream vendor had released a patch for the issue, but the downstream manufacturer had not taken the patch and released the fix for users to apply. Project Zero wrote about one of these cases in November 2022 in their “Mind the Gap” blog post




These gaps between upstream vendors and downstream manufacturers allow n-days - vulnerabilities that are publicly known - to function as 0-days because no patch is readily available to the user and their only defense is to stop using the device. While these gaps exist in most upstream/downstream relationships, they are more prevalent and longer in Android. 




This is a great case for attackers. Attackers can use the known n-day bug, but have it operationally function as a 0-day since it will work on all affected devices. An example of how this happened in 2022 on Android is CVE-2022-38181, a vulnerability in the ARM Mali GPU. The bug was originally reported to the Android security team in July 2022, by security researcher Man Yue Mo of the Github Security Lab. The Android security team then decided that they considered the issue a “Won’t Fix” because it was “device-specific”. However, Android Security referred the issue to ARM. In October 2022, ARM released the new driver version that fixed the vulnerability. In November 2022, TAG discovered the bug being used in-the-wild. While ARM had released the fixed driver version in October 2022, the vulnerability was not fixed by Android until April 2023, 6 months after the initial release by ARM, 9 months after the initial report by Man Yue Mo, and 5 months after it was first found being actively exploited in-the-wild.




  • July 2022: Reported to Android Security team

  • Aug 2022: Android Security labels “Won’t Fix” and sends to ARM

  • Oct 2022: Bug fixed by ARM

  • Nov 2022: In-the-wild exploit discovered

  • April 2023: Included in Android Security Bulletin




In December 2022, TAG discovered another exploit chain targeting the latest version of the Samsung Internet browser. At that time, the latest version of the Samsung Internet browser was running on Chromium 102, which had been released 7 months prior in May 2022. As a part of this chain, the attackers were able to use two n-day vulnerabilities which were able to function as 0-days: CVE-2022-3038 which had been patched in Chrome 105 in June 2022 and CVE-2022-22706 in the ARM Mali GPU kernel driver. ARM had released the patch for CVE-2022-22706 in January 2022 and even though it had been marked as exploited in-the-wild, attackers were still able to use it 11 months later as a 0-day. Although this vulnerability was known as exploited in the wild in January 2022, it was not included in the Android Security Bulletin until June 2023, 17 months after the patch released and it was publicly known to be actively exploited in-the-wild.




These n-days that function as 0-days fall into this gray area of whether or not to track as 0-days. In the past we have sometimes counted them as 0-days: CVE-2019-2215 and CVE-2021-1048. In the cases of these two vulnerabilities the bugs had been fixed in the upstream Linux kernel, but without assigning a CVE as is Linux’s standard. We included them because they had not been identified as security issues needing to be patched in Android prior to their in-the-wild discovery. Whereas in the case of CVE-2022-38181 the bug was initially reported to Android and ARM published security advisories to the issues indicating that downstream users needed to apply those patches. We will continue trying to decipher this “gray area” of bugs, but welcome input on how they ought to be tracked. 



Browsers Are So 2021

Similar to the overall numbers, there was a 42% drop in the number of detected in-the-wild 0-days targeting browsers from 2021 to 2022, dropping from 26 to 15. We assess this reflects browsers’ efforts to make exploitation more difficult overall as well as a shift in attacker behavior away from browsers towards 0-click exploits that target other components on the device. 







Advances in the defenses of the top browsers is likely influencing the push to other components as the initial vector in an exploit chain. Throughout 2022 we saw more browsers launching and improving additional defenses against exploitation. For Chrome that’s MiraclePtr, v8 Sandbox, and libc++ hardening. Safari launched Lockdown Mode and Firefox launched more fine-grained sandboxing. In his April 2023 Keynote at Zer0Con, Ki Chan Ahn, a vulnerability researcher and exploit developer at offensive security vendor, Dataflow Security, commented on how these types of mitigations are making browser exploitation more difficult and are an incentive for moving to other attack surfaces.




Browsers becoming more difficult to exploit pairs with an evolution in exploit delivery over the past few years to explain the drop in browser bugs in 2022. In 2019 and 2020, a decent percentage of the detected in-the-wild 0-days were delivered via watering hole attacks. A watering hole attack is where an attacker is targeting a group that they believe will visit a certain website. Anyone who visits that site is then exploited and delivered the final payload (usually spyware). In 2021, we generally saw a move to 1-click links as the initial attack vector. Both watering hole attacks and 1-click links use the browser as the initial vector onto the device. In 2022, more attackers began moving to using 0-click exploits instead, exploits that require no user interaction to trigger. 0-clicks tend to target device components other than browsers.




At the end of 2021, Citizen Lab captured a 0-click exploit targeting iMessage, CVE-2023-30860, used by NSO in their Pegasus spyware. Project Zero detailed the exploit in this 2-part blog post series. While no in-the-wild 0-clicks were publicly detected and disclosed in 2022, this does not signal a lack of use. We know that multiple attackers have and are using 0-click exploit chains.

0-clicks are difficult to detect because:



  • They are short lived

  • Often have no visible indicator of their presence

  • Can target many different components and vendors don’t even always realize all the components that are remotely accessible

  • Delivered directly to the target rather than broadly available like in a watering hole attack

  • Often not hosted on a navigable website or server




With 1-click exploits, there is a visible link that has to be clicked by the target to deliver the exploit. This means that the target or security tools may detect the link. The exploits are then hosted on a navigable server at that link.




0-clicks on the other hand often target the code that processes incoming calls or messages, meaning that they can often run prior to an indicator of an incoming message or call ever being shown. This also dramatically shortens their lifetime and the window in which they can be detected “live”. It’s likely that attackers will continue to move towards 0-click exploits and thus we as defenders need to be focused on how we can detect and protect users from these exploits. 



Déjà vu-lnerability: Complete patching remains one of the biggest opportunities

17 out of 41 of the 0-days discovered in-the-wild in 2022 are variants of previously public vulnerabilities. We first published about this in the 2020 Year in Review report, “Deja vu-lnerability,” identifying that 25% of the in-the-wild 0-days from 2020 were variants of previously public bugs. That number has continued to rise, which could be due to:



  • Defenders getting better at identifying variants, 

  • Defenders improving at detecting in-the-wild 0-days that are variants, 

  • Attackers are exploiting more variants, or

  • Vulnerabilities are being fixed less comprehensively and thus there are more variants.




The answer is likely a combination of all of the above, but we know that the number of variants that are able to be exploited against users as 0-days is not decreasing. Reducing the number of exploitable variants is one of the biggest areas of opportunity for the tech and security industries to force attackers to have to work harder to have functional 0-day exploits. 




Not only were over 40% of the 2020 in-the-wild 0-days variants, but more than 20% of the bugs are variants of previous in-the-wild 0-days: 7 from 2021 and 1 from 2020. When a 0-day is caught in the wild it’s a gift. Attackers don’t want us to know what vulnerabilities they have and the exploit techniques they’re using. Defenders need to take as much advantage as we can from this gift and make it as hard as possible for attackers to come back with another 0-day exploit. This involves: 


  • Analyzing the bug to find the true root cause, not just the way that the attackers chose to exploit it in this case

  • Looking for other locations that the same bug may exist

  • Evaluating any additional paths that could be used to exploit the bug

  • Comparing the patch to the true root cause and determining if there are any ways around it




We consider a patch to be complete only when it is both correct and comprehensive. A correct patch is one that fixes a bug with complete accuracy, meaning the patch no longer allows any exploitation of the vulnerability. A comprehensive patch applies that fix everywhere that it needs to be applied, covering all of the variants. When exploiting a single vulnerability or bug, there are often multiple ways to trigger the vulnerability, or multiple paths to access it. Many times we see vendors block only the path that is shown in the proof-of-concept or exploit sample, rather than fixing the vulnerability as a whole. Similarly, security researchers often report bugs without following up on how the patch works and exploring related attacks.




While the idea that incomplete patches are making it easier for attackers to exploit 0-days may be uncomfortable, the converse of this conclusion can give us hope. We have a clear path toward making 0-days harder. If more vulnerabilities are patched correctly and comprehensively, it will be harder for attackers to exploit 0-days.




We’ve included all identified vulnerabilities that are variants in the table below. For more thorough walk-throughs of how the in-the-wild 0-day is a variant, check out the presentation from the FIRST conference [video, slides], the slides from Zer0Con, the presentation from OffensiveCon [video, slides] on CVE-2022-41073, and this blog post on CVE-2022-22620.




Product

2022 ITW CVE

Variant

Windows win32k

CVE-2022-21882

CVE-2021-1732 (2021 itw)

iOS IOMobileFrameBuffer

CVE-2022-22587

CVE-2021-30983 (2021 itw)

WebKit “Zombie”

CVE-2022-22620

Bug was originally fixed in 2013, patch was regressed in 2016

Firefox WebGPU IPC

CVE-2022-26485

Fuzzing crash fixed in 2021

Android in ARM Mali GPU

CVE-2021-39793 CVE-2022-22706

CVE-2021-28664 (2021 itw)

Sophos Firewall

CVE-2022-1040

CVE-2020-12271 (2020 itw)

Chromium v8

CVE-2022-1096

CVE-2021-30551 (2021 itw)

Chromium

CVE-2022-1364

CVE-2021-21195

Windows “PetitPotam”

CVE-2022-26925

CVE-2021-36942 - Patch was regressed

Windows “Follina”

CVE-2022-30190

CVE-2021-40444 

(2021 itw)

Atlassian Confluence

CVE-2022-26134

CVE-2021-26084 (2021 itw)

Chromium Intents

CVE-2022-2856

CVE-2021-38000 (2021 itw)

Exchange SSRF “ProxyNotShell”

CVE-2022-41040

CVE-2021-34473  “ProxyShell”

Exchange RCE “ProxyNotShell”

CVE-2022-41082

CVE-2023-21529 “ProxyShell”

Internet Explorer JScript9

CVE-2022-41128

CVE-2021-34480

Windows “Print Spooler”

CVE-2022-41073

CVE-2022-37987

WebKit JSC

CVE-2022-42856

2016 bug discovered due to test failure




No Copyrights in Exploits

Unlike many commodities in the world, a 0-day itself is not finite. Just because one person has discovered the existence of a 0-day vulnerability and developed it into an exploit doesn’t prevent other people from independently finding it too and using it in their exploit. Most attackers who are doing their own vulnerability research and exploit development do not want anyone else to do the same as it lowers its value and makes it more likely to be detected and fixed quickly.




Over the last couple of years we’ve become aware of a trend of a high number of bug collisions, where more than one researcher has found the same vulnerability. This is happening amongst both attackers and security researchers who are reporting the bugs to vendors. While bug collisions have always occurred and we can’t measure the exact rate at which they’re occurring, the number of different entities independently being credited for the same vulnerability in security advisories, finding the same 0-day in two different exploits, and even conversations with researchers who work on both sides of the fence, suggest this is happening more often.




A higher number of bug collisions is a win for defense because that means attackers are overall using fewer 0-days. Limiting attack surfaces and making fewer bug classes exploitable can definitely contribute to researchers finding the same bugs, but more security researchers publishing their research also likely contributes. People read the same research and it incites an idea for their next project, but it incites similar ideas in many. Platforms and attack surfaces are also becoming increasingly complex so it takes quite a bit of investment in time to build up an expertise in a new component or target.




Security researchers and their vulnerability reports are helping to fix the same 0-days that attackers are using, even if those specific 0-days haven’t yet been detected in the wild, thus breaking the attackers’ exploits. We hope that vendors continue supporting researchers and investing in their bug bounty programs because it is helping fix the same vulnerabilities likely being used against users. It also highlights why thorough patching of known in-the-wild bugs and vulnerabilities by security researchers are both important.   



What now?

Looking back on 2022 our overall takeaway is that as an industry we are on the right path, but there are also plenty of areas of opportunity, the largest area being the industry’s response to reported vulnerabilities. 



  • We must get fixes and mitigations to users quickly so that they can protect themselves.
  • We must perform detailed analyses to ensure the root cause of the vulnerability is addressed.

  • We must share as many technical details as possible.

  • We must capitalize on reported vulnerabilities to learn and fix as much as we can from them.



None of this is easy, nor is any of this a surprise to security teams who operate in this space. It requires investment, prioritization, and developing a patching process that balances both protecting users quickly and ensuring it is comprehensive, which can at times be in tension. Required investments depend on each unique situation, but we see some common themes around staffing/resourcing, incentive structures, process maturity, automation/testing, release cadence, and partnerships. 




We’ve detailed some efforts that can help ensure bugs are correctly and comprehensively fixed in this post: including root cause, patch, variant, and exploit technique analyses. We will continue to help with these analyses, but we hope and encourage platform security teams and other independent security researchers to invest in these efforts as well.



Final Thoughts: TAG’s New Exploits Team

Looking into the second half of 2023, we’re excited for what’s to come. You may notice that our previous reports have been on the Project Zero blog. Our 0-days in-the-wild program has moved from Project Zero to TAG in order to combine the vulnerability analysis, detection, and threat actor tracking expertise all in one team, benefiting from more resources and ultimately making: TAG Exploits! More to come on that, but we’re really excited for what this means for protecting users from 0-days and making 0-day hard. 




One of the intentions of our Year in Review is to make our conclusions and findings “peer-reviewable”. If we want to best protect users from the harms of 0-days and make 0-day exploitation hard, we need all the eyes and brains we can get tackling this problem. We welcome critiques, feedback, and other ideas on our work in this area. Please reach out at 0day-in-the-wild <at> google.com.

Supply chain security for Go, Part 3: Shifting left


Previously in our Supply chain security for Go series, we covered dependency and vulnerability management tools and how Go ensures package integrity and availability as part of the commitment to countering the rise in supply chain attacks in recent years

In this final installment, we’ll discuss how “shift left” security can help make sure you have the security information you need, when you need it, to avoid unwelcome surprises. 

Shifting left


The software development life cycle (SDLC) refers to the series of steps that a software project goes through, from planning all the way through operation. It’s a cycle because once code has been released, the process continues and repeats through actions like coding new features, addressing bugs, and more. 

Shifting left involves implementing security practices earlier in the SDLC. For example, consider scanning dependencies for known vulnerabilities; many organizations do this as part of continuous integration (CI) which ensures that code has passed security scans before it is released. However, if a vulnerability is first found during CI, significant time has already been invested building code upon an insecure dependency. Shifting left in this case means allowing developers to run vulnerability scans locally, well before the CI-time scan, so they can learn about issues with their dependencies prior to investing time and effort into creating new code built upon vulnerable dependencies or functions.

Shifting left with Go

Go provides several features that help you address security early in your process, including govulncheck and pkg.go.dev discussed in Supply chain security for Go, Part 1. Today’s post covers two more features of special interest to supply chain security: the Go extension for Visual Studio Code and built-in fuzz testing. 


VS Code Go extension

The VS Code Go extension helps developers shift left by surfacing problems directly in their code editor. The plugin is loaded with features including built in testing and debugging and vulnerability information right in your IDE. Having these features at your fingertips while coding means good security practices are incorporated into your project as early as possible. For example, by running the govulncheck integration early and often, you'll know whether you are invoking a compromised function before it becomes difficult to extract. Check out the tutorial to get started today. 

Fuzz testing in Go

In 2022, Go became the first major programming language to include fuzz testing in its standard toolset with the release of Go 1.18. Fuzzing is a type of automated testing that continuously alters program inputs to find bugs. It plays a huge role in keeping the Go project itself secure – OSS-Fuzz has discovered eight vulnerabilities in the Go Standard library since 2020. 

Fuzz testing can find security exploits and vulnerabilities in edge cases that humans often miss, not only your code, but also in your dependencies—which means more insight into your supply chain. With fuzzing included in the standard Go tool set, developers can more easily shift left, fuzzing earlier in their development process. Our tutorial walks you through how to set up and run your fuzzing tests. 

If you maintain a Go package, your project may be eligible for free and continuous fuzzing provided by OSS-Fuzz, which supports native Go fuzzing. Fuzzing your project, whether on demand through the standard toolset or continuously through OSS-Fuzz is a great way to help protect the people and projects who will use your module. 

Security at the ecosystem level

In the same way that we’re working toward "secure Go practices" becoming "standard Go practices," the future of software will be more secure for everyone when they’re simply “standard development practices.” Supply chain security threats are real and complex, but we can contribute to solving them by building solutions directly into open source ecosystems.

If you’ve enjoyed this series, come meet the Go team at Gophercon this September! And check out our closing keynote—all about how Go’s vulnerability management can help you write more secure and reliable software.


Gmail client-side encryption: A deep dive



In February, we expanded Google Workspace client-side encryption (CSE) capabilities to include Gmail and Calendar in addition to Drive, Docs, Slides, Sheets, and Meet.

CSE in Gmail was designed to provide commercial and public sector organizations an additional layer of confidentiality and data integrity protection beyond the existing encryption offered by default in Workspace. When CSE is enabled, email messages are protected using encryption keys that are fully under the customer’s control. The data is encrypted on the client device before it’s sent to Google servers that do not have access to the encryption keys, which means the data is indecipherable to us–we have no technical ability to access it. The entire process happens in the browser on the client device, without the need to install desktop applications or browser extensions, which means that users get the same intuitive productivity and collaboration experiences that they enjoy with Gmail today. Let’s take a deeper look into how it works.

How we built Client-side Encryption for Workspace

We invented and designed a new service called, Key Access Control List Service (KACLS), that is used across all essential Workspace applications. Then, we worked directly with customers and partners to make it secure, reliable, and simple to deploy. KACLS performs cryptographic operations with encryption keys after validating end-user authentication and authorization. It runs in a customer's controlled environment and provides the key management API called by the CSE-enabled Workspace clients. We have multiple partners providing software implementations of the KACLS API that can be used by our customers. 


At a high level, Workspace client code takes advantage of envelope encryption to encrypt and decrypt the user content on the client with a Data Encryption Key (DEK) and leverage the KACLS to encrypt and decrypt the DEK. In order to provide separation of duty, we use the customer's OpenID Connect (OIDC) IdP to authenticate end-users and provide a JSON Web Token assertion with a claim identifying the user (3P_JWT). For every encryption/decryption request sent to KACLS, the application (e.g. Gmail) provides a JSON Web Token assertion with a claim authorizing the current end-user operation (G_JWT). KACLS validates these authentication and authorization tokens before returning, for example, a decrypted DEK to the user’s client device.

More details on KACLS are available in Google Workspace Encryption Whitepaper and CSE reference API.

How we built CSE into Gmail

Google Workspace Engineering teams have been hard at work over multiple years to deliver to our customers the ability to have their data protected with client-side encryption. This journey required us to work closely with customers and partners to provide a capability that was secure, easy to use, intuitive and easily deployable. It was also important for CSE to work seamlessly across the Workspace products: you can create a Meet CSE scheduled meeting in Calendar CSE and follow-up with Gmail CSE emails containing links to Drive CSE files.

Client-side encryption in Gmail was built with openness and interoperability in mind. The underlying technology being used is S/MIME, an open standard for sending encrypted messages over email. S/MIME is already supported in most enterprise email clients, so users are able to communicate securely, outside of their domain, regardless of what provider the recipient is using to read their mail, without forcing the recipients to log into a proprietary portal. S/MIME uses asymmetric encryption. The public key and the email of each user are included in the user's S/MIME certificate. Similarly to TLS used for HTTPS, each certificate is digitally signed by a chain of certificate authorities up to a broadly trusted root certificate authority. The certificate acts as a virtual business card, enabling anyone getting it to encrypt emails for that user. The user's private keys are kept secure under customer control and are used by users for decryption of incoming emails and digital signature of outgoing emails.

We decided to leverage the CSE paradigm used for Drive CSE and not keep the private key on the device, to keep them as safe as possible. Instead, we extended our KACLS API to support asymmetric encryption and signature operations. This enables our customers to centrally provision and enable S/MIME, on the KACLS, for all their users without having to deploy certificates individually to each user device.

CSE in Gmail uses the end-user's client existing cryptographic functionalities (Web Crypto API for web browsers for instance) to perform local encryption operations and run client-side code to perform all S/MIME message generation.

Now let's cover the detailed user flows:

When sending an email, the Gmail client generates a MIME message, encrypts the message with a random Data Encryption Key (DEK) then uses the recipients' public keys to encrypt the DEK, calls KACLS (with the user authenticated by customer's IdP and authorized by Google) to digitally sign content and finally sends the authenticated and encrypted S/MIME message, which contains both the encrypted email and the encrypted DEK, to Google servers for delivery to the recipients.


When receiving an email, Gmail will verify that the digital signature of the email is valid and matches the sender's identity, which protects the email against tampering. Gmail will trust digital identities signed by Root CA PKI as well as custom domain configurations. The Gmail client will call KACLS (with the authentication and authorization JWT) to decrypt the email encryption key, then can decrypt the email and render it to the end-user.

How we protect the application

Workspace already uses the latest cryptographic standards to encrypt all data at rest and in transit between its facilities for all services. Additionally, Gmail uses Transport Layer Security (TLS) by default for communication with other email service providers. CSE in Gmail, however, provides an additional layer of protection for sensitive content. The security of Gmail CSE is paramount to us, and we developed new additional mechanisms to ensure CSE content would be locked into a secure container. On the web, we have been leveraging iframe origin isolation, strict postMessage API, and Content Security Policy to protect the user's sensitive data. Those security controls provide multiple layers of safety to ensure that CSE content stays isolated from the rest of the application. See this simplified diagram covering the isolation protecting CSE emails during composition or display.


What’s next for Client-side encryption and why it’s important 

CSE in Gmail uses S/MIME to encrypt and digitally sign emails using public keys supplied by customers, which add an additional level of confidentiality and integrity to emails. This is done with extensive security controls to protect user data confidentiality, but also transparently integrated in Gmail UI to delight our users. However our work is not done, and we are actively partnering with Google Research to further develop client-side capabilities. You can see some of our progress in this field with our presentation at the RSA Security Conference last year where we provided insight into the challenges and the practical strategies to provide advanced capabilities, such as AI-driven phishing protection for CSE.