Author Archives:

Introducing a new spam policy for “back button hijacking”

Today, we are expanding our spam policies to address a deceptive practice known as "back button hijacking", which will become an explicit violation of the "malicious practices" of spam policies, leading to potential spam actions.

Source: Google Search Central Blog

Google Workspace Updates Weekly Recap – April 10, 2026

Book Google Workspace resources from third-party calendars

Organizations that use both Google Workspace and other calendaring systems, like Microsoft Outlook, can now more easily coordinate shared resources, such as rooms, projectors, or company cars. | Learn more about how to book Google Workspace resources from third-party calendars.

Greater control and error visibility for Google Sheets formulas

We are introducing updates to the logic and parameter support for a targeted set of Google Sheets functions. These changes are designed to improve error visibility, offer more control over data, and ensure seamless compatibility when importing files. | Learn more about greater control and error visibility for Google Sheets formulas.

Migration update on restricted access items

With this update, all items with legacy restricted access will be automatically migrated to use the limited access setting instead. There will be no change to who can see or access the files. | Learn more about Migration update on restricted access items.

Speech translation in Google Meet is now rolling out to mobile devices

Following our recent general availability launch for web, we are excited to announce that speech translation is now rolling out to the Meet Android and iOS apps. The feature allows audio to be translated to other languages in near-real-time, helping global teams communicate more naturally and removing language barriers. | Learn more about speech translation in Google Meet is now out to mobile devices.

Edit your AI-generated scripts when you convert Slides to Vids

Now when you import Slides into Google Vids with Gemini enabled, you can see and edit your AI-generated scripts for each slide before completing the import, generating voiceovers, and applying animations. | Learn more about how to edit your AI-generated scripts when you convert Slides to Vids.

Gmail end-to-end encryption now available on mobile devices

We’re expanding Gmail end-to-end encryption (E2EE) to Android and iOS devices for Gmail client-side encryption (CSE) users. With Gmail E2EE, your users can confidentially engage with your organization's most sensitive data from anywhere on their mobile devices while ensuring data remains compliant and with your organizations sovereignty and compliance requirements. | Learn more about Gmail end-to-end encryption now available on mobile devices.

Expanding access to longer musical tracks in the Gemini app

Last month, we introduced Lyria 3 Pro in the Gemini app for select Business, Enterprise, and Education editions. Now, more users can create longer tracks with Lyria 3 Pro in the Gemini app. | Learn more about expanded access to longer musical tracks in the Gemini app.

The announcements above were published on the Workspace Updates blog over the last week. Please refer to the original blog posts for complete details.

Source: Google Workspace Updates

Leveraging CPU memory for faster, cost-efficient TPU LLM training

by Keyur Ruganathbhai Ranipa, Qinglan Xiang, Vrushabh Sanghavi, Ramesh AG & Weilin Wang, Intel
and Penporn Koanantakool, Google

Host offloading with JAX on Intel® Xeon® processors

As Large Language Models (LLMs) continue to scale into the hundreds of billions of parameters, device memory capacity has become a big limiting factor in training, as intermediate activations from every layer in the forward pass are needed in the backward pass. To reduce device memory pressure, these activations can be rematerialized during the backward pass, trading memory for recomputation. While rematerialization enables larger models to fit within limited device memory, it significantly increases training time and cost.

Intel® Xeon® processors (5th and 6th Gen) with Advanced Matrix Extensions (AMX) enable practical host offloading of selected memory- and compute-intensive components in JAX training workflows. This approach can help teams train larger models, relieve accelerator memory pressure, improve end-to-end throughput, and reduce total cost of ownership—particularly on TPU-based Google Cloud instances.

By publishing these results and implementation details, Google and Intel aim to promote transparency and share practical guidance with the community. This post describes how to enable activation offloading for JAX on TPU platforms and outlines considerations for building scalable, cost-aware hybrid CPU–accelerator training workflows.

**Figure 1.** Google Cloud TPU Pod commonly used in LLM training.

Host offloading

Traditional LLM training is usually done on device accelerators alone. However, modern host machines have much larger memory size than accelerators (512GB or more) and can offer extra compute power, e.g., TFLOPS in case of Intel® Xeon® Scalable Processor with AMX capability. Leveraging host resources can be a great alternative to rematerialization. Host offloading selectively moves computation or data between host and device to optimize performance and memory usage.

Host memory offloading keeps frequently-accessed tensors on the device and spills the rest to CPU memory as an extra level of cache. Activation offloading transfers activations computed on-device in the forward pass to the host, stores them in the host memory, and brings them back to the device in the backward pass for gradient computation. This unlocks the ability to train larger models, use bigger batch sizes, and improve throughput.

**Figure 2:** Memory offloading during forward and backward pass

In this blog post, we provide a practical guide to offload activations through JAX to efficiently train larger models on TPUs with an Intel® Xeon® Scalable Processor.

Enabling memory offloading in JAX

JAX offers multiple strategies for offloading activations, model parameters, and optimizer states to the host. Users can use checkpoint_names() to create a checkpoint for a tensor. The snippet below shows how to create a checkpoint  x:

from jax.ad_checkpoint import checkpoint_name 
 
def layer_name(x, w): 
  w1, w2 = w 
  x = checkpoint_name(x, "x") 
  y = x @ w1 
  return y @ w2, None

Users can provide checkpoint_policies() to select the appropriate memory optimization strategy for intermediate values. There are three strategies:

Recomputing during backward pass (default behavior)
Storing on device
Offloading to host memory after forward pass and loading back during backward pass

The code below moves x from device to the pinned host memory after the forward pass.
from jax import checkpoint_policies as cp

policy = cp.save_and_offload_only_these_names( 
  names_which_can_be_saved=[],         # No values stored on device 
  names_which_can_be_offloaded=["x"],  # Offload activations labeled "x" 
  offload_src="device",                # Move from device memory 
  offload_dst="pinned_host"            # To pinned host memory 
)

Measuring Host Offloading Benefits on TPU v5p

We examined TPU host-offloading on JAX on both fine-tuning and training workloads. All our experiments were run on Google Cloud Platform, using a single v5p-8 TPU instance with single host 4th Gen Intel® Xeon® Scalable Processor.

Fine-tuning PaliGemma2: Using the base PaliGemma2 28B model for vision-language tasks, we fine-tuned the attention layers of the language model (Gemma2 27B) while keeping all other parameters frozen. During fine-tuning, we set the LLM sequence length to 256 and the batch size to 256.

The default checkpoint policy is nothing_saveable, which does not keep any activations on-device during the forward pass. The activations are rematerialized during the backward pass for gradient computation. While this approach reduces memory pressure on the TPU, it increases compute time. To apply host offloading, we offload Q, K, and V projection weights using save_and_offload_only_these_names. These activations are transferred to host memory (D2H) during the forward pass and fetched back during the backward pass (H2D), so the device neither stores nor recomputes them. Figure 2 shows 10% reduction in training time from host offloading. This translates directly into a similar reduction in TPU core-hours, yielding meaningful cost savings. The complete fine-tuning recipe is available at [JAX host offloading].

**Figure 3:** (Top) Training time comparison between full rematerialization and host offloading.
(Bottom) Memory analysis with and without host offloading.

Training Llama2-13B using MaxText: MaxText offers several rematerialization strategies that can be specified in the training configuration file. We used the policy remat_policy: 'qkv_proj_offloaded' to offload Q, K, and V projection weights. Figure 3 shows ~5% reduction in per-step training time compared to fully rematerializing all activations ( remat_policy: 'full').

**Figure 4:** MaxText Llama2-13B training statistics with and without host offloading.
The step time was 5% faster with host offloading.

When to offload activations

Activation offloading is beneficial when the time to transfer activations across host and device is lower than the time to recompute them. The timing depends on multiple factors such as PCIe bandwidth, model size, batch size, sequence length, activation tensor sizes, compute capabilities of the device, etc. An additional factor is how much the data movement can be overlapped with computation to keep the device busy. Figure 4 demonstrates an efficient overlap of the device-to-host transfer with compute during the backward pass in PaliGemma2 28B training.

**Figure 5:** A JAX trace of PaliGemma2 training viewed on Perfetto.
Memory offloading overlaps with compute effectively during backward pass host to device.

Smaller model variants such as PaliGemma2 3B and 9B did not see benefits from host offloading because it is faster to rematerialize all tensors than to transfer them to and from the host. Therefore, identifying the appropriate workload and offloading policy is crucial to realizing performance gain from host offloading

Call to Action

If you train on TPUs and are limited by device memory, consider evaluating activation offloading. Start by labeling candidate activations (for example, Q/K/V projections) and compare step time, memory headroom, and overall cost across representative workloads.

In our experiments, we observed up to ~10% improvement in end-to-end training time for larger workloads, which can reduce total cost of ownership (TCO) by shortening time-to-train or enabling the same workload on smaller instances.

Acknowledgments

Emilio Cota, and Karlo Basioli from Google and Eugene Zhulenev (formerly at Google).

Source: Google Open Source Blog

Chrome Beta for iOS Update

Hi everyone! We've just released Chrome Beta 148 (148.0.7778.8) for iOS; it'll become available on App Store in the next few days.

You can see a partial list of the changes in the Git log. If you find a new issue, please let us know by filing a bug.

Chrome Release Team
Google Chrome

Source: Google Chrome Releases

Bringing Rust to the Pixel Baseband

Posted by Jiacheng Lu, Software Engineer, Google Pixel Team

Google is continuously advancing the security of Pixel devices. We have been focusing on hardening the cellular baseband modem against exploitation. Recognizing the risks associated within the complex modem firmware, Pixel 9 shipped with mitigations against a range of memory-safety vulnerabilities. For Pixel 10, Google is advancing its proactive security measures further. Following our previous discussion on "Deploying Rust in Existing Firmware Codebases", this post shares a concrete application: integrating a memory-safe Rust DNS(Domain Name System) parser into the modem firmware. The new Rust-based DNS parser significantly reduces our security risk by mitigating an entire class of vulnerabilities in a risky area, while also laying the foundation for broader adoption of memory-safe code in other areas.

Here we share our experience of working on it, and hope it can inspire the use of more memory safe languages in low-level environments.

Why Modem Memory Safety Can’t Wait

In recent years, we have seen increasing interest in the cellular modem from attackers and security researchers. For example, Google's Project Zero gained remote code execution on Pixel modems over the Internet. Pixel modem has tens of Megabytes of executable code. Given the complexity and remote attack surface of the modem, other critical memory safety vulnerabilities may remain in the predominantly memory-unsafe firmware code.

Why DNS?

The DNS protocol is most commonly known in the context of browsers finding websites. With the evolution of cellular technology, modern cellular communications have migrated to digital data networks; consequently, even basic operations such as call forwarding rely on DNS services.

DNS is a complex protocol and requires parsing of untrusted data, which can lead to vulnerabilities, particularly when implemented in a memory-unsafe language (example: CVE-2024-27227). Implementing the DNS parser in Rust offers value by decreasing the attack surfaces associated with memory unsafety.

Picking a DNS library

DNS already has a level of support in the open-source Rust community. We evaluated multiple open source crates that implement DNS. Based on criteria shared in earlier posts, we identified hickory-proto as the best candidate. It has excellent maintenance, over 75% test coverage, and widespread adoption in the Rust community. Its pervasiveness shows its potential as the de-facto DNS choice and long term support. Although hickory-proto initially lacked no_std support, which is needed for Bare-metal environments (see our previous post on this topic), we were able to add support to it and its dependencies.

Adding `no_std` support

The work to enable no_std for hickory-proto is mostly mechanical. We shared the process in a previous post. We undertook modifications to hickory_proto and its dependencies to enable no_std support. The upstream no_std work also results in a no_std URL parser, beneficial to other projects.

The above PRs are great examples of how to extend no_std support to existing std-only crates.

Code size study

Code size is the one of the factors that we evaluated when picking the DNS library to use.

Code size by category	Rust implemented Shim that calls Hickory-proto on receiving a DNS response	4KB
	core, alloc, compiler_builtins (reusable, one-time cost)	17KB
	Hickory-proto library and dependencies	350KB

Sum

371KB

We built prototypes and measured size with size-optimized settings. Expectedly, hickory_proto is not designed with embedded use in mind, and is not optimized for size. As the Pixel modem is not tightly memory constrained, we prioritized community support and code quality, leaving code size optimizations as future work.

However, the additional code size may be a blocker for other embedded systems. This could be addressed in the future by adding additional feature flags to conditionally compile only required functionality. Implementing this modularity would be a valuable future work.

Hook-up Rust to modem firmware

Before building the Rust DNS library, we defined several Rust unit tests to cover basic arithmetic, dynamic allocations, and FFI to verify the integration of Rust with the existing modem firmware code base.

Compile Rust code to staticlib

While using cargo is the default choice for compilation in the Rust ecosystem, it presents challenges when integrating it into existing build systems. We evaluated two options:

Using cargo to build a staticlib before the modem builds. Then add the produced staticlib into the linking step.
Directly work with rustc and integrate the Rust compilation steps into the existing modem build system.

Option #1 does not scale if we are going to add more Rust components in the future, as linking multiple staticlibs may cause duplicated symbol errors. We chose option #2 as it scales more easily and allows tighter integration into our existing build system. Our existing C/C++ codebase uses Pigweed to drive the primary build system. Pigweed supports Rust targets (example) with direct calls to rustc through rust tools defined in GN.

We compiled all the Rust crates, including hickory-proto, its dependencies, and core, compiler_builtin, alloc, to rlib. Then, we created a staticlib target with a single lib.rs file which references all the rlib crates using extern crate keywords.

Build core, alloc, and compiler_builtins

Android’s Rust Toolchain distributes source code of core, alloc, and compiler_builtins, and we leveraged this for the modem. They can be included to the build graph by adding a GN target with crate_root pointing to the root lib.rs of each crate.

Pixel modem firmware already has a well-tested and specialized global memory allocation system to support some dynamic memory allocations. alloc support was added by implementing the GlobalAlloc with FFI calls to the allocators C APIs:

use core::alloc::{GlobalAlloc, Layout};

extern "C" {
    fn mem_malloc(size: usize, alignment: usize) -> *mut u8;
    fn mem_free(ptr: *mut u8, alignment: usize);
}

struct MemAllocator;

unsafe impl GlobalAlloc for MemAllocator {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        mem_malloc(layout.size(), layout.align())
    }

    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        mem_free(ptr, layout.align());
    }
}

#[global_allocator]
static ALLOCATOR: MemAllocator = MemAllocator;

Pixel modem firmware already implements a backend for the Pigweed crash facade as the global crash handler. Exposing it into Rust panic_handler through FFI unifies the crash handling for both Rust and C/C++ code.

#![no_std]
use core::panic::PanicInfo;

extern "C" {
    pub fn PwCrashBackend(sigature: *const i8, file_name: *const i8, line: u32);
}

#[panic_handler]
fn panic(panic_info: &PanicInfo) -> ! {
    let mut filename = "";
    let mut line_number: u32 = 0;

    if let Some(location) = panic_info.location() {
        filename = location.file();
        line_number = location.line();
    }

    let mut cstr_buffer = [0u8; 128];
    // Never writes to the last byte to make sure `cstr_buffer` is always zero
    // terminated.
    let (_, writer) = cstr_buffer.split_last_mut().unwrap();
    for (place, ch) in writer.iter_mut().zip(filename.bytes()) {
        *place = ch;
    }

    unsafe {
        PwCrashBackend(
            "Rust panic\0".as_ptr() as *const i8,
            cstr_buffer.as_ptr() as *const i8,
            line_number,
        );
    }

    loop {}
}

Link Rust staticlib

The Pixel modem firmware linking has a step that calls the linker to link all the objects generated from C/C++ code. By using llvm-ar -x to extract object files from the Rust combined staticlib and supplying them to the linker, the Rust code appears in the final modem image.

There was a performance issue we experienced due to weak symbols during linking. The inclusion of Rust core and compiler-builtin caused unexpected power and performance regressions on various tests. Upon analysis, we realized that modem optimized implementations of memset and memcpy provided by the modem firmware are accidentally replaced by those defined in compiler_builtin. It seems to happen because both compiler_builtin crate and the existing codebase defines symbols as weak, linker has no way to figure out which one is weaker. We fixed the regression by stripping the compiler_builtin crate before linking using a one line shell script.

llvm-ar -t <rust staticlib> | grep compiler_builtins | xargs llvm-ar -d <rust staticlib>

Integrating hickory-proto

Expose Rust API and calling back to C++

For the DNS parser, we declared the DNS response parsing API in C and then implemented the same API in Rust.

int32_t process_dns_response(uint8_t*, int32_t);

The Rust function returns an integer standing for the error code. The received DNS answers in the DNS response are required to be updated to in-memory data structures that are coupled with the original C implementation, therefore, we use existing C functions to do it. The existing C functions are dispatched from the Rust implementation.

pub unsafe extern "C" fn process_dns_response(
    dns_response: *const u8,
    response_len: i32,
) -> i32 {
    //... validate inputs `dns_response` and `response_len`.


    // SAFETY:
    // It is safe because `dns_response` is null checked above. `response_len`
    // is passed in, safe as long as it is set correctly by vendor code.
    match process_response(unsafe {
        slice::from_raw_parts(dns_response, response_len)
    }) {
         Ok(()) => 0,
         Err(err) => err.into(),
    }
}

fn process_response(response: &[u8]) -> Result<()> {
    let response = hickory_proto::op::Message::from_bytes(response)?;
    let response = hickory_proto::xfer::DnsResponse::from_message(response)?;

   
    for answer in response.answers() {  
        match answer.record_type() {
            hickory_proto::RecordType:... => {
                // SAFETY:
                // It is safe because the callback function does not store
                // reference of the inputs or their members.
                unsafe {
                    callback_to_c_function(...)?;
                }
            }
            
            // ... more match arms omitted.
        }    
    }

    Ok(())
}

In our case, the DNS responding parsing function API is simple enough for us to hand write, while the callbacks back to C functions for handling the response have complex data type conversions. Therefore, we leveraged bindgen to generate FFI code for the callbacks.

Build third-party crates

Even with all features disabled, hickory-proto introduces more than 30 dependent crates. Manually written build rules are difficult to ensure correctness and scale poorly when upgrading dependencies into new versions.

Fuchsia has developed cargo-gnaw to support building their third party Rust crates. Cargo-gnaw works by invoking cargo metadata to resolve dependencies, then parse and generate GN build rules. This ensures correctness and ease of maintenance.

Conclusion

The Pixel 10 series of phones marks a pivotal moment, being the first Pixel device to integrate a memory-safe language into its modem.

While replacing one piece of risky attack surface is itself valuable, this project lays the foundation for future integration of memory-safe parsers and code into the cellular baseband, ensuring the baseband’s security posture will continue to improve as development continues.

Special thanks to Armando Montanez, Bjorn Mellem, Boky Chen, Cheng-Yu Tsai, Dominik Maier, Erik Gilling, Ever Rosales, Hungyen Weng, Ivan Lozano, James Farrell, Jeffrey Vander Stoep, Jiacheng Lu, Jingjing Bu, Min Xu, Murphy Stein, Ray Weng, Shawn Yang, Sherk Chung, Stephan Chen, Stephen Hines.

Source: Google Online Security Blog

6 easy ways to study for finals with Gemini

Learn how to use Gemini as your personal study partner — from turning messy lecture notes into podcasts to testing your knowledge with custom quizzes.

Source: The Official Google Blog

Booking restaurants in the UK just got easier with AI in Search

We’re bringing new agentic capabilities to AI Mode in Search to help you book restaurant reservations.

Source: The Official Google Blog

Dev Channel Update for ChromeOS / ChromeOS Flex

The Dev channel is being updated to OS version 16640.2.0 (Browser version 148.0.7778.6) for most ChromeOS devices.

If you find new issues, please let us know one of the following ways

File a bug
Visit our ChromeOS communities

General: Chromebook Help Community
Beta Specific: ChromeOS Beta Help Community

Report an issue or send feedback on Chrome

Interested in switching channels? Find out how.

Andy Wu,
Google ChromeOS

Source: Google Chrome Releases

Dev Channel Update for ChromeOS / ChromeOS Flex

The Dev channel is being updated to OS version 16640.2.0 (Browser version 148.0.7778.6) for most ChromeOS devices.

If you find new issues, please let us know one of the following ways

File a bug
Visit our ChromeOS communities

General: Chromebook Help Community
Beta Specific: ChromeOS Beta Help Community

Report an issue or send feedback on Chrome

Interested in switching channels? Find out how.

Andy Wu,
Google ChromeOS

Source: Google Chrome Releases

Expanding access to longer musical tracks in the Gemini app

Last month, we introduced Lyria 3 Pro in the Gemini app for select Business, Enterprise, and Education editions. Starting today, more users can create longer tracks with Lyria 3 Pro in the Gemini app.

Lyria 3 Pro allows users to create tracks up to three minutes long, with customization and creative control. Lyria 3 Pro better understands musical composition, so users can now prompt for specific elements like intros, verses, choruses, and bridges. It’s great for experimenting with different styles or adding custom tracks to projects, presentations, or assets.

Note: Lyria 3 Pro music generation in Gemini is currently available globally supporting languages such as English, Japanese, Korean, Hindi, Spanish, Portuguese, German, and French for users over the age of 18.

Getting started

Admins: The Gemini app and related in-app tools are controlled by the Generative AI settings in the Workspace Admin console. Music generation in Gemini is subject to these existing controls. Visit the Help Center to learn more about turning the Gemini app on or off.
End users: End users will receive access to full-length songs automatically. To get started, select “Create music” from the tools menu. Visit the Help Center to learn more about limits.

Rollout pace

Rapid Release and Scheduled Release domains: Full rollout (1–3 days for feature visibility) started on April 8, 2026

Availability

Lyria 3 Pro is now available to the following Google Workspace customers and users with personal accounts who are 18 years or older and signed in to the Gemini app:

Business: Business Starter
Enterprise: Enterprise Starter
Education: Education Fundamentals, Standard, and Plus
Consumer: Google AI Plus
Other Editions: Frontline Starter, Standard, and Plus; Nonprofits

Lyria 3 Pro is already available to the following Google Workspace customers and users with personal accounts who are 18 years or older and signed in to the Gemini app:

Business: Business Standard and Plus
Enterprise: Enterprise Standard and Plus
AI Add-ons: Google AI Pro for Education, AI Expanded Access, AI Ultra Access
Consumer: Google AI Pro and Ultra

Resources

Keyword Blog: Lyria 3 Pro: Create longer tracks in more Google products
Google Workspace Updates: Create longer musical tracks in the Gemini app with Lyria 3 Pro
Google Workspace Admin Help: Turn the Gemini app on or off
Google Help: Generate music with Gemini Apps

Source: Google Search Central Blog

Book Google Workspace resources from third-party calendars

Greater control and error visibility for Google Sheets formulas

Migration update on restricted access items

Speech translation in Google Meet is now rolling out to mobile devices

Edit your AI-generated scripts when you convert Slides to Vids

Gmail end-to-end encryption now available on mobile devices

Expanding access to longer musical tracks in the Gemini app

Source: Google Workspace Updates

Host offloading with JAX on Intel® Xeon® processors

Host offloading

Enabling memory offloading in JAX

Measuring Host Offloading Benefits on TPU v5p

When to offload activations

Call to Action

Acknowledgments

Source: Google Open Source Blog

Source: Google Chrome Releases

Why Modem Memory Safety Can’t Wait

Why DNS?

Picking a DNS library

Adding no_std support

Code size study

Hook-up Rust to modem firmware

Compile Rust code to staticlib

Build core, alloc, and compiler_builtins

Link Rust staticlib

Integrating hickory-proto

Expose Rust API and calling back to C++

Build third-party crates

Conclusion

Source: Google Online Security Blog

Source: The Official Google Blog

Source: The Official Google Blog

Source: Google Chrome Releases

Source: Google Chrome Releases

Getting started

Rollout pace

Availability

Resources

Source: Google Workspace Updates

Adding `no_std` support