Hi everyone! We've just released Chrome Stable 143 (143.0.7499.151) for iOS; it'll become available on App Store in the next few hours.
This release includes stability and performance improvements. You can see a full list of the changes in the Git log. If you find a new issue, please let us know by filing a bug.
Presenters will now be able to share stereo sound when presenting content with stereo audio in Google Meet. During virtual meetings, presenters often share content with audio, such as music before a meeting starts, videos for review or discussion during the meeting, and more. Now, if the audio is originally in stereo (with separate left and right audio channels), the stereo sound will apply to the audio presented via Meet as well.
This can help make a more natural and immersive listening experience, improving the quality of the sound for all attendees.
Additional details:
Only users on the web will be able to send stereo audio.
Only Chrome and Firefox browsers will be able to receive stereo audio.
Getting started
Admins: This feature will be on by default, there is no admin control for this feature.
End users: This feature will be on by default when applicable content is shared via screen sharing. Visit the Help Center to learn more about presenting in Google Meet.
We are introducing Silent Test mode, a new offering that lets you run a large-scale eCDN (Enterprise Content Delivery Network) test with your users and devices, across your entire network, while minimizing any risk of impacting the viewer experience.
Google Meet eCDN provides peer-assisted media delivery for Meet live streams, saving up to 95% of the original bandwidth. To optimize bandwidth savings, administrators may want to fine-tune peering policies and custom rules to match their network topology. Silent Test is a risk-minimizing mode that helps admins validate those configurations by running large-scale eCDN tests with real user profiles and devices across large or global networks.
When Silent Test mode is turned on, Meet eCDN will run in a full simulation mode during large meetings and live streams. Live stream clients collect and report real-world data and statistics on how peer-based delivery through eCDN would perform, while showing viewers media that is directly served from Google's servers. This allows admins to quickly and with low risk test various configuration options.
In Silent Test mode clients will:
Stream media directly from Google's servers and use it for viewer playback
Discover and connect to peers to form Peering Groups
Operate in their client role (Root, Leaf or Branch) in a full P2P topology
Exchange actual media for simulation purposes and to generate real-world network load
Report back any connectivity bandwidth issues between peers
Collect all statistics in Meet Quality Tool and clearly mark metrics from Silent Tests
Advanced operation
In addition to simulating eCDN for regular live streams administrators can now also perform large-scale network tests by scheduling* workload scripts on users' devices to run transparently in the background. Since no live stream needs to be arranged for actual users to join, those tests can run as often as needed or use non-peak hours. This is a powerful way for admins to faster validate iterative changes.
*Using an existing endpoint management system that allows remote script execution.
Getting started
Admins: This feature will be OFF by default. Visit the Help Center to learn more about how to turn on Silent Test Mode. Complete the initial setup for Meet eCDN before turning on Silent Test mode. Learn more about how to set up Meet eCDN.
End users: There is no end user setting for this feature.
As 2025 comes to an end, we’re revisiting some of the biggest updates and key developments in Google Play this year. It all reflects our commitment to making Google Play…
The Stable channel has been updated to 143.0.7499.146/.147 for Windows/Mac and 143.0.7499.146 for Linux, which will roll out over the coming days/weeks. A full list of changes in this build is available in the Log.
2025-12-12: Updated to include more details for bug number 466192044
Security Fixes and Rewards
Note: Access to bug details and links may be kept restricted until a majority of users are updated with a fix. We will also retain restrictions if the bug exists in a third party library that other projects similarly depend on, but haven’t yet fixed.
Note: Access to bug details and links may be kept restricted until a majority of users are updated with a fix. We will also retain restrictions if the bug exists in a third party library that other projects similarly depend on, but haven’t yet fixed.
This update includes 2 security fixes. Below, we highlight fixes that were contributed by external researchers. Please see the Chrome Security Page for more information.
[$10000][448294721] High CVE-2025-14765: Use after free in WebGPU. Reported by Anonymous on 2025-09-30
[TBD][466786677] High CVE-2025-14766: Out of bounds read and write in V8. Reported by Shaheen Fazim on 2025-12-08
We would also like to thank all security researchers that worked with us during the development cycle to prevent security bugs from ever reaching the stable channel.
Interested in switching release channels? Find out how here. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.
by The GRL Team, UC San Diego and Lin Chai & Srikanth Kilaru, Google ML Frameworks
Introduction
JAX is widely recognized for its power in training large-scale AI models. However, a primary bottleneck in the next phase of AI development—LLM post-training with Reinforcement Learning (RL)—is the scarcity of environments with verifiable rewards.
Today, we are highlighting the work of the GRL (Game Reinforcement Learning) team at UC San Diego. To solve the data bottleneck, they have built a pipeline to turn video games into rigorous reasoning benchmarks. They utilized Tunix, a JAX-native research-friendly RL framework that supports multi-host, multi-turn capabilities, and leveraged the Google TPU Research Cloud (TRC) to scale their experiments. The results are promising: this approach has yielded significant improvements in model quality, particularly in planning and reasoning tasks, proving that games can be a viable substrate for serious AI capability training.
In this blog the GRL team explains how they are combining game environments, modular Tunix library for RL post-training, and TPU compute to train the next generation of agents.
Why Verifiable Games for LLM Post-Training?
Current RL post-training has shown strong gains in domains like math and coding because success can be auto-checked. However, these settings are often narrow and short-term. We are effectively overfitting RL to clean problems, while the next generation of agents must operate in messy, multi-step worlds.
To unlock RL as a systematic method for reasoning, we need a diverse pool of environments where rewards are grounded in explicit, machine-checkable rules. Games are this missing, underused substrate.
The Performance Gap: LLMs still perform surprisingly poorly on many strategy games, revealing a clear gap between model behavior and human-level interactive competence.
Verifiable Signals: Games come with built-in verifiable signals—wins, scores, puzzle completion—meaning outcomes are automatically and unambiguously graded without human labeling.
Long-Horizon Reasoning: Unlike short QA tasks, games force models to plan, explore, and reason over many steps.
Abundance: Decades of RL research has produced a standardized ecosystem of diverse environments ready to be recycled.
Game Reinforcement Learning (GRL): A Unified Game-to-Post-Training Pipeline
To harness this ecosystem, we built GRL, a comprehensive suite designed to recycle diverse game environments into a reusable post-training resource. Our mission is to prioritize environments with executable success checks—ranging from text-based puzzles to embodied 3D worlds and web/GUI workflows. Our code and ecosystem live under the LM Games organization (lmgame.org).
GRL provides three key capabilities:
A Unified Pipeline: We standardize the conversion of games into RL-ready environments with structured states and consistent metrics. This makes results comparable across models and research groups.
Versatile Configuration: Researchers can tailor interaction styles (e.g., max_turns, natural language feedback) while mixing training data from different tasks seamlessly. This allows for training on puzzles, math, and web tasks within a single run.
Algorithm-Agnostic Interface: GRL works with any agentic training algorithm. While we frequently use PPO, the system serves as a robust testbed for developing new RL techniques.
The Engine: Plugging into the Tunix RL Framework
Designed for Research Flexibility and Multi-Turn Agents
In practice, plugging a GRL game agent into Tunix is seamless thanks to its modular design. Tunix is built specifically to support multi-turn agentic tasks, allowing researchers to leverage native one-turn inference APIs to achieve complex multi-turn rollouts, then batch those outputs directly back into the training flow. This research flexibility is key; the framework is lightweight enough for quick iteration and benchmarking, yet modular enough to allow fine-grained adjustments to reward functions, algorithms, and hardware-aware settings like mesh sizes.
We first define an agent_cfg (see picture above) that tells the system which game to play (eg. Sokoban or Tetris), how the LLM should talk (chat template + reasoning style), and its budgets (max turns, tokens per turn, action format). On the Tunix side, we then load a pre-trained model into three roles: actor, critic, and reference and build ClusterConfig to specify rollout and training configs and PpoConfig to specify RL hyperparameters. The glue is minimal and the layout is clear and research friendly: once agent_cfg, ppo_cfg, and cluster_cfg are defined, we construct an RLCluster and pass everything into PpoLearner, which gives us a complete multi-turn PPO trainer in JAX.
Our multi-turn RL workflow is equally lightweight from the user's point of view. For example, with a 5-turn budget, the trainer repeatedly lets the LLM "play" the game for up to five conversational turns: at each turn it sees the current grid or state, reasons in language using the chat template, outputs a series of actions, and receives the next state and a verifiable reward signal (win/loss/score/step penalty). GRL's agent + env configs handle all the orchestration: they log observations, actions, and rewards into structured trajectories, which Tunix then turns into token-level advantages and returns for PPO updates. You don't manually build datasets or rollouts; the trainer owns the loop - interact -> log -> compute rewards -> update policy -> repeat.
In our preliminary experiments using this setup, training Qwen2.5-7B-Instruct on Sokoban and Tetris yielded strong in-domain gains (+2-56% across game variants). We also observed modest generalization to out-of-domain tasks, with consistent improvements in planning tasks (Blocksworld: +3-7%) and positive but unstable signals in computer use (Webshop: ~+6%). All scripts and configs are available in the GRL repo: https://github.com/lmgame-org/GRL/tree/main. To reproduce the end-to-end Tunix + GRL training example (including our Sokoban/Tetris runs), you can simply clone the repo and run one line: bash tunix_quick_training_example.sh.
Google TRC & TPUs: Accelerating Game-Based RL at Scale
A critical component of our research was the Google TPU Research Cloud (TRC) program. Access to Cloud TPUs allowed us to move from small-scale prototypes to production-grade training runs with minimal friction.
TPUs and JAX directly attacked our two biggest bottlenecks:
Rollout Throughput: Using the vLLM-TPU path via tpu-inference, we could serve multiple model families on the same TPU v5p backend. This boosted sampling throughput, making the data-collection loop tighter and multi-environment concurrency cheaper.
Multi-Host Scale for 7B Models: Tunix's lightweight design combined with JAX's mesh-based sharding allowed us to scale the same code from a single host to multi-host setups declaratively. This capability was essential for our experiments with 7B parameter models (such as Qwen2.5-7B), where we leveraged 2 v5p-8 hosts with minimal code change (in fact, only an env var config). The scale up is seamless, proving that the infrastructure can handle the heavy computational lifting required for modern LLM post-training without requiring complex engineering overhauls.
Hardware Advantage: At the hardware level, the performance gains were significant. Each TPU v5p chip delivers around 459 BF16 TFLOPs, compared to roughly 312 on an NVIDIA A100. This raw power, combined with the TRC program's support, meant that large-N studies—involving more seeds, longer horizons, and more environments—became routine experiments rather than "special ops" engineering challenges.
This combination of Tunix's flexible abstraction and TRC's massive compute resources allowed us to iterate quickly on ideas while benefiting from production-grade infrastructure.
Get Started
GRL and Tunix are open for the community to explore. You can reproduce our end-to-end training example (including the Sokoban/Tetris runs) by cloning the repo, following the installation instructions, and then running a single command:
The Extended Stable channel has been updated to 142.0.7499.243for Windows and Mac which will roll out over the coming days/weeks.
A full list of changes in this build is available in the log. Interested in switching release channels? Find out how here. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.
CC is our new experimental AI productivity agent from Google Labs, built with Gemini to help you stay organized and get things done. When you sign up, it connects your G…