Chrome Beta for Desktop Update

The Beta channel has been updated to 121.0.6167.85 for Windows, Mac and Linux.

A partial list of changes is available in the Git log. Interested in switching release channels? Find out how. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.

Daniel Yip
Google Chrome

Beta Channel Update for ChromeOS/ChromeOS Flex

The Beta channel is being updated to OS version: 15699.40.0, Browser version: 121.0.6167.82 for most chromeOS devices.

If you find new issues, please let us know one of the following ways

  1. File a bug
  2. Visit our ChromeOS communities
    1. General: Chromebook Help Community
    2. Beta Specific: ChromeOS Beta Help Community
  3. Report an issue or send feedback on Chrome

Interested in switching channels? Find out how.

Matt Nelson,
Google ChromeOS

Beta Channel Update for ChromeOS/ChromeOS Flex

The Beta channel is being updated to OS version: 15699.40.0, Browser version: 121.0.6167.82 for most chromeOS devices.

If you find new issues, please let us know one of the following ways

  1. File a bug
  2. Visit our ChromeOS communities
    1. General: Chromebook Help Community
    2. Beta Specific: ChromeOS Beta Help Community
  3. Report an issue or send feedback on Chrome

Interested in switching channels? Find out how.

Matt Nelson,
Google ChromeOS

Google Workspace Updates Weekly Recap – January 19, 2024

2 New updates

Unless otherwise indicated, the features below are available to all Google Workspace customers, and are fully launched or in the process of rolling out. Rollouts should take no more than 15 business days to complete if launching to both Rapid and Scheduled Release at the same time. If not, each stage of rollout should take no more than 15 business days to complete.


Unsubscribe from emails on Gmail more easily 
We know managing unwanted emails is a source of frustration for many users. That’s why we announced new guidelines for bulk senders a few months ago to ensure users stay safe. Now, we’re introducing new ways to make it even easier to unsubscribe from unwanted emails in Gmail on web and mobile by:
  • Adding the unsubscribe button to the hover actions in the threadlist on web. When the unsubscribe button is clicked, Gmail sends a http request or an email to the sender to remove your email address from their mailing list. 
    Unsubscribe from emails on Gmail more easily
  • Moving the unsubscribe button from the three dot menu to appear more prominently in your email on your Android and iOS devices.
Additionally, we know it is common for people to receive unwanted messages, despite initially signing up to receive them from brands or organizations. These messages often originate from legitimate senders, and marking them as spam can negatively impact the sender's email reputation and can potentially affect the deliverability of future emails. For this reason, we're changing the text of the buttons to make it clearer for users to choose between unsubscribing or reporting a message as spam. 

These features are now available to all Google Workspace customers and users with personal Google Accounts on web and iOS devices, and are rolling out now on Android devices at an extended pace (potentially longer than 15 days for feature visibility). | Learn more about unsubscribing from emails and reporting spam in Gmail.


Updating the mobile experience on Android tablets and foldable devices 
Last year, we announced enhancements to the Google Drive mobile experience on Android tablets. This included several modernizations, such as shifting the navigation bar to the side of the Drive app. Similarly, we’ve moved the navigation bar for Gmail to the side to optimize the experience on tablets and foldable devices. This migration will make it easier for users to switch tabs in Gmail. | Available now to all Google Workspace customers and users with personal Google Accounts. 
Updating the mobile experience on Android tablets and foldable devices




Previous announcements

The announcements below were published on the Workspace Updates blog earlier this week. Please refer to the original blog posts for complete details.


Launch the Lucidspark whiteboarding app directly from Google Meet Series One Board 65 and Desk 27 devices 
By the end of the month, Lucidspark by Lucid Software can be launched directly from Google Meet Series One Board 65 and Desk 27 devices. With this integration, users will be able to share and participate in a Lucidspark whiteboard session in a Meet call, either initiated from the Series One Board 65 and Desk 27 or a remote participant on the call. | Learn more about the Lucidspark whiteboarding app. 

Use comments & action items on your client-side encrypted Google Docs 
You can now collaborate with others on client-side encrypted Google Docs to add, edit, reply, filter, or delete comments. You can also assign action items to yourself or others. This added functionality helps bring parity to unencrypted docs while also ensuring your data is behind encryption keys you control, including the identity provider used to access those keys. | Available to Google Workspace Enterprise Plus, Education Standard and Education Plus customers only. | Learn more about CSE comments & action items in Docs. 

Combine multiple video effects, improve lighting and audio in Google Meet 
We’re launching three new features to personalize your appearance in Google Meet. | Learn more about combining multiple video effects on web and mobile, studio lighting on web and studio sound. 

Join meetings as a guest without a Google account on mobile devices 
You can now quickly join a meeting as a guest without signing into a personal or work Google account or creating a new Google account. This functionality already exists for meetings on the web and, by expanding to mobile, guests now have greater flexibility for joining Meet meetings on the go. | Learn more about joining meetings as a guest.

Star messages in Google Chat on mobile
Last November, we introduced starred on web, an additional shortcut in the redesigned Google Chat navigation panel that helps you stay on top of your most important messages. We’re excited to announce this is now available on Android and iOS mobile devices. | Learn more about starring messages.

Completed rollouts

The features below completed their rollouts to Rapid Release domains, Scheduled Release domains, or both. Please refer to the original blog posts for additional details.

Rapid Release Domains: 

Rapid and Scheduled Release Domains: 

For a recap of announcements in the past six months, check out What’s new in Google Workspace (recent releases). 

Star messages in Google Chat on mobile

What’s changing

Last November, we introduced starred on web, an additional shortcut in the redesigned Google Chat navigation panel that helps you stay on top of your most important messages. Today, we’re excited to announce this is now available on Android and iOS mobile devices. 
star messages on mobile

Getting started 

Rollout pace 

Android: 
iOS: 

Availability 

  • Available to all Google Workspace customers and users with personal Google Accounts

Resources 

Chrome Dev for Android Update

Hi everyone! We've just released Chrome Dev 122 (122.0.6250.2) for Android. It's now available on Google Play.

You can see a partial list of the changes in the Git log. For details on new features, check out the Chromium blog, and for details on web platform updates, check here.

If you find a new issue, please let us know by filing a bug.

Erhu Akpobaro
Google Chrome

Performance Max Guide Enhancements

Today, we are pleased to announce several guide enhancements to improve the experience of creating, managing and reporting on Performance Max campaigns with the Google Ads API.

Improving Performance Max integrations Blog Series

This article is part of a series that discusses new and upcoming features that you have been asking for. Keep an eye out for further updates and improvements on our developer blog, continue providing feedback on Performance Max integrations with the Google Ads API, and as always, contact our team if you need support.

Introducing ASPIRE for selective prediction in LLMs

In the fast-evolving landscape of artificial intelligence, large language models (LLMs) have revolutionized the way we interact with machines, pushing the boundaries of natural language understanding and generation to unprecedented heights. Yet, the leap into high-stakes decision-making applications remains a chasm too wide, primarily due to the inherent uncertainty of model predictions. Traditional LLMs generate responses recursively, yet they lack an intrinsic mechanism to assign a confidence score to these responses. Although one can derive a confidence score by summing up the probabilities of individual tokens in the sequence, traditional approaches typically fall short in reliably distinguishing between correct and incorrect answers. But what if LLMs could gauge their own confidence and only make predictions when they're sure?

Selective prediction aims to do this by enabling LLMs to output an answer along with a selection score, which indicates the probability that the answer is correct. With selective prediction, one can better understand the reliability of LLMs deployed in a variety of applications. Prior research, such as semantic uncertainty and self-evaluation, has attempted to enable selective prediction in LLMs. A typical approach is to use heuristic prompts like “Is the proposed answer True or False?” to trigger self-evaluation in LLMs. However, this approach may not work well on challenging question answering (QA) tasks.

The OPT-2.7B model incorrectly answers a question from the TriviaQA dataset: “Which vitamin helps regulate blood clotting?” with “Vitamin C”. Without selective prediction, LLMs may output the wrong answer which, in this case, could lead users to take the wrong vitamin. With selective prediction, LLMs will output an answer along with a selection score. If the selection score is low (0.1), LLMs will further output “I don’t know!” to warn users not to trust it or verify it using other sources.

In "Adaptation with Self-Evaluation to Improve Selective Prediction in LLMs", presented at Findings of EMNLP 2023, we introduce ASPIRE — a novel framework meticulously designed to enhance the selective prediction capabilities of LLMs. ASPIRE fine-tunes LLMs on QA tasks via parameter-efficient fine-tuning, and trains them to evaluate whether their generated answers are correct. ASPIRE allows LLMs to output an answer along with a confidence score for that answer. Our experimental results demonstrate that ASPIRE significantly outperforms state-of-the-art selective prediction methods on a variety of QA datasets, such as the CoQA benchmark.


The mechanics of ASPIRE

Imagine teaching an LLM to not only answer questions but also evaluate those answers — akin to a student verifying their answers in the back of the textbook. That's the essence of ASPIRE, which involves three stages: (1) task-specific tuning, (2) answer sampling, and (3) self-evaluation learning.

Task-specific tuning: ASPIRE performs task-specific tuning to train adaptable parameters (θp) while freezing the LLM. Given a training dataset for a generative task, it fine-tunes the pre-trained LLM to improve its prediction performance. Towards this end, parameter-efficient tuning techniques (e.g., soft prompt tuning and LoRA) might be employed to adapt the pre-trained LLM on the task, given their effectiveness in obtaining strong generalization with small amounts of target task data. Specifically, the LLM parameters (θ) are frozen and adaptable parameters (θp) are added for fine-tuning. Only θp are updated to minimize the standard LLM training loss (e.g., cross-entropy). Such fine-tuning can improve selective prediction performance because it not only improves the prediction accuracy, but also enhances the likelihood of correct output sequences.

Answer sampling: After task-specific tuning, ASPIRE uses the LLM with the learned θp to generate different answers for each training question and create a dataset for self-evaluation learning. We aim to generate output sequences that have a high likelihood. We use beam search as the decoding algorithm to generate high-likelihood output sequences and the Rouge-L metric to determine if the generated output sequence is correct.

Self-evaluation learning: After sampling high-likelihood outputs for each query, ASPIRE adds adaptable parameters (θs) and only fine-tunes θs for learning self-evaluation. Since the output sequence generation only depends on θ and θp, freezing θ and the learned θp can avoid changing the prediction behaviors of the LLM when learning self-evaluation. We optimize θs such that the adapted LLM can distinguish between correct and incorrect answers on their own.

The three stages of the ASPIRE framework.

In the proposed framework, θp and θs can be trained using any parameter-efficient tuning approach. In this work, we use soft prompt tuning, a simple yet effective mechanism for learning “soft prompts” to condition frozen language models to perform specific downstream tasks more effectively than traditional discrete text prompts. The driving force behind this approach lies in the recognition that if we can develop prompts that effectively stimulate self-evaluation, it should be possible to discover these prompts through soft prompt tuning in conjunction with targeted training objectives.

Implementation of the ASPIRE framework via soft prompt tuning. We first generate the answer to the question with the first soft prompt and then compute the learned self-evaluation score with the second soft prompt.

After training θp and θs, we obtain the prediction for the query via beam search decoding. We then define a selection score that combines the likelihood of the generated answer with the learned self-evaluation score (i.e., the likelihood of the prediction being correct for the query) to make selective predictions.


Results

To demonstrate ASPIRE’s efficacy, we evaluate it across three question-answering datasets — CoQA, TriviaQA, and SQuAD — using various open pre-trained transformer (OPT) models. By training θp with soft prompt tuning, we observed a substantial hike in the LLMs' accuracy. For example, the OPT-2.7B model adapted with ASPIRE demonstrated improved performance over the larger, pre-trained OPT-30B model using the CoQA and SQuAD datasets. These results suggest that with suitable adaptations, smaller LLMs might have the capability to match or potentially surpass the accuracy of larger models in some scenarios.

When delving into the computation of selection scores with fixed model predictions, ASPIRE received a higher AUROC score (the probability that a randomly chosen correct output sequence has a higher selection score than a randomly chosen incorrect output sequence) than baseline methods across all datasets. For example, on the CoQA benchmark, ASPIRE improves the AUROC from 51.3% to 80.3% compared to the baselines.

An intriguing pattern emerged from the TriviaQA dataset evaluations. While the pre-trained OPT-30B model demonstrated higher baseline accuracy, its performance in selective prediction did not improve significantly when traditional self-evaluation methods — Self-eval and P(True) — were applied. In contrast, the smaller OPT-2.7B model, when enhanced with ASPIRE, outperformed in this aspect. This discrepancy underscores a vital insight: larger LLMs utilizing conventional self-evaluation techniques may not be as effective in selective prediction as smaller, ASPIRE-enhanced models.

Our experimental journey with ASPIRE underscores a pivotal shift in the landscape of LLMs: The capacity of a language model is not the be-all and end-all of its performance. Instead, the effectiveness of models can be drastically improved through strategic adaptations, allowing for more precise, confident predictions even in smaller models. As a result, ASPIRE stands as a testament to the potential of LLMs that can judiciously ascertain their own certainty and decisively outperform larger counterparts in selective prediction tasks.


Conclusion

In conclusion, ASPIRE is not just another framework; it's a vision of a future where LLMs can be trusted partners in decision-making. By honing the selective prediction performance, we're inching closer to realizing the full potential of AI in critical applications.

Our research has opened new doors, and we invite the community to build upon this foundation. We're excited to see how ASPIRE will inspire the next generation of LLMs and beyond. To learn more about our findings, we encourage you to read our paper and join us in this thrilling journey towards creating a more reliable and self-aware AI.


Acknowledgments

We gratefully acknowledge the contributions of Sayna Ebrahimi, Sercan O Arik, Tomas Pfister, and Somesh Jha.

Source: Google AI Blog