How we fought bad apps and bad actors in 2023

A safe and trusted Google Play experience is our top priority. We leverage our SAFE (see below) principles to provide the framework to create that experience for both users and developers. Here's what these principles mean in practice:

  • (S)afeguard our Users. Help them discover quality apps that they can trust.
  • (A)dvocate for Developer Protection. Build platform safeguards to enable developers to focus on growth.
  • (F)oster Responsible Innovation. Thoughtfully unlock value for all without compromising on user safety.
  • (E)volve Platform Defenses. Stay ahead of emerging threats by evolving our policies, tools and technology.

With those principles in mind, we’ve made recent improvements and introduced new measures to continue to keep Google Play’s users safe, even as the threat landscape continues to evolve. In 2023, we prevented 2.28 million policy-violating apps from being published on Google Play1 in part thanks to our investment in new and improved security features, policy updates, and advanced machine learning and app review processes. We have also strengthened our developer onboarding and review processes, requiring more identity information when developers first establish their Play accounts. Together with investments in our review tooling and processes, we identified bad actors and fraud rings more effectively and banned 333K bad accounts from Play for violations like confirmed malware and repeated severe policy violations.

Additionally, almost 200K app submissions were rejected or remediated to ensure proper use of sensitive permissions such as background location or SMS access. To help safeguard user privacy at scale, we partnered with SDK providers to limit sensitive data access and sharing, enhancing the privacy posture for over 31 SDKs impacting 790K+ apps. We also significantly expanded the Google Play SDK Index, which now covers the SDKs used in almost 6 million apps across the Android ecosystem. This valuable resource helps developers make better SDK choices, boosts app quality and minimizes integration risks.

Protecting the Android Ecosystem

Building on our success with the App Defense Alliance (ADA), we partnered with Microsoft and Meta as steering committee members in the newly restructured ADA under the Joint Development Foundation, part of the Linux Foundation family. The Alliance will support industry-wide adoption of app security best practices and guidelines, as well as countermeasures against emerging security risks.

Additionally, we announced new Play Store transparency labeling to highlight VPN apps that have completed an independent security review through App Defense Alliance’s Mobile App Security Assessment (MASA). When a user searches for VPN apps, they will now see a banner at the top of Google Play that educates them about the “Independent security review” badge in the Data safety section. This helps users see at-a-glance that a developer has prioritized security and privacy best practices and is committed to user safety.

To better protect our customers who install apps outside of the Play Store, we made Google Play Protect’s security capabilities even more powerful with real-time scanning at the code-level to combat novel malicious apps. Our security protections and machine learning algorithms learn from each app submitted to Google for review and we look at thousands of signals and compare app behavior. This new capability has already detected over 5 million new, malicious off-Play apps, which helps protect Android users worldwide.

More Stringent Developer Requirements and Guidelines

Last year we updated Play policies around Generative AI apps, disruptive notifications, and expanded privacy protections. We also are raising the bar for new personal developer accounts by requiring new testing requirements before developers can make their app available on Google Play. By testing their apps, getting feedback and ensuring everything is ready before they launch, developers are able to bring more high quality content to Play users. In order to increase trust and transparency, we’ve introduced expanded developer verification requirements, including D-U-N-S numbers for organizations and a new “About the developer” section.

To give users more control over their personal data, apps that enable account creation now need to provide an option to initiate account and data deletion from within the app and online. This web requirement is especially important so that a user can request account and data deletion without having to reinstall an app. To simplify the user experience, we have also incorporated this as a feature within the Data safety section of the Play Store.

With each iteration of the Android operating system (including its robust set of APIs), a myriad of enhancements are introduced, aiming to elevate the user experience, bolster security protocols, and optimize the overall performance of the Android platform. To further safeguard our customers, approximately 1.5 million applications that do not target the most recent APIs are no longer available in the Play Store to new users who have updated their devices to the latest Android version.

Looking Ahead

Protecting users and developers on Google Play is paramount and ever-evolving. We're launching new security initiatives in 2024, including removing apps from Play that are not transparent about their privacy practices.

We also recently filed a lawsuit in federal court against two fraudsters who made multiple misrepresentations to upload fraudulent investment and crypto exchange apps on Play to scam users. This lawsuit is a critical step in holding these bad actors accountable and sending a clear message that we will aggressively pursue those who seek to take advantage of our users.

We're constantly working on new ways to protect your experience on Google Play and across the entire Android ecosystem, and we look forward to sharing more.

Notes


  1. In accordance with the EU's Digital Services Act (DSA) reporting requirements, Google Play now calculates policy violations based on developer communications sent. 

Google Workspace Updates Weekly Recap – April 26, 2024

3 New updates

Unless otherwise indicated, the features below are available to all Google Workspace customers, and are fully launched or in the process of rolling out. Rollouts should take no more than 15 business days to complete if launching to both Rapid and Scheduled Release at the same time. If not, each stage of rollout should take no more than 15 business days to complete.


Customizable Home tab for Google Chat apps 
Recently, we announced the availability of the "Home" tab for Google Chat apps through the Google Workspace Developer Preview Program. This feature allows developers to create a new tab in their Chat apps, known as “App Home”. App home can be customized to display user-specific dashboards, a list of open items and tasks, and more. We’re excited to announce this is now generally available for Google Workspace developers. | Rollout to Rapid Release domains and Scheduled Release domains is complete. | Available to all Google Workspace customers. | Learn more about sending an app home card message for a Google Chat app.
Customizable Home tab for Google Chat apps

Create Looker Studio reports from Google Sheets 
Looker Studio enables users to quickly build interactive reports and dashboards, and starting today they can now be created directly from Google Sheets. More specifically, users can: 
  • Pick which sheet or cell range to use in the generated report on Looker Studio. 
  • Transform the data in Sheets to an automatically generated Looker Studio report in a single click, and save and share the report with an individual or a team. 
The Looker Studio report remains connected to the Sheet, and can be refreshed to reflect data updates. | Rolling out to Rapid Release domains and Scheduled Release domains now. | Available to all Google Workspace customers, Google Workspace Individual subscribers, and users with personal Google accounts. | Learn how to create a Looker Studio report from Google Sheets

Create Looker Studio reports from Google Sheets

Export Gemini data for users in your organization 
Super admins can now export all of their users Gemini data, including prompts and Gemini’s responses to those prompts. Expanding takeout to include Gemini data continues to ensure that our customers have control over their organization’s data in order to manage their data privacy and compliance needs. | Rollout to Rapid Release domains and Scheduled Release domains is complete. | Available to Google Workspace customers with the Gemini Enterprise and Gemini Business add-on, as well as those customers with Gemini (gemini.google.com) enabled for their users. | Learn more about exporting Gemini data, exporting all of your organization’s data, and exporting data by organizational unit, group, or user. Additionally, you can use the Help Center to learn more about allowing your users to download their data.





Previous announcements

The announcements below were published on the Workspace Updates blog earlier this week. Please refer to the original blog posts for complete details.


External participants can now join Google Meet client-side encrypted calls 
We’re enhancing the experience for client-side encrypted Google Meet calls to include support for inviting external participants, including users without a Google account. | Learn more about external participants joining CSE Meet calls. 


Client-side encryption can now be selected as a data loss prevention condition 
You can now use client-side encryption as a condition for a data loss prevention (DLP) rule. | Learn more about the DLP rule. 


Seamlessly transfer between devices during a Google Meet call 
You can now smoothly transfer between devices while on a Google Meet call without hanging up and rejoining. | Learn more about transferring between devices during a Google Meet call. 


Import data from Slack to Google Chat using CloudFuze 
With the CloudFuze integration, you can move messages and memberships from Slack channels into Chat spaces. CloudFuze also imports data while maintaining historical timestamps to ensure users can start using spaces right where they left off. | Learn more about Google Chat and CloudFuze.


Get notified about application load failures for your Google Meet Hardware devices 
You can now opt-in to receive email or text message notifications when application load failures occur. Subscribing to alerts can help you stay on-top of what’s happening across your hardware fleet and quickly take action to resolve these issues. | Learn more about application load failure notifications.


Workspace Data Protection rules are now available for Gmail in Beta
Launching first to beta, we’re introducing data loss prevention rules for Gmail. Data protection rules help admins and security experts build a stronger framework around sensitive data to prevent personal or proprietary information from ending up in the wrong hands. | Learn more about Data Protection rules.



Completed rollouts

The features below completed their rollouts to Rapid Release domains, Scheduled Release domains, or both. Please refer to the original blog posts for additional details.


Rapid Release Domains: 
Scheduled Release Domains: 
Rapid and Scheduled Release Domains: 

For a recap of announcements in the past six months, check out What’s new in Google Workspace (recent releases).   

Chrome Dev for Desktop Update

The Dev channel has been updated to 126.0.6439.0 for Windows, Mac and Linux.

A partial list of changes is available in the Git log. Interested in switching release channels? Find out how. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.

Srinivas Sista
Google Chrome

Chrome Dev for Desktop Update

The Dev channel has been updated to 126.0.6439.0 for Windows, Mac and Linux.

A partial list of changes is available in the Git log. Interested in switching release channels? Find out how. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.

Srinivas Sista
Google Chrome

Accelerating incident response using generative AI


Introduction

As security professionals, we're constantly looking for ways to reduce risk and improve our workflow's efficiency. We've made great strides in using AI to identify malicious content, block threats, and discover and fix vulnerabilities. We also published the Secure AI Framework (SAIF), a conceptual framework for secure AI systems to ensure we are deploying AI in a responsible manner. 




Today we are highlighting another way we use generative AI to help the defenders gain the advantage: Leveraging LLMs (Large Language Model) to speed-up our security and privacy incidents workflows.




Incident management is a team sport. We have to summarize security and privacy incidents for different audiences including executives, leads, and partner teams. This can be a tedious and time-consuming process that heavily depends on the target group and the complexity of the incident. We estimate that writing a thorough summary can take nearly an hour and more complex communications can take multiple hours. But we hypothesized that we could use generative AI to digest information much faster, freeing up our incident responders to focus on other more critical tasks - and it proved true. Using generative AI we could write summaries 51% faster while also improving the quality of them. 



Our incident response approach

When suspecting a potential data incident, for example,we follow a rigorous process to manage it. From the identification of the problem, the coordination of experts and tools, to its resolution and then closure. At Google, when an incident is reported, our Detection & Response teams work to restore normal service as quickly as possible, while meeting both regulatory and contractual compliance requirements. They do this by following the five main steps in the Google incident response program:



  1. Identification: Monitoring security events to detect and report on potential data incidents using advanced detection tools, signals, and alert mechanisms to provide early indication of potential incidents.

  2. Coordination: Triaging the reports by gathering facts and assessing the severity of the incident based on factors such as potential harm to customers, nature of the incident, type of data that might be affected, and the impact of the incident on customers. A communication plan with appropriate leads is then determined.

  3. Resolution: Gathering key facts about the incident such as root cause and impact, and integrating additional resources as needed to implement necessary fixes as part of remediation.

  4. Closure: After the remediation efforts conclude, and after a data incident is resolved, reviewing the incident and response to identify key areas for improvement.

  5. Continuous improvement: Is crucial for the development and maintenance of incident response programs. Teams work to improve the program based on lessons learned, ensuring that necessary teams, training, processes, resources, and tools are maintained.




Google’s Incident Response Process diagram flow



Leveraging generative AI 

Our detection and response processes are critical in protecting our billions of global users from the growing threat landscape, which is why we’re continuously looking for ways to improve them with the latest technologies and techniques. The growth of generative AI has brought with it incredible potential in this area, and we were eager to explore how it could help us improve parts of the incident response process. We started by leveraging LLMs to not only pioneer modern approaches to incident response, but also to ensure that our processes are efficient and effective at scale. 




Managing incidents can be a complex process and an additional factor is effective internal communication to leads, executives and stakeholders on the threats and status of incidents. Effective communication is critical as it properly informs executives so that they can take any necessary actions, as well as to meet regulatory requirements. Leveraging LLMs for this type of communication can save significant time for the incident commanders while improving quality at the same time.



Humans vs. LLMs

Given that LLMs have summarization capabilities, we wanted to explore if they are able to generate summaries on par, or as well as humans can. We ran an experiment that took 50 human-written summaries from native and non-native English speakers, and 50 LLM-written ones with our finest (and final) prompt, and presented them to security teams without revealing the author.




We learned that the LLM-written summaries covered all of the key points, they were rated 10% higher than their human-written equivalents, and cut the time necessary to draft a summary in half. 




Comparison of human vs LLM content completeness




Comparison of human vs LLM writing styles

Managing risks and protecting privacy

Leveraging generative AI is not without risks. In order to mitigate the risks around potential hallucinations and errors, any LLM generated draft must be reviewed by a human. But not all risks are from the LLM -  human misinterpretation of a fact or statement generated by the LLM can also happen. That is why it’s important to ensure there is human accountability, as well as to monitor quality and feedback over time. 




Given that our incidents can contain a mixture of confidential, sensitive, and privileged data, we had to ensure we built an infrastructure that does not store any data. Every component of this pipeline - from the user interface to the LLM to output processing - has logging turned off. And, the LLM itself does not use any input or output for re-training. Instead, we use metrics and indicators to ensure it is working properly. 



Input processing

The type of data we process during incidents can be messy and often unstructured: Free-form text, logs, images, links, impact stats, timelines, and code snippets. We needed to structure all of that data so the LLM “knew” which part of the information serves what purpose. For that, we first replaced long and noisy sections of codes/logs by self-closing tags (<Code Section/> and <Logs/>) both to keep the structure while saving tokens for more important facts and to reduce risk of hallucinations.




During prompt engineering, we refined this approach and added additional tags such as <Title>, <Actions Taken>, <Impact>, <Mitigation History>, <Comment> so the input’s structure becomes closely mirrored to our incident communication templates. The use of self-explanatory tags allowed us to convey implicit information to the model and provide us with aliases in the prompt for the guidelines or tasks, for example by stating “Summarize the <Security Incident>”.



Sample {incident} input

Prompt engineering

Once we added structure to the input, it was time to engineer the prompt. We started simple by exploring how LLMs can view and summarize all of the current incident facts with a short task:


Caption: First prompt version




Limits of this prompt:

  • The summary was too long, especially for executives trying to understand the risk and impact of the incident

  • Some important facts were not covered, such as the incident’s impact and its mitigation

  • The writing was inconsistent and not following our best practices such as “passive voice”, “tense”, “terminology” or “format”

  • Some irrelevant incident data was being integrated into the summary from email threads

  • The model struggled to understand what the most relevant and up-to-date information was




For version 2, we tried a more elaborate prompt that would address the problems above: We told the model to be concise and we explained what a well-written summary should be: About the main incident response steps (coordination and resolution).


Second prompt version




Limits of this prompt:

  • The summaries still did not always succinctly and accurately address the incident in the format we were expecting

  • At times, the model lost sight of the task or did not take all the guidelines into account

  • The model still struggled to stick to the latest updates

  • We noticed a tendency to draw conclusions on hypotheses with some minor hallucinations




For the final prompt, we inserted 2 human-crafted summary examples and introduced a <Good Summary> tag to highlight high quality summaries but also to tell the model to immediately start with the summary without first repeating the task at hand (as LLMs usually do).



Final prompt


This produced outstanding summaries, in the structure we wanted, with all key points covered, and almost without any hallucinations.



Workflow integration

In integrating the prompt into our workflow, we wanted to ensure it was complementing the work of our teams, vs. solely writing communications. We designed the tooling in a way that the UI had a ‘Generate Summary’ button, which would pre-populate a text field with the summary that the LLM proposed. A human user can then either accept the summary and have it added to the incident, do manual changes to the summary and accept it, or discard the draft and start again. 


UI showing the ‘generate draft’ button and LLM proposed summary around a fake incident 



Quantitative wins

Our newly-built tool produced well-written and accurate summaries, resulting in 51% time saved, per incident summary drafted by an LLM, versus a human.




Time savings using LLM-generated summaries (sample size: 300)



The only edge cases we have seen were around hallucinations when the input size was small in relation to the prompt size. In these cases, the LLM made up most of the summary and key points were incorrect. We fixed this programmatically: If the input size is smaller than 200 tokens, we won’t call the LLM for a summary and let the humans write it. 



Evolving to more complex use cases: Executive updates

Given these results, we explored other ways to apply and build upon the summarization success and apply it to more complex communications. We improved upon the initial summary prompt and ran an experiment to draft executive communications on behalf of the Incident Commander (IC). The goal of this experiment was to ensure executives and stakeholders quickly understand the incident facts, as well as allow ICs to relay important information around incidents. These communications are complex because they go beyond just a summary - they include different sections (such as summary, root cause, impact, and mitigation), follow a specific structure and format, as well as adhere to writing best practices (such as neutral tone, active voice instead of passive voice, minimize acronyms).




This experiment showed that generative AI can evolve beyond high level summarization and help draft complex communications. Moreover, LLM-generated drafts, reduced time ICs spent writing executive summaries by 53% of time, while delivering at least on-par content quality in terms of factual accuracy and adherence to writing best practices. 



What’s next

We're constantly exploring new ways to use generative AI to protect our users more efficiently and look forward to tapping into its potential as cyber defenders. For example, we are exploring using generative AI as an enabler of ambitious memory safety projects like teaching an LLM to rewrite C++ code to memory-safe Rust, as well as more incremental improvements to everyday security workflows, such as getting generative AI to read design documents and issue security recommendations based on their content.

Workspace Data Protection rules are now available for Gmail in Beta

What’s changing 

Launching first to beta, we’re introducing data loss prevention rules for Gmail. Data protection rules help admins and security experts build a stronger framework around sensitive data to prevent personal or proprietary information from ending up in the wrong hands. This functionality is already available in Google Chat and Google Drive, and in Gmail you’ll be able to create, implement, and investigate rules in the same manner. 


Admins can create data protection rules to flag sensitive information from using your organization. These rules are applied to outgoing messages sent internally or externally and admins can choose whether all content (including attached files and images), the body of the email, email headers, or subject lines should be scanned. You can configure your rules to look for sensitive text strings, custom detectors, or select predefined detectors. If a message violates a rule, admins can choose to:

  • Block message — the sender will receive a notification about message delivery failure and more information about the policy they violated.
  • Quarantine message — the message will require review and approval by an admin before delivery. If the message is rejected by an admin, the user may receive a notification about it.
  • Audit only — the message is delivered, but it is captured in rule log events for further analysis. This is particularly advantageous because it allows admins to assess the impact of rules before introducing them to your end users.

Data loss prevention for Gmail are available for select Google Workspace customers (see the “Availability” section below) — no additional sign-up is required to use the feature. 

Create data protection policies for Gmail alongside Drive and Chat

Build flexible conditions with selection of predefined and custom detectors of sensitive information

Set up a rule with Audit Only action applied to messages sent outside of organization. The severity level for event logging is set up to ‘Medium’ and alerting via Alert Center is turned on 

Detailed information about the event in the Alert Center

Overview of DLP incidents in the Security Dashboard with further option to investigate audit logs in detail

Who’s impacted

Admins and end users



Why it’s important


In addition to detecting sensitive content, DLP in Gmail offers additional benefits such as:

  • Simplified deployment and data protection policies management with rules for Gmail, Drive and Google Chat unified into the same area and workflow.
  • Advanced detection policies with flexible conditions, wide selection of predefined detectors for global and regional information types, custom detectors (Regular Expressions and word lists), targeting on specific parts of a message (header, subject, body). 
  • Granular configuration of policies scope, defining sender audiences (at domain, OU, and group levels) and recipient audiences (internal, external, both).
  • Actions with various levels of restriction such as block delivery of message (Block), quarantine message for review (Quarantine), and log event for future audit (Audit only).
  • Tools for incident management and investigation such as the Alert Center, Security Dashboard and Security Investigation Tool.



Additional details

How does DLP in Gmail compare to Content Compliance rules?
Content compliance in Gmail does offer similar functionality in that you can create rules to prevent messages that contain specific content from being sent. However, unlike DLP in Gmail, admins have no way to preview the impact of these rules before deploying them broadly.


Further, content compliance offers a variety of features that are better suited for filtering content. For example, you can:
  • Set up a metadata match on a range of IP addresses, and quarantine messages from IP addresses outside of the range.
  • Route messages with content that matches specific text strings or patterns to a specific department, suited the best to process information.


Getting started

  • Admins: 
    • Data loss prevention rules can be configured at the domain, OU, or group level. DLP rules can be enabled in Gmail in the Admin console under Security > Access and data control > Data protection. Visit the Help Center to learn more about controlling sensitive data shared in Gmail.
      • Note that you can modify existing DLP rules for Drive and Chat to also apply to Gmail. 
    • DLP events can be reviewed in the Security Investigation Tool or Security > Alert Center, if alerts are configured in rules.


    • We recommend selecting “Audit only” when you’re setting up a rule. When selected, messages that match the conditions of a rule will be delivered with the detection being logged. This allows you to rest new rules and monitor their performance, or to passively monitor the  environment without interrupting email flow for your users.

    • Note on asynchronous and synchronous scanning: With DLP for Gmail, data protection rules are scanned asynchronously, which means that the message is blocked or quarantined after it leaves the sender’s mailbox and before being dispatched to the recipient. We’re working on the ability to scan data protection rules synchronously when a user hits “Send” in order to notify users about sensitive content before the message leaves their mailbox. 


    • Please share your feedback on this feature with us — this will help us continue to improve the experience as we move through beta and toward general availability. You can share your feedback by selecting the “Send feedback” button located in the bottom left corner of your screen of any data protection related page in the Admin console.


  • End users: When configured by your admins, you’ll be notified if your message contains information that violates a DLP rule

Rollout pace


Availability

Available to Google Workspace:
  • Enterprise Standard, Enterprise Plus
  • Education Fundamentals, Standard, Plus, and the Teaching & Learning Upgrade
  • Frontline Standard
  • Cloud Identity Premium customers

Get notified about application load failures for your Google Meet Hardware devices

What’s changing 

As part of an ongoing series of improvements for managing Google Meet hardware devices, we recently announced that we would begin capturing application load failures across Meet hardware devices. Beginning today, you can now opt-in to receive email or text message notifications when these failures occur. Subscribing to alerts can help you stay on-top of what’s happening across your hardware fleet and quickly take action to resolve these issues.


Getting started


Rollout pace

  • Rapid and Scheduled Release domains: Extended rollout (potentially longer than 15 days for feature visibility) starting on April 25, 2024. We anticipate rollout to take around six weeks to complete.

Availability