Stable Channel Update for Desktop

The Stable channel has been updated to 117.0.5938.88 for Mac and Linux and 117.0.5938.88/.89 for Windows, which will roll out over the coming days/weeks. A full list of changes in this build is available in the log.


Interested in switching release channels?  Find out how here. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.


Prudhvikumar Bommana
Google Chrome

Google Workspace Updates Weekly Recap – September 15, 2023

4 New updates 

Unless otherwise indicated, the features below are available to all Google Workspace customers, and are fully launched or in the process of rolling out. Rollouts should take no more than 15 business days to complete if launching to both Rapid and Scheduled Release at the same time. If not, each stage of rollout should take no more than 15 business days to complete.


Insert links in the Google Sheets app on iOS devices 
We’re adding the ability to insert a hyperlink into a cell in the Google Sheets iOS app by selecting a cell > clicking “+” in the top left corner > Insert > Link. If a cell contains a link, you’ll see options to edit or remove the link. | Available to all Google Workspace customers and users with personal Google Accounts. | Visit the Help Center to learn more about working with links & bookmarks
Insert links in the Google Sheets app on iOS devices


Enhanced spam protection through automatic labeling of suspected spam messages in Google Voice 
If you're using Google Voice, you're familiar with our suspected spam caller warnings. We're extending this feature to SMS messages on Android and iOS devices. You'll see these labels within the message, and you can either: 
  • Confirm a suspected spam message, which causes future messages from that number to go directly into the spam folder. 
  • Mark a labeled message as not spam, after which the suspected spam label is never displayed for that number again. 
Available to Voice Starter, Standard, and Premier customers, as well as users with personal accounts in the US. | Rolling out now to Rapid Release and Scheduled Release domains at an extended pace (potentially longer than 15 days for feature visibility). 
Enhanced spam protection through automatic labeling of suspected spam messages in Google Voice



Birthday decorations for people cards 
In Google Contacts and across various Google Workspace products, you’ll begin to notice birthday decorations when hovering over another user’s people card. This small change can have a big impact on building deeper connections with your colleagues and stakeholders. Birthday decorations will be displayed on your birthday if you’ve added your birthday to your Google Account profile and you’ve set the information to be visible to others. | Available now for all Google Workspace customers and users with personal Google Accounts. | Visit the Help Center to learn more about your Google Account profile and what information others can see.
Birthday decorations for people cards
Add an organizational unit as an attribute in your external directory 
When using Directory Sync, you can now place users from the Azure Active Directory or Active Directory to a specific organizational unit on the Google Workspace side. To do so, you’ll need to add an organizational unit as an attribute in your external directory. This makes it easier to sync users who will be mapped to different organizational units on the Google Workspace side. | Directory Sync is available as an open beta to all Google Workspace customers. | Visit the Help Center to learn more about setting up users to sync based on an organizational unit attribute.
Add an organizational unit as an attribute in your external directory


Previous announcements

The announcements below were published on the Workspace Updates blog earlier this week. Please refer to the original blog posts for complete details.


Dual Display on Poly Studio X Series Makes Video Meetings More Productive 
We are excited to announce dual display support for Google Meet on the Poly Studio X Series to help make video meetings more productive. With dual displays, you can see more meeting participants, presentations, and documents at once, which can help you stay focused and engaged in meetings. | Available to Google Workspace customers with Poly Studio X50, X52, and X70 devices only. | Learn more about Dual Display on Poly Studio X Series.

Completed rollouts

The features below completed their rollouts to Rapid Release domains, Scheduled Release domains, or both. Please refer to the original blog posts for additional details.


Rapid Release Domains:

Capslock: What is your code really capable of?




When you import a third party library, do you review every line of code? Most software packages depend on external libraries, trusting that those packages aren’t doing anything unexpected. If that trust is violated, the consequences can be huge—regardless of whether the package is malicious, or well-intended but using overly broad permissions, such as with Log4j in 2021. Supply chain security is a growing issue, and we hope that greater transparency into package capabilities will help make secure coding easier for everyone.




Avoiding bad dependencies can be hard without appropriate information on what the dependency’s code actually does, and reviewing every line of that code is an immense task.  Every dependency also brings its own dependencies, compounding the need for review across an expanding web of transitive dependencies. But what if there was an easy way to know the capabilities–the privileged operations accessed by the code–of your dependencies? 




Capslock is a capability analysis CLI tool that informs users of privileged operations (like network access and arbitrary code execution) in a given package and its dependencies. Last month we published the alpha version of Capslock for the Go language, which can analyze and report on the capabilities that are used beneath the surface of open source software. 




This CLI tool will provide deeper insights into the behavior of dependencies by reporting code paths that access privileged operations in the standard libraries. In upcoming versions we will add support for open source maintainers to prescribe and sandbox the capabilities required for their packages, highlighting to users what capabilities are present and alerting them if they change.




Capabilities vs Vulnerabilities


Vulnerability management is an important part of your supply chain security, but it doesn’t give you a full picture of whether your dependencies are safe to use. Adding capability analysis into your security posture, gives you a better idea of the types of behavior you can expect from your dependencies, identifies potential weak points, and allows you to make a more informed choice about using a given dependency. 




Capslock is motivated by the belief that the principle of least privilege—the idea that access should be limited to the minimal set that is feasible and practical—should be a first-class design concept for secure and usable software. Applied to software development, this means that a package should be allowed access only to the capabilities that it requires as part of its core behaviors. For example, you wouldn’t expect a data analysis package to need access to the network or a logging library to include remote code execution capabilities. 




Capslock is initially rolling out for Go, a language with a strong security commitment and fantastic tooling for finding known vulnerabilities in package dependencies. When Capslock is used alongside Go’s vulnerability management tools, developers can use the additional, complementary signals to inform how they interpret vulnerabilities in their dependencies. 




These capability signals can be used to


  • Find code with the highest levels of access to prioritize audits, code reviews and vulnerability patches

  • Compare potential dependencies, or look for alternative packages when an existing dependency is no longer appropriate

  • Surface unwanted capability usage in packages to uncover new vulnerabilities or identify supply chain attacks in progress

  • Monitor for unexpected emerging capabilities due to package version or dependency changes, and even integrate capability monitoring into CI/CD pipelines 

  • Filter vulnerability data to respond to the most relevant cases, such as finding packages with network access during a network-specific vulnerability alert  





Using Capslock






We are looking forward to adding new features in future releases, such as better support for declaring the expected capabilities of a package, and extending to other programming languages. We are working to apply Capslock at scale and make capability information for open source packages broadly available in various community tools like deps.dev




You can try Capslock now, and we hope you find it useful for auditing your external dependencies and making informed decisions on your code’s capabilities.




We’ll be at Gophercon in San Diego on Sept 27th, 2023—come and chat with us! 




MediaPipe FaceStylizer: On-device real-time few-shot face stylization

In recent years, we have witnessed rising interest across consumers and researchers in integrated augmented reality (AR) experiences using real-time face feature generation and editing functions in mobile applications, including short videos, virtual reality, and gaming. As a result, there is a growing demand for lightweight, yet high-quality face generation and editing models, which are often based on generative adversarial network (GAN) techniques. However, the majority of GAN models suffer from high computational complexity and the need for a large training dataset. In addition, it is also important to employ GAN models responsibly.

In this post, we introduce MediaPipe FaceStylizer, an efficient design for few-shot face stylization that addresses the aforementioned model complexity and data efficiency challenges while being guided by Google’s responsible AI Principles. The model consists of a face generator and a face encoder used as GAN inversion to map the image into latent code for the generator. We introduce a mobile-friendly synthesis network for the face generator with an auxiliary head that converts features to RGB at each level of the generator to generate high quality images from coarse to fine granularities. We also carefully designed the loss functions for the aforementioned auxiliary heads and combined them with the common GAN loss functions to distill the student generator from the teacher StyleGAN model, resulting in a lightweight model that maintains high generation quality. The proposed solution is available in open source through MediaPipe. Users can fine-tune the generator to learn a style from one or a few images using MediaPipe Model Maker, and deploy to on-device face stylization applications with the customized model using MediaPipe FaceStylizer.


Few-shot on-device face stylization


An end-to-end pipeline

Our goal is to build a pipeline to support users to adapt the MediaPipe FaceStylizer to different styles by fine-tuning the model with a few examples. To enable such a face stylization pipeline, we built the pipeline with a GAN inversion encoder and efficient face generator model (see below). The encoder and generator pipeline can then be adapted to different styles via a few-shot learning process. The user first sends a single or a few similar samples of the style images to MediaPipe ModelMaker to fine-tune the model. The fine-tuning process freezes the encoder module and only fine-tunes the generator. The training process samples multiple latent codes close to the encoding output of the input style images as the input to the generator. The generator is then trained to reconstruct an image of a person’s face in the style of the input style image by optimizing a joint adversarial loss function that also accounts for style and content. With such a fine-tuning process, the MediaPipe FaceStylizer can adapt to the customized style, which approximates the user’s input. It can then be applied to stylize test images of real human faces.


Generator: BlazeStyleGAN

The StyleGAN model family has been widely adopted for face generation and various face editing tasks. To support efficient on-device face generation, we based the design of our generator on StyleGAN. This generator, which we call BlazeStyleGAN, is similar to StyleGAN in that it also contains a mapping network and synthesis network. However, since the synthesis network of StyleGAN is the major contributor to the model’s high computation complexity, we designed and employed a more efficient synthesis network. The improved efficiency and generation quality is achieved by:

  1. Reducing the latent feature dimension in the synthesis network to a quarter of the resolution of the counterpart layers in the teacher StyleGAN,
  2. Designing multiple auxiliary heads to transform the downscaled feature to the image domain to form a coarse-to-fine image pyramid to evaluate the perceptual quality of the reconstruction, and
  3. Skipping all but the final auxiliary head at inference time.

With the newly designed architecture, we train the BlazeStyleGAN model by distilling it from a teacher StyleGAN model. We use a multi-scale perceptual loss and adversarial loss in the distillation to transfer the high fidelity generation capability from the teacher model to the student BlazeStyleGAN model and also to mitigate the artifacts from the teacher model.

More details of the model architecture and training scheme can be found in our paper.

Visual comparison between face samples generated by StyleGAN and BlazeStyleGAN. The images on the first row are generated by the teacher StyleGAN. The images on the second row are generated by the student BlazeStyleGAN. The face generated by BlazeStyleGAN has similar visual quality to the image generated by the teacher model. Some results demonstrate the student BlazeStyleGAN suppresses the artifacts from the teacher model in the distillation.

In the above figure, we demonstrate some sample results of our BlazeStyleGAN. By comparing with the face image generated by the teacher StyleGAN model (top row), the images generated by the student BlazeStyleGAN (bottom row) maintain high visual quality and further reduce artifacts produced by the teacher due to the loss function design in our distillation.


An encoder for efficient GAN inversion

To support image-to-image stylization, we also introduced an efficient GAN inversion as the encoder to map input images to the latent space of the generator. The encoder is defined by a MobileNet V2 backbone and trained with natural face images. The loss is defined as a combination of image perceptual quality loss, which measures the content difference, style similarity and embedding distance, as well as the L1 loss between the input images and reconstructed images.


On-device performance

We documented model complexities in terms of parameter numbers and computing FLOPs in the following table. Compared to the teacher StyleGAN (33.2M parameters), BlazeStyleGAN (generator) significantly reduces the model complexity, with only 2.01M parameters and 1.28G FLOPs for output resolution 256x256. Compared to StyleGAN-1024 (generating image size of 1024x1024), the BlazeStyleGAN-1024 can reduce both model size and computation complexity by 95% with no notable quality difference and can even suppress the artifacts from the teacher StyleGAN model.

Model     Image Size     #Params (M)     FLOPs (G)
StyleGAN     1024     33.17     74.3
BlazeStyleGAN     1024     2.07     4.70
BlazeStyleGAN     512     2.05     1.57
BlazeStyleGAN     256     2.01     1.28
Encoder     256     1.44     0.60

Model complexity measured by parameter numbers and FLOPs.

We benchmarked the inference time of the MediaPipe FaceStylizer on various high-end mobile devices and demonstrated the results in the table below. From the results, both BlazeStyleGAN-256 and BlazeStyleGAN-512 achieved real-time performance on all GPU devices. It can run in less than 10 ms runtime on a high-end phone’s GPU. BlazeStyleGAN-256 can also achieve real-time performance on the iOS devices’ CPU.

Model     BlazeStyleGAN-256 (ms)     Encoder-256 (ms)
iPhone 11     12.14     11.48
iPhone 12     11.99     12.25
iPhone 13 Pro     7.22     5.41
Pixel 6     12.24     11.23
Samsung Galaxy S10     17.01     12.70
Samsung Galaxy S20     8.95     8.20

Latency benchmark of the BlazeStyleGAN, face encoder, and the end-to-end pipeline on various mobile devices.

Fairness evaluation

The model has been trained with a high diversity dataset of human faces. The model is expected to be fair to different human faces. The fairness evaluation demonstrates the model performs good and balanced in terms of human gender, skin-tone, and ages.


Face stylization visualization

Some face stylization results are demonstrated in the following figure. The images in the top row (in orange boxes) represent the style images used to fine-tune the model. The images in the left column (in the green boxes) are the natural face images used for testing. The 2x4 matrix of images represents the output of the MediaPipe FaceStylizer which is blending outputs between the natural faces on the left-most column and the corresponding face styles on the top row. The results demonstrate that our solution can achieve high-quality face stylization for several popular styles.

Sample results of our MediaPipe FaceStylizer.

MediaPipe Solutions

The MediaPipe FaceStylizer is going to be released to public users in MediaPipe Solutions. Users can leverage MediaPipe Model Maker to train a customized face stylization model using their own style images. After training, the exported bundle of TFLite model files can be deployed to applications across platforms (Android, iOS, Web, Python, etc.) using the MediaPipe Tasks FaceStylizer API in just a few lines of code.


Acknowledgements

This work is made possible through a collaboration spanning several teams across Google. We’d like to acknowledge contributions from Omer Tov, Yang Zhao, Andrey Vakunov, Fei Deng, Ariel Ephrat, Inbar Mosseri, Lu Wang, Chuo-Ling Chang, Tingbo Hou, and Matthias Grundmann.

Source: Google AI Blog


Global Hispanic and Latino Developers Share How They Use Google Tools

Posted by Lyanne Alfaro, DevRel Program Manager, Google Developer Studio

Developer Journey is a monthly series highlighting diverse and global developers sharing relatable challenges, opportunities, and wins in their journey. Every month, we will spotlight developers around the world, the Google tools they leverage, and the kind of products they are building.

In celebration of Hispanic and Latin Heritage, this month we spoke with developers from Mexico and Spain.


Estela Franco

Headshot of Estela Franco, smiling
Barcelona, Spain
Google Developer Expert, Web Technologies
Web Performance Specialist

What unique perspectives do you believe you bring to the tech industry as a Hispanic developer? How do your cultural experiences influence your approach to problem-solving and innovation?

We Spanish people love talking and interacting with other people. We tend to speak a lot, and we bring that to all areas of our lives, including development. I enjoy discussing projects, understanding user needs and use cases, challenging peers, and providing other ideas that weren't initially considered. Every developer has their own background and experiences, and that's something that any project can leverage, so having a space where the team can safely have this kind of discussion can be very beneficial.

What Google tools have you used to build?

As a Web Performance specialist, I use Chrome, PageSpeed Insights API, Big Query, CrUX API, and Looker Studio. With these tools, I create microsites and dashboards to monitor and analyze web performance.

Which tool has been your favorite to use? Why?

I love the CrUX API and all the information you can get from it. It's super helpful to understand how your users experience your website and how your competitors are performing. Providing a great user experience to your users is as important as understanding how your website is performing in the market versus competitors' websites.

The CrUX API documentation provides enough information and examples to create your request and get valuable data that you will convert into insights to identify issues/bottlenecks and improve your website.

Tell us about something you've built in the past using Google tools.

I created a basic and simple CrUX data explorer. It uses the CrUX API (for getting the data) and Firebase (for the authentication). This tool, which is still a work in progress, allows you to visually get the Core Web Vitals' values for any website or web page you want to check, based on CrUX data.

What will you create with Google Bard?

Google Bard is an excellent tool which you can use to create dev projects. It won't develop them for you, but you can save a lot of time thanks to it. Currently, I'm not planning to create any specific project using Google Bard, but any project I create will probably get some help from it.

What advice would you give someone starting in their developer journey?

  • Start from the beginning. You first need to understand the fundamentals before learning a framework or a specific technology. Being proficient on the fundamentals will make the rest easier.
  • Don't walk this journey alone. Get support from a community. Luckily, there are hundreds of tech communities you can get support from! You will probably find some difficulties during this journey and having this support will help you to go through them and solve them faster.
  • Don't be afraid to ask. You can ask your questions in a community or you can also ask them to Google (and Google Bard). Trust me, you're not the first person to have that question. Asking is the best way to get an answer.

What technological advancements or trends do you believe have the potential to positively impact Hispanic communities, both locally and globally?

Real-time translators using AI can be a game-changer. Although Spanish is one of the most frequently spoken languages in the world, English is needed in many fields and the tech industry is one of them.

Currently, it can be harder to have a successful developer journey if you don't speak English, and not all Hispanic/Latinx communities speak English. Having the tools to properly communicate with tech people even if you don't speak English could open new opportunities to these communities.


Alba Silvente Fuentes

Headshot of Alba Silvente Fuentes, smiling
Amsterdam, Netherlands by way of Alicante, Spain
Google Developer Expert, Web Technologies
Women Techmakers Ambassador
Developer Relations Engineer at Storyblok (a Headless CMS)

What unique perspectives do you believe you bring to the tech industry as a Hispanic developer? How do your cultural experiences influence your approach to problem-solving and innovation?

When coding or solving problems, one quality that has always been present in my culture is passion. While passion is not exclusive to Latinx or Hispanic people, it is a part of our culture to approach tasks with dedication, effort, and care.

To prevent myself from giving in to a very difficult bug, I rely on my sense of humor and open communication. Whether at work or at home, I communicate openly about what is happening to me, seeking help or collaborating on a solution. I often use humor to diffuse tension and find the funny side of frustrating situations. This helps me to clear my mind of thoughts that block the search for a solution.

When it comes to innovation, I focus on small, everyday things that can improve my daily life. This is because I have been taught to value small details over bigger ones.

What Google tools have you used to build?

One of my first experiences with Google technologies was when I was studying at university and I decided to learn Android development in my free time, which was then using Java as its official language. However, after starting my career and discovering my passion for frontend development, my first full-time job as a frontend developer involved using Angular. Throughout my professional journey, I have relied heavily on Google's essential tools such as DevTools, Lighthouse, and Page speed. These tools have become an integral part of my daily routine. And over the past two years, I have actively participated in developing Chrome Extensions and conducting numerous Flutter workshops.

Which tool has been your favorite to use? Why?

In my opinion, the most helpful tools are DevTools and Web Vitals. However, if I had to choose a favorite, I would say it is the Chrome Extensions Manifest. I had a great time developing extensions and exploring all the different possibilities. Whether I was scraping websites for useful information or extending functionality, it was always a fun and rewarding experience.

Tell us about something you've built in the past using Google tools.

The question should be what have you developed without the use of Google tools, because I think there hasn't been a project where I haven't used DevTools or Web Vitals since I focus on the frontend. If you were to check my GitHub, nearly 90% of my projects have undergone testing with Lighthouse in order to ensure basic performance improvements.

Please share a memorable project where you incorporated elements of your heritage into the design or functionality? How did this enrich the user experience?

During my first job, I was a part of a campaign project for wines from the Alicante region called “#EnamórateDeUnAlicantino” (translating to “#FallInLoveWithAnAlcantino”). The campaign had a website featuring a love form to help people find the wine that best suited their taste. Each wine was associated with a person, and every question in the form had a local food item linked to it, such as Valencia oranges. I had a lot of fun working on it.

What will you create with Google Bard?

Up until now, I've utilized generative AI to assist me in refining my content. It's been especially helpful in synthesizing detailed information for my podcasts, articles, and talks. On one occasion, I even used it to create the basis of an extension, and while the outcome was decent, I had to make a few adjustments. Nevertheless, it was a valuable experiment. Moving forward, I plan to further explore the potential of AI and perhaps even use it to generate tests for my code or troubleshoot bugs out of pure curiosity.

What advice would you give someone starting in their developer journey?

My recommendation for beginners is to start by focusing on one thing that they enjoy, taking the time to understand the basics and explore their limits without rushing through the process. It is important to remain calm and enjoy the journey.

What technological advancements or trends do you believe have the potential to positively impact Hispanic communities, both locally and globally?

Considering our past limitations with languages other than Spanish due to our historical past, there are still many people who face a language barrier and cannot access all the information they need. However, thanks to advancements in AI, chatbots like Bard, and technologies like VR glasses, we can now overcome this hurdle. These tools allow us to translate in real-time as the speaker shares their story, or improve automatic subtitles, enabling us to reach a wider audience than ever before.


Juan Guillermo Gómez

Headshot of Juan Guillermo Gómez, smiling
Mexico City, Mexico
Google Developer Expert, Firebase, ML, Google Cloud Platform, Kotlin
Tech Lead

What unique perspectives do you believe you bring to the tech industry as a Latino developer? How do your cultural experiences influence your approach to problem-solving and innovation?

The developer community is strong and very united in Latin America. We also have relationships with other communities around the world, which allow growth in our professional career. In some cases, there is a shortage of resources, but this is not a barrier, but rather a motivation. We can learn a lot about technology by visiting places and networking.

What Google tools have you used to build?

I have used a lot of tools. I have used several tools for Android applications, and a lot of services via Google Cloud Platform, Firebase, Go, TensorFlow, and more.

Which tool has been your favorite to use? Why?

I love two tools: Firebase and GCP. These tools have a host of services that allow you to build apps, track their performance, user behavior, growth, and more. You can create applications with the support of Google.

Tell us about something you've built in the past using Google tools.

I have created mobile applications for health services and applications for a security services company. In the last four years, I have created an app called "Wordbox English" with a great team. Wordbox is an application that allows you to learn English via television series and movies in an entertaining way.

Please share a memorable project where you incorporated elements of your heritage into the design or functionality? How did this enrich the user experience?

Wordbox English is a great application which helps the user learn another language in an entertaining way. To create new features and modules, we often work with our users and yield great results. Because of this, our users love to learn.

What advice would you give someone starting in their developer journey?

Learn and practice every day. There are many tools, videos, and educational platforms where you can learn. Learn to love problems and challenges. You can belong to a community with other people with whom you can grow.

What technological advancements or trends do you believe have the potential to positively impact Latin communities, both locally and globally?

AI and machine learning. These accelerated advances allow you to build apps and learn faster. You can innovate and add more value to users.

Chrome Dev for Android Update

Hi everyone! We've just released Chrome Dev 119 (119.0.6006.3) for Android. It's now available on Google Play.

You can see a partial list of the changes in the Git log. For details on new features, check out the Chromium blog, and for details on web platform updates, check here.

If you find a new issue, please let us know by filing a bug.

Erhu Akpobaro
Google Chrome

On-device content distillation with graph neural networks

In today's digital age, smartphones and desktop web browsers serve as the primary tools for accessing news and information. However, the proliferation of website clutter — encompassing complex layouts, navigation elements, and extraneous links — significantly impairs both the reading experience and article navigation. This issue is particularly acute for individuals with accessibility requirements.

To improve the user experience and make reading more accessible, Android and Chrome users may leverage the Reading Mode feature, which enhances accessibility by processing webpages to allow customizable contrast, adjustable text size, more legible fonts, and to enable text-to-speech utilities. Additionally, Android's Reading Mode is equipped to distill content from apps. Expanding Reading Mode to encompass a wide array of content and improving its performance, while still operating locally on the user's device without transmitting data externally, poses a unique challenge.

To broaden Reading Mode capabilities without compromising privacy, we have developed a novel on-device content distillation model. Unlike early attempts using DOM Distiller — a heuristic approach limited to news articles — our model excels in both quality and versatility across various types of content. We ensure that article content doesn't leave the confines of the local environment. Our on-device content distillation model smoothly transforms long-form content into a simple and customizable layout for a more pleasant reading journey while also outperforming the leading alternative approaches. Here we explore details of this research highlighting our approach, methodology, and results.


Graph neural networks

Instead of relying on complicated heuristics that are difficult to maintain and scale to a variety of article layouts, we approach this task as a fully supervised learning problem. This data-driven approach allows the model to generalize better across different layouts, without the constraints and fragility of heuristics. Previous work for optimizing the reading experience relied on HTML or parsing, filtering, and modeling of a document object model (DOM), a programming interface automatically generated by the user’s web browser from site HTML that represents the structure of a document and allows it to be manipulated.

The new Reading Mode model relies on accessibility trees, which provide a streamlined and more accessible representation of the DOM. Accessibility trees are automatically generated from the DOM tree and are utilized by assistive technologies to allow people with disabilities to interact with web content. These are available on Chrome Web browser and on Android through AccessibilityNodeInfo objects, which are provided for both WebView and native application content.

We started by manually collecting and annotating accessibility trees. The Android dataset used for this project comprises on the order of 10k labeled examples, while the Chrome dataset contains approximately 100k labeled examples. We developed a novel tool that uses graph neural networks (GNNs) to distill essential content from the accessibility trees using a multi-class supervised learning approach. The datasets consist of long-form articles sampled from the web and labeled with classes such as headline, paragraph, images, publication date, etc.

GNNs are a natural choice for dealing with tree-like data structures, because unlike traditional models that often demand detailed, hand-crafted features to understand the layout and links within such trees, GNNs learn these connections naturally. To illustrate this, consider the analogy of a family tree. In such a tree, each node represents a family member and the connections denote familial relationships. If one were to predict certain traits using conventional models, features like the "number of immediate family members with a trait" might be needed. However, with GNNs, such manual feature crafting becomes redundant. By directly feeding the tree structure into the model, GNNs utilize a message-passing mechanism where each node communicates with its neighbors. Over time, information gets shared and accumulated across the network, enabling the model to naturally discern intricate relationships.

Returning to the context of accessibility trees, this means that GNNs can efficiently distill content by understanding and leveraging the inherent structure and relationships within the tree. This capability allows them to identify and possibly omit non-essential sections based on the information flow within the tree, ensuring more accurate content distillation.

Our architecture heavily follows the encode-process-decode paradigm using a message-passing neural network to classify text nodes. The overall design is illustrated in the figure below. The tree representation of the article is the input to the model. We compute lightweight features based on bounding box information, text information, and accessibility roles. The GNN then propagates each node's latent representation through the edges of the tree using a message-passing neural network. This propagation process allows nearby nodes, containers, and text elements to share contextual information with each other, enhancing the model's understanding of the page's structure and content. Each node then updates its current state based on the message received, providing a more informed basis for classifying the nodes. After a fixed number of message-passing steps, the now contextualized latent representations of the nodes are decoded into essential or non-essential classes. This approach enables the model to leverage both the inherent relationships in the tree and the hand-crafted features representing each node, thereby enriching the final classification.

A visual demonstration of the algorithm in action, processing an article on a mobile device. A graph neural network (GNN) is used to distill essential content from an article. 1. A tree representation of the article is extracted from the application. 2. Lightweight features are computed for each node, represented as vectors. 3. A message-passing neural network propagates information through the edges of the tree and updates each node representation. 4. Leaf nodes containing text content are classified as essential or non-essential content. 5. A decluttered version of the application is composed based on the GNN output.

We deliberately restrict the feature set used by the model to increase its broad generalization across languages and speed up inference latency on user devices. This was a unique challenge, as we needed to create an on-device lightweight model that could preserve privacy.

Our final lightweight Android model has 64k parameters and is 334kB in size with a median latency of 800ms, while the Chrome model has 241k parameters, is 928kB in size, and has a 378ms median latency. By employing such on-device processing, we ensure that user data never leaves the device, reinforcing our responsible approach and commitment to user privacy. The features used in the model can be grouped into intermediate node features, leaf-node text features, and element position features. We performed feature engineering and feature selection to optimize the set of features for model performance and model size. The final model was transformed into TensorFlow Lite format to deploy as an on-device model on Android or Chrome.


Results

We trained the GNN for about 50 epochs in a single GPU. The performance of the Android model on webpages and native application test sets is presented below:

The table presents the content distillation metrics in Android for webpages and native apps. We report precision, recall and F1-score for three classes: non-essential content, headline, and main body text, including macro average and weighted average by number of instances in each class. Node metrics assess the classification performance at the granularity of the accessibility tree node, which is analogous to a paragraph level. In contrast, word metrics evaluate classification at an individual word level, meaning each word within a node gets the same classification.

In assessing the results' quality on commonly visited webpage articles, an F1-score exceeding 0.9 for main-text (essentially paragraphs) corresponds to 88% of these articles being processed without missing any paragraphs. Furthermore, in over 95% of cases, the distillation proves to be valuable for readers. Put simply, the vast majority of readers will perceive the distilled content as both pertinent and precise, with errors or omissions being an infrequent occurrence.

The comparison of Chrome content distillation with other models such as DOM Distiller or Mozilla Readability on a set of English language pages is presented in the table below. We reuse the metrics from machine translation to compare the quality of these models. The reference text is from the groundtruth main content and the text from the models as hypothesis text. The results show the excellent performance of our models in comparison to other DOM-based approaches.

The table presents the comparison between DOM-Distiller, Mozilla Readability and the new Chrome model. We report text-based metrics, such as BLUE, CHRF and ROUGE, by comparing the main body text distilled from each model to a ground-truth text manually labeled by raters using our annotation policy.

The F1-score of the Chrome content distillation model for headline and main text content on the test sets of different widely spoken languages demonstrates that the Chrome model, in particular, is able to support a wide range of languages.

The table presents per language of F1-scores of the Chrome model for the headline and main text classes. The language codes correspond to the following languages: German, English, Spanish, French, Italian, Persian, Japanese, Korean, Portuguese, Vietnamese, simplified Chinese and traditional Chinese.

Conclusion

The digital age demands both streamlined content presentation and an unwavering commitment to user privacy. Our research highlights the effectiveness of Reading Mode in platforms like Android and Chrome, offering an innovative, data-driven approach to content parsing through Graph Neural Networks. Crucially, our lightweight on-device model ensures that content distillation occurs without compromising user data, with all processes executed locally. This not only enhances the reading experience but also reinforces our dedication to user privacy. As we navigate the evolving landscape of digital content consumption, our findings underscore the paramount importance of prioritizing the user in both experience and security.


Acknowledgements

This project is the result of joint work with Manuel Tragut, Mihai Popa, Abodunrinwa Toki, Abhanshu Sharma, Matt Sharifi, David Petrou and Blaise Aguera y Arcas. We sincerely thank our collaborators Gang Li and Yang Li. We are very grateful to Tom Small for assisting us in preparing the post.

Source: Google AI Blog


Notes from Google Play: Keeping our platform safe

Posted by Jacqueline Hart, Director, Trusted Experiences, Developer Enablement

Hi there,

With millions of Android apps to choose from, users are increasingly focused on the privacy and security of the titles they download. That’s why it’s so important to build user trust with delightful, high-quality app experiences built on a secure foundation.

I’m Jacqueline Hart and I lead the team that helps developers navigate our policies. We’re also responsible for reviewing apps on Google Play to make sure they are safe for users.

In this edition of Notes from Google Play, I’d like to share how we’re working to improve your policy experience and how we’re helping strengthen user trust by highlighting your app’s approach to privacy and security.

Over the past few months, we’ve shared updates on our key privacy and security initiatives to help you prepare for changes and use new tools and resources, including enhanced account data transparency and controls in your app’s Data safety section and new Android 14 functionality. Now, I’m pleased to share the next phase of features, tools, and updates that we’ve been working on to help keep our platform safe and trustworthy.

Giving you a better policy experience

A few months ago, we announced that we’re redesigning the App content page in Google Play Console to make your outstanding tasks clearer, and now we’re adding more information to help you:

  • Spot deadlines with a new timeline view for new and updated declarations
  • Understand why your app is in-scope for a particular declaration
  • Find relevant policy issues alongside each declaration, helping you identify and fix issues more quickly

Later this year, we plan to show not just existing declarations, but also upcoming declaration requirements and deadlines to give you more time to plan.

Clearly see outstanding declaration-related tasks in our redesigned App content page. 
Example is illustrative and subject to change

We’re also getting you critical information about third-party SDKs, including a new notice on Google Play SDK Index to help you make more informed decisions about which version of an SDK may cause your app to violate Google Play policies.

And now we’re bringing more critical information right into Play Console. Previously, you could only learn about SDK-related policy issues affecting your apps through an Inbox message or email. Later this year, we’ll bring this information to you right on the Policy status page so you can see any issues in one place and stay on top of your app’s policy status.

Soon, you can learn about SDK-related policy issues on the Policy status page in Play Console.
Example is illustrative and subject to change

We’re also making it easier to find out if your app is impacted by our Target API requirements, which requires you to build for the latest versions of Android so you can make use of our latest security updates and platform enhancements. Since early August, you may have seen information outlining any potential impact on your app on the Policy status page, including resources to help you learn what to do to stay compliant.

To give you further support, we’re launching more ways to improve the experience. These include the new Developer Help Community, where you can ask your peers about everything from Play Console to the latest policy changes, and the Google Play Strike Removal program, which helps eligible developers get certain enforcement strikes removed after passing a related Play Academy training course. We launched the program as a pilot last year and have seen a successful reduction in repeat violations, so now we’re making it available to all developers.

Building user trust with the Data safety section

Security plays an important role in helping users decide whether an app is right for them. Recently, we announced account data transparency and controls in the Data safety section to help build user trust. If you haven’t completed your updated Data safety form yet, watch our two-minute video to learn how before the December 7, 2023 deadline. Users will begin to see your new information in your store listing in early 2024.
Provide data deltion options in Play Console by Dec 7, 2023
Soon, you can learn about SDK-related policy issues on the Policy status page in Play Console. 
Example is illustrative and subject to change.

To help users feel confident about their downloads in sensitive app categories, we’re soon adding a new Play Store banner for the VPN app category to emphasize the importance of reviewing an app’s Data safety section before installing.

When users search for “VPN” apps in Google Play, they’ll see a banner that encourages them to look for a shield icon in the app’s Data safety section, which indicates that the app has completed an independent security review. VPN developers such as NordVPN, Google One and ExpressVPN are early adopters of this program. We encourage and anticipate additional VPN app developers to undergo independent security testing, bringing even more transparency to users. Users can learn more about the independent security validation process and see VPN apps that have been independently validated by tapping “Learn more” to go to the App Validation Directory.
Independent Security Review - VPN apps with this badge in the Data safety section have been independently validated against a global security standard.
We're rolling out a Play Store new banner to build user confidence in the VPN app category. 
Example is illustrative and subject to change.

If you are a VPN developer and interested in learning more about this feature, please submit this form.

Looking ahead

Our team at Google is prioritizing new ways to give users even more confidence in the quality and security of the apps and games they download, establishing Google Play as the most trusted app marketplace. This includes efforts like our new developer verification process for new Play Console accounts.

We’ve got a lot more to come, but I’m excited to share these updates with you now, and I hope they help you continue to thrive on our platform. As always, thanks for partnering with us to make Google Play a safe, trustworthy platform.

Jacqueline Hart

Planned Content API maintenance from 15:00 UTC to 17:00 UTC on September 28, 2023

The Content API for Shopping will undergo planned maintenance on September 28, 2023, from 15:00 to 17:00 UTC.

During this time, you will not be able to make any changes to your account such as updates to users, business information, feeds, shipping details, or linking your Google Ads accounts.

You can still upload products to your existing feeds or data sources and run ads as usual.

If you have any questions or concerns, please don't hesitate to contact us via the forum.