How we build with and for people with disabilities

Editor’s note: Today is Global Accessibility Awareness Day. We’re also sharing how we’re making education more accessibleand launching a newAndroid accessibility feature.

Over the past nine years, my job has focused on building accessible products and supporting Googlers with disabilities. Along the way, I’ve been constantly reminded of how vast and diverse the disability community is, and how important it is to continue working alongside this community to build technology and solutions that are truly helpful.

Before delving into some of the accessibility features our teams have been building, I want to share how we’re working to be more inclusive of people with disabilities to create more accessible tools overall.

Nothing about us, without us

In the disability community, people often say “nothing about us without us.” It’s a sentiment that I find sums up what disability inclusion means. The types of barriers that people with disabilities face in society vary depending on who they are, where they live and what resources they have access to. No one’s experience is universal. That’s why it’s essential to include a wide array of people with disabilities at every stage of the development process for any of our accessibility products, initiatives or programs.

We need to work to make sure our teams at Google are reflective of the people we’re building for. To do so, last year we launched our hiring site geared toward people with disabilities — including our Autism Career Program to further grow and strengthen our autistic community. Most recently, we helped launch the Neurodiversity Career Connector along with other companies to create a job portal that connects neurodiverse candidates to companies that are committed to hiring more inclusively.

Beyond our internal communities, we also must partner with communities outside of Google so we can learn what is truly useful to different groups and parlay that understanding into the improvement of current products or the creation of new ones. Those partnerships have resulted in the creation of Project Relate, a communication tool for people with speech impairments, the development of a completely new TalkBack, Android’s built-in screen reader, and the improvement of Select-to-Speak, a Chromebook tool that lets you hear selected text on your screen spoken out loud.

Equitable experiences for everyone

Engaging and listening to these communities — inside and outside of Google — make it possible to create tools and features like the ones we’re sharing today.

The ability to add alt-text, which is a short description of an image that is read aloud by screen readers, directly to images sent through Gmail starts rolling out today. With this update, people who use screen readers will know what’s being sent to them, whether it’s a GIF celebrating the end of the week or a screenshot of an important graph.

Communication tools that are inclusive of everyone are especially important as teams have shifted to fully virtual or hybrid meetings. Again, everyone experiences these changes differently. We’ve heard from some people who are deaf or hard of hearing, that this shift has made it easier to identify who is speaking — something that is often more difficult in person. But, in the case of people who use ASL, we’ve heard that it can be difficult to be in a virtual meeting and simultaneously see their interpreter and the person speaking to them.

Multi-pin, a new feature in Google Meet, helps solve this. Now you can pin multiple video tiles at once, for example, the presenter’s screen and the interpreter’s screen. And like many accessibility features, the usefulness extends beyond people with disabilities. The next time someone is watching a panel and wants to pin multiple people to the screen, this feature makes that possible.

We've also been working to make video content more accessible to those who are blind or low-vision through audio descriptions that describe verbally what is on the screen visually. All of our English language YouTube Originals content from the past year — and moving forward — will now have English audio descriptions available globally. To turn on the audio description track, at the bottom right of the video player, click on “Settings”, select “Audio track”, and choose “English descriptive”.

For many people with speech impairments, being understood by the technology that powers tools like voice typing or virtual assistants can be difficult. In 2019, we started work to change that through Project Euphonia, a research initiative that works with community organizations and people with speech impairments to create more inclusive speech recognition models. Today, we’re expanding Project Euphonia’s research to include four more languages: French, Hindi, Japanese and Spanish. With this expansion, we can create even more helpful technology for more people — no matter where they are or what language they speak.

I’ve learned so much in my time working in this space and among the things I’ve learned is the absolute importance of building right alongside the very people who will most use these tools in the end. We’ll continue to do that as we work to create a more inclusive and accessible world, both physically and digitally.

Milan Cathedral, up close and beautiful

There is a particular shade of pink in the marble that makes Milan Cathedral unique. It is this marble, from Candoglia quarries, that inspired Milan Cathedral Remixed to take a fresh look at the iconic Duomo.

The heart of the city

The Duomo has stood in the center of Milan for 635 years — a proud spiritual and architectural reference point for a city in constant evolution. The dream of Gian Galeazzo Visconti, Lord of Milan, work began in 1386, overseen by the Veneranda Fabbrica del Duomo, which then took care of the conservation and enhancement of the Cathedral.

Looking at the Cathedral today, it’s as though it is in dialogue with the surrounding square and the city beyond. The large stained glass windows with their finely inlaid Biblia pauperum (literally the Bible for the poor — or those who couldn’t read), heralding modern media in their use of images to represent scripture.

The power of technology

Milan Cathedral Remixed was made possible by Google Arts & Culture technology, in partnership with the Veneranda Fabbrica. This ambitious digitization project led to the capture of more than 50 stained glass windows in high resolution, bringing the Google Art Camera to a dizzying height of 30 meters. This captured the details of more than 2,000 stained glass window panels, many of which can’t be seen from ground level. With Street View, we can now see every corner of the Cathedral in 360°, from the highest peak, the Madonnina, down to the Crypt — an underground place of meditation and prayer.

Discover, learn and play with Milan Cathedral Remixed

Read the Biblical stories and find out about the Cathedral’s modern and contemporary art from the 80 narratives that link ancient with contemporary inside the Duomo.

One of these narratives, “Lux fuit” (literally, there was light) takes a close up look at the Cathedral’s windows, the stories they depict and the light flooding through.

These extraordinary stained-glass windows have aroused wonder across the centuries. Many celebratedpoets and authors have written of them, and they inspired the creation of the Google Arts & Culture Coloring Book and Puzzle Party. It is this heritage that Veneranda Fabbrica preserves for us all, and for our descendants.

Visit g.co/milancathedral or download Google Arts & Culture’s Android or iOS app to continue learning and having fun.

Chrome Beta for Android Update

Hi everyone! We've just released Chrome Beta 102 (102.0.5005.58) for Android. It's now available on Google Play.

You can see a partial list of the changes in the Git log. For details on new features, check out the Chromium blog, and for details on web platform updates, check here.

If you find a new issue, please let us know by filing a bug.

Erhu Akpobaro
Google Chrome

Augmented reality brings fine art to life for International Museum Day

Have you ever dreamt of having your portrait taken by a world-famous artist? Or wished a painting would come to life before your eyes? This International Museum Day, we’re unveiling three new Art Filter options via the Google Arts & Culture app so that you can immerse yourself in iconic paintings by Vincent van Gogh, Grant Wood, and Fernando Botero.

Our 3D-modeled augmented reality filter for Starry Night is a creative new twist on our previous Art Filter options and reflects how we continue to innovate with technology. Responding to the evocative atmosphere of Van Gogh’s masterpiece, it lets you set the night sky’s swirling winds and dazzling stars in motion. These filters are possible thanks to our partners in New York, Bogotá, and around the world who make their astonishing collections available online via Google Arts & Culture.

In another first for Art Filter, we’ve introduced face-mirroring effects to Grant Wood’s definitive depiction of midwestern America. See the figures of this celebrated double-portrait in a new light by interacting with both simultaneously. Perhaps you’ll put a smile on their famously long faces? Fernando Botero’s La primera dama, by contrast, needs no cheering up. This voluminous figure captures the Columbian artist’s inimitable Boterismo style in all its vibrancy and humor. Each of our three new Art Filter options draws inspiration from the paintings themselves to make these extraordinary artworks fun and educational for everyone.

Museums exist to preserve and celebrate art and culture. Using immersive, interactive technology, we aim to make these vital institutions more accessible. More than 60 museums from over 15 countries have joined Google Arts & Culture in 2022, joining more than 2000 existing partners to share their new collections and stories.

You can flick through the history of manga, tune into Bob Marley’s positive vibrations, tour an Argentinian palace, and hear powerful oral histories from Black Britain. In addition to art-inspired Art Filter options, you can also explore space, air, and sea with Neil Armstrong’s space suit, Amelia Earhart’s Lockheed Vega 5B, or a deep-sea diving helmet.

The Google Arts & Culture app is available to download for Android or iOS. Tap the Camera icon to immerse yourself in Art Filter (g.co/artfilter), get creative with Art Transfer, find a pawfect match for your animal companion, and more. From the beauty of India’s celebrated crafts to terracotta toys for Greco-Roman children, we hope it will inspire you to explore and interact with incredible artifacts from around the globe and across history.

Beta Channel Update for Desktop

The Beta channel has been updated to 102.0.5005.61 for Windows,Mac and Linux.

A full list of changes in this build is available in the log. Interested in switching release channels? Find out how here. If you find a new issues, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.


Prudhvikumar BommanaGoogle Chrome

AppSheet Enterprise Standard and Enterprise Plus available as add-ons to Google Workspace editions

Quick summary 

Google Workspace customers can now purchase AppSheet Enterprise Standard and Enterprise Plus as add-ons by contacting their Google Cloud sales representative or through the Google Workspace Partner network. 


AppSheet allows users to maximize Google Workspace by building custom applications on top of Google Workspace and other services in their environment, all without writing any code. With AppSheet Enterprise, customers can enable advanced scenarios, adding advanced connectivity, more scale, and strengthened governance.


Getting started



Availability

  • Available to Google Workspace Essentials, Business Starter, Business Standard, Business Plus, Enterprise Essentials, Enterprise Starter, Enterprise Standard, Enterprise Plus, Education Fundamentals, Education Plus, the Teaching and Learning Upgrade, Frontline, and Nonprofits, as well as legacy G Suite Basic and Business customers.

Resources

How to use App Engine Memcache in Flask apps (Module 12)

Posted by Wesley Chun

Background

In our ongoing Serverless Migration Station series aimed at helping developers modernize their serverless applications, one of the key objectives for Google App Engine developers is to upgrade to the latest language runtimes, such as from Python 2 to 3 or Java 8 to 17. Another objective is to help developers learn how to move away from App Engine legacy APIs (now called "bundled services") to Cloud standalone equivalent services. Once this has been accomplished, apps are much more portable, making them flexible enough to:

In today's Module 12 video, we're going to start our journey by implementing App Engine's Memcache bundled service, setting us up for our next move to a more complete in-cloud caching service, Cloud Memorystore. Most apps typically rely on some database, and in many situations, they can benefit from a caching layer to reduce the number of queries and improve response latency. In the video, we add use of Memcache to a Python 2 app that has already migrated web frameworks from webapp2 to Flask, providing greater portability and execution options. More importantly, it paves the way for an eventual 3.x upgrade because the Python 3 App Engine runtime does not support webapp2. We'll cover both the 3.x and Cloud Memorystore ports next in Module 13.

Got an older app needing an update? We can help with that.

Adding use of Memcache

The sample application registers individual web page "visits," storing visitor information such as the IP address and user agent. In the original app, these values are stored immediately, and then the most recent visits are queried to display in the browser. If the same user continuously refreshes their browser, each refresh constitutes a new visit. To discourage this type of abuse, we cache the same user's visit for an hour, returning the same cached list of most recent visits unless a new visitor arrives or an hour has elapsed since their initial visit.

Below is pseudocode representing the core part of the app that saves new visits and queries for the most recent visits. Before, you can see how each visit is registered. After the update, the app attempts to fetch these visits from the cache. If cached results are available and "fresh" (within the hour), they're used immediately, but if cache is empty, or a new visitor arrives, the current visit is stored as before, and this latest collection of visits is cached for an hour. The bolded lines represent the new code that manages the cached data.

Adding App Engine Memcache usage to sample app

Wrap-up

Today's "migration" began with the Module 1 sample app. We added a Memcache-based caching layer and arrived at the finish line with the Module 12 sample app. To practice this on your own, follow the codelab doing it by-hand while following the video. The Module 12 app will then be ready to upgrade to Cloud Memorystore should you choose to do so.

In Fall 2021, the App Engine team extended support of many of the bundled services to next-generation runtimes, meaning you are no longer required to migrate to Cloud Memorystore when porting your app to Python 3. You can continue using Memcache in your Python 3 app so long as you retrofit the code to access bundled services from next-generation runtimes.

If you do want to move to Cloud Memorystore, stay tuned for the Module 13 video or try its codelab to get a sneak peek. All Serverless Migration Station content (codelabs, videos, source code [when available]) can be accessed at its open source repo. While our content initially focuses on Python users, we hope to one day cover other language runtimes, so stay tuned. For additional video content, check out our broader Serverless Expeditions series.

Vector-Quantized Image Modeling with Improved VQGAN

In recent years, natural language processing models have dramatically improved their ability to learn general-purpose representations, which has resulted in significant performance gains for a wide range of natural language generation and natural language understanding tasks. In large part, this has been accomplished through pre-training language models on extensive unlabeled text corpora.

This pre-training formulation does not make assumptions about input signal modality, which can be language, vision, or audio, among others. Several recent papers have exploited this formulation to dramatically improve image generation results through pre-quantizing images into discrete integer codes (represented as natural numbers), and modeling them autoregressively (i.e., predicting sequences one token at a time). In these approaches, a convolutional neural network (CNN) is trained to encode an image into discrete tokens, each corresponding to a small patch of the image. A second stage CNN or Transformer is then trained to model the distribution of encoded latent variables. The second stage can also be applied to autoregressively generate an image after the training. But while such models have achieved strong performance for image generation, few studies have evaluated the learned representation for downstream discriminative tasks (such as image classification).

In “Vector-Quantized Image Modeling with Improved VQGAN”, we propose a two-stage model that reconceives traditional image quantization techniques to yield improved performance on image generation and image understanding tasks. In the first stage, an image quantization model, called VQGAN, encodes an image into lower-dimensional discrete latent codes. Then a Transformer model is trained to model the quantized latent codes of an image. This approach, which we call Vector-quantized Image Modeling (VIM), can be used for both image generation and unsupervised image representation learning. We describe multiple improvements to the image quantizer and show that training a stronger image quantizer is a key component for improving both image generation and image understanding.

Vector-Quantized Image Modeling with ViT-VQGAN
One recent, commonly used model that quantizes images into integer tokens is the Vector-quantized Variational AutoEncoder (VQVAE), a CNN-based auto-encoder whose latent space is a matrix of discrete learnable variables, trained end-to-end. VQGAN is an improved version of this that introduces an adversarial loss to promote high quality reconstruction. VQGAN uses transformer-like elements in the form of non-local attention blocks, which allows it to capture distant interactions using fewer layers.

In our work, we propose taking this approach one step further by replacing both the CNN encoder and decoder with ViT. In addition, we introduce a linear projection from the output of the encoder to a low-dimensional latent variable space for lookup of the integer tokens. Specifically, we reduced the encoder output from a 768-dimension vector to a 32- or 8-dimension vector per code, which we found encourages the decoder to better utilize the token outputs, improving model capacity and efficiency.

Overview of the proposed ViT-VQGAN (left) and VIM (right), which, when working together, is capable of both image generation and image understanding. In the first stage, ViT-VQGAN converts images into discrete integers, which the autoregressive Transformer (Stage 2) then learns to model. Finally, the Stage 1 decoder is applied to these tokens to enable generation of high quality images from scratch.

With our trained ViT-VQGAN, images are encoded into discrete tokens represented by integers, each of which encompasses an 8x8 patch of the input image. Using these tokens, we train a decoder-only Transformer to predict a sequence of image tokens autoregressively. This two-stage model, VIM, is able to perform unconditioned image generation by simply sampling token-by-token from the output softmax distribution of the Transformer model.

VIM is also capable of performing class-conditioned generation, such as synthesizing a specific image of a given class (e.g., a dog or a cat). We extend the unconditional generation to class-conditioned generation by prepending a class-ID token before the image tokens during both training and sampling.

Uncurated set of dog samples from class-conditioned image generation trained on ImageNet. Conditioned classes: Irish terrier, Norfolk terrier, Norwich terrier, Yorkshire terrier, wire-haired fox terrier, Lakeland terrier.

To test the image understanding capabilities of VIM, we also fine-tune a linear projection layer to perform ImageNet classification, a standard benchmark for measuring image understanding abilities. Similar to ImageGPT, we take a layer output at a specific block, average over the sequence of token features (frozen) and insert a softmax layer (learnable) projecting averaged features to class logits. This allows us to capture intermediate features that provide more information useful for representation learning.

Experimental Results
We train all ViT-VQGAN models with a training batch size of 256 distributed across 128 CloudTPUv4 cores. All models are trained with an input image resolution of 256x256. On top of the pre-learned ViT-VQGAN image quantizer, we train Transformer models for unconditional and class-conditioned image synthesis and compare with previous work.

We measure the performance of our proposed methods for class-conditioned image synthesis and unsupervised representation learning on the widely used ImageNet benchmark. In the table below we demonstrate the class-conditioned image synthesis performance measured by the Fréchet Inception Distance (FID). Compared to prior work, VIM improves the FID to 3.07 (lower is better), a relative improvement of 58.6% over the VQGAN model (FID 7.35). VIM also improves the capacity for image understanding, as indicated by the Inception Score (IS), which goes from 188.6 to 227.4, a 20.6% improvement relative to VQGAN.

Model Acceptance
Rate
FID IS

Validation data 1.0 1.62 235.0

DCTransformer 1.0 36.5 N/A
BigGAN 1.0 7.53 168.6
BigGAN-deep 1.0 6.84 203.6
IDDPM 1.0 12.3 N/A
ADM-G, 1.0 guid. 1.0 4.59 186.7
VQVAE-2 1.0 ~31 ~45

VQGAN 1.0 17.04 70.6
VQGAN 0.5 10.26 125.5
VQGAN 0.25 7.35 188.6
ViT-VQGAN (Ours) 1.0 4.17 175.1
ViT-VQGAN (Ours) 0.5 3.04 227.4
Fréchet Inception Distance (FID) comparison between different models for class-conditional image synthesis and Inception Score (IS) for image understanding, both on ImageNet with resolution 256x256. The acceptance rate shows results filtered by a ResNet-101 classification model, similar to the process in VQGAN.

After training a generative model, we test the learned image representations by fine-tuning a linear layer to perform ImageNet classification, a standard benchmark for measuring image understanding abilities. Our model outperforms previous generative models on the image understanding task, improving classification accuracy through linear probing (i.e., training a single linear classification layer, while keeping the rest of the model frozen) from 60.3% (iGPT-L) to 73.2%. These results showcase VIM’s strong generation results as well as image representation learning abilities.

Conclusion
We propose Vector-quantized Image Modeling (VIM), which pretrains a Transformer to predict image tokens autoregressively, where discrete image tokens are produced from improved ViT-VQGAN image quantizers. With our proposed improvements on image quantization, we demonstrate superior results on both image generation and understanding. We hope our results can inspire future work towards more unified approaches for image generation and understanding.

Acknowledgements
We would like to thank Xin Li, Han Zhang, Ruoming Pang, James Qin, Alexander Ku, Yuanzhong Xu, Jason Baldridge, Yonghui Wu for the preparation of the VIM paper. We thank Wei Han, Yuan Cao, Jiquan Ngiam‎, Vijay Vasudevan, Zhifeng Chen and Claire Cui for helpful discussions and feedback, and others on the Google Research and Brain Team for support throughout this project.

Source: Google AI Blog


Export log data in near-real time to BigQuery

Quick summary 

Currently, you can export Google Workspace logs to Google BigQuery for customized and scalable reporting. Exports take place as a daily sync, returning log data that can be up to three days old. With this launch, exported log data streams will be near-real time (under 10 minutes), ensuring fresh data for your export. This helps you stay on top of security threats and analysis with the most up-to-date activity log data. 



Stream activity log data in near-real time when using BigQuery export




Getting started 

  • Admins: This feature works automatically if you have set up service log exports to BigQuery. There is no additional admin control for this feature. 
  • End users: There is no end user impact. 

Rollout pace 


Availability 

  • Available to Google Workspace Enterprise Standard, Enterprise Plus, Education Standard, and Education Plus Not available to Google Workspace Essentials, Business Starter, Business Standard, Business Plus, Enterprise Essentials, Education Fundamentals, Frontline, and Nonprofits, as well as legacy G Suite Basic and Business customers 
  • Not available to users with personal Google Accounts 

Resources 

New and updated third-party DevOps integrations for Google Chat, including PagerDuty

What’s changing

We’re introducing and updating a variety of additional DevOps integrations, which will allow you to take action on common workflows directly in Google Chat: 

  • Apps such as Google Cloud Build, Asana, GitHub, Jenkins and more, have been updated with new functionality: 
    • Using Slash commands for quick actions such as creating a new Asana task or triggering a build in Jenkins or Google Cloud Build. 
    • Ability to use Dialogs for important flows such as setting up the app, or entering detailed info such as creating a GitHub issue. 
  • Operations and incident response professionals can use the new PagerDuty integration to take action on PagerDuty incidents from Chat. From Chat, you’ll be able to: 
  • Receive notifications of PagerDuty incidents right from Google Chat. 
  • Take action, including acknowledging and resolving incidents without leaving the conversation. 



You can find these integrations and a complete list of other Google-developed Chat apps here


Who’s impacted 

Admins and end users 


Why you’d use them 

We hope these additional third-party integrations within Chat help you collaborate and get work done faster by eliminating the need to switch between various apps and browser tabs. 


Additional details 

We plan to introduce the ability to create dedicated spaces to collaborate with teammates on important incidents to resolve them quickly, with the right people. We will provide an update on the Workspace Updates blog when that functionality becomes available. 


Getting started 


Rollout pace 


Availability 

  • Available to Google Workspace customers, as well as legacy G Suite Basic and Business customers 

Resources