New cookie choices in Europe

If you’ve visited a website in Europe, chances are you’ve seen a cookie consent banner. Cookies help sites remember information about your visit, so they can do things like display text in your preferred language, make sure you’re a real user and not a pesky bot, or estimate whether or not an ad campaign is working.

In the past year, regulators who interpret European laws requiring these banners, including data protection authorities in France, Germany, Ireland, Italy, Spain and the U.K., have updated their guidance for compliance. We’re committed to meeting the standards of that updated guidance and have been working with a number of these authorities.

Based on these conversations and specific direction from France’s Commission Nationale de l’Informatique et des Libertés (CNIL), we have now completed a full redesign of our approach, including changes to the infrastructure we use to handle cookies.

A box that reads, “Before you continue to YouTube,” explains cookies, and asks you to “Reject all” or “Accept all” with one click

Our new cookie banners began rolling out earlier this month on YouTube in France and will soon be coming to all Google users in Europe

Soon, anyone visiting Search and YouTube in Europe while signed out or in Incognito Mode will see a new cookie consent choice. This update, which began rolling out earlier this month on YouTube, will provide you with equal “Reject all” and “Accept all” buttons on the first screen in your preferred language. (You can also still choose to customize your choice in more detail with “More options.”)

We’ve kicked off the launch in France and will be extending this experience across the rest of the European Economic Area, the U.K. and Switzerland. Before long, users in the region will have a new cookie choice — one that can be accepted or rejected with a single click.

Not just a new button

This update meant we needed to re-engineer the way cookies work on Google sites, and to make deep, coordinated changes to critical Google infrastructure. Moreover, we knew that these changes would impact not only Search and YouTube, but also the sites and content creators who use them to help grow their businesses and make a living.

We believe this update responds to updated regulatory guidance and is aligned with our broader goal of helping build a more sustainable future for the web. We’ve committed to building new privacy-preserving technologies in the Privacy Sandbox for the same reason. We believe it is possible both to protect people’s privacy online and to give companies and developers tools to build thriving digital businesses.

Equiano’s next stop is in Nigeria

Last month, we announced that the Equiano undersea cable successfully landed in Togo. This was the first in a series of landings on the continent for the subsea cable, which will run from Portugal along Africa's west coast to South Africa. Today, we're thrilled to announce our second Africa landing in Lagos, Nigeria. While many subsea cables are named after historical luminaries, the Equiano cable has special resonance for Nigeria. It’s named after Olaudah Equiano, a Nigerian-born writer and abolitionist, so its landing in Lagos, Nigeria today is like a homecoming.




Nigeria is sub-Saharan Africa’s largest economy. Still, the share of people using the internet stood at approximately 35% as of 2020 – double what it was in 2012. Across much of the country, people lack affordable, reliable & quality access, which limits their ability to benefit from, and contribute to, the digital economy.



Since 2017, the Nigerian government has been actively working on its digital transformation programs as part of plans to grow its domestic sectors. These initiatives have proven pivotal to the success of many industries in the country, especially the startup space. In the last five years, startups in Nigeria have produced five unicorns (startups valued at over a billion dollars). Businesses are also benefiting significantly from the usage of internet platforms, with total e-commerce annual expenditure predicted to climb to $75 billion by 2025, up from its current projection of US $12 billion.





Though a great deal of progress has been made, studies suggest that faster internet connections, better user experiences, and reduced internet costs will help accelerate these benefits.



Landing this cable comes as part of critical stages leading up to its deployment later this year, and it is expected to deliver up to 20 times more capacity than the region's prior cables. We've worked with established partners and in-country experts to ensure that Equiano has the greatest potential effect in Nigeria and throughout Africa.



A recent economic impact assessment conducted by Africa Practice and Genesis Analytics states that Equiano's arrival in Nigeria is expected to result in faster internet speeds and significantly improve people's experiences while online. Internet speeds in Nigeria are expected to grow almost sixfold by 2025, and retail internet prices are forecasted to decline by 21% over the same period. The same study found that by 2025, real GDP in Nigeria is forecast to be USD 10.1 billion higher than it otherwise would have been without Equiano and that the cable would indirectly generate roughly 1.6 million new jobs between 2022 and 2025.




We are partnering with multiple key telecom players, including our landing party, the West Indian Ocean Cable Company (WIOCC), where Equiano lands to ensure that the cable can reach more businesses and end users across Nigeria and the African continent more broadly




With Equiano, we look forward to being an even more integral part of the digital transformation journey in Nigeria.



Posted by Juliet Ehimuan, Director, West Africa


 ==== 

The city using Google tools for environmental education

Since launching Google’s Environmental Insights Explorer (EIE) in 2018, my team and I have seen how data can help local governments develop relevant climate plans.

EIE is a free tool designed to help measure emission sources and identify strategies to reduce emissions. In Pune City, India, the local government has used data from EIE to better analyze trip emissions. In Australia, Ironbark Sustainability and Beyond Zero Emissions have developed Snapshot Climate, a community climate tool that incorporates EIE transportation and emissions data — and shares it with local councils and other organizations across the country.

So far, over 320 cities worldwide have made their data available for the public to view through the platform — including West Nusa Tenggara, in Indonesia, the first place in Southeast Asia to adopt EIE.

While we have seen how EIE has helped cities shape their efforts to reduce emissions using data, that’s not the only benefit that the tool offers. Cities like Yokohama in Japan are also using it to educate their citizens.

I wanted to learn more about this initiative — so in the lead-up to Earth Day this week, I sat down with Hiroki Miyajima, the Executive Director of the General Affairs Department in the International Affairs Bureau of the City of Yokohama.

Hiroki-san, it’s wonderful to know that Yokohama City uses Google’s Environmental Insights Explorer (EIE). What motivated the city to use this tool?

I was introduced to EIE back in 2020 and found it to be an excellent tool with visual capabilities and accessible simulation features for us to understand our city better. As we already had data on greenhouse gas emissions, I saw the tool as a great way to build awareness around sustainability among our citizens.

Households in Yokohama generate about 25% of our current CO2 emissions. With our mayor having announced a goal of reducing emissions by 50% by 2030, we need to encourage our citizens to change their behavior as we work towards decarbonization. That starts with education, in particular for children and young people: our next generation. We’ve begun incorporating EIE into education programs from junior high school to universities. By exploring EIE, these students can visualize and better understand the impacts of CO2 emissions.

A male student with a mask on, looking at the Environmental Insights Explorer on his computer.

A student using the Environmental Insights Explorer in class.

What impact have you seen since the education programs have rolled out?

I’ve heard several anecdotal stories from teachers. After attending one class, a junior high school student commented that he would make sure to turn off unnecessary electricity if he saw no one using the classroom. Another student said he plans to incorporate energy-saving ideas at home and share what he learns with his parents.

At universities, we see student teams incorporating EIE data into their projects. For instance, one group created a report on promoting the use of electric vehicles and shared their presentation at an international conference held by the Ministry of Foreign Affairs.

I’m incredibly encouraged knowing that our younger generation cares about their city and this planet. We can motivate them to take practical action through education, no matter how big or small they are. We look forward to bringing EIE to more institutions.

Why should other cities consider getting on board in using EIE with city planning?

We’ve been collaborating and supporting urban development projects with emerging cities in Southeast Asia. We’ve noticed that many of these cities have not had the chance to calculate the amount of GHG emissions they generate. One reason for this is that calculating emissions can be time-consuming and requires significant funding. However, using EIE, it’s possible to get insightful data efficiently and effectively.

If you’re part of a local government and interested in what EIE can do for your community, fill out this formto get in touch with our team, or visit our website.

Chrome Beta for Android Update

Hi everyone! We've just released Chrome Beta 101 (101.0.4951.41) for Android. It's now available on Google Play.

You can see a partial list of the changes in the Git log. For details on new features, check out the Chromium blog, and for details on web platform updates, check here.

If you find a new issue, please let us know by filing a bug.

Ben Mason
Google Chrome

Set up host controls and assign co-hosts ahead of meetings in Google Calendar

Quick summary

In addition to setting up Google Meet breakout rooms in advance in Google Calendar, meeting organizers can also: 
  • Turn meeting safety features on or off, such as chat lock, present lock, and more. 
  • Designate co-hosts before the meeting. 

We hope that by allowing meeting hosts to pre-configure additional settings and assign co-hosts, meetings can flow more smoothly. 

Getting started 

  • Admins: Visit the Help Center to learn more about managing Meet safety settings and Host Management
  • End users: To configure host control and co-host options when scheduling a meeting in Google Calendar, select “Add Google Meet video conferencing” > “Video call options” (gear icon) > Host controls or Co-hosts

Rollout pace 


Availability 

Moderation Settings 
  • Available to all Google Workspace customers, as well as legacy G Suite Basic and Business customers. 
  • Available to users with a personal Google account 

Co-host settings 
  • Available to Google Workspace Essentials, Enterprise Essentials, Enterprise Standard, Enterprise Plus, Business Standard, Business Plus, Education Fundamentals, Education Standard, Education Plus, and Teaching and Learning Upgrade customers 
  • Not available to Google Workspace Business Starter, Frontline, and Nonprofits, as well as legacy G Suite Basic and Business customers 

Resources 

FormNet: Beyond Sequential Modeling for Form-Based Document Understanding

Form-based document understanding is a growing research topic because of its practical potential for automatically converting unstructured text data into structured information to gain insight about a document’s contents. Recent sequence modeling, which is a self-attention mechanism that directly models relationships between all words in a selection of text, has demonstrated state-of-the-art performance on natural language tasks. A natural approach to handle form document understanding tasks is to first serialize the form documents (usually in a left-to-right, top-to-bottom fashion) and then apply state-of-the-art sequence models to them.

However, form documents often have more complex layouts that contain structured objects, such as tables, columns, and text blocks. Their variety of layout patterns makes serialization difficult, substantially limiting the performance of strict serialization approaches. These unique challenges in form document structural modeling have been largely underexplored in literature.

An illustration of the form document information extraction task using an example from the FUNSD dataset.

In “FormNet: Structural Encoding Beyond Sequential Modeling in Form Document Information Extraction”, presented at ACL 2022, we propose a structure-aware sequence model, called FormNet, to mitigate the sub-optimal serialization of forms for document information extraction. First, we design a Rich Attention (RichAtt) mechanism that leverages the 2D spatial relationship between word tokens for more accurate attention weight calculation. Then, we construct Super-Tokens (tokens that aggregate semantically meaningful information from neighboring tokens) for each word by embedding representations from their neighboring tokens through a graph convolutional network (GCN). Finally, we demonstrate that FormNet outperforms existing methods, while using less pre-training data, and achieves state-of-the-art performance on the CORD, FUNSD, and Payment benchmarks.

FormNet for Information Extraction
Given a form document, we first use the BERT-multilingual vocabulary and optical character recognition (OCR) engine to identify and tokenize words. We then feed the tokens and their corresponding 2D coordinates into a GCN for graph construction and message passing. Next, we use Extended Transformer Construction (ETC) layers with the proposed RichAtt mechanism to continue to process the GCN-encoded structure-aware tokens for schema learning (i.e., semantic entity extraction). Finally, we use the Viterbi algorithm, which finds a sequence that maximizes the posterior probability, to decode and obtain the final entities for output.

Extended Transformer Construction (ETC)
We adopt ETC as the FormNet model backbone. ETC scales to relatively long inputs by replacing standard attention, which has quadratic complexity, with a sparse global-local attention mechanism that distinguishes between global and long input tokens. The global tokens attend to and are attended by all tokens, but the long tokens attend only locally to other long tokens within a specified local radius, reducing the complexity so that it is more manageable for long sequences.

Rich Attention
Our novel architecture, RichAtt, avoids the deficiencies of absolute and relative embeddings by avoiding embeddings entirely. Instead, it computes the order of and log distance between pairs of tokens with respect to the x and y axes on the layout grid, and adjusts the pre-softmax attention scores of each pair as a direct function of these values.

In a traditional attention layer, each token representation is linearly transformed into a Query vector, a Key vector, and a Value vector. A token “looks” for other tokens from which it might want to absorb information (i.e., attend to) by finding the ones with Key vectors that create relatively high scores when matrix-multiplied (called Matmul) by its Query vector and then softmax-normalized. The token then sums together the Value vectors of all other tokens in the sentence, weighted by their score, and passes this up the network, where it will normally be added to the token’s original input vector.

However, other features beyond the Query and Key vectors are often relevant to the decision of how strongly a token should attend to another given token, such as the order they’re in, how many other tokens separate them, or how many pixels apart they are. In order to incorporate these features into the system, we use a trainable parametric function paired with an error network, which takes the observed feature and the output of the parametric function and returns a penalty that reduces the dot product attention score.

The network uses the Query and Key vectors to consider what value some low-level feature (e.g., distance) should take if the tokens are related, and penalizes the attention score based on the error.

At a high level, for each attention head at each layer, FormNet examines each pair of token representations, determines the ideal features the tokens should have if there is a meaningful relationship between them, and penalizes the attention score according to how different the actual features are from the ideal ones. This allows the model to learn constraints on attention using logical implication.

A visualization of how RichAtt might act on a sentence. There are three adjectives that the word “crow” might attend to. “Lazy” is to the right, so it probably does not modify “crow” and its attention edge is penalized. “Sly” is many tokens away, so its attention edge is also penalized. “Cunning” receives no significant penalties, so by process of elimination, it is the best candidate for attention.

Furthermore, if one assumes that the softmax-normalized attention scores represent a probability distribution, and the distributions for the observed features are known, then this algorithm — including the exact choice of parametric functions and error functions — falls out algebraically, meaning FormNet has a mathematical correctness to it that is lacking from many alternatives (including relative embeddings).

Super-Tokens by Graph Learning
The key to sparsifying attention mechanisms in ETC for long sequence modeling is to have every token only attend to tokens that are nearby in the serialized sequence. Although the RichAtt mechanism empowers the transformers by taking the spatial layout structures into account, poor serialization can still block significant attention weight calculation between related word tokens.

To further mitigate the issue, we construct a graph to connect nearby tokens in a form document. We design the edges of the graph based on strong inductive biases so that they have higher probabilities of belonging to the same entity type. For each token, we obtain its Super-Token embedding by applying graph convolutions along these edges to aggregate semantically relevant information from neighboring tokens. We then use these Super-Tokens as an input to the RichAtt ETC architecture. This means that even though an entity may get broken up into multiple segments due to poor serialization, the Super-Tokens learned by the GCN will have retained much of the context of the entity phrase.

An illustration of the word-level graph, with blue edges between tokens, of a FUNSD document.

Key Results
The Figure below shows model size vs. F1 score (the harmonic mean of the precision and recall) for recent approaches on the CORD benchmark. FormNet-A2 outperforms the most recent DocFormer while using a model that is 2.5x smaller. FormNet-A3 achieves state-of-the-art performance with a 97.28% F1 score. For more experimental results, please refer to the paper.

Model Size vs. Entity Extraction F1 Score on CORD benchmark. FormNet significantly outperforms other recent approaches in absolute F1 performance and parameter efficiency.

We study the importance of RichAtt and Super-Token by GCN on the large-scale masked language modeling (MLM) pre-training task across three FormNets. Both RichAtt and GCN components improve upon the ETC baseline on reconstructing the masked tokens by a large margin, showing the effectiveness of their structural encoding capability on form documents. The best performance is obtained when incorporating both RichAtt and GCN.

Performance of the Masked-Language Modeling (MLM) pre-training. Both the proposed RichAtt and Super-Token by GCN components improve upon ETC baseline by a large margin, showing the effectiveness of their structural encoding capability on large-scale form documents.

Using BertViz, we visualize the local-to-local attention scores for specific examples from the CORD dataset for the standard ETC and FormNet models. Qualitatively, we confirm that the tokens attend primarily to other tokens within the same visual block for FormNet. Moreover for that model, specific attention heads are attending to tokens aligned horizontally, which is a strong signal of meaning for form documents. No clear attention pattern emerges for the ETC model, suggesting the RichAtt and Super-Token by GCN enable the model to learn the structural cues and leverage layout information effectively.

The attention scores for ETC and FormNet (ETC+RichAtt+GCN) models. Unlike the ETC model, the FormNet model makes tokens attend to other tokens within the same visual blocks, along with tokens aligned horizontally, thus strongly leveraging structural cues.

Conclusion
We present FormNet, a novel model architecture for form-based document understanding. We determine that the novel RichAtt mechanism and Super-Token components help the ETC transformer excel at form understanding in spite of sub-optimal, noisy serialization. We demonstrate that FormNet recovers local syntactic information that may have been lost during text serialization and achieves state-of-the-art performance on three benchmarks.

Acknowledgements
This research was conducted by Chen-Yu Lee, Chun-Liang Li, Timothy Dozat, Vincent Perot, Guolong Su, Nan Hua, Joshua Ainslie, Renshen Wang, Yasuhisa Fujii, and Tomas Pfister. Thanks to Evan Huang, Shengyang Dai, and Salem Elie Haykal for their valuable feedback, and Tom Small for creating the animation in this post.

Source: Google AI Blog


Beta Channel Update for Desktop

The Beta channel has been updated to 101.0.4951.41 for Windows,Mac and Linux.

A full list of changes in this build is available in the log. Interested in switching release channels? Find out how here. If you find a new issues, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.


Prudhvikumar BommanaGoogle Chrome

Find great extensions with new Chrome Web Store badges

Since 2009, publishers have been hard at work building extensions that make Chrome more powerful, useful and customizable for users. It has always been our mission to make it easy for users to find great extensions while recognizing the publishers who create them. Today, we’re announcing two new extension badges to help us deliver on our goal: the Featured badge and the Established Publisher badge. Both badges are live on the Chrome Web Store today.

Featured badge

Picture featuring UI of Featured badge

The Featured badge is assigned to extensions that follow our technical best practices and meet a high standard of user experience and design. Chrome team members manually evaluate each extension before it receives the badge, paying special attention to the following:

  1. Adherence to Chrome Web Store’s best practices guidelines, including providing an enjoyable and intuitive experience, using the latest platform APIs and respecting the privacy of end-users.
  2. A store listing page that is clear and helpful for users, with quality images and a detailed description.

Established Publisher badge

Picture featuring UI of Featured badge

The Established Publisher badge showcases publishers who have verified their identity and demonstrated compliance with the developer program policies. This badge is granted to publishers who meet the following two conditions:

  1. The publisher's identity has been verified.
  2. The publisher has established a consistent positive track record with Google services and compliance with the Developer Program Policy.

As our goal is to help users find great extensions, publishers cannot pay to receive either badge. They can, however, submit a request for their extension to be reviewed to receive the Featured badge in the one-stop support page (under My item → I want to nominate my extension…) .

If you’re a publisher, learn more about badging and discovery on Chrome Web Store.