Tag Archives: bigquery

Google Workspace Updates Weekly Recap – May 16, 2025

6 New updates

Unless otherwise indicated, the features below are available to all Google Workspace customers, and are fully launched or in the process of rolling out. Rollouts should take no more than 15 business days to complete if launching to both Rapid and Scheduled Release at the same time. If not, each stage of rollout should take no more than 15 business days to complete.

Easily connect to BigQuery data using Saved Queries in Connected Sheets

We’re launching direct access to the BigQuery saved queries feature directly from Connected Sheets. Users can now quickly and easily create a new Connected Sheet by selecting a previously saved query under any project they have access to. | Rolling out now to Rapid Release and Scheduled Release domains. | Available to all Google Workspace customers, Workspace Individual Subscribers, and users with personal Google accounts. | Learn more about analyzing & refreshing BigQuery data in Google Sheets using Connected Sheets, getting started with BigQuery data in Google Sheets and writing & editing a query.

Scan documents easier using Google Drive on Android devices

To improve upon the scanning experience in the Google Drive app on Android devices, we’re increasing the size and updating the style of the scan buttons. To scan something, open the Drive app on your Android device > scan a document > edit scan (if desired) > upload to Drive. | Rollout to Rapid Release domains is complete; Rolling out now to Scheduled Release domains, with expected completion by June 30, 2025. | Available to Google Workspace customers, Google Workspace Individual subscribers, and users with personal Google accounts. | Visit the Help Center to learn more about scanning documents with Google Drive.

Chat apps can now create spaces with their own identity using the Google Chat API

Last year, we introduced a feature through the Google Workspace Developer Preview Program that enables Chat apps to create spaces and memberships - using application identity - via the Google Chat API. This update further enhances Chat apps for real-time collaboration and allows for even more sophisticated and autonomous Chat apps by eliminating the need to create spaces and memberships on behalf of a user. This week, we’re excited to announce that this is now generally available for Google Workspace developers. | Rolling out now to Rapid Release and Scheduled Release domains. | Available to all Google Workspace customers. | Visit the Developer blog to see an example and learn more about what this means for developers. You can also visit our developer documentation to explore how to authenticate and authorize Chat apps, and learn how to create spaces and add space members using application identity.

Export Gemini responses to a Google Doc or Sheet from Gemini in the side panel of Google Drive

Users can collaborate with Gemini in Google Drive to perform various tasks, such as extracting data, drafting proposals, creating study guides and more. We’re excited to build upon this by giving users the ability to convert Gemini’s response directly into a Sheet or a Doc. After asking Gemini a question, click the ‘Export to Doc’ or ‘Export to Sheet’ button, and you’ll have an artifact that you can continue working on or share with others. | Rolling out now to Rapid Release and Scheduled Release domains. | Available to Business Standard and Plus; Enterprise Standard and Plus; Google One AI Premium; Customers with the Gemini Education or Gemini Education Premium add-on, and anyone who previously purchased a Gemini Business or Gemini Enterprise add-on. | Visit the Help Center to learn more about collaborating with Gemini in Google Drive.

Export Gemini responses to a Google Doc or Sheet from the Gemini side panel in Google Drive

Create even more interactive YouTube assignments in Google Classroom

In 2024, we added AI-suggested questions that educators can easily attach to the video based on its content to save educators time. Today, we’re excited to announce that we’re expanding the collection of YouTube videos that have AI-suggested questions available. As a result, educators can create even more interactive YouTube assignments in Classroom more quickly and efficiently, with the help of Gemini. Before assigning the interactive video activity, educators will be able to edit the suggested questions ahead of attaching the YouTube video to the assignment. | Rollout to Rapid Release and Scheduled Release domains is complete. | Available for Google Workspace Education Plus and the Teaching and Learning add-on. | Visit the Help Center to learn more about interactive questions for YouTube videos in Google Classroom.

Your users can now express interest in upgrading their Google Workspace subscription

We’re introducing a simple way for Workspace end users to request an upgraded edition of Google Workspace from their admins. To send this request, Workspace end users will see a path to request an upgraded Workspace edition including additional Gemini features when they interact with Gemini, such as with the Gemini icon displayed across Workspace apps. This interaction will provide an end user a path to complete a request form for an upgraded Workspace edition. Super admins will receive this request by email, and from there they can choose to upgrade their Workspace subscription directly in the Admin console. | Admins can disable these requests at any time in the Admin console by going to Account > Account Settings > Gemini for Google Workspace, and deselecting the ‘User-requested upgrades’ setting — use this article in our Help Center for more information. | Please note this is only available for customers in the US and India with fewer than 300 licenses. | Rolling out now to Rapid Release and Scheduled Release domains. | Learn more about Gemini for Workspace, including our recent Gemini for Workspace announcements.

Previous announcements

The announcements below were published on the Workspace Updates blog earlier this week. Please refer to the original blog posts for complete details.

Use Gemini in Drive to interact with folders in 20+ additional languages

Last year, we added folder support to Gemini in the side panel of Google Drive. This week, we’re excited to announce this experience is now available in 20+ languages. | Learn more about language availability for Gemini in Drive.

Use Gemini in Google Sheets to quickly add dropdowns, pivot tables, filters, and more

Analyzing and manipulating data in spreadsheets can be complex and time-consuming, even for experienced users. This week, we're introducing powerful new editing options within Gemini in Sheets that are designed to help everyone accomplish more, faster. | Learn more about Gemini in Sheets.

Google Workspace apps for Gmail, Google Drive, Google Docs, Calendar, Keep, and Tasks are now generally available for the Gemini app

Earlier this year, we launched Google Workspace apps (formerly known as "extensions") in Gemini in open beta. When enabled, Gemini can reference and incorporate data from these apps to generate even more informed and relevant responses, bringing Gemini’s capabilities more seamlessly into your daily workflows, helping enhance productivity. Workspace apps are available for: Calendar, Docs, Drive, Gmail, Keep, Tasks. | Learn more about Workspace apps in the Gemini app.

Extract and categorize data in AppSheet with the power of Gemini

At Google Cloud Next 2025, we introduced Gemini in AppSheet solutions for AppSheet Enterprise Plus users. Now you can automatically extract key information from uploaded photos, parse through complex PDFs, or even categorize, route and prioritize incoming requests based on their content – all seamlessly within your existing AppSheet apps. The new AI Task (Preview) feature, powered by Gemini, makes this a reality. | Learn more about Gemini in AppSheet.

Upcoming Changes to Microsoft Exchange Migrations

Beginning May 31, 2025, the legacy Data Migration Service will no longer support migrating email, calendar, and contact data from Microsoft Exchange on-premises servers (e.g., Exchange 2010, 2013, or 2016). Admins will not be able to start new migrations after May 31, 2025 – migrations that are in progress will continue until completion. | Learn more about changes to Microsoft Exchange Migrations.

Completed rollouts

The features below completed their rollouts to Rapid Release domains, Scheduled Release domains, or both. Please refer to the original blog posts for additional details.

Rapid Release Domains:

Start a conversation with Gemini in the side panel of Google Drive about specific files

Scheduled Release Domains:

Rapid and Scheduled Release Domains:

For a recap of announcements in the past six months, check out What’s new in Google Workspace (recent releases).

Source: Google Workspace Updates

Producer java library for Data Lineage is now open source

Integrating OpenLineage producers with GCP Lineage just got a lot easier

What is Data Lineage

Data Lineage is a GCP feature that allows tracking data movement. This tool helps data owners and analysts detect anomalies in data flows, find connections between data sources and verify the potential consequences of planned changes in data pipelines.

Lineage is injected automatically for some Google Cloud products (BigQuery, Cloud Data Fusion, Cloud Composer, Dataproc, Vertex AI). That means, if Lineage integration with any of those products is enabled in the projects, data movements coming from executing jobs by these products will be reported to GCP Lineage.

For custom integrations, the API can be used to report and fetch lineage.

After injecting, lineage can be viewed in the Google Cloud console (available from DataCatalog UI, BigQuery UI, Vertex UI). There are two representations: graph view, with data sources as nodes and data movements as edges, and list view, a tabular representation. Lineage information can also be fetched from the API.

More information is available in the documentation.

GCP Lineage information model

We describe data flows using the following concepts:

Process is a definition of some data transformation. For example, a SQL or Spark script.

Run is an execution of a Process.

Lineage Event is a data transformation event. It is reported in context of a Run.

A Link represents a connection between two data sources, when data in the link’s Target depends on its Source. A Lineage Event contains a list of Links.

OpenLineage support

OpenLineage is an open standard for reporting lineage information. It unifies lineage reporting between systems, which means the events generated in this format can be consumed by any product supporting it. This leads to more flexibility: adding or replacing a lineage producer does not imply changing the consumer, and vice versa.

OpenLineage format is adopted by a number of lineage producers and consumers, meaning there is already tooling available to report lineage from/to those systems. GCP Lineage is one of those consumers: users can report events in OpenLineage format, see the resulting lineage on the UI, and query it via the API.

OpenLineage is the preferred method for reporting lineage in GCP Lineage. It is used by the Dataproc lineage integration. To find out more about sending OpenLineage events to GCP Lineage refer to the documentation.

After injecting lineage in OpenLineage format, it can be accessed in the same way as if it was injected via other API methods or automatically: from the Google Cloud console or the API.

Why producer library

The GCP Lineage producer library is an extension of the client library. Client libraries are recommended for calling Cloud APIs programmatically. They handle low level API call details, leaving the necessary user code simpler and shorter.

The producer library further simplifies integration by providing ready to use code needed to call the API from Java. It adds additional functionality such as synchronous and asynchronous clients, translating OpenLineage JSON messages to the API friendly format, error handling etc.

Using the producer library, all the code needed to send a request to GCP Lineage API is:

SyncLineageProducerClient client = SyncLineageProducerClient.create();
ProcessOpenLineageRunEventRequest request =
        ProcessOpenLineageRunEventRequest.newBuilder()
            .setParent(parent)
            .setOpenLineage(openLineageMessage)
            .build();
client.processOpenLineageRunEvent(request);

The field openLineageMessage here is a protobuf Struct that includes information about job execution, inputs and outputs and other metadata. The object model is described in the documentation. An example message is:

{
  "eventType": "START",
  "eventTime": "2023-04-04T13:21:16.098Z",
  "run": {
    "runId": "502483d6-3e3d-474f-9380-da565eaa7516",
    "facets": {
       "spark_properties": {
        "_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.22.0/integration/spark",
        "_schemaURL": "https://openlineage.io/spec/2-0-2/OpenLineage.json#/$defs/RunFacet",
        "properties": {
          "spark.master": "yarn",
          "spark.app.name": "sparkJobTest.py"
        }
      }
    }
  },
  "job": {
    "namespace": "project-name",
    "name": "cluster-name",
    "facets": {
    "jobType": {
        "_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.22.0/integration/spark",
        "_schemaURL": "https://openlineage.io/spec/facets/2-0-3/JobTypeJobFacet.json#/$defs/JobTypeJobFacet",
        "processingType": "BATCH",
        "integration": "SPARK",
        "jobType": "SQL_JOB"
      },

    }
  },
  "inputs": [
    {
      "namespace": "bigquery",
      "name": "project.dataset.input_table",
    }],
  "outputs": [
   {
      "namespace": "bigquery",
      "name": "project.dataset.output_table",
    }],
  "producer": "https://github.com/OpenLineage/OpenLineage/tree/0.18.0/integration/spark",
  "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/$defs/RunEvent"
}

Learn more about building an OpenLineage message.

Best Practices for Constructing OpenLineage Messages

The openLineageMessage should follow the OpenLineage format. The fields that are required for correct parsing by the GCP Lineage API are:

job	mapped to Process
job.namespace	used to construct Process name
job.name	used to construct Process name
run	mapped to Run
run.runId	used to construct Run name
producer	URI identifying the producer of this metadata
eventTime	time of the data movement
schemaURL	URL pointing to the schema definition for this message

In addition to those, the fields used to create lineage are:

eventType	corresponds to the status of the Run
inputs	mapped to sources of links. Must be specified according to the naming conventions
outputs	mapped to targets of links. Must be specified according to the naming conventions

The GCP Lineage API supports OpenLineage major versions 1 and 2. For more information please refer to the documentation.

How to access GCP Lineage?

The code is now publicly available on GitHub. The library is also published to Maven.

GcpLineageTransport

To simplify integration with GCP Lineage, we offer GcpLineageTransport. It is available on the OpenLineage GitHub repository and is built to a separate maven artifact. It is built on top of the producer library mentioned above.

Using the transport minimises the code for sending events to GCP Lineage. The GcpLineageTransport can be configured as the event sink for any existing OpenLineage producer such as Airflow, Spark, and Flink. Find more information and examples on GCP Lineage.

By Mary Idamkina – Data Lineage

Source: Google Open Source Blog

Fluent Bit WriteAPI Connector: Lowering the barrier to streaming data

Automating ingestion processes is crucial for modern businesses that handle vast amounts of data daily. In today's fast-paced digital landscape, the ability to seamlessly collect, process, and analyze data can make the difference between staying ahead of the competition and falling behind. To simplify ingestion, tools such as Fluent Bit enable customers to route data between pluggable sources and sinks without needing to write a single line of code. Instead, data routing is managed via a config file. The Fluent Bit WriteAPI Connector is a pluggable sink built on top of the BigQuery Storage Write API that enables organizations to rapidly develop a data ingestion pipeline.

What are the BigQuery Storage Write API and Fluent Bit?

The BigQuery Storage Write API is a high-performance data-ingestion API for BigQuery. It leverages both batching and streaming methods to ingest records into BigQuery in real-time. The WriteAPI offers features such as ability to scale and provides exactly-once delivery to guarantee that data is not duplicated. Using the Write API directly typically requires technical expertise, as users must navigate one of the client SDKs. This can create a high barrier to entry for some customers to stream data into BigQuery.

Fluent Bit is a widely-used open-source observability agent known for its lightweight design, speed, and flexibility. It operates by collecting logs, traces and metrics through various inputs such as local or network files, filtering and buffering them, and then routing them to designated outputs. Fluent Bit's high-performance parsing capabilities allow for data to be processed according to user specifications. The output component is a configurable plugin that directs data to different destinations, such as various tables in BigQuery. There can be multiple WriteAPI outputs and each output can be independently configured to use a specific write mode, enabling seamless data streaming into BigQuery based on tag/match pairs.

Why Use the Fluent Bit WriteAPI Connector?

Our solution to the technical challenges posed by using the WriteAPI is the Fluent Bit WriteAPI Connector. This connector automates the data ingestion process, eliminating the need for customers to write any code. The entire pipeline is managed through a single configuration file, making it easy to use. The flow of data is depicted in the diagram below.

Example Use Case

Say we wish to monitor a log file containing JSON data, and we would like to ingest this data into a BigQuery table that has a single column titled “Text” of type String. A line from the log file looks like this:

{"Text": "Hello, World"}

Setup Process

Setting Up Fluent Bit:

install

Cloning the Google Git Repository:

Google Git Repository

/usr/local/fluentbit-bigquery-writeapi-sink

plugins.conf

writeapi

[PLUGINS]

Path /usr/local/fluentbit-bigquery-writeapi-sink/out_writeapi.so

Setting Up BigQuery Tables:

BigQuery tables

Text

STRING

myProject.myDataset.myTable.

click to enlarge

Prepare the input file:

/usr/local/logfile.log

touch /usr/local/logfile.log

Configuring the Plugin:

/usr/local

demo.conf

format

This routes the data from /usr/local/logfile.log to the BigQuery table at myProject.myDataset.myTable. There are additional configurable fields that control the stream, such as chunking, asynchronous response queue, and also the type of stream. These fields let you control how your data is streamed.

To run the pipeline, use the command:

fluent-bit -c /usr/local/demo.conf

As the log file is updated new lines will automatically appear in the BigQuery table. For example, to populate the log file you can run the following command:

echo "{\"Text\": \"Hello, world\"}" >> /usr/local/logfile.log

Note that the default flush interval in Fluent Bit is 1 minute, so it might take a minute before the log file is flushed. The BigQuery table will now be updated as follows:

click to enlarge

Key Features

The connector supports a wide variety of features including multi-instancing, dynamic scaling, exactly-once delivery, and automatic retry.

1. Multi-Instancing

The multi-instancing feature of the Fluent Bit WriteAPI Connector is designed to offer flexibility in routing data. Specifically, users can configure the connector to handle multiple data inputs and outputs in various combinations. This feature also supports more complex configurations, such as multiple inputs feeding into multiple outputs, allowing data to be aggregated or distributed as needed. An input connector is labeled with a tag field. In our example, this has value log1. Data is routed to an output connector based on the value of its match field. In our example, this also has value log1, meaning there is a 1-to-1 correspondence between the input and output connector. The match field is a regex so it can be used to connect with multiple inputs. For example, if this was set to * then data from all inputs would flow to this output.

2. Dynamic Scaling

Handling large volumes of data efficiently is crucial for modern pipelines. The dynamic scaling feature addresses the issue of potential overloads in the Write API. As data is streamed into BigQuery, there may be times when the API queue becomes full—by default, it can hold up to 1000 pending responses. When this limit is reached, no new data can be appended until some of the pending responses are processed, which can create back pressure in the system. To manage this, the connector automatically scales up its capacity by creating an additional network connection when it detects that the number of pending responses has reached the threshold.

3. Exactly-Once

The "exactly-once" feature ensures that each piece of data is sent and recorded in BigQuery exactly once. This feature ensures no data is duplicated. If the connector encounters an intermittent issue while sending a specific piece of data, it will synchronously retry sending it until it is successful. This ensures data is delivered correctly.

4. Retry Functionality

The retry functionality allows the connector to handle temporary failures gracefully. The retry mechanism is configurable, meaning users can set how many times the system should attempt to resend the data before giving up. By default, the connector will retry sending failed data up to four times. In the default stream mode, if a row of data fails to send, it is retried while other rows continue to be processed. However, in the "exactly once" mode, the retry process is synchronous, meaning the system will wait for the failed row to be successfully sent before moving on to subsequent rows.

5. Error Handling

Error handling in the connector is designed to catch and manage issues that may arise during data transmission. The connector will continue processing incoming data even if earlier data had a failure. Any permanent issues that are encountered are logged to the console.

Conclusion

The ability to efficiently collect, process, and analyze data is a critical factor for business success. The Fluent Bit WriteAPI Connector stands out as a powerful solution that simplifies and automates the data ingestion process, bridging the gap between Fluent Bit's versatile data collection capabilities and Google BigQuery's robust analytics platform.

By eliminating the need for complex coding and manual data management, the Fluent Bit WriteAPI Connector lowers the barrier to entry for businesses of all sizes. Whether you're a small startup or a large enterprise, this tool allows you to effortlessly set up and manage your data pipelines with a single configuration file. Its features like multi-instancing, dynamic scaling, exactly-once delivery, and error handling ensure that your data is ingested accurately, reliably, and in real-time.

The straightforward setup process, combined with the flexibility and scalability of the connector, make it a valuable asset for any organization looking to harness the power of their data. By automating the ingestion process, businesses can focus on what truly matters: deriving actionable insights from their data to drive growth and innovation.

Source: Google Open Source Blog

Workspace audit log exports in BigQuery now enriched with Drive label metadata

What’s changing

Google Workspace audit logs enable admins to have visibility into activity on their data, such as file shares and downloads, when it occurred, and who within the organization performed the action. The Google Drive audit events include activity on content your users create in Google Docs, Sheets, Slides, as well as the files that your users upload to Drive, such as PDFs and Microsoft Word files.

Today, we’re excited to announce that for admins who analyze these logs in BigQuery, these events are now enriched with Drive labels metadata. Admins leverage Drive labels to apply descriptive metadata, such as file sensitivity, to Drive items. With the enrichment of label metadata on log events, admins can now focus their analysis on activity occurring on their most important files by filtering on label conditions.

Getting started

Admins: Visit the Help Center to learn how to set up service log exports to BigQuery.
End users: There is no end user setting for this feature.

Rollout pace

This feature is available now.

Availability

Available to Enterprise Essentials Plus, Enterprise Standard and Enterprise Plus, Education Standard and Education Plus

Resources

Source: Google Workspace Updates

Google Workspace Updates Weekly Recap – February 2, 2024

3 New updates

Unless otherwise indicated, the features below are available to all Google Workspace customers, and are fully launched or in the process of rolling out. Rollouts should take no more than 15 business days to complete if launching to both Rapid and Scheduled Release at the same time. If not, each stage of rollout should take no more than 15 business days to complete.

New ways to work with filters in Connected Sheets for Looker and Connected Sheets for BigQuery

With Looker: Connected Sheets for Looker users will now be able to see whether Looker-defined filters, always_filters and conditionally_filters, are applied in their Google Sheets. | Rolling out to Rapid Release domains and Scheduled Release domains now. | Available to all Google Workspace customers, Google Workspace Individual subscribers, and users with personal Google accounts. | Learn more about always_filter and conditionally_filter in Looker.
With BigQuery: Connected Sheets for BigQuery users can now right-click any value in the Connected Sheets preview and filter rows in the preview on that specific value. As a result, a new query will be sent to BigQuery to re-fetch the preview. This new option provides an additional way to easily explore data in the Connected Sheets preview for BigQuery. | This is now available to all Google Workspace customers, Google Workspace Individual subscribers, and users with personal Google accounts. | Learn more about sorting & filtering BigQuery data in Google Sheets.

Latest improvement for page setup in Google Docs

We’re introducing a new menu item within the Format option that enables you to easily switch between pageless and pages mode in Google Docs. | Rollout to Rapid Release domains is complete; launch to Scheduled Release domains planned for February 5, 2024. | Available to all Google Workspace customers, Google Workspace Individual subscribers, and users with personal Google accounts. | Learn more about changing a document’s page setup.

Developers can now choose which calendar events to be notified about

Developers using the Calendar API can now subscribe to specific changes in users’ calendars — this includes event types such as working location, out-of-office, or focus time. This update reduces the amount of notifications down to those that are most relevant to the developer. | Use our Developer Documentation for more information about subscribing to alerts for specific calendar event changes. | Rolling out now to Rapid Release domains and Scheduled Release domains. | The Calendar API is available to all Google Workspace customers, however Calendar features may vary across Google Workspace editions.

Previous announcements

The announcements below were published on the Workspace Updates blog earlier this week. Please refer to the original blog posts for complete details.

For Google Meet Hardware, ‘Auto-Update Expiration (AUE)’ will now be referred to as 'end of Meet support'

We’re making changes to the terminology that refers to when a Meet hardware device is no longer supported. | This update impacts all Google Workspace customers with Meet hardware devices. | Learn more about 'end of Meet support'.

Local data storage exports your organization's Workspace data into the geographic location of your choice, launching in beta

We’re introducing Google Workspace’s new feature, local data storage. This feature allows admins to export their organization’s Workspace data into the geographic location or locations of their choice. | Available to Google Workspace Enterprise Plus customers with Assured Controls add-on only. | Learn more about the local data storage exports beta.

Improving the Google Calendar appointment scheduling experience with new features

We’re improving upon the current appointment scheduling experience with four new features. | Co-hosts, secondary calendars and delegate access features are available to Google Workspace Business Standard, Business Plus, Enterprise Standard, Enterprise Plus, Education Fundamentals, Education Standard, Education Plus, the Teaching and Learning Upgrade, and Nonprofits only. Check Calendars for availability feature is available to Google Workspace Business Standard, Business Plus, Enterprise Standard, Enterprise Plus, Education Fundamentals, Education Standard, Education Plus, the Teaching and Learning Upgrade, Nonprofits, Google Workspace Individual subscribers, and users with personal Google accounts only. | Learn more about the appointment scheduling features.

Shareable class templates and classwork in Google Classroom are now generally available

Last year, we announced a Google Classroom beta that simplifies lesson planning with shareable class templates and classwork. We’re excited to announce that this is now globally available for Google Workspace for Education Plus customers in English, Spanish, Portuguese and Japanese. | Available to Education Plus only. | Learn more about shareable class materials.

Pin chat messages in Google Meet

You can now pin messages to the top of the meeting chat. Pinned messages will appear for all users, even if they join the meeting after the message was initially sent. | Learn more about messaging pinning.

Completed rollouts

The features below completed their rollouts to Rapid Release domains, Scheduled Release domains, or both. Please refer to the original blog posts for additional details.

Rapid Release Domains:

Improving the Google Calendar appointment scheduling experience with new features

Scheduled Release Domains:

Adding timezone support for existing smart chips in Google Docs
Combining multiple video effects (Web & Mobile availability)
Studio lighting (Web availability)

Rapid and Scheduled Release Domains:

For a recap of announcements in the past six months, check out What’s new in Google Workspace (recent releases).

Source: Google Workspace Updates

Google Workspace Updates Weekly Recap – January 12, 2024

4 New updates

Unless otherwise indicated, the features below are available to all Google Workspace customers, and are fully launched or in the process of rolling out. Rollouts should take no more than 15 business days to complete if launching to both Rapid and Scheduled Release at the same time. If not, each stage of rollout should take no more than 15 business days to complete.

Version history limits for Apps Script projects

For all new scripts, you’ll be able to create and save up to 200 versions of your script. If needed, you can permanently delete a script version from the project history page. | This is available now to all Google Workspace customers. | Learn more using our developer documentation on working with Apps Script versions.

Share a link to a specific time in a Google Drive video

We’re adding new functionality to the Drive sharing button that lets you share timestamped links to specific parts of a video. On web, simply navigate to drive.google.com > find and open a video file > play the video (you can pause the playback before performing the following steps) > select the dropdown on the “Share” button in the top-right corner > select “Copy link to this time” > send the link. | Rolling out now to Rapid Release and Scheduled Release domains. | Available to all Google Workspace customers and users with personal Google Accounts. | Learn more about copying a specific time in the video.

Introducing dropdown options on the sharing button in Google Docs, Sheets, Slides and Drawings We’re adding a new feature that ensures a seamless sharing experience across Workspace. In Google Docs, Sheets, Slides and Drawings, you will now see a dropdown on the Share button that surfaces quick actions, such as pending access requests and the “Copy link” option. | Rolling out now to Rapid Release and Scheduled Release domains. | Available to all Google Workspace customers and users with personal Google Accounts.

Using functions in Connected Sheets for BigQuery

Today, Connected Sheets for BigQuery supports 23 Sheets functions, such as AVERAGE and XLOOKUP. However, all of these functions behave somewhat differently than their native counterparts. Thus, to help Connected Sheets users write better functions, we now display context-aware Help Center content in Sheets. The ‘formula help’ shows descriptions for Connected Sheets functions when writing a formula that would query BigQuery, and otherwise shows descriptions of native Sheets functions. | This is available now to all Google Workspace customers and users with personal Google Accounts. | Learn more about the XLOOKUP function.

Previous announcements

The announcements below were published on the Workspace Updates blog earlier this week. Please refer to the original blog posts for complete details.

Updates to metrics in Google Drive Apps Reports and Reports API

We’re making some updates to the Google Drive metrics in the Admin Console Apps reports and the Reports API. As a result of these improvements, admins who analyze metrics will have more reporting clarity and can better understand activity trends within their domain. | Learn more about metrics in Drive Apps Reports and Reports API.

Easily share Google Drive files to Google Calendar meeting attendees

We’re introducing the option to share any file with all meeting participants on a Google Calendar invite via the sharing dialog within a file. | Learn more about sharing Drive files to Calendar.

Google Meet is now available on Logitech Android appliances

Google Meet is now supported on Logitech’s Rally Bar and Rally Bar Mini Android-based appliances for collaboration rooms and spaces of just about any size. After initial setup, admins can easily enroll, manage, and monitor these devices using the Google admin console. Google Meet on Logitech Android-based devices is supported on CollabOS v1.11 as a video conferencing provider | Learn more about Meet on Logitech Android appliances.

Google Meet hardware devices from Poly now support interoperability with Cisco Webex and Zoom

We’re expanding the existing interoperability between Google Meet, Cisco Webex, and Zoom to include Android-based Meet hardware devices from Poly. Specifically, these devices include: Poly Studio X30, X50, X52, and X70. | Learn more about support interoperability with Cisco Webex and Zoom.

Extending Trusted Types to Gmail

We’re excited to announce the expansion of Trusted Types to Gmail. This will provide a defense against DOM XSS and further enhances our advanced data protection controls to keep users and data safe across more of the apps they use everyday. | Learn more about Trusted Types.

Completed rollouts

The features below completed their rollouts to Rapid Release domains, Scheduled Release domains, or both. Please refer to the original blog posts for additional details.

Rapid Release Domains:

Scheduled Release Domains:

Rapid and Scheduled Release Domains:

For a recap of announcements in the past six months, check out What’s new in Google Workspace (recent releases).

Source: Google Workspace Updates

Improving consistency of Gmail data across the Security Investigation Tool and BigQuery

What’s changing

In August 2022, we announced the movement of the existing Gmail logs in BigQuery to Google Workspace logs and reports in BigQuery, creating a single space to access all of your Google Workspace audit events.

To further improve this experience, you’ll see all Gmail log events from the Security Investigation Tool (SIT) in your Google Workspace logs and reports in BigQuery. This includes device types and post-delivery actions (for example open, delete, clicked links, and attachment downloads). This change will give admins a more complete picture of Gmail activity happening in their domain, and bring consistency across reporting tools.

Use the table below to familiarize yourself with the newly added columns — complete details can be found in our Help Center.

List of newly added columns	Description
event_info.mail_event_type	The event type corresponds to the Event attribute in Gmail log events in Security Investigation Tool.
event_info.client_context.client_type	The type of client or device where the action occurred, including: WEB, IOS, ANDROID, IMAP, POP3, and API.
event_info.client_context.session_context.delegate_user_email	Email address of the delegated user who performed the action on the account owner's behalf.
message_info.attachment.file_name	File attachment name.
message_info.post_delivery_info	Information about the post-delivery event.
message_info.post_delivery_info.action_type	Post-delivery action type.
message_info.post_delivery_info.interaction	Information about the user's interaction with message links, Drive items, or attachments.
message_info.post_delivery_info.interaction.link_url	The URL associated with the interaction, which is set only for link click interactions.
message_info.post_delivery_info.interaction.drive_id	The unique ID of the Google Drive item associated with the interaction. This ID is used to access the item in Drive.
message_info.post_delivery_info.interaction.attachment	The target attachments of the interaction, which are set only for attachment interactions.
message_info.post_delivery_info.interaction.attachment.file_extension_type	File extension (not MIME part type), not including the period.
message_info.post_delivery_info.interaction.attachment.malware_family	Malware type, if malware is detected during message handling.
message_info.post_delivery_info.interaction.attachment.file_name	Attachment file name.
message_info.post_delivery_info.interaction.attachment.sha256	SHA256 hash of the attachment.

Getting started

Admins:

Visit the Help Center to learn more about Google Workspace logs and reports in BigQuery, schemas for Gmail logs in BigQuery and Gmail log events.
This update adheres to the standard BigQuery pricing — see here for more information.

End users: There is no end user impact.

Rollout pace

Rapid Release and Scheduled Release domains: Available now

Availability

Available to Google Workspace Enterprise Essentials, Enterprise Standard, Enterprise Plus, Education Plus, Education Standard
Not available to Google Workspace Essentials, Business Starter, Business Standard, Business Plus, Frontline, and Nonprofits, as well as legacy G Suite Basic and Business customers

Resources

Google Workspace Admin Help: Google Workspace logs and reports in BigQuery

Source: Google Workspace Updates

Unified experience for Gmail logs in BigQuery, configure your existing Gmail logs to route to Workspace logs

What’s changing

In the coming months, we will move the location of the existing Gmail logs in BigQuery to Google Workspace logs and reports in BigQuery. This change will only impact existing Google Workspace customers, who have BigQuery Export enabled. New customers will be guided through setting up a Workspace BigQuery project for Gmail logs.

While this change is happening, for a brief period of time, Gmail data will be exported to both the Gmail logs and the Workspace logs — existing Google Workspace customers who are impacted will receive an email notification with more information in the coming weeks.

Gmail logs in BigQuery

Migrating to Unified Workspace logs in BigQuery

Who’s impacted

Admins

Why it’s important

This change will create a single space to access all of your Google Workspace audit events and eliminate the need to switch between various sources of truth.

Once this merger is completed, you will not be able to use Gmail logs in BigQuery in the Admin console. Visit the Help Center for more information, including detailed step-by-step instructions on handling this change.

Additional details

Temporary dual export period and BigQuery storage requirements

Once Gmail logs are set to be exported to Workspace Logs, this data will temporarily exported to two destinations:

Your previous, Gmail-only BigQuery project
Your new or existing Workspace BigQuery project, which stores data for all Google Workspace applications

This will also impact your BigQuery quotas limits and how often they have to refresh, as well as the cost associated with exporting logs to BigQuery. To avoid impact on your quotas and duplicative data, we strongly recommend turning off Gmail log exports to your Gmail-only BigQuery projects.

Getting started

Admins: Visit the Help Center to learn more about Google Workspace logs and reports in BigQuery and moving your views, queries, and scripts from Gmail logs in BigQuery to Workspace logs in BigQuery.
End users: There is no end user impact.

Rollout pace

Rapid and Scheduled Release domains: Extended rollout (potentially longer than 15 days for feature visibility) starting on August 10, 2022

Availability

Available to Google Workspace Enterprise Essentials, Enterprise Standard, Enterprise Plus, Education Plus, Education Standard
Not available to Google Workspace Essentials, Business Starter, Business Standard, Business Plus, Frontline, and Nonprofits, as well as legacy G Suite Basic and Business customers

Resources

Source: Google Workspace Updates

Logica: organizing your data queries, making them universally reusable and fun

We present Logica, a novel open source Logic Programming language. A successor to Yedalog (a language developed at Google earlier) it is a Datalog-like logic programming language. Logica code compiles to SQL and runs on Google BigQuery (with experimental support for PostgreSQL and SQLite), but it is much more concise and supports the clean and reusable abstraction mechanisms that SQL lacks. It supports modules and imports, it can be used from an interactive Python notebook and it even makes testing your queries natural and easy.

“Data is the new oil”, they say, and SQL is so far the lingua franca for working with data. When SQL (or “Structured English Query Language”, as it was first named) was invented in the 1970s, its authors might not have imagined the popularity that it would reach half a century later. Today, systems ranging from tiny smart watch applications to enterprise IT solutions, read and write their data using SQL. Even the browser that you are using to read this post now might have a working built-in SQL database in it.

Despite the widespread adoption, SQL is not flawless. Constructing statements from long chains of English words (which are often capitalized to keep the old-fashioned COBOL spirit of the 70s alive!) can be very verbose—a single query spanning hundreds of lines is a routine occurrence. The main flaw of SQL, however, lies in its very limited support for abstraction.

Good programming is about creating small, understandable, reusable pieces of logic that can be tested, given names, and organized into packages which can later be used to construct more useful pieces of logic. SQL resists this workflow. Although you can encapsulate certain repeated computations into views and functions, the syntax and support for these can vary among implementations, the notions of packages and imports are generally nonexistent, and higher-level constructions (e.g. passing a function to a function) are impossible.

This inherent resistance to decomposition of logic into bite-sized pieces is what leads into the contrived, lengthy queries, the copy-pasted chunks of code and, eventually, unmaintainable, unstructured (note the irony) SQL codebases. To make things worse, SQL code is rarely tested, because “testing SQL queries” sounds rather esoteric to most engineers, at best. Because of that, a number of alternative query languages and libraries have been developed. Of those, systems based on logic programming perhaps come the closest to addressing SQL’s limitations.

Logic programming languages solve problems of SQL by using syntax of mathematical propositional logic rather than natural English language. The language of formal logic was designed by mathematicians specifically to make expression of complex statements easier and suits this purpose much better than natural language. Logica extends classical Logic programming syntax further, most notably with aggregation, hence the name, which stands for

Logica = Logic + Aggregation.

Let us see how it all works. SQL operates with relations, which are sets of rows. In logic programming the analog of a relation is a predicate. While a predicate is a set of rows, we think of it as a logical condition, which describes the rows of a relation. Here is, for example, the definition of a simple predicate:

MagicNumber(x: 2);

MagicNumber(x: 3);

MagicNumber(x: 5);

The definition claims that the condition MagicNumber(x) must hold when X is precisely either 2, 3, or 5. That means, if we were to query this predicate (i.e. request all values of X that satisfy it), the output should be a “relation” with a single column X and rows 2, 3, and 5. The SQL equivalent would be:

SELECT 2 AS x

UNION ALL

SELECT 3 AS x

UNION ALL

SELECT 5 AS x;

Rather than listing the individual values, we could have defined the predicate by encoding a logical condition upon X as follows:

MagicNumber(x:) :-

x in [2, 3, 5];

Now, here is where the magic starts. Firstly, any table in your database is itself already a predicate, so the following definition:

MagicComment(comment_text:) :-

`comments`(user_id:, comment_text:),

user_id == 5;

Defines a predicate MagicComment, which includes precisely those comment_text values, which are present in the comments table where user_id == 5. In SQL this would read:

SELECT comment_text FROM comments WHERE user_id = 5;

Observe what happens if we replace the condition “user_id == 5” in our predicate with MagicNumber(x: user_id):

MagicComment(comment_text:) :-

`comments`(user_id:, comment_text:),

MagicNumber(x: user_id);

Here, we are querying for comments of users whose ID is one of the “magic numbers” we just defined above. Note how easily we could reuse a previously defined piece of code without having to copy anything around. We could now even extract the MagicNumber to a common module and import it in wherever it is needed:

import my_project.magic.MagicNumber;

As a final example, let us mock the comments table, in a unittest of a query.

import my_project.magic.MagicComment;

MockComments(user_id: 1, comment_text: "Hello");

MockComments(user_id: 2, comment_text: "Logic");

MockComments(user_id: 3, comment_text: "Programming");

MagicCommentTest := MagicComment(`comments`: MockComments);

If we query the MagicComment predicate here, it will not try to read the comments table in the database. Instead, it will use the predicate we just defined, thus letting us verify its correctness by testing the output (it must include two rows “Logic” and “Programming”). Observe how natural and frictionless many of the good programming practices become with Logica, and compare that to what you would have to do to achieve the same using bare SQL.

There is much more to Logica, so make sure you give it a try—chances are, you will love it! Start with this tutorial to learn Logica. Even if you do not end up using it in your next project, learning a new powerful language may open your mind to new ideas and perspectives on data processing and computing in general.

The simple examples above are only a small sample of how concise Logica code can be over SQL for complex queries. In particular, we did not even touch the topic of aggregations in this article. For all of this see examples section of the Logica open source repository.

We also hope that some of the readers consider contributing to Logica development. That’s what open source is all about!

By Konstantin Tretyakov and Evgeny Skvortsov – Logica Open Source Project

Source: Google Open Source Blog

Code that final mile: from big data analysis to slide presentation

Posted by Wesley Chun (@wescpy), Developer Advocate, Google Cloud

Google Cloud Platform (GCP) provides infrastructure, serverless products, and APIs that help you build, innovate, and scale. G Suite provides a collection of productivity tools, developer APIs, extensibility frameworks and low-code platforms that let you integrate with G Suite applications, data, and users. While each solution is compelling on its own, users can get more power and flexibility by leveraging both together.

In the latest episode of the G Suite Dev Show, I'll show you one example of how you can take advantage of powerful GCP tools right from G Suite applications. BigQuery, for example, can help you surface valuable insight from massive amounts of data. However, regardless of "the tech" you use, you still have to justify and present your findings to management, right? You've already completed the big data analysis part, so why not go that final mile and tap into G Suite for its strengths? In the sample app covered in the video, we show you how to go from big data analysis all the way to an "exec-ready" presentation.

The sample application is meant to give you an idea of what's possible. While the video walks through the code a bit more, let's give all of you a high-level overview here. Google Apps Script is a G Suite serverless development platform that provides straightforward access to G Suite APIs as well as some GCP tools such as BigQuery. The first part of our app, the runQuery() function, issues a query to BigQuery from Apps Script then connects to Google Sheets to store the results into a new Sheet (note we left out CONSTANT variable definitions for brevity):

function runQuery() {
  // make BigQuery request
  var request = {query: BQ_QUERY};
  var queryResults = BigQuery.Jobs.query(request, PROJECT_ID);
  var jobId = queryResults.jobReference.jobId;
  queryResults = BigQuery.Jobs.getQueryResults(PROJECT_ID, jobId);
  var rows = queryResults.rows;

  // put results into a 2D array
  var data = new Array(rows.length);
  for (var i = 0; i < rows.length; i++) {
    var cols = rows[i].f;
    data[i] = new Array(cols.length);
    for (var j = 0; j < cols.length; j++) {
      data[i][j] = cols[j].v;
    }
  }

  // put array data into new Sheet
  var spreadsheet = SpreadsheetApp.create(QUERY_NAME);
  var sheet = spreadsheet.getActiveSheet();
  var headers = queryResults.schema.fields;
  sheet.appendRow(headers); // header row
  sheet.getRange(START_ROW, START_COL,
      rows.length, headers.length).setValues(data);

  // return Sheet object for later use
  return spreadsheet;
}

It returns a handle to the new Google Sheet which we can then pass on to the next component: using Google Sheets to generate a Chart from the BigQuery data. Again leaving out the CONSTANTs, we have the 2nd part of our app, the createColumnChart() function:

function createColumnChart(spreadsheet) {
  // create & put chart on 1st Sheet
  var sheet = spreadsheet.getSheets()[0];
  var chart = sheet.newChart()
     .setChartType(Charts.ChartType.COLUMN)
     .addRange(sheet.getRange(START_CELL + ':' + END_CELL))
     .setPosition(START_ROW, START_COL, OFFSET, OFFSET)
     .build();
  sheet.insertChart(chart);

  // return Chart object for later use
  return chart;
}

The chart is returned by createColumnChart() so we can use that plus the Sheets object to build the desired slide presentation from Apps Script with Google Slides in the 3rd part of our app, the createSlidePresentation() function:

function createSlidePresentation(spreadsheet, chart) {
  // create new deck & add title+subtitle
  var deck = SlidesApp.create(QUERY_NAME);
  var [title, subtitle] = deck.getSlides()[0].getPageElements();
  title.asShape().getText().setText(QUERY_NAME);
  subtitle.asShape().getText().setText('via GCP and G Suite APIs:\n' +
    'Google Apps Script, BigQuery, Sheets, Slides');

  // add new slide and insert empty table
  var tableSlide = deck.appendSlide(SlidesApp.PredefinedLayout.BLANK);
  var sheetValues = spreadsheet.getSheets()[0].getRange(
      START_CELL + ':' + END_CELL).getValues();
  var table = tableSlide.insertTable(sheetValues.length, sheetValues[0].length);

  // populate table with data in Sheets
  for (var i = 0; i < sheetValues.length; i++) {
    for (var j = 0; j < sheetValues[0].length; j++) {
      table.getCell(i, j).getText().setText(String(sheetValues[i][j]));
    }
  }

  // add new slide and add Sheets chart to it
  var chartSlide = deck.appendSlide(SlidesApp.PredefinedLayout.BLANK);
  chartSlide.insertSheetsChart(chart);

  // return Presentation object for later use
  return deck;
}

Finally, we need a driver application that calls all three one after another, the createColumnChart() function:

function createBigQueryPresentation() {
  var spreadsheet = runQuery();
  var chart = createColumnChart(spreadsheet);
  var deck = createSlidePresentation(spreadsheet, chart);
}

We left out some detail in the code above but hope this pseudocode helps kickstart your own project. Seeking a guided tutorial to building this app one step-at-a-time? Do our codelab at g.co/codelabs/bigquery-sheets-slides. Alternatively, go see all the code by hitting our GitHub repo at github.com/googlecodelabs/bigquery-sheets-slides. After executing the app successfully, you'll see the fruits of your big data analysis captured in a presentable way in a Google Slides deck:

This isn't the end of the story as this is just one example of how you can leverage both platforms from Google Cloud. In fact, this was one of two sample apps featured in our Cloud NEXT '18 session this summer exploring interoperability between GCP & G Suite which you can watch here:

Stay tuned as more examples are coming. We hope these videos plus the codelab inspire you to build on your own ideas.

6 New updates

Previous announcements

Completed rollouts

The features below completed their rollouts to Rapid Release domains, Scheduled Release domains, or both. Please refer to the original blog posts for additional details.

Source: Google Workspace Updates

Integrating OpenLineage producers with GCP Lineage just got a lot easier

What is Data Lineage

GCP Lineage information model

OpenLineage support

Why producer library

Best Practices for Constructing OpenLineage Messages

How to access GCP Lineage?

GcpLineageTransport

Source: Google Open Source Blog

What are the BigQuery Storage Write API and Fluent Bit?

Why Use the Fluent Bit WriteAPI Connector?

Example Use Case

Setup Process

Key Features

1. Multi-Instancing

2. Dynamic Scaling

3. Exactly-Once

4. Retry Functionality

5. Error Handling

Conclusion

Source: Google Open Source Blog

What’s changing

Getting started

Rollout pace

Availability

Resources

Source: Google Workspace Updates

3 New updates

Previous announcements

Completed rollouts

The features below completed their rollouts to Rapid Release domains, Scheduled Release domains, or both. Please refer to the original blog posts for additional details.

Source: Google Workspace Updates

4 New updates

Previous announcements

Completed rollouts

The features below completed their rollouts to Rapid Release domains, Scheduled Release domains, or both. Please refer to the original blog posts for additional details.

Source: Google Workspace Updates

What’s changing

Getting started

Rollout pace

Availability

Resources

Source: Google Workspace Updates

What’s changing

Who’s impacted

Why it’s important

Additional details

Getting started

Rollout pace

Availability

Resources

Source: Google Workspace Updates

Source: Google Open Source Blog

Source: Google Developers Blog