
3 heart-health tips from Fitbit’s lead cardiologist

Machine learning (ML) practitioners looking to reuse existing datasets to train an ML model often spend a lot of time understanding the data, making sense of its organization, or figuring out what subset to use as features. So much time, in fact, that progress in the field of ML is hampered by a fundamental obstacle: the wide variety of data representations.
ML datasets cover a broad range of content types, from text and structured data to images, audio, and video. Even within datasets that cover the same types of content, every dataset has a unique ad hoc arrangement of files and data formats. This challenge reduces productivity throughout the entire ML development process, from finding the data to training the model. It also impedes development of badly needed tooling for working with datasets.
There are general purpose metadata formats for datasets such as schema.org and DCAT. However, these formats were designed for data discovery rather than for the specific needs of ML data, such as the ability to extract and combine data from structured and unstructured sources, to include metadata that would enable responsible use of the data, or to describe ML usage characteristics such as defining training, test and validation sets.
Today, we're introducing Croissant, a new metadata format for ML-ready datasets. Croissant was developed collaboratively by a community from industry and academia, as part of the MLCommons effort. The Croissant format doesn't change how the actual data is represented (e.g., image or text file formats) — it provides a standard way to describe and organize it. Croissant builds upon schema.org, the de facto standard for publishing structured data on the Web, which is already used by over 40M datasets. Croissant augments it with comprehensive layers for ML relevant metadata, data resources, data organization, and default ML semantics.
In addition, we are announcing support from major tools and repositories: Today, three widely used collections of ML datasets — Kaggle, Hugging Face, and OpenML — will begin supporting the Croissant format for the datasets they host; the Dataset Search tool lets users search for Croissant datasets across the Web; and popular ML frameworks, including TensorFlow, PyTorch, and JAX, can load Croissant datasets easily using the TensorFlow Datasets (TFDS) package.
This 1.0 release of Croissant includes a complete specification of the format, a set of example datasets, an open source Python library to validate, consume and generate Croissant metadata, and an open source visual editor to load, inspect and create Croissant dataset descriptions in an intuitive way.
Supporting Responsible AI (RAI) was a key goal of the Croissant effort from the start. We are also releasing the first version of the Croissant RAI vocabulary extension, which augments Croissant with key properties needed to describe important RAI use cases such as data life cycle management, data labeling, participatory data, ML safety and fairness evaluation, explainability, and compliance.
The majority of ML work is actually data work. The training data is the “code” that determines the behavior of a model. Datasets can vary from a collection of text used to train a large language model (LLM) to a collection of driving scenarios (annotated videos) used to train a car’s collision avoidance system. However, the steps to develop an ML model typically follow the same iterative data-centric process: (1) find or collect data, (2) clean and refine the data, (3) train the model on the data, (4) test the model on more data, (5) discover the model does not work, (6) analyze the data to find out why, (7) repeat until a workable model is achieved. Many steps are made harder by the lack of a common format. This “data development burden” is especially heavy for resource-limited research and early-stage entrepreneurial efforts.
The goal of a format like Croissant is to make this entire process easier. For instance, the metadata can be leveraged by search engines and dataset repositories to make it easier to find the right dataset. The data resources and organization information make it easier to develop tools for cleaning, refining, and analyzing data. This information and the default ML semantics make it possible for ML frameworks to use the data to train and test models with a minimum of code. Together, these improvements substantially reduce the data development burden.
Additionally, dataset authors care about the discoverability and ease of use of their datasets. Adopting Croissant improves the value of their datasets, while only requiring a minimal effort, thanks to the available creation tools and support from ML data platforms.
Today, users can find Croissant datasets at:
With a Croissant dataset, it is possible to:
To publish a Croissant dataset, users can:
We are excited about Croissant's potential to help ML practitioners, but making this format truly useful requires the support of the community. We encourage dataset creators to consider providing Croissant metadata. We encourage platforms hosting datasets to provide Croissant files for download and embed Croissant metadata in dataset Web pages so that they can be made discoverable by dataset search engines. Tools that help users work with ML datasets, such as labeling or data analysis tools should also consider supporting Croissant datasets. Together, we can reduce the data development burden and enable a richer ecosystem of ML research and development.
We encourage the community to join us in contributing to the effort.
Croissant was developed by the Dataset Search, Kaggle and TensorFlow Datasets teams from Google, as part of an MLCommons community working group, which also includes contributors from these organizations: Bayer, cTuning Foundation, DANS-KNAW, Dotphoton, Harvard, Hugging Face, Kings College London, LIST, Meta, NASA, North Carolina State University, Open Data Institute, Open University of Catalonia, Sage Bionetworks, and TU Eindhoven.
With millions of developers relying on our platform, Google Play is committed to keeping our ecosystem safe for everyone. That’s why, in addition to our ongoing investments in app privacy and security, we also continuously update our policies to respond to new challenges and user expectations.
For example, we recently introduced a new account deletion policy with required disclosures within the Data Safety section on the Play Store. Deleting an account should be as easy as creating one, so the new policy requires developers to provide information and web resources that help users to manage their data and understand an app's deletion practices.
To help you build trust and design a user-friendly experience that helps meet our policy requirements, consider these 5 best practices when implementing your account deletion solution.
Users prefer a simple and straightforward account deletion flow. Although users know that more steps may follow (such as authentication) navigating multiple screens before the deletion page can be a significant barrier and create negative feelings for the user. Consider providing your account deletion option on an account settings page or place a prominent button on the home screen. Design the flow with discoverability in mind by taking the user directly to the deletion process.
Users feel that if they can create an account without talking to a customer service agent, they should be able to delete their account online, too. If automation is not on your roadmap just yet, consider a step-by-step deletion request form or a dedicated page to connect users with customer support.
Users delete their accounts for various reasons, some of which may be better resolved another way. Early in your deletion flow, point your users toward a Help Center article that explains how your deletion process works in simple terms, including any potential consequences. For example, make it clear if your users will need to pause their payment method before deleting the account, or download any account data they want to keep. Helping your users understand the process in advance can prevent accidental deletions. For those who do change their minds, consider offering a way to recover their accounts within a reasonable timeframe.
Here’s an example of how Play Store Developer, Canva, has designed the in-app deletion flow to explain potential consequences of account deletion:
“User data privacy has always been important for us. We’ve always been intentional about our approach in optimizing the Canva app so our users can have more transparency and control over their data. We’re welcoming these new requirements from the Play store as we know this new flow will elevate users’ trust in our app and benefit our business in the long term.” - Will Currie, Software Engineer, Canva
Sometimes users misunderstand whether the account itself or just data collected by the app was deleted in the deletion process. Users often think that the data your app stored in the cloud will automatically be deleted at the same time as account deletion. Since it may take time to remove account data from internal company systems or comply with data retention requirements in different regions, transparency about the process can help you maintain trust in your brand and make it more likely for users to return in the future.
Here’s SYBO Games, has designed their web deletion in-app deletion flow:
“We are always striving to ensure that our games provide a fun user experience, built on a solid data protection foundation. When we learned about the new account deletion update on Google Play, we thought this was a great step forward to ensure that the entire developer ecosystem optimizes for user safety. We encourage developers to carve out time to embrace these improvements and prioritize regular privacy check-ins.” - Elizabeth Ann Schweitzer, Games Compliance Manager, SYBO Games
This is a great opportunity to connect with your users at a critical moment. Make sure users who have uninstalled your app can easily remove their accounts through a web resource without needing to reinstall the app. You can also invite them to complete a survey or provide feedback on their decision.
Protecting users' data is essential for building trust and loyalty. By updating the Data Safety section on Google Play and continuing to optimize user experience for account deletion, you can strengthen trust in your company while striving for the highest level of user data protection.
Thank you for your continued collaboration and feedback in developing this data transparency feature and in helping make Google Play safe for all.
Starting May 1, 2024, requests to retrieve, create, or run Full Path and Path Attribution reports through the Bid Manager API will return an error. We deprecated both report types in February 2024. We announced this deprecation last November.
After deprecation, running a query using the ReportType
FULL_PATH
or PATH_ATTRIBUTION
generates an empty report. Existing Query
and Report
resources of these types are still retrievable, and report files generated previously will still be available.
Starting on May 1, 2024, ReportType
values FULL_PATH
or PATH_ATTRIBUTION
and the pathQueryOptions
field will sunset. As a result:
queries.create
requests using these values or fields will return an INVALID_ARGUMENT
error,
queries.get
, queries.delete
, and queries.reports.get
requests retrieving resources using these types will return a NOT_FOUND
error,
queries.list
and queries.reports.list
responses will not include resources using these types.We’ve added these details to our change log. To avoid an interruption of service, we recommend that you stop creating, retrieving, or running any reports using these values before the applicable sunset date.
If you have questions regarding these changes, please contact us using our support contact form.
Today, we’re excited to announce the release of a new Text-To-Speech (TTS) engine that is performant and reliable. Text-to-speech turns text into natural-sounding speech across more than 50 languages powered by Google’s machine learning (ML) technology. The new text-to-speech engine on Wear OS uses decreased prosody ML models to bring faster synthesis on Wear OS devices.
Use cases for Wear OS’s text-to-speech can range from accessibility services, coaching cues for exercise apps, navigation cues, and reading aloud incoming alerts through the watch speaker or Bluetooth connected headphones. The engine is meant for brief interactions, so it shouldn’t be used for reading aloud a long article, or a long summary of a podcast.
Text-to-speech has long been supported on Android. Wear OS’s new TTS has been tuned to be performant and reliable on low-memory devices. All the Android APIs are still the same, so developers use the same process to integrate it into a Wear OS app, for example, TextToSpeech#speak can be used to speak specific text. This is available on devices that run Wear OS 4 or higher.
When the user interacts with the Wear OS TTS for the first time following a device boot, the synthesis engine is ready in about 10 seconds. For special cases where developers want the watch to speak immediately after opening an app or launching an experience, the following code can be used to pre-warm the TTS engine before any synthesis requests come in.
private fun initTtsEngine() { // Callback when TextToSpeech connection is set up val callback = TextToSpeech.OnInitListener { status -> if (status == TextToSpeech.SUCCESS) { Log.i(TAG, "tts Client Initialized successfully") // Get default TTS locale val defaultVoice = tts.voice if (defaultVoice == null) { Log.w(TAG, "defaultVoice == null") return@OnInitListener } // Set TTS engine to use default locale tts.language = defaultVoice.locale try { // Create a temporary file to synthesize sample text val tempFile = File.createTempFile("tmpsynthesize", null, applicationContext.cacheDir) // Synthesize sample text to our file tts.synthesizeToFile( /* text= */ "1 2 3", // Some sample text /* params= */ null, // No params necessary for a sample request /* file= */ tempFile, /* utteranceId= */ "sampletext" ) // And clean up the file tempFile.deleteOnExit() } catch (e: Exception) { Log.e(TAG, "Unhandled exception: ", e) } } } tts = TextToSpeech(applicationContext, callback) }
When you are done using TTS, you can release the engine by calling tts.shutdown() in your activity’s onDestroy() method. This command should also be used when closing an app that TTS is used for.
By default, Wear OS TTS includes 7 pre-loaded languages in the system image: English, Spanish, French, Italian, German, Japanese, and Mandarin Chinese. OEMs may choose to preload a different set of languages. You can check what languages are available by using TextToSpeech#getAvailableLanguages(). During watch setup, if the user selects a system language that is not a pre-loaded voice file, the watch automatically downloads the corresponding voice file the first time the user connects to Wi-Fi while charging their watch.
There are limited cases where the speech output may differ from the user’s system language. For example, in a scenario where a safety app uses TTS to call emergency responders, developers might want to synthesize speech in the language of the locale the user is in, not in the language the user has their watch set to. To synthesize text in a different language from system settings, use TextToSpeech#setLanguage(java.util.Locale)
Your Wear OS apps now have the power to talk, either directly from the watch’s speakers or through Bluetooth connected headphones. Learn more about using TTS.
We look forward to seeing how you use Text-to-speech engine to create more helpful and engaging experiences for your users on Wear OS!
Copyright 2023 Google LLC. SPDX-License-Identifier: Apache-2.0