Tag Archives: Announcements

The Power of Open Source


At the day 1 keynote of Open Source Summit North America, Timothy Jordan, Director of Developer Relations and Open Source at Google, will talk about the landscape of open source and AI, the importance of a responsible approach, and the transformative impact of community collaboration. In anticipation of this talk, let’s break down the AI open source ecosystem, and how Google approaches it.

Google believes in the power of open technology to drive innovation and benefit everyone. It fosters creativity and collaboration, while ensuring technology access for developers and allowing customization to fit unique use cases. Open source licenses give developers full creative autonomy without restriction. It is this ecosystem of open source and open technology, shaped by ML frameworks like TensorFlow, Keras, and JAX, that has enabled so many incredible advances in AI in recent years.

The open source community has been in discussion on how to apply the Open Source Definition to carry forward the open principles of the OSD while addressing concepts like derived work and author attribution in AI. During Timothy’s keynote, he’ll speak to his own philosophy on Open Source and AI, and share how his assumptions about how we apply open source to AI have evolved. The immediate availability of AI models, powered by the open source ecosystem of ML frameworks, means it’s more important than ever that we establish a shared definition for open source and AI.

While that definition is in development, at Google we’re using precise language to describe our openly available models like Gemma. The definition and license is only one part of this open ML/AI future; advancements in safety tooling, policies, and developer knowledge are all part of creating a responsible and open future for AI. Those advancements are all fueled by a dedication to collaboration. Whether sharing innovations and improvements with the community, or having conversations with policymakers and open source leaders, collaboration is key to a responsible approach to AI in the open ecosystem. AI can only be safe and responsible if everyone’s experiences and perspectives are brought to the forefront as it’s built.

To demonstrate how open source has made AI readily available, Timothy will also take the audience through a “low code” demo of how to run large language models in-browser for web applications. Using MediaPipe, the LLM Inference API, and Gemma, users can quickly add genAI capabilities like document summarization and text generation.

Join us at Open Source Summit North America for this keynote, and visit opensource.google to learn more.

By the Google Open Source team

Gemini 1.5 Pro Now Available in 180+ Countries; With Native Audio Understanding, System Instructions, JSON Mode and More

Posted by Jaclyn Konzelmann and Megan Li - Google Labs

Grab an API key in Google AI Studio, and get started with the Gemini API Cookbook

Less than two months ago, we made our next-generation Gemini 1.5 Pro model available in Google AI Studio for developers to try out. We’ve been amazed by what the community has been able to debug, create and learn using our groundbreaking 1 million context window.

Today, we’re making Gemini 1.5 Pro available in 180+ countries via the Gemini API in public preview, with a first-ever native audio (speech) understanding capability and a new File API to make it easy to handle files. We’re also launching new features like system instructions and JSON mode to give developers more control over the model’s output. Lastly, we’re releasing our next generation text embedding model that outperforms comparable models. Go to Google AI Studio to create or access your API key, and start building.


Unlock new use cases with audio and video modalities

We’re expanding the input modalities for Gemini 1.5 Pro to include audio (speech) understanding in both the Gemini API and Google AI Studio. Additionally, Gemini 1.5 Pro is now able to reason across both image (frames) and audio (speech) for videos uploaded in Google AI Studio, and we look forward to adding API support for this soon.


screen grab of a clooege professor using Gemini 1.5 Pro to create a quiz based on their latest lecture video in Google AI Studio
You can upload a recording of a lecture, like this 117,000+ token lecture from Jeff Dean, and Gemini 1.5 Pro can turn it into a quiz with an answer key. Video sped up for demo purposes.

Gemini API Improvements

Today, we’re addressing a number of top developer requests:

1. System instructions: Guide the model’s responses with system instructions, now available in Google AI Studio and the Gemini API. Define roles, formats, goals, and rules to steer the model's behavior for your specific use case.

image showing where System Instructions is located in Google AI Studio
Set System Instructions easily in Google AI Studio

2. JSON mode: Instruct the model to only output JSON objects. This mode enables structured data extraction from text or images. You can get started with cURL, and Python SDK support is coming soon.

3. Improvements to function calling: You can now select modes to limit the model’s outputs, improving reliability. Choose text, function call, or just the function itself.


A new embedding model with improved performance

Starting today, developers will be able to access our next generation text embedding model via the Gemini API. The new model, text-embedding-004, (text-embedding-preview-0409 in Vertex AI), achieves a stronger retrieval performance and outperforms existing models with comparable dimensions, on the MTEB benchmarks.

table showing Gecko: Versativel Text Embeddings Distilled from Large Language Models
'Text-embedding-004' (aka Gecko) using 256 dims output outperforms all larger 768 dim output models on MTEB benchmarks

These are just the first of many improvements coming to the Gemini API and Google AI Studio in the next few weeks. We’re continuing to work on making Google AI Studio and the Gemini API the easiest way to build with Gemini. Get started today in Google AI Studio with Gemini 1.5 Pro, explore code examples and quickstarts in our new Gemini API Cookbook, and join our community channel on Discord.

Meet the inaugural cohort of our Google for Startups Accelerator: AI First North America

Posted by Matt Ridenour, Head of Startup Developer Ecosystem - USA

Startups are at the forefront of developing solutions for some of humanity's most pressing challenges by using AI, driving breakthroughs across industries from healthcare to cybersecurity.

To help AI-focused startups scale quickly while building responsibly, we’re thrilled to introduce the inaugural class of the Google for Startups Accelerator: AI-First program in North America. This new program is for startups building AI solutions based in the U.S. and Canada. This is the first of several AI-focused programs we'll offer throughout the year in Europe, India and Brazil.

This equity-free program provides 10 weeks of hands-on mentorship and technical project support to startups using AI in their core service or product. Selected startups will collaborate with a cohort of top peer founders and engage with leaders across Google. The curriculum will give founders access to the latest AI tools (including Google’s own Gemini), and will also include workshops on tech and infrastructure, UX and product, growth, sales, leadership and OKRs.

Meet the inaugural class of Google for Startups Accelerator: AI-First, North America

We’re thrilled to introduce the 15 AI startups selected for this accelerator:

Aptori, San Jose, CA. Aptori assists developers and security engineers to build secure, high-quality software.

Augmend, Seattle, WA. Augmend is an AI native Loom made for developers, making it possible to share expertise, not just videos.

Backpack Healthcare, Elkridge, MA. Backpack Healthcare is a pediatric mental health company utilizing proprietary AI technology, an engagement platform, and live therapists to offer personalized care to patients.

BrainLogic AI, Menlo Park, CA. BrainLogic AI has built a localized AI agent that connects users and businesses through whatsapp.

Cicerai, The Woodlands, TX. Cicerai is an AI-native Legal Practice Management Platform, boosting productivity and enhancing quality.

CLIKA, San Jose, CA. CLIKA simplifies deploying AI models on diverse hardware by offering automated model compression and format compilation.

Easel AI, Inc., Los Angeles, CA. Easel AI is an AI avatar-based social chat app that runs on iMessage.

Findly, San Francisco, CA. Findly is a data visualization integrator using a natural language chat interface.

Glass Health, San Francisco, CA. Glass Health empowers clinicians with the best-in-class AI platform for clinical decision support.

Kodif, Sunnyvale, CA. Kodif is a low-code AI-powered automation platform for support agent workflows to resolve customer issues.

Liminal, Indianapolis, IN. Liminal empowers regulated enterprises to securely deploy and use generative AI, horizontally covering every interaction and use case.

Mbue, Austin, TX. Mbue leverages AI to instantly review architectural drawings, catching errors earlier and streamlining the process.

Modulo Bio, San Diego, CA. Modulo Bio is building a platform to discover therapeutics that prevent or reverse neurodegenerative diseases.

Rocket Doctor, Toronto, ON, Canada. Rocket Doctor is a digital health platform and marketplace that intelligently matches patients and clinicians in a telemedicine 2.0 approach.

Sibli, Montreal, QC, Canada. Sibli is a fintech platform that processes unstructured data and identifies key insights for financial analysts.

The program kicks off at Cloud Next 2024 and culminates with a high profile Demo Day in June for potential partners, customers and investors.

After graduation, startups join the dynamic Google for Startups accelerator community, where they receive ongoing support and have the opportunity to build lasting connections with like-minded founders, mentors and investors.

We are honored to partner with this cohort of companies through this accelerator and beyond, to advance their AI technologies. Register your interest to get updates on the program, and join us in celebrating these exceptional startups!

Gemma Family Expands with Models Tailored for Developers and Researchers

Posted by Tris Warkentin – Director, Product Management and Jane Fine - Senior Product Manager

In February we announced Gemma, our family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. The community's incredible response – including impressive fine-tuned variants, Kaggle notebooks, integration into tools and services, recipes for RAG using databases like MongoDB, and lots more – has been truly inspiring.

Today, we're excited to announce our first round of additions to the Gemma family, expanding the possibilities for ML developers to innovate responsibly: CodeGemma for code completion and generation tasks as well as instruction following, and RecurrentGemma, an efficiency-optimized architecture for research experimentation. Plus, we're sharing some updates to Gemma and our terms aimed at improvements based on invaluable feedback we've heard from the community and our partners.


Introducing the first two Gemma variants


CodeGemma: Code completion, generation, and chat for developers and businesses

Harnessing the foundation of our Gemma models, CodeGemma brings powerful yet lightweight coding capabilities to the community. CodeGemma models are available as a 7B pretrained variant that specializes in code completion and code generation tasks, a 7B instruction-tuned variant for code chat and instruction-following, and a 2B pretrained variant for fast code completion that fits on your local computer. CodeGemma models have several advantages:

  • Intelligent code completion and generation: Complete lines, functions, and even generate entire blocks of code – whether you're working locally or leveraging cloud resources. 
  • Enhanced accuracy: Trained on 500 billion tokens of primarily English language data from web documents, mathematics, and code, CodeGemma models generate code that's not only more syntactically correct but also semantically meaningful, helping reduce errors and debugging time. 
  • Multi-language proficiency: Your invaluable coding assistant for Python, JavaScript, Java, and other popular languages. 
  • Streamlined workflows: Integrate a CodeGemma model into your development environment to write less boilerplate, and focus on interesting and differentiated code that matters – faster.
image of streamlined workflows within an exisitng AI dev project with CodeGemma integrated
This table compares the performance of CodeGemma with other similar models on both single and multi-line code completion tasks. Learn more in the technical report.

Learn more about CodeGemma in our report or try it in this quickstart guide.


RecurrentGemma: Efficient, faster inference at higher batch sizes for researchers

RecurrentGemma is a technically distinct model that leverages recurrent neural networks and local attention to improve memory efficiency. While achieving similar benchmark score performance to the Gemma 2B model, RecurrentGemma's unique architecture results in several advantages:

  • Reduced memory usage: Lower memory requirements allow for the generation of longer samples on devices with limited memory, such as single GPUs or CPUs. 
  • Higher throughput: Because of its reduced memory usage, RecurrentGemma can perform inference at significantly higher batch sizes, thus generating substantially more tokens per second (especially when generating long sequences). 
  • Research innovation: RecurrentGemma showcases a non-transformer model that achieves high performance, highlighting advancements in deep learning research. 
graph showing maximum thoughput when sampling from a prompt of 2k tokens on TPUv5e
This chart reveals how RecurrentGemma maintains its sampling speed regardless of sequence length, while Transformer-based models like Gemma slow down as sequences get longer.

To understand the underlying technology, check out our paper. For practical exploration, try the notebook, which demonstrates how to finetune the model.


Built upon Gemma foundations, expanding capabilities

Guided by the same principles of the original Gemma models, the new model variants offer:

  • Open availability: Encourages innovation and collaboration with its availability to everyone and flexible terms of use. 
  • High-performance and efficient capabilities: Advances the capabilities of open models with code-specific domain expertise and optimized design for exceptionally fast completion and generation. 
  • Responsible design: Our commitment to responsible AI helps ensure the models deliver safe and reliable results. 
  • Flexibility for diverse software and hardware:  
    • Both CodeGemma and RecurrentGemma: Built with JAX and compatible with JAX, PyTorch, , Hugging Face Transformers, and Gemma.cpp. Enable local experimentation and cost-effective deployment across various hardware, including laptops, desktops, NVIDIA GPUs, and Google Cloud TPUs.  
    • CodeGemma: Additionally compatible with Keras, NVIDIA NeMo, TensorRT-LLM, Optimum-NVIDIA, MediaPipe, and availability on Vertex AI. 
    • RecurrentGemma: Support for all the aforementioned products will be available in the coming weeks.

Gemma 1.1 update

Alongside the new model variants, we're releasing Gemma 1.1, which includes performance improvements. Additionally, we've listened to developer feedback, fixed bugs, and updated our terms to provide more flexibility.


Get started today

These first Gemma model variants are available in various places worldwide, starting today on Kaggle, Hugging Face, and Vertex AI Model Garden. Here's how to get started:

We invite you to try the CodeGemma and RecurrentGemma models and share your feedback on Kaggle. Together, let's shape the future of AI-powered content creation and understanding.

ML Olympiad 2024: Globally Distributed ML Competitions by Google ML Community

Posted by Bitnoori Keum – DevRel Community Manager

The ML Olympiad consists of Kaggle Community Competitions organized by ML GDE, TFUG, and other ML communities, aiming to provide developers with opportunities to learn and practice machine learning. Following successful rounds in 2022 and 2023, the third round has now launched with support from Google for Developers for each competition host. Over the last two rounds, 605 teams participated in 32 competitions, generating 105 discussions and 170 notebooks. We encourage you to join this round to gain hands-on experience with machine learning and tackle real-world challenges.


ML Olympiad Community Competitions

Over 20 ML Olympiad community competitions are currently open. Visit the ML Olympiad page to participate.

Smoking Detection in Patients

Predict smoking status with bio-signal ML models
Host: Rishiraj Acharya (AI/ML GDE) / TFUG Kolkata

TurtleVision Challenge

Develop a classification model to distinguish between jellyfish and plastic pollution in ocean imagery
Host: Anas Lahdhiri / MLAct

Detect hallucinations in LLMs

Detect which answers provided by a Mistral 7B instruct model are most likely hallucinations
Host: Luca Massaron (AI/ML GDE)

ZeroWasteEats

Find ML solutions to reduce food wastage
Host: Anushka Raj / TFUG Hajipur

Predicting Wellness

Predict the percentage of body fat in men using multiple regression methods
Host: Ankit Kumar Verma / TFUG Prayagraj

Offbeats Edition

Build a regression model to predict the age of the crab
Host: Ayush Morbar / Offbeats Byte Labs

Nashik Weather

Predict the condition of weather in Nashik, India
Host: TFUG Nashik

Predicting Earthquake Damage

Predict the level of damage to buildings caused by earthquake based on aspects of building location and construction
Host: Usha Rengaraju

Forecasting Bangladesh's Weather

Predict the rainy day; amount of rainfall, and average temperature for a particular day.
Host: TFUG Bangladesh (Dhaka)

CO2 Emissions Prediction Challenge

Predict CO2 emissions per capita for 2030 using global development indicators
Host: Md Shahriar Azad Evan, Shuvro Pal / TFUG North Bengal

AI & ML Malaysia

Predict loan approval status
Host: Kuan Hoong (AI/ML GDE) / Artificial Intelligence & Machine Learning Malaysia User Group

Sustainable Urban Living

Predict the habitability score of properties
Host: Ashwin Raj / BeyondML

Toxic Language (PTBR) Detection

(in local language)
Classify Brazilian Portuguese tweets in one of the two classes: toxics or non toxics.
Host: Mikaeri Ohana, Pedro Gengo, Vinicius F. Caridá (AI/ML GDE)

Improving disaster response

Predict the humanitarian aid contributions as a response to disasters occurs in the world
Host: Yara Armel Desire / TFUG Abidjan

Urban Traffic Density

Develop predictive models to estimate the traffic density in urban areas
Host: Kartikey Rawat / TFUG Durg

Know Your Customer Opinion

Classify each customer opinion into several Likert scale
Host: TFUG Surabaya

Forecasting India's Weather

Predict the temperature of the particular month
Host: Mohammed Moinuddin / TFUG Hyderabad

Classification Champ

Develop classification models to predict tumor malignancy
Host: TFUG Bhopal

AI-Powered Job Description Generator

Build a system that employs Generative AI and a chatbot interface to automatically generate job descriptions
Host: Akaash Tripathi / TFUG Ghaziabad

Machine Translation French-Wolof

Develop robust algorithms or models capable of accurately translating French sentences into Wolof.
Host: GalsenAI

Water Mapping using Satellite Imagery

Water mapping using satellite imagery and deep learning for dam drought detection
Host: Taha Bouhsine / ML Nomads


Navigating ML Olympiad

To see all the community competitions around the ML Olympiad, search "ML Olympiad" on Kaggle and look for further related posts on social media using #MLOlympiad. Browse through the available competitions and participate in those that interest you!

Google for Games is coming to GDC 2024

Posted by Aurash Mahbod – General Manager, Games on Google Play

Google for Games is coming to GDC in San Francisco! Join us on March 19 for the Game Developers Conference (GDC) at the Moscone Center, where game developers from across the world will gather to learn, network, problem-solve, and help shape the future of the industry. From March 18 to March 22, experience our comprehensive suite of multi-platform game development tools and explore the new features from Play Pass at the West Hall, Level 2 Lobby.

This year, we’re proud to host eight sessions for developers, designers, business and marketing teams, and everyone else in the gaming community with an interest to grow their game business. Take a look at this year’s sessions below and if you’re interested in learning more about topics from Google Play and Android, check out key product updates from the Google for Games Developer Summit.


Scaling your game development

We’re hosting three sessions designed to help scale your game development using tools from Firebase, Android, and Google Cloud. Learn more about building high quality games with case studies from industry experts.


Beyond "Set and Forget": Advanced Debugging with Firebase Crashlytics

Tuesday, March 19, 9:30 am - 10:00 am 

Speaker: Joe Spiro (Developer Relations Engineer, Google) 

Crashlytics has added a number of features that make detecting, tracking, and understanding bugs even easier, from high-level to native code. Take your fixes to another level with native stack traces, memory debugging, issue annotation, and the ability to log uncaught exceptions as fatal.


Enhancing Game Performance: Vulkan and Android Adaptability Technology

Tuesday, March 19, 10:50 am - 11:50 am 

Speakers: Dohyun Kim (Developer Relations Engineer, Android Games, Google), Hak Matsuda (Developer Relations Engineer, Android Games, Google), Jungwoo Kim (Principal Engineer, Samsung), Syed Farhan Hassan (Software Engineer, ARM) 

Learn how to leverage Vulkan graphics API to improve your graphics quality or performance, including performance tuning with dynamic upscaling. Find out how the Android Dynamic Performance Framework (ADPF) can enhance game performance and power in Unity and native C++, with easy integration through the Unreal Engine plugin. We're also sharing how NCSoft Lineage W improved thermal status and performance using ADPF.


Creating a global-scale game with Google Cloud

Tuesday, March 19, 4:40 pm - 5:10 pm 

Speaker: Mark Mandel (Developer Advocate, Google) 

This session will cover the best of Google Cloud's open source projects (Agones, Open Match, and more) and products (GKE, Spanner, Anthos Service Mesh, Cloud Build, Cloud Deploy, and more) to teach you how to build, deploy, and scale world-scale multiplayer games with Google Cloud.


Increasing user engagement

We’re hosting two sessions designed to help you increase engagement by creating dynamic gameplay experiences using generative AI and expanding opportunities on Google Play to grow your community of players with exclusive rewards.

Reimagine the Future of Gaming with Google AI

Tuesday, March 19, 10:50 am - 11:50 am 

Speakers: Gus Martins (Developer Advocate, Google), Dan Zaratsian (AI/ML Solutions Architect, Google), Lei Zhang (Director, Play Partnerships, Global GenAI & Greater China Play Partnerships, Google), Jack Buser (Director, Game Industry Solutions), Simon Tokumine (Director of Product Management, Google AI), Giovane Moura Jr. (App Modernization Specialist, Google), Moonlit Beshinov (Head of Google for Games Partnerships and Industry Strategy, Google) 

In our keynote session, senior executives from Google Cloud, Google Play, and Labs will share their unique perspectives on generative AI in the gaming landscape. Learn more about cutting-edge AI solutions from Google Cloud, Android, Google Play, and Labs designed to simplify game development, publishing, and business operations, plus actionable strategies to leverage AI for faster development, better player experiences, and sustainable growth.

Grow your community of loyal gamers with Google Play

Tuesday, March 19, 1:20 pm - 1:50 pm 

Speaker: Tom Grinsted (Group Product Manager, Google Play Games, Google) 

In this session, we’ll cover new features and insights from Google Play to create rewarding experiences for gamers using Play Pass, Play Points, and Play Games Services. Get a behind-the-scenes look at how Google Play rewards a growing community of passionate gamers, and how to use this to super-charge your business.


Maximizing reach across screens

These sessions, from Google Play, Android, and Flutter, introduce ways to expand your mobile games to PC. Learn about the latest tools that will help you accelerate growth across large screens.

Bringing more users to your Google Play Games on PC game

Tuesday, March 19, 2:10 pm - 2:40 pm 

Speakers: Aly Hung (Developer Relations Engineer, Android and Google Play, Google), Dara Monasch (Product Manager, Google), Justin Gardner (Partner Program Manager, App Attribution, Google) 

Join us for an overview of Google Play Games on PC, how it has grown in the past year, and a walkthrough of how to optimize and attribute your PC advertisements for your Google Play Games on PC titles. Learn how to use Google Play Games to increase your reach and acquisition of PC users for your mobile game, as well as how to effectively use the Google Play Install Referrer API to attribute and optimize your ads across mobile and PC.

Android input on desktop: How to delight your users

Tuesday, March 19, 3:00 pm - 3:30 pm 

Speakers: Shenshen Cui (Staff Developer Relations Engineer, Google), Patrick Martin (Developer Relations Engineer, Google) 

Give your players a first-class gaming experience with our best practices for handling input between mobile and PC games, including technical details on how to implement these best practices across mobile, tablets, Chromebooks and Windows PCs1. Learn how Android handles keyboard, mouse, and controller input across different form factors, with case studies for designing for both touch and hardware input.

Building Multiplatform Games with Flutter

Tuesday, March 19, 3:50 pm - 4:20 pm 

Speakers: Zoey Fan (Senior Product Manager, Flutter, Google), Brett Morgan (Developer Relations Engineer, Google) 

Learn why game developers are choosing Flutter to build casual games on mobile, desktop, and web browsers. We’ll cover the free, open-source tools and resources available through the Casual Games Toolkit, a collection of free and open-source tools, templates, and resources to make game dev more productive with Flutter.

Learn more about all of our sessions coming to you on March, 19, at GDC in San Francisco.


________________

1Windows is a trademark of the Microsoft group of companies.

Introducing Play Install Referrer for Google Play Games on PC

Posted by Arjun Dayal, Director – Google Play Games

Making informed marketing decisions relies on identifying your most valuable user acquisition channels for your games. By tracking referral data, you can understand which traffic sources send the most users to download your app from the Google Play store. These insights can help you make the most of your advertising spend and maximize ROI. That’s why in 2017, we launched the Play Install Referrer API, which gives you an easy, reliable way to track your apps’ referral information directly from the Play store.

Until now, this feature was only available for your games in the mobile Play store. Today, we’re pleased to announce support for Google Play Games on PC, allowing you to attribute conversions from your marketing activities on the Web1. If you use the Google Play Install Referrer API to track your referral sources, you can now attribute conversions to specific campaigns directly from Google Play by manually retrieving referral information, or using third-party analytics tools from Google’s App Attribution Partners.

Getting started is easy. First, generate a Google Play URL for your game’s Google Play store listing page and add a referrer query parameter for your Web campaign. Then, when a PC user clicks the link, they will be redirected to your game’s listing page on the Google Play Web store, which will give them the option to “Install on Windows.” Once the user launches your game, you’ll be able to track the referral using the Google Play Install Referrer library.

“With integration support from Adjust, developers can quickly and efficiently measure marketing campaigns for their games on Google Play Games on PC. We’re excited about the opportunity this brings for developers to broaden their games’ reach and strengthen cross-platform measurement.” 

- Gijsbert Pols, Director of Connected TV & New Channels, Adjust

Learn more about third-party referral codes for Google Play Games on PC and start optimizing your marketing performance today.


______________

1Subject to device compatibility with Google Play Games on PC.

Introducing a new Text-To-Speech engine on Wear OS

Posted by Ouiam Koubaa – Product Manager and Yingzhe Li – Software Engineer

Today, we’re excited to announce the release of a new Text-To-Speech (TTS) engine that is performant and reliable. Text-to-speech turns text into natural-sounding speech across more than 50 languages powered by Google’s machine learning (ML) technology. The new text-to-speech engine on Wear OS uses decreased prosody ML models to bring faster synthesis on Wear OS devices.

Use cases for Wear OS’s text-to-speech can range from accessibility services, coaching cues for exercise apps, navigation cues, and reading aloud incoming alerts through the watch speaker or Bluetooth connected headphones. The engine is meant for brief interactions, so it shouldn’t be used for reading aloud a long article, or a long summary of a podcast.

How to use Wear OS’s TTS

Text-to-speech has long been supported on Android. Wear OS’s new TTS has been tuned to be performant and reliable on low-memory devices. All the Android APIs are still the same, so developers use the same process to integrate it into a Wear OS app, for example, TextToSpeech#speak can be used to speak specific text. This is available on devices that run Wear OS 4 or higher.

When the user interacts with the Wear OS TTS for the first time following a device boot, the synthesis engine is ready in about 10 seconds. For special cases where developers want the watch to speak immediately after opening an app or launching an experience, the following code can be used to pre-warm the TTS engine before any synthesis requests come in.

private fun initTtsEngine() {
    // Callback when TextToSpeech connection is set up
    val callback = TextToSpeech.OnInitListener { status ->
        if (status == TextToSpeech.SUCCESS) {
            Log.i(TAG, "tts Client Initialized successfully")


            // Get default TTS locale
            val defaultVoice = tts.voice
            if (defaultVoice == null) {
                Log.w(TAG, "defaultVoice == null")
                return@OnInitListener
            }


            // Set TTS engine to use default locale
            tts.language = defaultVoice.locale




            try {
                // Create a temporary file to synthesize sample text
                val tempFile =
                        File.createTempFile("tmpsynthesize", null, applicationContext.cacheDir)


                // Synthesize sample text to our file
                tts.synthesizeToFile(
                        /* text= */ "1 2 3", // Some sample text
                        /* params= */ null, // No params necessary for a sample request
                        /* file= */ tempFile,
                        /* utteranceId= */ "sampletext"
                )


                // And clean up the file
                tempFile.deleteOnExit()
            } catch (e: Exception) {
                Log.e(TAG, "Unhandled exception: ", e)
            }
        }
    }


    tts = TextToSpeech(applicationContext, callback)
}

When you are done using TTS, you can release the engine by calling tts.shutdown() in your activity’s onDestroy() method. This command should also be used when closing an app that TTS is used for.

Languages and Locales

By default, Wear OS TTS includes 7 pre-loaded languages in the system image: English, Spanish, French, Italian, German, Japanese, and Mandarin Chinese. OEMs may choose to preload a different set of languages. You can check what languages are available by using TextToSpeech#getAvailableLanguages(). During watch setup, if the user selects a system language that is not a pre-loaded voice file, the watch automatically downloads the corresponding voice file the first time the user connects to Wi-Fi while charging their watch.

There are limited cases where the speech output may differ from the user’s system language. For example, in a scenario where a safety app uses TTS to call emergency responders, developers might want to synthesize speech in the language of the locale the user is in, not in the language the user has their watch set to. To synthesize text in a different language from system settings, use TextToSpeech#setLanguage(java.util.Locale)

Conclusion

Your Wear OS apps now have the power to talk, either directly from the watch’s speakers or through Bluetooth connected headphones. Learn more about using TTS.

We look forward to seeing how you use Text-to-speech engine to create more helpful and engaging experiences for your users on Wear OS!


Copyright 2023 Google LLC.
SPDX-License-Identifier: Apache-2.0

New goodies from Android, Wearables at Mobile World Congress + tune in to a new episode of #TheAndroidShow next week!

Posted by Anirudh Dewani, Director of Android Developer Relations

Earlier today, at Mobile World Congress (MWC), an annual conference showcasing the latest in mobile, Android and our partners unveiled a range of new goodies, including new wearables, foldables, as well as a number of new features for Android users. Keep reading below to see how you, as developers, can take advantage of these new features and devices that are being released. And in just over a week, on Thursday March 7 at 10AM PT, we’ll be kicking off another episode of #TheAndroidShow, our quarterly live show on YouTube and on developer.android.com, where we’ll dive more into these topics.


Meet the new watch from OnePlus and how we’re boosting power with the Wear OS hybrid interface

Wearables are on display across MWC this week, and one of our favorites is OnePlus Watch 2, powered with the latest version of Wear OS (Wear OS 4). As part of our ongoing work to improve the Wear OS by Google user experience, we’ve made fundamental changes to the platform and substantially expanded the capabilities of the Wear OS hybrid interface that improve two key areas: power and performance. As a developer, you can leverage existing Wear OS APIs to get underneath optimizations without any added effort – no code changes required! You can read more about the updates here.

Images of three people wearing the OnePlus Watch 2

A few new features for Android users

Google released 9 new features Android users can take advantage of across Google apps, you can read more about those features here. For developers, we wanted to highlight a few ways you can take advantage of this news across experiences you build into your apps:

    • More places for users to see their Health Connect data, now in the Fitbit app: With permission from your users, Health Connect is a central way to connect and sync their favorite health and fitness apps, see all their data in one place, and stay in control of their privacy. By setting up Health Connect in the Fitbit mobile app for Android, users will have an overview of their health and fitness data from across their apps in one place. You can join developers like Peloton, ŌURA, and Lifesum who are using Health Connect to provide their users with deeper health and fitness insights, get started now!
Image that reads 'New updates on Android' with pictures of a smart watch, laptop, and Android Auto

A new episode of #TheAndroidShow, live on March 7 at 10AM PT. Send us your #AskAndroid questions now!

You can join us on March 7 at 10AM PT for a new episode of #TheAndroidShow. In this quarterly show, we’ll unpack the latest Android foldables and large screens for you to get building on, plus a behind-the-scenes on Gemini Nano and AICore.

We’ll have a live #AskAndroid Q&A with the team about building Android; you can ask us about building excellent apps across devices, Android 15, Compose, Gemini and more, using #AskAndroid on X or on YouTube. Our experts are ready to answer your questions live!

#TheAndroidShow: March 7 at 10AM PT, broadcast live on YouTube and d.android.com/events/show!

The First Developer Preview of Android 15

Posted by Dave Burke, VP of Engineering
Android 14 logo

We're releasing the first Developer Preview of Android 15 today so you, our developers, can collaborate with us to build a better Android.

Android 15 continues our work to build a platform that helps improve your productivity while giving you new capabilities to produce superior media experiences, minimize battery impact, maximize smooth app performance, and protect user privacy and security all on the most diverse lineup of devices out there.

Android enables your apps to take advantage of premium device hardware, including high-end camera capabilities, powerful GPUs, dazzling displays, and AI processing. The demand for large-screen devices, including tablets, foldables and flippables, continues to grow, offering an opportunity to reach high-value users. Also, Android is committed to providing tooling and libraries to help your apps take advantage of the latest advances in AI.

Your feedback on the Android 15 Developer Preview and QPR beta program plays a key role in helping Android continuously improve. The Android 15 developer site has more information about the preview, including downloads for Pixel and detailed documentation about changes. This preview is just the beginning, and we’ll have lots more to share as we move through the release cycle. Thank you in advance for your help in making Android a platform that works for everyone.

Protecting user privacy and security

Android is constantly working to create solutions that maximize user privacy and security.

Privacy Sandbox on Android

Android 15 brings Android AD Services up to extension level 10, incorporating the latest version of the Privacy Sandbox on Android, part of our work to develop new technologies that improve user privacy and enable effective, personalized advertising experiences for mobile apps. Our website has more about the Privacy Sandbox on Android developer preview and beta programs to help you get started.

Health Connect

Android 15 integrates Android 14 extensions 10 around Health Connect by Android, a secure and centralized platform to manage and share app-collected health and fitness data. This update adds support for new data types across fitness, nutrition, and more.

File integrity

Android 15's FileIntegrityManager includes new APIs that tap into the power of the fs-verity feature in the Linux kernel. With fs-verity, files can be protected by custom cryptographic signatures, helping you ensure they haven't been tampered with or corrupted. This leads to enhanced security, protecting against potential malware or unauthorized file modifications that could compromise your app's functionality or data.

Partial screen sharing

Android 15 supports partial screen sharing so users can share or record just an app window rather than the entire device screen. This feature, enabled first in Android 14 QPR2, includes MediaProjection callbacks that allow your app to customize the partial screen sharing experience. Note that user consent is now required for each MediaProjection capture session.

Supporting creators

Android continues its work to give you access to tools and hardware to support creators to bring their vision to life on Android.

In-app Camera Controls

Android 15 adds new extensions for more control over the camera hardware and its algorithms on supported devices:

Virtual MIDI 2.0 Devices

Android 13 added support for connecting to MIDI 2.0 devices via USB, which communicate using Universal MIDI Packets (UMP). Android 15 extends UMP support to virtual MIDI apps, enabling composition apps to control synthesizer apps as a virtual MIDI 2.0 device just like they would with an USB MIDI 2.0 device.

Performance and quality

Android continues its focus on helping you improve the quality of your apps. Much of this focus is around tooling and libraries, including Jetpack Compose, Android Studio, and more.

Dynamic Performance

Android 15 continues our investment in the Android Dynamic Performance Framework (ADPF), a set of APIs that allow games and performance intensive apps to interact more directly with power and thermal systems of Android devices. On supported devices, Android 15 will add new ADPF capabilities:

    • A power-efficiency mode for hint sessions to indicate that their associated threads should prefer power saving over performance, great for long-running background workloads.
    • GPU and CPU work durations can both be reported in hint sessions, allowing the system to adjust CPU and GPU frequencies together to best meet workload demands.

To learn more about how to use ADPF in your apps and games, head over to the documentation.

Developer Productivity

Android 15 continues to add OpenJDK APIs, including quality-of-life improvements around NIO buffers, streams, security, and more. These APIs are updated on over a billion devices running Android 12+ through Google Play System updates, so you can target the latest programming features.

App compatibility

Image of Android 15 Development timeline, indicating we are on time with Developer Previews in February

To give you more time to plan for app compatibility work, we’re letting you know our Platform Stability milestone well in advance.

At this milestone, we’ll deliver final SDK/NDK APIs and also final internal APIs and app-facing system behaviors. We’re expecting to reach Platform Stability in June 2024, and from that time you’ll have several months before the official release to do your final testing. The release timeline details are here.

Get started with Android 15

The Developer Preview has everything you need to try the Android 15 features, test your apps, and give us feedback. You can get started today by flashing a system image onto a Pixel 6, 7, or 8 series device, along with the Pixel Fold and Pixel Tablet. If you don’t have a Pixel device, you can use the 64-bit system images with the Android Emulator in Android Studio.

For the best development experience with Android 15, we recommend that you use the latest preview of Android Studio Jellyfish (or more recent Jellyfish+ versions). Once you’re set up, here are some of the things you should do:

    • Try the new features and APIs – your feedback is critical during the early part of the developer preview. Report issues in our tracker on the feedback page.
    • Test your current app for compatibility – learn whether your app is affected by changes in Android 15; install your app onto a device or emulator running Android 15 and extensively test it.

We’ll update the preview system images and SDK regularly throughout the Android 15 release cycle. This initial preview release is for developers only and not intended for daily or consumer use, so we're making it available by manual download only. Once you’ve manually installed a preview build, you’ll automatically get future updates over-the-air for all later previews and Betas. Read more here.

If you intend to move from the Android 14 QPR Beta program to the Android 15 Developer Preview program and don't want to have to wipe your device, we recommend that you move to Developer Preview 1 now. Otherwise you may run into time periods where the Android 14 Beta will have a more recent build date which will prevent you from going directly to the Android 15 Developer Preview without doing a data wipe.

As we reach our Beta releases, we'll be inviting consumers to try Android 15 as well, and we'll open up enrollment for the Android Beta program at that time. For now, please note that the Android Beta program is not yet available for Android 15.

For complete information, visit the Android 15 developer site.


Java and OpenJDK are trademarks or registered trademarks of Oracle and/or its affiliates.