Tag Archives: Text-to-Speech

Introducing a new Text-To-Speech engine on Wear OS

Posted by Ouiam Koubaa – Product Manager and Yingzhe Li – Software Engineer

Today, we’re excited to announce the release of a new Text-To-Speech (TTS) engine that is performant and reliable. Text-to-speech turns text into natural-sounding speech across more than 50 languages powered by Google’s machine learning (ML) technology. The new text-to-speech engine on Wear OS uses decreased prosody ML models to bring faster synthesis on Wear OS devices.

Use cases for Wear OS’s text-to-speech can range from accessibility services, coaching cues for exercise apps, navigation cues, and reading aloud incoming alerts through the watch speaker or Bluetooth connected headphones. The engine is meant for brief interactions, so it shouldn’t be used for reading aloud a long article, or a long summary of a podcast.

How to use Wear OS’s TTS

Text-to-speech has long been supported on Android. Wear OS’s new TTS has been tuned to be performant and reliable on low-memory devices. All the Android APIs are still the same, so developers use the same process to integrate it into a Wear OS app, for example, TextToSpeech#speak can be used to speak specific text. This is available on devices that run Wear OS 4 or higher.

When the user interacts with the Wear OS TTS for the first time following a device boot, the synthesis engine is ready in about 10 seconds. For special cases where developers want the watch to speak immediately after opening an app or launching an experience, the following code can be used to pre-warm the TTS engine before any synthesis requests come in.

private fun initTtsEngine() {
    // Callback when TextToSpeech connection is set up
    val callback = TextToSpeech.OnInitListener { status ->
        if (status == TextToSpeech.SUCCESS) {
            Log.i(TAG, "tts Client Initialized successfully")


            // Get default TTS locale
            val defaultVoice = tts.voice
            if (defaultVoice == null) {
                Log.w(TAG, "defaultVoice == null")
                return@OnInitListener
            }


            // Set TTS engine to use default locale
            tts.language = defaultVoice.locale




            try {
                // Create a temporary file to synthesize sample text
                val tempFile =
                        File.createTempFile("tmpsynthesize", null, applicationContext.cacheDir)


                // Synthesize sample text to our file
                tts.synthesizeToFile(
                        /* text= */ "1 2 3", // Some sample text
                        /* params= */ null, // No params necessary for a sample request
                        /* file= */ tempFile,
                        /* utteranceId= */ "sampletext"
                )


                // And clean up the file
                tempFile.deleteOnExit()
            } catch (e: Exception) {
                Log.e(TAG, "Unhandled exception: ", e)
            }
        }
    }


    tts = TextToSpeech(applicationContext, callback)
}

When you are done using TTS, you can release the engine by calling tts.shutdown() in your activity’s onDestroy() method. This command should also be used when closing an app that TTS is used for.

Languages and Locales

By default, Wear OS TTS includes 7 pre-loaded languages in the system image: English, Spanish, French, Italian, German, Japanese, and Mandarin Chinese. OEMs may choose to preload a different set of languages. You can check what languages are available by using TextToSpeech#getAvailableLanguages(). During watch setup, if the user selects a system language that is not a pre-loaded voice file, the watch automatically downloads the corresponding voice file the first time the user connects to Wi-Fi while charging their watch.

There are limited cases where the speech output may differ from the user’s system language. For example, in a scenario where a safety app uses TTS to call emergency responders, developers might want to synthesize speech in the language of the locale the user is in, not in the language the user has their watch set to. To synthesize text in a different language from system settings, use TextToSpeech#setLanguage(java.util.Locale)

Conclusion

Your Wear OS apps now have the power to talk, either directly from the watch’s speakers or through Bluetooth connected headphones. Learn more about using TTS.

We look forward to seeing how you use Text-to-speech engine to create more helpful and engaging experiences for your users on Wear OS!


Copyright 2023 Google LLC.
SPDX-License-Identifier: Apache-2.0

Listen to our major Text to Speech upgrades for 64 bit devices.

Posted by Rakesh Iyer, Staff Software Engineer and Leland Rechis, Group Product Manager

We are upgrading the Speech Services by Google speech engine in a big way, providing clearer, more natural voices. All 421 voices in 67 languages have been upgraded with a new voice model and synthesizer.

If you already use TTS and the Speech Services by Google engine, there is nothing to do – everything will happen behind the scenes as your users will have automatically downloaded the latest update. We’ve seen a significant side by side quality increase with this change, particularly in respects to clarity and naturalness.

With this upgrade we will also be changing the default voice in en-US to one that is built using fresher speaker data, which alongside our new stack, results in a drastic improvement. If your users have not selected a system voice, and you rely on system defaults, they will hear a slightly different speaker. You can hear the difference below

Speaker change and upgrade for EN-US

Sample Current Speaker

Sample Upgraded Speaker


Speaker upgrades in a few other languages

Current

Upgraded

HI-IN

HI-IN

PT-BR

PT-BR

ES-US

ES-US


This update will be rolling out to all 64 bit Android devices via the Google Play Store over the next few weeks as a part of the Speech Services by Google apk. If you are concerned your users have not updated this yet, you can check for the minimum version code ,210390644 on the package com.google.android.tts.

If you haven't used TTS in your projects yet, or haven’t given your users the ability to choose a voice within your app, it's fairly straightforward, and easy to experiment with. We’ve included some sample code to get you started. 

Here’s an example of how to set up voice synthesis, get a list of voices, and set a specific voice. We finally send a simple utterance to the synthesizer.

public class MainActivity extends AppCompatActivity {
  private static final String TAG = "TextToSpeechSample";

  private TextToSpeech tts;

  private final UtteranceProgressListener progressListener =
new UtteranceProgressListener() {
    @Override
    public void onStart(String utteranceId) {
      Log.d(TAG, "Started utterance " + utteranceId);
    }

    @Override
    public void onDone(String utteranceId) {
      Log.d(TAG, "Done with utterance " + utteranceId);
    }

    @Override
    public void onError(String s) { }
  };

  private final TextToSpeech.OnInitListener initListener = new OnInitListener() {
    @Override
    public void onInit(int status) {
      tts.setOnUtteranceProgressListener(progressListener);
      for (Voice voice : tts.getVoices()) {
        if (voice.getName().equals("en-us-x-iog-local")) {
          tts.setVoice(voice);
          tts.speak("1 2 3", TextToSpeech.QUEUE_ADD, new Bundle(), "utteranceId");
          break;
        }
      }
    }
  };

  @Override
  protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_main);
    tts = new TextToSpeech(this, initListener);
  }

  @Override
  protected void onDestroy() {
    if (tts != null) {
      tts.shutdown();
    }
    super.onDestroy();
  }
}


We are excited to see this upgraded experience in your app!

Listen to our major Text to Speech upgrades for 64 bit devices.

Posted by Rakesh Iyer, Staff Software Engineer and Leland Rechis, Group Product Manager

We are upgrading the Speech Services by Google speech engine in a big way, providing clearer, more natural voices. All 421 voices in 67 languages have been upgraded with a new voice model and synthesizer.

If you already use TTS and the Speech Services by Google engine, there is nothing to do – everything will happen behind the scenes as your users will have automatically downloaded the latest update. We’ve seen a significant side by side quality increase with this change, particularly in respects to clarity and naturalness.

With this upgrade we will also be changing the default voice in en-US to one that is built using fresher speaker data, which alongside our new stack, results in a drastic improvement. If your users have not selected a system voice, and you rely on system defaults, they will hear a slightly different speaker. You can hear the difference below

Speaker change and upgrade for EN-US

Sample Current Speaker

Sample Upgraded Speaker


Speaker upgrades in a few other languages

Current

Upgraded

HI-IN

HI-IN

PT-BR

PT-BR

ES-US

ES-US


This update will be rolling out to all 64 bit Android devices via the Google Play Store over the next few weeks as a part of the Speech Services by Google apk. If you are concerned your users have not updated this yet, you can check for the minimum version code ,210390644 on the package com.google.android.tts.

If you haven't used TTS in your projects yet, or haven’t given your users the ability to choose a voice within your app, it's fairly straightforward, and easy to experiment with. We’ve included some sample code to get you started. 

Here’s an example of how to set up voice synthesis, get a list of voices, and set a specific voice. We finally send a simple utterance to the synthesizer.

public class MainActivity extends AppCompatActivity {
  private static final String TAG = "TextToSpeechSample";

  private TextToSpeech tts;

  private final UtteranceProgressListener progressListener =
new UtteranceProgressListener() {
    @Override
    public void onStart(String utteranceId) {
      Log.d(TAG, "Started utterance " + utteranceId);
    }

    @Override
    public void onDone(String utteranceId) {
      Log.d(TAG, "Done with utterance " + utteranceId);
    }

    @Override
    public void onError(String s) { }
  };

  private final TextToSpeech.OnInitListener initListener = new OnInitListener() {
    @Override
    public void onInit(int status) {
      tts.setOnUtteranceProgressListener(progressListener);
      for (Voice voice : tts.getVoices()) {
        if (voice.getName().equals("en-us-x-iog-local")) {
          tts.setVoice(voice);
          tts.speak("1 2 3", TextToSpeech.QUEUE_ADD, new Bundle(), "utteranceId");
          break;
        }
      }
    }
  };

  @Override
  protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_main);
    tts = new TextToSpeech(this, initListener);
  }

  @Override
  protected void onDestroy() {
    if (tts != null) {
      tts.shutdown();
    }
    super.onDestroy();
  }
}


We are excited to see this upgraded experience in your app!

Exercise or Games? Why Not Both!

Posted by Alice Ching, Google Engineer

We are pleased to announce the release of Games in Motion, an open source game sample to demonstrate how developers can make fun games using Google Fit and Android Wear. Do you ever go on a jog and feel like there is a lack of incentive to help you run better? What if you were a secret agent and had to use your speed and your nifty gadget to complete missions?

With Games in Motion, you can enhance your exercise with missions and actions on your Android Wear device, while logging your jogs to the cloud.

Games in Motion is written in Java programming language using Android Studio. It demonstrates multiple Android technologies.

  • Android Wear bridges notifications from a phone or tablet to a paired Android Wear device. The notifications are stacked so we can show multiple stats at the same time.
  • Google Fit API collects and processes fitness data and sessions. This allows us to use the fitness data to show user progress. All exercise sessions done in Games in Motion will be recorded to Google Fit as well.
  • Google Play Games Services is used to create and unlock achievements.
  • Several different Android audio APIs are integrated.
  • JUnit tests are present for the data-driven parser, which demonstrates how unit testing can be done within Android Studio.

You can download the latest open source release from GitHub. We hope to inspire similar Android games, where multiple different form factors are combined for a fun experience.