Tag Archives: Audio

Omnitone: Spatial audio on the web


Spatial audio is a key element for an immersive virtual reality (VR) experience. By bringing spatial audio to the web, the browser can be transformed into a complete VR media player with incredible reach and engagement. That’s why the Chrome WebAudio team has created and is releasing the Omnitone project, an open source spatial audio renderer with the cross-browser support.

Our challenge was to introduce the audio spatialization technique called ambisonics so the user can hear the full-sphere surround sound on the browser. In order to achieve this, we implemented the ambisonic decoding with binaural rendering using web technology. There are several paths for introducing a new feature into the web platform, but we chose to use only the Web Audio API. In doing so, we can reach a larger audience with this cross-browser technology, and we can also avoid the lengthy standardization process for introducing a new Web Audio component. This is possible because the Web Audio API provides all the necessary building blocks for this audio spatialization technique.



Omnitone Audio Processing Diagram

The AmbiX format recording, which is the target of the Omnitone decoder, contains 4 channels of audio that are encoded using ambisonics, which can then be decoded into an arbitrary speaker setup. Instead of the actual speaker array, Omnitone uses 8 virtual speakers based on an the head-related transfer function (HRTF) convolution to render the final audio stream binaurally. This binaurally-rendered audio can convey a sense of space when it is heard through headphones.

The beauty of this mechanism lies in the sound-field rotation applied to the incoming spatial audio stream. The orientation sensor of a VR headset or a smartphone can be linked to Omnitone’s decoder to seamlessly rotate the entire sound field. The rest of the spatialization process will be handled automatically by Omnitone. A live demo can be found at the project landing page.

Throughout the project, we worked closely with the Google VR team for their VR audio expertise. Not only was their knowledge on the spatial audio a tremendous help for the project, but the collaboration also ensured identical audio spatialization across all of Google’s VR applications - both on the web and Android (e.g. Google VR SDK, YouTube Android app). The Spatial Media Specification and HRTF sets are great examples of the Google VR team’s efforts, and Omnitone is built on top of this specification and HRTF sets.

With emerging web-based VR projects like WebVR, Omnitone’s audio spatialization can play a critical role in a more immersive VR experience on the web. Web-based VR applications will also benefit from high-quality streaming spatial audio, as the Chrome Media team has recently added FOA compression to the open source audio codec Opus. More exciting things like VR view integration, higher-order ambisonics and mobile web support will also be coming soon to Omnitone.

We look forward to seeing what people do with Omnitone now that it's open source. Feel free to reach out to us or leave a comment with your thoughts and feedback on the issue tracker on GitHub.

By Hongchan Choi and Raymond Toy, Chrome Team

Due to the incomplete implementation of multichannel audio decoding on various browsers, Omnitone does not support mobile web at the time of writing.

A new method to measure touch and audio latency

Posted by Mark Koudritsky, software engineer

There is a new addition in the arsenal of instruments used by Android and ChromeOS teams in the battle to measure and minimize touch and audio latency: the WALT Latency Timer.

When you use a mobile device, you expect it to respond instantly to your touch or voice: the more immediate the response, the more you feel directly connected to the device. Over the past few years, we have been trying to measure, understand, and reduce latency in our Chromebook and Android products.

Before we can reduce latency, we must first understand where it comes from. In the case of tapping a touchscreen, the time for a response includes the touch-sensing hardware and driver, the application, and the display and graphics output. For a voice command, there is time spent in sampling input audio, the application, and in audio output. Sometimes we have a mixture of these (for example, a piano app would include touch input and audio output).

Most previous work to study latency has focused on measuring a single round-trip latency number. For example, to measure audio latency, an app would measure time from app to speaker/mic and back to the app using the Dr. Rick O'Rang loopback audio dongle together with an appropriate app such as the Dr Rick O’Rang Loopback app or Superpowered Mobile Audio Latency Test App. Similarly, the TouchBot uses a fast camera to measure the round-trip delay from physical touch until a change on the screen is visible. While valuable, the problem with such a setup is that it’s very difficult to break down the latency into input vs output components.

An important innovation in WALT (a descendant of QuickStep) is that it synchronizes an external hardware clock with the Android device or Chromebook to within a millisecond. This allows it to measure input and output latencies separately as opposed to measuring a round-trip latency.

WALT is simple. The parts cost less than $50 and with some basic hobby electronics skills, you can build it yourself.

We’ve been using WALT within Google for Nexus and Chromebook development. We’re now opening this tool to app developers and anyone who wants to precisely measure real-world latencies. We hope that having readily accessible tools will help the industry as a whole improve and make all our devices more responsive to touch and voice.

Exercise or Games? Why Not Both!

Posted by Alice Ching, Google Engineer

We are pleased to announce the release of Games in Motion, an open source game sample to demonstrate how developers can make fun games using Google Fit and Android Wear. Do you ever go on a jog and feel like there is a lack of incentive to help you run better? What if you were a secret agent and had to use your speed and your nifty gadget to complete missions?

With Games in Motion, you can enhance your exercise with missions and actions on your Android Wear device, while logging your jogs to the cloud.

Games in Motion is written in Java programming language using Android Studio. It demonstrates multiple Android technologies.

  • Android Wear bridges notifications from a phone or tablet to a paired Android Wear device. The notifications are stacked so we can show multiple stats at the same time.
  • Google Fit API collects and processes fitness data and sessions. This allows us to use the fitness data to show user progress. All exercise sessions done in Games in Motion will be recorded to Google Fit as well.
  • Google Play Games Services is used to create and unlock achievements.
  • Several different Android audio APIs are integrated.
  • JUnit tests are present for the data-driven parser, which demonstrates how unit testing can be done within Android Studio.

You can download the latest open source release from GitHub. We hope to inspire similar Android games, where multiple different form factors are combined for a fun experience.

Respecting Audio Focus

Posted by Kristan Uccello, Google Developer Relations

It’s rude to talk during a presentation, it disrespects the speaker and annoys the audience. If your application doesn’t respect the rules of audio focus then it’s disrespecting other applications and annoying the user. If you have never heard of audio focus you should take a look at the Android developer training material.

With multiple apps potentially playing audio it's important to think about how they should interact. To avoid every music app playing at the same time, Android uses audio focus to moderate audio playback—your app should only play audio when it holds audio focus. This post provides some tips on how to handle changes in audio focus properly, to ensure the best possible experience for the user.

Requesting audio focus

Audio focus should not be requested when your application starts (don’t get greedy), instead delay requesting it until your application is about to do something with an audio stream. By requesting audio focus through the AudioManager system service, an application can use one of the AUDIOFOCUS_GAIN* constants (see Table 1) to indicate the desired level of focus.

Listing 1. Requesting audio focus.

1. AudioManager am = (AudioManager) mContext.getSystemService(Context.AUDIO_SERVICE);
2.     
3.  int result = am.requestAudioFocus(mOnAudioFocusChangeListener,
4.    // Hint: the music stream.
5.    AudioManager.STREAM_MUSIC,
6.    // Request permanent focus.
7.    AudioManager.AUDIOFOCUS_GAIN);
8.  if (result == AudioManager.AUDIOFOCUS_REQUEST_GRANTED) {
9.    mState.audioFocusGranted = true;
10. } else if (result == AudioManager.AUDIOFOCUS_REQUEST_FAILED) {
11.   mState.audioFocusGranted = false;
12. }

In line 7 above, you can see that we have requested permanent audio focus. An application could instead request transient focus using AUDIOFOCUS_GAIN_TRANSIENT which is appropriate when using the audio system for less than 45 seconds.

Alternatively, the app could use AUDIOFOCUS_GAIN_TRANSIENT_MAY_DUCK, which is appropriate when the use of the audio system may be shared with another application that is currently playing audio (e.g. for playing a "keep it up" prompt in a fitness application and expecting background music to duck during the prompt). The app requesting AUDIOFOCUS_GAIN_TRANSIENT_MAY_DUCK should not use the audio system for more than 15 seconds before releasing focus.

Handling audio focus changes

In order to handle audio focus change events, an application should create an instance of OnAudioFocusChangeListener. In the listener, the application will need to handle theAUDIOFOCUS_GAIN* event and AUDIOFOCUS_LOSS* events (see Table 1). It should be noted that AUDIOFOCUS_GAIN has some nuances which are highlighted in Listing 2, below.

Listing 2. Handling audio focus changes.

1. mOnAudioFocusChangeListener = new AudioManager.OnAudioFocusChangeListener() {  
2.   
3. @Override
4. public void onAudioFocusChange(int focusChange) {
5.   switch (focusChange) {
6.   case AudioManager.AUDIOFOCUS_GAIN:
7.     mState.audioFocusGranted = true;
8.        
9.     if(mState.released) {
10.   initializeMediaPlayer();
11.    }
12.
13. switch(mState.lastKnownAudioFocusState) { 14. case UNKNOWN: 15. if(mState.state == PlayState.PLAY && !mPlayer.isPlaying()) { 16. mPlayer.start(); 17. } 18. break; 19. case AudioManager.AUDIOFOCUS_LOSS_TRANSIENT: 20. if(mState.wasPlayingWhenTransientLoss) { 21. mPlayer.start(); 22. } 23. break; 24. case AudioManager.AUDIOFOCUS_LOSS_TRANSIENT_CAN_DUCK: 25. restoreVolume(); 26. break; 27. } 28.
29. break; 30. case AudioManager.AUDIOFOCUS_LOSS: 31. mState.userInitiatedState = false; 32. mState.audioFocusGranted = false; 33. teardown(); 34. break; 35. case AudioManager.AUDIOFOCUS_LOSS_TRANSIENT: 36. mState.userInitiatedState = false; 37. mState.audioFocusGranted = false; 38. mState.wasPlayingWhenTransientLoss = mPlayer.isPlaying(); 39. mPlayer.pause(); 40. break; 41. case AudioManager.AUDIOFOCUS_LOSS_TRANSIENT_CAN_DUCK: 42. mState.userInitiatedState = false; 43. mState.audioFocusGranted = false; 44. lowerVolume(); 45. break; 46. } 47. mState.lastKnownAudioFocusState = focusChange; 48. } 49.};

AUDIOFOCUS_GAIN is used in two distinct scopes of an applications code. First, it can be used when registering for audio focus as shown in Listing 1. This does NOT translate to an event for the registered OnAudioFocusChangeListener, meaning that on a successful audio focus request the listener will NOT receive an AUDIOFOCUS_GAIN event for the registration.

AUDIOFOCUS_GAIN is also used in the implementation of an OnAudioFocusChangeListener as an event condition. As stated above, the AUDIOFOCUS_GAIN event will not be triggered on audio focus requests. Instead the AUDIOFOCUS_GAIN event will occur only after an AUDIOFOCUS_LOSS* event has occurred. This is the only constant in the set shown Table 1 that is used in both scopes.

There are four cases that need to be handled by the focus change listener. When the application receives an AUDIOFOCUS_LOSS this usually means it will not be getting its focus back. In this case the app should release assets associated with the audio system and stop playback. As an example, imagine a user is playing music using an app and then launches a game which takes audio focus away from the music app. There is no predictable time for when the user will exit the game. More likely, the user will navigate to the home launcher (leaving the game in the background) and launch yet another application or return to the music app causing a resume which would then request audio focus again.

However another case exists that warrants some discussion. There is a difference between losing audio focus permanently (as described above) and temporarily. When an application receives an AUDIOFOCUS_LOSS_TRANSIENT, the behavior of the app should be that it suspends its use of the audio system until it receives an AUDIOFOCUS_GAIN event. When the AUDIOFOCUS_LOSS_TRANSIENT occurs, the application should make a note that the loss is temporary, that way on audio focus gain it can reason about what the correct behavior should be (see lines 13-27 of Listing 2).

Sometimes an app loses audio focus (receives an AUDIOFOCUS_LOSS) and the interrupting application terminates or otherwise abandons audio focus. In this case the last application that had audio focus may receive an AUDIOFOCUS_GAIN event. On the subsequent AUDIOFOCUS_GAIN event the app should check and see if it is receiving the gain after a temporary loss and can thus resume use of the audio system or if recovering from an permanent loss, setup for playback.

If an application will only be using the audio capabilities for a short time (less than 45 seconds), it should use an AUDIOFOCUS_GAIN_TRANSIENT focus request and abandon focus after it has completed its playback or capture. Audio focus is handled as a stack on the system — as such the last process to request audio focus wins.

When audio focus has been gained this is the appropriate time to create a MediaPlayer or MediaRecorder instance and allocate resources. Likewise when an app receives AUDIOFOCUS_LOSS it is good practice to clean up any resources allocated. Gaining audio focus has three possibilities that also correspond to the three audio focus loss cases in Table 1. It is a good practice to always explicitly handle all the loss cases in the OnAudioFocusChangeListener.

Table 1. Audio focus gain and loss implication.

GAIN LOSS
AUDIOFOCUS_GAIN AUDIOFOCUS_LOSS
AUDIOFOCUS_GAIN_TRANSIENT AUDIOFOCUS_LOSS_TRANSIENT
AUDIOFOCUS_GAIN_TRANSIENT_MAY_DUCK AUDIOFOCUS_LOSS_TRANSIENT_CAN_DUCK

Note: AUDIOFOCUS_GAIN is used in two places. When requesting audio focus it is passed in as a hint to the AudioManager and it is used as an event case in the OnAudioFocusChangeListener. The gain events highlighted in green are only used when requesting audio focus. The loss events are only used in the OnAudioFocusChangeListener.

Table 2. Audio stream types.

Stream Type Description
STREAM_ALARM The audio stream for alarms
STREAM_DTMF The audio stream for DTMF Tones
STREAM_MUSIC The audio stream for "media" (music, podcast, videos) playback
STREAM_NOTIFICATION The audio stream for notifications
STREAM_RING The audio stream for the phone ring
STREAM_SYSTEM The audio stream for system sounds

An app will request audio focus (see an example in the sample source code linked below) from the AudioManager (Listing 1, line 1). The three arguments it provides are an audio focus change listener object (optional), a hint as to what audio channel to use (Table 2, most apps should use STREAM_MUSIC) and the type of audio focus from Table 1, column 1. If audio focus is granted by the system (AUDIOFOCUS_REQUEST_GRANTED), only then handle any initialization (see Listing 1, line 9).

Note: The system will not grant audio focus (AUDIOFOCUS_REQUEST_FAILED) if there is a phone call currently in process and the application will not receive AUDIOFOCUS_GAIN after the call ends.

Within an implementation of OnAudioFocusChange(), understanding what to do when an application receives an onAudioFocusChange() event is summarized in Table 3.

In the cases of losing audio focus be sure to check that the loss is in fact final. If the app receives an AUDIOFOCUS_LOSS_TRANSIENT or AUDIOFOCUS_LOSS_TRANSIENT_CAN_DUCK it can hold onto the media resources it has created (don’t call release()) as there will likely be another audio focus change event very soon thereafter. The app should take note that it has received a transient loss using some sort of state flag or simple state machine.

If an application were to request permanent audio focus with AUDIOFOCUS_GAIN and then receive an AUDIOFOCUS_LOSS_TRANSIENT_CAN_DUCK an appropriate action for the application would be to lower its stream volume (make sure to store the original volume state somewhere) and then raise the volume upon receiving an AUDIOFOCUS_GAIN event (see Figure 1, below).

Table 3. Appropriate actions by focus change type.

Focus Change Type Appropriate Action
AUDIOFOCUS_GAIN Gain event after loss event: Resume playback of media unless other state flags set by the application indicate otherwise. For example, the user paused the media prior to loss event.
AUDIOFOCUS_LOSS Stop playback. Release assets.
AUDIOFOCUS_LOSS_TRANSIENT Pause playback and keep a state flag that the loss is transient so that when the AUDIOFOCUS_GAIN event occurs you can resume playback if appropriate. Do not release assets.
AUDIOFOCUS_LOSS_TRANSIENT_CAN_DUCK Lower volume or pause playback keeping track of state as with AUDIOFOCUS_LOSS_TRANSIENT. Do not release assets.

Conclusion and further reading

Understanding how to be a good audio citizen application on an Android device means respecting the system's audio focus rules and handling each case appropriately. Try to make your application behave in a consistent manner and not negatively surprise the user. There is a lot more that can be talked about within the audio system on Android and in the material below you will find some additional discussions.

Example source code is available here:

https://android.googlesource.com/platform/development/+/master/samples/RandomMusicPlayer