Author Archives:

Building a Mixed-Reality Tour Guide with Android XR, the Geospatial API, and Gemini

Posted by Coco Fatus, UX Designer, Alon Hetzroni, UX Engineer, Azin Mehrnoosh, Product Manager Android XR




At this year's Google I/O, we announced an update for spatial experiences: the Geospatial API is now available as a preview in ARCore for Jetpack XR. By bringing Google's Visual Positioning System (VPS) to Android XR, Android XR enables anchoring digital content to the physical world with sub-meter accuracy and precise orientation in supported areas.* To explore what the Geospatial API could unlock, our team built a demo: the XR Geospatial Tour.

Imagine walking into a new city, putting on a pair of wired XR glasses (like the upcoming XREAL Project Aura), and instantly having a knowledgeable, local guide showing you around. You don't need to stare down at a 2D map—instead, 3D models gently guide your path, and an intelligent voice tells you about the historical landmarks right in front of you. We combined the Geospatial APIs, Gemini API using Firebase AI Logic, Google Maps Grounding, and Jetpack XR SDK to create a hands-free, immersive walking tour experience.

*Disclaimer: Video and Tour Guide application are for demonstration purposes only. Some sequences have been shortened. Any hardware depicted may be under development; final product details may differ.

Let’s walk through the implementation details and show how we tied these APIs together to build a world-scale spatial experience.

1. Pinpointing the User with ARCore Geospatial API (VPS)

Enhance your navigation experience on XR by combining the power of GPS with the precision of VPS. The accuracy and precise orientation that comes with VPS allows 3D waypoints to align with the physical world.

This is why the Geospatial API on Android XR can help you build custom experiences. By using advanced computer vision, VPS tries to provide a GeospatialPose (including latitude, longitude, and heading) that is more accurate than GPS.

Here's how we retrieve the user's Geospatial pose by mapping the device's orientation to a Geospatial coordinate:

// Retrieve the current geospatial pose from the ARCore session
val result = geospatial.createGeospatialPoseFromPose(arDevice.state.value.devicePose)
if (result is CreateGeospatialPoseFromPoseSuccess) {
val pose = result.pose
Log.d("VPS", "Accurate Location: ${pose.latitude}, ${pose.longitude}")
}

Because the entire experience relies on this accuracy, we monitor the horizontalAccuracy and orientationYawAccuracy until they meet our thresholds. If the user is indoors or in an unrecognized area, we prompt them to "walk to an outdoor public space and look around".

2. Crafting the Itinerary with Gemini API & Google Maps Grounding

Once we have a location, we use the Gemini API using Firebase AI Logic to prompt the Gemini model to act as a local tour guide. We pass the user's coordinates to the model and ask it to output a structured JSON response containing nearby walking tours:

val configForTools = ToolConfig(
functionCallingConfig = null,
retrievalConfig = retrievalConfig {
latLng = FirebaseLatLng(pose.latitude, pose.longitude)
languageCode = "en"
}
)
val responseJsonSchema = Schema.obj(
mapOf(
"locationIntro" to Schema.string(),
"tours" to Schema.array(
Schema.obj(
mapOf(
"title" to Schema.string(),
"description" to Schema.string(),
"stops" to Schema.array(
Schema.obj(
mapOf(
"name" to Schema.string(),
"detailedName" to Schema.string(),
"description" to Schema.string()
)
)
)
)
)
)
)
)
val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(
modelName = "gemini-3.5-flash",
tools = listOf(Tool.googleMaps()),
generationConfig = generationConfig {
responseMimeType = "application/json"
responseSchema = responseJsonSchema
}
)
val result = model.generateContent("The user is at latitude ${pose.latitude} and longitude ${pose.longitude}. Generate exactly 3 diverse tours near this location (e.g., historical, food, nature). All tour ideas should be walking distance only.")

Large Language Models are great at generating rich descriptions, but they can sometimes hallucinate exact latitude/longitude coordinates. To solve this, we used Google Maps Grounding to ground the AI.

3. A Voice to Guide You: Gemini 2.5 TTS

To make the tour guide feel truly present, we implemented dynamic voiceovers.

Using the gemini-2.5-flash-tts model, we can configure our model generation config to natively return audio data instead of just text! Here’s how you can request the ResponseModality.AUDIO:

val ttsModel = Firebase.ai(backend = GenerativeBackend.googleAI())
.generativeModel(
modelName = "gemini-2.5-flash-tts",
generationConfig = generationConfig {
// Instruct the model to return Audio
responseModalities = listOf(ResponseModality.AUDIO)
}
)
val response = ttsModel.generateContent("Say in a neutral but positive voice:\n$prompt")
// Extract the raw audio bytes from the response
val audioBytes = response.candidates.firstOrNull()?.content?.parts
?.filterIsInstance<InlineDataPart>()
?.firstOrNull { it.mimeType.contains("audio") }?.inlineData

4. Bringing it to Life in 3D with Jetpack XR

The final piece of the puzzle is rendering this data in the user's field of view. The Jetpack XR SDK makes it intuitive to transition from a 2D Android UI to spatial computing.

We used Jetpack Compose for XR to build spatial components. To represent points of interest along the tour, we built a Composable called InfoSphere, which contains a GltfModel of a 3D orb that floats in space and can be interacted with to reveal information.

Using Jetpack XR SDK, we can place 3D models alongside the Compose UI using SpatialBox and SceneCoreEntity. We also used InteractableComponent to respond to user taps.

@Composable
fun InfoSphere(
content: InfoBubbleContent,
session: Session,
sphereModel: GltfModel,
isSelected: Boolean,
onClick: () -> Unit
) {
// SpatialBox lets us arrange 3D components and SpatialPanels together
SpatialBox(
SubspaceModifier
.offset(x = 2.dp, y = 1.dp, z = (-3).dp) // Positioned in 3D space
) {
// Smoothly animate the visibility of our 2D Compose UI Panel
AnimatedSpatialVisibility(visible = isSelected) {
SpatialPanel {
InfoBubble(content) // Regular 2D Compose UI
}
}
// Render our interactive 3D sphere
SceneCoreEntity(
factory = {
GltfModelEntity.create(session, sphereModel).also { entity ->
// Make the 3D model respond to user taps
entity.addComponent(InteractableComponent.create(session) { inputEvent ->
if (inputEvent.action == InputEvent.Action.UP) {
onClick()
}
})
}
}
)
}
}

By combining AnimatedSpatialVisibility for traditional Compose UI surfaces with SceneCoreEntity 3D elements, we're able to seamlessly blend data into the physical world.

Explore what’s possible with Android XR today

Building the XR Geospatial Tour app showed us that the barrier to entry for world-scale spatial experiences is lower than ever for Android developers. With the Geospatial API now available in preview on Android XR, your apps can seamlessly understand the physical world around them. By combining Compose for XR’s APIs with the high-precision location data of VPS and the generative capabilities of Gemini, we can create experiences that understand both where the user is and what they are looking at.

To help you get hands-on with Android XR, we are thrilled to open applications for the Android XR Developer Catalyst Program, which includes XREAL Project Aura. Starting today, you can apply to get access to an XREAL Project Aura devkit or our display glasses devkit over the coming months!

*Disclaimer: Available on select devices. Internet connection required. Works on compatible apps and surfaces. Results may vary.


Custom event colors in Google Calendar

Google Calendar is expanding its event coloring options, moving beyond the current limitation of 11 predefined colors for events. Going forward, users are offered an expanded color palette so they can personalize events and visually organize their calendar with ease, giving each user access to up to 200 custom colors for individual events via both the native web and mobile apps as well as the Calendar API. This fulfills a long-standing feature request from both business and personal users for more customizable options.

Users will be able to select from 24 default colors. On Calendar on the web (or via API), users can define additional colors by using a full RGB color picker.

UI showing how to select an event color in Calendar

Getting started

  • Admins: There is no admin control for this feature.
  • End users: This feature will be ON by default and can be customized by the user. Visit the Help Center to learn more.

Rollout pace

  • Rapid Release domains: Extended rollout (potentially longer than 15 days for feature visibility) starting on June 17, 2026
  • Scheduled Release domains: Extended rollout (potentially longer than 15 days for feature visibility) starting on June 29, 2026

Availability

  • Available to all Google Workspace customers, Workspace Individual subscribers, and users with personal Google accounts

Resources

Early Stable Update for Desktop

The Stable channel has been updated to 150.0.7871.24/.25 for Windows and Mac as part of our early stable release to a small percentage of users. A full list of changes in this build is available in the log.

You can find more details about early Stable releases here.

Interested in switching release channels?  Find out how here. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.


Daniel Yip

Google Chrome

Chrome Beta for Desktop Update

The Beta channel has been updated to 150.0.7871.24 for Windows, Mac and Linux.

A partial list of changes is available in the Git log. Interested in switching release channels? Find out how. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.

Chrome Release Team
Google Chrome

Enhanced AI avatar features and capabilities in Google Vids

With the integration of Gemini 3.1 Flash Text-To-Speech (TTS) and the latest capabilities in Veo 3.1, AI avatars in Google Vids have become more realistic and expressive than ever. We’re excited to announce expanded language support, a new collection of avatar defaults, and the ability to direct your custom avatars to take action in any generated video.

Expanded preset avatars with more expressive speaking


Several examples of realistic avatars

Several examples of 3D cartoon avatars

Samples of new default avatars

We’ve expanded avatar options from 23 to 53 default presets, spanning photorealistic, 3D cartoon, and graphic novel styles. This broader gallery includes avatars like Sofia, Jack, Charlie, Finley, and Eleanor, all powered by Gemini Audio to speak with greater degrees of expression, realism, and conversational tones.

Expanded speaker support to 24 languages


List of supported languages

Full list of language support in Google Vids UI

We’ve added support for 16 new languages, including Hindi, Bengali, Marathi, Tamil, Telugu, Arabic, Indonesian, Russian, Dutch, Polish, Thai, Turkish, Vietnamese, Romanian, and Ukrainian. These languages join our existing set—English, Spanish, Portuguese, Japanese, Korean, French, Italian, and German—bringing the total to 24 supported languages for AI avatar and voiceovers.

Create custom avatars with the latest Gemini voice model


UI for creating a custom avatar, including box to add a name
User experience of assigning new voices to custom avatars

With custom avatars in Vids, you can design your avatar using Nano Banana Pro. Starting today, you can choose from 30+ voices powered by Gemini Audio that offer increased expression, language support, and steering control over how they speak.

Direct your custom avatars to take action in addition to speaking



Generated sample of a directed custom avatar

Previously, users could only direct default avatars. Now, you can add custom avatars as an ingredient in generated video clips, unlocking new ways to have your customized spokesperson tell your story. Every generation preserves your custom avatar's appearance and voice, ensuring your customizations are preserved throughout new generations.

  • Control custom actions: Instruct your avatar to walk, talk, and use objects simply by typing a text prompt describing their actions.
  • Use image references: Upload additional images to direct your avatar in customized locations or with branded logos.

Getting started

Rollout pace

Availability

  • Business: Business Starter, Standard, and Plus
  • Enterprise: Enterprise Starter, Standard, and Plus
  • Education: Education Plus
  • Consumer: All users with personal Google accounts, including Google AI Pro and Ultra
  • Other Editions: Enterprise Essentials, and Enterprise Essentials Plus; Nonprofits; Individual
  • Education Add-ons: Teaching and Learning; Google AI Pro for Education
  • Other Add-ons: AI Expanded Access*
*Users with AI Expanded Access add-on licenses have higher limits on usage of AI avatars in Vids.

Resources

Create longer Veo videos and generate multiple at once in Google Vids

Starting today, we’re introducing powerful new ways to create and iterate on video content in Google Vids using Veo. These updates provide all Vids users with the ability to create longer videos with consistent characters and generate multiple videos in parallel, enabling you to bring your vision to life faster than ever before.

  • Longer Veo videos: You can now extend existing video clips using Veo to create longer, more immersive content while ensuring perfect storytelling continuity across your scenes.
  • Generate multiple clips at once: Increase your productivity by kicking off multiple video generation requests at once, allowing you to explore different styles and prompts simultaneously to save time.


Extend video clips using Veo

Getting started

Rollout pace

Availability

  • Business: Business Starter, Standard, and Plus
  • Enterprise: Enterprise Starter, Standard, and Plus
  • Education: Education Plus
  • Consumer: All users with personal Google accounts, including Google AI Pro and Ultra
  • Other Editions: Enterprise Essentials, and Enterprise Essentials Plus; Nonprofits; Individual
  • Education Add-ons: Teaching and Learning; Google AI Pro for Education
  • Other Add-ons: AI Expanded Access*
*Users with AI Expanded Access add-on licenses have higher limits on video generation using Veo in Vids.

Resources