Tag Archives: GenerativeAI

Agentic AI takes Gemini in Android Studio to the next level

Posted by Sandhya Mohan – Product Manager, and Jose Alcérreca – Developer Relations Engineer

Software development is undergoing a significant evolution, moving beyond reactive assistants to intelligent agents. These agents don't just offer suggestions; they can create execution plans, utilize external tools, and make complex, multi-file changes. This results in a more capable AI that can iteratively solve challenging problems, fundamentally changing how developers work.

At Google I/O 2025, we offered a glimpse into our work on agentic AI in Android Studio, the integrated development environment (IDE) focused on Android development. We showcased that by combining agentic AI with the built-in portfolio of tools inside of Android Studio, the IDE is able to assist you in developing Android apps in ways that were never possible before. We are now incredibly excited to announce the next frontier in Android development with the availability of 'Agent Mode' for Gemini in Android Studio.

These features are available in the latest Android Studio Narwhal Feature Drop Canary release, and will be rolled out to business tier subscribers in the coming days. As with all new Android Studio features, we invite developers to provide feedback to direct our development efforts and ensure we are creating the tools you need to build better apps, faster.

Agent Mode

Gemini in Android Studio's Agent Mode is a new experimental capability designed to handle complex development tasks that go beyond what you can experience by just chatting with Gemini.

With Agent Mode, you can describe a complex goal in natural language — from generating unit tests to complex refactors — and the agent formulates an execution plan that can span multiple files in your project and executes under your direction. Agent Mode uses a range of IDE tools for reading and modifying code, building the project, searching the codebase and more to help Gemini complete complex tasks from start to finish with minimal oversight from you.

To use Agent Mode, click Gemini in the sidebar, then select the Agent tab, and describe a task you'd like the agent to perform. Some examples of tasks you can try in Agent Mode include:

    • Build my project and fix any errors
    • Extract any hardcoded strings used across my project and migrate to strings.xml
    • Add support for dark mode to my application
    • Given an attached screenshot, implement a new screen in my application using Material 3

The agent then suggests edits and iteratively fixes bugs to complete tasks. You can review, accept, or reject the proposed changes along the way, and ask the agent to iterate on your feedback.

moving image showing Gemini breaking tasks down into a plan with simple steps, and the list of IDE tools it needs to complete each step
Gemini breaks tasks down into a plan with simple steps. It also shows the list of IDE tools it needs to complete each step.

While powerful, you are firmly in control, with the ability to review, refine and guide the agent’s output at every step. When the agent proposes code changes, you can choose to accept or reject them.

screenshot of Gemini in Android Studio showing the Agent prompting the user to accept or reject a change
The Agent waits for the developer to approve or reject a change.

Additionally, you can enable “Auto-approve” if you are feeling lucky 😎 — especially useful when you want to iterate on ideas as rapidly as possible.

You can delegate routine, time-consuming work to the agent, freeing up your time for more creative, high-value work. Try out Agent Mode in the latest preview version of Android Studio – we look forward to seeing what you build! We are investing in building more agentic experiences for Gemini in Android Studio to make your development even more intuitive, so you can expect to see more agentic functionality over the next several releases.

moving image showing that Gemini understanding the context of an app
Gemini is capable of understanding the context of your app

Supercharge Agent Mode with your Gemini API key

screenshot of Gemini API key prompt in Android Studio

The default Gemini model has a generous no-cost daily quota with a limited context window. However, you can now add your own Gemini API key to expand Agent Mode's context window to a massive 1 million tokens with Gemini 2.5 Pro.

A larger context window lets you send more instructions, code and attachments to Gemini, leading to even higher quality responses. This is especially useful when working with agents, as the larger context provides Gemini 2.5 Pro with the ability to reason about complex or long-running tasks.

screenshot of how to add your API Key in the Gemini settings
Add your API key in the Gemini settings

To enable this feature, get a Gemini API key by navigating to Google AI Studio. Sign in and get a key by clicking on the “Get API key” button. Then, back in Android Studio, navigate to the settings by going to File (Android Studio on macOS) > Settings > Tools > Gemini to enter your Gemini API key. Relaunch Gemini in Android Studio and get even better responses from Agent Mode.

Be sure to safeguard your Gemini API key, as additional charges apply for Gemini API usage associated with a personal API key. You can monitor your Gemini API key usage by navigating to AI Studio and selecting Get API key > Usage & Billing.

Note that business tier subscribers already get access to Gemini 2.5 Pro and the expanded context window automatically with their Gemini Code Assist license, so these developers will not see an API key option.

Model Context Protocol (MCP)

Gemini in Android Studio's Agent Mode can now interact with external tools via the Model Context Protocol (MCP). This feature provides a standardized way for Agent Mode to use tools and extend knowledge and capabilities with the external environment.

There are many tools you can connect to the MCP Host in Android Studio. For example you could integrate with the Github MCP Server to create pull requests directly from Android Studio. Here are some additional use cases to consider.

In this initial release of MCP support in the IDE you will configure your MCP servers through a mcp.json file placed in the configuration directory of Studio, using the following format:

{
  "mcpServers": {
    "memory": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-memory"
      ]
    },
    "sequential-thinking": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-sequential-thinking"
      ]
    },
    "github": {
      "command": "docker",
      "args": [
        "run",
        "-i",
        "--rm",
        "-e",
        "GITHUB_PERSONAL_ACCESS_TOKEN",
        "ghcr.io/github/github-mcp-server"
      ],
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "<YOUR_TOKEN>"
      }
    }
  }  
}
Example configuration with two MCP servers

For this initial release, we support interacting with external tools via the stdio transport as defined in the MCP specification. We plan to support the full suite of MCP features in upcoming Android Studio releases, including the Streamable HTTP transport, external context resources, and prompt templates.

For more information on how to use MCP in Studio, including the mcp.json configuration file format, please refer to the Android Studio MCP Host documentation.

By delegating routine tasks to Gemini through Agent Mode, you’ll be able to focus on more innovative and enjoyable aspects of app development. Download the latest preview version of Android Studio on the canary release channel today to try it out, and let us know how much faster app development is for you!

As always, your feedback is important to us – check known issues, report bugs, suggest improvements, and be part of our vibrant community on LinkedIn, Medium, YouTube, or X. Let's build the future of Android apps together!

Top 3 things to know for AI on Android at Google I/O ‘25

Posted by Kateryna Semenova – Sr. Developer Relations Engineer

AI is reshaping how users interact with their favorite apps, opening new avenues for developers to create intelligent experiences. At Google I/O, we showcased how Android is making it easier than ever for you to build smart, personalized and creative apps. And we’re committed to providing you with the tools needed to innovate across the full development stack in this evolving landscape.

This year, we focused on making AI accessible across the spectrum, from on-device processing to cloud-powered capabilities. Here are the top 3 announcements you need to know for building with AI on Android from Google I/O ‘25:

#1 Leverage the efficiency of Gemini Nano for on-device AI experiences

For on-device AI, we announced a new set of ML Kit GenAI APIs powered by Gemini Nano, our most efficient and compact model designed and optimized for running directly on mobile devices. These APIs provide high-level, easy integration for common tasks including text summarization, proofreading, rewriting content in different styles, and generating image description. Building on-device offers significant benefits such as local data processing and offline availability at no additional cost for inference. To start integrating these solutions explore the ML Kit GenAI documentation, the sample on GitHub and watch the "Gemini Nano on Android: Building with on-device GenAI" talk.

#2 Seamlessly integrate on-device ML/AI with your own custom models

The Google AI Edge platform enables building and deploying a wide range of pretrained and custom models on edge devices and supports various frameworks like TensorFlow, PyTorch, Keras, and Jax, allowing for more customization in apps. The platform now also offers improved support of on-device hardware accelerators and a new AI Edge Portal service for broad coverage of on-device benchmarking and evals. If you are looking for GenAI language models on devices where Gemini Nano is not available, you can use other open models via the MediaPipe LLM Inference API.

Serving your own custom models on-device can pose challenges related to handling large model downloads and updates, impacting the user experience. To improve this, we’ve launched Play for On-Device AI in beta. This service is designed to help developers manage custom model downloads efficiently, ensuring the right model size and speed are delivered to each Android device precisely when needed.

For more information watch “Small language models with Google AI Edge” talk.

#3 Power your Android apps with Gemini Flash, Pro and Imagen using Firebase AI Logic

For more advanced generative AI use cases, such as complex reasoning tasks, analyzing large amounts of data, processing audio or video, or generating images, you can use larger models from the Gemini Flash and Gemini Pro families, and Imagen running in the cloud. These models are well suited for scenarios requiring advanced capabilities or multimodal inputs and outputs. And since the AI inference runs in the cloud any Android device with an internet connection is supported. They are easy to integrate into your Android app by using Firebase AI Logic, which provides a simplified, secure way to access these capabilities without managing your own backend. Its SDK also includes support for conversational AI experiences using the Gemini Live API or generating custom contextual visual assets with Imagen. To learn more, check out our sample on GitHub and watch "Enhance your Android app with Gemini Pro and Flash, and Imagen" session.

These powerful AI capabilities can also be brought to life in immersive Android XR experiences. You can find corresponding documentation, samples and the technical session: "The future is now, with Compose and AI on Android XR".

Flow cahrt demonstrating Firebase AI Logic integration architecture
Figure 1: Firebase AI Logic integration architecture

Get inspired and start building with AI on Android today

We released a new open source app, Androidify, to help developers build AI-driven Android experiences using Gemini APIs, ML Kit, Jetpack Compose, CameraX, Navigation 3, and adaptive design. Users can create personalized Android bot with Gemini and Imagen via the Firebase AI Logic SDK. Additionally, it incorporates ML Kit pose detection to detect a person in the camera viewfinder. The full code sample is available on GitHub for exploration and inspiration. Discover additional AI examples in our Android AI Sample Catalog.

moving image of the Androidify app on a mobile device, showing a fair-skinned woman with blond hair wearing a red jacket with black shirt and pants and a pair of sunglasses converting into a 3D image of a droid with matching skin tone and blond hair wearing a red jacket with black shirt and pants and a pair of sunglasses
The original image and Androidifi-ed image

Choosing the right Gemini model depends on understanding your specific needs and the model's capabilities, including modality, complexity, context window, offline capability, cost, and device reach. To explore these considerations further and see all our announcements in action, check out the AI on Android at I/O ‘25 playlist on YouTube and check out our documentation.

We are excited to see what you will build with the power of Gemini!

On-device GenAI APIs as part of ML Kit help you easily build with Gemini Nano

Posted by Caren Chang - Developer Relations Engineer, Chengji Yan - Software Engineer, Taj Darra - Product Manager

We are excited to announce a set of on-device GenAI APIs, as part of ML Kit, to help you integrate Gemini Nano in your Android apps.

To start, we are releasing 4 new APIs:

    • Summarization: to summarize articles and conversations
    • Proofreading: to polish short text
    • Rewriting: to reword text in different styles
    • Image Description: to provide short description for images

Key benefits of GenAI APIs

GenAI APIs are high level APIs that allow for easy integration, similar to existing ML Kit APIs. This means you can expect quality results out of the box without extra effort for prompt engineering or fine tuning for specific use cases.

GenAI APIs run on-device and thus provide the following benefits:

    • Input, inference, and output data is processed locally
    • Functionality remains the same without reliable internet connection
    • No additional cost incurred for each API call

To prevent misuse, we also added safety protection in various layers, including base model training, safety-aware LoRA fine-tuning, input and output classifiers and safety evaluations.

How GenAI APIs are built

There are 4 main components that make up each of the GenAI APIs.

  1. Gemini Nano is the base model, as the foundation shared by all APIs.
  2. Small API-specific LoRA adapter models are trained and deployed on top of the base model to further improve the quality for each API.
  3. Optimized inference parameters (e.g. prompt, temperature, topK, batch size) are tuned for each API to guide the model in returning the best results.
  4. An evaluation pipeline ensures quality in various datasets and attributes. This pipeline consists of: LLM raters, statistical metrics and human raters.

Together, these components make up the high-level GenAI APIs that simplify the effort needed to integrate Gemini Nano in your Android app.

Evaluating quality of GenAI APIs

For each API, we formulate a benchmark score based on the evaluation pipeline mentioned above. This score is based on attributes specific to a task. For example, when evaluating the summarization task, one of the attributes we look at is “grounding” (ie: factual consistency of generated summary with source content).

To provide out-of-box quality for GenAI APIs, we applied feature specific fine-tuning on top of the Gemini Nano base model. This resulted in an increase for the benchmark score of each API as shown below:

Use case in English Gemini Nano Base Model ML Kit GenAI API
Summarization 77.2 92.1
Proofreading 84.3 90.2
Rewriting 79.5 84.1
Image Description 86.9 92.3

In addition, this is a quick reference of how the APIs perform on a Pixel 9 Pro:

Prefix Speed
(input processing rate)
Decode Speed
(output generation rate)
Text-to-text 510 tokens/second 11 tokens/second
Image-to-text 510 tokens/second + 0.8 seconds for image encoding 11 tokens/second

Sample usage

This is an example of implementing the GenAI Summarization API to get a one-bullet summary of an article:

val articleToSummarize = "We are excited to announce a set of on-device generative AI APIs..."

// Define task with desired input and output format
val summarizerOptions = SummarizerOptions.builder(context)
    .setInputType(InputType.ARTICLE)
    .setOutputType(OutputType.ONE_BULLET)
    .setLanguage(Language.ENGLISH)
    .build()
val summarizer = Summarization.getClient(summarizerOptions)

suspend fun prepareAndStartSummarization(context: Context) {
    // Check feature availability. Status will be one of the following: 
    // UNAVAILABLE, DOWNLOADABLE, DOWNLOADING, AVAILABLE
    val featureStatus = summarizer.checkFeatureStatus().await()

    if (featureStatus == FeatureStatus.DOWNLOADABLE) {
        // Download feature if necessary.
        // If downloadFeature is not called, the first inference request will 
        // also trigger the feature to be downloaded if it's not already
        // downloaded.
        summarizer.downloadFeature(object : DownloadCallback {
            override fun onDownloadStarted(bytesToDownload: Long) { }

            override fun onDownloadFailed(e: GenAiException) { }

            override fun onDownloadProgress(totalBytesDownloaded: Long) {}

            override fun onDownloadCompleted() {
                startSummarizationRequest(articleToSummarize, summarizer)
            }
        })    
    } else if (featureStatus == FeatureStatus.DOWNLOADING) {
        // Inference request will automatically run once feature is      
        // downloaded.
        // If Gemini Nano is already downloaded on the device, the   
        // feature-specific LoRA adapter model will be downloaded very  
        // quickly. However, if Gemini Nano is not already downloaded, 
        // the download process may take longer.
        startSummarizationRequest(articleToSummarize, summarizer)
    } else if (featureStatus == FeatureStatus.AVAILABLE) {
        startSummarizationRequest(articleToSummarize, summarizer)
    } 
}

fun startSummarizationRequest(text: String, summarizer: Summarizer) {
    // Create task request  
    val summarizationRequest = SummarizationRequest.builder(text).build()

    // Start summarization request with streaming response
    summarizer.runInference(summarizationRequest) { newText -> 
        // Show new text in UI
    }

    // You can also get a non-streaming response from the request
    // val summarizationResult = summarizer.runInference(summarizationRequest)
    // val summary = summarizationResult.get().summary
}

// Be sure to release the resource when no longer needed
// For example, on viewModel.onCleared() or activity.onDestroy()
summarizer.close()

For more examples of implementing the GenAI APIs, check out the official documentation and samples on GitHub:

Use cases

Here is some guidance on how to best use the current GenAI APIs:

For Summarization, consider:

    • Conversation messages or transcripts that involve 2 or more users
    • Articles or documents less than 4000 tokens (or about 3000 English words). Using the first few paragraphs for summarization is usually good enough to capture the most important information.

For Proofreading and Rewriting APIs, consider utilizing them during the content creation process for short content below 256 tokens to help with tasks such as:

    • Refining messages in a particular tone, such as more formal or more casual
    • Polishing personal notes for easier consumption later

For the Image Description API, consider it for:

    • Generating titles of images
    • Generating metadata for image search
    • Utilizing descriptions of images in use cases where the images themselves cannot be displayed, such as within a list of chat messages
    • Generating alternative text to help visually impaired users better understand content as a whole

GenAI API in production

Envision is an app that verbalizes the visual world to help people who are blind or have low vision lead more independent lives. A common use case in the app is for users to take a picture to have a document read out loud. Utilizing the GenAI Summarization API, Envision is now able to get a concise summary of a captured document. This significantly enhances the user experience by allowing them to quickly grasp the main points of documents and determine if a more detailed reading is desired, saving them time and effort.

side by side images of a mobile device showing a document on a table on the left, and the results of the scanned document on the right showing details providing the what, when, and where as written in the document

Supported devices

GenAI APIs are available on Android devices using optimized MediaTek Dimensity, Qualcomm Snapdragon, and Google Tensor platforms through AICore. For a comprehensive list of devices that support GenAI APIs, refer to our official documentation.

Learn more

Start implementing GenAI APIs in your Android apps today with guidance from our official documentation and samples on GitHub: AI Catalog GenAI API Samples with Compose, ML Kit GenAI APIs Quickstart.

Get ready for Google I/O: Program lineup revealed

The Google I/O agenda is live. We're excited to share Google’s biggest announcements across AI, Android, Web, and Cloud May 20-21. Tune in to learn how we’re making development easier so you can build faster.

We'll kick things off with the Google Keynote at 10:00 AM PT on May 20th, followed by the Developer Keynote at 1:30 PM PT. This year, we're livestreaming two days of sessions directly from Mountain View, bringing more of the I/O experience to you, wherever you are.

Here’s a sneak peek of what we’ll cover:

    • AI advancements: Learn how Gemini models enable you to build new applications and unlock new levels of productivity. Explore the flexibility offered by options like our Gemma open models and on-device capabilities.
    • Build excellent apps, across devices with Android: Crafting exceptional app experiences across devices is now even easier with Android. Dive into sessions focused on building intelligent apps with Google AI and boosting your productivity, alongside creating adaptive user experiences and leveraging the power of Google Play.
    • Powerful web, made easier: Exciting new features continue to accelerate web development, helping you to build richer, more reliable web experiences. We’ll share the latest innovations in web UI, Baseline progress, new multimodal built-in AI APIs using Gemini Nano, and how AI in DevTools streamline building innovative web experiences.

Plan your I/O

Join us online for livestreams May 20-21, followed by on-demand sessions and codelabs on May 22. Register today and explore the full program for sessions like these:

We're excited to share what's next and see what you build!

By the Google I/O team

Get ready for Google I/O: Program lineup revealed

The Google I/O agenda is live. We're excited to share Google’s biggest announcements across AI, Android, Web, and Cloud May 20-21. Tune in to learn how we’re making development easier so you can build faster.

We'll kick things off with the Google Keynote at 10:00 AM PT on May 20th, followed by the Developer Keynote at 1:30 PM PT. This year, we're livestreaming two days of sessions directly from Mountain View, bringing more of the I/O experience to you, wherever you are.

Here’s a sneak peek of what we’ll cover:

    • AI advancements: Learn how Gemini models enable you to build new applications and unlock new levels of productivity. Explore the flexibility offered by options like our Gemma open models and on-device capabilities.
    • Build excellent apps, across devices with Android: Crafting exceptional app experiences across devices is now even easier with Android. Dive into sessions focused on building intelligent apps with Google AI and boosting your productivity, alongside creating adaptive user experiences and leveraging the power of Google Play.
    • Powerful web, made easier: Exciting new features continue to accelerate web development, helping you to build richer, more reliable web experiences. We’ll share the latest innovations in web UI, Baseline progress, new multimodal built-in AI APIs using Gemini Nano, and how AI in DevTools streamline building innovative web experiences.

Plan your I/O

Join us online for livestreams May 20-21, followed by on-demand sessions and codelabs on May 22. Register today and explore the full program for sessions like these:

We're excited to share what's next and see what you build!

By the Google I/O team

Get ready for Google I/O: Program lineup revealed

Posted by the Google I/O team

The Google I/O agenda is live. We're excited to share Google’s biggest announcements across AI, Android, Web, and Cloud May 20-21. Tune in to learn how we’re making development easier so you can build faster.

We'll kick things off with the Google Keynote at 10:00 AM PT on May 20th, followed by the Developer Keynote at 1:30 PM PT. This year, we're livestreaming two days of sessions directly from Mountain View, bringing more of the I/O experience to you, wherever you are.

Here’s a sneak peek of what we’ll cover:

    • AI advancements: Learn how Gemini models enable you to build new applications and unlock new levels of productivity. Explore the flexibility offered by options like our Gemma open models and on-device capabilities.
    • Build excellent apps, across devices with Android: Crafting exceptional app experiences across devices is now even easier with Android. Dive into sessions focused on building intelligent apps with Google AI and boosting your productivity, alongside creating adaptive user experiences and leveraging the power of Google Play.
    • Powerful web, made easier: Exciting new features continue to accelerate web development, helping you to build richer, more reliable web experiences. We’ll share the latest innovations in web UI, Baseline progress, new multimodal built-in AI APIs using Gemini Nano, and how AI in DevTools streamline building innovative web experiences.

Plan your I/O

Join us online for livestreams May 20-21, followed by on-demand sessions and codelabs on May 22. Register today and explore the full program for sessions like these:

We're excited to share what's next and see what you build!

Generate Stunning Visuals in Your Android Apps with Imagen 3 via Vertex AI in Firebase

Posted by Thomas Ezan Sr. – Android Developer Relation Engineer (@lethargicpanda)

Imagen 3, our most advanced image generation model, is now available through Vertex AI in Firebase, making it even easier to integrate it to your Android apps.

Designed to generate well-composed images with exceptional details, reduced artifacts, and rich lighting, Imagen 3 represents a significant leap forward in image generation capabilities.

Hot air balloons float over a scenic desert landscape with unique rock formations.
Image generated by Imagen 3 with prompt: “Shot in the style of DSLR camera with the polarizing filter. A photo of two hot air balloons over the unique rock formations in Cappadocia, Turkey. The colors and patterns on these balloons contrast beautifully against the earthy tones of the landscape below. This shot captures the sense of adventure that comes with enjoying such an experience.”

A wooden robot stands in a field of yellow flowers, holding a small blue bird on its outstretched hand.
Image generated by Imagen 3 with prompt: A weathered, wooden mech robot covered in flowering vines stands peacefully in a field of tall wildflowers, with a small blue bird resting on its outstretched hand. Digital cartoon, with warm colors and soft lines. A large cliff with a waterfall looms behind.

Imagen 3 unlocks exciting new possibilities for Android developers. Generated visuals can adapt to the content of your app, creating a more engaging user experience. For instance, your users can generate custom artwork to enhance their in-app profile. Imagen can also improve your app's storytelling by bringing its narratives to life with delightful personalized illustrations.

You can experiment with image prompts in Vertex AI Studio, and learn how to improve your prompts by reviewing the prompt and image attribute guide.

Get started with Imagen 3

The integration of Imagen 3 is similar to adding Gemini access via Vertex AI in Firebase. Start by adding the gradle dependencies to your Android project:

dependencies {
    implementation(platform("com.google.firebase:firebase-bom:33.10.0"))

    implementation("com.google.firebase:firebase-vertexai")
}

Then, in your Kotlin code, create an ImageModel instance by passing the model name and optionally, a model configuration and safety settings:

val imageModel = Firebase.vertexAI.imagenModel(
  modelName = "imagen-3.0-generate-001",
  generationConfig = ImagenGenerationConfig(
    imageFormat = ImagenImageFormat.jpeg(compresssionQuality = 75),
    addWatermark = true,
    numberOfImages = 1,
    aspectRatio = ImagenAspectRatio.SQUARE_1x1
  ),
  safetySettings = ImagenSafetySettings(
    safetyFilterLevel = ImagenSafetyFilterLevel.BLOCK_LOW_AND_ABOVE
    personFilterLevel = ImagenPersonFilterLevel.ALLOW_ADULT
  )
)

Finally generate the image by calling generateImages:

val imageResponse = imageModel.generateImages(
  prompt = "An astronaut riding a horse"
)

Retrieve the generated image from the imageResponse and display it as a bitmap as follow:

val image = imageResponse.images.first()
val uiImage = image.asBitmap()

Next steps

Explore the comprehensive Firebase documentation for detailed API information.

Access to Imagen 3 using Vertex AI in Firebase is currently in Public Preview, giving you an early opportunity to experiment and innovate. For pricing details, please refer to the Vertex AI in Firebase pricing page.

Start experimenting with Imagen 3 today! We're looking forward to seeing how you’ll leverage Imagen 3's capabilities to create truly unique, immersive and personalized Android experiences.

Production-ready generative AI on Android with Vertex AI in Firebase

Posted by Thomas Ezan – Sr. Developer Relation Engineer (@lethargicpanda)

Gemini can help you build and launch new user features that will boost engagement and create personalized experiences for your users.

The Vertex AI in Firebase SDK lets you access Google’s Gemini Cloud models (like Gemini 1.5 Flash and Gemini 1.5 Pro) and add GenAI capabilities to your Android app. It became generally available last October which means it's now ready for production and it is already used by many apps in Google Play.

Here are tips for a successful deployment to production.

Implement App Check to prevent API abuse

When using the Vertex AI in Firebase API it is crucial to implement robust security measures to prevent unauthorized access and misuse.

Firebase App Check helps protect backend resources (like Vertex AI in Firebase, Cloud Functions for Firebase, or even your own custom backend) from abuse. It does this by attesting that incoming traffic is coming from your authentic app running on an authentic and untampered Android device.

A flow diagram illustrating App Check, with green lines depicting 'User Request' going through App Check to 'Backend'. A red line depicting 'Bad Request' is being blocked by App Check.
Firebase App Check ensures that only legitimate users access your backend resources

To get started, add Firebase to your Android project and enable the Play Integrity API for your app in the Google Play console. Back in the Firebase console, go to the App Check section of your Firebase project to register your app by providing its SHA-256 fingerprint.

Then, update your Android project’s Gradle dependencies with the App Check library for Android:

dependencies {
    // BoM for the Firebase platform
   implementation(platform("com.google.firebase:firebase-bom:33.7.0"))

    // Dependency for App Check
    implementation("com.google.firebase:firebase-appcheck-playintegrity")
}

Finally, in your Kotlin code, initialize App Check before using any other Firebase SDK:

Firebase.initialize(context)
Firebase.appCheck.installAppCheckProviderFactory(
    PlayIntegrityAppCheckProviderFactory.getInstance(),
)

To enhance the security of your generative AI feature, you should implement and enforce App Check before releasing your app to production. Additionally, if your app utilizes other Firebase services like Firebase Authentication, Firestore, or Cloud Functions, App Check provides an extra layer of protection for those resources as well.

Once App Check is enforced, you’ll be able to monitor your app’s requests in the Firebase console.

An area chart of the Apps Check metrics page in Firebase console, showing the percentages of verified and unverified requests over several days. Numerical breakdowns of verified (51%) and unverified requests (49%) are shown.
App Check metrics page in the Firebase console

You can learn more about App Check on Android in the Firebase documentation.

Use Remote Config for server-controlled configuration

The generative AI landscape evolves quickly. Every few months, new Gemini model iterations become available and some models are removed. See the Vertex AI in Firebase Gemini models page for details.

Because of this, instead of hardcoding the model name in your app, we recommend using a server-controlled variable using Firebase Remote Config. This allows you to dynamically update the model your app uses without having to deploy a new version of your app or require your users to pick up a new version.

You define parameters that you want to control (like model name) using the Firebase console. Then, you add these parameters into your app, along with default "fallback" values for each parameter. Back in the Firebase console, you can change the value of these parameters at any time. Your app will automatically fetch the new value.

Here's how to implement Remote Config in your app:

// Initialize the remote configuration by defining the refresh time
val remoteConfig: FirebaseRemoteConfig = Firebase.remoteConfig
val configSettings = remoteConfigSettings {
    minimumFetchIntervalInSeconds = 3600
}
remoteConfig.setConfigSettingsAsync(configSettings)

// Set default values defined in your app resources 
remoteConfig.setDefaultsAsync(R.xml.remote_config_defaults)

// Load the model name
val modelName = remoteConfig.getString("model_name")

Read more about using Remote Config with Vertex AI in Firebase.

Gather user feedback to evaluate impact

As you roll out your AI-enabled feature to production, it's critical to build feedback mechanisms into your product and allow users to easily signal whether the AI output was helpful, accurate, or relevant. For example, you can incorporate interactive elements such as thumb-up and thumb-down buttons and detailed feedback forms within the user interface. The Material Icons in Compose package provides ready to use icons to help you implement it.

You can easily track the user interaction with these elements as custom analytics events by using Google Analytics logEvent() function:

Row {
   Button (
      onClick = {
         firebaseAnalytics.logEvent("model_response_feedback") {
            param("feedback", "thumb_up")
         }
      }
   ) {
      Icon(Icons.Default.ThumbUp, contentDescription = "Thumb up")
   },
   Button (
      onClick = {
         firebaseAnalytics.logEvent("model_response_feedback") {
            param("feedback", "thumb_down")
         }
      }
   ) {
      Icon(Icons.Default.ThumbDown, contentDescription = "Thumb down")
   }
}

Learn more about Google Analytics and its event logging capabilities in the Firebase documentation.

User privacy and responsible AI

When you use Vertex AI in Firebase for inference, you have the guarantee that the data sent to Google won’t be used by Google to train AI models (see Vertex AI documentation for details).

It's also important to be transparent with your users when they're engaging with generative AI technology. You should highlight the possibility of unexpected model behavior.

Finally, users should have control within your app over how their activity related to AI model interactions is stored and deleted.

You can learn more about how Google is approaching Generative AI responsibly in the Google Cloud documentation.

Gaze Link Wins Best Android App in Gemini API Developer Competition

Posted by Thomas Ezan – Sr Developer Relation Engineer (@lethargicpanda)

We're excited to announce Gaze Link as the winner of the Best Android App for our Gemini API Developer Competition!

This innovative app demonstrates the potential of the Gemini API in providing a communication system for individuals with Amyotrophic lateral sclerosis (ALS) who develop severe motor and verbal disabilities, enabling them to type sentences with only their eyes.

About Gaze Link

Gaze Link uses Google’s Gemini 1.5 Flash model to predict the user’s intended sentence based on a few key words and the context of the conversation.

For example if the context is “Is the room temperature ok?” and the user replies “hot AC two” the app will leverage Gemini to generate the full sentence “I am hot, can you turn the AC down by two degrees?”.

The Gaze Link team took advantage of Gemini 1.5 Flash multilingual capabilities to let the app generate sentences in English, Spanish and Chinese, the three languages currently supported by the app.

We were truly impressed by the Gaze Link app. The team used the Gemini API combined with ML Kit Face Detection to empower individuals with ALS providing them with a powerful communication system that is both accessible and affordable.

With Gemini 1.5 Flash currently supporting 38 languages, it is possible for Gaze Link to add support for more languages in the future. In addition, the model’s multimodal abilities could enable the team to enhance the user experience by integrating image, audio and video to augment the context of the conversation.

Build with the Gemini API

The result of the integration of the Gemini API in Gaze Link is inspiring. If you are working on an Android app today, we encourage you to learn about the Gemini API capabilities to see how you can successfully add generative AI to your app and delight your users.

To get started, go to the Android AI documentation!

Gemini API in action: showcase of innovative Android apps

Posted by Thomas Ezan, Sr Developer Relation Engineer

With the advent of Generative AI, Android developers now have access to capabilities that were previously out of reach. For instance, you can now easily add image captioning to your app without any computer vision knowledge.

With the upcoming launch of the stable version of VertexAI in Firebase in a few weeks (available in Beta Since Google I/O), you'll be able to effortlessly incorporate the capabilities of Gemini 1.5 Flash and Gemini 1.5 Pro into your app. The inference runs on Google's servers, making these capabilities accessible to any device with an internet connection.

Several Android developers have already begun leveraging this technology. Let's explore some of their creations.


Generate your meals for the week

The team behind Meal Planner, a meal planner and shopping list management app, is leveraging Gemini 1.5 Flash to create original meal plans. Based on the user’s diet, the number of people you are cooking for and any food allergies or intolerances, the app automatically creates a meal plan for the selected week.

For each dish, the model lists ingredients and quantities, taking into account the number of portions. It also provides instructions on how to prepare it. The app automatically generates a shopping list for the week based on the ingredient list for each meal.

moving image of Meal Planner app user experience

To enable reliable processing of the model’s response and to integrate it in the app, the team leveraged Gemini's JSON mode. They specified responseMimeType = "application/json" in the model configuration and defined the expected JSON response schema in the prompt (see API documentation).

Following the launch of the meal generation feature, Meal Planner received overwhelmingly positive feedback. The feature simplified meal planning for users with dietary restrictions and helped reduce food waste. In the months after its introduction, Meal Planner experienced a 17% surge in premium users.


Journaling while chatting with Leo

A few months ago, the team behind the journal app Life wanted to provide an innovative way to let their users log entries. They created "Leo", an AI diary assistant chatting with users and converting conversations into a journal entry.

To modify the behavior of the model and the tone of its responses, the team used system instructions to define the chatbot persona. This allows the user to set the behavior and tone of the assistant: Pick “Professional and formal” and the model will keep the conversation strict, select “Friendly and cheerful” and it will lighten up the dialogue with lots of emojis!

moving image of Leo app user experience

The team saw an increase of user engagement following the launch of the feature.

And if you want to know more about how the Life developers used Gemini API in their app, we had a great conversation with Jomin from the team. This conversation is part of a new Android podcast series called Android Build Time, that you can also watch on YouTube.


Create a nickname on the fly

The HiiKER app provides offline hiking maps. The app also fosters a community by letting users rating trails and leaving comments. But users signing up don’t always add a username to their profile. To avoid the risk of lowering the conversion rate by making username selection mandatory at signup time, the team decided to use the Gemini API to suggest unique usernames based on the user’s country or area.

moving image of HiiKER app user experience

To generate original usernames, the team set a high temperature value and played with the top-K and top-P values to increase the creativity of the model.

This AI-assisted feature led to a significant lift in the percentage of users with "complete" profiles contributing to a positive impact on engagement and retention.

It’s time to build!

Generative AI is still a very new space and we are just starting to have easy access to these capabilities for Android. From enabling advanced personalization, creating delightful interactive experiences or simplifying signup, you might have unique challenges that you are trying to solve as an Android app developer. It is a great time to start looking at these challenges as opportunities that generative AI can help you tackle!


You can learn more about the advanced features of the Gemini Cloud models, find an introduction to generative AI for Android developers, and get started with Vertex AI in Firebase documentation.

To learn more about AI on Android, check out other resources we have available during AI in Android Spotlight Week.

Use #AndroidAI hashtag to share your creations or feedback on social media, and join us at the forefront of the AI revolution!