Tag Archives: Gemini Nano

Gemini in Android Studio, now helping you across the development lifecycle

Posted by Sandhya Mohan – Product Manager, Android Studio

This is Our Biggest Feature Release Since Launch!

AI can accelerate your development experience, and help you become more productive. That's why we introduced Gemini in Android Studio, your AI-powered coding companion. It’s designed to make it easier for you to build high quality Android apps, faster. Today, we're releasing the biggest set of updates to Gemini in Android Studio since launch, and now Gemini brings the power of AI to every stage of the development lifecycle, directly within the Android Studio IDE experience. And for more updates on how to grow your apps and games businesses, check out the latest updates from Google Play.

Download the latest version of Android Studio in the canary channel to take advantage of all these new features, and read on to unpack what's new.



Gemini Can Now Write, Refactor, and Document Android Code

Gemini goes beyond just guidance. It can edit your code, helping you quickly move from prototype to implementation, implement common design patterns, and refactor your code. Gemini also streamlines your workflow with features like documentation and commit message generation, allowing you to focus more time on writing code.

Moving image demonstrating Gemini writing code for an Android Composable in real time in Android Studio

Coding features we are launching include:

    • Gemini Code Transforms - modify and refactor code using custom prompts.

      using Gemini to modify code in Android Studio

    • Commit message generation - analyze changes and propose VCS commit messages to streamline version control operations.

      using Gemini to analyze changes and propose VCS commit messages in Android Studio

    • Rethink and Rename - generate intuitive names for your classes, methods, and variables. This can be invoked while you’re coding, or as a larger refactor action applied to existing code.

      using Gemini to generate intuitive names for variables while you're coding in Android Studio

    • Prompt library - save and manage your most frequently used prompts. You can quickly recall them when you need them.

      save your frequently used prompts for future use with Gemini in Android Studio

    • Generate documentation - get documentation for selected code snippets with a simple right click.

      generating code documation in Android Studio

Integrating AI into UI Tools

It’s never been easier to build with Compose now that we have integrated AI into Compose workflows. Composable previews help you visualize your composables during design time in Android Studio. We understand that manually crafting mock data for the preview parameters can be time-consuming. Gemini can now help auto-generate Composable previews with relevant context using AI, simplifying the process of visualizing your UI during development.

Visualize your composables during design time in Android Studio

We are continuing to experiment with Multimodal support to speed up your UI development cycle. Coming soon, we will allow for image attachment as context and utilizing Gemini's multimodal understanding to make it easier to create beautiful and engaging user interfaces.

Deploy with Confidence

Gemini's intelligence can help you release higher quality apps with greater confidence. Gemini can analyze, test code, and suggest fixes — and we are continuing to integrate AI into the IDE’s App Quality Insights tool window by helping you analyze crashes reported by Google Play Console and Firebase Crashlytics. Now, with Ladybug Feature Drop, you can generate deeper insights by using your local code context. This means that you will fix bugs faster and your users will see fewer crashes.

Generate insights using the IDE's App Quality Insights tool window

Some of the features we are launching include:

    • Unit test scenario generation generates unit test scenarios based on local code context.

    generate unit test scenarios based on local code context in Android Studio

      • Build / sync error insights now provides improved coverage for build and sync errors.

        build sync error insights are now avaiable in Android Studio

      • App Quality Insights explains and suggests fixes for observed crashes from Android Vitals and Firebase Crashlytics, and now allows you to use local code context for improved insights.

        save your frequently used prompts for future use with Gemini in Android Studio

    A better Gemini in Android Studio for you

    We recently surveyed many of you to see how AI-powered code completion has impacted your productivity, and 86% of respondents said they felt more productive. Please continue to provide feedback as you use Gemini in your day-to-day workflows. In fact, a few of you wanted to share some of your tips and tricks for how to get the most out of Gemini in Android Studio.



    Along with the Gemini Nano APIs that you can integrate with your own app, Android developers now have access to Google's leading edge AI technologies across every step of their development journey — with Gemini in Android Studio central to that developer experience.

    Get these new features in the latest versions of Android Studio

    These features are all available to try today in the Android Studio canary channel. We expect to release many of these features in the upcoming Ladybug Feature Drop, to be released in the stable channel in late December — with the rest to follow shortly after.

      • Gemini Code Transforms - Modify and refactor your code within the editor
      • Commit message generation - Automatically generate commit messages with Gemini
      • Rethink and Rename - Get help renaming your classes, methods, and variables
      • Prompt library - Save and recall your most commonly used prompts
      • Compose Preview Generation - Generate previews for your composables with Gemini
      • Generate documentation - Have Gemini help you document your code
      • Unit test scenario generation - Generate unit test scenarios
      • Build / sync error insights - Ask Gemini for help in troubleshooting build and sync errors
      • App Quality Insights - Insights on how you can fix crashes from Android Vitals and Firebase Crashlytics

    As always, Google is committed to the responsible use of AI. Android Studio won't send any of your source code to servers without your consent — which means you'll need to opt in to enable Gemini's developer assistance features in Android Studio. You can read more on Gemini in Android Studio's commitment to privacy.

    Try enabling Gemini in your project and tell us what you think on social media with #AndroidGeminiEra. We're excited to see how these enhancements help you build amazing apps!

    Gemini Nano is now available on Android via experimental access

    Posted by Taj Darra – Product Manager

    Gemini, introduced last year, is Google’s most capable family of models yet; designed for flexibility, it can run on everything from data centers to mobile devices. Since announcing Gemini Nano, our most efficient model built for on-device tasks, we've been working with a limited set of partners to support a range of use cases for their apps.

    Today, we’re opening up access to experiment with Gemini Nano to all Android developers with the AI Edge SDK via AICore. Developers will initially have access to experiment with text-to-text prompts on Pixel 9 series devices. Support for more devices and modalities will be added in the future. Check out our documentation and video to get started. Note that experimental access is for development purposes, and is not for production usage at this time.


    Fast, private and cost-effective on-device AI

    On-device generative AI processes prompts directly on your device without server calls. It offers many benefits: sensitive user data is processed locally on the device, full functionality without internet connectivity, and no additional monetary cost for each inference.

    Since on-device generative AI models run on devices with less computational power than cloud servers, they are significantly smaller and less generalized than their cloud-based equivalents. As a result, the model works best for tasks where the requests can be clearly specified rather than open-ended use cases such as chatbots. Here are some use cases you can try:

      • Rephrasing - Rephrasing and rewriting text to change the tone to be more casual or formal.
      • Smart reply - Given several chat messages in a thread, suggest the next likely response.
      • Proofreading - Removing spelling or grammatical errors from text.
      • Summarization - Generating a summary of a long document, either as a paragraph or as bullet points.

    Check out our prompting strategies to achieve best results when experimenting with the above use-cases. If you want to test your own use case, you can download our sample app for an easy way to start experimenting with Gemini Nano.


    Gemini Nano performance and usage

    Compared to its predecessor, the model being made available to developers today (referred to in the academic paper as “Nano 2”) delivers a substantial improvement in quality. At nearly twice the size of the predecessor (“Nano 1”), it excels in both academic benchmarks and real-world applications, offering capabilities that rival much larger models.


    MMLU (5-shot)*

    MATH (4-shot)*

    Paraphrasing**

    Smart Reply**

    Nano 1

    46%

    14%

    44%

    44%

    Nano 2

    56%

    23%

    90%

    82%

    * As reported in Gemini: A Family of Highly Capable Multimodal Models. Note that both these models are a part of our Gemini 1.0 series.
    ** Percentage of good answers measured on public datasets via an autorater powered by Gemini 1.5 Pro.

    Gemini Nano is already in use by Google apps. Pixel Screenshots, Talkback, Recorder and many more have leveraged Gemini Nano’s text and image understanding to deliver new experiences:

      • Talkback - Android’s accessibility app leverages Gemini Nano’s multimodal capabilities to improve image descriptions for blind and low vision users.
      moving image of Talkback app UI highlighting improved image descriptions with multimodality model for users with low vision

      • Pixel Recorder - Gemini Nano with Multimodality model enables support for longer recordings and higher quality summaries.
    moving image of Talkback app UI highlighting improved image descriptions with multimodality model for users with low vision

    Seamless model integration with AI Edge SDK using AICore

    Integrating generative AI models directly into mobile apps is challenging due to the significant computational resources and storage space they require. To address this challenge, we developed AICore, a new system service in Android. AICore allows you to benefit from AI running directly on the device without needing to distribute runtimes, models and other components yourself.

    To run inference with Gemini Nano in AICore, you use the AI Edge SDK. The AI Edge SDK enables developers to customize prompts and inference parameters to their specific needs, enabling greater control over each inference.

    To experiment with the AI Edge SDK, add the following to your apps’ dependency:

    implementation("com.google.ai.edge.aicore:aicore:0.0.1-exp01")
    

    The AI Edge SDK allows you to customize inference parameters. Some of the more commonly-used parameters include:

      • Temperature, which controls randomness. Higher values increase diversity and creativity of output.
      • Top K, which specifies how many tokens from the highest-ranking ones are to be considered.
      • Candidate count, which describes the maximum number of responses to return.
      • Max output tokens, which is the length of the desired response.

    When you are ready to run the inference with your model, the AI Edge SDK offers an easy way to pass in multiple strings as input to accommodate long inference data.

    Here’s an example:

    scope.launch {
        // Single string input prompt
        val input = "I want you to act as an English proofreader. I will 
        provide you texts, and I would like you to review them for any 
        spelling, grammar, or punctuation errors. Once you have finished 
        reviewing the text, provide me with any necessary corrections or 
        suggestions for improving the text: 
        These arent the droids your looking for."
        val response = generativeModel.generateContent(input)
        print(response.text)
    
        // Or multiple strings as input
        val response = generativeModel.generateContent(
            content {
                text("I want you to act as an English proofreader.I will 
                provide you texts and I would like you to review them for 
                any spelling, grammar, or punctuation errors.")
                text("Once you have finished reviewing the text, 
                provide me with any necessary corrections or suggestions 
                for improving the text:")
                text("These arent the droids your looking for.")
            }
        )
        print(response.text)
    }
    

    Our integration guide has more information on the AI Edge SDK as well as detailed instructions to start your experimentation with Gemini Nano. To learn more about prompting, check out the Gemini prompting strategies.


    Get Started

    Learn more about Gemini Nano for app development by watching our video walkthrough, and try out Gemini Nano experimental access in your own app today.

    We are excited to see what you build and welcome your input as you evaluate this new technology for your use cases! Post your creations on social media and include the hashtag #AndroidAI to share what you build. To share your ideas and feedback for on-device GenAI and help shape our APIs, you can file a ticket.

    There’s a lot more that we’re covering this week for you to build great AI experiences on Android so be sure to check out the rest of the AI on Android Spotlight Week content!

    TalkBack uses Gemini Nano to increase image accessibility for users with low vision

    Posted by Terence Zhang – Developer Relations Engineer and Lisie Lillianfeld - Product Manager

    TalkBack is Android’s screen reader in the Android Accessibility Suite that describes text and images for Android users who have blindness or low vision. The TalkBack team is always working to make Android more accessible. Today, thanks to Gemini Nano with multimodality, TalkBack automatically provides users with blindness or low vision more vivid and detailed image descriptions to better understand the images on their screen.

    Increasing accessibility using Gemini Nano with multimodality

    Advancing accessibility is a core part of Google’s mission to build for everyone. That’s why TalkBack has a feature to describe images when developers didn’t include descriptive alt text. This feature was powered by a small ML model called Garcon. However, Garcon produced short, generic responses and couldn’t specify relevant details like landmarks or products.

    The development of Gemini Nano with multimodality was the perfect opportunity to use the latest AI technology to increase accessibility with TalkBack. Now, when TalkBack users opt in on eligible devices, the screen reader uses Gemini Nano’s new multimodal capabilities to automatically provide users with clear, detailed image descriptions in apps including Google Photos and Chrome, even if the device is offline or has an unstable network connection.

    “Gemini Nano helps fill in missing information,” said Lisie Lillianfeld, product manager at Google. “Whether it’s more details about what’s in a photo a friend sent or the style and cut of clothing when shopping online.”

    Going beyond basic image descriptions

    Here’s an example that illustrates how Gemini Nano improves image descriptions: When Garcon is presented with a panorama of the Sydney, Australia shoreline at night, it might read: “Full moon over the ocean.” Gemini Nano with multimodality can paint a richer picture, with a description like: “A panoramic view of Sydney Opera House and the Sydney Harbour Bridge from the north shore of Sydney, New South Wales, Australia.”

    “It's amazing how Nano can recognize something specific. For instance, the model will recognize not just a tower, but the Eiffel Tower,” said Lisie. “This kind of context takes advantage of the unique strengths of LLMs to deliver a helpful experience for our users.”

    Using an on-device model like Gemini Nano was the only feasible solution for TalkBack to provide automatically generated detailed image descriptions for images, even while the device is offline.

    “The average TalkBack user comes across 90 unlabeled images per day, and those images weren't as accessible before this new feature,” said Lisie. The feature has gained positive user feedback, with early testers writing that the new image descriptions are a “game changer” and that it’s “wonderful” to have detailed image descriptions built into TalkBack.


    Gemini Nano with multimodality was critical to improving the experience for users with low vision. Providing detailed on-device image descriptions wouldn’t have been possible without it. — Lisie Lillianfeld, Product Manager at Google

    Balancing inference verbosity and speed

    One important decision the Android accessibility team made when implementing Gemini Nano with multimodality was between inference verbosity and speed, which is partially determined by image resolution. Gemini Nano with multimodality currently accepts images in either 512 pixels or 768 pixels.

    “The 512-pixel resolution emitted its first token almost two seconds faster than 768 pixels, but the output wasn't as detailed,” said Tyler Freeman, a senior software engineer at Google. “For our users, we decided a longer, richer description was worth the increased latency. We were able to hide the perceived latency a bit by streaming the tokens directly to the text-to-speech system, so users don’t have to wait for the full text to be generated before hearing a response.”

    A hybrid solution using Gemini Nano and Gemini 1.5 Flash

    TalkBack developers also implemented a hybrid AI solution using Gemini 1.5 Flash. With this server-based AI model, TalkBack can provide the best of on-device and server-based generative AI features to make the screen reader even more powerful.

    When users want more details after hearing an automatically generated image description from Gemini Nano, TalkBack gives the user an option to listen to more by running the image through Gemini Flash. When users focus on an image, they can use a three-finger tap to open the TalkBack menu and select the “Describe Image” option to send the image to Gemini 1.5 Flash on the server and get even more details.

    By combining the unique advantages of both Gemini Nano's on-device processing with the full power of cloud-based Gemini 1.5 Flash, TalkBack provides blind and low-vision Android users a helpful and informative experience with images. The “describe image” feature powered by Gemini 1.5 Flash launched to TalkBack users on more Android devices, so even more users can get detailed image descriptions.


    Animated UI example of TalkBack in action, describing a photo of a sunny view of Sydney Harbor, Australia, with the Sydney Opera House and Sydney Harbour Bridge in the frame.

    Compact model, big impact

    The Android accessibility team recommends developers looking to use the Gemini Nano with multimodality prototype and test on a powerful, server-side model first. There developers can understand the UX faster, iterate on prompt engineering, and get a better idea of the highest quality possible using the most capable model available.

    While Gemini Nano with multimodality can include missing context to improve image descriptions, it’s still best practice for developers to provide detailed alt text for all images on their apps or websites. If the alt text is not provided, TalkBack can help fill in the gaps.

    The Android accessibility team’s goal is to create inclusive and accessible features, and leveraging Gemini Nano with multimodality to provide vivid and detailed image descriptions automatically is a big step towards that. Furthermore, their hybrid approach towards AI, combining the strengths of both Gemini Nano on device and Gemini 1.5 Flash in the server, showcases the transformative potential of AI in promoting inclusivity and accessibility and highlights Google's ongoing commitment to building for everyone.

    Get started

    Learn more about Gemini Nano for app development.


    This blog post is part of our series: Spotlight Week on Android 15, where we provide resources — blog posts, videos, sample code, and more — all designed to help you prepare your apps and take advantage of the latest features in Android 15. You can read more in the overview of Spotlight Week: Android 15, which will be updated throughout the week.

    Our first Spotlight Week: diving into Android 15

    Posted by Aaron Labiaga- Android Developer Relations Engineer

    By now, you’ve probably heard the news: Android 15 was just released earlier today to AOSP. To celebrate, we’re kicking off a new series called “Spotlight Week” where we’ll shine a light on technical areas across Android development and equip you with the tools you need to take advantage of each area.

    The Android 15 "Spotlight Week" will provide resources — blog posts, videos, sample code, and more — all designed to help you prepare your apps and take advantage of the latest features. These changes strive to improve the Android ecosystem, but updating the OS comes with potential app compatibility implications and integrations that require detailed guidance.

    Here’s what we’re covering this week in our Spotlight Week on Android 15:


      The Android 15 summary page outlines what a developer needs to know about what is new in the release, behavioral changes affecting all apps, and changes applicable only when targeting the new SDK level 35.


    • Building for the future of Android, an in-depth video (Wednesday, Sept 4)

    • Foreground services and a live Android 15 Q&A (Thursday, September 5): Foreground services changes are coming in Android 15, and we’re introducing a new foreground service type, updating the exemption scenarios that allow a foreground service to start from the background, and updating the max duration of certain foreground service types. These changes are intended to improve user experience by preventing apps from misusing foreground service that may drain a user’s battery. Plus we’ll have a live Q&A.

    • Passkeys and Picture-in-Picture (Friday, September 6): Passkeys enable a more streamlined and secured means of authenticating your users. Learn more about passkeys through our sample code and about the updates made to further simplify the login process in Android 15. Plus, we're highlighting a Picture-in-Picture sample code that is applicable to apps with video functionality.

    That’s just a taste of what we’re covering in our Spotlight Week on Android 15. Keep checking back to this blog post for updates, where we’ll be adding links and more throughout the week. Plus, follow Android Developers on X and Android by Google at Linkedin throughout the week to hear even more about Android 15.

    Our first Spotlight Week: diving into Android 15

    Posted by Aaron Labiaga- Android Developer Relations Engineer

    By now, you’ve probably heard the news: Android 15 was just released earlier today to AOSP. To celebrate, we’re kicking off a new series called “Spotlight Week” where we’ll shine a light on technical areas across Android development and equip you with the tools you need to take advantage of each area.

    The Android 15 "Spotlight Week" will provide resources — blog posts, videos, sample code, and more — all designed to help you prepare your apps and take advantage of the latest features. These changes strive to improve the Android ecosystem, but updating the OS comes with potential app compatibility implications and integrations that require detailed guidance.

    Here’s what we’re covering this week in our Spotlight Week on Android 15:


      The Android 15 summary page outlines what a developer needs to know about what is new in the release, behavioral changes affecting all apps, and changes applicable only when targeting the new SDK level 35.


    • Building for the future of Android, an in-depth video (Wednesday, Sept 4)

    • Foreground services and a live Android 15 Q&A (Thursday, September 5): Foreground services changes are coming in Android 15, and we’re introducing a new foreground service type, updating the exemption scenarios that allow a foreground service to start from the background, and updating the max duration of certain foreground service types. These changes are intended to improve user experience by preventing apps from misusing foreground service that may drain a user’s battery. Plus we’ll have a live Q&A.

    • Passkeys and Picture-in-Picture (Friday, September 6): Passkeys enable a more streamlined and secured means of authenticating your users. Learn more about passkeys through our sample code and about the updates made to further simplify the login process in Android 15. Plus, we're highlighting a Picture-in-Picture sample code that is applicable to apps with video functionality.

    That’s just a taste of what we’re covering in our Spotlight Week on Android 15. Keep checking back to this blog post for updates, where we’ll be adding links and more throughout the week. Plus, follow Android Developers on X and Android by Google at Linkedin throughout the week to hear even more about Android 15.

    #TheAndroidShow: diving into the latest from Made by Google, including wearables, Foldable, Gemini and more!

    Posted by Anirudh Dewani, Director – Android Developer Relations

    We just dropped our summer episode of #TheAndroidShow, on YouTube and on developer.android.com, where we unpacked all of the goodies coming out of this month’s Made by Google event and what you as Android developers need to know. With two new Wear OS 5 watches, we show you how to get building for the wrist. And with the latest foldable from Google, the Pixel 9 Pro Fold, we show how you can leverage out of the box APIs and multi-window experiences to make your apps adaptive for this new form factor.

    Building for Pixel 9 Pro Fold with Adaptive UIs

    With foldables like the Pixel 9 Pro Fold, users have options for how to engage and multitask based on the display they are using and the folded state of their device. Building apps that adapt based on screen size and device postures allows you to scale your UI for mobile, foldables, tablets and beyond. You can read more about how to get started building for devices like the Pixel 9 Pro Fold, or learn more about building for large screens.

    Preparing for Pixel Watch 3: Wear OS 5 and Larger Displays

    With Pixel Watch 3 ringing in the stable release of Wear OS 5, there’s never been a better time to prepare your app for the behavior changes from Wear OS 5 and larger screen sizes from Pixel. We covered how to get started building for wearables like Pixel Watch 3, and you can learn more about building for Wear OS 3.

    Gemini Nano, with multi-modality

    We also took you behind the scenes with Gemini Nano with multimodality, Google’s latest model for on-device AI. Gemini Nano, the smallest version of the Gemini model family, can be executed on-device on capable Android devices including the latest Pixel 9. We caught up with the team to hear more about how the Pixel Recorder team used Gemini Nano to summarize users’ transcripts of audio recordings, with data remaining on-device.

    And some voices from Android devs like you!

    Across the show, we heard from some amazing developers building excellent apps, across devices. Like Rex Jin and Bismark Ito, Android Developers at Meta: they told us how the team at Instagram was able to add Ultra HDR in less than three months, dramatically improving the user experience. Later, SAP told us how within 5 minutes, they integrated NavigationSuiteScaffold, swiftly adapting their navigation UI to different window sizes. And AllTrails told us they are seeing 60% higher monthly retention from Wear OS users… pretty impressive!


    Have an idea for our next episode of #TheAndroidShow? It’s your conversation with the broader community, and #TheAndroidShow is your conversation with the Android developer community, this time hosted by Huyen Tue Dao and John Zoeller. You'll hear the latest from the developers and engineers who build Android. You can watch the full show on YouTube Comment start and on developer.android.com/events/show!

    Tune in for our summer episode of #TheAndroidShow on August 27!

    Posted by Anirudh Dewani – Director, Android Developer Relations

    In just a few days, on Tuesday, August 27 at 10AM PT, we’ll be dropping our summer episode of #TheAndroidShow, on YouTube and on developer.android.com. In this quarterly show, we’ll be unpacking all of the goodies coming out of this month’s Made by Google event and what you as Android developers need to know!



    With two new Wear OS 5 watches, we’ll show you how to get building for the wrist. And with the latest foldable from Google, the Pixel 9 Pro Fold, we’ll show how you can leverage out of the box APIs and multi-window experiences to make your apps adaptive for this new form factor.

    Plus, Gemini Nano now has Multimodality, and we’ll be going behind-the-scenes to show you how teams at Google are using the latest model for on-device AI.

    #TheAndroidShow is your conversation with the Android developer community, this time hosted by Huyen Tue Dao and John Zoeller. You'll hear the latest from the developers and engineers who build Android.

    Don’t forget to tune in live on August 27 at 10AM PT, live on YouTube and on developer.android.com/events/show!