Tag Archives: GenerativeAI

Top 3 Updates for Building with AI on Android at Google I/O ‘24

Posted by Terence Zhang – Developer Relations Engineer

At Google I/O, we unveiled a vision of Android reimagined with AI at its core. As Android developers, you're at the forefront of this exciting shift. By embracing generative AI (Gen AI), you'll craft a new breed of Android apps that offer your users unparalleled experiences and delightful features.

Gemini models are powering new generative AI apps both over the cloud and directly on-device. You can now build with Gen AI using our most capable models over the Cloud with the Google AI client SDK or Vertex AI for Firebase in your Android apps. For on-device, Gemini Nano is our recommended model. We have also integrated Gen AI into developer tools - Gemini in Android Studio supercharges your developer productivity.

Let’s walk through the major announcements for AI on Android from this year's I/O sessions in more detail!

#1: Build AI apps leveraging cloud-based Gemini models

To kickstart your Gen AI journey, design the prompts for your use case with Google AI Studio. Once you are satisfied with your prompts, leverage the Gemini API directly into your app to access Google’s latest models such as Gemini 1.5 Pro and 1.5 Flash, both with one million token context windows (with two million available via waitlist for Gemini 1.5 Pro).

If you want to learn more about and experiment with the Gemini API, the Google AI SDK for Android is a great starting point. For integrating Gemini into your production app, consider using Vertex AI for Firebase (currently in Preview, with a full release planned for Fall 2024). This platform offers a streamlined way to build and deploy generative AI features.

We are also launching the first Gemini API Developer competition (terms and conditions apply). Now is the best time to build an app integrating the Gemini API and win incredible prizes! A custom Delorean, anyone?

#2: Use Gemini Nano for on-device Gen AI

While cloud-based models are highly capable, on-device inference enables offline inference, low latency responses, and ensures that data won’t leave the device.

At I/O, we announced that Gemini Nano will be getting multimodal capabilities, enabling devices to understand context beyond text – like sights, sounds, and spoken language. This will help power experiences like Talkback, helping people who are blind or have low vision interact with their devices via touch and spoken feedback. Gemini Nano with Multimodality will be available later this year, starting with Google Pixel devices.

We also shared more about AICore, a system service managing on-device foundation models, enabling Gemini Nano to run on-device inference. AICore provides developers with a streamlined API for running Gen AI workloads with almost no impact on the binary size while centralizing runtime, delivery, and critical safety components for Gemini Nano. This frees developers from having to maintain their own models, and allows many applications to share access to Gemini Nano on the same device.

Gemini Nano is already transforming key Google apps, including Messages and Recorder to enable Smart Compose and recording summarization capabilities respectively. Outside of Google apps, we're actively collaborating with developers who have compelling on-device Gen AI use cases and signed up for our Early Access Program (EAP), including Patreon, Grammarly, and Adobe.

Moving image of Gemini Nano operating in Adobe

Adobe is one of these trailblazers, and they are exploring Gemini Nano to enable on-device processing for part of its AI assistant in Acrobat, providing one-click summaries and allowing users to converse with documents. By strategically combining on-device and cloud-based Gen AI models, Adobe optimizes for performance, cost, and accessibility. Simpler tasks like summarization and suggesting initial questions are handled on-device, enabling offline access and cost savings. More complex tasks such as answering user queries are processed in the cloud, ensuring an efficient and seamless user experience.

This is just the beginning - later this year, we'll be investing heavily to enable and aim to launch with even more developers.

To learn more about building with Gen AI, check out the I/O talks Android on-device GenAI under the hood and Add Generative AI to your Android app with the Gemini API, along with our new documentation.

#3: Use Gemini in Android Studio to help you be more productive

Besides powering features directly in your app, we’ve also integrated Gemini into developer tools. Gemini in Android Studio is your Android coding companion, bringing the power of Gemini to your developer workflow. Thanks to your feedback since its preview as Studio Bot at last year’s Google I/O, we’ve evolved our models, expanded to over 200 countries and territories, and now include this experience in stable builds of Android Studio.

At Google I/O, we previewed a number of features available to try in the Android Studio Koala preview release, like natural-language code suggestions and AI-assisted analysis for App Quality Insights. We also shared an early preview of multimodal input using Gemini 1.5 Pro, allowing you to upload images as part of your AI queries — enabling Gemini to help you build fully functional compose UIs from a wireframe sketch.

You can read more about the updates here, and make sure to check out What’s new in Android development tools.

Get ready for Google I/O: Program lineup revealed

Posted by Timothy Jordan – Director, Developer Relations and Open Source

Developers, get ready! Google I/O is just around the corner, kicking off live from Mountain View with the Google keynote on Tuesday, May 14 at 10 am PT, followed by the Developer keynote at 1:30 pm PT.

But the learning doesn’t stop there. Mark your calendars for May 16 at 8 am PT when we’ll be releasing over 150 technical deep dives, demos, codelabs, and more on-demand. If you register online, you can start building your 'My I/O' agenda today.

Here's a sneak peek at some of the exciting highlights from the I/O program preview:

Unlocking the power of AI: The Gemini era unlocks a new frontier for developers. We'll showcase the newest features in the Gemini API, Google AI Studio, and Gemma. Discover cutting-edge pre-trained models from Kaggle, and delve into Google's open-source libraries like Keras and JAX.

Android: A developer's playground: Get the latest updates on everything Android! We'll cover groundbreaking advancements in generative AI, the highly anticipated Android 15, innovative form factors, and the latest tools and libraries in the Jetpack and Compose ecosystem. Plus, discover how to optimize performance and streamline your development workflow.

Building beautiful and functional web experiences: We’ll cover Baseline updates, a revolutionary tool that empowers developers with a clear understanding of web features and API interoperability. With Baseline, you'll have access to real-time information on popular developer resource sites like MDN, Can I Use, and web.dev.

The future of ChromeOS: Get a glimpse into the exciting future of ChromeOS. We'll discuss the developer-centric investments we're making in distribution, app capabilities, and operating system integrations. Discover how our partners are shaping the future of Chromebooks and delivering world-class user experiences.

This is just a taste of what's in store at Google I/O. Stay tuned for more updates, and get ready to be a part of the future.

Don't forget to mark your calendars and register for Google I/O today!

Gemini 1.5 Pro Now Available in 180+ Countries; With Native Audio Understanding, System Instructions, JSON Mode and More

Posted by Jaclyn Konzelmann and Megan Li - Google Labs

Grab an API key in Google AI Studio, and get started with the Gemini API Cookbook

Less than two months ago, we made our next-generation Gemini 1.5 Pro model available in Google AI Studio for developers to try out. We’ve been amazed by what the community has been able to debug, create and learn using our groundbreaking 1 million context window.

Today, we’re making Gemini 1.5 Pro available in 180+ countries via the Gemini API in public preview, with a first-ever native audio (speech) understanding capability and a new File API to make it easy to handle files. We’re also launching new features like system instructions and JSON mode to give developers more control over the model’s output. Lastly, we’re releasing our next generation text embedding model that outperforms comparable models. Go to Google AI Studio to create or access your API key, and start building.

Unlock new use cases with audio and video modalities

We’re expanding the input modalities for Gemini 1.5 Pro to include audio (speech) understanding in both the Gemini API and Google AI Studio. Additionally, Gemini 1.5 Pro is now able to reason across both image (frames) and audio (speech) for videos uploaded in Google AI Studio, and we look forward to adding API support for this soon.

screen grab of a clooege professor using Gemini 1.5 Pro to create a quiz based on their latest lecture video in Google AI Studio
You can upload a recording of a lecture, like this 117,000+ token lecture from Jeff Dean, and Gemini 1.5 Pro can turn it into a quiz with an answer key. Video sped up for demo purposes.

Gemini API Improvements

Today, we’re addressing a number of top developer requests:

1. System instructions: Guide the model’s responses with system instructions, now available in Google AI Studio and the Gemini API. Define roles, formats, goals, and rules to steer the model's behavior for your specific use case.

image showing where System Instructions is located in Google AI Studio
Set System Instructions easily in Google AI Studio

2. JSON mode: Instruct the model to only output JSON objects. This mode enables structured data extraction from text or images. You can get started with cURL, and Python SDK support is coming soon.

3. Improvements to function calling: You can now select modes to limit the model’s outputs, improving reliability. Choose text, function call, or just the function itself.

A new embedding model with improved performance

Starting today, developers will be able to access our next generation text embedding model via the Gemini API. The new model, text-embedding-004, (text-embedding-preview-0409 in Vertex AI), achieves a stronger retrieval performance and outperforms existing models with comparable dimensions, on the MTEB benchmarks.

table showing Gecko: Versativel Text Embeddings Distilled from Large Language Models
'Text-embedding-004' (aka Gecko) using 256 dims output outperforms all larger 768 dim output models on MTEB benchmarks

These are just the first of many improvements coming to the Gemini API and Google AI Studio in the next few weeks. We’re continuing to work on making Google AI Studio and the Gemini API the easiest way to build with Gemini. Get started today in Google AI Studio with Gemini 1.5 Pro, explore code examples and quickstarts in our new Gemini API Cookbook, and join our community channel on Discord.

Android Studio uses Gemini Pro to make Android development faster and easier

Posted by Sandhya Mohan – Product Manager, Android Studio

As part of the next chapter of our Gemini era, we announced we were bringing Gemini to more products. Today we’re excited to announce that Android Studio is using the Gemini 1.0 Pro model to make Android development faster and easier, and we’ve seen significant improvements in response quality over the last several months through our internal testing. In addition, we are making this transition more apparent by announcing that Studio Bot is now called Gemini in Android Studio.

Gemini in Android Studio is an AI-powered coding assistant which can be accessed directly in the IDE. It can accelerate your ability to develop high-quality Android apps faster by helping generate code for your app, providing complex code completions, answering your questions, finding relevant resources, adding code comments and more — all without ever having to leave Android Studio. It is available in 180+ countries and territories in Android Studio Jellyfish.

If you were already using Studio Bot in the canary channel, you’ll continue experiencing the same helpful and powerful features, but you’ll notice improved quality in responses compared to earlier versions.

Ask Gemini your Android development questions

Gemini in Android Studio can understand natural language, so you can ask development questions in your own words. You can enter your questions in the chat window ranging from very simple and open-ended ones to specific problems that you need help with.

Here are some examples of the types of queries it can answer:

    • How do I add camera support to my app?
    • Using Compose, I need a login screen with the following: a username field, a password field, a 'Sign In' button, a 'Forgot Password?' link. I want the password field to obscure the input.
    • What's the best way to get location on Android?
    • I have an 'orders' table with columns like 'order_id', 'customer_id', 'product_id', 'price', and 'order_date'. Can you help me write a query that calculates the average order value per customer over the last month?
Moving image demonstrating a conversation in Android Studio

Gemini in Android Studio remembers the context of the conversation, so you can also ask follow-up questions, such as “Can you give me the code for this in Kotlin?” or “Can you show me how to do it in Compose?”

Code faster with AI powered Code Completions

Gemini in Android Studio can help you be more productive by providing you with powerful AI code completions. You can receive suggestions of multi-line code completions, suggestions for how to do comments for your code, or how to add documentation to your code.

Moving image demonstrating code completion in Android Studio

Designed with privacy in mind

Gemini in Android Studio was designed with privacy in mind. Gemini is only available after you log in and enable it. You don’t need to send your code context to take advantage of most features. By default, Gemini in Android Studio’s chat responses are purely based on conversation history, and you control whether you want to share additional context for customized responses. You can update this anytime in Android Studio > Settings at a granular project level. We also have a custom way for you to opt out certain files and folders through an .aiexclude file. Much like our work on other AI projects, we stick to a set of AI Principles that hold us accountable. Learn more here.

image of share settings in Android Studio

Build a Generative AI app using the Gemini API starter template

Not only does Android Studio use Gemini to help you be more productive, it can also help you take advantage of Gemini models to create AI-powered features in your applications. Get started in minutes using the Gemini API starter template available in the canary release – channel for Android Studio – under File > New Project > Gemini API Starter. You can also use the code sample available at File > Import Sample > Google Generative AI sample.

The Gemini API is multimodal, meaning it can support image and text inputs. For example, it can support conversational chat, summarization, translation, caption generation etc. using both text and image inputs.

image of starter templates in Android Studio

Try Gemini in Android Studio

Gemini in Android Studio is still in preview, but we have added many feature improvements — and now a major model update — since we released the experience in May 2023. It is currently no-cost for developers to try out. Now is a great time to test it and let us know what you think, before we release this experience to stable.

Stay updated on the latest by following us on LinkedIn, Medium, YouTube, or X. Let's build the future of Android apps together!