Category Archives: Android Developers Blog

An Open Handset Alliance Project

How to bring your AI Model to Android devices

Posted by Kateryna Semenova – Senior Developer Relations Engineer and Mark Sherwood – Senior Product Manager

During AI on Android Spotlight Week, we're diving into how you can bring your own AI model to Android-powered devices such as phones, tablets, and beyond. By leveraging the tools and technologies available from Google and other sources, you can run sophisticated AI models directly on these devices, opening up exciting possibilities for better performance, privacy, and usability.

Understanding on-device AI

On-device AI involves deploying and executing machine learning or generative AI models directly on hardware devices, instead of relying on cloud-based servers. This approach offers several advantages, such as reduced latency, enhanced privacy, cost saving and less dependence on internet connectivity.

For generative text use cases, explore Gemini Nano that is now available in experimental access through its SDK. For many on-device AI use cases, you might want to package your own models in your app. Today we will walk through how to do so on Android.

Key resources for on-device AI

The Google AI Edge platform provides a comprehensive ecosystem for building and deploying AI models on edge devices. It supports various frameworks and tools, enabling developers to integrate AI capabilities seamlessly into their applications. The Google AI Edge platforms consists of:

    • MediaPipe Tasks - Cross-platform low-code APIs to tackle common generative AI, vision, text, and audio tasks
    • LiteRT (formerly known as TensorFlow Lite) - Lightweight runtime for deploying custom machine learning models on Android
    • MediaPipe Framework - Pipeline framework for chaining multiple ML models along with pre and post processing logic


Google AI Edge Logo

How to build custom AI features on Android

    1. Define your use case: Before diving into technical details, it's crucial to clearly define what you want your AI feature to achieve. Whether you're aiming for image classification, natural language processing, or another application, having a well-defined goal will guide your development process.

    2. Choose the right tools and frameworks: Depending on your use case, you might be able to use an out of the box solution or you might need to create or source your own model. Look through MediaPipe Tasks for common solutions such as gesture recognition, image segmentation or face landmark detection. If you find a solution that aligns with your needs, you can proceed directly to the testing and deployment step.


Google AI Edge Logo

    If you need to create or source a custom model for your use case, you will need an on-device ML framework such as LiteRT (formerly TensorFlow Lite). LiteRT is designed specifically for mobile and edge devices and provides a lightweight runtime for deploying machine learning models. Simply follow these substeps:

        a. Develop and train your model: Develop your AI model using your chosen framework. Training can be performed on a powerful machine or cloud environment, but the model should be optimized for deployment on a device. Techniques like quantization and pruning can help reduce the model size and improve inference speed. Model Explorer can help understand and explore your model as you're working with it.

        b. Convert and optimize the model: Once your model is trained, convert it to a format suitable for on-device deployment. LiteRT, for example, requires conversion to its specific format. Optimization tools can help reduce the model’s footprint and enhance performance. AI Edge Torch allows you to convert PyTorch models to run locally on Android and other platforms, using Google AI Edge LiteRT and MediaPipe Tasks libraries.

        c. Accelerate your model: You can speed up model inference on Android by using GPU and NPU. LiteRT’s GPU delegate allows you to run your model on GPU today. We’re working hard on building the next generation of GPU and NPU delegates that will make your models run even faster, and enable more models to run on GPU and NPU. We’d like to invite you to participate in our early access program to try out this new GPU and NPU infrastructure. We will select participants out on a rolling basis so don’t wait to reach out.

    3. Test and deploy: To ensure that your model delivers the expected performance across various devices, rigorous testing is crucial. Deploy your app to users after completing the testing phase, offering them a seamless and efficient AI experience. We're working on bringing the benefits of Google Play and Android App Bundles to delivering custom ML models for on-device AI features. Play for On-device AI takes the complexity out of launching, targeting, versioning, downloading, and updating on-device models so that you can offer your users a better user experience without compromising your app's size and at no additional cost. Complete this form to express interest in joining the Play for On-device AI early access program.

Build trust in AI through privacy and transparency

With the growing role of AI in everyday life, ensuring models run as intended on devices is crucial. We're emphasizing a "zero trust" approach, providing developers with tools to verify device integrity and user control over their data. In the zero trust approach, developers need the ability to make informed decisions about the device's trustworthiness.

The Play Integrity API is recommended for developers looking to verify their app, server requests, and the device environment (and, soon, the recency of security updates on the device). You can call the API at important moments before your app’s backend decides to download and run your models. You can also consider turning on integrity checks for installing your app to reduce your app’s distribution to unknown and untrusted environments.

Play Integrity API makes use of Android Platform Key Attestation to verify hardware components and generate integrity verdicts across the fleet, eliminating the need for most developers to directly integrate different attestation tools and reducing device ecosystem complexity. Developers can use one or both of these tools to assess device security and software integrity before deciding whether to trust a device to run AI models.

Conclusion

Bringing your own AI model to a device involves several steps, from defining your use case to deploying and testing the model. With resources like Google AI Edge, developers have access to powerful tools and insights to make this process smoother and more effective. As on-device AI continues to evolve, leveraging these resources will enable you to create cutting-edge applications that offer enhanced performance, privacy, and user experience. We are currently seeking early access partners to try out some of our latest tools and APIs at Google AI Edge. Simply fill in this form to connect and explore how we can work together to make your vision a reality.

Dive into these resources and start exploring the potential of on-device AI—your next big innovation could be just a model away!

Use #AndroidAI hashtag to share your feedback or what you've built on social media and catch up with the rest of the updates being shared during Spotlight Week: AI on Android.

An introduction to privacy and safety for Gemini Nano

Posted by Terence Zhang – Developer Relations Engineer, and Adrien Couque – Software Engineer

AI can enhance the user experience and productivity of Android apps. If you're looking to build GenAI features that benefit from additional data privacy or offline inference, on-device GenAI is a good choice as it processes prompts directly on your device without any server calls.

Gemini Nano is the most efficient model in Google's Gemini family, and Android’s foundational model for running on-device GenAI. It's supported by AICore, a system service that works behind the scenes to centralize the model’s runtime, ensure its safe execution, and protect your privacy. With Gemini Nano, apps can offer more personalized and reliable AI experiences without sending your data off the device.

In this blog post, we'll provide an introductory look into how Gemini Nano and AICore work together to deliver powerful on-device AI capabilities while prioritizing users’ privacy and safety.

Private Compute Core (PCC) compliance

At Google I/O 2021, we introduced Private Compute Core (PCC), a secure environment designed to keep your data private. At I/O in 2024, we shared that AICore is PCC compliant, meaning that it operates under strict privacy rules. It can only interact with a limited set of other system packages that are also PCC compliant, and it cannot directly access the internet. Any requests to download models or other information are routed through a separate, open-source companion APK called Private Compute Services.

This framework helps protect your privacy while still allowing apps to benefit from the power of Gemini Nano. Consider a keyboard application using Gemini Nano for a reply suggestion feature. Without PCC, the keyboard would require direct access to the conversation context. With PCC, the code that has access to the conversation runs in a secure sandbox and interacts directly with Gemini Nano to generate suggestions on behalf of the keyboard. This allows the keyboard app to benefit from Gemini Nano's capabilities without directly accessing or storing sensitive conversation data. You can find out more about how this works in the PCC Whitepaper.

Protecting your privacy through data isolation

AICore is built to isolate each request to protect your privacy. This prevents apps from accessing data that does not belong to them. Requests are handled independently and processed from a single app at a time to mitigate the risk of data being exposed to other apps.

Additionally, AICore doesn't store any record of the input data or the resulting outputs after processing each request. This design, combined with the fact that Gemini Nano’s inference happens directly on your device, helps ensure your app’s data stays private and secure.

Prioritizing Safety in Gemini Nano

A flow chart illustrating the architecture of an AI system, highlighting the flow of data and processing steps from the 'Client app' to the 'Service' component, including 'Input safety signals', 'Output safety signals', 'Weights' and 'Runtime'

We're committed to building AI responsibly, and that includes making sure Gemini Nano is safe. We've implemented multiple layers of protection to limit harmful or unintended results:

    • Native model safety: All Gemini models, including Gemini Nano, are trained to be safety-aware out of the box. This means safety considerations are built into the core of the model, not just added as an afterthought.
    • Safety aware fine-tuning: We use a LoRA fine-tuning block to adapt Gemini Nano for the needs of specific apps. When we train the LoRA block, we incorporate safety data specific to the app’s use case to preserve and even enhance the model's safety features during fine-tuning where applicable.
    • Safety filters on input and output: As a final safeguard, both the input prompt and results generated by the Gemini Nano runtime are evaluated against our safety filters before providing the results to the app. This helps prevent unsafe content from slipping through, without any loss in quality.

These layers of protection work together to ensure that Gemini Nano provides a safe and helpful experience for everyone.


Get started

Learn more about Gemini Nano for app development, and try it out in your own app!

Be sure to check out the other amazing AI on Android Spotlight week content!

Gemini Nano is now available on Android via experimental access

Posted by Taj Darra – Product Manager

Gemini, introduced last year, is Google’s most capable family of models yet; designed for flexibility, it can run on everything from data centers to mobile devices. Since announcing Gemini Nano, our most efficient model built for on-device tasks, we've been working with a limited set of partners to support a range of use cases for their apps.

Today, we’re opening up access to experiment with Gemini Nano to all Android developers with the AI Edge SDK via AICore. Developers will initially have access to experiment with text-to-text prompts on Pixel 9 series devices. Support for more devices and modalities will be added in the future. Check out our documentation and video to get started. Note that experimental access is for development purposes, and is not for production usage at this time.


Fast, private and cost-effective on-device AI

On-device generative AI processes prompts directly on your device without server calls. It offers many benefits: sensitive user data is processed locally on the device, full functionality without internet connectivity, and no additional monetary cost for each inference.

Since on-device generative AI models run on devices with less computational power than cloud servers, they are significantly smaller and less generalized than their cloud-based equivalents. As a result, the model works best for tasks where the requests can be clearly specified rather than open-ended use cases such as chatbots. Here are some use cases you can try:

    • Rephrasing - Rephrasing and rewriting text to change the tone to be more casual or formal.
    • Smart reply - Given several chat messages in a thread, suggest the next likely response.
    • Proofreading - Removing spelling or grammatical errors from text.
    • Summarization - Generating a summary of a long document, either as a paragraph or as bullet points.

Check out our prompting strategies to achieve best results when experimenting with the above use-cases. If you want to test your own use case, you can download our sample app for an easy way to start experimenting with Gemini Nano.


Gemini Nano performance and usage

Compared to its predecessor, the model being made available to developers today (referred to in the academic paper as “Nano 2”) delivers a substantial improvement in quality. At nearly twice the size of the predecessor (“Nano 1”), it excels in both academic benchmarks and real-world applications, offering capabilities that rival much larger models.


MMLU (5-shot)*

MATH (4-shot)*

Paraphrasing**

Smart Reply**

Nano 1

46%

14%

44%

44%

Nano 2

56%

23%

90%

82%

* As reported in Gemini: A Family of Highly Capable Multimodal Models. Note that both these models are a part of our Gemini 1.0 series.
** Percentage of good answers measured on public datasets via an autorater powered by Gemini 1.5 Pro.

Gemini Nano is already in use by Google apps. Pixel Screenshots, Talkback, Recorder and many more have leveraged Gemini Nano’s text and image understanding to deliver new experiences:

    • Talkback - Android’s accessibility app leverages Gemini Nano’s multimodal capabilities to improve image descriptions for blind and low vision users.
    moving image of Talkback app UI highlighting improved image descriptions with multimodality model for users with low vision

    • Pixel Recorder - Gemini Nano with Multimodality model enables support for longer recordings and higher quality summaries.
moving image of Talkback app UI highlighting improved image descriptions with multimodality model for users with low vision

Seamless model integration with AI Edge SDK using AICore

Integrating generative AI models directly into mobile apps is challenging due to the significant computational resources and storage space they require. To address this challenge, we developed AICore, a new system service in Android. AICore allows you to benefit from AI running directly on the device without needing to distribute runtimes, models and other components yourself.

To run inference with Gemini Nano in AICore, you use the AI Edge SDK. The AI Edge SDK enables developers to customize prompts and inference parameters to their specific needs, enabling greater control over each inference.

To experiment with the AI Edge SDK, add the following to your apps’ dependency:

implementation("com.google.ai.edge.aicore:aicore:0.0.1-exp01")

The AI Edge SDK allows you to customize inference parameters. Some of the more commonly-used parameters include:

    • Temperature, which controls randomness. Higher values increase diversity and creativity of output.
    • Top K, which specifies how many tokens from the highest-ranking ones are to be considered.
    • Candidate count, which describes the maximum number of responses to return.
    • Max output tokens, which is the length of the desired response.

When you are ready to run the inference with your model, the AI Edge SDK offers an easy way to pass in multiple strings as input to accommodate long inference data.

Here’s an example:

scope.launch {
    // Single string input prompt
    val input = "I want you to act as an English proofreader. I will 
    provide you texts, and I would like you to review them for any 
    spelling, grammar, or punctuation errors. Once you have finished 
    reviewing the text, provide me with any necessary corrections or 
    suggestions for improving the text: 
    These arent the droids your looking for."
    val response = generativeModel.generateContent(input)
    print(response.text)

    // Or multiple strings as input
    val response = generativeModel.generateContent(
        content {
            text("I want you to act as an English proofreader.I will 
            provide you texts and I would like you to review them for 
            any spelling, grammar, or punctuation errors.")
            text("Once you have finished reviewing the text, 
            provide me with any necessary corrections or suggestions 
            for improving the text:")
            text("These arent the droids your looking for.")
        }
    )
    print(response.text)
}

Our integration guide has more information on the AI Edge SDK as well as detailed instructions to start your experimentation with Gemini Nano. To learn more about prompting, check out the Gemini prompting strategies.


Get Started

Learn more about Gemini Nano for app development by watching our video walkthrough, and try out Gemini Nano experimental access in your own app today.

We are excited to see what you build and welcome your input as you evaluate this new technology for your use cases! Post your creations on social media and include the hashtag #AndroidAI to share what you build. To share your ideas and feedback for on-device GenAI and help shape our APIs, you can file a ticket.

There’s a lot more that we’re covering this week for you to build great AI experiences on Android so be sure to check out the rest of the AI on Android Spotlight Week content!

Welcome to AI on Android Spotlight Week

Posted by Joseph Lewis – Technical Writer, Android AI

AI on Android Spotlight Week this year runs September 30th to October 4th! As part of the Android “Spotlight Weeks” series, this week’s content and updates are your gateway to understanding how to integrate cutting-edge AI into your Android apps. Whether you're a seasoned Android developer, an AI enthusiast, or just starting out on your development journey, get ready for a week filled with insightful sessions, practical demos, and inspiring success stories that'll help you build intuitive and powerful AI integrations.

Throughout the week, we'll dive into the core technologies driving AI experiences on Android. This blog will be updated throughout the week with links to new announcements and resources, so check back here daily for updates!


Monday: Getting started with AI

September 30, 2024

Learn how to begin with AI on Android development. Understand which AI models and versions you can work with. Learn about developer tools to help you start building features empowered with AI.

We'll guide you through the differences between traditional programming and machine learning, and contrast traditional machine learning with generative AI. The post explains large language models (LLMs), the transformer architecture, and key concepts like context windows and embeddings. It also touches on fine-tuning and the future of LLMs on Android.

Read the blog post: A quick introduction to large language models for Android Developers

We'll then provide a look behind the scenes at our work improving developer productivity with Gemini in Android Studio. We'll discuss Studio's new AI code completion feature, how we've been working to improve the accuracy and quality of suggested code, and how this feature can benefit your workflow.

Read the blog post: Gemini in Android Studio: Code Completion gains powerful model improvements


Tuesday: On-device AI capabilities with Gemini Nano

October 1, 2024

Discover how Gemini Nano empowers Android developers to unlock the full potential of generative AI, offering personalization and privacy benefits for next-generation apps. We'll share how you can begin integrating your Android apps with on-device LLMs. Look for more information and announcements here on Tuesday!


Wednesday: On-device AI with custom models

October 2, 2024

On Wednesday, we'll help you understand how to bring your own AI model to Android devices, and how you can integrate tools and technologies from Google and other sources. The ability to run sophisticated AI models directly on devices – whether it's a smartphone, tablet, or embedded system – opens up exciting possibilities for better performance, privacy, usability, and cost efficiency.

We'll also give you a detailed walkthrough of how Android developers can leverage Google AI Edge Torch to convert PyTorch machine learning models for on-device execution, using the LiteRT and MediaPipe Tasks libraries. This walkthrough includes code examples and explanations for converting a MobileViT model for image classification and a DIS model for segmentation, and highlights the steps involved in preparing these models for seamless integration into Android applications. By following this guide, developers can harness PyTorch models to enhance their Android apps with advanced machine learning capabilities.


Thursday: Access cloud models with Android SDKs

October 3, 2024

Tap into the boundless potential of Gemini 1.5 Pro and Gemini 1.5 Flash, the revolutionary generative AI models that are redefining the capabilities of Android apps. With Gemini 1.5 Pro and 1.5 Flash, you'll have the tools you need to create apps that are truly intelligent and interactive.

On Thursday, we'll give you a codelab that'll help you understand how to integrate the Gemini API capabilities into your Android projects. We'll guide you through crafting effective prompts and integrate Vertex AI in Firebase. By the end of this hands-on tutorial, you'll be able to implement features like text summarization in your own app all powered by the cutting-edge Gemini models.

Next we'll publish a blog post exploring the potential of the Gemini API with case studies. We'll delve into how Android developers are leveraging generative AI capabilities in innovative ways, showcasing real-world examples of apps that have successfully integrated the Gemini API. From meal planning to journaling and personalized user experiences, the article highlights examples of how Android developers are already taking advantage of Gemini transformative capabilities in their apps.

We'll also share with you examples of advanced features of the Gemini API to go beyond simple text prompting. You'll learn how system instructions can shape the model behavior, how JSON support streamlines development, and how multimodal capabilities and function calling can unlock exciting new use cases for your apps.


Friday: Build with AI on Android and beyond

October 4, 2024

As the capstone for AI on Android Spotlight Week, we'll host a discussion with Kateryna Semenova, Oli Gaymond, Miguel Ramos, and Khanh LeViet to talk about building with AI on Android. We'll explore the latest AI advancements tailored for Android engineers, showcasing how these technologies can elevate your app development game. Through engaging discussions and real-world examples, we will unveil the potential of AI, from fast, private on-device solutions using Gemini Nano to the powerful capabilities of Gemini 1.5 Flash and Pro. We'll discuss building generative AI solutions rapidly using Vertex AI in Firebase. And we'll dive into harnessing the power of AI with safety and privacy in mind.


Work with Gemini beyond Android

As we wrap things up for AI on Android Spotlight Week, know that we're striving to provide comprehensive AI solutions for cross-platform Gemini development. The AI capabilities showcased during Android AI Week can extend to other platforms, such as built-in AI in Chrome. Web developers can leverage similar tools and techniques to create web experiences enhanced by AI. Developers can run Gemini Pro in the cloud for natural language processing and other complex user journeys. Or, you explore the benefits of performing AI inferenceclient-side, with Gemini Nano in Chrome.

Build with usability and privacy in mind

As you embark on your AI development journey, we want you to keep in mind a few important considerations:

    • Privacy: Prioritize user privacy and data security when implementing AI features, especially when handling sensitive user information. When it becomes available, opt for on-device AI solutions like Gemini Nano whenever possible to minimize data exposure.
    • Seamless user experience: Ensure that AI features seamlessly integrate into your app's overall user experience. AI should enhance the user experience, not disrupt it.
    • Ethical considerations: AI technologies are developed and deployed in a way that benefits society while minimizing potential harm. By considering fairness, transparency, privacy, accountability, and societal impact, developers can play a vital role in creating a future where AI serves humanity's best interests. Be mindful of the ethical implications of AI, such as potential biases in your AI models. Strive to create AI-powered features that are fair and inclusive.

AI on Android Spotlight Week is an opportunity to explore the latest in AI and its potential for Android app development. We encourage you to delve into the wealth of resources shared during the week and begin experimenting with AI in your own projects. The future of Android is rooted in AI and machine learning, and with the tools and knowledge shared during Android AI Week, developers are well-equipped to build the next generation of AI-powered apps.


What's next

Come back to this blog post for updates; we’ll add links to blog and video content and more throughout the week. Follow Android Developers on X and Android Developers at LinkedIn, and remember to use the hashtag #AndroidAI to share your AI-powered Android creations, and join the vibrant community of developers pushing the boundaries of mobile AI.

Gemini in Android Studio: Code Completion Gains Powerful Model Improvements

Posted by Sandhya Mohan – Product Manager, Android Studio and Sarmad Hashmi – Software Engineer, Labs

The Android team believes AI has the potential to revolutionize coding, drive unprecedented innovation and productivity in software development, and supercharge your development productivity. AI code completion is a key part of this effort within Gemini in Android Studio.

Since launching in May 2024, we've been hard at work improving this feature to provide the best possible experience for all Android developers. In this post, we want to take you “under the hood” on how we achieved a 40% relative increase in acceptance rate since release, and share some of our excitement for how we have seen Android developers use this feature. We hope you'll give it a try and let us know what you think.


An AI coding companion for every developer

Our vision for Gemini in Android Studio is to empower developers to build high quality Android apps — making it easy for developers to quickly write correct code aligned with Android's best practices. Launched last year, the first version of Studio Bot provided a chat experience where developers could access Android-specific guidance, powered by Google's latest AI models. Developers are able to ask Gemini in Android Studio to provide developer guidance, summarize technical documentation, and critique their Android code. But in all these cases the feedback is reactive, responding to a user's question.

AI code completion takes these capabilities a step further by providing real-time feedback as you work as a developer, thinking ahead and suggesting the next few lines of code that you are likely to type based on the context from the surrounding file and what was just typed. You can think of AI Code Completion as a partner in your work — a coding companion waiting to offer guidance when you need it.

This feature is particularly well suited for tasks like defining business logic, creating database schemas, making network requests, or even writing tests — tasks that are often time-consuming and distract from building the core experience for your app. Many developers have told us how much they enjoy the speed AI completions brings to their app development workflow.

A moving image demonstrating AI autocomplete in Android Studio

Bringing more intelligent code completion to Android development

While we are excited to see how AI Code Completions have improved developers’ workflows, we know there's still more we can do to improve developer productivity. Development of Gemini in Android Studio is an ongoing, large-scale collaborative effort by many teams across Google. Earlier this year, we switched to Gemini 1.5 models and saw a significant improvement in the quality of code completions, resulting in a 2x increase in our developer productivity metrics, including overall acceptance rate for suggestions.

Once we started doing A/B test experiments to improve AI code completion we found several improvements around model quality, context, and heuristics. This overall effort led to a 40% relative increase in acceptance rate — how often users accept the AI's proposed code suggestions — since we launched. Since then, we've been exploring several improvements like:

    • Retrieval augmentation: With your opt-in consent, we use the files and dependencies most relevant to your current coding context to enhance the accuracy of suggestions. This is just the first step and we're continuing to experiment with adding even more context from the IDE as part of each request.
    • Filtering out low-confidence completions: Prioritize showing high quality suggestions where they are most relevant, and therefore most likely to be accepted. We do this by using a combination of the probabilities returned by the model and using a classifier trained to identify high-quality completions based on developer feedback.
    • Smarter post-processing: The LLM's output for AI Code Completion is fundamentally different from the output users expect in a chat session. Responses need to be tightly scoped in order to quickly output useful code, without surrounding expository text. We apply additional heuristics on the model output to ensure responses are concise and accurate, as well as making sure that the generated code is valid within the context of the user's codebase.
    • Improved models: We use opt-in feedback from Android Studio users, such as noting when a code suggestion is accepted or rejected, to adapt the code completion model to their coding style and preferences over time. We regularly ship new models with higher quality data based on your feedback.

We are also exploring metrics beyond acceptance rate to better measure AI impact on developer velocity, such as the percentage of total code written by AI.


Try it out!

We are rolling out these successful experiments and others as quickly as possible.

If you haven't tried AI code completions yet, you can enable this feature by clicking on the Gemini sparkGemini button in your editor window and signing in to your Google account.

A screenshot of Android Studio with a pop-up notification about the Gemini AI coding companion. The notification explains that Gemini is a free feature in preview and requires a Google account login to use.
Figure 1. Launching Gemini in Android Studio for the first time

After doing so, navigate to Settings > Tools > Gemini and select "Enable AI-based inline code completions".

A screenshot of the settings menu within Android Studio, with the 'Gemini' section expanded showing options related to the AI coding companion, including privacy and context awareness.
Figure 2. Enabling "AI-based inline code completions"

As always, Google is committed to the responsible use of AI. Android Studio won't send any of your source code to servers without your consent — which means you'll need to opt-in to enable Gemini's developer assistance features in Android Studio. You can read more on Gemini in Android Studio's commitment to privacy.

Try enabling AI Code Completions in your project and tell us what you think on social media with #AndroidGeminiEra. We're excited to see how these enhancements help you build amazing apps!


This blog post is part of our series: AI on Android Spotlight Week, where we provide resources — blog posts, videos, sample code, and more — all designed to to explore the latest in AI and its potential for Android app development.

Quick introduction to Large Language Models for Android developers

Posted by Thomas Ezan, Sr Developer Relation Engineer

Android has supported traditional machine learning models for years. Frameworks and SDKs like LiteRT (formerly known as TensorFlow Lite), ML Kit and MediaPipe enabled developers to easily implement tasks like image classification and object detection.

In recent years, generative AI (gen AI) and large language models (LLMs), have opened up new possibilities for language understanding and text generation. We have lowered the barriers for integrating gen AI features into your apps and this blog post will provide you with the necessary high-level knowledge to get started.

Before we dive into the specificities of generative AI models, let’s take a high level look: how is machine learning (ML) different from traditional programming.


Machine learning as a new programming paradigm

A key difference between traditional programming and ML lies in how solutions are implemented.

In traditional programming, developers write explicit algorithms that take input and produce a desired output.

A flow chart showing the process of machine learning model training. Input data is fed into the training process, resulting in a trained ML model

Machine learning takes a different approach: developers provide a large set of previously collected input data and the corresponding output, and the ML model is trained to learn how to map the input to the output.

A flow chart illustrating the machine learning model training. This step is labeled above the process '1. Train the model with a large set of input and output data'. Below, arrows labeled 'Input' and 'Output' point to a green box labeled 'ML Model Training'.  Another arrow points away from the box and is labeled 'ML Model'.

Then, the model is deployed on the Cloud or on-device to process input data. This step is called inference.

A flow chart illustrating the inference training for training an ML model. This step is labeled above the process '2. Deploy the model to run inferences on input data'. Below, an arrow labeled 'Input' points to a green box labeled 'Run ML Inference'.  Another arrow points away from the box and is labeled 'Output'.

This paradigm enables developers to tackle problems that were previously difficult or impossible to solve with rule-based programming.


Traditional machine learning vs. generative AI on Android

Traditional ML on Android includes tasks such as image classification that can be implemented using mobilenet and LiteRT, or pose estimation that can be easily added to your Android app with the ML Kit SDK. These models are often trained on specific datasets and perform extremely well on well-defined, narrow tasks.

Generative AI introduces the capability to understand inputs such as text, images, audio and video and generate human-like responses. This enables applications like chatbots, language translation, text summarization, image captioning, image or code generation, creative writing assistance, and much more.

Most state of the art generative AI models like the Gemini models are built on the transformer architecture. To generate images, diffusion models are often used.


Understanding large language models

At its core, an LLM is a neural network model trained on massive amounts of text data. It learns patterns, grammar, and semantic relationships between words and phrases, enabling it to predict and generate text that mimics human language.

As mentioned earlier, most recent LLMs use the transformer architecture. It breaks down input into tokens, assigns numerical representations called “embeddings” (see Key concepts below) to these tokens, and then processes these embeddings through multiple layers of the neural network to understand the context and meaning.

LLMs typically go through two main phases of training:

      1. Pre-training phase: The model is exposed to vast amounts of text from different sources to learn general language patterns and knowledge.

      2. Fine-tuning phase: The model is trained on specific tasks and datasets to refine its performance for particular applications.


Classes of models and their capabilities.

Gen AI models come in various sizes, from smaller models like Gemini Nano or Gemma 2 2B, to massive models like Gemini 1.5 Pro that run on Google Cloud. The size of a model generally correlates with the capabilities and compute power required to run it.

Models are constantly evolving, with new research pushing the boundaries of their capabilities. These models are being evaluated on tasks like question answering, code generation, and creative writing, demonstrating impressive results.

In addition some models are multimodal which means that they are designed to process and understand information from multiple modalities, such as images, audio, and video, alongside text. This allows them to tackle a wider range of tasks, including image captioning, visual question answering, audio transcription. Multiple Google Generative AI models such as Gemini 1.5 Flash, Gemini 1.5 Pro, Gemini Nano with Multimodality and PaliGemma are multimodal.


Key concepts

Context Window

Context window refers to the amount of tokens (converted from text, image, audio or video) the model considers when generating a response. For chat use cases, it includes both the current input and a history of past interactions. For reference, 100 tokens is equal to about 60-80 English words.For reference, Gemini 1.5 Pro currently supports 2M input tokens. It is enough to fit the seven Harry Potter books… and more!

Embeddings

Embeddings are multidimensional numerical representations of tokens that accurately encode their semantic meaning and relationships within a given vector space. Words with similar meanings are closer together, while words with opposite meanings are farther apart.

The embedding process is a key component of an LLM. You can try it independently using MediaPipe Text Embedder for Android. It can be used to identify relations between words and sentences and implement a simplified semantic search directly on-device.

A 3-D graph plots 'Man' and 'King' in blue and 'Woman' and 'Queen' in green, with arrows pointing from 'Man' to 'Woman' and from 'King' to 'Queen'.
A (very) simplified representation of the embeddings for the words “king”, “queen”, “man” and “woman”

Top-K, Top-P and Temperature

Parameters like Top-K, Top-P and Temperature enable you to control the creativity of the model and the randomness of its output.

Top-K filters tokens for output. For example a Top-K of 3 keeps the three most probable tokens. Increasing the Top-K value will increase the randomness of the model response (learn about Top-K parameter).

Then, defining the Top-P value adds another step of filtering. Tokens with the highest probabilities are selected until their sum equals the Top-P value. Lower Top-P values result in less random responses, and higher values result in more random responses (learn about Top-P parameter).

Finally, the Temperature defines the randomness to select the tokens left. Lower temperatures are good for prompts that require a more deterministic and less open-ended or creative response, while higher temperatures can lead to more diverse or creative results (learn about Temperature).

Fine-tuning

Iterating over several versions of a prompt to achieve an optimal response from the model for your use-case isn’t always enough. The next step is to fine-tune the model by re-training it with data specific to your use-case. You will then obtain a model customized to your application.

More specifically, Low rank adaptation (LoRA) is a fine-tuning technique that makes LLM training much faster and more memory-efficient while maintaining the quality of the model outputs. The process to fine-tune open models via LoRA is well documented. See, for example, how you can fine-tune Gemini models through Google AI Studio without advanced ML expertise. You can also fine-tune Gemma models using the KerasNLP library.


The future of generative AI on Android

With ongoing research and optimization of LLMs for mobile devices, we can expect even more innovative gen AI enabled features coming to Android soon. In the meantime check out other AI on Android Spotlight Week blog posts, and go to the Android AI documentation to learn more about how to power your apps with gen AI capabilities!

AllTrails gains over 1 million downloads after implementing its Wear OS app

Posted by Kseniia Shumelchyk – Developer Relations Engineer

With more than 65 million global users, AllTrails is one of the world’s most popular and trusted platforms for outdoor exploration. The app is designed to be the ultimate adventure companion, so the AllTrails team always works to improve users’ outdoor experience using the latest technology. Recently, its developers created a new Wear OS application. Now, users can access their favorite AllTrails features using their favorite Android wearables.

Growing the AllTrails ecosystem

AllTrails has had a great deal of growth from its Android users, and the app’s developers wanted to meet the needs of this growing segment by delivering new ways to get outside. That meant creating an ecosystem of connected experiences, and Wear OS was the perfect starting point. The team started by building essential functions for controlling the app, like pausing, resuming, and finishing hikes, straight from wearables.

“We know that the last thing you want as you’re pulling into the trailhead is to fumble with your phone and look for the trail, so we wanted to bring the trails to your fingertips,” said Sydney Cho, director of product management at AllTrails. “There’s so much cool stuff we want to do with our Wear OS app, but we decided to start by focusing on the fundamentals.”

After implementing core controls, AllTrails developers added more features to take advantage of the watch screen, like a circular progress ring to show users how far they are on their current route. Implementing new user interfaces is efficient since Compose for Wear OS provides built-in Material components for developers, like a CircularProgressIndicator.

AllTrails’ mobile app warns users when they start to wander off-trail with wrong-turn alerts. AllTrails developers incorporated these alerts into the new Wear OS app, so users can get notified straight from their wrists and keep their phones in their pockets.

The new AllTrails Wear OS application has been super popular among its user base, and the team has received substantial positive feedback on the new wearable experience. AllTrails Wear OS app has had over 1 million downloads since implementing the Wear OS app.

'We’re seeing a lot of growth from Android users, and we want to provide them an ecosystem of connected experiences. Wearables are a core part of that experience.'— Sydney Cho, Director of product management at AllTrails

Streamlined development with Compose for Wear OS

To build the new wearable experience, AllTrails developers used Jetpack Compose for Wear OS. The modern declarative toolkit simplifies UI development by letting developers create reusable code blocks for basic functions, allowing for fast and efficient wearable app development.

“Compose for Wear OS definitely sped up development,” said Sydney. “It also gave our dev team exposure to the toolkit, which we’re obviously huge fans of and use for the majority of our new development.”

This was the first app AllTrails developers created entirely using Jetpack Compose, even though they currently use it for parts of the mobile app. Even with their brief experience using the toolkit, they knew it would greatly improve development, so it was an obvious choice for the Wear OS integration.

“Jetpack Compose allowed us to iterate much more quickly,” said Sydney. “It’s incredibly simple to create composables, and the simplicity of previewing the app in various states is extremely helpful.”



Connecting health and fitness via Health Connect

AllTrails developers saw another opportunity to improve the user experience while building the new Wear OS application by integrating Health Connect. Health Connect is one of Android’s latest API offerings that gives users a simpler way to consolidate and share their health and fitness data across applications.

When users opt-in for Health Connect, they can share their various health and fitness data between applications, giving them a more comprehensive understanding of their activity regardless of the apps tracking it.

“Health Connect allows our users to sync their AllTrails activity recordings, like hiking, biking, running, and so on, directly on their phone,” said Sydney. “This activity can then be viewed within Health Connect or from other apps, giving users more freedom to see all their physical activity data, regardless of which app it was recorded on.”

Health Connect streamlines health data management using simple APIs and a straightforward data model. It acts as a centralized repository, consolidating health and fitness data from various apps, simply by having each app write its data to Health Connect. This means that even partial adoption of the API can yield benefits.

AllTrails developers enjoyed how easy it was to integrate Health Connect, thanks to its straightforward and well-documented APIs that were “very simple but extremely powerful.”

moving asset of 3D Droid figure on the right gesticulating toward tect on the left that reads 'AllTrails +1million downloads since implementing the Wear OS app'

What’s ahead with Wear OS

Implementing a new Wear OS application did more than give AllTrails’ users a new way to interact with the app. It lets them put their phones back in their pockets so they can enjoy more of what’s on the trail. By prioritizing core functionalities like nearby trail access, recording control, and real-time alerts, AllTrails delivered a seamless and intuitive wearable experience, enriching UX with impressive user adoption and retention rates.

Get started

Learn more about building wearable apps with design and developer guidance for Wear OS.

Attestation format change for the Android FIDO2 API

Posted by Christiaan Brand – Group Product Manager

In 2019 we introduced a FIDO2 API, adopted by many leading developers, which allows users to generate an attested, device-bound FIDO2 credential on Android devices.

Since this launch, Android has generated an attestation statement based on the SafetyNet API. As the underlying SafetyNet API is being deprecated, the FIDO2 API must move to a new attestation scheme based on hardware-backed key attestation. This change will require action from developers using the FIDO2 API to ensure a smooth transition.

The FIDO2 API is closely related to, but distinct from, the passkeys API and is invoked by setting the residentKey parameter to discouraged. While our goal is over time to migrate developers to the passkey API, we understand that not all developers who are currently using the FIDO2 API are ready for that move and we continue working on ways to converge these two APIs.

We will update the FIDO2 API on Android to produce attestation statements based on hardware-backed key attestation. As of November 2024, developers can opt in to this attestation scheme with controls for individual requests. This should be useful for testing and incremental rollouts, while also allowing developers full control over the timing of the switch over the next 6 months.

We will begin returning hardware-backed key attestation by default for all developers in early April 2025. From that point, SafetyNet certificates will no longer be granted. It is important to implement support for the new attestation statement, or move to the passkey API before the cutover date, otherwise your applications might not be able to parse the new attestation statements.

For web apps, requesting hardware-backed key attestation requires Chrome 130 or higher to enroll in the WebAuthn attestationFormats origin trial. (Learn more about origin trials.) Once these conditions are met, you can specify the attestationFormats parameter in your navigator.credentials.create call with the value ["android-key"].

If you're using the FIDO2 Play Services API in an Android app, switching to hardware-backed key attestation requires Play Services version 22.0.0 on the device. Developers can then specify android-key as the attestation format in the PublicKeyCredentialCreationOptions. You must update your Play Services dependencies to see this new option.

We will continue to evolve FIDO APIs. Please continue to provide feedback using [email protected] to connect with the team and developer community.

AI on Android Spotlight Week begins September 30th

Posted by Joseph Lewis – Technical Writer, Android AI

AI on Android Spotlight Week is our latest installment of the Spotlight Weeks series. We'll have a full week of investigation into the latest advancements in AI for Android developers. We’ll feature a variety of exciting activities, including an AMA with Google AI experts, technical talks, early access to our new tools and API, and demos of the latest Android generative AI technologies. AI on Android Spotlight Week kicks off next week on September 30th through October 4th, and will feature information and activities for developers, researchers, and enthusiasts interested in the future of generative AI app development on Android-powered devices.

Get the latest on Android AI developer strategies

During our Spotlight Week: AI on Android, we’ll feature a number of new and exciting opportunities to learn more about how to work with generative AI and machine learning for Android app development, including:

    • Conversations about on-device and cloud based GenAI solutions with Gemini Nano, Vertex AI in Firebase, and LiteRT (formerly known as TensorFlow Lite)
    • Partner demos and deep dives into the latest AI technologies and how to integrate them in Android apps
    • Discussions around model capabilities, developer tools and integration strategies from web to mobile
    • Answers to top questions from dev community about AI on Android

How to participate

Our Spotlight Week: AI on Android will happen entirely online, across Android Developer’s channels - YouTube, X, LinkedIn, and on d.android.com: check the Android AI developer page on Monday, September 30, 2024 to read our next blog post with full details!

Follow @AndroidDev on X for the latest updates, and help spread the word about AI on Android Spotlight Week, and use #AndroidAI on your favorite social media platforms to ask questions and share your AI projects with the community. We’re excited for you to join us!

Tools, not Rules: become a better Android developer with Compiler Explorer

Posted by Shai Barack – Android Platform Performance lead

Introducing Android support in Compiler Explorer

In a previous blog post you learned how Android engineers continuously improve the Android Runtime (ART) in ways that boost app performance on user devices. These changes to the compiler make system and app code faster or smaller. Developers don’t need to change their code and rebuild their apps to benefit from new optimizations, and users get a better experience. In this blog post I’ll take you inside the compiler with a tool called Compiler Explorer and witness some of these optimizations in action.

Compiler Explorer is an interactive website for studying how compilers work. It is an open source project that anyone can contribute to. This year, our engineers added support to Compiler Explorer for the Java and Kotlin programming languages on Android.

You can use Compiler Explorer to understand how your source code is translated to assembly language, and how high-level programming language constructs in a language like Kotlin become low-level instructions that run on the processor.

At Google our engineers use this tool to study different coding patterns for efficiency, to see how existing compiler optimizations work, to share new optimization opportunities, and to teach and learn. Learning is best when it’s done through tools, not rules. Instead of teaching developers to memorize different rules for how to write efficient code or what the compiler might or might not optimize, give the engineers the tools to find out for themselves what happens when they write their code in different ways, and let them experiment and learn. Let’s learn together!

Start by going to godbolt.org. By default we see C++ sample code, so click the dropdown that says C++ and select Android Java. You should see this sample code:

class Square {
   static int square(int num) {
       return num * num;
   }
}
screenshot of sample code in Compiler Explorer
click to enlarge

On the left you’ll see a very simple program. You might say that this is a one line program. But this is not a meaningful statement in terms of performance - how many lines of code there are doesn’t tell us how long this program will take to run, or how much memory will be occupied by the code when the program is loaded.

On the right you’ll see a disassembly of the compiler output. This is expressed in terms of assembly language for the target architecture, where every line is a CPU instruction. Looking at the instructions, we can say that the implementation of the square(int num) method consists of 2 instructions in the target architecture. The number and type of instructions give us a better idea for how fast the program is than the number of lines of source code. Since the target architecture is AArch64 aka ARM64, every instruction is 4 bytes, which means that our program’s code occupies 8 bytes in RAM when the program is compiled and loaded.

Let’s take a brief detour and introduce some Android toolchain concepts.


The Android build toolchain (in brief)

a flow diagram of the Android build toolchain

When you write your Android app, you’re typically writing source code in the Java or Kotlin programming languages. When you build your app in Android Studio, it’s initially compiled by a language-specific compiler into language-agnostic JVM bytecode in a .jar. Then the Android build tools transform the .jar into Dalvik bytecode in .dex files, which is what the Android Runtime executes on Android devices. Typically developers use d8 in their Debug builds, and r8 for optimized Release builds. The .dex files go in the .apk that you push to test devices or upload to an app store. Once the .apk is installed on the user’s device, an on-device compiler which knows the specific target device architecture can convert the bytecode to instructions for the device’s CPU.

We can use Compiler Explorer to learn how all these tools come together, and to experiment with different inputs and see how they affect the outputs.

Going back to our default view for Android Java, on the left is Java source code and on the right is the disassembly for the on-device compiler dex2oat, the very last step in our toolchain diagram. The target architecture is ARM64 as this is the most common CPU architecture in use today by Android devices.

The ARM64 Instruction Set Architecture offers many instructions and extensions, but as you read disassemblies you will find that you only need to memorize a few key instructions. You can look for ARM64 Quick Reference cards online to help you read disassemblies.

At Google we study the output of dex2oat in Compiler Explorer for different reasons, such as:

    • Gaining intuition for what optimizations the compiler performs in order to think about how to write more efficient code.
    • Estimating how much memory will be required when a program with this snippet of code is loaded into memory.
    • Identifying optimization opportunities in the compiler - ways to generate instructions for the same code that are more efficient, resulting in faster execution or in lower memory usage without requiring app developers to change and rebuild their code.
    • Troubleshooting compiler bugs! 🐞

Compiler optimizations demystified

Let’s look at a real example of compiler optimizations in practice. In the previous blog post you can read about compiler optimizations that the ART team recently added, such as coalescing returns. Now you can see the optimization, with Compiler Explorer!

Let’s load this example:

class CoalescingReturnsDemo {
   String intToString(int num) {
       switch (num) {
           case 1:
               return "1";
           case 2:
               return "2";
           case 3:
               return "3";           
           default:
               return "other";
       }
   }
}
screenshot of sample code in Compiler Explorer
click to enlarge

How would a compiler implement this code in CPU instructions? Every case would be a branch target, with a case body that has some unique instructions (such as referencing the specific string) and some common instructions (such as assigning the string reference to a register and returning to the caller). Coalescing returns means that some instructions at the tail of each case body can be shared across all cases. The benefits grow for larger switches, proportional to the number of the cases.

You can see the optimization in action! Simply create two compiler windows, one for dex2oat from the October 2022 release (the last release before the optimization was added), and another for dex2oat from the November 2023 release (the first release after the optimization was added). You should see that before the optimization, the size of the method body for intToString was 124 bytes. After the optimization, it’s down to just 76 bytes.

This is of course a contrived example for simplicity’s sake. But this pattern is very common in Android code. For instance consider an implementation of Handler.handleMessage(Message), where you might implement a switch statement over the value of Message#what.

How does the compiler implement optimizations such as this? Compiler Explorer lets us look inside the compiler’s pipeline of optimization passes. In a compiler window, click Add New > Opt Pipeline. A new window will open, showing the High-level Internal Representation (HIR) that the compiler uses for the program, and how it’s transformed at every step.

screenshot of the high-level internal representation (HIR) the compiler uses for the program in Compiler Explorer
click to enlarge

If you look at the code_sinking pass you will see that the November 2023 compiler replaces Return HIR instructions with Goto instructions.

Most of the passes are hidden when Filters > Hide Inconsequential Passes is checked. You can uncheck this option and see all optimization passes, including ones that did not change the HIR (i.e. have no “diff” over the HIR).

Let’s study another simple optimization, and look inside the optimization pipeline to see it in action. Consider this code:

class ConstantFoldingDemo {
   static int demo(int num) {
       int result = num;
       if (num == 2) {
           result = num + 2;
       }
       return result;
   }
}

The above is functionally equivalent to the below:

class ConstantFoldingDemo {
   static int demo(int num) {
       int result = num;
       if (num == 2) {
           result = 4;
       }
       return result;
   }
}

Can the compiler make this optimization for us? Let’s load it in Compiler Explorer and turn to the Opt Pipeline Viewer for answers.

screenshot of Opt Pipeline Viewer in Compiler Explorer
click to enlarge

The disassembly shows us that the compiler never bothers with “two plus two”, it knows that if num is 2 then result needs to be 4. This optimization is called constant folding. Inside the conditional block where we know that num == 2 we propagate the constant 2 into the symbolic name num, then fold num + 2 into the constant 4.

You can see this optimization happening over the compiler’s IR by selecting the constant_folding pass in the Opt Pipeline Viewer.

Kotlin and Java, side by side

Now that we’ve seen the instructions for Java code, try changing the language to Android Kotlin. You should see this sample code, the Kotlin equivalent of the basic Java sample we’ve seen before:

fun square(num: Int): Int = num * num
screenshot of sample code in Kotlin in Compiler Explorer
click to enlarge

You will notice that the source code is different but the sample program is functionally identical, and so is the output from dex2oat. Finding the square of a number results in the same instructions, whether you write your source code in Java or in Kotlin.

You can take this opportunity to study interesting language features and discover how they work. For instance, let’s compare Java String concatenation with Kotlin String interpolation.

In Java, you might write your code as follows:

class StringConcatenationDemo {
   void stringConcatenationDemo(String myVal) {
       System.out.println("The value of myVal is " + myVal);
   }
}

Let’s find out how Java String concatenation actually works by trying this example in Compiler Explorer.

screenshot of sample code in Kotlin in Compiler Explorer
click to enlarge

First you will notice that we changed the output compiler from dex2oat to d8. Reading Dalvik bytecode, which is the output from d8, is usually easier than reading the ARM64 instructions that dex2oat outputs. This is because Dalvik bytecode uses higher level concepts. Indeed you can see the names of types and methods from the source code on the left side reflected in the bytecode on the right side. Try changing the compiler to dex2oat and back to see the difference.

As you read the d8 output you may realize that Java String concatenation is actually implemented by rewriting your source code to use a StringBuilder. The source code above is rewritten internally by the Java compiler as follows:

class StringConcatenationDemo {
   void stringConcatenationDemo(String myVal) {
       StringBuilder sb = new StringBuilder();
       sb.append("The value of myVal is ");
       sb.append(myVal);
       System.out.println(sb.toString());
  }
}

In Kotlin, we can use String interpolation:

fun stringInterpolationDemo(myVal: String) {
   System.out.println("The value of myVal is $myVal");
}

The Kotlin syntax is easier to read and write, but does this convenience come at a cost? If you try this example in Compiler Explorer, you may find that the Dalvik bytecode output is roughly the same! In this case we see that Kotlin offers an improved syntax, while the compiler emits similar bytecode.

At Google we study examples of language features in Compiler Explorer to learn about how high-level language features are implemented in lower-level terms, and to better inform ourselves on the different tradeoffs that we might make in choosing whether and how to adopt these language features. Recall our learning principle: tools, not rules. Rather than memorizing rules for how you should write your code, use the tools that will help you understand the upsides and downsides of different alternatives, and then make an informed decision.

What happens when you minify your app?

Speaking of making informed decisions as an app developer, you should be minifying your apps with R8 when building your Release APK. Minifying generally does three things to optimize your app to make it smaller and faster:

      1. Dead code elimination: find all the live code (code that is reachable from well-known program entry points), which tells us that the remaining code is not used, and therefore can be removed.

      2. Bytecode optimization: various specialized optimizations that rewrite your app’s bytecode to make it functionally identical but faster and/or smaller.

      3. Obfuscation: renaming all types, methods, and fields in your program that are not accessed by reflection (and therefore can be safely renamed) from their names in source code (com.example.MyVeryLongFooFactorySingleton) to shorter names that fit in less memory (a.b.c).

Let’s see an example of all three benefits! Start by loading this view in Compiler Explorer.

screenshot of sample code in Kotlin in Compiler Explorer
click to enlarge

First you will notice that we are referencing types from the Android SDK. You can do this in Compiler Explorer by clicking Libraries and adding Android API stubs.

Second, you will notice that this view has multiple source files open. The Kotlin source code is in example.kt, but there is another file called proguard.cfg.

-keep class MinifyDemo {
   public void goToSite(...);
}

Looking inside this file, you’ll see directives in the format of Proguard configuration flags, which is the legacy format for configuring what to keep when minifying your app. You can see that we are asking to keep a certain method of MinifyDemo. “Keeping” in this context means don’t shrink (we tell the minifier that this code is live). Let’s say we’re developing a library and we’d like to offer our customer a prebuilt .jar where they can call this method, so we’re keeping this as part of our API contract.

We set up a view that will let us see the benefits of minifying. On one side you’ll see d8, showing the dex code without minification, and on the other side r8, showing the dex code with minification. By comparing the two outputs, we can see minification in action:

      1. Dead code elimination: R8 removed all the logging code, since it never executes (as DEBUG is always false). We removed not just the calls to android.util.Log, but also the associated strings.

      2. Bytecode optimization: since the specialized methods goToGodbolt, goToAndroidDevelopers, and goToGoogleIo just call goToUrl with a hardcoded parameter, R8 inlined the calls to goToUrl into the call sites in goToSite. This inlining saves us the overhead of defining a method, invoking the method, and returning from the method.

      3. Obfuscation: we told R8 to keep the public method goToSite, and it did. R8 also decided to keep the method goToUrl as it’s used by goToSite, but you’ll notice that R8 renamed that method to a. This method’s name is an internal implementation detail, so obfuscating its name saved us a few precious bytes.

You can use R8 in Compiler Explorer to understand how minification affects your app, and to experiment with different ways to configure R8.

At Google our engineers use R8 in Compiler Explorer to study how minification works on small samples. The authoritative tool for studying how a real app compiles is the APK Analyzer in Android Studio, as optimization is a whole-program problem and a snippet might not capture every nuance. But iterating on release builds of a real app is slow, so studying sample code in Compiler Explorer helps our engineers quickly learn and iterate.

Google engineers build very large apps that are used by billions of people on different devices, so they care deeply about these kinds of optimizations, and strive to make the most use out of optimizing tools. But many of our apps are also very large, and so changing the configuration and rebuilding takes a very long time. Our engineers can now use Compiler Explorer to experiment with minification under different configurations and see results in seconds, not minutes.

You may wonder what would happen if we changed our code to rename goToSite? Unfortunately our build would break, unless we also renamed the reference to that method in the Proguard flags. Fortunately, R8 now natively supports Keep Annotations as an alternative to Proguard flags. We can modify our program to use Keep Annotations:

@UsedByReflection(kind = KeepItemKind.CLASS_AND_METHODS)
public static void goToSite(Context context, String site) {
    ...
}

Here is the complete example. You’ll notice that we removed the proguard.cfg file, and under Libraries we added “R8 keep-annotations”, which is how we’re importing @UsedByReflection.

At Google our engineers prefer annotations over flags. Here we’ve seen one benefit of annotations - keeping the information about the code in one place rather than two makes refactors easier. Another is that the annotations have a self-documenting aspect to them. For instance if this method was kept actually because it’s called from native code, we would annotate it as @UsedByNative instead.

Baseline profiles and you

Lastly, let’s touch on baseline profiles. So far you saw some demos where we looked at dex code, and others where we looked at ARM64 instructions. If you toggle between the different formats you will notice that the high-level dex bytecode is much more compact than low-level CPU instructions. There is an interesting tradeoff to explore here - whether, and when, to compile bytecode to CPU instructions?

For any program method, the Android Runtime has three compilation options:

      1. Compile the method Just in Time (JIT).

      2. Compile the method Ahead of Time (AOT).

      3. Don’t compile the method at all, instead use a bytecode interpreter.

Running code in an interpreter is an order of magnitude slower, but doesn’t incur the cost of loading the representation of the method as CPU instructions which as we’ve seen is more verbose. This is best used for “cold” code - code that runs only once, and is not critical to user interactions.

When ART detects that a method is “hot”, it will be JIT-compiled if it’s not already been AOT compiled. JIT compilation accelerates execution times, but pays the one-time cost of compilation during app runtime. This is where baseline profiles come in. Using baseline profiles, you as the app developer can give ART a hint as to which methods are going to be hot or otherwise worth compiling. ART will use that hint before runtime, compiling the code AOT (usually at install time, or when the device is idle) rather than at runtime. This is why apps that use Baseline Profiles see faster startup times.

With Compiler Explorer we can see Baseline Profiles in action.

Let’s open this example.

screenshot of sample code in Compiler Explorer
click to enlarge

The Java source code has two method definitions, factorial and fibonacci. This example is set up with a manual baseline profile, listed in the file profile.prof.txt. You will notice that the profile only references the factorial method. Consequently, the dex2oat output will only show compiled code for factorial, while fibonacci shows in the output with no instructions and a size of 0 bytes.

In the context of compilation modes, this means that factorial is compiled AOT, and fibonacci will be compiled JIT or interpreted. This is because we applied a different compiler filter in the profile sample. This is reflected in the dex2oat output, which reads: “Compiler filter: speed-profile” (AOT compile only profile code), where previous examples read “Compiler filter: speed” (AOT compile everything).

Conclusion

Compiler Explorer is a great tool for understanding what happens after you write your source code but before it can run on a target device. The tool is easy to use, interactive, and shareable. Compiler Explorer is best used with sample code, but it goes through the same procedures as building a real app, so you can see the impact of all steps in the toolchain.

By learning how to use tools like this to discover how the compiler works under the hood, rather than memorizing a bunch of rules of optimization best practices, you can make more informed decisions.

Now that you've seen how to use the Java and Kotlin programming languages and the Android toolchain in Compiler Explorer, you can level up your Android development skills.

Lastly, don't forget that Compiler Explorer is an open source project on GitHub. If there is a feature you'd like to see then it's just a Pull Request away.


Java and OpenJDK are trademarks or registered trademarks of Oracle and/or its affiliates.