Optimize for Android (Go edition): Lessons from Google apps Part 3

Posted by Niharika Arora, Developer Relations Engineer

In Part 1 and Part 2 of our “Optimizing for Android Go” blog series, we discussed why we should consider building for Android Go and how to optimize your app to perform well on Go devices. In this blog, we will talk about the tools which helped Google optimize their Google apps performance.


Monitoring Memory

Analyze Memory Footprint 

 1.    To determine the memory footprint for an application, any of the following metrics may be used:
    • Resident Set Size (RSS): The number of shared and non-shared pages used by the app
    • Proportional Set Size (PSS): The number of non-shared pages used by the app and an even distribution of the shared pages (for example, if three processes are sharing 3MB, each process gets 1MB in PSS)
      • Note: Private Set Size (PSS) = Private memory + (shared memory / the number of processes sharing).
    • Unique Set Size (USS): The number of non-shared pages used by the app (shared pages are not included)

PSS is useful for the operating system when it wants to know how much memory is used by all processes since pages don’t get counted multiple times. PSS takes a long time to calculate because the system needs to determine which pages are shared and by how many processes. RSS doesn't distinguish between shared and non-shared pages (making it faster to calculate) and is better for tracking changes in memory allocation.

So, which method should you choose? The choice depends on the usage of shared memory.

For example, if the shared memory is being used by the application only then we should use the RSS approach. While, if the shared memory is taken by the Google Play Services then we should use the USS approach. For more understanding, please read here

2.    Take a heap dump and analyze how much memory is utilized by the running processes. Follow
    • Review the Prerequisites. 
      • Developer options Don't keep activities must be OFF. 
      • Use recent release builds for testing.
    • Execute the user journey you desire to measure.
    • Run the following command:
                    adb shell am dumpheap <You Android App Process ID> <output-file-name>
    • In a second terminal, run the following command and wait for message indicating that the “heap dump has completed”:
adb logcat | grep hprof
    • Run:
adb pull <output-file-name> 
This will pull the generated file to your machine for analysis.
To get info on native heap, read here :

To know about Java heap, read here :

3.    Understand low-memory killer

In Android, we have a process called low memory killer, and this will pick a process from the device and will kill that process when the device is under low RAM, the thresholds can be tuned by OEMs. By doing so, you will get back all the memory that the process was using.
But what if the low memory killer kills the process that the user cares about?
In Android, we have a priority list of applications and based on that priority list we remove the app when the low memory killer comes into play. Read more here.

You can run this command and know :
adb shell dumpsys activity oom

To check stats on low memory killer :
adb shell dumpsys activity lmk

For more information, please check Perfetto documentation for Memory.


 1.    Debug Memory usage using Perfetto
This is one the best tools to find where all your app memory is consumed. Use Perfetto to get information about memory management events from the kernel. Deep dive and understand how to profile native and Java heap here
2.    Inspect your memory usage using Memory Profiler
The Memory Profiler is a component in the Android Profiler that helps you identify memory leaks and memory churn that can lead to stutter, freezes, and even app crashes. It shows a real time graph of your app's memory use and lets you capture a heap dump, force garbage collections, and track memory allocations. To learn more about inspecting performance, please check MAD skills videos here
3.    Utilize meminfo
You may want to observe how your app's memory is divided between different types of RAM allocation with the following adb command: 

adb shell dumpsys meminfo <package_name|pid> [-d]

You can view the following seven memory categories with Meminfo:
    • Java heap – memory allocated by Java code
    • Native heap – memory allocated by native code. These are best understood using debug malloc. Allocations made by the application from C or C++ code using malloc or new.
    • Code – memory used for Java, native code and some resources, including dex bytecode, shared libraries and fonts
    • Stack – memory used for both native and Java stacks. This usually has to do with how many threads your application is running.
    • Graphics – Memory used for graphics buffer queues to display pixels to the screen, GL surfaces and textures and such.
    • Private Other – uncategorized private memory
    • System – memory shared with the system or otherwise under the control of the system.

Key memory terms:

    • Private – Memory used only by the process.
    • Shared – System memory shared with other processes.
    • Clean – Memory-mapped pages that can be reclaimed when under memory pressure.
    • Dirty – Memory-mapped page modified by a process. These pages may be written to file/swap and then reclaimed when under memory pressure.

Note :

    • Debug class is super useful and provides different methods for Android applications, including tracing and allocation counts. You can read about usage here.
    • For deeper understanding and tracking allocations for each page, read about page owner here.
4.    Detailed analysis using showmap
The showmap command provides a much more detailed breakdown of memory than friendly meminfo. It lists the name and sizes of memory maps used by a process. This is a summary of the information available at /proc/<pid>/smaps, which is the primary source of information used in dumpsys meminfo, except for some graphics memory.
$adb root
$ adb shell pgrep <process>
Output - process id
$ adb shell showmap <process id>

Sample Output :

 virtual                     shared   shared  private  private

    size      RSS      PSS    clean    dirty    clean    dirty object

-------- -------- -------- -------- -------- -------- -------- ------------------------------

    3048      948      516      864        0       84        0 /data/app/……

    2484     2088     2088        0        0     2084        4 /data/app/……..

     144       72        2       68        4        0        0 /data/dalvik-cache/arm64/system@framework@<...>.art

     216      180        5      176        4        0        0 /data/dalvik-cache/arm64/system@framework@<...>.art

     168      164        8      136       24        0        4 /data/dalvik-cache/arm64/system@framework@<...>.art

      12        8        0        4        4        0        0 /data/dalvik-cache/arm64/system@framework@<...>.art

    1380     1300       73     1100      164        0       36 /data/dalvik-cache/arm64/system@framework@<...>.art

Common memory mappings are:

    • [anon:libc_malloc] - Allocations made from C/C++ code using malloc or new.
    • *boot*.art - The boot image. A Java heap that is pre-initialized by loading and running static initializers where possible for common frameworks classes.
    • /dev/ashmem/dalvik-main space N - The main Java heap.
    • /dev/ashmem/dalvik-zygote space - The main Java heap of the zygote before forking a child process. Also known as the zygote heap.
    • /dev/ashmem/dalvik-[free list ] large object space - Heap used for Java objects larger than ~12KB. This tends to be filled with bitmap pixel data and other large primitive arrays.
    • *.so - Executable code from shared native libraries loaded into memory.
    • *.{oat,dex,odex,vdex} - Compiled dex bytecode, including optimized dex bytecode and metadata, native machine code, or a mix of both.
5.    Analyze native memory allocations using malloc debug
Malloc debug is a method of debugging native memory problems. It can help detect memory corruption, memory leaks, and use after free issues. You can check this documentation for more understanding and usage.  
6.    Use Address Sanitizer to detect memory errors in C/C++ code
Beginning with Android 27, Android NDK supports Address Sanitizer which is a fast compiler-based tool for detecting memory bugs in native code. ASan detects:
      • Stack and heap buffer overflow/underflow
      • Heap use after free
      • Stack use outside scope
      • Double free/wild free

            ASan runs on both 32-bit and 64-bit ARM, plus x86 and x86-64. ASan's CPU overhead is roughly 2x, code size overhead is between 50% and 2x, and the memory overhead is large (dependent on your allocation patterns, but on the order of 2x). To learn more, read here.

            Camera from the Google team used it and automated the process that would run and get back to them in the form of alerts in case of Asan issues, and found it really convenient to fix memory issues missed during code authoring/review.

            Monitoring Startup

            Analyze Startup

            1.    Measure and analyze time spent in major operations
            Once you have a complete app startup trace, look at the trace and measure time taken for major operations like bindApplication, activitystart etc.

            Look at overall time spent to

              • Identify which operations occupy large time frames and can be optimized
              • Identify which operations consume high time where it is not expected.
              • Identify which operations cause the main thread to be blocked
            2.    Analyze and identify different time consuming operations and their possible solutions
              • Identify all time consuming operations.
              • Identify any operations which are not supposed to be executed during startup (Ideally there are a lot of legacy code operations which we are not aware about and not easily visible when looking at our app code for performance)
              • Identify which all operations are absolutely needed OR could be delayed until your first frame is drawn.
            3.    Check Home activity load time
            This is your app’s home page and often performance will depend on the loading of this page. For most apps, there is a lot of data displayed on this page, spanning multiple layouts and processes running in background. Check the home activity layout and specifically look at the Choreographer.onDraw method of the home activity.
              • Measure time taken for overall operations of measure, draw,inflate,animate etc.
              • Look at frame drops.
              • Identify layouts taking high time to render or measure.
              • Identify assets taking a long time to load.
              • Identify layouts not needed but still getting inflated.


             1.    Perfetto
              • To know CPU usage, thread activity, frame rendering time, Perfetto will be the best tool.
              • Record trace either by using command line or UI tools like Perfetto. Add app package name with the -a tag, to filter data for your app. Some ways to capture trace :
              • Produces a report combining data from the Android kernel, such as the CPU scheduler, disk activity, and app threads.
              • Best when enabled with custom tracing to know which method or part of code is taking how long and then develop can dig deep accordingly.
              • Understand Atrace, and ftrace while analyzing traces through Perfetto.
            2.    App Startup library
            The App Startup library provides a straightforward, performant way to initialize components at application startup. Both library developers and app developers can use App Startup to streamline startup sequences and explicitly set the order of initialization. Instead of defining separate content providers for each component you need to initialize, App Startup allows you to define component initializers that share a single content provider. This can significantly improve app startup time. To find how to use it in your app, refer here
            3.    Baseline Profiles
            Baseline Profiles are a list of classes and methods included in an APK used by Android Runtime (ART) during installation to pre-compile critical paths to machine code. This is a form of profile guided optimization (PGO) that lets apps optimize startup, reduce jank, and improve performance for end users. Profile rules are compiled into a binary form in the APK, in assets/dexopt/baseline.prof.

            During installation, ART performs Ahead-of-time (AOT) compilation of methods in the profile, resulting in those methods executing faster. If the profile contains methods used in app launch or during frame rendering, the user experiences faster launch times and/or reduced jank. For more information on usage and advantages, refer here.  

            4.    Android CPU Profiler
            You can use the CPU Profiler to inspect your app’s CPU usage and thread activity in real time while interacting with your app, or you can inspect the details in recorded method traces, function traces, and system traces. The detailed information that the CPU Profiler records and shows is determined by which recording configuration you choose:
              • System Trace: Captures fine-grained details that allow you to inspect how your app interacts with system resources.
              • Method and function traces: For each thread in your app process, you can find out which methods (Java) or functions (C/C++) are executed over a period, and the CPU resources each method or function consumes during its execution.
            5.    Debug API + CPU Profiler
            To give apps the ability to start and stop recording CPU profiling and then inspect in CPU profiler is what Debug API is all about. It provides information about tracing and allocation counts the same way using startMethodTracing() and stopMethodTracing().

            Debug.startMethodTracing("sample") - Starts recording a trace log with the name you provide

             Debug.stopMethodTracing() - he system begins buffering the generated trace data, until the

             application calls this method.


              • Debug API is designed for short intervals or scenarios that are hard to start/stop recording manually. (Used it once to find the lock contention happening due to some library)
              • To generate a method trace of an app's execution, we can instrument the app using the Debug class. This way developers get more control over exactly when the device starts and stops recording tracing information.
            6.    MacroBenchmark
              • Measures Scrolling / Animation rendering time.
              • Use a UiAutomator to trigger a scroll or animation. (It captures frame timing / janks for whatever the app is doing. Scroll and animations are just the easiest ways to produce frames where jank is noticeable)
              • Requires Android 10 or higher to run the tests.
              • Can view traces on Systrace/Perfetto Traces.
              • FrameTimingMetric is the API reporting frame time in ms.
            This sample can be used for app instrumentation. 

              • Added in API level 30 and supported in the latest Studio Bumblebee preview (2021.1)
              • Uses simpleperf with customized build scripts for profiling.
              • Simpleperf supports profiling java code on Android >M.
              • Profiling a release build requires one of following:
                • Device to be rooted
                • Android >=O, use a script wrap.sh and make android::debuggable=“true” to enable profiling.
                • Android >=Q, add profileable in manifest flag.

            <profileable android:shell=["true" | "false"] android:enable=["true" | "false"] />

              • Helpful in app instrumentation with Macrobenchmark.
            8.    MicroBenchmark
            The Jetpack Microbenchmark library allows you to quickly benchmark your Android native code (Kotlin or Java) from within Android Studio. The library handles warmup, measures your code performance and allocation counts, and outputs benchmarking results to both the Android Studio console and a JSON file with more detail. Read more here.

            Monitoring App size

            No user wants to download a large APK that might consume most of their Network/Wifi Bandwidth, also most importantly, space inside the mobile device.

            The size of your APK has an impact on how fast your app loads, how much memory it uses, and how much power it consumes. Reducing your app's download size enables more users to download your app.


            • Use the Android Size Analyzer
            The Android Size Analyzer tool is an easy way to identify and implement many strategies for reducing the size of your app. It is available as both an Android Studio plugin as well as a standalone JAR
            • Remove unused resources using Lint
            The lint tool, a static code analyzer included in Android Studio, detects resources in your res/ folder that your code doesn't reference. When the lint tool discovers a potentially unused resource in your project, it prints a message like the following example.

            Note : Libraries that you add to your code may include unused resources. Gradle can automatically remove resources on your behalf if you enable shrinkResources in your app's build.gradle file.

            • Native animated image decoding
            In Android 12 (API level 31), the NDK ImageDecoder API has been expanded to decode all frames and timing data from images that use the animated GIF and animated WebP file formats. When it was introduced in Android 11, this API decoded only the first image from animations in these formats. 
            Use ImageDecoder instead of third-party libraries to further decrease APK size and benefit from future updates related to security and performance. 
            For more details on the API, refer to the API reference and the sample on GitHub.
            • Crunch PNG files using aapt
            The aapt tool can optimize the image resources placed in res/drawable/ with lossless compression during the build process. For example, the aapt tool can convert a true-color PNG that does not require more than 256 colors to an 8-bit PNG with a color palette. Doing so results in an image of equal quality but a smaller memory footprint. Read more here.

            Note : Please check Android developer documentation for all the useful tools which can help you identify and help fix such performance issues.


            This part of the blog captures the tools used by Google to identify and fix performance issues in their apps. They saw great improvements in their metrics. Most Android Go apps could benefit from applying the strategies described above. Optimize and make your app delightful and fast for your users!