Tag Archives: Art

Tools, not Rules: become a better Android developer with Compiler Explorer

Posted by Shai Barack – Android Platform Performance lead

Introducing Android support in Compiler Explorer

In a previous blog post you learned how Android engineers continuously improve the Android Runtime (ART) in ways that boost app performance on user devices. These changes to the compiler make system and app code faster or smaller. Developers don’t need to change their code and rebuild their apps to benefit from new optimizations, and users get a better experience. In this blog post I’ll take you inside the compiler with a tool called Compiler Explorer and witness some of these optimizations in action.

Compiler Explorer is an interactive website for studying how compilers work. It is an open source project that anyone can contribute to. This year, our engineers added support to Compiler Explorer for the Java and Kotlin programming languages on Android.

You can use Compiler Explorer to understand how your source code is translated to assembly language, and how high-level programming language constructs in a language like Kotlin become low-level instructions that run on the processor.

At Google our engineers use this tool to study different coding patterns for efficiency, to see how existing compiler optimizations work, to share new optimization opportunities, and to teach and learn. Learning is best when it’s done through tools, not rules. Instead of teaching developers to memorize different rules for how to write efficient code or what the compiler might or might not optimize, give the engineers the tools to find out for themselves what happens when they write their code in different ways, and let them experiment and learn. Let’s learn together!

Start by going to godbolt.org. By default we see C++ sample code, so click the dropdown that says C++ and select Android Java. You should see this sample code:

class Square {
   static int square(int num) {
       return num * num;
   }
}

screenshot of sample code in Compiler Explorer

click to enlarge

On the left you’ll see a very simple program. You might say that this is a one line program. But this is not a meaningful statement in terms of performance - how many lines of code there are doesn’t tell us how long this program will take to run, or how much memory will be occupied by the code when the program is loaded.

On the right you’ll see a disassembly of the compiler output. This is expressed in terms of assembly language for the target architecture, where every line is a CPU instruction. Looking at the instructions, we can say that the implementation of the square(int num) method consists of 2 instructions in the target architecture. The number and type of instructions give us a better idea for how fast the program is than the number of lines of source code. Since the target architecture is AArch64 aka ARM64, every instruction is 4 bytes, which means that our program’s code occupies 8 bytes in RAM when the program is compiled and loaded.

Let’s take a brief detour and introduce some Android toolchain concepts.

The Android build toolchain (in brief)

a flow diagram of the Android build toolchain

When you write your Android app, you’re typically writing source code in the Java or Kotlin programming languages. When you build your app in Android Studio, it’s initially compiled by a language-specific compiler into language-agnostic JVM bytecode in a .jar. Then the Android build tools transform the .jar into Dalvik bytecode in .dex files, which is what the Android Runtime executes on Android devices. Typically developers use d8 in their Debug builds, and r8 for optimized Release builds. The .dex files go in the .apk that you push to test devices or upload to an app store. Once the .apk is installed on the user’s device, an on-device compiler which knows the specific target device architecture can convert the bytecode to instructions for the device’s CPU.

We can use Compiler Explorer to learn how all these tools come together, and to experiment with different inputs and see how they affect the outputs.

Going back to our default view for Android Java, on the left is Java source code and on the right is the disassembly for the on-device compiler dex2oat, the very last step in our toolchain diagram. The target architecture is ARM64 as this is the most common CPU architecture in use today by Android devices.

The ARM64 Instruction Set Architecture offers many instructions and extensions, but as you read disassemblies you will find that you only need to memorize a few key instructions. You can look for ARM64 Quick Reference cards online to help you read disassemblies.

At Google we study the output of dex2oat in Compiler Explorer for different reasons, such as:

Gaining intuition for what optimizations the compiler performs in order to think about how to write more efficient code.

Estimating how much memory will be required when a program with this snippet of code is loaded into memory.

Identifying optimization opportunities in the compiler - ways to generate instructions for the same code that are more efficient, resulting in faster execution or in lower memory usage without requiring app developers to change and rebuild their code.

Troubleshooting compiler bugs! 🐞

Compiler optimizations demystified

Let’s look at a real example of compiler optimizations in practice. In the previous blog post you can read about compiler optimizations that the ART team recently added, such as coalescing returns. Now you can see the optimization, with Compiler Explorer!

Let’s load this example:

class CoalescingReturnsDemo {
   String intToString(int num) {
       switch (num) {
           case 1:
               return "1";
           case 2:
               return "2";
           case 3:
               return "3";           
           default:
               return "other";
       }
   }
}

click to enlarge

How would a compiler implement this code in CPU instructions? Every case would be a branch target, with a case body that has some unique instructions (such as referencing the specific string) and some common instructions (such as assigning the string reference to a register and returning to the caller). Coalescing returns means that some instructions at the tail of each case body can be shared across all cases. The benefits grow for larger switches, proportional to the number of the cases.

You can see the optimization in action! Simply create two compiler windows, one for dex2oat from the October 2022 release (the last release before the optimization was added), and another for dex2oat from the November 2023 release (the first release after the optimization was added). You should see that before the optimization, the size of the method body for intToString was 124 bytes. After the optimization, it’s down to just 76 bytes.

This is of course a contrived example for simplicity’s sake. But this pattern is very common in Android code. For instance consider an implementation of Handler.handleMessage(Message), where you might implement a switch statement over the value of Message#what.

How does the compiler implement optimizations such as this? Compiler Explorer lets us look inside the compiler’s pipeline of optimization passes. In a compiler window, click Add New > Opt Pipeline. A new window will open, showing the High-level Internal Representation (HIR) that the compiler uses for the program, and how it’s transformed at every step.

screenshot of the high-level internal representation (HIR) the compiler uses for the program in Compiler Explorer

click to enlarge

If you look at the code_sinking pass you will see that the November 2023 compiler replaces Return HIR instructions with Goto instructions.

Most of the passes are hidden when Filters > Hide Inconsequential Passes is checked. You can uncheck this option and see all optimization passes, including ones that did not change the HIR (i.e. have no “diff” over the HIR).

Let’s study another simple optimization, and look inside the optimization pipeline to see it in action. Consider this code:

class ConstantFoldingDemo {
   static int demo(int num) {
       int result = num;
       if (num == 2) {
           result = num + 2;
       }
       return result;
   }
}

The above is functionally equivalent to the below:

class ConstantFoldingDemo {
   static int demo(int num) {
       int result = num;
       if (num == 2) {
           result = 4;
       }
       return result;
   }
}

Can the compiler make this optimization for us? Let’s load it in Compiler Explorer and turn to the Opt Pipeline Viewer for answers.

screenshot of Opt Pipeline Viewer in Compiler Explorer

click to enlarge

The disassembly shows us that the compiler never bothers with “two plus two”, it knows that if num is 2 then result needs to be 4. This optimization is called constant folding. Inside the conditional block where we know that num == 2 we propagate the constant 2 into the symbolic name num, then fold num + 2 into the constant 4.

You can see this optimization happening over the compiler’s IR by selecting the constant_folding pass in the Opt Pipeline Viewer.

Kotlin and Java, side by side

Now that we’ve seen the instructions for Java code, try changing the language to Android Kotlin. You should see this sample code, the Kotlin equivalent of the basic Java sample we’ve seen before:

fun square(num: Int): Int = num * num

screenshot of sample code in Kotlin in Compiler Explorer

click to enlarge

You will notice that the source code is different but the sample program is functionally identical, and so is the output from dex2oat. Finding the square of a number results in the same instructions, whether you write your source code in Java or in Kotlin.

You can take this opportunity to study interesting language features and discover how they work. For instance, let’s compare Java String concatenation with Kotlin String interpolation.

In Java, you might write your code as follows:

class StringConcatenationDemo {
   void stringConcatenationDemo(String myVal) {
       System.out.println("The value of myVal is " + myVal);
   }
}

Let’s find out how Java String concatenation actually works by trying this example in Compiler Explorer.

click to enlarge

First you will notice that we changed the output compiler from dex2oat to d8. Reading Dalvik bytecode, which is the output from d8, is usually easier than reading the ARM64 instructions that dex2oat outputs. This is because Dalvik bytecode uses higher level concepts. Indeed you can see the names of types and methods from the source code on the left side reflected in the bytecode on the right side. Try changing the compiler to dex2oat and back to see the difference.

As you read the d8 output you may realize that Java String concatenation is actually implemented by rewriting your source code to use a StringBuilder. The source code above is rewritten internally by the Java compiler as follows:

class StringConcatenationDemo {
   void stringConcatenationDemo(String myVal) {
       StringBuilder sb = new StringBuilder();
       sb.append("The value of myVal is ");
       sb.append(myVal);
       System.out.println(sb.toString());
  }
}

In Kotlin, we can use String interpolation:

fun stringInterpolationDemo(myVal: String) {
   System.out.println("The value of myVal is $myVal");
}

The Kotlin syntax is easier to read and write, but does this convenience come at a cost? If you try this example in Compiler Explorer, you may find that the Dalvik bytecode output is roughly the same! In this case we see that Kotlin offers an improved syntax, while the compiler emits similar bytecode.

At Google we study examples of language features in Compiler Explorer to learn about how high-level language features are implemented in lower-level terms, and to better inform ourselves on the different tradeoffs that we might make in choosing whether and how to adopt these language features. Recall our learning principle: tools, not rules. Rather than memorizing rules for how you should write your code, use the tools that will help you understand the upsides and downsides of different alternatives, and then make an informed decision.

What happens when you minify your app?

Speaking of making informed decisions as an app developer, you should be minifying your apps with R8 when building your Release APK. Minifying generally does three things to optimize your app to make it smaller and faster:

1. Dead code elimination: find all the live code (code that is reachable from well-known program entry points), which tells us that the remaining code is not used, and therefore can be removed.

2. Bytecode optimization: various specialized optimizations that rewrite your app’s bytecode to make it functionally identical but faster and/or smaller.

3. Obfuscation: renaming all types, methods, and fields in your program that are not accessed by reflection (and therefore can be safely renamed) from their names in source code (com.example.MyVeryLongFooFactorySingleton) to shorter names that fit in less memory (a.b.c).

Let’s see an example of all three benefits! Start by loading this view in Compiler Explorer.

click to enlarge

First you will notice that we are referencing types from the Android SDK. You can do this in Compiler Explorer by clicking Libraries and adding Android API stubs.

Second, you will notice that this view has multiple source files open. The Kotlin source code is in example.kt, but there is another file called proguard.cfg.

-keep class MinifyDemo {
   public void goToSite(...);
}

Looking inside this file, you’ll see directives in the format of Proguard configuration flags, which is the legacy format for configuring what to keep when minifying your app. You can see that we are asking to keep a certain method of MinifyDemo. “Keeping” in this context means don’t shrink (we tell the minifier that this code is live). Let’s say we’re developing a library and we’d like to offer our customer a prebuilt .jar where they can call this method, so we’re keeping this as part of our API contract.

We set up a view that will let us see the benefits of minifying. On one side you’ll see d8, showing the dex code without minification, and on the other side r8, showing the dex code with minification. By comparing the two outputs, we can see minification in action:

1. Dead code elimination: R8 removed all the logging code, since it never executes (as DEBUG is always false). We removed not just the calls to android.util.Log, but also the associated strings.

2. Bytecode optimization: since the specialized methods goToGodbolt, goToAndroidDevelopers, and goToGoogleIo just call goToUrl with a hardcoded parameter, R8 inlined the calls to goToUrl into the call sites in goToSite. This inlining saves us the overhead of defining a method, invoking the method, and returning from the method.

3. Obfuscation: we told R8 to keep the public method goToSite, and it did. R8 also decided to keep the method goToUrl as it’s used by goToSite, but you’ll notice that R8 renamed that method to a. This method’s name is an internal implementation detail, so obfuscating its name saved us a few precious bytes.

You can use R8 in Compiler Explorer to understand how minification affects your app, and to experiment with different ways to configure R8.

At Google our engineers use R8 in Compiler Explorer to study how minification works on small samples. The authoritative tool for studying how a real app compiles is the APK Analyzer in Android Studio, as optimization is a whole-program problem and a snippet might not capture every nuance. But iterating on release builds of a real app is slow, so studying sample code in Compiler Explorer helps our engineers quickly learn and iterate.

Google engineers build very large apps that are used by billions of people on different devices, so they care deeply about these kinds of optimizations, and strive to make the most use out of optimizing tools. But many of our apps are also very large, and so changing the configuration and rebuilding takes a very long time. Our engineers can now use Compiler Explorer to experiment with minification under different configurations and see results in seconds, not minutes.

You may wonder what would happen if we changed our code to rename goToSite? Unfortunately our build would break, unless we also renamed the reference to that method in the Proguard flags. Fortunately, R8 now natively supports Keep Annotations as an alternative to Proguard flags. We can modify our program to use Keep Annotations:

@UsedByReflection(kind = KeepItemKind.CLASS_AND_METHODS)
public static void goToSite(Context context, String site) {
    ...
}

Here is the complete example. You’ll notice that we removed the proguard.cfg file, and under Libraries we added “R8 keep-annotations”, which is how we’re importing @UsedByReflection.

At Google our engineers prefer annotations over flags. Here we’ve seen one benefit of annotations - keeping the information about the code in one place rather than two makes refactors easier. Another is that the annotations have a self-documenting aspect to them. For instance if this method was kept actually because it’s called from native code, we would annotate it as @UsedByNative instead.

Baseline profiles and you

Lastly, let’s touch on baseline profiles. So far you saw some demos where we looked at dex code, and others where we looked at ARM64 instructions. If you toggle between the different formats you will notice that the high-level dex bytecode is much more compact than low-level CPU instructions. There is an interesting tradeoff to explore here - whether, and when, to compile bytecode to CPU instructions?

For any program method, the Android Runtime has three compilation options:

1. Compile the method Just in Time (JIT).

2. Compile the method Ahead of Time (AOT).

3. Don’t compile the method at all, instead use a bytecode interpreter.

Running code in an interpreter is an order of magnitude slower, but doesn’t incur the cost of loading the representation of the method as CPU instructions which as we’ve seen is more verbose. This is best used for “cold” code - code that runs only once, and is not critical to user interactions.

When ART detects that a method is “hot”, it will be JIT-compiled if it’s not already been AOT compiled. JIT compilation accelerates execution times, but pays the one-time cost of compilation during app runtime. This is where baseline profiles come in. Using baseline profiles, you as the app developer can give ART a hint as to which methods are going to be hot or otherwise worth compiling. ART will use that hint before runtime, compiling the code AOT (usually at install time, or when the device is idle) rather than at runtime. This is why apps that use Baseline Profiles see faster startup times.

With Compiler Explorer we can see Baseline Profiles in action.

Let’s open this example.

click to enlarge

The Java source code has two method definitions, factorial and fibonacci. This example is set up with a manual baseline profile, listed in the file profile.prof.txt. You will notice that the profile only references the factorial method. Consequently, the dex2oat output will only show compiled code for factorial, while fibonacci shows in the output with no instructions and a size of 0 bytes.

In the context of compilation modes, this means that factorial is compiled AOT, and fibonacci will be compiled JIT or interpreted. This is because we applied a different compiler filter in the profile sample. This is reflected in the dex2oat output, which reads: “Compiler filter: speed-profile” (AOT compile only profile code), where previous examples read “Compiler filter: speed” (AOT compile everything).

Conclusion

Compiler Explorer is a great tool for understanding what happens after you write your source code but before it can run on a target device. The tool is easy to use, interactive, and shareable. Compiler Explorer is best used with sample code, but it goes through the same procedures as building a real app, so you can see the impact of all steps in the toolchain.

By learning how to use tools like this to discover how the compiler works under the hood, rather than memorizing a bunch of rules of optimization best practices, you can make more informed decisions.

Now that you've seen how to use the Java and Kotlin programming languages and the Android toolchain in Compiler Explorer, you can level up your Android development skills.

Lastly, don't forget that Compiler Explorer is an open source project on GitHub. If there is a feature you'd like to see then it's just a Pull Request away.

Java and OpenJDK are trademarks or registered trademarks of Oracle and/or its affiliates.

Source: Android Developers Blog

The secret to Android’s improved memory on 1B+ Devices: The latest Android Runtime update

Posted by Santiago Aboy Solanes - Software Engineer

The Android Runtime (ART) executes Dalvik bytecode produced from apps and system services written in the Java or Kotlin languages. We constantly improve ART to generate smaller and more performant code. Improving ART makes the system and user-experience better as a whole, as it is the common denominator in Android apps. In this blog post we will talk about optimizations that reduce code size without impacting performance.

Code size is one of the key metrics we look at, since smaller generated files are better for memory (both RAM and storage). With the new release of ART, we estimate saving users about 50-100MB per device. This could be just the thing you need to be able to update your favorite app, or download a new one. Since ART has been updateable from Android 12, these optimizations reach 1B+ devices for whom we are saving 47-95 petabytes (47-95 millions of GB!) globally!

All the improvements mentioned in this blog post are open source. They are available through the ART mainline update so you don’t even need a full OS update to reap the benefits. We can have our upside-down cake and eat it too!

Optimizing compiler 101

ART compiles applications from the DEX format to native code using the on-device dex2oat tool. The first step is to parse the DEX code and generate an Intermediate Representation (IR). Using the IR, dex2oat performs a number of code optimizations. The last step of the pipeline is a code generation phase where dex2oat converts the IR into native code (for example, AArch64 assembly).

The optimization pipeline has phases that execute in order that each concentrate on a particular set of optimizations. For example, Constant Folding is an optimization that tries to replace instructions with constant values like folding the addition operation 2 + 3 into a 5.

ART's optimization pipeline overview with an example showing we can combine the addition of 2 plus 3 into a 5

The IR can be printed and visualized, but is very verbose compared to Kotlin language code. For the purposes of this blog post, we will show what the optimizations do using Kotlin language code, but know that they are happening to IR code.

Code size improvements

For all code size optimizations, we tested our optimizations on over half a million APKs present in the Google Play Store and aggregated the results.

Eliminating write barriers

We have a new optimization pass that we named Write Barrier Elimination. Write barriers track modified objects since the last time they were examined by the Garbage Collector (GC) so that the GC can revisit them. For example, if we have:

Example showing that we can eliminate redundant write barriers if there a GC cannot happen between set instructions

Previously, we would emit a write barrier for each object modification but we only need a single write barrier because: 1) the mark will be set in o itself (and not in the inner objects), and 2) a garbage collection can't have interacted with the thread between those sets.

If an instruction may trigger a GC (for example, Invokes and SuspendChecks), we wouldn't be able to eliminate write barriers. In the following example, we can't guarantee a GC won't need to examine or modify the tracking information between the modifications:

Example showing that we can't eliminate redundant write barriers because a GC may happen between set instructions Implementing this new pass contributes to 0.8% code size reduction.

Implicit suspend checks

Let's assume we have several threads running. Suspend checks are safepoints (represented by the houses in the image below) where we can pause the thread execution. Safepoints are used for many reasons, the most important of them being Garbage Collection. When a safepoint call is issued, the threads must go into a safepoint and are blocked until they are released.

The previous implementation was an explicit boolean check. We would load the value, test it, and branch into the safepoint if needed.

Shows the explicit suspend check (load + test + branch) when multiple threads are running

Implicit suspend checks is an optimization that eliminates the need for the test and branch instructions. Instead, we only have one load: if the thread needs to suspend, that load will trap and the signal handler will redirect the code to a suspend check handler as if the method made a call.

Shows the implicit suspend check (two loads: the first one loads null and the second one traps) when multiple threads are running

Going into a bit more detail, a reserved register rX is pre-loaded with an address within the thread where we have a pointer pointing to itself. As long as we don't need to do a suspend check, we keep that self-pointing pointer. When we need to do a suspend check, we clear the pointer and once it becomes visible to the thread the first LDR rX, [rX] will load null and the second will segfault.

The suspend request is essentially asking the thread to suspend some time soon, so the minor delay of having to wait for the second load is okay.

This optimization reduces code size by 1.8%.

Coalescing returns

It is common for compiled methods to have an entry frame. If they have it, those methods have to deconstruct it when they return, which is also known as an exit frame. If a method has several return instructions, it will generate several exit frames, one per return instruction.

By coalescing the return instructions into one, we are able to have one return point and are able to remove the extra exit frames. This is especially useful for switch cases with multiple return statements. Switch case optimized by having one return instead of multiple return instructions

Coalescing returns reduces code size by 1%.

Other optimization improvements

We improved a lot of our existing optimization passes. For this blog post, we will group them up in the same section, but they are independent of each other. All the optimizations in the following section contribute to a 5.7% code size reduction.

Code Sinking

Code sinking is an optimization pass that pushes down instructions to uncommon branches like paths that end with a throw. This is done to reduce wasted cycles on instructions that are likely not going to be used.

We improved code sinking in graphs with try catches: we now allow sinking code as long as we don't sink it inside of a different try than the one it started in (or inside of any try if it wasn't in one to begin with).

Code sinking optimizations in the presence of a try catch

In the first example, we can sink the Object creation since it will only be used in the if(flag) path and not the other and it is within the same try. With this change, at runtime it will only be run if flag is true. Without getting into too much technical detail, what we can sink is the actual object creation, but loading the Object class still remains before the if. This is hard to show with Kotlin code, as the same Kotlin line turns into several instructions at the ART compiler level.

In the second example, we cannot sink the code as we would be moving an instance creation (which may throw) inside of another try.

Code Sinking is mostly a runtime performance optimization, but it can help reduce the register pressure. By moving instructions closer to their uses, we can use fewer registers in some cases. Using fewer registers means fewer move instructions, which ends up helping code size.

Loop optimization

Loop optimization helps eliminate loops at compile time. In the following example, the loop in foo will multiply a by 10, 10 times. This is the same as multiplying by 100. We enabled loop optimization to work in graphs with try catches.

In foo, we can optimize the loop since the try catch is unrelated.

In bar or baz, however, we don't optimize it. It is not trivial to know the path the loop will take if we have a try in the loop, or if the loop as a whole is inside of a try.

Dead code elimination – Remove unneeded try blocks

We improved our dead code elimination phase by implementing an optimization that removes try blocks that don't contain throwing instructions. We are also able to remove some catch blocks, as long as no live try blocks point to it.

In the following example, we inline bar into foo. After that, we know that the division cannot throw. Later optimization passes can leverage this and improve the code.

We can remove tries which contain no throwing instructions

Just removing the dead code from the try catch is good enough, but even better is the fact that in some cases we allow other optimizations to take place. If you recall, we don't do loop optimization when the loop has a try, or it's inside of one. By eliminating this redundant try/catch, we can loop optimize producing smaller and faster code.

Another example of removing tries which contain no throwing instructions

Dead code elimination – SimplifyAlwaysThrows

During the dead code elimination phase, we have an optimization we call SimplifyAlwaysThrows. If we detect that an invoke will always throw, we can safely discard whatever code we have after that method call since it will never be executed.

We also updated SimplifyAlwaysThrows to work in graphs with try catches, as long as the invoke itself is not inside of a try. If it is inside of a try, we might jump to a catch block, and it gets harder to figure out the exact path that will be executed.

We can use the SimplifyAlwaysThrows optimization as long as the invoke itself is not inside of a try

We also improved:

Detection when an invoke will always throw by looking at their parameters. On the left, we will mark divide(1, 0) as always throwing even when the generic method doesn't always throw.
SimplifyAlwaysThrows to work on all invokes. Previously we had restrictions for example don't do it for invokes leading to an if, but we could remove all of the restrictions.

We improved detection, and removed some of the restrictions from this optimization

Load Store Elimination – Working with try catch blocks

Load store elimination (LSE) is an optimization pass that removes redundant loads and stores.

We improved this pass to work with try catches in the graph. In foo, we can see that we can do LSE normally if the stores/loads are not directly interacting with the try. In bar, we can see an example where we either go through the normal path and don't throw, in which case we return 1; or we throw, catch it and return 2. Since the value is known for every path, we can remove the redundant load.

Examples showing we can perform Load Store Elimination in graphs with try catches, as long as the instructions are not inside of a try

Load Store Elimination – Working with release/acquire operations

We improved our load store elimination pass to work in graphs with release/acquire operations. These are volatile loads, stores, and monitor operations. To clarify, this means that we allow LSE to work in graphs that have those operations, but we don't remove said operations.

In the example, i and j are regular ints, and vi is a volatile int. In foo, we can skip loading the values since there's not a release/acquire operation between the sets and the loads. In bar, the volatile operation happens between them so we can't eliminate the normal loads. Note that it doesn't matter that the volatile load result is not used—we cannot eliminate the acquire operation.

Examples showing that we can perform LSE in graphs with release/acquire operations. Note that the release/acquire operations themselves are not removed since they are needed for synchronization.

This optimization works similarly with volatile stores and monitor operations (which are synchronized blocks in Kotlin).

New inliner heuristic

Our inliner pass has a wide range of heuristics. Sometimes we decide not to inline a method because it is too big, or sometimes to force inlining of a method because it is too small (for example, empty methods like Object initialization).

We implemented a new inliner heuristic: Don't inline invokes leading to a throw. If we know we are going to throw we will skip inlining those methods, as throwing itself is costly enough that inlining that code path is not worth it.

We had three families of methods that we are skipping to inline:

Calculating and printing debug information before a throw.
Inlining the error constructor itself.
Finally blocks are duplicated in our optimizing compiler. We have one for the normal case (i.e. the try doesn't throw), and one for the exceptional case. We do this because in the exceptional case we have to: catch, execute the finally block, and rethrow. The methods in the exceptional case will now not be inlined, but the ones in the normal case will.

Examples showing:
* calculating and printing debug information before a throw
* inlining the error constructor itself
* methods in finally blocks

Constant folding

Constant folding is the optimization pass that changes operations into constants if possible. We implemented an optimization that propagates variables known to be constant when used in if guards. Having more constants in the graph allows us to perform more optimizations later.

In foo, we know that a has the value 2 in the if guard. We can propagate that information and deduce that b must be 4. In a similar vein, in bar we know that cond must be true in the if case and false in the else case (simplifying the graphs).

Example showing that if we know that a variable has a constant value within the scope of an
`if`, we will propagate that example within said scope

Putting it all together

If we take into account all code size optimizations in this blog post we achieved a 9.3% code size reduction!

To put things into perspective, an average phone can have 500MB-1GB in optimized code (The actual number can be higher or lower, depending on how many apps you have installed, and which particular apps you installed), so these optimizations save about 50-100MB per device. Since these optimizations reach 1B+ devices, we are saving 47-95 petabytes globally!

Source: Android Developers Blog

Latest ARTwork on hundreds of millions of devices

Posted by Serban Constantinescu, Product Manager

Wouldn’t it be great if each update improved start-up times, execution speed, and memory usage of your apps? Google Play system updates for the Android Runtime (ART) do just that. These updates deliver performance improvements, the latest security fixes, and unify the core OpenJDK APIs across hundreds of millions of devices, including all Android 12+ devices and soon Android Go.

ART is the engine behind the Android operating system (OS). It provides the runtime and core APIs that all apps and most OS services rely on. Both Java and Kotlin are compiled down to bytecode executed by ART. Improvements in the runtime, compiler and core API benefit all developers making app execution faster and bytecode compilation more efficient.

While parts of Android are customizable by device manufacturers, ART is the same for all devices and Google Play system updates enable a path to modular updates.

Modularizing the OS

Android was originally designed for monolithic updates, which meant that OS components did not need to have clear API boundaries. This is because all dependent software would be built together. However, this made it difficult to update ART independently of the rest of the OS. Our first challenge was to untangle ART's dependencies and create clear, well-defined, and tested API boundaries. This allowed us to modularize ART and make it independently updatable.

Illustration of a racecar with an engine part hovering above the hood. A curved arrow points to where this part should go

As a core part of the OS, ART had to blaze new trails and engineer new OS boundaries. These new boundaries were so extensive that manually adding and updating them would be too time-consuming. Therefore, we implemented automatic generation of those through introspection in the build system.

Another example is stack unwinding, which reports the functions last executed when an issue is detected. Before modularizing the OS, all stack unwinding code was built together and could change across Android versions. This made the transition even more challenging, since there is only one version of ART that is delivered to many versions of Android, we had to create a new API boundary as well as design it to be forward-compatible with newer versions of the ART APEX module on devices that are no longer getting full OS updates.

Recently, for Android 14, we refactored the interface between the Package Manager, the service that determines how to install and update apps, and ART. This moves the OS boundary from the ART dex2oat command line to a well-defined interface that enables future optimizations, such as finer-grained control over the compilation mode.

ART updatability also introduced new challenges. For example, the collection of Java libraries, referred to as the Boot Classpath, had to be securely recompiled to ensure good performance. This required introducing a new secure state for compilation during boot as well as a fallback JIT compilation mode.

On older devices, the secure compilation happens on the first reboot after an ART update. On newer devices that support the Android Virtualization Framework, the compilation happens while the device is idle, in an enclave called Isolated Compilation – saving up to 20 seconds of boot-time.

Testing the ART APEX module

The ART APEX module is a complex piece of software with an order of magnitude more APIs than any other APEX module. It also backs a quarter of the developer APIs available in the Android SDK. In addition, ART has a compiler that aims to make the most of the underlying hardware by generating chipset-specific instructions, such as Arm SVE. This, together with the multiple OS versions on which the ART APEX module has to run, makes testing challenging.

We first modularized the testing framework from per-platform release (e.g. Android CTS) to per module. We did this by introducing an ART-specific Mainline Test Suite (MTS), which tests both compiler and runtime, as well as core OpenJDK APIs, while collecting code coverage statistics.

Our target is 100% API coverage and high line coverage, especially for new APIs. Together with HWASan and fuzzing, all of the tests described above contribute to a massive test load that needs to be sharded across multiple devices to ensure that it completes in a reasonable amount of time.

Illustration of modularized testing framework

We test the upcoming ART release every day by compiling over 18 million APKs and running app compatibility tests, and startup, performance, and memory benchmarks on a variety of Android devices that replicate the diversity of our ecosystem as closely as possible. Once tests pass with all possible compilation modes, all Garbage Collector algorithms, and supported OS versions, we begin gradually rolling out the next ART release.

Benefits of ART Google Play system updates

By updating ART independently of OS updates, users get the latest performance optimizations and security fixes as quickly as possible, while developers get OpenJDK improvements and compiler optimisations that benefit both Java and Kotlin.

As shown in the graph below, the runtime and compiler optimizations in the ART 13 update delivered real-world app start-up improvements of up to 30% on some devices.

Graph of average app startup time showing startup time in milliseconds with improvement up to 30% across 12 weeks on devices running the latest ART Google Play system update

ART updates allow us to frequently deploy fixes with little additional effort from our ecosystem partners. They include propagating upstream OpenJDK fixes to Android devices as quickly as possible, as well as runtime and compiler security fixes, such as CVE-2022-20502, which was detected by our automated fuzzing tests.

For developers, ART updates mean that you can now target the latest programming features. ART 13 delivered OpenJDK 11 core language features, which was the fastest-ever adoption of a new OpenJDK release on Android devices.

What’s next

In the coming months, we'll be releasing ART 14 to all compatible devices. ART 14 includes OpenJDK 17 support along with new compiler and runtime optimizations that improve performance while reducing code size. Stay tuned for more details on ART 14!

Java and OpenJDK are trademarks or registered trademarks of Oracle and/or its affiliates.

Source: Android Developers Blog

Bringing artworks to life with AR

Posted by Richard Adem, UX Engineer at Google Arts & Culture

What is Art Filter?

One of the best ways to learn about global culture is by trying on famous art pieces using Google’s Augmented Reality technology on your mobile device. What does it feel like to wear a three thousand year old necklace, put on a sixteenth century Japanese helmet or don pearl earrings and pose in a Vermeer?

Google Arts & Culture have created a new feature called Art Filter allowing everyone to learn about culturally significant art pieces from around the world and put themselves inside famous paintings, normally safely displayed in a museum.

We teamed up with the MediaPipe team, which offers cross-platform, customizable ML solutions to combine ML with rendering to generate stunning visuals.

Working closely with the MediaPipe team to utilize their face mesh and 3D face transform allowed us to create custom effects for each of the artifacts we had chosen, and easily display them on as part of the Google Arts & Culture iOS and Android app.

Figure 1. The Art Filter feature.

The Challenges

We selected five iconic cultural treasures from around the world:

Given their diverse formats or textures each artwork or object required special approaches to bring it to life in AR.

gif of user wearing jewelry on art filter feature

Figure 2. User wearing the jewelry from Johannes Vermeer's "Girl with a Pearl Earring" - Mauritshuis museum, Hague.

Creating 3D objects that can be viewed from all sides, using 2D references.

Some of the artwork we selected are 2D paintings and we wanted everyone to immerse themselves in the paintings. Our team of 3D artists and designers took high resolution gigapixel images from Google Arts & Culture and projected them onto 3D meshes to texture them. We also extended the 2D textures all the way around the 3D meshes while maintaining the style of the original artist. This means that when you turn your head the previously hidden parts of the piece are viewable from every angle, mimicking how the object would look in real-life.

Gif of the Van Gogh Self-Portrait filter

Figure 3. The Van Gogh Self-Portrait filter - Musée d’Orsay, Paris.

Our cultural partners were immensely helpful during the creation of Art Filter. They have sourced a huge amount of reference images allowing us to reproduce the pieces accurately using photographs from different angles, to help them appear to fit into the “real world” in AR (using size comparisons).

Layering elements of the effect along with the image of the user.

Art Filter takes an image of the user from their device’s camera and uses that to generate a 3D mesh of the user’s face. All processing of user images or video feeds is run entirely on device. We do not use this feature to identify or collect any personal biometric data; the feature cannot be used to identify an individual.

The image is then reused to texture the face mesh, generated in real-time on-device with MediaPipe Face Mesh, representing it in the virtual 3D world within the device. We then add virtual 2D and 3D layers around the face to complete the effect. The Tengu Helmet, for example, sits on top of the face mesh in 3D and is “attached” to the face mesh so it moves around when the user moves their head around. The Vermeer earrings with a headscarf and Frida Kahlo’s necklace are attached to the user’s image in a similar way. The Van Gogh effect works slightly differently since we still use a mesh of the user’s face but this time we apply a texture from the painting.

We use 2D elements to complete the scene as well, such as the backgrounds in the Kahlo and Van Gogh paintings. These were created by carefully separating the painting subjects from the background then placing them behind the user in 3D. You may notice that Van Gogh’s body is also 2D, shown as a “billboard” so that it always faces the camera.

Figure 4. Creating the 3D mesh showing layers and masks.

Using shaders for different materials such as the metal helmet.

To create a realistic looking material we used “Physically Based” Rendering shaders. You can see this on the Tengu helmet, it has a bumpy surface that is affected by the real life light captured by the device. This requires creating extra textures, texture maps, for the effect that uses colors to represent how bumpy or shiny the 3D object should appear. Texture maps look like bright pink and blue images but tell the renderer about tiny details on the surface of the object without creating any extra polygons, which can slow down the frame rate of the feature.

Figure 5. User wearing Helmet with Tengu Mask and Crows - The Metropolitan Museum of Art.

Conclusion

We hope you enjoy the collection we have created in Art Filter. Please visit and try for yourself! You can also explore more amazing ML features with Google Arts & Culture such as Art Selfie and Art Transfer.

We hope to bring many more filters to the feature and are looking forward to new features from MediaPipe.

Source: Google Developers Blog

Monster Mash: A Sketch-Based Tool for Casual 3D Modeling and Animation

Posted by Cassidy Curtis, Visual Designer and David Salesin, Principal Scientist, Google Research

3D computer animation is a time-consuming and highly technical medium — to complete even a single animated scene requires numerous steps, like modeling, rigging and animating, each of which is itself a sub-discipline that can take years to master. Because of its complexity, 3D animation is generally practiced by teams of skilled specialists and is inaccessible to almost everyone else, despite decades of advances in technology and tools. With the recent development of tools that facilitate game character creation and game balance, a natural question arises: is it possible to democratize the 3D animation process so it’s accessible to everyone?

To explore this concept, we start with the observation that most forms of artistic expression have a casual mode: a classical guitarist might jam without any written music, a trained actor could ad-lib a line or two while rehearsing, and an oil painter can jot down a quick gesture drawing. What these casual modes have in common is that they allow an artist to express a complete thought quickly and intuitively without fear of making a mistake. This turns out to be essential to the creative process — when each sketch is nearly effortless, it is possible to iteratively explore the space of possibilities far more effectively.

In this post, we describe Monster Mash, an open source tool presented at SIGGRAPH Asia 2020 that allows experts and amateurs alike to create rich, expressive, deformable 3D models from scratch — and to animate them — all in a casual mode, without ever having to leave the 2D plane. With Monster Mash, the user sketches out a character, and the software automatically converts it to a soft, deformable 3D model that the user can immediately animate by grabbing parts of it and moving them around in real time. There is also an online demo, where you can try it out for yourself.

Creating a walk cycle using Monster Mash. Step 1: Draw a character. Step 2: Animate it.

Creating a 2D Sketch
The insight that makes this casual sketching approach possible is that many 3D models, particularly those of organic forms, can be described by an ordered set of overlapping 2D regions. This abstraction makes the complex task of 3D modeling much easier: the user creates 2D regions by drawing their outlines, then the algorithm creates a 3D model by stitching the regions together and inflating them. The result is a simple and intuitive user interface for sketching 3D figures.

For example, suppose the user wants to create a 3D model of an elephant. The first step is to draw the body as a closed stroke (a). Then the user adds strokes to depict other body parts such as legs (b). Drawing those additional strokes as open curves provides a hint to the system that they are meant to be smoothly connected with the regions they overlap. The user can also specify that some new parts should go behind the existing ones by drawing them with the right mouse button (c), and mark other parts as symmetrical by double-clicking on them (d). The result is an ordered list of 2D regions.

Steps in creating a 2D sketch of an elephant.

Stitching and Inflation
To understand how a 3D model is created from these 2D regions, let’s look more closely at one part of the elephant. First, the system identifies where the leg must be connected to the body (a) by finding the segment (red) that completes the open curve. The system cuts the body’s front surface along that segment, and then stitches the front of the leg together with the body (b). It then inflates the model into 3D by solving a modified form of Poisson’s equation to produce a surface with a rounded cross-section (c). The resulting model (d) is smooth and well-shaped, but because all of the 3D parts are rooted in the drawing plane, they may intersect each other, resulting in a somewhat odd-looking “elephant”. These intersections will be resolved by the deformation system.

Illustration of the details of the stitching and inflation process. The schematic illustrations (b, c) are cross-sections viewed from the elephant’s front.

Layered Deformation
At this point we just have a static model — we need to give the user an easy way to pose the model, and also separate the intersecting parts somehow. Monster Mash’s layered deformation system, based on the well-known smooth deformation method as-rigid-as-possible (ARAP), solves both of these problems at once. What’s novel about our layered “ARAP-L” approach is that it combines deformation and other constraints into a single optimization framework, allowing these processes to run in parallel at interactive speed, so that the user can manipulate the model in real time.

The framework incorporates a set of layering and equality constraints, which move body parts along the z axis to prevent them from visibly intersecting each other. These constraints are applied only at the silhouettes of overlapping parts, and are dynamically updated each frame.

In steps (d) through (h) above, ARAP-L transforms a model from one with intersecting 3D parts to one with the depth ordering specified by the user. The layering constraints force the leg’s silhouette to stay in front of the body (green), and the body’s silhouette to stay behind the leg (yellow). Equality constraints (red) seal together the loose boundaries between the leg and the body.

Meanwhile, in a separate thread of the framework, we satisfy point constraints to make the model follow user-defined control points (described in the section below) in the xy-plane. This ARAP-L method allows us to combine modeling, rigging, deformation, and animation all into a single process that is much more approachable to the non-specialist user.

The model deforms to match the point constraints (red dots) while the layering constraints prevent the parts from visibly intersecting.

Animation
To pose the model, the user can create control points anywhere on the model’s surface and move them. The deformation system converges over multiple frames, which gives the model’s movement a soft and floppy quality, allowing the user to intuitively grasp its dynamic properties — an essential prerequisite for kinesthetic learning.

Because the effect of deformations converges over multiple frames, our system lends 3D models a soft and dynamic quality.

To create animation, the system records the user’s movements in real time. The user can animate one control point, then play back that movement while recording additional control points. In this way, the user can build up a complex action like a walk by layering animation, one body part at a time. At every stage of the animation process, the only task required of the user is to move points around in 2D, a low-risk workflow meant to encourage experimentation and play.

Conclusion
We believe this new way of creating animation is intuitive and can thus help democratize the field of computer animation, encouraging novices who would normally be unable to try it on their own as well as experts who often require fast iteration under tight deadlines. Here you can see a few of the animated characters that have been created using Monster Mash. Most of these were created in a matter of minutes.

A selection of animated characters created using Monster Mash. The original hand-drawn outline used to create each 3D model is visible as an inset above each character.

All of the code for Monster Mash is available as open source, and you can watch our presentation and read our paper from SIGGRAPH Asia 2020 to learn more. We hope this software will make creating 3D animations more broadly accessible. Try out the online demo and see for yourself!

Acknowledgements
Monster Mash is the result of a collaboration between Google Research, Czech Technical University in Prague, ETH Zürich, and the University of Washington. Key contributors include Marek Dvorožňák, Daniel Sýkora, Cassidy Curtis, Brian Curless, Olga Sorkine-Hornung, and David Salesin. We are also grateful to Hélène Leroux, Neth Nom, David Murphy, Samuel Leather, Pavla Sýkorová, and Jakub Javora for participating in the early interactive sessions.

Source: Google AI Blog

Using GANs to Create Fantastical Creatures

Posted by Andeep Singh Toor, Stadia Software Engineer and Fred Bertsch, Software Engineer, Google Research, Brain Team

Creating art for digital video games takes a high degree of artistic creativity and technical knowledge, while also requiring game artists to quickly iterate on ideas and produce a high volume of assets, often in the face of tight deadlines. What if artists had a paintbrush that acted less like a tool and more like an assistant? A machine learning model acting as such a paintbrush could reduce the amount of time necessary to create high-quality art without sacrificing artistic choices, perhaps even enhancing creativity.

Today, we present Chimera Painter, a trained machine learning (ML) model that automatically creates a fully fleshed out rendering from a user-supplied creature outline. Employed as a demo application, Chimera Painter adds features and textures to a creature outline segmented with body part labels, such as “wings” or “claws”, when the user clicks the “transform” button. Below is an example using the demo with one of the preset creature outlines.

Using an image imported to Chimera Painter or generated with the tools provided, an artist can iteratively construct or modify a creature outline and use the ML model to generate realistic looking surface textures. In this example, an artist (Lee Dotson) customizes one of the creature designs that comes pre-loaded in the Chimera Painter demo.

In this post, we describe some of the challenges in creating the ML model behind Chimera Painter and demonstrate how one might use the tool for the creation of video game-ready assets.

Prototyping for a New Type of Model
In developing an ML model to produce video-game ready creature images, we created a digital card game prototype around the concept of combining creatures into new hybrids that can then battle each other. In this game, a player would begin with cards of real-world animals (e.g., an axolotl or a whale) and could make them more powerful by combining them (making the dreaded Axolotl-Whale chimera). This provided a creative environment for demonstrating an image-generating model, as the number of possible chimeras necessitated a method for quickly designing large volumes of artistic assets that could be combined naturally, while still retaining identifiable visual characteristics of the original creatures.

Since our goal was to create high-quality creature card images guided by artist input, we experimented with generative adversarial networks (GANs), informed by artist feedback, to create creature images that would be appropriate for our fantasy card game prototype. GANs pair two convolutional neural networks against each other: a generator network to create new images and a discriminator network to determine if these images are samples from the training dataset (in this case, artist-created images) or not. We used a variant called a conditional GAN, where the generator takes a separate input to guide the image generation process. Interestingly, our approach was a strict departure from other GAN efforts, which typically focus on photorealism.

To train the GANs, we created a dataset of full color images with single-species creature outlines adapted from 3D creature models. The creature outlines characterized the shape and size of each creature, and provided a segmentation map that identified individual body parts. After model training, the model was tasked with generating multi-species chimeras, based on outlines provided by artists. The best performing model was then incorporated into Chimera Painter. Below we show some sample assets generated using the model, including single-species creatures, as well as the more complex multi-species chimeras.

Generated card art integrated into the card game prototype showing basic creatures (bottom row) and chimeras from multiple creatures, including an Antlion-Porcupine, Axolotl-Whale, and a Crab-Antion-Moth (top row). More info about the game itself is detailed in this Stadia Research presentation.

Learning to Generate Creatures with Structure
An issue with using GANs for generating creatures was the potential for loss of anatomical and spatial coherence when rendering subtle or low-contrast parts of images, despite these being of high perceptual importance to humans. Examples of this can include eyes, fingers, or even distinguishing between overlapping body parts with similar textures (see the affectionately named BoggleDog below).

GAN-generated image showing mismatched body parts.

Generating chimeras required a new non-photographic fantasy-styled dataset with unique characteristics, such as dramatic perspective, composition, and lighting. Existing repositories of illustrations were not appropriate to use as datasets for training an ML model, because they may be subject to licensing restrictions, have conflicting styles, or simply lack the variety needed for this task.

To solve this, we developed a new artist-led, semi-automated approach for creating an ML training dataset from 3D creature models, which allowed us to work at scale and rapidly iterate as needed. In this process, artists would create or obtain a set of 3D creature models, one for each creature type needed (such as hyenas or lions). Artists then produced two sets of textures that were overlaid on the 3D model using the Unreal Engine — one with the full color texture (left image, below) and the other with flat colors for each body part (e.g., head, ears, neck, etc), called a “segmentation map” (right image, below). This second set of body part segments was given to the model at training to ensure that the GAN learned about body part-specific structure, shapes, textures, and proportions for a variety of creatures.

Example dataset training image and its paired segmentation map.

The 3D creature models were all placed in a simple 3D scene, again using the Unreal Engine. A set of automated scripts would then take this 3D scene and interpolate between different poses, viewpoints, and zoom levels for each of the 3D creature models, creating the full color images and segmentation maps that formed the training dataset for the GAN. Using this approach, we generated 10,000+ image + segmentation map pairs per 3D creature model, saving the artists millions of hours of time compared to creating such data manually (at approximately 20 minutes per image).

Fine Tuning
The GAN had many different hyper-parameters that could be adjusted, leading to different qualities in the output images. In order to better understand which versions of the model were better than others, artists were provided samples for different creature types generated by these models and asked to cull them down to a few best examples. We gathered feedback about desired characteristics present in these examples, such as a feeling of depth, style with regard to creature textures, and realism of faces and eyes. This information was used both to train new versions of the model and, after the model had generated hundreds of thousands of creature images, to select the best image from each creature category (e.g., gazelle, lynx, gorilla, etc).

We tuned the GAN for this task by focusing on the perceptual loss. This loss function component (also used in Stadia’s Style Transfer ML) computes a difference between two images using extracted features from a separate convolutional neural network (CNN) that was previously trained on millions of photographs from the ImageNet dataset. The features are extracted from different layers of the CNN and a weight is applied to each, which affects their contribution to the final loss value. We discovered that these weights were critically important in determining what a final generated image would look like. Below are some examples from the GAN trained with different perceptual loss weights.

Dino-Bat Chimeras generated using varying perceptual loss weights.

Some of the variation in the images above is due to the fact that the dataset includes multiple textures for each creature (for example, a reddish or grayish version of the bat). However, ignoring the coloration, many differences are directly tied to changes in perceptual loss values. In particular, we found that certain values brought out sharper facial features (e.g., bottom right vs. top right) or “smooth” versus “patterned” (top right vs. bottom left) that made generated creatures feel more real.

Here are some creatures generated from the GAN trained with different perceptual loss weights, showing off a small sample of the outputs and poses that the model can handle.

Creatures generated using different models.

A generated chimera (Dino-Bat-Hyena, to be exact) created using the conditional GAN. Output from the GAN (left) and the post-processed / composited card (right).

Chimera Painter
The trained GAN is now available in the Chimera Painter demo, allowing artists to work iteratively with the model, rather than drawing dozens of similar creatures from scratch. An artist can select a starting point and then adjust the shape, type, or placement of creature parts, enabling rapid exploration and for the creation of a large volume of images. The demo also allows for uploading a creature outline created in an external program, like Photoshop. Simply download one of the preset creature outlines to get the colors needed for each creature part and use this as a template for drawing one outside of Chimera Painter, and then use the “Load’ button on the demo to use this outline to flesh out your creation.

It is our hope that these GAN models and the Chimera Painter demonstration tool might inspire others to think differently about their art pipeline. What can one create when using machine learning as a paintbrush?

Acknowledgments
This project is conducted in collaboration with many people. Thanks to Ryan Poplin, Lee Dotson, Trung Le, Monica Dinculescu, Marc Destefano, Aaron Cammarata, Maggie Oh, Richard Wu, Ji Hun Kim, Erin Hoffman-John, and Colin Boswell. Thanks to everyone who pitched in to give hours of art direction, technical feedback, and drawings of fantastic creatures.

Source: Google AI Blog

India’s mini-masterpieces brought to life with AI and AR

Miniature paintings are among the most beautiful, most technically-advanced and most sophisticated art forms in Indian culture. Though compact (about the same size as a small book), they typically tackle profound themes such as love, power and faith. Using technologies like machine learning, augmented reality and high-definition robotic cameras, Google Arts & Culture has partnered with the National Museum in New Delhi to showcase these special works of art in a magical new way.

Virtually wander the halls of a special ‘pocket gallery’

Inspired by the domes and doorways that punctuate Indian homes and public spaces, this is the first AR-powered art gallery designed with traditional Indian architecture. Using your smartphone, you can open up a life-size virtual space, walk around at your leisure and zoom into your favorite pieces—you have this beautiful museum to yourself!

The first AR-powered art gallery inspired by the domes and doorways of India.

Art meets AI, with Magnify Miniatures

Miniatures are rich in detailed representations of topics that have shaped Indian culture. Thanks to machine learning, you can now discover these attributes across a collection of miniature paintings. Select from tags like ‘face’, ‘animal’, or even ‘moustache’, and see where these features occur!

Take a closer look with immersive in-painting tours

Art Camera, our ultra-high-resolution robotic camera, was deployed to produce the most vivid images of masterpieces ever seen. Using these images, we’ve created over 75 in-painting tours to help you stop and appreciate details like wisps of smoke from firecrackers, or see how finesse and variety of every person’s attire in this royal procession—flourishes that you wouldn’t be able to see well with the naked eye.

You can zoom in to see the wisps of smoke in this miniature titled "Lady Holding a Sparkler"

Explore thousands of rich stories and images

The virtual collection includes 1,200 high resolution images from 25 collections all around the world and more than 75 stories, depicting scenes that include legendary marriage processions, the joy of being among nature, or epic battles. Curious minds, students and families will find playful and educational ways to enjoy the world of Indian miniatures, such as an interactive coloring book.

We’re glad that through the power of technology, people all over the world can engage with these miniature masterpieces like never before.

Posted by Simon Rein, Program Manager, Google Arts & Culture

Source: Official Google India Blog

A digital exhibit to elevate Indigenous art

In March 2020, the 22nd Biennale of Sydney opened to wide acclaim—only to close after 10 days because of COVID-19. The Biennale has since physically reopened to limited audiences, but now, through a virtual exhibit on Google Arts & Culture, people all over the world can experience it.
This year’s Biennale is led by First Nations artists, and showcases work from marginalised communities around the world, under the artistic direction of the Indigenous Australian artist, Brook Andrew. It’s titled NIRIN—meaning “edge”—a word of Brook’s mother’s Nation, the Wiradjuri people of western New South Wales.

To commemorate the opening of this unique exhibition, and learn more about its origins and purpose, we spoke with Jodie Polutele, Head of Communications and Community Engagement at the Biennale of Sydney.

Tell us about the theme of this year’s exhibition.
NIRIN is historic in its focus on the unresolved nature of Australian and global colonial history. It presents the work of artists and communities that are often relegated to the edge and whose practices challenge dominant narratives.
As a community, we’re at a critical point in time where the voices, histories and spheres of knowledge that have been historically pushed to “the edge” are being heard and shared. The recent Black Lives Matter protests in the United States and in other parts of the world have triggered a belated awakening in many people—particularly in Australia—about the real-life impacts of systemic racism and inequality. But we have a long way to go, and the art and ideas presented in NIRIN are one way to start (or continue) the conversation.
What does this offer audiences, both in Australia, and all over the world, particularly during this time?
Many of the artworks ask audiences to be critical of dominant historical narratives, and our own perspective and privilege; we are forced to recognise and question our own discomfort. In doing so, they also present an opportunity to inspire truly meaningful action.
What are some of the highlights of the exhibition?
Some highlights include Healing Land, Remembering Country by Tony Albert, a sustainable greenhouse which raises awareness of the Stolen Generations and poses important questions about how we remember, give justice to and rewrite complex and traumatic histories. Latai Taumoepeau’s endurance performance installation on Cockatoo Island explores the fragility of Pacific Island nations and the struggle of rising sea levels and displacement. Zanele Muholi’s three bodies of work at the Museum of Contemporary Art look at the politics of race, gender and sexuality. Wiradjuri artist Karla Dickens’ installation A Dickensian Circus presents a dramatic collection of objects inside the Art Gallery of New South Wales’ grand vestibule, reclaiming the space to share the hidden stories and histories of Indigenous people.

Tony Albert's sustainable greenhouse posing important questions about historical and intergenerational trauma

This virtual exhibit was not what you originally imagined. Can you tell us what hurdles you have had to overcome?
The Biennale of Sydney takes more than two years to produce with a team of dedicated people. Closing the exhibitions and cancelling or postponing a program of more than 600 events was devastating. But with the enormous support of the Google Arts & Culture team, we have delivered a virtual exhibition that is respectful of artists’ works and conveys the true vision of NIRIN—inspiring conversation and action through a meaningful arts experience. We hope that NIRIN on Google Arts & Culture will be an enduring legacy for the exhibition, and also for the talented team who made it happen.

Watch Latal Taumoepeau's endurance performance, The Last Resort

Posted Elisabeth Callot, Program Manager, Google Arts & Culture

Source: Official Google Australia Blog

Exploring and Visualizing an Open Global Dataset

Posted by Reena Jana, Creative Lead, Business Inclusion, and Josh Lovejoy, UX Designer, Google Research

Machine learning systems are increasingly influencing many aspects of everyday life, and are used by both the hardware and software products that serve people globally. As such, researchers and designers seeking to create products that are useful and accessible for everyone often face the challenge of finding data sets that reflect the variety and backgrounds of users around the world. In order to train these machine learning systems, open, global — and growing — datasets are needed.

Over the last six months, we’ve seen such a dataset emerge from users of Quick, Draw!, Google’s latest approach to helping wide, international audiences understand how neural networks work. A group of Googlers designed Quick, Draw! as a way for anyone to interact with a machine learning system in a fun way, drawing everyday objects like trees and mugs. The system will try to guess what their drawing depicts, within 20 seconds. While the goal of Quick, Draw! was simply to create a fun game that runs on machine learning, it has resulted in 800 million drawings from twenty million people in 100 nations, from Brazil to Japan to the U.S. to South Africa.

And now we are releasing an open dataset based on these drawings so that people around the world can contribute to, analyze, and inform product design with this data. The dataset currently includes 50 million drawings Quick Draw! players have generated (we will continue to release more of the 800 million drawings over time).

It’s a considerable amount of data; and it’s also a fascinating lens into how to engage a wide variety of people to participate in (1) training machine learning systems, no matter what their technical background; and (2) the creation of open data sets that reflect a wide spectrum of cultures and points of view.

Seeing national — and global — patterns in one glance
To understand visual patterns within the dataset quickly and efficiently, we worked with artist Kyle McDonald to overlay thousands of drawings from around the world. This helped us create composite images and identify trends in each nation, as well across all nations. We made animations of 1000 layered international drawings of cats and chairs, below, to share how we searched for visual trends with this data:

Cats, made from 1000 drawings from around the world:

Chairs, made from 1,000 drawings around the world:

Doodles of naturally recurring objects, like cats (or trees, rainbows, or skulls) often look alike across cultures:

However, for objects that might be familiar to some cultures, but not others, we saw notable differences. Sandwiches took defined forms or were a jumbled set of lines; mug handles pointed in opposite directions; and chairs were drawn facing forward or sideways, depending on the nation or region of the world:

One size doesn’t fit all
These composite drawings, we realized, could reveal how perspectives and preferences differ between audiences from different regions, from the type of bread used in sandwiches to the shape of a coffee cup, to the aesthetic of how to depict objects so they are visually appealing. For example, a more straightforward, head-on view was more consistent in some nations; side angles in others.

Overlaying the images also revealed how to improve how we train neural networks when we lack a variety of data — even within a large, open, and international data set. For example, when we analyzed 115,000+ drawings of shoes in the Quick, Draw! dataset, we discovered that a single style of shoe, which resembles a sneaker, was overwhelmingly represented. Because it was so frequently drawn, the neural network learned to recognize only this style as a “shoe.”

But just as in the physical world, in the realm of training data, one size does not fit all. We asked, how can we consistently and efficiently analyze datasets for clues that could point toward latent bias? And what would happen if a team built a classifier based on a non-varied set of data?

Diagnosing data for inclusion
With the open source tool Facets, released last month as part of Google’s PAIR initiative, one can see patterns across a large dataset quickly. The goal is to efficiently, and visually, diagnose how representative large datasets, like the Quick, Draw! Dataset, may be.

Here’s a screenshot from the Quick,Draw! dataset within the Facets tool. The tool helped us position thousands of drawings by "faceting" them in multiple dimensions by their feature values, such as country, up to 100 countries. You, too, can filter for for features such as “random faces” in a 10-country view, which can then be expanded to 100 countries. At a glance, you can see proportions of country representations. You can also zoom in and see details of each individual drawing, allowing you to dive deeper into single data points. This is especially helpful when working with a large visual data set like Quick, Draw!, allowing researchers to explore for subtle differences or anomalies, or to begin flagging small-scale visual trends that might emerge later as patterns within the larger data set.

Here’s the same Quick, Draw! data for “random faces,” faceted for 94 countries and seen from another view. It’s clear in the few seconds that Facets loads the drawings in this new visualization that the data is overwhelmingly representative of the United States and European countries. This is logical given that the Quick, Draw! game is currently only available in English. We plan to add more languages over time. However, the visualization shows us that Brazil and Thailand seem to be non-English-speaking nations that are relatively well-represented within the data. This suggested to us that designers could potentially research what elements of the interface design may have worked well in these countries. Then, we could use that information to improve Quick,Draw! in its next iteration for other global, non-English-speaking audiences. We’re also using the faceted data to help us figure out how prioritize local languages for future translations.

Another outcome of using Facets to diagnose the Quick, Draw! data for inclusion was to identify concrete ways that anyone can improve the variety of data, as well as check for potential biases. Improvements could include:

Changing protocols for human rating of data or content generation, so that the data is more accurately representative of local or global populations
Analyzing subgroups of data and identify the database equivalent of "intersectionality" surfaced within visual patterns
Augmenting and reweighting data so that it is more inclusive

By releasing this dataset, and tools like Facets, we hope to facilitate the exploration of more inclusive approaches to machine learning, and to turn those observations into opportunities for innovation. We’re just beginning to draw insights from both Quick, Draw! and Facets. And we invite you to draw more with us, too.

Acknowledgements
Jonas Jongejan, Henry Rowley, Takashi Kawashima, Jongmin Kim, Nick Fox-Gieg, built Quick, Draw! in collaboration with Google Creative Lab and Google’s Data Arts Team. The video about fairness in machine learning was created by Teo Soares, Alexander Chen, Bridget Prophet, Lisa Steinman, and JR Schmidt from Google Creative Lab. James Wexler, Jimbo Wilson, and Mahima Pushkarna, of PAIR, designed Facets, a project led by Martin Wattenberg and Fernanda Viégas, Senior Staff Research Scientists on the Google Brain team, and UX Researcher Jess Holbrook. Ian Johnson from the Google Cloud team contributed to the visualizations of overlaid drawings.

Source: Google Research Blog

Neural Network-Generated Illustrations in Allo

Posted by Jennifer Daniel, Expressions Creative Director, Allo

Taking, sharing, and viewing selfies has become a daily habit for many — the car selfie, the cute-outfit selfie, the travel selfie, the I-woke-up-like-this selfie. Apart from a social capacity, self-portraiture has long served as a means for self and identity exploration. For some, it’s about figuring out who they are. For others it’s about projecting how they want to be perceived. Sometimes it’s both.

Photography in the form of a selfie is a very direct form of expression. It comes with a set of rules bounded by reality. Illustration, on the other hand, empowers people to define themselves - it’s warmer and less fraught than reality.

Today, Google is introducing a feature in Allo that uses a combination of neural networks and the work of artists to turn your selfie into a personalized sticker pack. Simply snap a selfie, and it’ll return an automatically generated illustrated version of you, on the fly, with customization options to help you personalize the stickers even further.

What makes you, you?
The traditional computer vision approach to mapping selfies to art would be to analyze the pixels of an image and algorithmically determine attribute values by looking at pixel values to measure color, shape, or texture. However, people today take selfies in all types of lighting conditions and poses. And while people can easily pick out and recognize qualitative features, like eye color, regardless of the lighting condition, this is a very complex task for computers. When people look at eye color, they don’t just interpret the pixel values of blue or green, but take into account the surrounding visual context.

In order to account for this, we explored how we could enable an algorithm to pick out qualitative features in a manner similar to the way people do, rather than the traditional approach of hand coding how to interpret every permutation of lighting condition, eye color, etc. While we could have trained a large convolutional neural network from scratch to attempt to accomplish this, we wondered if there was a more efficient way to get results, since we expected that learning to interpret a face into an illustration would be a very iterative process.

That led us to run some experiments, similar to DeepDream, on some of Google's existing more general-purpose computer vision neural networks. We discovered that a few neurons among the millions in these networks were good at focusing on things they weren’t explicitly trained to look at that seemed useful for creating personalized stickers. Additionally, by virtue of being large general-purpose neural networks they had already figured out how to abstract away things they didn’t need. All that was left to do was to provide a much smaller number of human labeled examples to teach the classifiers to isolate out the qualities that the neural network already knew about the image.

To create an illustration of you that captures the qualities that would make it recognizable to your friends, we worked alongside an artistic team to create illustrations that represented a wide variety of features. Artists initially designed a set of hairstyles, for example, that they thought would be representative, and with the help of human raters we used these hairstyles to train the network to match the right illustration to the right selfie. We then asked human raters to judge the sticker output against the input image to see how well it did. In some instances, they determined that some styles were not well represented, so the artists created more that the neural network could learn to identify as well.

Raters were asked to classify hairstyles that the icon on the left resembled closest. Then, once consensus was reached, resident artist Lamar Abrams drew a representation of what they had in common.

Avoiding the uncanny valley
In the study of aesthetics, a well-known problem is the uncanny valley - the hypothesis that human replicas which appear almost, but not exactly, like real human beings can feel repulsive. In machine learning, this could be compounded if were confronted by a computer’s perception of you, versus how you may think of yourself, which can be at odds.

Rather than aim to replicate a person’s appearance exactly, pursuing a lower resolution model, like emojis and stickers, allows the team to explore expressive representation by returning an image that is less about reproducing reality and more about breaking the rules of representation.

The team worked with artist Lamar Abrams to design the features that make up more than 563 quadrillion combinations.

Translating pixels to artistic illustrations
Reconciling how the computer perceives you with how you perceive yourself and what you want to project is truly an artistic exercise. This makes a customization feature that includes different hairstyles, skin tones, and nose shapes, essential. After all, illustration by its very nature can be subjective. Aesthetics are defined by race, culture, and class which can lead to creating zones of exclusion without consciously trying. As such, we strove to create a space for a range of race, age, masculinity, femininity, and/or androgyny. Our teams continue to evaluate the research results to help prevent against incorporating biases while training the system.

Creating a broad palette for identity and sentiment
There is no such thing as a ‘universal aesthetic’ or ‘a singular you’. The way people talk to their parents is different than how they talk to their friends which is different than how they talk to their colleagues. It’s not enough to make an avatar that is a literal representation of yourself when there are many versions of you. To address that, the Allo team is working with a range of artistic voices to help others extend their own voice. This first style that launched today speaks to your sarcastic side but the next pack might be more cute for those sincere moments. Then after that, maybe they’ll turn you into a dog. If emojis broadened the world of communication it’s not hard to imagine how this technology and language evolves. What will be most exciting is listening to what people say with it.

This feature is starting to roll out in Allo today for Android, and will come soon to Allo on iOS.

Acknowledgements
This work was made possible through a collaboration of the Allo Team and Machine Perception researchers at Google. We additionally thank Lamar Abrams, Koji Ashida, Forrester Cole, Jennifer Daniel, Shiraz Fuman, Dilip Krishnan, Inbar Mosseri, Aaron Sarna, and Bhavik Singh.

Introducing Android support in Compiler Explorer

The Android build toolchain (in brief)

Compiler optimizations demystified

Kotlin and Java, side by side

What happens when you minify your app?

Baseline profiles and you

Conclusion

Source: Android Developers Blog

Optimizing compiler 101

Code size improvements

Putting it all together

Further reading

Source: Android Developers Blog

Modularizing the OS

Testing the ART APEX module

Benefits of ART Google Play system updates

What’s next

Source: Android Developers Blog

What is Art Filter?

The Challenges

Creating 3D objects that can be viewed from all sides, using 2D references.

Layering elements of the effect along with the image of the user.

Using shaders for different materials such as the metal helmet.

Conclusion

Source: Google Developers Blog

Source: Google AI Blog

Source: Google AI Blog

Source: Official Google India Blog

Source: Official Google Australia Blog

Source: Google Research Blog

Source: Google Research Blog