Tag Archives: vulkan

Android GPU Compute Going Forward

Posted by Dan Galpin, Developer Advocate

Illustration of image rendering with character

We introduced RenderScript in Android 3.0 as a way for applications to run computationally-intensive code on the CPU or GPU without having to make use of the NDK or GPU-specific APIs. As Android has evolved, the NDK tooling and APIs for GPU compute using OpenGL have dramatically improved. In Android 7.0 (API level 24), we added the Vulkan API, which allows for low-level access to GPU hardware features. In Android 10.0 (API level 29) we added the capability to easily share Bitmap hardware buffers between Android SDK and NDK code to speed image processing.

We no longer recommend RenderScript as the optimal way to accomplish these performance-critical tasks, and will be deprecating the APIs in Android 12. We want you to have confidence that your high-performance workloads will run on GPU hardware, and many devices are already shipping with only CPU support for RenderScript. The APIs will continue to function, but compiling RenderScript code when targeting Android 12 will give a warning.

How we’re addressing RenderScript intrinsics

The RenderScript subsystem was also used to implement a number of useful image-processing intrinsics. We're providing an open-source library that contains the highly-tuned CPU implementations for all but the Basic Linear Algebra Subprograms (BLAS) intrinsics. For the cases that we have measured, intrinsics execute faster (often substantially so) on the CPU using our library than running within RenderScript, even on devices where GPU support was possible.

For BLAS, we suggest using one of the many libraries that already provide this functionality, such as netlib’s libblas.

Please file issues with the library here.

If you use RenderScript scripts:

To take full advantage of GPU acceleration, we recommend migrating RenderScript scripts to the cross-platform Vulkan API. To help get you started with this transition, we have provided a sample app that demonstrates two RenderScript scripts with Vulkan equivalents.

  • The first script is the “hello world” of RenderScript; it rotates the hue of the image.
  • The second script implements blur, using multiple cells of the input allocation to figure out one cell of the output allocation. (Note that the highly-optimized blur in our static library has broader device compatibility and faster performance than this sample script in many cases.)

Because Vulkan is not available on older devices, you may need to manage two code paths: RenderScript on older devices and Vulkan on newer ones. Our documentation has more details on how to migrate.

You can file issues with the sample here.

Thank you

We on the RenderScript team thank you for your support over the years. We understand that transitions are never easy; our focus on cross-platform APIs such as Vulkan will mean even better tools and support for your GPU-accelerated applications.

Handling Device Orientation Efficiently in Vulkan With Pre-Rotation

Illustration of prerotation

By Omar El Sheikh, Android Engineer

Francesco Carucci, Developer Advocate

Vulkan provides developers with the power to specify much more information to devices about rendering state compared to OpenGL. With that power though comes some new responsibilities; developers are expected to explicitly implement things that in OpenGL were handled by the driver. One of these things is device orientation and its relationship to render surface orientation. Currently, there are 3 ways that Android can handle reconciling the render surface of the device with the orientation of the device :

  1. The device has a Display Processing Unit (DPU) that can efficiently handle surface rotation in hardware to match the device orientation (requires a device that supports this)
  2. The Android OS can handle surface rotation by adding a compositor pass that will have a performance cost depending on how the compositor has to deal with rotating the output image
  3. The application itself can handle the surface rotation by rendering a rotated image onto a render surface that matches the current orientation of the display

What Does This Mean For Your Apps?

There currently isn't a way for an application to know whether surface rotation handled outside of the application will be free. Even if there is a DPU to take care of this for us, there will still likely be a measurable performance penalty to pay. If your application is CPU bound, this becomes a huge power issue due to the increased GPU usage by the Android Compositor which usually is running at a boosted frequency as well; and if your application is GPU bound, then it becomes a potentially large performance issue on top of that as the Android Compositor will preempt your application's GPU work causing your application to drop its frame rate.

On Pixel 4XL, we have seen on shipping titles that SurfaceFlinger (the higher-priority task that drives the Android Compositor) both regularly preempts the application’s work causing 1-3ms hits to frametimes, as well as puts increased pressure on the GPU’s vertex/texture memory as the Compositor has to read the frame to do its composition work.

Handling orientation properly stops GPU preemption by SurfaceFlinger almost entirely, as well as sees the GPU frequency drop 40% as the boosted frequency used by the Android Compositor is no longer needed.

To ensure surface rotations are handled properly with as little overhead as possible (as seen in the case above) we recommend implementing method 3 - this is known as pre-rotation. The primary mechanism with which this works is by telling the Android OS that we are handling the surface rotation by specifying in the orientation during swapchain creation through the passed in surface transform flags, which stops the Android Compositor from doing the rotation itself.

Knowing how to set the surface transform flag is important for every Vulkan application, since applications tend to either support multiple orientations, or support a single orientation where its render surface is in a different orientation to what the device considers its identity orientation; For example a landscape-only application on a portrait-identity phone, or a portrait-only application on a landscape-identity tablet.

In this post we will describe in detail how to implement pre-rotation and handle device rotation in your Vulkan application.

Modify AndroidManifest.xml

To handle device rotation in your app, change the application’s AndroidManifest.xml to tell Android that your app will handle orientation and screen size changes. This prevents Android from destroying and recreating the Android activity and calling the onDestroy() function on the existing window surface when an orientation change occurs. This is done by adding the orientation (to support API level <13) and screenSize attributes to the activity’s configChanges section:

<activity android:name="android.app.NativeActivity"
          android:configChanges="orientation|screenSize">

If your application fixes its screen orientation using the screenOrientation attribute you do not need to do this. Also if your application uses a fixed orientation then it will only need to setup the swapchain once on application startup/resume.

Get the Identity Screen Resolution and Camera Parameters

The next thing that needs to be done is to detect the device’s screen resolution that is associated with the VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR as this resolution is the one that the Swapchain will always need to be set to. The most reliable way to get this is to make a call to vkGetPhysicalDeviceSurfaceCapabilitiesKHR() at application startup and storing the returned extent - swapping the width and height based on the currentTransform that is also returned in order to ensure we are storing the identity screen resolution:

    VkSurfaceCapabilitiesKHR capabilities;
    vkGetPhysicalDeviceSurfaceCapabilitiesKHR(physDevice, surface, &capabilities);

    uint32_t width = capabilities.currentExtent.width;
    uint32_t height = capabilities.currentExtent.height;
    if (capabilities.currentTransform & VK_SURFACE_TRANSFORM_ROTATE_90_BIT_KHR ||
        capabilities.currentTransform & VK_SURFACE_TRANSFORM_ROTATE_270_BIT_KHR) {
      // Swap to get identity width and height
      capabilities.currentExtent.height = width;
      capabilities.currentExtent.width = height;
    }

    displaySizeIdentity = capabilities.currentExtent;

displaySizeIdentity is a VkExtent2D that we use to store said identity resolution of the app's window surface in the display’s natural orientation.

Detect Device Orientation Changes (Android 10+)

To know when the application has encountered an orientation change, the most reliable means of tracking this change involves checking the return value of vkQueuePresentKHR() and seeing if it returned VK_SUBOPTIMAL_KHR

  auto res = vkQueuePresentKHR(queue_, &present_info);
  if (res == VK_SUBOPTIMAL_KHR){
    orientationChanged = true;
  }

One thing to note about this solution is that it only works on devices running Android Q and above as that is when Android started to return VK_SUBOPTIMAL_KHR from vkQueuePresentKHR()

orientationChanged is a boolean stored somewhere accessible from the applications main rendering loop

Detecting Device Orientation Changes (Pre-Android 10)

For devices running older versions of Android below 10, a different implementation is needed since we do not have access to VK_SUBOPTIMAL_KHR.

Using Polling

On pre-10 devices we can poll the current device transform every pollingInterval frames, where pollingInterval is a granularity decided on by the programmer. The way we do this is by calling vkGetPhysicalDeviceSurfaceCapabilitiesKHR() and then comparing the returned currentTransform field with that of the currently stored surface transformation (in this code example stored in pretransformFlag)

  currFrameCount++;
  if (currFrameCount >= pollInterval){
    VkSurfaceCapabilitiesKHR capabilities;
    vkGetPhysicalDeviceSurfaceCapabilitiesKHR(physDevice, surface, &capabilities);

    if (pretransformFlag != capabilities.currentTransform) {
      window_resized = true;
    }
    currFrameCount = 0;
  }

On a Pixel 4 running Android Q, polling vkGetPhysicalDeviceSurfaceCapabilitiesKHR() took between .120-.250ms and on a Pixel 1XL running Android O polling took .110-.350ms

Using Callbacks

A second option for devices running below Android 10 is to register an onNativeWindowResized() callback to call a function that sets the orientationChanged flag to signal to the application an orientation change has occurred:

void android_main(struct android_app *app) {
  ...
  app->activity->callbacks->onNativeWindowResized = ResizeCallback;
}

Where ResizeCallback is defined as:

void ResizeCallback(ANativeActivity *activity, ANativeWindow *window){
  orientationChanged = true;
}

The drawback to this solution is that onNativeWindowResized() only ever gets called on 90 degree orientation changes (going from landscape to portrait or vice versa), so for example an orientation change from landscape to reverse landscape will not trigger the swapchain recreation, requiring the Android compositor to do the flip for your application.

Handling the Orientation Change

To actually handle the orientation change, we first have a check at the top of the main rendering loop for whether the orientationChanged variable has been set to true, and if so we'll go into the orientation change routine:

bool VulkanDrawFrame() {
 if (orientationChanged) {
   OnOrientationChange();
 }

And within the OnOrientationChange() function we will do all the work necessary to recreate the swapchain. This involves destroying any existing Framebuffers and ImageViews; recreating the swapchain while destroying the old swapchain (which will be discussed next); and then recreating the Framebuffers with the new swapchain’s DisplayImages. Note that attachment images (depth/stencil images for example) usually do not need to be recreated as they are based on the identity resolution of the pre-rotated swapchain images.

void OnOrientationChange() {
 vkDeviceWaitIdle(getDevice());

 for (int i = 0; i < getSwapchainLength(); ++i) {
   vkDestroyImageView(getDevice(), displayViews_[i], nullptr);
   vkDestroyFramebuffer(getDevice(), framebuffers_[i], nullptr);
 }

 createSwapChain(getSwapchain());
 createFrameBuffers(render_pass, depthBuffer.image_view);
 orientationChanged = false;
}

And at the end of the function we reset the orientationChanged flag to false to show that we have handled the orientation change.

Swapchain Recreation

In the previous section we mention having to recreate the swapchain. The first steps to doing so involves getting the new characteristics of the rendering surface:

void createSwapChain(VkSwapchainKHR oldSwapchain) {
   VkSurfaceCapabilitiesKHR capabilities;
   vkGetPhysicalDeviceSurfaceCapabilitiesKHR(physDevice, surface, &capabilities);
   pretransformFlag = capabilities.currentTransform;

With the VkSurfaceCapabilities struct populated with the new information, we can now check to see whether an orientation change has occurred by checking the currentTransform field and store it for later in the pretransformFlag field as we will be needing it for later when we make adjustments to the MVP matrix.

In order to do so we must make sure that we properly specify some attributes within the VkSwapchainCreateInfo struct:

VkSwapchainCreateInfoKHR swapchainCreateInfo{
  ...                        
  .sType = VK_STRUCTURE_TYPE_SWAPCHAIN_CREATE_INFO_KHR,
  .imageExtent = displaySizeIdentity,
  .preTransform = pretransformFlag,
  .oldSwapchain = oldSwapchain,
};

vkCreateSwapchainKHR(device_, &swapchainCreateInfo, nullptr, &swapchain_));

if (oldSwapchain != VK_NULL_HANDLE) {
  vkDestroySwapchainKHR(device_, oldSwapchain, nullptr);
}

The imageExtent field will be populated with the displaySizeIdentity extent that we stored at application startup. The preTransform field will be populated with our pretransformFlag variable (which is set to the currentTransform field of the surfaceCapabilities). We also set the oldSwapchain field to the swapchain that we are about to destroy.

It is important that the surfaceCapabilities.currentTransform field and the swapchainCreateInfo.preTransform field match because this lets the Android OS know that we are handling the orientation change ourselves, thus avoiding the Android Compositor.

MVP Matrix Adjustment

The last thing that needs to be done is actually apply the pre-transformation. This is done by applying a rotation matrix to your MVP matrix. What this essentially does is apply the rotation in clip space so that the resulting image is rotated to the device current orientation. You can then simply pass this updated MVP matrix into your vertex shader and use it as normal without the need to modify your shaders.

glm::mat4 pre_rotate_mat = glm::mat4(1.0f);
glm::vec3 rotation_axis = glm::vec3(0.0f, 0.0f, 1.0f);

if (pretransformFlag & VK_SURFACE_TRANSFORM_ROTATE_90_BIT_KHR) {
 pre_rotate_mat = glm::rotate(pre_rotate_mat, glm::radians(90.0f), rotation_axis);
}

else if (pretransformFlag & VK_SURFACE_TRANSFORM_ROTATE_270_BIT_KHR) {
 pre_rotate_mat = glm::rotate(pre_rotate_mat, glm::radians(270.0f), rotation_axis);
}

else if (pretransformFlag & VK_SURFACE_TRANSFORM_ROTATE_180_BIT_KHR) {
 pre_rotate_mat = glm::rotate(pre_rotate_mat, glm::radians(180.0f), rotation_axis);
}

MVP = pre_rotate_mat * MVP;

Consideration - Non-Full Screen Viewport and Scissor

If your application is using a non-full screen viewport/scissor region, they will need to be updated according to the orientation of the device. This requires that we enable the dynamic Viewport and Scissor options during Vulkan’s pipeline creation:

VkDynamicState dynamicStates[2] = {
  VK_DYNAMIC_STATE_VIEWPORT,
  VK_DYNAMIC_STATE_SCISSOR,
};

VkPipelineDynamicStateCreateInfo dynamicInfo = {
  .sType = VK_STRUCTURE_TYPE_PIPELINE_DYNAMIC_STATE_CREATE_INFO,
  .pNext = nullptr,
  .flags = 0,
  .dynamicStateCount = 2,
  .pDynamicStates = dynamicStates,
};

VkGraphicsPipelineCreateInfo pipelineCreateInfo = {
  .sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO,
  ...
  .pDynamicState = &dynamicInfo,
  ...
};

VkCreateGraphicsPipelines(device, VK_NULL_HANDLE, 1, &pipelineCreateInfo, nullptr, &mPipeline);

The actual computation of the viewport extent during command buffer recording looks like this:

int x = 0, y = 0, w = 500, h = 400;
glm::vec4 viewportData;

switch (device->GetPretransformFlag()) {
  case VK_SURFACE_TRANSFORM_ROTATE_90_BIT_KHR:
    viewportData = {bufferWidth - h - y, x, h, w};
    break;
  case VK_SURFACE_TRANSFORM_ROTATE_180_BIT_KHR:
    viewportData = {bufferWidth - w - x, bufferHeight - h - y, w, h};
    break;
  case VK_SURFACE_TRANSFORM_ROTATE_270_BIT_KHR:
    viewportData = {y, bufferHeight - w - x, h, w};
    break;
  default:
    viewportData = {x, y, w, h};
    break;
}

const VkViewport viewport = {
    .x = viewportData.x,
    .y = viewportData.y,
    .width = viewportData.z,
    .height = viewportData.w,
    .minDepth = 0.0F,
    .maxDepth = 1.0F,
};

vkCmdSetViewport(renderer->GetCurrentCommandBuffer(), 0, 1, &viewport);

Where x and y define the coordinates of the top left corner of the viewport, and w and h define the width and height of the viewport respectively.

The same computation can be used to also set the scissor test, and is included below for completeness:

int x = 0, y = 0, w = 500, h = 400;
glm::vec4 scissorData;

switch (device->GetPretransformFlag()) {
  case VK_SURFACE_TRANSFORM_ROTATE_90_BIT_KHR:
    scissorData = {bufferWidth - h - y, x, h, w};
    break;
  case VK_SURFACE_TRANSFORM_ROTATE_180_BIT_KHR:
    scissorData = {bufferWidth - w - x, bufferHeight - h - y, w, h};
    break;
  case VK_SURFACE_TRANSFORM_ROTATE_270_BIT_KHR:
    scissorData = {y, bufferHeight - w - x, h, w};
    break;
  default:
    scissorData = {x, y, w, h};
    break;
}

const VkRect2D scissor = {
    .offset =
        {
            .x = (int32_t)viewportData.x,
            .y = (int32_t)viewportData.y,
        },
    .extent =
        {
            .width = (uint32_t)viewportData.z,
            .height = (uint32_t)viewportData.w,
        },
};

vkCmdSetScissor(renderer->GetCurrentCommandBuffer(), 0, 1, &scissor);

Consideration - Fragment Shader Derivatives

If your application is using derivative computations such as dFdx and dFdy, additional transformations may be needed to account for the rotated coordinate system as these computations are executed in pixel space. This requires the app to pass some indication of the preTransform into the fragment shader (such as an integer representing the current device orientation) and use that to map the derivative computations properly:

  • For a 90 degree pre-rotated frame
    • dFdx needs to be mapped to dFdy
    • dFdy needs to be mapped to dFdx
  • For a 270 degree pre-rotated frame
    • dFdx needs to be mapped to -dFdy
    • dFdy needs to be mapped to -dFdx
  • For a 180 degree pre-rotated frame,
    • dFdx needs to be mapped to -dFdx
    • dFdy needs to be mapped to -dFdy

Conclusion

In order for your application to get the most out of Vulkan on Android, implementing pre-rotation is a must. The most important takeaways from this blogpost are:

  1. Ensure that during swapchain creation or recreation, the pretransform flag is set so that it matches the flag returned by the Android operating system. This will avoid the compositor overhead
  2. Keep the swapchain size fixed to the identity resolution of the app's window surface in the display’s natural orientation
  3. Rotate the MVP matrix in clip space to account for the devices orientation since the swapchain resolution/extent no longer update with the orientation of the display
  4. Update viewport and scissor rectangles as needed by your application

Here is a link to a minimal example of pre-rotation being implemented for an android application:

https://github.com/google/vulkan-pre-rotation-demo

Optimize, Develop, and Debug with Vulkan Developer Tools

Posted by Shannon Woods, Technical Program Manager

Today we're pleased to bring you a preview of Android development tools for Vulkan™. Vulkan is a new 3D rendering API which we’ve helped to develop as a member of Khronos, geared at providing explicit, low-overhead GPU (Graphics Processor Unit) control to developers. Vulkan’s reduction of CPU overhead allows some synthetic benchmarks to see as much as 10 times the draw call throughput on a single core as compared to OpenGL ES. Combined with a threading-friendly API design which allows multiple cores to be used in parallel with high efficiency, this offers a significant boost in performance for draw-call heavy applications.

Vulkan support is available now via the Android N Preview on devices which support it, including Nexus 5X and Nexus 6P. (Of course, you will still be able to use OpenGL ES as well!)

To help developers start coding quickly, we’ve put together a set of samples and guides that illustrate how to use Vulkan effectively.

You can see Vulkan in action running on an Android device with Robert Hodgin’s Fish Tornado demo, ported by Google’s Art, Copy, and Code team:


Optimization: The Vulkan API

There are many similarities between OpenGL ES and Vulkan, but Vulkan offers new features for developers who need to make every millisecond count.


  • Application control of memory allocation. Vulkan provides mechanisms for fine-grained control of how and when memory is allocated on the GPU. This allows developers to use their own allocation and recycling policies to fit their application, ultimately reducing execution and memory overhead and allowing applications to control when expensive allocations occur.
  • Asynchronous command generation. In OpenGL ES, draw calls are issued to the GPU as soon as the application calls them. In Vulkan, the application instead submits draw calls to command buffers, which allows the work of forming and recording the draw call to be separated from the act of issuing it to the GPU. By spreading command generation across several threads, applications can more effectively make use of multiple CPU cores. These command buffers can also be reused, reducing the overhead involved in command creation and issuance.
  • No hidden work. One OpenGL ES pitfall is that some commands may trigger work at points which are not explicitly spelled out in the API specification or made obvious to the developer. Vulkan makes performance more predictable and consistent by specifying which commands will explicitly trigger work and which will not.
  • Multithreaded design, from the ground up. All OpenGL ES applications must issue commands for a context only from a single thread in order to render predictably and correctly. By contrast, Vulkan doesn’t have this requirement, allowing applications to do work like command buffer generation in parallel— but at the same time, it doesn’t make implicit guarantees about the safety of modifying and reading data from multiple threads at the same time. The power and responsibility of managing thread synchronization is in the hands of the application.
  • Mobile-friendly features. Vulkan includes features particularly helpful for achieving high performance on tiling GPUs, used by many mobile devices. Applications can provide information about the interaction between separate rendering passes, allowing tiling GPUs to make effective use of limited memory bandwidth, and avoid performing off-chip reads.
  • Offline shader compilation. Vulkan mandates support for SPIR-V, an intermediate language for shaders. This allows developers to compile shaders ahead of time, and ship SPIR-V binaries with their applications. These binaries are simpler to parse than high-level languages like GLSL, which means less variance in how drivers perform this parsing. SPIR-V also opens the door for third parties to provide compilers for specialized or cross-platform shading languages.
  • Optional validation. OpenGL ES validates every command you call, checking that arguments are within expected ranges, and objects are in the correct state to be operated upon. Vulkan doesn’t perform any of this validation itself. Instead, developers can use optional debug tools to ensure their calls are correct, incurring no run-time overhead in the final product.

Debugging: Validation Layers


As noted above, Vulkan’s lack of implicit validation requires developers to make use of tools outside the API in order to validate their code. Vulkan’s layer mechanism allows validation code and other developer tools to inspect every API call during development, without incurring any overhead in the shipping version. Our guides show you how to build the validation layers for use with the Android NDK, giving you the tools necessary to build bug-free Vulkan code from start to finish.

Develop: Shader toolchain


The Shaderc collection of tools provides developers with build-time and run-time tools for compiling GLSL into SPIR-V. Shaders can be compiled at build time using glslc, a command-line compiler, for easy integration into existing build systems. Or, for shaders which are generated or edited during execution, developers can use the Shaderc library to compile GLSL shaders to SPIR-V via a C interface. Both tools are built on top of Khronos’s reference compiler.

Additional Resources


The Vulkan ecosystem is a broad one, and the resources to get you started don’t end here. There is a wealth of material to explore, including:

Low-overhead rendering with Vulkan

Posted by Shannon Woods, Technical Program Manager

Developers of games and 3D graphics applications have one key challenge to meet: How complex a scene can they draw in a small fraction of a second? Much of the work in graphics development goes into organizing data so it can be efficiently consumed by the GPU for rendering. But even the most careful developers can hit unforeseen bottlenecks, in part because the drivers for some graphics processors may reorganize all of that data before it can actually be processed. The APIs used to control these drivers are also not designed for multi-threaded use, requiring synchronization with locks around calls that could be more efficiently done in parallel. All of this results in CPU overhead, which consumes time and power that you’d probably prefer to spend drawing your scene.

Lowering overhead and handing control to developers

In order to address some of the sources of CPU overhead and provide developers with more explicit control over rendering, we’ve been working to bring a new 3D rendering API, Vulkan™, to Android. Like OpenGL™ ES, Vulkan is an open standard for 3D graphics and rendering maintained by Khronos. Vulkan is being designed from the ground up to minimize CPU overhead in the driver, and allow your application to control GPU operation more directly. Vulkan also enables better parallelization by allowing multiple threads to perform work such as command buffer construction at once.

An API is only useful if it does what you expect

To make it easier to write an application once that works across a variety of devices, Android 5.0 Lollipop significantly expanded the Android Compatibility Test Suite (CTS) with over fifty thousand new tests for OpenGL ES, and many more have been added since. This provides an extensive open source test suite for identifying problems in drivers so that they can be fixed, creating a more robust and reliable experience for both developers and end users. For Vulkan, we’ll not only develop similar tests for use in the Android CTS, but we’ll also contribute them to Khronos for use in Vulkan’s own open source Conformance Test Suite. This will enable Khronos to test Vulkan drivers across platforms and hardware, and improve the 3D graphics ecosystem as a whole.

It’s all about developer choice

We’ll be working hard to help create, test, and ship Vulkan, but at the same time, we’re also going to contribute to and support OpenGL ES. As a developer, you’ll be able to choose which API is right for you: the simplicity of OpenGL ES, or the explicit control of Vulkan. We’re committed to providing an excellent developer experience, no matter which API you choose.

Vulkan is still under development, but you’ll be able to find specifications, tests, and tools once they are released at http://www.khronos.org/vulkan.