Tag Archives: SIMD

The cpu_features library

Originally posted by Guillaume Chatelet from the Google Compiler Research Team on the Google Open Source Blog

"Write Once, Run Anywhere." That was the promise of Java back in the 1990s. You could write your Java code on one platform, and it would run on any CPU implementing a Java Virtual Machine.

Copyright Andrew Dunn, licensed CC-BY-SA-2.0

But for developers who need to squeeze every bit of performance out of their applications, that's not enough. Since the dawn of computing, performance-minded programmers have used insights about hardware to fine tune their code.

Let's say you're working on code for which speed is paramount, perhaps a new video codec or a library to process tensors. There are individual instructions that will dramatically improve performance, like fused multiply-add, as well as entire instruction sets like SSE2 and AVX, that can give the critical portions of your code a speed boost.

Here's the problem: there's no way to know a priori which instructions your CPU supports. Identifying the CPU manufacturer isn't sufficient. For instance, Intel's Haswell architecture supports the AVX2 instruction set, while Sandy Bridge doesn't. Some developers resort to desperate measures like reading /proc/cpuinfo to identify the CPU and then consulting hardcoded mappings of CPU IDs to instructions.

Enter cpu_features, a small, fast, and simple open source library to report CPU features at runtime. Written in C99 for maximum portability, it allocates no memory and is suitable for implementing fundamental functions and running in sandboxed environments.

The library currently supports x86, ARM/AArch64, and MIPS processors, and we'll be adding to it as the need arises. We also welcome contributions from others interested in making programs "write once, run fast everywhere."

The cpu_features library

"Write Once, Run Anywhere." That was the promise of Java back in the 1990s. You could write your Java code on one platform, and it would run on any CPU implementing a Java Virtual Machine.

But for developers who need to squeeze every bit of performance out of their applications, that's not enough. Since the dawn of computing, performance-minded programmers have used insights about hardware to fine tune their code.

Let's say you're working on code for which speed is paramount, perhaps a new video codec or a library to process tensors. There are individual instructions that will dramatically improve performance, like fused multiply-add, as well as entire instruction sets like SSE2 and AVX, that can give the critical portions of your code a speed boost.
Photo by Andrew Dunn, licensed CC-BY-SA-2.0.

Here's the problem: there's no way to know a priori which instructions your CPU supports. Identifying the CPU manufacturer isn't sufficient. For instance, Intel’s Haswell architecture supports the AVX2 instruction set, while Sandy Bridge doesn't. Some developers resort to desperate measures like reading /proc/cpuinfo to identify the CPU and then consulting hardcoded mappings of CPU IDs to instructions.

Enter cpu_features, a small, fast, and simple open source library to report CPU features at runtime. Written in C89 for maximum portability, it allocates no memory and is suitable for implementing fundamental functions and running in sandboxed environments.

The library currently supports x86, ARM/AArch64, and MIPS processors, and we'll be adding to it as the need arises. We also welcome contributions from others interested in making programs “write once, run fast everywhere.”

By Guillaume Chatelet, Google Compiler Research Team