Tag Archives: rust

Rewarding Rust contributors with Google Open Source Peer Bonuses

logo showing a trophy cup in the Google colors, with the text “Google Open Source Peer Bonus” below it
We are very excited to reward 25 open source contributors, specifically, for their work on Rust projects! 

At Google, Open Source lies at the core of not only our processes but also many of our products. While we try to directly contribute upstream as much as possible, the Google Open Source Peer Bonus program is designed to reward external open source contributors for their outstanding contributions to FOSS, whether the contribution benefits Google in some way or not.

The Rust programming language is an open source systems programming language with a strong focus on memory safety. The language has a caring community and a robust package ecosystem, which have heavily contributed to its growing popularity. The Rust community shows dedication to maintaining quality packages and tooling, which we at Google are quite thankful for.

Among many other things, Google uses Rust in some open source projects, including Android, Fuchsia, and ICU4X; and has been participating in the efforts to evaluate Rust in the Linux Kernel. Google is also a founding member of the Rust Foundation.

Below is the list of winners who gave us permission to thank them publicly:

Winner

Project

antoyo

For work on rustc_codegen_gcc

Asherah Connor

For maintaining comrak

David Hewitt

For maintaining PyO3

Dirkjan Ochtman

For maintaining rustls and quinn

Frank Denis

For maintaining rust-ed25519-compact

Gary Guo

For maintaining Rust for Linux

Jack Grigg

For integrating RustCrypto into Fuchsia

Jack Huey

For highly involved rust compiler work fixing a large number of crashes around higher-kinded types.

Joe Birr-Pixton

For building rustls

Joshua Nelson

For improving the developer workflow for contributing to Rust itself

Lokathor

For creating tinyvec and bytemuck

Mara Bos

For work on the Rust Libraries Team and the 2021 Rust Edition

Nikita Popov

For maintaining the Rust compiler’s LLVM backend

Pietro Albini

For maintaining crucial Rust infrastructure and working on the Rust core team

Ricky Hosfelt

For maintaining cargo-outdated

Sébastien Crozet

For creating dimforge

Simonas Kazlauskas

For maintaining the Rust compiler’s LLVM backend


Thank you for your contributions to the Rust projects and ecosystem! Congratulations!

By Maria Tabak, Google Open Source and Manish Goregaokar, Rustacean Googler

Rust/C++ interop in the Android Platform

One of the main challenges of evaluating Rust for use within the Android platform was ensuring we could provide sufficient interoperability with our existing codebase. If Rust is to meet its goals of improving security, stability, and quality Android-wide, we need to be able to use Rust anywhere in the codebase that native code is required. To accomplish this, we need to provide the majority of functionality platform developers use. As we discussed previously, we have too much C++ to consider ignoring it, rewriting all of it is infeasible, and rewriting older code would likely be counterproductive as the bugs in that code have largely been fixed. This means interoperability is the most practical way forward.

Before introducing Rust into the Android Open Source Project (AOSP), we needed to demonstrate that Rust interoperability with C and C++ is sufficient for practical, convenient, and safe use within Android. Adding a new language has costs; we needed to demonstrate that Rust would be able to scale across the codebase and meet its potential in order to justify those costs. This post will cover the analysis we did more than a year ago while we evaluated Rust for use in Android. We also present a follow-up analysis with some insights into how the original analysis has held up as Android projects have adopted Rust.

Language interoperability in Android

Existing language interoperability in Android focuses on well defined foreign-function interface (FFI) boundaries, which is where code written in one programming language calls into code written in a different language. Rust support will likewise focus on the FFI boundary as this is consistent with how AOSP projects are developed, how code is shared, and how dependencies are managed. For Rust interoperability with C, the C application binary interface (ABI) is already sufficient.

Interoperability with C++ is more challenging and is the focus of this post. While both Rust and C++ support using the C ABI, it is not sufficient for idiomatic usage of either language. Simply enumerating the features of each language results in an unsurprising conclusion: many concepts are not easily translatable, nor do we necessarily want them to be. After all, we’re introducing Rust because many features and characteristics of C++ make it difficult to write safe and correct code. Therefore, our goal is not to consider all language features, but rather to analyze how Android uses C++ and ensure that interop is convenient for the vast majority of our use cases.

We analyzed code and interfaces in the Android platform specifically, not codebases in general. While this means our specific conclusions may not be accurate for other codebases, we hope the methodology can help others to make a more informed decision about introducing Rust into their large codebase. Our colleagues on the Chrome browser team have done a similar analysis, which you can find here.

This analysis was not originally intended to be published outside of Google: our goal was to make a data-driven decision on whether or not Rust was a good choice for systems development in Android. While the analysis is intended to be accurate and actionable, it was never intended to be comprehensive, and we’ve pointed out a couple of areas where it could be more complete. However, we also note that initial investigations into these areas showed that they would not significantly impact the results, which is why we decided to not invest the additional effort.

Methodology

Exported functions from Rust and C++ libraries are where we consider interop to be essential. Our goals are simple:

  • Rust must be able to call functions from C++ libraries and vice versa.
  • FFI should require a minimum of boilerplate.
  • FFI should not require deep expertise.

While making Rust functions callable from C++ is a goal, this analysis focuses on making C++ functions available to Rust so that new Rust code can be added while taking advantage of existing implementations in C++. To that end, we look at exported C++ functions and consider existing and planned compatibility with Rust via the C ABI and compatibility libraries. Types are extracted by running objdump on shared libraries to find external C++ functions they use1 and running c++filt to parse the C++ types. This gives functions and their arguments. It does not consider return values, but a preliminary analysis2 of those revealed that they would not significantly affect the results.

We then classify each of these types into one of the following buckets:

Supported by bindgen

These are generally simple types involving primitives (including pointers and references to them). For these types, Rust’s existing FFI will handle them correctly, and Android’s build system will auto-generate the bindings.

Supported by cxx compat crate

These are handled by the cxx crate. This currently includes std::string, std::vector, and C++ methods (including pointers/references to these types). Users simply have to define the types and functions they want to share across languages and cxx will generate the code to do that safely.

Native support

These types are not directly supported, but the interfaces that use them have been manually reworked to add Rust support. Specifically, this includes types used by AIDL and protobufs.

We have also implemented a native interface for StatsD as the existing C++ interface relies on method overloading, which is not well supported by bindgen and cxx3. Usage of this system does not show up in the analysis because the C++ API does not use any unique types.

Potential addition to cxx

This is currently common data structures such as std::optional and std::chrono::duration and custom string and vector implementations.

These can either be supported natively by a future contribution to cxx, or by using its ExternType facilities. We have only included types in this category that we believe are relatively straightforward to implement and have a reasonable chance of being accepted into the cxx project.

We don't need/intend to support

Some types are exposed in today’s C++ APIs that are either an implicit part of the API, not an API we expect to want to use from Rust, or are language specific. Examples of types we do not intend to support include:

  • Mutexes - we expect that locking will take place in one language or the other, rather than needing to pass mutexes between languages, as per our coarse-grained philosophy.
  • native_handle - this is a JNI interface type, so it is inappropriate for use in Rust/C++ communication.
  • std::locale& - Android uses a separate locale system from C++ locales. This type primarily appears in output due to e.g., cout usage, which would be inappropriate to use in Rust.

Overall, this category represents types that we do not believe a Rust developer should be using.

HIDL

Android is in the process of deprecating HIDL and migrating to AIDL for HALs for new services.We’re also migrating some existing implementations to stable AIDL. Our current plan is to not support HIDL, preferring to migrate to stable AIDL instead. These types thus currently fall into the “We don't need/intend to support'' bucket above, but we break them out to be more specific. If there is sufficient demand for HIDL support, we may revisit this decision later.

Other

This contains all types that do not fit into any of the above buckets. It is currently mostly std::string being passed by value, which is not supported by cxx.

Top C++ libraries

One of the primary reasons for supporting interop is to allow reuse of existing code. With this in mind, we determined the most commonly used C++ libraries in Android: liblog, libbase, libutils, libcutils, libhidlbase, libbinder, libhardware, libz, libcrypto, and libui. We then analyzed all of the external C++ functions used by these libraries and their arguments to determine how well they would interoperate with Rust.

Overall, 81% of types are in the first three categories (which we currently fully support) and 87% are in the first four categories (which includes those we believe we can easily support). Almost all of the remaining types are those we believe we do not need to support.

Mainline modules

In addition to analyzing popular C++ libraries, we also examined Mainline modules. Supporting this context is critical as Android is migrating some of its core functionality to Mainline, including much of the native code we hope to augment with Rust. Additionally, their modularity presents an opportunity for interop support.

We analyzed 64 binaries and libraries in 21 modules. For each analyzed library we examined their used C++ functions and analyzed the types of their arguments to determine how well they would interoperate with Rust in the same way we did above for the top 10 libraries.

Here 88% of types are in the first three categories and 90% in the first four, with almost all of the remaining being types we do not need to handle.

Analysis of Rust/C++ Interop in AOSP

With almost a year of Rust development in AOSP behind us, and more than a hundred thousand lines of code written in Rust, we can now examine how our original analysis has held up based on how C/C++ code is currently called from Rust in AOSP.4

The results largely match what we expected from our analysis with bindgen handling the majority of interop needs. Extensive use of AIDL by the new Keystore2 service results in the primary difference between our original analysis and actual Rust usage in the “Native Support” category.

A few current examples of interop are:

  • Cxx in Bluetooth - While Rust is intended to be the primary language for Bluetooth, migrating from the existing C/C++ implementation will happen in stages. Using cxx allows the Bluetooth team to more easily serve legacy protocols like HIDL until they are phased out by using the existing C++ support to incrementally migrate their service.
  • AIDL in keystore - Keystore implements AIDL services and interacts with apps and other services over AIDL. Providing this functionality would be difficult to support with tools like cxx or bindgen, but the native AIDL support is simple and ergonomic to use.
  • Manually-written wrappers in profcollectd - While our goal is to provide seamless interop for most use cases, we also want to demonstrate that, even when auto-generated interop solutions are not an option, manually creating them can be simple and straightforward. Profcollectd is a small daemon that only exists on non-production engineering builds. Instead of using cxx it uses some small manually-written C wrappers around C++ libraries that it then passes to bindgen.

Conclusion

Bindgen and cxx provide the vast majority of Rust/C++ interoperability needed by Android. For some of the exceptions, such as AIDL, the native version provides convenient interop between Rust and other languages. Manually written wrappers can be used to handle the few remaining types and functions not supported by other options as well as to create ergonomic Rust APIs. Overall, we believe interoperability between Rust and C++ is already largely sufficient for convenient use of Rust within Android.

If you are considering how Rust could integrate into your C++ project, we recommend doing a similar analysis of your codebase. When addressing interop gaps, we recommend that you consider upstreaming support to existing compat libraries like cxx.

Acknowledgements

Our first attempt at quantifying Rust/C++ interop involved analyzing the potential mismatches between the languages. This led to a lot of interesting information, but was difficult to draw actionable conclusions from. Rather than enumerating all the potential places where interop could occur, Stephen Hines suggested that we instead consider how code is currently shared between C/C++ projects as a reasonable proxy for where we’ll also likely want interop for Rust. This provided us with actionable information that was straightforward to prioritize and implement. Looking back, the data from our real-world Rust usage has reinforced that the initial methodology was sound. Thanks Stephen!

Also, thanks to:

  • Andrei Homescu and Stephen Crane for contributing AIDL support to AOSP.
  • Ivan Lozano for contributing protobuf support to AOSP.
  • David Tolnay for publishing cxx and accepting our contributions.
  • The many authors and contributors to bindgen.
  • Jeff Vander Stoep and Adrian Taylor for contributions to this post.


  1. We used undefined symbols of function type as reported by objdump to perform this analysis. This means that any header-only functions will be absent from our analysis, and internal (non-API) functions which are called by header-only functions may appear in it. 

  2. We extracted return values by parsing DWARF symbols, which give the return types of functions. 

  3. Even without automated binding generation, manually implementing the bindings is straightforward. 

  4. In the case of handwritten C/C++ wrappers, we analyzed the functions they call, not the wrappers themselves. For all uses of our native AIDL library, we analyzed the types used in the C++ version of the library. 

Rust in the Linux kernel

In our previous post, we announced that Android now supports the Rust programming language for developing the OS itself. Related to this, we are also participating in the effort to evaluate the use of Rust as a supported language for developing the Linux kernel. In this post, we discuss some technical aspects of this work using a few simple examples.

C has been the language of choice for writing kernels for almost half a century because it offers the level of control and predictable performance required by such a critical component. Density of memory safety bugs in the Linux kernel is generally quite low due to high code quality, high standards of code review, and carefully implemented safeguards. However, memory safety bugs do still regularly occur. On Android, vulnerabilities in the kernel are generally considered high-severity because they can result in a security model bypass due to the privileged mode that the kernel runs in.

We feel that Rust is now ready to join C as a practical language for implementing the kernel. It can help us reduce the number of potential bugs and security vulnerabilities in privileged code while playing nicely with the core kernel and preserving its performance characteristics.

Supporting Rust

We developed an initial prototype of the Binder driver to allow us to make meaningful comparisons between the safety and performance characteristics of the existing C version and its Rust counterpart. The Linux kernel has over 30 million lines of code, so naturally our goal is not to convert it all to Rust but rather to allow new code to be written in Rust. We believe this incremental approach allows us to benefit from the kernel’s existing high-performance implementation while providing kernel developers with new tools to improve memory safety and maintain performance going forward.

We joined the Rust for Linux organization, where the community had already done and continues to do great work toward adding Rust support to the Linux kernel build system. We also need designs that allow code in the two languages to interact with each other: we're particularly interested in safe, zero-cost abstractions that allow Rust code to use kernel functionality written in C, and how to implement functionality in idiomatic Rust that can be called seamlessly from the C portions of the kernel.

Since Rust is a new language for the kernel, we also have the opportunity to enforce best practices in terms of documentation and uniformity. For example, we have specific machine-checked requirements around the usage of unsafe code: for every unsafe function, the developer must document the requirements that need to be satisfied by callers to ensure that its usage is safe; additionally, for every call to unsafe functions (or usage of unsafe constructs like dereferencing a raw pointer), the developer must document the justification for why it is safe to do so.

Just as important as safety, Rust support needs to be convenient and helpful for developers to use. Let’s get into a few examples of how Rust can assist kernel developers in writing drivers that are safe and correct.

Example driver

We'll use an implementation of a semaphore character device. Each device has a current value; writes of n bytes result in the device value being incremented by n; reads decrement the value by 1 unless the value is 0, in which case they will block until they can decrement the count without going below 0.

Suppose semaphore is a file representing our device. We can interact with it from the shell as follows:

> cat semaphore

When semaphore is a newly initialized device, the command above will block because the device's current value is 0. It will be unblocked if we run the following command from another shell because it increments the value by 1, which allows the original read to complete:

> echo -n a > semaphore

We could also increment the count by more than 1 if we write more data, for example:

> echo -n abc > semaphore

increments the count by 3, so the next 3 reads won't block.

To allow us to show a few more aspects of Rust, we'll add the following features to our driver: remember what the maximum value was throughout the lifetime of a device, and remember how many reads each file issued on the device.

We'll now show how such a driver would be implemented in Rust, contrasting it with a C implementation. We note, however, we are still early on so this is all subject to change in the future. How Rust can assist the developer is the aspect that we'd like to emphasize. For example, at compile time it allows us to eliminate or greatly reduce the chances of introducing classes of bugs, while at the same time remaining flexible and having minimal overhead.

Character devices

A developer needs to do the following to implement a driver for a new character device in Rust:

  1. Implement the FileOperations trait: all associated functions are optional, so the developer only needs to implement the relevant ones for their scenario. They relate to the fields in C's struct file_operations.
  2. Implement the FileOpener trait: it is a type-safe equivalent to C's open field of struct file_operations.
  3. Register the new device type with the kernel: this lets the kernel know what functions need to be called in response to files of this new type being operated on.

The following outlines how the first two steps of our example compare in Rust and C:

impl FileOpener<Arc<Semaphore>> for FileState {
fn open(
shared: &Arc<Semaphore>
) -> KernelResult<Box<Self>> {
[...]
}
}

impl FileOperations for FileState {
type Wrapper = Box<Self>;

fn read(
&self,
_: &File,
data: &mut UserSlicePtrWriter,
offset: u64
) -> KernelResult<usize> {
[...]
}

fn write(
&self,
data: &mut UserSlicePtrReader,
_offset: u64
) -> KernelResult<usize> {
[...]
}

fn ioctl(
&self,
file: &File,
cmd: &mut IoctlCommand
) -> KernelResult<i32> {
[...]
}

fn release(_obj: Box<Self>, _file: &File) {
[...]
}

declare_file_operations!(read, write, ioctl);
}
static 
int semaphore_open(struct inode *nodp,
struct file *filp)

{
struct semaphore_state *shared =
container_of(filp->private_data,
struct semaphore_state,
miscdev);
[...]
}

static
ssize_t semaphore_write(struct file *filp,
const char __user *buffer,
size_t count, loff_t *ppos)

{
struct file_state *state = filp->private_data;
[...]
}

static
ssize_t semaphore_read(struct file *filp,
char __user *buffer,
size_t count, loff_t *ppos)

{
struct file_state *state = filp->private_data;
[...]
}

static
long semaphore_ioctl(struct file *filp,
unsigned int cmd,
unsigned long arg)

{
struct file_state *state = filp->private_data;
[...]
}

static
int semaphore_release(struct inode *nodp,
struct file *filp)

{
struct file_state *state = filp->private_data;
[...]
}

static const struct file_operations semaphore_fops = {
.owner = THIS_MODULE,
.open = semaphore_open,
.read = semaphore_read,
.write = semaphore_write,
.compat_ioctl = semaphore_ioctl,
.release = semaphore_release,
};

Character devices in Rust benefit from a number of safety features:

  • Per-file state lifetime management: FileOpener::open returns an object whose lifetime is owned by the caller from then on. Any object that implements the PointerWrapper trait can be returned, and we provide implementations for Box<T> and Arc<T>, so developers that use Rust's idiomatic heap-allocated or reference-counted pointers have no additional requirements.

    All associated functions in FileOperations receive non-mutable references to self (more about this below), except the release function, which is the last function to be called and receives the plain object back (and its ownership with it). The release implementation can then defer the object destruction by transferring its ownership elsewhere, or destroy it then; in the case of a reference-counted object, 'destruction' means decrementing the reference count (and actual object destruction if the count goes to zero).

    That is, we use Rust's ownership discipline when interacting with C code by handing the C portion ownership of a Rust object, allowing it to call functions implemented in Rust, then eventually giving ownership back. So as long as the C code is correct, the lifetime of Rust file objects work seamlessly as well, with the compiler enforcing correct lifetime management on the Rust side, for example: open cannot return stack-allocated pointers or heap-allocated objects containing pointers to the stack, ioctl/read/write cannot free (or modify without synchronization) the contents of the object stored in filp->private_data, etc.

  • Non-mutable references: the associated functions called between open and release all receive non-mutable references to self because they can be called concurrently by multiple threads and Rust aliasing rules prohibit more than one mutable reference to an object at any given time.

    If a developer needs to modify some state (and they generally do), they can do so via interior mutability: mutable state can be wrapped in a Mutex<T> or SpinLock<T> (or atomics) and safely modified through them.

    This prevents, at compile-time, bugs where a developer fails to acquire the appropriate lock when accessing a field (the field is inaccessible), or when a developer fails to wrap a field with a lock (the field is read-only).

  • Per-device state: when file instances need to share per-device state, which is a very common occurrence in drivers, they can do so safely in Rust. When a device is registered, a typed object can be provided and a non-mutable reference to it is provided when FileOperation::open is called. In our example, the shared object is wrapped in Arc<T>, so files can safely clone and hold on to a reference to them.

    The reason FileOperation is its own trait (as opposed to, for example, open being part of the FileOperations trait) is to allow a single file implementation to be registered in different ways.

    This eliminates opportunities for developers to get the wrong data when trying to retrieve shared state. For example, in C when a miscdevice is registered, a pointer to it is available in filp->private_data; when a cdev is registered, a pointer to it is available in inode->i_cdev. These structs are usually embedded in an outer struct that contains the shared state, so developers usually use the container_of macro to recover the shared state. Rust encapsulates all of this and the potentially troublesome pointer casts in a safe abstraction.

  • Static typing: we take advantage of Rust's support for generics to implement all of the above functions and types with static types. So there are no opportunities for a developer to convert an untyped variable or field to the wrong type. The C code in the table above has casts from an essentially untyped (void *) pointer to the desired type at the start of each function: this is likely to work fine when first written, but may lead to bugs as the code evolves and assumptions change. Rust would catch any such mistakes at compile time.

  • File operations: as we mentioned before, a developer needs to implement the FileOperations trait to customize the behavior of their device. They do this with a block starting with impl FileOperations for Device, where Device is the type implementing the file behavior (FileState in our example). Once inside this block, tools know that only a limited number of functions can be defined, so they can automatically insert the prototypes. (Personally, I use neovim and the rust-analyzer LSP server.)

    While we use this trait in Rust, the C portion of the kernel still requires an instance of struct file_operations. The kernel crate automatically generates one from the trait implementation (and optionally the declare_file_operations macro): although it has code to generate the correct struct, it is all const, so evaluated at compile-time with zero runtime cost.

Ioctl handling

For a driver to provide a custom ioctl handler, it needs to implement the ioctl function that is part of the FileOperations trait, as exemplified in the table below.

fn ioctl(
&self,
file: &File,
cmd: &mut IoctlCommand
) -> KernelResult<i32> {
cmd.dispatch(self, file)
}

impl IoctlHandler for FileState {
fn read(
&self,
_file: &File,
cmd: u32,
writer: &mut UserSlicePtrWriter
) -> KernelResult<i32> {
match cmd {
IOCTL_GET_READ_COUNT => {
writer.write(
&self
.read_count
.load(Ordering::Relaxed))?;
Ok(0)
}
_ => Err(Error::EINVAL),
}
}

fn write(
&self,
_file: &File,
cmd: u32,
reader: &mut UserSlicePtrReader
) -> KernelResult<i32> {
match cmd {
IOCTL_SET_READ_COUNT => {
self
.read_count
.store(reader.read()?,
Ordering::Relaxed);
Ok(0)
}
_ => Err(Error::EINVAL),
}
}
}
#define IOCTL_GET_READ_COUNT _IOR('c', 1, u64)
#define IOCTL_SET_READ_COUNT _IOW('c', 1, u64)

static
long semaphore_ioctl(struct file *filp,
unsigned int cmd,
unsigned long arg)

{
struct file_state *state = filp->private_data;
void __user *buffer = (void __user *)arg;
u64 value;

switch (cmd) {
case IOCTL_GET_READ_COUNT:
value = atomic64_read(&state->read_count);
if (copy_to_user(buffer, &value, sizeof(value)))
return -EFAULT;
return 0;
case IOCTL_SET_READ_COUNT:
if (copy_from_user(&value, buffer, sizeof(value)))
return -EFAULT;
atomic64_set(&state->read_count, value);
return 0;
default:
return -EINVAL;
}
}

Ioctl commands are standardized such that, given a command, we know whether a user buffer is provided, its intended use (read, write, both, none), and its size. In Rust, we provide a dispatcher (accessible by calling cmd.dispatch) that uses this information to automatically create user memory access helpers and pass them to the caller.

A driver is not required to use this though. If, for example, it doesn't use the standard ioctl encoding, Rust offers the flexibility of simply calling cmd.raw to extract the raw arguments and using them to handle the ioctl (potentially with unsafe code, which will need to be justified).

However, if a driver implementation does use the standard dispatcher, it will benefit from not having to implement any unsafe code, and:

  • The pointer to user memory is never a native pointer, so the developer cannot accidentally dereference it.
  • The types that allow the driver to read from user space only allow data to be read once, so we eliminate the risk of time-of-check to time-of-use (TOCTOU) bugs because when a driver needs to access data twice, it needs to copy it to kernel memory, where an attacker is not allowed to modify it. Excluding unsafe blocks, there is no way to introduce this class of bugs in Rust.
  • No accidental overflow of the user buffer: we'll never read or write past the end of the user buffer because this is enforced automatically based on the size encoded in the ioctl command. In our example above, the implementation of IOCTL_GET_READ_COUNT only has access to an instance of UserSlicePtrWriter, which limits the number of writable bytes to sizeof(u64) as encoded in the ioctl command.
  • No mixing of reads and writes: we'll never write buffers for ioctls that are only meant to read and never read buffers for ioctls that are only meant to write. This is enforced by read and write handlers only getting instances of UserSlicePtrWriter and UserSlicePtrReader respectively.

All of the above could potentially also be done in C, but it's very easy for developers to (likely unintentionally) break contracts that lead to unsafety; Rust requires unsafe blocks for this, which should only be used in rare cases and brings additional scrutiny. Additionally, Rust offers the following:

  • The types used to read and write user memory do not implement the Send and Sync traits, which means that they (and pointers to them) are not safe to be used in another thread context. In Rust, if a driver developer attempted to write code that passed one of these objects to another thread (where it wouldn't be safe to use them because it isn't necessarily in the right memory manager context), they would get a compilation error.
  • When calling IoctlCommand::dispatch, one might understandably think that we need dynamic dispatching to reach the actual handler implementation (which would incur additional cost in comparison to C), but we don't. Our usage of generics will lead the compiler to monomorphize the function, which will result in static function calls that can even be inlined if the optimizer so chooses.

Locking and condition variables

We allow developers to use mutexes and spinlocks to provide interior mutability. In our example, we use a mutex to protect mutable data; in the tables below we show the data structures we use in C and Rust, and how we implement a wait until the count is nonzero so that we can satisfy a read:

struct SemaphoreInner {
count: usize,
max_seen: usize,
}

struct Semaphore {
changed: CondVar,
inner: Mutex<SemaphoreInner>,
}

struct FileState {
read_count: AtomicU64,
shared: Arc<Semaphore>,
}
struct semaphore_state {
struct kref ref;
struct miscdevice miscdev;
wait_queue_head_t changed;
struct mutex mutex;
size_t count;
size_t max_seen;
};

struct file_state {
atomic64_t read_count;
struct semaphore_state *shared;
};

fn consume(&self) -> KernelResult {
let mut inner = self.shared.inner.lock();
while inner.count == 0 {
if self.shared.changed.wait(&mut inner) {
return Err(Error::EINTR);
}
}
inner.count -= 1;
Ok(())
}
static int semaphore_consume(
struct semaphore_state *state)

{
DEFINE_WAIT(wait);

mutex_lock(&state->mutex);
while (state->count == 0) {
prepare_to_wait(&state->changed, &wait,
TASK_INTERRUPTIBLE);
mutex_unlock(&state->mutex);
schedule();
finish_wait(&state->changed, &wait);
if (signal_pending(current))
return -EINTR;
mutex_lock(&state->mutex);
}

state->count--;
mutex_unlock(&state->mutex);

return 0;
}

We note that such waits are not uncommon in the existing C code, for example, a pipe waiting for a "partner" to write, a unix-domain socket waiting for data, an inode search waiting for completion of a delete, or a user-mode helper waiting for state change.

The following are benefits from the Rust implementation:

  • The Semaphore::inner field is only accessible when the lock is held, through the guard returned by the lock function. So developers cannot accidentally read or write protected data without locking it first. In the C example above, count and max_seen in semaphore_state are protected by mutex, but there is no enforcement that the lock is held while they're accessed.
  • Resource Acquisition Is Initialization (RAII): the lock is unlocked automatically when the guard (inner in this case) goes out of scope. This ensures that locks are always unlocked: if the developer needs to keep a lock locked, they can keep the guard alive, for example, by returning the guard itself; conversely, if they need to unlock before the end of the scope, they can explicitly do it by calling the drop function.
  • Developers can use any lock that implements the Lock trait, which includes Mutex and SpinLock, at no additional runtime cost when compared to a C implementation. Other synchronization constructs, including condition variables, also work transparently and with zero additional run-time cost.
  • Rust implements condition variables using kernel wait queues. This allows developers to benefit from atomic release of the lock and putting the thread to sleep without having to reason about low-level kernel scheduler functions. In the C example above, semaphore_consume is a mix of semaphore logic and subtle Linux scheduling: for example, the code is incorrect if mutex_unlock is called before prepare_to_wait because it may result in a wake up being missed.
  • No unsynchronized access: as we mentioned before, variables shared by multiple threads/CPUs must be read-only, with interior mutability being the solution for cases when mutability is needed. In addition to the example with locks above, the ioctl example in the previous section also has an example of using an atomic variable; Rust also requires developers to specify how memory is to be synchronized by atomic accesses. In the C part of the example, we happen to use atomic64_t, but the compiler won't alert a developer to this need.

Error handling and control flow

In the tables below, we show how open, read, and write are implemented in our example driver:

fn read(
&self,
_: &File,
data: &mut UserSlicePtrWriter,
offset: u64
) -> KernelResult<usize> {
if data.is_empty() || offset > 0 {
return Ok(0);
}

self.consume()?;
data.write_slice(&[0u8; 1])?;
self.read_count.fetch_add(1, Ordering::Relaxed);
Ok(1)
}

static
ssize_t semaphore_read(struct file *filp,
char __user *buffer,
size_t count, loff_t *ppos)

{
struct file_state *state = filp->private_data;
char c = 0;
int ret;

if (count == 0 || *ppos > 0)
return 0;

ret = semaphore_consume(state->shared);
if (ret)
return ret;

if (copy_to_user(buffer, &c, sizeof(c)))
return -EFAULT;

atomic64_add(1, &state->read_count);
*ppos += 1;
return 1;
}

fn write(
&self,
data: &mut UserSlicePtrReader,
_offset: u64
) -> KernelResult<usize> {
{
let mut inner = self.shared.inner.lock();
inner.count = inner.count.saturating_add(data.len());
if inner.count > inner.max_seen {
inner.max_seen = inner.count;
}
}

self.shared.changed.notify_all();
Ok(data.len())
}
static
ssize_t semaphore_write(struct file *filp,
const char __user *buffer,
size_t count, loff_t *ppos)

{
struct file_state *state = filp->private_data;
struct semaphore_state *shared = state->shared;

mutex_lock(&shared->mutex);
shared->count += count;
if (shared->count < count)
shared->count = SIZE_MAX;

if (shared->count > shared->max_seen)
shared->max_seen = shared->count;

mutex_unlock(&shared->mutex);

wake_up_all(&shared->changed);
return count;
}

fn open(
shared: &Arc<Semaphore>
) -> KernelResult<Box<Self>> {
Ok(Box::try_new(Self {
read_count: AtomicU64::new(0),
shared: shared.clone(),
})?)
}
static 
int semaphore_open(struct inode *nodp,
struct file *filp)

{
struct semaphore_state *shared =
container_of(filp->private_data,
struct semaphore_state,
miscdev);
struct file_state *state;

state = kzalloc(sizeof(*state), GFP_KERNEL);
if (!state)
return -ENOMEM;

kref_get(&shared->ref);
state->shared = shared;
atomic64_set(&state->read_count, 0);

filp->private_data = state;

return 0;
}

They illustrate other benefits brought by Rust:

  • The ? operator: it is used by the Rust open and read implementations to do error handling implicitly; the developer can focus on the semaphore logic, the resulting code being quite small and readable. The C versions have error-handling noise that can make them less readable.
  • Required initialization: Rust requires all fields of a struct to be initialized on construction, so the developer can never accidentally fail to initialize a field; C offers no such facility. In our open example above, the developer of the C version could easily fail to call kref_get (even though all fields would have been initialized); in Rust, the user is required to call clone (which increments the ref count), otherwise they get a compilation error.
  • RAII scoping: the Rust write implementation uses a statement block to control when inner goes out of scope and therefore the lock is released.
  • Integer overflow behavior: Rust encourages developers to always consider how overflows should be handled. In our write example, we want a saturating one so that we don't end up with a zero value when adding to our semaphore. In C, we need to manually check for overflows, there is no additional support from the compiler.

What's next

The examples above are only a small part of the whole project. We hope it gives readers a glimpse of the kinds of benefits that Rust brings. At the moment we have nearly all generic kernel functionality needed by Binder neatly wrapped in safe Rust abstractions, so we are in the process of gathering feedback from the broader Linux kernel community with the intent of upstreaming the existing Rust support.

We also continue to make progress on our Binder prototype, implement additional abstractions, and smooth out some rough edges. This is an exciting time and a rare opportunity to potentially influence how the Linux kernel is developed, as well as inform the evolution of the Rust language. We invite those interested to join us in Rust for Linux and attend our planned talk at Linux Plumbers Conference 2021!


Thanks Nick Desaulniers, Kees Cook, and Adrian Taylor for contributions to this post. Special thanks to Jeff Vander Stoep for contributions and editing, and to Greg Kroah-Hartman for reviewing and contributing to the code examples.