Insights from Writing an Embedded IoT App (Almost) Entirely in Rust

21 September 2020 by Louis Thiery • Last updated 16 December 2021

Today we have a guest post from Louis Thiery. Louis has spent many years building hardware and software for IoT devices ranging from agriculture to consumer electronics. He now develops IoT infrastructure at Helium. Helium is the largest public LPWAN in the US and is shipping now (Sept 2020) in Europe. You can contact him by email.

Embedded development practices are in many ways a time-capsule for general software development practices. Automated testing is rarely part of projects; automated on-target testing even less so. Continuous integration and deployment are nearly a fantasy. C still reigns supreme despite being nearly 50 years old.

In my opinion, this is not because embedded developers are inherently conservative and have no interest in evolving practices. It is rather because deviating from C and/or implementing these testing practices is harder in embedded development. Build systems are often bespoke Makefile architectures. A lack of standard RTOS interfaces makes unit testing project specific, or at least, RTOS specific. In the end, adopting modern software best practices requires additional tooling and project-specific overhead.

Enter Rust, a “systems programming language” developed in 2010 at Mozilla Research. Nearly 10 years later, the language is consistently rated “most beloved” on Stackoverflow’s annual developer survey. While originally developed for the Mozilla browser engine, the language is gaining immense traction in applications ranging from back-end services to blockchain development.

The key features of Rust that make it so appreciated and rapidly growing in these domains make it equally relevant in embedded development. In my experience, Rust as a language provides structure and discipline that cannot be matched in C without add-on tooling and developer discipline.

So everybody should do everything in Rust! Well, not quite (although the “rewrite it in Rust” movement is well on its way). Instead, I would encourage select usage of it in new projects that don’t deal with a lot of legacy architecture or frameworks and to get comfortable early mixing C/C++ and Rust in the same codebase. Interoperability of C/C++ and Rust is fairly well done, but you will always find greater success in minimizing the seams between the two languages. For example, perhaps that C crypto library does not exist in Rust, but other than that, you could do the project entirely in Rust; creating a binding to that one library won’t be too much work. Or perhaps that one module in C is buggy and difficult, so maybe try a Rust rewrite within that larger C project.

Rather than go in-depth on all of Rust’s distinguishing features in an abstract view (many articles will cover this better than I would), I’ll provide a practical view with embedded development in particular. The examples will build in three stages:

Chip-level Rust best practices, embodied by embedded-hal
A real-time event framework, featuring Real-Time Interrupt-driven Concurrency (RTIC)
A device-side LoRaWAN implementation, featuring a Rust-to-C binding to vendor drivers

These examples culminate in an IoT application written almost entirely in Rust: it connects via LoRaWAN to the Helium Network, the biggest public LPWAN in North America, and sends a count of button presses.

Embedded HAL Best Practices

Embedded HAL is a Rust community effort to create a standard API for different MCUs. Such abstraction layers have sometimes been done in different C-based RTOS projects, but this effort distinguishes itself in that it is an abstraction effort for its own sake and is not bound to an RTOS or runtime.

As an embedded developer learning Rust, I was impressed by the way the community adapted Rust’s features to the embedded domain so well. I’d like to show a few examples.

Static Strong Types

Have you ever tried to read a pin but it was accidentally configured as an output? You keep toggling a button, but the value won’t change! Only after 30 minutes of debugging do you realize it was never configured as an input… woopsy.

In embedded Rust, it is best practice to transform the type of the pin when it goes through these state changes. At compile-time, the misuse of the pin will be identified and flagged by the compiler; no debugging is necessary as the type system protects you with static analysis.

For example, the following code snippet would work fine:

// Configure PA0 as input.
let mut pin = gpioa.pa0.into_pull_up_input();
if pin.is_high() {
    //do something
}

First, note the type-inference: the compiler knows that into_pull_up_input() returns a type gpio::Input so explicit declaration is unnecessary; let is sufficient for declaring the variable. Type-inference is nice, but the real value of Rust shines when you try to do something bad:

pin.set_high()

If you do that, the compiler would complain:

error[E0599]: no method named `set_high` found for struct `stm32l0xx_hal::gpio::gpioa::PA0<stm32l0xx_hal::gpio::Input<stm32l0xx_hal::gpio::PullUp>>` in the current scope

On the other hand, if you did this:

let output_pin = pin.into_push_pull_output();
pin.set_high()

Everything is okay and the code compiles! This might seem silly here, but it ends up being a solid foundation as complexity builds. For example, let’s say I’m making a library for an RGB LED Driver. Is it the responsibility of the user to initialize the pins or is this something that my library graciously does for the user? Often, one might initialize the pin before passing it to the library, but then inside the library initialize it again, out of paranoia, just in case somebody changes the pin configuration outside the library. With the type system, the signature of my constructor clarifies this and we can all relax a little.

Memory Safety

Memory safety in Rust results in fantastic constructs such as the core Option type. It means the return type of a function might be Option<Response>, which tells you maybe there’s a Some(Response) or maybe there’s None. Combined with the match syntax, quality of life is through the roof:

match mylibrary.handle(request) {
    Some(response) => // Handles response
    None => // Do something else? Up to you
}

The same thing in C just doesn’t roll off the tongue as easily:

switch library(&lib_handle, request)
{
    case NULL: // Do something else? Up to you;
        break;
    default: 
    // Handles response
}

Also, did I mention array indices are automatically checked? This does have a run-time cost, but if I had a dollar for every time I corrupted my stack by writing too far in an array in C…

Ownership

To carry on with the pin examples, suppose you are using an RGB LED library. Your code may look something like this in Rust:

let device_pheripherals = pac::Peripherals::take().unwrap();
let gpioa = dp.GPIOA.split(&mut rcc);
let rgb = Rgb::new(gpioa.pa0, gpioa.pa1, gpioa.pa2);

First we declared a local variable to hold the “Peripheral Access Crate” of the MCU. It’s worth noting the ‘unwrap’ here which returns the result of the ‘take’ operation, if successful, but throws a panic if an Error was there. I will not go more into depth on error-handling than this, but this yet another well thought-out Rust feature.

Concerning ownership, it has become standard practice to create a base struct that “holds” all of the device pins, peripherals (SPI, I2C, Serial), and other hardware components. From there, we’ve split the GPIOA bank of pins and put them into gpioa. And from there, we move pins pa0–pa2 into a Rgb structure. All of this becomes useful if I do something like this later:

let rgb2 = Rgb::new(gpioa.pa2, gpioa.pa3, gpioa.pa4);

While constructing my second Rgb structure, I accidentally shifted my pin assignments by one! Again, the compiler saves me from myself and I cannot possibly give the pin pgpio.pa2 to two Rgb structures by accident:

let rgb2 = Rgb::new(gpioa.pa2, gpioa.pa3, gpioa.pa4);
   |              ^^^^^^^^^ value used here after move

The Borrow Checker

Sometimes hardware is shared. The I2C bus is a prime example in embedded systems where multiple device libraries will need access to the same bus in turn. So something like this may work:

i2c_device1.read_value(&mut i2c_bus);
i2c_device2.read_value(&mut i2c_bus);

Before I explain why this may or may not work, it’s worth noting the keyword mut which indicates that the function here may “mutate” the i2c_bus which it is getting a reference to. Indeed, it will be doing reads and writes to the bus, so it will most likely be twiddling the registers “owned” by this i2c_bus structure.

Now the thorny details of where this may or may not work depend on what actually happens in the definition of ‘read_value’. If the i2c_device library kept this mutable reference somewhere in its own memory, the second passing of the pointer would upset the compiler. This constrains you to use the i2c_bus in the function definition only and not to store this pointer somewhere.

This type of thinking came from Rust’s concepts of “fearless concurrency” for multithreaded and multicore systems, but it is also valuable for interrupt driven applications in embedded systems. I can safely have access to these i2c_device structs in interrupt context now, because I am sure they aren’t both holding onto pointers to the same I2C bus and potentially pre-empting another mid-operation on that bus!

With this use-case in mind, we will go into the next section on Real-Time Interrupt-driven Concurrency.

Real-Time Interrupt-driven Concurrency

Real-Time Interrupt-drive Concurrency (RTIC) is an event framework written in Rust. The design was a shock to me initially as, contrary to common embedded development wisdom, it proposes to do almost everything within interrupt vectors.

It’s a radical idea that was originally implemented in C by Per Lindgren’s group at Luleå University of Technology in Luleå, Sweden. Recently, however, it was fully implemented by a student, Jorge Aparacio, entirely in Rust. By leveraging the language’s features, the design reaches its full potential.

The basic idea is that the Cortex-M NVIC priorities are used for scheduling tasks. Pre-emption and stack management is implemented almost entirely hardware, minimizing processing overhead and providing astonishing speed in scheduling. In addition, the frameworks tools for sharing data guarantee no deadlocks.

I will provide a few examples of how its Rust implementation provides this extraordinary tool in a uniquely safe way and with convenient and modern syntax.

Resources

A big part of RTIC’s utility is in its streamlined way of initializing and sharing global variables, something which is central to most embedded applications.

For example, let’s say I have an app that toggles an LED if a button OR a key is pressed. The declaration of these three resources happens at the top of the application definition:

#[app(device = stm32l0xx_hal::pac, peripherals = true)]
const APP: () = {
    struct Resources {
        button: gpiob::PB0<Input<PullUp>> = (),
        led: gpiob::PB6<Output<PushPull>> = (),
        rx: serial::Rx<DebugUsart>,
}

You’ll notice the first section: #[app…] which is a Rust attribute. It’s a powerful feature that RTIC leverages ubiquitously to accomplish some wonderful things. In this case, it’s connecting RTIC application event framework to a specific device, the stm32l0xx, and saying that we are taking ownership of the peripherals in our app.

The next part is the declaration of Resources, which in this case a promise that we will move the appropriate structs into the Resources at the end of the initialization routine. It looks something like this (details omitted for brevity):

#[init]
fn init() -> init::LateResources { 
   Let button = led::new(gpiob.pb0.into_pull_up_input());
   let led = led::new(gpiob.pb6.into_push_pull_output());
   Let (_tx, rx) = serial_peripheral
            .usart(tx_pin, rx_pin, serial::Config::default(), &mut rcc)
            .unwrap().split()

   // return the initialised resources.
   init::LateResources {
      int,
       led,
       rx,
   }
}

You’ll notice a little syntactical quality-of-life niceness here as the assignment of the LateResources fields happens implicitly as the local variables share the same name as the definition.

In addition, since I do nothing with the tx part of the serial struct, I assign it to _tx, a variable with the _ prefix, which is used to suppress compiler warnings about an unused variable.

Anyway, the initialization routine is complete! Let’s see how these Resources are used in tasks.

Tasks

The rest of the application is defined by hardware and software tasks. It would look something like this:

#[task(binds = USART1, priority = 1, resources = [rx, led])]
fn USART1() {
   // clear the byte
   ctx.resources.rx.read();
   ctx.resources.led.lock( |led| led.toggle());
}

#[task(binds = EXTI0_1, priority = 2, resources = [button, led])]
fn EXTI0_1() {
   // Clear the interrupt flag.
   Exti::unpend(
      GpioLine::from_raw_line(ctx.resources.button.pin_number()).unwrap()
   ); 
   ctx.resources.led.toggle();
}

With these simple task definitions, we have our app defined: when a byte is received over UART we toggle the LED and when a button is pressed, we also toggle. What is really neat here is that we have used the attribute assignments to do a few things:

Bind this task to an interrupt vector
Make the button interrupt vector (EXTI0_1) higher priority than the USART1 vector (it may preempt it)
Provide access to the LED resource to both interrupt vectors (as well as their own specific structs for button and USART receive)

What is compelling about RTIC’s implementation in Rust, other than these very concise and clean attribute definitions, is the fact that the type of the led resource changes depending on the priority of the task. That is to say, the highest priority task knows it cannot be preempted when accessing the ‘led’ resource, so it does not need to lock the resource; meanwhile, the “not-highest” priority task(s) must lock the resource to prevent preemption when manipulating this piece of shared memory.

This type of fancy footwork is sure to trip you up in C, as somebody might come along and create yet another higher priority task that uses the LED resource, making this direct access in the formerly “highest priority” task unsafe. You might not notice for quite some time before it causes problems. As such, most developers would avoid this altogether despite the efficiency gains. But with this clever use of Rust’s type-system, this efficiency may be fearlessly adopted.

Queues & Software Tasks

Let’s do a slight variant of this app where the LED is turned off by the button and on by serial bytes:

#[task(binds = USART1, priority = 2, resources = [rx], spawn = [led_on, led_off])]
fn USART1(ctx: USART1::Context) {
   // clear the byte
   ctx.resources.rx.read();
   ctx.spawn.led_on();
}

#[task(binds = EXTI0_1, priority = 2, resources = [button], spawn = [led_on, led_off])]
fn EXTI0_1(ctx: EXTI0_1::Context) {
   // Clear the interrupt flag.
   Exti::unpend(
      GpioLine::from_raw_line(ctx.resources.button.pin_number()).unwrap()
   ); 
   ctx.spawn.led_off();
}

#[task(capacity = 4,  priority = 1, resources = [led])]
fn led_on(ctx: led_on::Context) {
   ctx.resources.led.on();
}

#[task(capacity = 4,  priority = 1, resources = [led])]
fn led_off(ctx: led_off::Context) {
   ctx.resources.led.on();
}

// Interrupt handlers used to dispatch software tasks
extern "C" {
   fn USART4_USART5();
}

In this example, we essentially did away with resource sharing altogether in favor of “spawning tasks” to do our LED work. The LED work is done at lower priority while capturing button pressed and serial bytes is done at higher priority.

Also notice that the led_on/led_off tasks are pure software tasks. They are scheduled still by hijacking one interrupt vector; it could be any, but we selected USART4_USART5 in this case.

Another noteworthy aspect is that we can declare queues dedicated for each of these software tasks; events may be captured and enqueued from higher priority and handled eventually by the software task when higher priority events are done running. All of this is scheduled with no overhead thanks to RTIC’s use of the NVIC hardware.

LoRaWAN in Rust

Rust has a fantastic built-in package manager which makes using external libraries very systematic and painless. Compared to C, where you are sure to spend some time making build systems agree, reaching for a third-party library, or “crate”, is dangerously easy.

That being said, sometimes the code doesn’t exist yet in Rust. In my opinion, that is the biggest challenge in embedded Rust today. In the embedded industry, it’s not uncommon to be forced to link to a bespoke binary blob provided by a silicon vendor; so in my opinion, we’re quite some time off before we can hope to find silicon drivers as Rust source-code and we better get used to binding to their C code.

To this end, I would encourage embedded developers who are curious about Rust to explore the interoperability of Rust and C. In this example, I show a crate of mine that wraps Semtech’s radio driver written in C such that it is accessible in Rust.

SX12xx in Rust

Writing a Rust binding to C code is a long topic, but I’d like to highlight some of the things that are remarkably easy and some of the core challenges I had.

The biggest challenge I had with the SX12xx drivers was that they had lots of direct connections to functions and callbacks which made it very incompatible with Rust’s notions of ownership.

For example, the radio driver would like access to a SPI struct such that it may read and write to the radio. Typically, this is done by passing function pointers upon library initialization, but in this case, they required certain functions to be declared globally. A function called SpiInOut that writes and read SPI must be defined so it can be found at link-time.

Another example is that the library expected a mutual exchange of callback pointers, so that the application could signal board events (such as a pin interrupt) to the library and, inversely, so that the driver could signal radio events (such as transmit complete) to the application.

To simplify the interface and try to converge on something more Rust-friendly, I create a layer of C to invert these models. In addition, I massaged the code slightly so that drivers for multiple radios could be built at the same time and then everything could be compiled in C using CMake. By doing this work in C and CMake, I made generating Rust bindings for C quite simple. There is an optional build layer in Rust projects, where you declare a build.rs file, and the meat of it looks like this:

use std::env;
use std::path::PathBuf;
use std::process::Command;
use cmake;
use cmake::Config;

fn main() {
  let dst = Config::new("sx12xx")
               .define("BUILD_TESTING", "OFF")
               .define("CMAKE_C_COMPILER_WORKS", "1")
               .define("CMAKE_CXX_COMPILER_WORKS", "1")
               .pic(false)
               .build();

  println!("cargo:rustc-link-search=native={}/lib", dst.display());
  println!("cargo:rustc-link-lib=static=sx12xx");

  // make the bindings
  let bindings = bindgen::Builder::default()
     .raw_line("use cty;")
     .use_core()
     .ctypes_prefix("cty")
     .detect_include_paths(true)
     .header("sx12xx/sx12xx.h")
}

This ends up being all it takes to generate the Rust bindings to the C module! In addition to the generated bindings, generally, it’s recommended to create a hand-written Rust layer that does the final adjustments to make things friendly and “safe” for library clients. The goal is to make the generated “unsafe” calls to C functions safe for the client. Rust’s notion of unsafe is a topic of its own, but I basically want to make it so that users of my Rust library binding to C can use my crate without concerns memory corruption or ownership (despite calling a language that does not necessarily verify such things at compile time). In other words, I’ve audited the underlying code and my APIs cannot be abused to cause unexpected crashes.

A LoRaWAN Device Stack in Rust

The final piece that convinced me this project was possible was that a crate already existed with much of the LoRaWAN heavy-lifting accomplished. That is to say, packet serialization/encryption and deserialization/decryption were completed. In addition, there is ample testing AND it is no_std compatible (a requirement for bare-metal projects).

I had the opportunity to test-drive the crate with a simple Linux utility for sniffing LoRaWAN packets and had some very positive interactions with the rust-lorawan crate author, Ivaylo Petrov. What seems typical of Rust contributors, he was passionate about the work and enthusiastic about adopting feedback.

With a low-level radio driver in place and the foundation of a LoRaWAN stack, I was just a few days of hacking away from sending what I believe are the first Rust LoRaWAN packets from a bare-metal device. The LoRaWAN Stack is minimal but accomplishes an OTAA JoinRequest/Accept sequence to create a Session and sends encrypted packets. I worked with Ivaylo to find a way to merge my work with his crate, so that rust-lorawan initiatives can live under one repo and continue to grow!

The example is available here. This project works with the readily available STM32 LoRa Discovery Board. With no additional hardware, you can connect the device to the Helium Network for free (assuming coverage is available in your area). Follow this simple guide for creating free credentials for your device.

Final Notes

I hope this article has shown some compelling examples about how Rust fits well in an embedded workflow. If it’s made you interested in dipping your toes into Rust, I would recommend two things:

Start by writing a simple Linux or Windows application; allowing yourself to rely on the standard library leads to more mainstream and idiomatic Rust which will make it a lot easier to get started. Once you understand the general mechanics of the language, switching to writing bare metal Rust is a little less overwhelming
When it comes to starting with bare metal Rust, start with the book which includes a fantastic primer to Rust. In addition, try to use a well-supported MCU such as the NRF52. In addition, there is even a Rust BLE library which is primarily tested on that platform

For those that fancy a challenge, there are even RISC-V projects out there.

Finally, feel free to drop me a line! thiery.louis@gmail.com

Related Terms:

6 Replies to “Insights from Writing an Embedded IoT App (Almost) Entirely in Rust”

Andres says:

5 October 2020 at 21:19

Thank you for the insights. I’ve been interested in starting learning Rust for a while and this post got me going to finally do it. Just wanted to let you know that the “RISC-V projects” link seems to be broken.
Aaron Fontaine says:

18 December 2020 at 09:02

I’m a little confused by the two-priority LED toggling example. If I think about this in the context of C, there’s nothing that would prevent EXTIO_1 from firing while USART1 is toggling the LED. Does that mean that ctx.resources.led.lock() is disabling the EXTIO_1 interrupt? I don’t see how else it would work. The implication here seems to be that Rust knows to do this from the list of bound resources in the task attributes. This implies knowledge of all tasks that declare binding to a resource in the implementation of that resource’s lock function. Is my understanding here correct?
Aaron Fontaine says:

18 December 2020 at 09:15

I don’t understand the i2c sharing example under “The Borrow Checker”. Only one mutable reference to an object can exist at a time. How is the Rust compiler guaranteeing this for interrupt service routines?

The explanation at https://doc.rust-lang.org/book/ch04-02-references-and-borrowing.html describes the compile-time check in the context of a single, linearly-executed scope. That’s all well and good, but it doesn’t explain sharing between interrupt contexts. Both of the i2c device objects still need mutable access to the bus, and they’re both still operating in separate interrupt contexts. So, how does the compiler guarantee that one ISR will not preempt the other ISR in the midst of the read_value() method?
Louis Thiery says:

20 December 2020 at 15:49

If I think about this in the context of C, there’s nothing that would prevent EXTIO_1 from firing while USART1 is toggling the LED. Does that mean that ctx.resources.led.lock() is disabling the EXTIO_1 interrupt?

What’s happening here is we’ve yielded control of the shared resources and interrupts to the RTIC framework. Using the attributes in Rust (eg: #[app...] or #[task...]), we tell RTIC about priority levels and shared resources and it indeed disables higher priority interrupt when accessing a shared resource when a higher priority task (or IRQ) could potentially interrupt us.

That is to say, you’re absolutely right: ctx.resources.led.lock() disables the EXTIO_1 interrupt.
Louis Thiery says:

20 December 2020 at 15:55

So, how does the compiler guarantee that one ISR will not preempt the other ISR in the midst of the read_value() method?

In the borrow checker section, I haven’t touched on ISRs yet and I’m assuming a that these functions are called sequentially in the same thread. What I emphasize there is that the read function is “dropped” the reference after executing and not storing it in owned memory to that object.

With an Interrupt-driven program, I would refer to the RTIC examples of how to do it “safely” in Rust. Rather than an LED, we would declare the I2C bus as a shared resource and RTIC would help us arbitrate the sharing of that resource.

You can do something similar without using RTIC, but reading/writing to shared global (static) memory is inherently considered unsafe in Rust and you would need to add in the logic to make it safe. RTIC pretty much does that for you.

Hope that answers your question… Let me know if you have more!
Aaron Fontaine says:

28 December 2020 at 15:49

Thank you for that explanation. That’s quite a powerful feature of Rust.

Share Your ThoughtsCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.