crect: A C++14 Library for Generating a Stack Resource Policy Scheduler at Compile Time

The crect project (pronounced correct) is a C++14 library for generating a scheduler for Cortex-M microcontrollers at compile-time. crect uses the Cortex-M's Nested Vector Interrupt Controller (NVIC) to implement a Stack Resource Policy (SRP) scheduler which guarantees deadlock-free and data-race-free execution.

crect is built upon the Kvasir Meta-programming Library, which is also the foundation of the Kvasir framework. Use of C++ meta-programming and C++14 features allows priority ceilings and interrupt masks to be calculated at compile time. Resource locks are handled through RAII, and resource access is handled using a monitor pattern).

The most impressive thing about this framework is the minimal resource requirements:

  • ~400 bytes of memory for the static elements (linked list, SysTick, time implementation)
  • 4-5 instructions per job for initializing the NVIC
  • 2-3 instructions per queue element for initializing the asynchronous queue
  • 3-4 instructions + 4 bytes of stack space for a lock
  • 1-3 instructions for an unlock
  • 2-4 instructions for a pend/clear
  • 20-30 instructions per item in the queue for async

If you are working on a bare-metal ARM program with real-time concerns, crect is an RTOS alternative that can provides protection against common multithreading concerns like priority inversion, deadlocks, and race conditions.

Further Reading

For more on crect:

Related Posts

FreeRTOS Task Notifications: A Lightweight Method for Waking Threads

I was recently implementing a FreeRTOS-based system and needed a simple way to wake my thread from an ISR. I was poking around the FreeRTOS API manual looking at semaphores when I discovered a new feature: task notifications.

FreeRTOS claims that waking up a task using the new notification system is ~45% faster and uses less RAM than using a binary semaphore.

The following APIs are used to interact with task notifications:

The notification system utilizes the 32-bit task handle (TaskHandle_t) that is set during the task creation process:

BaseType_t xTaskCreate(    TaskFunction_t pvTaskCode,
                            const char * const pcName,
                            unsigned short usStackDepth,
                            void *pvParameters,
                            UBaseType_t uxPriority,
                            TaskHandle_t *pxCreatedTask

Task notifications can be used to emulate mailboxes, binary semaphores, counting semaphores, and event groups.

While task notifications have speed and RAM advantages over other FreeRTOS structures, they are limited in that they can only be used to wake a single task with an event. If multiple tasks must be woken by an event, you must utilize the traditional structures.

Real World Example

I took advantage of the task notifications to implement a simple thread-waking scheme.

I have a button which can be used to request a system power state change. The power state changes can take quite a bit of time, and we don't want to be locked up inside of our interrupt handler for 1-10s.

Instead, I created a simple thread which sleeps until a notification is received. When the thread wakes, we check to see if there is an action we need to take:

void powerTaskEntry(__unused void const* argument)
    static uint32_t thread_notification;

        /* Sleep until we are notified of a state change by an 
        * interrupt handler. Note the first parameter is pdTRUE, 
        * which has the effect of clearing the task's notification 
        * value back to 0, making the notification value act like
        * a binary (rather than a counting) semaphore.  */

        thread_notification = ulTaskNotifyTake(pdTRUE, 


In order to wake the thread, we send a notification from the interrupt handler:

void system_power_interrupt_handler(uint32_t time)
    BaseType_t xHigherPriorityTaskWoken = pdFALSE;

        // Perform a reset of computers
        computer_reset_request_ = RESET_ALL_COMPUTERS;
    else if((time >= SYS_RESET_SHUTDOWN_THRESHOLD_MS) &&

    // Notify the thread so it will wake up when the ISR is complete

Note the use of portYIELD_FROM_ISR(). This is required when waking a task from an interrupt handler. If vTaskNotifyGiveFromISR indicates that a higher priority task is being woken, the portYIELD_FROM_ISR() routine will context switch to that task after returning from the ISR.

Failure to use this function will result in execution will resuming at the previous point rather than switching to the new context.

Further Reading

Refactoring the ThreadX Dispatch Queue To Use std::mutex

Now that we've implemented std::mutex for an RTOS, let's refactor a library using RTOS-specific calls so that it uses std:mutex instead.

Since we have a ThreadX implementation for std::mutex, let's update our ThreadX-based dispatch queue. Moving to std::mutex will result in a simpler code structure. We still need to port std::thread and std::condition_variable to achieve true portability, but utilizing std::mutex is still a step in the right direction.

For a quick refresher on dispatch queues, refer to following articles:

Table of Contents

  1. How std::mutex Helps Us
    1. C++ Mutex Wrappers
  2. Refactoring the Asynchronous Dispatch Queue
    1. Class Definition
    2. Constructor
    3. Destructor
    4. Dispatch
    5. Thread Handler
  3. Putting It All Together
  4. Further Reading

How std::mutex Helps Us

Even though we can't yet make our dispatch queue fully portable, we still benefit from using std::mutex in the following ways:

  1. We no longer have to worry about initializing or deleting our mutex since the std::mutex constructor and destructor handles that for us
  2. We can take advantage of RAII to lock whenever we enter a scope, and to automatically unlock when we leave that scope
  3. We can utilize standard calls (with no arguments!), reducing the burden of remembering the exact ThreadX functions and arguments

But these arguments might not have real impact. Just take a look at the ThreadX native calls:

uint8_t status = tx_mutex_get(&mutex_, TX_WAIT_FOREVER);

// do some stuff

status = tx_mutex_put(&mutex_);

And here's the std::mutex equivalent:


//do some stuff


Don't you prefer the std::mutex version?

C++ Mutex Wrappers

While we could manually call lock() and unlock() on our mutex object, we'll utilize two helpful C++ mutex constructs: std::lock_guard and std::unique_lock.

The std::lock_guard wrapper provides an RAII mechanism for our mutex. When we construct a std::lock_guard, the mutex starts in a locked state (or waits to grab the lock). Whenever we leave that scope the mutex will be released automatically. A std::lock_guard is especially useful in functions that can return at multiple points. No longer do you have to worry about releasing the mutex at each exit point: the destructor has your back.

We'll also take advantage of the std::unique_lock wrapper. Using std::unique_lock provides similar benefits to std::lock_guard: the mutex is locked when the std::unique_lock is constructed, and unlocked automatically during destruction. However, it provides much more flexibility than std::lock_guard, as we can manually call lock() and unlock(), transfer ownership of the lock, and use it with condition variables.

Refactoring the Asynchronous Dispatch Queue

We will utilize both std::lock_guard and std::unique_lock to simplify our ThreadX dispatch queue.

Our starting point for this refactor will be the dispatch_threadx.cpp file in the embedded-resources repository.

Class Definition

In our dispatch class, we need to change the mutex definition from TX_MUTEX to std::mutex:

std::mutex mutex_;


Mutex initialization is handled for us by the std::mutex constructor. We can remove the tx_mutex_create call from our dispatch queue constructor:

// Initialize the Mutex
uint8_t status = tx_mutex_create(&mutex_, "Dispatch Mutex", TX_INHERIT);
assert(status == TX_SUCCESS && "Failed to create mutex!");


Mutex deletion is handled for us by the std::mutex destructor. We can remove the tx_mutex_delete call from the dispatch queue destructor:

status = tx_mutex_delete(&mutex_);
assert(status == TX_SUCCESS && "Failed to delete mutex!");


By using std::lock_guard, we can remove both the mutex get and put calls. RAII will ensure that the mutex is unlocked when we leave the function.

Here's the dispatch() implementation using std::lock_guard:

void dispatch_queue::dispatch(const fp_t& op)
    std::lock_guard<std::mutex> lock(mutex_);


    // Notifies threads that new work has been added to the queue
    tx_event_flags_set(&notify_flags_, DISPATCH_WAKE_EVT, TX_OR);

If you still wanted to unlock before setting the event flag, use std::unique_lock instead of std::lock_guard. Using std::unique_lock allows you to call unlock().

void dispatch_queue::dispatch(const fp_t& op)
    std::unique_lock<std::mutex> lock(mutex_);



    // Notifies threads that new work has been added to the queue
    tx_event_flags_set(&notify_flags_, DISPATCH_WAKE_EVT, TX_OR);

Either approach is acceptable and looks much cleaner than the native calls.

Why would you potentially care about calling unlock()? If you are using std::lock_guard, it is possible that the event flags will wake a thread, go to grab the mutex, and then sleep again until the dispatch() function exits. However, the dispatch() function will just release the mutex and the thread that is waiting will wake up and resume operation.

Thread Handler

We need to manually lock and unlock around specific points in our thread handler. Instead of std::lock_guard, we will use std::unique_lock so we can call unlock().

Here's our simplified thread handler:

void dispatch_queue::dispatch_thread_handler(void)
    ULONG flags;
    uint8_t status;

    std::unique_lock<std::mutex> lock(mutex_);

    do {
        //after wait, we own the lock
        if(q_.size() && !quit_)
            auto op = std::move(q_.front());

            //unlock now that we're done messing with the queue


        else if(!quit_)

            // Wait for new work
            status = tx_event_flags_get(&notify_flags_, 
            assert(status == TX_SUCCESS && 
                "Failed to get event flags!");

    } while (!quit_);

    // We were holding the mutex after we woke up

    // Set a signal to indicate a thread exited
    status = tx_event_flags_set(&notify_flags_, 
    assert(status == TX_SUCCESS && "Failed to set event flags!");

Looks a bit saner already!

Putting It All Together

Example source code can be found in the embedded-resources GitHub repository. The original ThreadX dispatch queue implementation can also be found in embedded-resources.

To build the example, run make at the top-level or inside of the examples/cpp folder.

The example is built as a static library. ThreadX headers are provided in the repository, but not binaries or source.

As we implement std::thread and std::condition_variable in the future, we will simplify our RTOS-based dispatch queue even further.

Further Reading