Migrating from C to C++

Refactoring the ThreadX Dispatch Queue To Use std::mutex

Now that we've implemented std::mutex for an RTOS, let's refactor a library using RTOS-specific calls so that it uses std:mutex instead.

Since we have a ThreadX implementation for std::mutex, let's update our ThreadX-based dispatch queue. Moving to std::mutex will result in a simpler code structure. We still need to port std::thread and std::condition_variable to achieve true portability, but utilizing std::mutex is still a step in the right direction.

For a quick refresher on dispatch queues, refer to following articles:

Table of Contents

  1. How std::mutex Helps Us
    1. C++ Mutex Wrappers
  2. Refactoring the Asynchronous Dispatch Queue
    1. Class Definition
    2. Constructor
    3. Destructor
    4. Dispatch
    5. Thread Handler
  3. Putting It All Together
  4. Further Reading

How std::mutex Helps Us

Even though we can't yet make our dispatch queue fully portable, we still benefit from using std::mutex in the following ways:

  1. We no longer have to worry about initializing or deleting our mutex since the std::mutex constructor and destructor handles that for us
  2. We can take advantage of RAII to lock whenever we enter a scope, and to automatically unlock when we leave that scope
  3. We can utilize standard calls (with no arguments!), reducing the burden of remembering the exact ThreadX functions and arguments

But these arguments might not have real impact. Just take a look at the ThreadX native calls:

uint8_t status = tx_mutex_get(&mutex_, TX_WAIT_FOREVER);

// do some stuff

status = tx_mutex_put(&mutex_);

And here's the std::mutex equivalent:

mutex_.lock()

//do some stuff

mutex_.unlock()

Don't you prefer the std::mutex version?

C++ Mutex Wrappers

While we could manually call lock() and unlock() on our mutex object, we'll utilize two helpful C++ mutex constructs: std::lock_guard and std::unique_lock.

The std::lock_guard wrapper provides an RAII mechanism for our mutex. When we construct a std::lock_guard, the mutex starts in a locked state (or waits to grab the lock). Whenever we leave that scope the mutex will be released automatically. A std::lock_guard is especially useful in functions that can return at multiple points. No longer do you have to worry about releasing the mutex at each exit point: the destructor has your back.

We'll also take advantage of the std::unique_lock wrapper. Using std::unique_lock provides similar benefits to std::lock_guard: the mutex is locked when the std::unique_lock is constructed, and unlocked automatically during destruction. However, it provides much more flexibility than std::lock_guard, as we can manually call lock() and unlock(), transfer ownership of the lock, and use it with condition variables.

Refactoring the Asynchronous Dispatch Queue

We will utilize both std::lock_guard and std::unique_lock to simplify our ThreadX dispatch queue.

Our starting point for this refactor will be the dispatch_threadx.cpp file in the embedded-resources repository.

Class Definition

In our dispatch class, we need to change the mutex definition from TX_MUTEX to std::mutex:

std::mutex mutex_;

Constructor

Mutex initialization is handled for us by the std::mutex constructor. We can remove the tx_mutex_create call from our dispatch queue constructor:

// Initialize the Mutex
uint8_t status = tx_mutex_create(&mutex_, "Dispatch Mutex", TX_INHERIT);
assert(status == TX_SUCCESS && "Failed to create mutex!");

Destructor

Mutex deletion is handled for us by the std::mutex destructor. We can remove the tx_mutex_delete call from the dispatch queue destructor:

status = tx_mutex_delete(&mutex_);
assert(status == TX_SUCCESS && "Failed to delete mutex!");

Dispatch

By using std::lock_guard, we can remove both the mutex get and put calls. RAII will ensure that the mutex is unlocked when we leave the function.

Here's the dispatch() implementation using std::lock_guard:

void dispatch_queue::dispatch(const fp_t& op)
{
    std::lock_guard<std::mutex> lock(mutex_);

    q_.push(op);

    // Notifies threads that new work has been added to the queue
    tx_event_flags_set(&notify_flags_, DISPATCH_WAKE_EVT, TX_OR);
}

If you still wanted to unlock before setting the event flag, use std::unique_lock instead of std::lock_guard. Using std::unique_lock allows you to call unlock().

void dispatch_queue::dispatch(const fp_t& op)
{
    std::unique_lock<std::mutex> lock(mutex_);

    q_.push(op);

    lock.unlock();

    // Notifies threads that new work has been added to the queue
    tx_event_flags_set(&notify_flags_, DISPATCH_WAKE_EVT, TX_OR);
}

Either approach is acceptable and looks much cleaner than the native calls.

Why would you potentially care about calling unlock()? If you are using std::lock_guard, it is possible that the event flags will wake a thread, go to grab the mutex, and then sleep again until the dispatch() function exits. However, the dispatch() function will just release the mutex and the thread that is waiting will wake up and resume operation.

Thread Handler

We need to manually lock and unlock around specific points in our thread handler. Instead of std::lock_guard, we will use std::unique_lock so we can call unlock().

Here's our simplified thread handler:

void dispatch_queue::dispatch_thread_handler(void)
{
    ULONG flags;
    uint8_t status;

    std::unique_lock<std::mutex> lock(mutex_);

    do {
        //after wait, we own the lock
        if(q_.size() && !quit_)
        {
            auto op = std::move(q_.front());
            q_.pop();

            //unlock now that we're done messing with the queue
            lock.unlock();

            op();

            lock.lock();
        }
        else if(!quit_)
        {
            lock.unlock();

            // Wait for new work
            status = tx_event_flags_get(&notify_flags_, 
                DISPATCH_WAKE_EVT, 
                TX_OR_CLEAR,
                &flags, 
                TX_WAIT_FOREVER);
            assert(status == TX_SUCCESS && 
                "Failed to get event flags!");

            lock.lock();
        }
    } while (!quit_);

    // We were holding the mutex after we woke up
    lock.unlock();

    // Set a signal to indicate a thread exited
    status = tx_event_flags_set(&notify_flags_, 
        DISPATCH_EXIT_EVT, TX_OR);
    assert(status == TX_SUCCESS && "Failed to set event flags!");
}

Looks a bit saner already!

Putting It All Together

Example source code can be found in the embedded-resources GitHub repository. The original ThreadX dispatch queue implementation can also be found in embedded-resources.

To build the example, run make at the top-level or inside of the examples/cpp folder.

The example is built as a static library. ThreadX headers are provided in the repository, but not binaries or source.

As we implement std::thread and std::condition_variable in the future, we will simplify our RTOS-based dispatch queue even further.

Further Reading

Choosing the Right Container: Sequential Containers

In a previous article, I provided an overview of the C++ standard container types. We've also taken a look at general rules of thumb for selecting a container, as well as a more detailed look at rules of thumb for associative containers. Let's take a look at guidelines we can use to select the right sequential container.

Sequential Container Review

SequenceContainers should be used when you care that your memory is stored sequentially or when you want to access the data sequentially. Here's a quick summary of the sequential containers:

  • std::array - static contiguous array, providing fast access but with a fixed number of elements
  • std::vector - dynamic contiguous array, providing fast access but costly insertions/deletions
  • std::deque - double-ended queue providing efficient insertion/deletion at the front and back of a sequence
  • std::list and std::forward_list - linked list data structures, allowing for efficient insertion/deletion into the middle of a sequence.

Container Adapters

Container adapters are a special type of container class. They are not full container classes on their own, but wrappers around other container types (such as a vector, deque, or list). The container adapters encapsulate the underlying container type and limit the user interfaces accordingly.

The standard container adapters are:

  • stack - adapter providing an LIFO data structure
  • queue - adapter providing a FIFO data structure
  • priority_queue - adapter providing a priority queue, which allows for constant-time lookup of the largest element (by default)

General Rules of Thumb

There are some general rules of thumb that will guide you through most situations:

  • Use sequential containers when you need to access elements by position
  • Use std:vector as your default sequential container, especially as an alternative to built-in arrays
  • If you add or remove elements frequently at both the front and back of a container, use std::deque
  • Use a std::list (not std::deque) if you need to insert/remove elements in the middle of the sequence
  • Do not use std::list if you need random access to objects
  • Prefer std::vector over std::list if your system uses a cache
  • std::string is almost always better than a C-string
  • If you need to limit the interfaces, use a container adapter

Memory allocation may also be a factor in your decision. Here are the general rules of thumb for how the different sequential containers are storing memory:

  • std:vector, std::array, and std::string store memory contiguously and are compatible with C-style APIs
  • std::deque allocates memory in chunks
  • std::list allocates memory by node

Container Adapters

Container adapters take a container type and limit the interfaces to provide a specific data type. These containers will exhibit the characteristics of their underlying data structures, so you may not always want to utilize the default underlying container.

Generally you will want to:

  • Use std::stack when you need a LIFO data structure
  • Use std::queue when you need a FIFO data structure

std::priority queue is a bit more complex. The container utilizes a Compare function (std::less by default) and provides constant-time lookup of the largest (by default) element in the container. This structure is useful for implementing things like an ISR handler queue, where you always want to be working through the highest priority interrupt in the queue. This feature comes with an increased cost in element insertion.

Container Analysis

Let's take a look at the various sequential containers. Since the container adapters are simply add-ons to the primary sequential containers, we will not be evaluating those below. Select the underlying container appropriately for your use case, and use the adapter to implement the specific interface you need.

std::vector

Your default sequential containers should be a std::vector. Generally, std::vector will provide you with the right balance of performance and speed. The std::vector container is similar to a C-style array that can grow or shrink during runtime. The underlying buffer is stored contiguously and is guaranteed to be compatible with C-style arrays.

Consider using a std::vector if:

  • You need your data to be stored contiguously in memory
    • Especially useful for C-style API compatibility
  • You do not know the size at compile time
  • You need efficient random access to your elements (O(1))
  • You will be adding and removing elements from the end
  • You want to iterate over the elements in any order

Avoid using a std::vector if:

  • You will frequently add or remove elements to the front or middle of the sequence
  • The size of your buffer is constant and known in advance (prefer std::array)

Be aware of the specialization of std::vector<bool>: Since C++98, std::vector<bool> has been specialized such that each element only occupies one bit. When accessing individual boolean elements, the operators return a copy of a bool that is constructed with the value of that bit.

std::array

The std::array container is the most like a built-in array, but offering extra features such as bounds checking and automatic memory management. Unlike std::vector, the size of std::array is fixed and cannot change during runtime.

Consider using a std::array if:

  • You need your data to be stored contiguously in memory
    • Especially useful for C-style API compatibility
  • The size of your array is known in advance
  • You need efficient random access to your elements (O(1))
  • You want to iterate over the elements in any order

Avoid using a std::array if:

  • You need to insert or remove elements
  • You don't know the size of your array at compile time
  • You need to be able to resize your array dynamically

std::deque

The std::deque container gets its name from a shortening of "double ended queue". The std::deque container is most efficient when appending items to the front or back of a queue. Unlike std::vector, std::deque does not provide a mechanism to reserve a buffer. The underlying buffer is also not guaranteed to be compatible with C-style array APIs.

Consider using std::deque if:

  • You need to insert new elements at both the front and back of a sequence (e.g. in a scheduler)
  • You need efficient random access to your elements (O(1))
  • You want the internal buffer to automatically shrink when elements are removed
  • You want to iterate over the elements in any order

Avoid using std::deque if:

  • You need to maintain compatibility with C-style APIs
  • You need to reserve memory ahead of time
  • You need to frequently insert or remove elements from the middle of the sequence
    • Calling insert in the middle of a std::deque invalidates all iterators and references to its elements

std::list

The std::list and std::forward_list containers implement linked list data structures. Where std::list provides a doubly-linked list, the std::forward_list only contains a pointer to the next object. Unlike the other sequential containers, the list types do not provide efficient random access to elements. Each element must be traversed in order.

Consider using std::list if:

  • You need to store many items but the number is unknown
  • You need to insert or remove new elements from any position in the sequence
  • You do not need efficient access to random elements
  • You want the ability to move elements or sets of elements within the container or between different containers
  • You want to implement a node-wise memory allocation scheme

Avoid using std::list if:

  • You need to maintain compatibility with C-style APIs
  • You need efficient access to random elements
  • Your system utilizes a cache (prefer std::vector for reduced cache misses)
  • The size of your data is known in advance and can be managed by a std::vector

Up Next

That wraps up my general overview of C++ STL containers. In future articles, I will be using these STL containers to implement embedded systems programming constructs.

Further Reading

Choosing the Right Container: Associative Containers

I recently provided an overview of the C++ standard container types as well as high-level guidelines for selecting the right container. Today we will be taking a look at the tradeoffs of the STL associative containers.

In general, there are two types of associative containers:

  1. AssociativeContainer, which represents an ordered container utilizing key-based lookup
  2. UnorderedAssociativeContainer, which utilizes hashing for key-based lookup

The ordered associative containers are:

  • std::set, which implements a collection of unique keys, sorted by keys
  • std::map, which implements a collection of key-value pairs, sorted by unique keys
  • std::multiset, which implements a set that allows multiple entries each key
  • std::multimap, which implements a map that allows multiple entries for each key

The unordered associative containers are:

Selecting between Ordered and Unordered Associative Containers

One of the biggest questions I've had is "when should I use a map vs. a unordered_map"? There are certainly plenty of opinions around the internet, and I haven't achieved full clarity yet. However, I have distilled some general details and rules of thumb to help you decide between ordered and unordered associative containers.

If you are only worried about performance, I'll save you some time: it seems that unordered_map has better performance in general.

Ordered Containers

Each of the associative containers sorts keys upon insertion, allowing for O(log n) search complexity.

The ordered associative containers use a node-based allocation scheme. Iterators to all elements are stable, and in C++17 you can even move elements from one map to another without invalidating their iterators. Node-based memory allocation keeps ordered container function timings more consistent, since they never need to make large allocations.

By using an ordered container, you are able to implement efficient range searches or iterate over subsets (e.g. operate on all elements where key > 100).

If you want to use a user-defined object for a key, you'll need to define a less-than operator or implement a new comparison function. This can be much simpler than implementing a hashing function for your custom data type.

You should use an ordered associative container if:

  • You prefer a tree structure
  • You don't want the memory overhead of storing the hash table
  • Ordering or ordered traversal of your data set matters
  • A good hash of your key data is not possible or is to slow
  • You prefer to write a comparison operator for your custom data type instead of a hashing operator
  • You need guaranteed performance (e.g. embedded systems)
    • Unordered containers have O(n) worst-case complexity in collision cases

Unordered Containers

Each of the unordered associative containers uses hashed data structures that can be searched in amortized constant time (O(1) amortized). This means that most operations take constant time, though collisions can increase the operational time to O(n) in the worst case.

Unordered containers require additional memory to store the hash table.

If you want to use a user-defined object for a key, you'll need to implement a definition of std::hash<T> for your key type T and override the == operator.

You should use an unordered associative container if:

  • You have the memory for a hash table
  • Can absorb the occasional long operation
  • Have a good hashing function to reduce collisions
  • You are using string data as a key

Differentiating the Containers

Most differences between the various types of associative containers are connected to the OrderedAssociative and UnorderedAssociative container concepts. Since we've already reviewed these two major categories, let's briefly review the subtypes: set, multiset, map, and multimap. The notes below apply to both ordered and unordered associative containers.

The map containers support key-value pairs. If you're just looking for a simple lookup table implementation (or a Perl-like associative array), the map subtype is the container for you. The map containers support unique keys: writing to a key which already exists in the structure will overwrite the stored value.

The set containers support key-only entries. Like map, set only supports unique keys, though writing to a key which already exists is a wasted operation. A set is useful for storing a collection of unique things, such as "topics we are subscribed to" or "types of food that our restaurant sells". If you need to store a value with your key, you need a map instead.

The multimap and multiset containers operate the same as map and set but relax the requirement that keys must be unique.

The multimap containers are useful for grouping objects that are related together by key. For example:

  • ZIP code for the key, and people who reside at that ZIP code as the value
  • Word as the key, and definition(s) as the value

A multiset is similar in concept to a key with an associated integer count. You can get the count for each time that an entry appears in your multiset.

In general, you will stick to the map and set types in most cases.

Further Reading