Code Cleanup: Splitting Up git Commits In the Middle of a Branch

I tend to follow a fairly traditional git development flow. When I am working on a new feature or a bug fix, I will make a branch and commit changes as I progress. I often just quickly commit incremental changes and make brief notes on my logic for that chunk.

I like to clean up these commits prior to merging the changes or submitting a pull request. I take the time to rebase the branch and squash related changes together until I am left with a set of atomic commits.

Sometimes I end up in a sticky situation where I have a single commit that contains changes destined for multiple atomic commits. Perhaps I added multiple files accidentally by using git commit -am, or perhaps I didn't originally notice that changes within a file would eventually be separated. If you catch this early you can revert a commit and split up the changes, but the situation is a little trickier if the offending commit is buried in the middle of a branch with other changes built on top of it.

After handling commit splits multiple times in the past two weeks, I figured I'd document a workflow.

Workflow for Splitting git Commits

Here's how I approach splitting up a git commit buried in the middle of a branch:

  1. Checkout the branch that you want to modify (e.g. pj/feature)
  2. Start an interactive rebase which includes your commit.
    • At a minimum, git rebase -i commit^ will start the rebase at the commit you want to split
    • You can also rebase the whole branch, which I usually do to split all target commits in one go
  3. Mark the commit(s) that you want to split with edit
  4. When git presents the commit that you want to split:
    1. Reset state to the previous commit using git reset HEAD^
    2. Use git add to carefully and incrementally add changes to the index
    3. Run git commit when you have a set of atomic changes that you are ready to commit
    4. Repeat the git add and git commit process until you have processed each set of changes represented by the commit
  5. Run git rebase --continue to resume or finish the rebase

That's all there is to it!

Further Reading

Refactoring the ThreadX Dispatch Queue To Use std::mutex

Now that we've implemented std::mutex for an RTOS, let's refactor a library using RTOS-specific calls so that it uses std:mutex instead.

Since we have a ThreadX implementation for std::mutex, let's update our ThreadX-based dispatch queue. Moving to std::mutex will result in a simpler code structure. We still need to port std::thread and std::condition_variable to achieve true portability, but utilizing std::mutex is still a step in the right direction.

For a quick refresher on dispatch queues, refer to following articles:

Table of Contents

  1. How std::mutex Helps Us
    1. C++ Mutex Wrappers
  2. Refactoring the Asynchronous Dispatch Queue
    1. Class Definition
    2. Constructor
    3. Destructor
    4. Dispatch
    5. Thread Handler
  3. Putting It All Together
  4. Further Reading

How std::mutex Helps Us

Even though we can't yet make our dispatch queue fully portable, we still benefit from using std::mutex in the following ways:

  1. We no longer have to worry about initializing or deleting our mutex since the std::mutex constructor and destructor handles that for us
  2. We can take advantage of RAII to lock whenever we enter a scope, and to automatically unlock when we leave that scope
  3. We can utilize standard calls (with no arguments!), reducing the burden of remembering the exact ThreadX functions and arguments

But these arguments might not have real impact. Just take a look at the ThreadX native calls:

uint8_t status = tx_mutex_get(&mutex_, TX_WAIT_FOREVER);

// do some stuff

status = tx_mutex_put(&mutex_);

And here's the std::mutex equivalent:


//do some stuff


Don't you prefer the std::mutex version?

C++ Mutex Wrappers

While we could manually call lock() and unlock() on our mutex object, we'll utilize two helpful C++ mutex constructs: std::lock_guard and std::unique_lock.

The std::lock_guard wrapper provides an RAII mechanism for our mutex. When we construct a std::lock_guard, the mutex starts in a locked state (or waits to grab the lock). Whenever we leave that scope the mutex will be released automatically. A std::lock_guard is especially useful in functions that can return at multiple points. No longer do you have to worry about releasing the mutex at each exit point: the destructor has your back.

We'll also take advantage of the std::unique_lock wrapper. Using std::unique_lock provides similar benefits to std::lock_guard: the mutex is locked when the std::unique_lock is constructed, and unlocked automatically during destruction. However, it provides much more flexibility than std::lock_guard, as we can manually call lock() and unlock(), transfer ownership of the lock, and use it with condition variables.

Refactoring the Asynchronous Dispatch Queue

We will utilize both std::lock_guard and std::unique_lock to simplify our ThreadX dispatch queue.

Our starting point for this refactor will be the dispatch_threadx.cpp file in the embedded-resources repository.

Class Definition

In our dispatch class, we need to change the mutex definition from TX_MUTEX to std::mutex:

std::mutex mutex_;


Mutex initialization is handled for us by the std::mutex constructor. We can remove the tx_mutex_create call from our dispatch queue constructor:

// Initialize the Mutex
uint8_t status = tx_mutex_create(&mutex_, "Dispatch Mutex", TX_INHERIT);
assert(status == TX_SUCCESS && "Failed to create mutex!");


Mutex deletion is handled for us by the std::mutex destructor. We can remove the tx_mutex_delete call from the dispatch queue destructor:

status = tx_mutex_delete(&mutex_);
assert(status == TX_SUCCESS && "Failed to delete mutex!");


By using std::lock_guard, we can remove both the mutex get and put calls. RAII will ensure that the mutex is unlocked when we leave the function.

Here's the dispatch() implementation using std::lock_guard:

void dispatch_queue::dispatch(const fp_t& op)
    std::lock_guard<std::mutex> lock(mutex_);


    // Notifies threads that new work has been added to the queue
    tx_event_flags_set(&notify_flags_, DISPATCH_WAKE_EVT, TX_OR);

If you still wanted to unlock before setting the event flag, use std::unique_lock instead of std::lock_guard. Using std::unique_lock allows you to call unlock().

void dispatch_queue::dispatch(const fp_t& op)
    std::unique_lock<std::mutex> lock(mutex_);



    // Notifies threads that new work has been added to the queue
    tx_event_flags_set(&notify_flags_, DISPATCH_WAKE_EVT, TX_OR);

Either approach is acceptable and looks much cleaner than the native calls.

Why would you potentially care about calling unlock()? If you are using std::lock_guard, it is possible that the event flags will wake a thread, go to grab the mutex, and then sleep again until the dispatch() function exits. However, the dispatch() function will just release the mutex and the thread that is waiting will wake up and resume operation.

Thread Handler

We need to manually lock and unlock around specific points in our thread handler. Instead of std::lock_guard, we will use std::unique_lock so we can call unlock().

Here's our simplified thread handler:

void dispatch_queue::dispatch_thread_handler(void)
    ULONG flags;
    uint8_t status;

    std::unique_lock<std::mutex> lock(mutex_);

    do {
        //after wait, we own the lock
        if(q_.size() && !quit_)
            auto op = std::move(q_.front());

            //unlock now that we're done messing with the queue


        else if(!quit_)

            // Wait for new work
            status = tx_event_flags_get(&notify_flags_, 
            assert(status == TX_SUCCESS && 
                "Failed to get event flags!");

    } while (!quit_);

    // We were holding the mutex after we woke up

    // Set a signal to indicate a thread exited
    status = tx_event_flags_set(&notify_flags_, 
    assert(status == TX_SUCCESS && "Failed to set event flags!");

Looks a bit saner already!

Putting It All Together

Example source code can be found in the embedded-resources GitHub repository. The original ThreadX dispatch queue implementation can also be found in embedded-resources.

To build the example, run make at the top-level or inside of the examples/cpp folder.

The example is built as a static library. ThreadX headers are provided in the repository, but not binaries or source.

As we implement std::thread and std::condition_variable in the future, we will simplify our RTOS-based dispatch queue even further.

Further Reading

Implementing std::mutex with FreeRTOS

Last week we looked at an implementation of std::mutex using the ThreadX RTOS. We will build upon the work in the previous article and add support for FreeRTOS.

In this edition, we will only be creating __external_threading definitions for FreeRTOS. If you are interested in the underlying work, please refer to the ThreadX RTOS port for more information.

Table of Contents

  1. A Review of FreeRTOS Mutex Support
  2. Implementing std::mutex with FreeRTOS
    1. Populating __external_threading_freertos
    2. Implementing the Recursive Mutex Shims
    3. Implementing the Mutex Shims
    4. Breaking with the Standard: Initializing std::mutex
  3. Building Our Custom std::mutex
  4. Putting It All Together
  5. Further Reading

A Review of FreeRTOS Mutex Support

Unlike ThreadX, FreeRTOS differentiates between recursive and non-recursive mutexes. However, FreeRTOS uses the SemaphoreHandle_t as a shared handle type for semaphores, mutexes, and recursive mutexes.

FreeRTOS uses the following APIs to interact with a non-recursive mutex:

SemaphoreHandle_t xSemaphoreCreateMutex( void );
SemaphoreHandle_t xSemaphoreCreateMutexStatic(
        StaticSemaphore_t *pxMutexBuffer );
xSemaphoreTake( SemaphoreHandle_t xSemaphore,
                 TickType_t xTicksToWait );
xSemaphoreGive( SemaphoreHandle_t xSemaphore );

FreeRTOS uses the following APIs to interact with a recursive mutex:

SemaphoreHandle_t xSemaphoreCreateRecursiveMutex( void );
SemaphoreHandle_t xSemaphoreCreateRecursiveMutexStatic(
                     StaticSemaphore_t *pxMutexBuffer );
xSemaphoreTakeRecursive( SemaphoreHandle_t xMutex,
        TickType_t xTicksToWait );
xSemaphoreGiveRecursive( SemaphoreHandle_t xMutex );

Both recursive and non-recursive mutexes use the same deleter function:

void vSemaphoreDelete( SemaphoreHandle_t xSemaphore );

When attempting to claim the mutex, both the recursive and non-recursive "Take" functions allow you to specify a number of ticks to wait before failing. A value of 0 indicates no waiting. Unlike ThreadX, there is no "wait forever", but FreeRTOS defines portMAX_DELAY to represent the longest timeout available on the system. Note that our "Take" calls can timeout over extremely long periods of time.

All of the mutex functions shown above are available when including the semphr.h header.

Implementing std::mutex with FreeRTOS

Now that we've familiarized ourselves with the FreeRTOS mutex APIs, let's get started with our std::mutex port.

Thanks to a well-designed shim layer, we can build off of the ThreadX std::mutex implementation and focus only on the FreeRTOS shims.

We need to adjust our __external_threading file for FreeRTOS:


#include <__external_threading_threadx>
#include <__external_threading_freertos>


An alternative approach is to create an __external_threading file inside of separate include folders. When compiling for different RTOSes, you simply change the include path so that each RTOS implementation is paired with the correct __external_threading header.

Populating __external_threading_freertos

We'll create the __external_threading_freertos file for storing our custom threading definitions.

First we'll include the FreeRTOS and semaphore headers:

#include <freertos/FreeRTOS.h>
#include <freertos/semphr.h>

We'll include some boilerplate which is also defined in __threading_support:

#pragma GCC system_header

#include <__undef_macros>

#if defined(__FreeBSD__) && defined(__clang__) && __has_attribute(no_thread_safety_analysis)
#define _LIBCPP_NO_THREAD_SAFETY_ANALYSIS __attribute__((no_thread_safety_analysis))



Supplying the mutex definitions for FreeRTOS is straightforward:

// Mutex
typedef SemaphoreHandle_t __libcpp_mutex_t;


// FreeRTOS Mutex requires a function to initialize

typedef SemaphoreHandle_t __libcpp_recursive_mutex_t;

Like the ThreadX std::mutex initialization, we'll need to use a non-constexpr constructor for std::mutex. We define _MUTEX_REQUIRES_INITIALIZATION to enable this support.

We're not quite ready to port std::condition_variable or std::thread yet, so we'll import the generic type definitions from the "no pthread API" case in __threading_support:

// Condition Variable
typedef void* __libcpp_condvar_t;

// Execute Once
typedef void* __libcpp_exec_once_flag;

// Thread ID
typedef long __libcpp_thread_id;

// Thread

typedef void* __libcpp_thread_t;

// Thread Local Storage
typedef long __libcpp_tls_key;

#define _LIBCPP_TLS_DESTRUCTOR_CC __stdcall

In coming articles we'll work on porting std::thread and possibly std::condition_variable, so these definitions will be updated. For now I'm just focused on std::mutex.

I also imported the shim function prototypes from __threading_support, including our custom _libcpp_mutex_init function:

// Mutex
int __libcpp_recursive_mutex_init(__libcpp_recursive_mutex_t *__m);

int __libcpp_recursive_mutex_lock(__libcpp_recursive_mutex_t *__m);

bool __libcpp_recursive_mutex_trylock(__libcpp_recursive_mutex_t *__m);

int __libcpp_recursive_mutex_unlock(__libcpp_recursive_mutex_t *__m);

int __libcpp_recursive_mutex_destroy(__libcpp_recursive_mutex_t *__m);

int __libcpp_mutex_init(__libcpp_mutex_t *__m);

int __libcpp_mutex_lock(__libcpp_mutex_t *__m);

bool __libcpp_mutex_trylock(__libcpp_mutex_t *__m);

int __libcpp_mutex_unlock(__libcpp_mutex_t *__m);

int __libcpp_mutex_destroy(__libcpp_mutex_t *__m);

I also imported the std::thread and std::condition_variable shims so the compiler will be happy:

// Condition variable
int __libcpp_condvar_signal(__libcpp_condvar_t* __cv);

int __libcpp_condvar_broadcast(__libcpp_condvar_t* __cv);

int __libcpp_condvar_wait(__libcpp_condvar_t* __cv, __libcpp_mutex_t* __m);

int __libcpp_condvar_timedwait(__libcpp_condvar_t *__cv, __libcpp_mutex_t *__m,
                               timespec *__ts);

int __libcpp_condvar_destroy(__libcpp_condvar_t* __cv);

// Execute once
int __libcpp_execute_once(__libcpp_exec_once_flag *flag,
                          void (*init_routine)(void));

// Thread id
bool __libcpp_thread_id_equal(__libcpp_thread_id t1, __libcpp_thread_id t2);

bool __libcpp_thread_id_less(__libcpp_thread_id t1, __libcpp_thread_id t2);

// Thread
bool __libcpp_thread_isnull(const __libcpp_thread_t *__t);

int __libcpp_thread_create(__libcpp_thread_t *__t, void *(*__func)(void *),
                           void *__arg);

__libcpp_thread_id __libcpp_thread_get_current_id();

__libcpp_thread_id __libcpp_thread_get_id(const __libcpp_thread_t *__t);

int __libcpp_thread_join(__libcpp_thread_t *__t);

int __libcpp_thread_detach(__libcpp_thread_t *__t);

void __libcpp_thread_yield();

void __libcpp_thread_sleep_for(const chrono::nanoseconds& __ns);

// Thread local storage
int __libcpp_tls_create(__libcpp_tls_key* __key,
                        void(_LIBCPP_TLS_DESTRUCTOR_CC* __at_exit)(void*));

void *__libcpp_tls_get(__libcpp_tls_key __key);

int __libcpp_tls_set(__libcpp_tls_key __key, void *__p);

Implementing the Recursive Mutex Shims

The recursive mutex shim functions are just wrappers around our underlying FreeRTOS recursive mutex calls:

int __libcpp_recursive_mutex_init(__libcpp_recursive_mutex_t *__m)
  *__m = xSemaphoreCreateRecursiveMutex();

  return (*__m == NULL);

int __libcpp_recursive_mutex_lock(__libcpp_recursive_mutex_t *__m)
  return xSemaphoreTakeRecursive(*__m, portMAX_DELAY);

bool __libcpp_recursive_mutex_trylock(__libcpp_recursive_mutex_t *__m)
  //intentional no wait for try_lock
  return xSemaphoreTakeRecursive(*__m, 0);

int __libcpp_recursive_mutex_unlock(__libcpp_mutex_t *__m)
  return xSemaphoreGiveRecursive(*__m);

int __libcpp_recursive_mutex_destroy(__libcpp_recursive_mutex_t *__m)

  return 0;

Implementing the Mutex Shims

The basic mutex shim functions are just wrappers around our underlying FreeRTOS non-recursive mutex calls:

int __libcpp_mutex_init(__libcpp_mutex_t *__m)
  *__m = xSemaphoreCreateMutex();

  return (*__m == NULL);

int __libcpp_mutex_lock(__libcpp_mutex_t *__m)
  return xSemaphoreTake(*__m, portMAX_DELAY);

bool __libcpp_mutex_trylock(__libcpp_mutex_t *__m)
  //intentional no wait for try_lock
 return xSemaphoreTake(*__m, 0);

int __libcpp_mutex_unlock(__libcpp_mutex_t *__m)
  return xSemaphoreGive(*__m);

int __libcpp_mutex_destroy(__libcpp_mutex_t *__m)

  return 0;

Building Our Custom std::mutex

A few compilation options need to be set in order to build std::mutex with support for FreeRTOS. Since a variety of build systems are in use, I am only providing general build strategies.

First, we'll need to set _LIBCPP_HAS_THREAD_API_EXTERNAL so that the compiler looks for the __external_threading header.


If you're using the __external_threading implementation that will support multiple RTOSes, you'll also need to set a FREERTOS definition:


You'll also want to set the following compiler flags so that the default C++ libraries are not linked:

-fno-builtin -nodefaultlibs

As well as the following link options:


Because we didn't include all the cpp headers, we need to point our compiler to the include location for other C++ headers. Make sure your local path is placed ahead of the mainstream path so that your compiler picks it up first.

Here's an example if you're compiling with clang on OSX:

-I/path/to/src/include -I/usr/local/opt/llvm/include/c++/v1/

Putting It All Together

I've included my example std::mutex implementation in the embedded-resources GitHub repository. The implementation can be found in examples/libcpp.

The example is currently built as a static library. Only FreeRTOS headers are included in the repository, so the current example is only runnable if you have a FreeRTOS implementation.

To compile the example, simply run:

cd examples/libcpp

Further Reading