Due to the resource constrained nature of embedded systems, circular buffer data structures can be found in most projects.
Circular buffers (also known as ring buffers) are fixed-size buffers that work as if the memory is contiguous & circular in nature. As memory is generated and consumed, data does not need to be reshuffled – rather, the head/tail pointers are adjusted. When data is added, the head pointer advances. When data is consumed, the tail pointer advances. If you reach the end of the buffer, the pointers simply wrap around to the beginning.
For a more detailed summary of circular buffer operation, please refer to the Wikipedia article. The rest of the article assumes you have an understanding of how circular buffers work.
Table of Contents:
- Why Use a Circular Buffer?
- C Implementation
- C++ Implementation
- Putting It All Together
- Further Reading
Why Use A Circular Buffer?
Circular buffers are often used as fixed-sized queues. The fixed size is beneficial for embedded systems, as developers often try to use static data storage methods rather than dynamic allocations.
Circular buffers are also useful structures for situations where data production and consumption happen at different rates: the most recent data is always available. If the consumer cannot keep up with production, the stale data will be overwritten with more recent data. By using a circular buffer, we can ensure that we are always consuming the most recent data.
For additional use cases, check out Ring Buffer Basics on Embedded.com.
C Implementation
We will start with a C implementation, as this exposes us to some of the design challenges and tradeoffs when creating a circular buffer library.
Using Encapsulation
Since we are creating a circular buffer library, we want to make sure users work with our library APIs instead of modifying the structure directly. We also want to keep the implementation contained within our library so we can change it as needed, without requiring end users to update their code. The user doesn’t need to know any details about our structure, only that it exists.
In of our library header, we will forward declare the structure:
// Opaque circular buffer structure
typedef struct circular_buf_t circular_buf_t;
We don’t want users to work with a circular_buf_t
pointer directly, as they might get the impression that they can dereference the value. We will create a handle type that they can use instead.
The simplest approach for our handle is to typedef
the cbuf_handle_t
as a pointer to the circular buffer. This will prevent us from needing to cast the pointer within our function implementation.
// Handle type, the way users interact with the API
typedef circular_buf_t* cbuf_handle_t;
An alternative approach would be to make the handle a uintptr_t
or void*
value. Inside of our interface, we would handle the translation to the appropriate pointer type. We keep the circular buffer type hidden from users, and the only way to interact with the data is through the handle.
We’re going to stick with the simple handle implementation to keep our example code simple and straightforward.
API Design
First, we should think about how users will interact with a circular buffer:
- They need to initialize the circular buffer container with a buffer and size
- They need to destroy a circular buffer container
- They need to reset the circular buffer container
- They need to be able to add data to the buffer
- They need to be able to get the next value from the buffer
- They need to know whether the buffer is full or empty
- They need to know the current number of elements in the buffer
- They need to know the max capacity of the buffer
Using this list, we can put together an API for our library. Users will interact with the circular buffer library using our opaque handle type, which is created during initialization.
I have chosen uint8_t
as the underlying data type in this implementation. You can use any particular type that you like – just be careful to handle the underlying buffer and number of bytes appropriately.
/// Pass in a storage buffer and size
/// Returns a circular buffer handle
cbuf_handle_t circular_buf_init(uint8_t* buffer, size_t size);
/// Free a circular buffer structure.
/// Does not free data buffer; owner is responsible for that
void circular_buf_free(cbuf_handle_t me);
/// Reset the circular buffer to empty, head == tail
void circular_buf_reset(cbuf_handle_t me);
/// Put version 1 continues to add data if the buffer is full
/// Old data is overwritten
void circular_buf_put(cbuf_handle_t me, uint8_t data);
/// Put Version 2 rejects new data if the buffer is full
/// Returns 0 on success, -1 if buffer is full
int circular_buf_put2(cbuf_handle_t me, uint8_t data);
/// Retrieve a value from the buffer
/// Returns 0 on success, -1 if the buffer is empty
int circular_buf_get(cbuf_handle_t me, uint8_t * data);
/// Returns true if the buffer is empty
bool circular_buf_empty(cbuf_handle_t me);
/// Returns true if the buffer is full
bool circular_buf_full(cbuf_handle_t me);
/// Returns the maximum capacity of the buffer
size_t circular_buf_capacity(cbuf_handle_t me);
/// Returns the current number of elements in the buffer
size_t circular_buf_size(cbuf_handle_t me);
Determining if a Buffer is Full
Before we proceed, we should take a moment to discuss the method we will use to determine whether or buffer is full or empty.
Both the “full” and “empty” cases of the circular buffer look the same: head
and tail
pointer are equal. There are two approaches to differentiating between full and empty:
- “Waste” a slot in the buffer:
- Full state is
head + 1 == tail
- Empty state is
head == tail
- Full state is
- Use a
bool
flag and additional logic to differentiate states::- Full state is
full
- Empty state is
(head == tail) && !full
- Full state is
We should also consider thread safety. By using a single empty cell to detect the “full” case, we can support a single producer and single consumer without a lock (as long as put
and get
don’t modify the same variables). The queue is thread-safe because the producer will only modify the head
index, and the consumer will only modify the tail
index. While either index might be slightly out-of-date in a given context, this will not impact the thread safety of the queue. Using the full
flag, however, creates a requirement for mutual exclusion. This is because the full
flag is shared by both the producer and consumer.
Of course, the decision has its tradeoffs. If your buffer element has a large memory footprint (such as a buffer that is sized for a camera i-frame), wasting a slot may not be reasonable on your system. If you have multiple producers/consumers interacting with a queue, you will need a lock anyway, so wasting a slot doesn’t make sense. If you do not have mutual exclusion available (e.g., because you’re not using an OS) but you are using interrupts, then you will want to use the version without the full flag. The memory model used on your system may also have an impact on your decision to go without a lock.
The implementation below uses the bool
flag. Using the flag requires additional logic in the get
and put
routines to update the flag. I will also describe how to make the modifications for a single producer/consumer that does not use the full
flag.
Circular Buffer Container Type
Now that we have a grasp on the operations we’ll need to support, we can design our circular buffer container.
We use the container structure for managing the state of the buffer. To preserve encapsulation, the container structure is defined inside of our library .c
file, rather than in the header.
We will need to keep track of:
- The underlying data buffer
- The maximum size of the buffer
- The current “head” position (incremented when elements are added)
- The current “tail” (incremented when elements are removed)
- A flag indicating whether the buffer is full or not
// The hidden definition of our circular buffer structure
struct circular_buf_t {
uint8_t * buffer;
size_t head;
size_t tail;
size_t max; //of the buffer
bool full;
};
Now that our container is designed, we are ready to implement the library functions.
Implementation
One important detail to note is that each of our APIs requires an initialized buffer handle. Rather than litter our code with conditional statements, we will utilize assertions to enforce our API requirements in the “Design by Contract” style.
If the interfaces are improperly used, the program will fail immediately rather than requiring the user to check and handle the error code.
For example:
circular_buf_reset(NULL);
Produces:
=== C Circular Buffer Check ===
Assertion failed: (me), function circular_buf_reset, file ../../circular_buffer.c, line 35.
Abort trap: 6
Another important note is that the implementation shown below is not thread-safe. No locks have been added to the underlying circular buffer library.
Initialize and Reset
Let’s start at the beginning: initializing a circular buffer. Our API has clients provide the underlying buffer and buffer size, and we return a circular buffer handle to them. The reason we want our users to provide the buffer is that this allows for a statically allocated buffer. If our API created the buffer under the hood, we would need to make use of dynamic memory allocation, which is often disallowed in embedded systems programs.
We are required to provide a circular buffer structure instance within the library so that we can return a pointer to the user. I have used malloc
for simplicity. Systems which cannot use dynamic memory simply need to modify the init
function to use a different method, such as allocation from a static pool of pre-allocated circular buffer structures.
Another approach would be to break encapsulation, allowing users to statically declare circular buffer container structures outside of the library. In this case, circular_buf_init
needs to be updated to take a struct pointer. We could also have our init
function create a container structure on the stack and return it wholesale. However, since encapsulation is broken, users will be able to modify the structure without using the library routines. If you want to preserve encapsulation, you need to work with pointers instead of concrete structure instances.
// User provides struct
void circular_buf_init(circular_buf_t* me, uint8_t* buffer,
size_t size);
// Return a concrete struct
circular_buf_t circular_buf_init(uint8_t* buffer, size_t size);
// Return a pointer to a struct instance - preferred approach
cbuf_handle_t circular_buf_init(uint8_t* buffer, size_t size);
We will be returning a handle to a structure that is allocated inside of the library. Once we’ve created our container, we need populate the values and call reset
on it. Before we return from init
, we ensure that the buffer container has been created in an empty state.
cbuf_handle_t circular_buf_init(uint8_t* buffer, size_t size)
{
assert(buffer && size);
cbuf_handle_t cbuf = malloc(sizeof(circular_buf_t));
assert(cbuf);
cbuf->buffer = buffer;
cbuf->max = size;
circular_buf_reset(cbuf);
assert(circular_buf_empty(cbuf));
return cbuf;
}
The purpose of the reset function is to put the buffer into an “empty” state, which requires updating head
, tail
, and full
:
void circular_buf_reset(cbuf_handle_t me)
{
assert(me);
me->head = 0;
me->tail = 0;
me->full = false;
}
Since we have a method to create a circular buffer container, we need an equivalent method for destroying the container. In this case, we call free
on our container. We do not attempt to free the underlying buffer, since we do not own it.
void circular_buf_free(cbuf_handle_t me)
{
assert(me);
free(me);
}
State Checks
Next, we’ll implement the functions related to the state of the buffer container.
The full function is the easiest to implement, since we have a flag representing the state:
bool circular_buf_full(cbuf_handle_t me)
{
assert(me);
return me->full;
}
Since we have the full
flag to differentiate between full or empty state, we combine the flag with a check that head == tail
:
bool circular_buf_empty(cbuf_handle_t me)
{
assert(me);
return (!me->full && (me->head == me->tail));
}
The capacity of our buffer was supplied during initialization, so we just return that value to the user:
size_t circular_buf_capacity(cbuf_handle_t me)
{
assert(me);
return me->max;
}
Calculating the number of elements in the buffer was a trickier problem than I expected. Many proposed size calculations use modulo, but I ran into strange corner cases when testing that out. I opted for a simplified calculation using conditional statements.
If the buffer is full, we know that our capacity is at the maximum. If head
is greater-than-or-equal-to the tail
, we simply subtract the two values to get our size. If tail
is greater than head
, we need to offset the difference with max
to get the correct size.
size_t circular_buf_size(cbuf_handle_t me)
{
assert(me);
size_t size = me->max;
if(!me->full)
{
if(me->head >= me->tail)
{
size = (me->head - me->tail);
}
else
{
size = (me->max + me->head - me->tail);
}
}
return size;
}
Adding and Removing Data
With the bookkeeping functions out of the way, it’s time to dig into the meat: adding and removing data from the queue.
Adding and removing data from a circular buffer requires manipulation of the head
and tail
pointers. When adding data to the buffer, we insert the new value at the current head
location, then we advance head
. When we remove data from the buffer, we retrieve the value of the current tail
pointer and then advance tail
.
Adding data to the buffer requires a bit more thought, however. If the buffer is full
, we need to advance our tail
pointer as well as head
. We also need to check whether inserting a value triggers the full
condition.
We are going to implement two versions of the put
function, so let’s extract our pointer advancement logic into a helper function. If our buffer is already full, we advance tail
. We always advance head
by one. After the pointer has been advanced, we populate the full
flag by checking whether head == tail
.
Note the use of the modulo operator (%
) below. Modulo will cause the head
and tail
values to reset to 0 when the maximum size is reached. This ensures that head
and tail
are always valid indices of the underlying data buffer.
static void advance_pointer(cbuf_handle_t me)
{
assert(me);
if(me->full)
{
me->tail = (me->tail + 1) % me->max;
}
me->head = (me->head + 1) % me->max;
me->full = (me->head == me->tail);
}
As Miro Samek helpfully pointed out, this is an expensive computational operation. Instead, we can use conditional logic to reduce the total number of instructions. Miro’s recommended approach is:
if (++(me->head) == me->max)
{
me->head = 0;
}
Now, advance_pointer
will look like this:
static void advance_pointer(cbuf_handle_t me)
{
assert(me);
if(me->full)
{
if (++(me->tail) == me->max)
{
me->tail = 0;
}
}
if (++(me->head) == me->max)
{
me->head = 0;
}
me->full = (me->head == me->tail);
}
We can make a similar helper function which is called when removing a value from the buffer. When we remove a value, the full
flag is set to false
, and the tail pointer is advanced.
static void retreat_pointer(cbuf_handle_t me)
{
assert(me);
me->full = false;
if (++(me->tail) == me->max)
{
me->tail = 0;
}
}
We’ll create two versions of the put
function. The first version inserts a value into the buffer and advances the pointer. If the buffer is full, the oldest value will be overwritten. This is the standard use case for a circular buffer
void circular_buf_put(cbuf_handle_t me, uint8_t data)
{
assert(me && me->buffer);
me->buffer[me->head] = data;
advance_pointer(me);
}
The second version of the put
function returns an error if the buffer is full. This is provided for demonstration purposes, but we do not use this variant in our systems.
int circular_buf_put2(cbuf_handle_t me, uint8_t data)
{
int r = -1;
assert(me && me->buffer);
if(!circular_buf_full(me))
{
me->buffer[me->head] = data;
advance_pointer(me);
r = 0;
}
return r;
}
To remove data from the buffer, we access the value at the tail
and then update the tail
pointer. If the buffer is empty we do not return a value or modify the pointer. Instead, we return an error to the user.
int circular_buf_get(cbuf_handle_t me, uint8_t * data)
{
assert(me && data && me->buffer);
int r = -1;
if(!circular_buf_empty(me))
{
*data = me->buffer[me->tail];
retreat_pointer(me);
r = 0;
}
return r;
}
That completes the implementation of our circular buffer library.
Usage
When using the library, the client is responsible for creating the underlying data buffer to circular_buf_init
, and a cbuf_handle_t
is returned:
uint8_t * buffer = malloc(EXAMPLE_BUFFER_SIZE * sizeof(uint8_t));
cbuf_handle_t me = circular_buf_init(buffer,
EXAMPLE_BUFFER_SIZE);
This handle is used to interact with all remaining library functions:
bool full = circular_buf_full(me);
bool empty = circular_buf_empty(me);
printf("Current buffer size: %zu\n", circular_buf_size(me);
Don’t forget to free both the underlying data buffer and the container when you are done:
free(buffer);
circular_buf_free(me);
A test program which uses the circular buffer library can be found in the embedded-resources repository.
Modifications for Removing the full
flag
If you wanted to ditch the full
flag, you would instead check that the head
is one position behind the tail to determine if the buffer is full:
bool circular_buf_full(circular_buf_t* me)
{
// We determine "full" case by head being one position behind the tail
// Note that this means we are wasting one space in the buffer
return ((me->head + 1) % me->size) == me->tail;
}
Now, if we wanted to avoid the modulo operation, we can use conditional logic instead:
bool circular_buf_full(circular_buf_t* me)
{
// We need to handle the wraparound case
size_t head = me->head + 1;
if(head == me->max)
{
head = 0;
}
return head == me->tail;
}
The empty case is then that head
and tail
are the same:
bool circular_buf_empty(circular_buf_t* me)
{
// We define empty as head == tail
return (me->head == me->tail);
}
When getting data from the buffer, we will advance the tail pointer, wrapping around if necessary:
int circular_buf_get(circular_buf_t * me, uint8_t * data)
{
int r = -1;
if(me && data && !circular_buf_empty(me))
{
*data = me->buffer[me->tail];
me->tail = (me->tail + 1) % me->size;
r = 0;
}
return r;
}
When adding data to the buffer, we will store the data and advance the head pointer, wrapping around if necessary:
int circular_buf_put(circular_buf_t * me, uint8_t data)
{
int r = -1;
if(me && !circular_buf_full(me))
{
me->buffer[me->head] = data;
me->head = (me->head + 1) % me->size;
r = 0;
}
return r;
}
Other references to full
can be eliminated.
C++
C++ lends itself to a cleaner circular buffer implementation than C.
Class Definition
We’ll start off by defining our C++ class. We want our C++ implementation to support any type of data, so we are going to make it a templated class.
Our APIs are going to be similar to the C implementation. Our class will provide interfaces for:
- Resetting the buffer to empty
- Adding data
- Removing data
- Checking full/empty state
- Checking the current number of elements in the buffer
- Checking the total capacity of the buffer
We will also utilize C++ smart pointers to ensure sure we don’t leave any data around once our buffer is destroyed. This means we can manage the buffer for the user.
Another benefit of C++ is the triviality of making this class thread-safe: we can rely on the std::mutex
type (assuming this is defined for your platform).
Here’s our class definition:
template <class T>
class circular_buffer {
public:
explicit circular_buffer(size_t size) :
buf_(std::unique_ptr<T[]>(new T[size])),
max_size_(size)
{ // empty }
void put(T item);
T get();
void reset();
bool empty() const;
bool full() const;
size_t capacity() const;
size_t size() const;
private:
std::mutex mutex_;
std::unique_ptr<T[]> buf_;
size_t head_ = 0;
size_t tail_ = 0;
const size_t max_size_;
bool full_ = 0;
};
C++ Implementation
Our C++ circular buffer mimics much of the logic from the C implementation, but results in a much cleaner and more reusable design. Also, the C++ buffer utilizes std::mutex
to provide a thread-safe implementation.
Note: The same logic changes can be made in the C++ implementation to support thread-safety with a single producer and consumer by “wasting” a slot. See the adjustments in the C implementation for more information.
Initialization
When constructing our class, we allocate the data for our underlying buffer and set the buffer size. This removes the overhead required with the C implementation.
Unlike the C implementation, the C++ constructor does not call reset
. Because we specify initial values for our member variables, our circular buffer starts out in the correct state.
explicit circular_buffer(size_t size) :
buf_(std::unique_ptr<T[]>(new T[size])),
max_size_(size)
{
//empty constructor
}
Our reset behavior puts the buffer back to an empty state (head == tail && !full_
).
void reset()
{
std::lock_guard<std::mutex> lock(mutex_);
head_ = tail_;
full_ = false;
}
State Tracking
The logic of the empty
and full
cases is the same as the C example:
bool empty() const
{
//if head and tail are equal, we are empty
return (!full_ && (head_ == tail_));
}
bool full() const
{
//If tail is ahead the head by 1, we are full
return full_;
}
In the C++ circular buffer implementation, size
and capacity
report the number of elements in the queue rather than the size in bytes. This allows us to be agnostic to the underlying details of the type.
size_t capacity() const
{
return max_size_;
}
size_t size() const
{
size_t size = max_size_;
if(!full_)
{
if(head_ >= tail_)
{
size = head_ - tail_;
}
else
{
size = max_size_ + head_ - tail_;
}
}
return size;
}
Adding Data
The logic for put
matches the C implementation. This implementation uses the “overwrite the oldest value” behavioral pattern.
void put(T item)
{
std::lock_guard<std::mutex> lock(mutex_);
buf_[head_] = item;
if(full_)
{
tail_ = (tail_ + 1) % max_size_;
}
head_ = (head_ + 1) % max_size_;
full_ = head_ == tail_;
}
Note: For simplicity, I have left out the option that avoids modulo operations. You can find that logic in the C section.
Retrieving Data
The logic behind get
matches the C implementation. Unlike the C implementation, an empty value is returned if the buffer is empty.
T get()
{
std::lock_guard<std::mutex> lock(mutex_);
if(empty())
{
return T();
}
//Read data and advance the tail (we now have a free space)
auto val = buf_[tail_];
full_ = false;
tail_ = (tail_ + 1) % max_size_;
return val;
}
Note:
return T()
will return a default constructed value fo ra given type. The actual value produced depends on the type or the constructor. Also, for simplicity, I have left out the option that avoids modulo operations. You can find that logic in the C section.
Usage
The C++ circular buffer is much simpler to use than the C implementation.
To instantiate a circular buffer, we just declare an object and specify the templated type for our buffer. Here’s an example using a buffer of 10 uint32_t
entries:
circular_buffer<uint32_t> circle(10);
Adding data is easy:
uint32_t x = 100;
circle.put(x);
And getting data is equally easy:
x = circle.get()
Remember that since this is a templated class, you can create a circular buffer of any type that you need.
Updates for C++17
In C++17, we have access to std::optional
, which enables us to represent a value that may or may not be present. Our get
function would return a std::optional<T>
. We would also return std::nullopt
instead of a default-constructed T
if the queue is empty.
std::optional<T> get()
{
std::lock_guard<std::mutex> lock(mutex_);
if(empty())
{
return std::nullopt;
}
//Read data and advance the tail (we now have a free space)
auto val = buf_[tail_];
full_ = false;
tail_ = (tail_ + 1) % max_size_;
return val;
}
Note: For simplicity, I have left out the option that avoids modulo operations. You can find that logic in the C section.
In the calling code, you could check for a valid value using the boolean operator or the has_value
member function. If a valid value is present, it can be accessed using the ->
or *
operators, ur using the value()
member function.
// Returns an optional
auto item = cbuf.get();
// Check if the optional is valid
if(item)
{
process_data(*item); // access the value
}
Putting it All Together
Example implementations can be found in the embedded-resources
Github repository.
If you are looking to extend this library, a useful exercise is to add additional APIs to enable users to add/remove multiple elements with a single operation. You can also make the C implementation thread-safe.
Thread Safety with the Lookahead Method
One approach for thread-safety without a mutex is the “lookahead” method. This method supports a single producer thread and single consumer thread; multiple producers or consumers will require a lock.
Instead of using the boolean flag to differentiate between the full and empty cases, we will always leave one cell empty. By using a single empty cell to detect the “full” case, we can support a single producer and single consumer without a lock (as long as put
and get
don’t modify the same variables).
You may be concerned about wasting a slot, but this tradeoff is often much cheaper than the cost of using an OS lock primitive.
Further Reading
Here are other circular buffer implementations:
For more information on circular buffers:
- Embedded.com: Ring buffer basics
- Wikipedia: Circular buffer
- Ferrous Systems: Lock Free Ring Buffer
- Boost Circular Buffer
- C++: Performance of a Circular Buffer vs Vector, Deque, and List
- Lock-Free Single-Producer – Single Consumer Circular Queue
There is a proposal for adding a circular buffer type to the C++ standard library:
Using C++ Without the Heap
Want to use C++, but worried about how much it relies on dynamic memory allocations? Our course provides a hands-on approach for learning a diverse set of patterns, tools, and techniques for writing C++ code that never uses the heap.
Learn More on the Course Page
Major Revisions
- 20210321
- Fixed error with
max_size_
whereme->max
should have been used - Added clarifying text regarding why the buffer is passed in by the user
- Fixed error with
- 20210301
- Addressed feedback from Miro regarding avoiding modulo operations
- 20210213
- Added additional links
- Added further discussion about tradeoff between full flag vs using a “wasted” slot
- Show modifications for thread safety with a single producer/consumer
- Add notes on
std::optional
use with C++17
- 20191016
- Updated Major Revisions section formatting for consistency across the site
- Demoted headers for consistency across the site
- Fixed broken Table of Contents links
- Removed Major Revisions from Table of Contents
- 20190627
- Added link to Ferrous Systems’ Lock Free Ring Buffer article.
- 20190604
- Fixed a typo (thanks Chris Svec!) and changed some wording related to the opaque type.
- 20181219
- Added note about avoiding concurrency problems with a single producer and single consumer using an empty slot.
- 20180804
- The article was restructured and rewritten.Thanks to everyone who provided feedback along the way. The examples have been updated to:
- Remove defensive programming
- Use assertions
- Create a standalone library using an opaque structure
- Expand the APIs, including a calculation for the current circular buffer size
- Update the library so it didn’t waste a slot
Thanks for nice explanation. One thing to correct
1. In circular_buf_get (C)
if(cbuf && data && !circular_buf_empty(*cbuf))
should be
if(cbuf && data && !circular_buf_empty(cbuf))
cbuf in that context is a pointer to a circular buffer, and the circular_buf_empty API does not take a pointer, so we need to dereference.
Reflecting on this with the benefit of time, I do think that a better API choice for embedded systems would be to have the circular_buf_empty() function take a pointer to a structure, which would allow you to skip the dereference.
That would also apply to circular_buf_full()
I have one doubt .. How to retrieve the data in main function…?
can you explain with example?
There is a functional example in the embedded-resources repository using a circular buffer built for uint8_t values: https://github.com/embeddedartistry/embedded-resources/blob/master/examples/c/circular_buffer.c
In short, values are added to the queue with the
circular_buf_put()
API.circular_buf_put(&cbuf, i);
Values can be retrieved by the
circular_buffer_get()
API:uint8_t data; circular_buf_get(&cbuf, &data);
Values will be retrieved based on the next value in line in the queue, until the current values in the queue are depleted.
Hi, thank you very much for this tutorial, I’ve been trying for so long to implement a circular buffer in C and this helped a lot! However, as I’m new to C programming I’m struggling with the next bit, which I feel should be the easiest part. I’m trying to fill the buffer with sensor readings and take an average of them, creating a moving average filter where the average updates with each new data point inputted into the buffer. Is this possible with this model or am I just missing something? I currently have the buffer filling with 9 of the same sensor readings, it doesn’t change, and then it resets the buffer and refills, this isn’t right for a moving average filter. Has anyone got any recommendations on where i’m going wrong?
Hard to say without example source code. Have you referenced the example circular buffer code in the embedded-resources project to see API usage?
https://github.com/embeddedartistry/embedded-resources/blob/master/examples/c/circular_buffer.c
I’ve implemented the code on your GitHub and it performs the correct function. Where your code inputs a new value with each increment of ‘i’, I want to add a new sensor reading. Then when the buffer is full, i want to add a new value to the end of it and take an average value of the buffer each time a value is added.
Great post! A concise ring buffer explanation for both C and C++. In the put() function/method you assume overwrite as default instead of discarding data. Is there any reason for that behavior or should the reader implement the behavior that fits the requirements?
Thanks.
Hey Gustavo,
That’s a great question, and something I should have touched on in the original article. Ultimately, the desired behavior should be selected for the application use case.
The reason I opted for overwriting data if the buffer is full is that I’ve primarily worked with teams who prioritize new data over old data. If the buffer is full, they would rather lose the oldest (now stale) item rather than the most recent.
In either case, I think that a full circular buffer likely indicates some other problem on the system, as your consumer is not able to keep up with the data being produced. Perhaps there is a thread prioritization issue, the buffer isn’t large enough, an interrupt storm is blocking the primary threads from running, etc.
Any ideas for how to go about building circular buffers to store 12 bit samples from an adc in c? Can I just replace uint8_t with uint16_t and waste 2 bits per sample?
Yes, that is what I would do.
Hello Phillip, thank you for the great tutorial. When using a circular buffer to have a more efficient FIFO, would the following still make sense?
if(cbuf->head == cbuf->tail)
{
cbuf->tail = (cbuf->tail + 1) % cbuf->size;
}
It seems, I would be skipping retrieving one sample of data whenever the rate of retrieving data is slower than of inserting data. What work around would you recommend? I was thinking of having a bufferFull flag as well
A buffer full flag is the easiest way to prevent wasting a sample. In retrospect, I should have just done that from the start. I’ll update this article in the coming weeks to reflect that.
Hi Jesus, the article has been rewritten with support for the full flag.
Hi Phillip
NIce clear explanation – thanks
I wonder if there is a problem though in your C++ example on github.
On line 35 you define:
T get(void) const
however, line 46:
tail_ = (tail_ + 1) % size_
then attempts to modify a member variable (tail_)
My compiler complains that:
… error: cannot assign to non-static data member within const member function ‘get’
which makes sense.
Is this a mistake in the github code? – the const keyword does not appear in the function declaration in your code in the blog.
Hi Steve,
Thanks for writing. That is an error in the code – I got const happy :). Interestingly my compiler doesn’t complain, which worries me. What compiler version are you using so I can dig into it?
Hello,
Firstly, thanks for the nice explanation. I am creating my own circular library using yours as reference. I will share it with you.
However, I think you should not call "circular_buf_empty" function using "cbuf" as parameter when you did not know if it exists.
In my opinion, "if(cbuf && data && !circular_buf_empty(*cbuf))" should be like:
if(cbuf && data)
{
if (!circular_buf_empty(cbuf))
{
/ Instructions */
}
}
Thanks!
Diego
Hi Diego,
You are right about the problem with the code. I’m working on a revision for this article, as it’s currently out of date with the example code.
The example code uses pointers, and it checks that cbuf is not null before using the values.
One thing to note is that in this statement:
if(cbuf && data && !circular_buf_empty(*cbuf))
Empty will not actually execute unless cbuf is valid (due to the && operator). Not that it makes the API any better 🙂
You should see an updated article next week.
As promised, the post has been completely updated!
Hi Phillip,
thank you for taking your time and creating a nice tutorial. I wanted to ask you, why not to use std::deque as an underlying container for circular buffer?
The ring buffer is useful when you have a pre-allocated buffer of a fixed size that you want to use for dynamically adding/removing objects. I used a std::unique_ptr in the example, but a C-style array or std::array could also be used (and are more common in my implementations).
Other containers, such as std::vector or std::deque, utilize dynamic memory allocation whenever a new element is added (unless you reserve the buffer size up front). If you used one of these containers, you would not need to use the ring buffer structure at all.
What does it mean
tail_ = (tail_ + 1) % size;
Modulo is: Divide by the size, and take the remainder.
Thanks for the post, one question:
in size(), why do you return max_size_ if the buffer is full (i.e. if full_ == true) ?
regards
This is a holdover from a different implementation. I think it is no longer needed.
I know you mentioned the modulo option in your size implementation. I think you can (I can) solve all corner cases like this:
circular_buf_size(cbuf_handle_t cbuf) {
if(full) { return max-capacity }
return (head + max-capacity – tail) % max-capacity
}
works for me and has no conditionals. But that is more a question of personal style I guess.
Nonetheless, thanx and great write-up, helped me to rinse out the index calculation errors I had in my javascript implementation of this,
}
I remember trying that at one point and running into a problem. Not sure what was going on there, but I agree your version is superior. I’ll work it in during the next update.
Hi,
Thank you.
Just want to point out that this comment “//If tail is ahead the head by 1, we are full” is not correct. Did I misunderstand it? If tail is ahead the head that means that the buffer was full before the ongoing pop operation, and it will become empty when tail will == head.
Hi Dmitry,
I will rework the code and article to remove the comment. That comes from an old implementation which kept an empty value in the queue (which made it thread-safe, but people requested the current implementation style).
Great post !
I needed a circular buffer example, as I am new to cpp. But I have no idea how to change this to use (Arduino) String. Do you have any advice ? Thank you !
There are a few options:
1) Use the circular buffer to manage type
char
. You could add an API to return the pointer to the underlying buffer, which would allow you to treat the full buffer contents as a C-string.2) Declare an Arduino String, call
reserve
for the maximum size of the buffer, and then manage the buffer using the ring-span-lite library: https://github.com/martinmoene/ring-span-liteHi, thanks for the tutorial.
I can’t use the malloc so I need a different approach.
I don’t understant why I need to create a buffer on my application and then pass it to the circular_buf_init where it allocate another buffer.
Can’t I simply allocate a static buffer somewhere in my application or in the circular_buffer.c file and use that?
In your example you use the malloc for the application buffer and then you pass it to the init where you use again a malloc to create a new buffer:
uint8_t * buffer = malloc(EXAMPLE_BUFFER_SIZE * sizeof(uint8_t));
cbuf_handle_t cbuf = circular_buf_init(buffer, EXAMPLE_BUFFER_SIZE);
Can you please clarify this?
Thanks in advance.
Hi Manuel,
malloc was used at the time for ease of implementation; the main goal was to show how a circular buffer works and how to hide information in C (i.e., hiding the internal implementation of the circular buffer.
The problem is that the circular buffer structure definition is not available in the C header file. That means you can only declare a pointer to it in your code. There are two solutions:
CIRCULAR_BUFFER_CONTAINER_POOL_SIZE
. You need a way of tracking structs that are currently allocated (such asbool inUse
flag in the struct), and a lock if using in multi-threaded code. Instead ofmalloc
ing the container struct, you would assign the next available struct from the pool to the user. Instead offree
, you would return this struct to the pool.Does that help?
Never mind: once I read again the code I got it: the first malloc is for the actual buffer, the second one is for the circualr buffer struct. Both of them are needed.
I ended up using a global variable for the struc and rearranging the init to use that and return its address.
That will work. See my response to your previous comment in case you need multiple circular buffers. I’ll draft another article on making this static-memory friendly.
Thank for the very clear demonstration and kudos.
Is there a way to read the values from the buffer without popping them out?
Not in this design. But you could easily add a front() or peek() API which returns the next valid value without popping it.
How would I go about redsigning this buffer to hold a custom struct instead of uint8_t?
You state:
"I have chosen uint8_t as the underlying data type in this implementation. You can use any particular type that you like – just be careful to handle the underlying buffer and number of bytes appropriately."
I have created custom struct, buffer for that struct and changed all the functions to either accept or return the new buffer type and new struct type.
This seems to work up to a point. Sometimes I get hard fault, at the function circular_buf_put() (Same result for circular_buf_put2()), I can not figure out for life of me what causes it.
Last snapshot from the buffer handle shows me the buffer->max value to be 65000 or some other ridiculous value, it obviously was not meant to be modified after the initialisation. That value seems to imply it got set to minus one maybe?
I thought since the struct now is composed of multiple types, pointer would need to be incremented additional number of times, but that doesn’t seem like the solution either.
This is running on a microcontroller, but with only a single interrupt which populates the buffer with structs and a single thread that depopulates the buffer.
Could you expand on your "Thread Safety with the Lookahead Method" ?
Would implementation of that involve checking whether the buffer is full – 1 before deciding whether new data can be inserted or rejected?
Thank you for doing this write up, it has helped me tremendously.
Hello! Can I use your code without changes in a commercial product? In other words, could you specify the license?
Yes. The circular buffer code is published in the embedded-resources GitHub repo and licensed as CC0 1.0 Universal: https://github.com/embeddedartistry/embedded-resources/blob/master/LICENSE
Is there any way to modify this circular buffer in c++ so that each element represents an array of a certain size. So that if you declare
circular_buffer<uint32_t> circle(10);
this represents a circular buffer of 10 arrays that contain X amount of uint32_t values.
Can anyone advise me on how to go about this?
You’d store uint32_t* in that case, not uint32_t. You’d need to update put to take an argument representing the array size, and you’d need to return the array size and pointer with get(). I’d say that the caller should be responsible for allocating and freeing memory.
Thanks for the post.
I have a question on determining buffer full
Waste a slot in the buffer:
Full state is tail + 1 == head
Isn’t it head+1 == tail?
Head advances when data is added. So, for full if head is one step lagging from tail then it is full.
Yes, you are correct. Updated the post to fix that error.
I worked through a couple of simplified size implementations on compiler explorer
https://godbolt.org/z/e7boGY
tldr; the modulo implementation is roughly the same as checking if head is greater than tail as long as you use full_. For a size-1 implementation that doesn’t use full_ using the modulo implementation is fastest.
Also of note, if checking that head_ is greater than tail_, dropping the check for full gives the same answer, but is faster on average if the buffer is rarely full.
Very cool analysis! I’ll get your insights incorporated into the article. Perhaps a “performance tradeoffs” section is in order, since there are other tradeoffs to consider for choosing the implementation. (e.g., size-1 wastes a slot, but it does make the buffer thread-safe for a single producer and single consumer; with full_ you need a lock to make sure it’s thread-safe).
More of a question. What does “return T(); ” actually do? Is this the atomic operator T and if so, how does this actually work?
C++ newb.
If the queue is empty, it will return a default constructed value for the given type. The actual value depends on the type and the constructor. For an integer, this would be zero.
One improvement I might make now is to return an optional, which means we could have a valid flag as part of the value. If the queue is empty, we will not return a valid value, and the reader can know that by checking the optional state.
Hello, good morning I am new to C programming, congratulating you for your excellent work and explanation, in the work that I am doing I need three producers, two circular buffers and two consumers, the producers generate tasks, each task has identifiers that are grouped with the struct function, my question is how do I enter these tasks (struct) into the circular buffer?
You would need to initialize two circular buffer structures and reference the appropriate structure with the buffer APIs at the appropriate location. To change the type of data that is stored, you can adjust the buffer type internally within the circular buffer (i.e., struct task_info instead of uint8_t).
The design decision to use the “full” flag saves one entry in the buffer, but has a very unfortunate consequence that the circular buffer must necessarily use some form of mutual exclusion (locking). This is because the “full” flag needs to be modified by both the producer of the data into the buffer and by the consumer of the data from the buffer.
In contrast, the alternative design without the “full” flag, but with the full state indicated as (head+1 == tail) CAN be lock-free. By this, I mean that under specific scenario you don’t need any locks to safely use the buffer to transfer data. The specific scenario is a single producer thread/ISR and single consumer thread/ISR. In that case you can get away without locking, because the producer only modifies the head index, and the consumer only modifies the tail index. Moreover, the modifications are such that both head and tail increment (and wrap-around). This means that using a slightly “outdated” head or tail is safe.
Anyway, if anybody is interested, I’ve placed a much simplified version of the ring buffer on GitHub:
https://github.com/QuantumLeaps/lock-free-ring-buffer
Another recommended resource is the BipBuffer, also available on GitHub at:
https://github.com/willemt/bipbuffer
Thanks Miro. I briefly mention this and keep meaning to expand the article to discuss both in more detail. I’ll include your links as well.
The published implementation uses the modulo (%) to wrap the head_/tail_ indexes around (head_ = (head_ + 1) % max_size_;). But modulo involves division, which is EXPENSIVE. For example, on ARM Cortex-M0+ integer division is a library call which takes at least 45 CPU cycles.
A much better implementationis simply: (if (++head_ == max_size_) { head_ = 0;}. This code takes only 3-4 CPU cycles when the branch is not taken and 4-5 CPU cycles when the branch is taken. This is far superior to the modulo alternative, even if hardware division is supported.
Again, this faster circular buffer code is posted see GitHub: https://github.com/QuantumLeaps/lock-free-ring-buffer
Hey, for the C version of the non-modulo advance_pointer function, I think you’ve got max_size_ where you mean to have cbuf->max.
Thank you, fixed.
No problem. Is there any reason why you couldn’t set this up to have the underlying data buffer created inside the init function?
Passing in a buffer allows for it to be allocated using static memory. If the init function created the buffer, you’d need dynamic memory. Can’t always make use of that in embedded software. I will discuss this point further in the article.
Given that this implementation overwrites old data and uses locks, I believe that size() needs a lock, otherwise it can cause a race condition.
Here’s an example race condition using the C++ implementation. It’s difficult to communicate this clearly without diagrams, but I did my best below. Note that this won’t compile of course as I had to show line by line executions of the functions to show the race condition.
// setup
circular_buffer buf(3);
buf.push(0);
buf.push(1);
buf.push(2); //full_ = true, head_=0, tail_=0
buf.push(3); //full_ = true, head_=1, tail_=1
// buffer values are now: [3,1,2]
uint32_t val = buf.read(); //val=1, full_ = false, head_ = 2, tail_ = 1
// buffer values are: [3,_,2], (of course 1 wasn’t actually overwritten)
// thread A:
// now let’s call size in thread A:
size_t size = buf.size();
// in the size method we execute;
if(!full_) // !full = true
if(m_head >= m_tail) // true: head_ = 2 and tail_=1
// WHILE we are calling size, thread B does a push()
// and completes the execution BEFORE we do the next line
// thread B:
buf.push(4);
// buf values: [3,4,2]
// full_ = true
// head_ = 2
// tail_ = 2
// now it’s full again and head == tail
// push completes, now back to thread A
// thread A:
// let’s continue to execute size()
// remember we previously evaluated the line if(m head>=m_tail) and found it was true
// we set size
size = head_ – tail_; // size = 0 since head_ == tail_
return 0;
// but m_full =true! size is actually 3
In the above example size returns 0 when it is full, because there is a race condition. head_, tail_ and full_ are updated between executions of lines of code in size, producing an incorrect answer.
If you are using an implementation a full flag, which requires locks, you should add a lock to the size() function. You can preserve the const-ness of the function by making the mutex mutable.
Thanks for the report – I will get this updated!
Great. I also think empty() needs a lock, but I didn’t have time to proof it out to show possible race conditions.
A couple of notes on static allocation and arrays:
For the C++ implementation, if you add a template parameter for the size you can allow the user to statically allocate the data structure at compile time. It would look like std::array’s class definition.
Given that the buffer is private, I’m not sure that there’s an advantage to using a C-array, since the underlying buffer data structure would not be shared with any pure C code. If I recall correctly, an std::array and a C array have the same run-time performance.
You could use the template parameter and an std::array as follows:
template
class circular_buffer {
/…/
circular_buffer() = default;
~circular_buffer() = default;
private:
std::array<T,N> buffer_;
/…/
}
Example declaration using the new template param:
CircularBuffer<uint32_t, 10> buffer;
Optionally, with static allocation:
static CircularBuffer<uint32_t, 10> buffer;
That way you don’t have to call the “new” operator in the constructor which does dynamic allocation. In fact you could just use a default constructor and destructor using the example shown above. Your capacity method would be as follows:
/** @returns the maximum size of the buffer */ constexpr size_t capacity() const { return N; };
Lastly, I believe the push function should pass a const reference:
void push(const T& value);
Please let me know your thoughts!
Thanks Veronica! I agree that empty() likely needs a lock, and that a template parameter is ideal. I’m not sure why I didn’t use a template parameter when I wrote the article, but I agree it is a better approach. I’ll include that in the next round of updates.
I need a peek function so I can use cbuf for parsing packets. Would the following implementation work? Btw, I am using only put2 and get.
Please review and advise. Thank you!
//This function allows the ability to look ahead in the buffer without
//altering the buffer. The argument look_ahead_counter determines how far
// to peek into. That is the element pointed by tail + look_ahead_counter
// will be returned.
int circular_buf_peek(cbuf_handle_t cbuf, uint8_t* data, unsigned int look_ahead_counter) {
int r = -1;
size_t pos;
assert(cbuf && data && cbuf->buffer); //We can't look beyond the current buffer size if(circular_buf_empty(cbuf) || look_ahead_counter > circular_buf_size(cbuf)) { return r; } pos = cbuf->tail; for (unsigned int i=0; i < look_ahead_counter; i++) { if(++(pos) == cbuf->max) pos = 0; else pos++; } *data = cbuf->buffer[pos]; return 0;
}
One question regarding thread safety: would the thread-safety be impacted since both the put and get functions are calling the advance_pointer function that is manipulating the tail pointer?
One question: would the thread safety be impacted by put and get functions calling advance_pointer function that in turn manipulates the tail pointer?
I am talking about the C implementation with only one consumer and one producer thread.
Thank you, I learned interesting technique of hiding structures from user by defining them in source file instead of header file, with only typedef in header file.
I would rename put2 function to try_put or something like that.
This is a good point – I believe so (well, they don’t both call advance but they do both modify the tail). I am currently working on a rewrite and will look into this.
Glad about the lessons. For some reason, it was a challenge to find the header in Linux that also implements this data structure. Thank you for sharing. I used it as a layer between the an application and the serial port of Onion Omega2. I was practically porting the SoftwareSerial library to work on a BSD distro.
Thank!
In the circular_buf_init implementation, I think assert(me) is problematic, because “me” is not defined. Maybe it should be assert(cbuf)?
I have the same question. Since both modify the tail, it wont be thread safe unless we use locks.
The version that can be threadsafe (with a single consumer and single producer) without locks is the version presented in the section “Modifications for Removing the full flag”. That version does not modify tail in both locations.
Thanks for an elaborate description.
Footnotes:
* Simple typo: “circular_nuf_full”.
* Also, the sentence “you will want to use the non-full flag version” could be rephrased (it could be read as “non” applies to the flag, not the version). So “you will want to use the version without the full-flag”.
Thanks Marty, I made those changes.
One question: why are you using an std::unique_ptr<T[]> and not a std::vector?
With a std::vector you get the same continuous buffer and automatic deletion of std::unique_ptr and is compatible with all c++ compiler version. Have I missed an advantage of std::unique_ptr?
Thanks
Hi. How this library can be modified in order to handle a generic uint8_t as queue data structure. I would like to have some like this:
. void circular_buf_put(cbuf_handle_t me, const uint8_t* data, uint32_t size);
I’m sorry I did not finish..
Hi. How this library can be modified in order to handle a generic uint8_t as queue data structure (not using loop of course). I would like to have some like this:
. void circular_buf_put(cbuf_handle_t me, const uint8_t* data, uint32_t size);
. int circular_buf_get(cbuf_handle_t me, uint32_t requestedSize, uint8_t * data);
Thank you so much