An Introduction to std::vector

I was working on an article comparing C++ containers to builtin arrays, which quickly grew out of hand due to the required prerequisite information. Instead I will be spending some time covering the std::vector,std::array, and std::string containers. After an initial introduction, I will compare these containers to the builtin C-style arrays and show how they can improve your programs and reduce

Today I will start the series by introducing you to std::vector, the C++ container type used for dynamic arrays (those that can grow or shrink during runtime). std::vector is worth adding to your toolbox because it provides many extra features over builtin arrays.

std::vector Overview

First thing's first: In order to utilize std::vector containers, you will need to include the vector header:

#include <vector>

std::vector is a header-only implementation, which means that once you have a C++ runtime set up for your target system you will get this feature for free.

As mentioned above, std::vector is a templated class that represents dynamic arrays. std::vector typically allocates memory on the heap (unless you override this behavior with your own allocator). The std::vector class mostly abstracts memory management, as it grows and shrinks automatically if elements are added or removed. However, std::vector is not magic, as the class still suffers from the runtime costs associated with dynamic memory allocation.

The std::vector class simplifies operations which are cumbersome with builtin arrays. For example:

  • Current array size is tracked
    • With C-style arrays, you would need to manually track this
  • Current amount of allocated memory is tracked
    • With C-style arrays, you would need to manually track this
  • Growing and shrinking the array is a simple function call
  • Inserting elements into the middle of an array is simplified into a function call.
    • C-style arrays require manually moving memory around when inserting elements into the middle of the array

Unlike C-style arrays, you can use strategies such as SBRM (scope-bound resource management). If you declare a std::vector in a specific scope, when that scope is no longer valid the std::vector memory will be destructed and freed automatically. Strategies like SBRM reduce the chance of introducing memory leaks.

In summary: std::vector simplifies much of the overhead code required to manage dynamic arrays. This eliminates boilerplate code that developers are required to write for basic program functionality. You are also relying on tested code and eliminating extra boilerplate code that is no longer required. These factors help to improve your code's reliability and reduce the potential areas where bugs can be introduced.

Creating a std::vector

Now that we know a little bit about std::vector, let's see how we work with one.

std::vector is a templated container class. When you are declaring a std::vector, you need to template the class on the type of data that needs to be stored:

//Declare an empty `std::vector` that will hold `int`s
std::vector<int> v2;

You can use an initializer list when creating your std::vector to assign initial values:

//v1 will be sized to the length of the initializer list
std::vector<int> v1 = {-1, 3, 5, -8, 0};

Copying vectors is quite simple! You don't need to worry about remembering the correct arguments for memcpy, simply use the equals operator (=):

v2 = v1; //copy

Or declare a new std::vector and use the vector you want copied as a constructor argument:

auto v3(v1);

Accessing Data

std::vector provides many interfaces for accessing the underlying data:

  • front()
  • back()
  • operator []
    • no bounds checks are performed
  • at()
    • bounds checks are performed - generates an exception
  • data()

The first element of the vector can be accessed using front(). Likewise, the last element can be accessed using v1.back().

There are two methods used to access specific elements: the at() function and the [] operator. The primary difference between the two methods is that the at() member function provides bounds checking. If the element you are trying to access is out-of-bounds, at() throws an exception. Using the familiar [] operator will not provide any bounds checking, providing for faster access while increasing the risk of a segmentation fault.

std::cout << "v1.front() is the first element: " << v1.front() << std::endl;
std::cout << "v1.back() is the last element: " << v1.back() << std::endl;
std::cout << "v1[0]: " << v1[0] << std::endl;
std::cout << "v1.at(4): " << v1.at(4) << std::endl;

// Bounds checking will generate exceptions. Try:
//auto b = v2.at(10);

//However, operator [] is not bounds checked!
//This may or may not seg fault
//std::cout << "v2[6]: " << v2[6] << std::endl;

Before assigning to elements using the [] operator, make sure your std::vector memory allocation is sufficient, else you risk causing a segmentation fault by accessing memory that you don't own.

data()

You can get the pointer to the underlying std::vector buffer using the data() member function. This is useful for interoperating with code that utilizes raw pointers to buffers.

For example, let's assume you are using an API with a C-style buffer interface:

void carr_func(int * vec, size_t size)
{
    std::cout << "carr_func - vec: " << vec << std::endl;
}

If you tried to pass a std::vector for the first argument, you would generate a compiler error: std::vector, unlike builtin arrays, will not decay into a pointer. Instead you need to use the data() member:

//Error:
//carr_func(v1, v1.size());

//OK:
carr_func(v1.data(), v1.size());

Adding and Removing Elements

There are a variety of functions used to add or remove elements from a std::vector:

  • push_back()
  • emplace_back()
  • pop_back()
  • clear()
  • resize()
  • emplace()
    • requires an iterator
  • insert()
    • requires an iterator
  • erase()
    • requires an iterator

To add new elements to the end of a std::vector, use push_back() and emplace_back(). These member functions are quite similar, with one critical distinction: emplace_back() allows for constructor arguments to be forwarded so that the new element can be constructed in place. If you have an existing object (or don't mind a temporary object being created), then you would use push_back().

For an integer the distinction is trivial:

int x = 0;
v2.push_back(x); //adds element to end
v2.emplace_back(10); //constructs an element in place at the end

However, for more complex types with complicated constructors, emplace_back() becomes very useful:

//Constructor: circular_buffer(size_t size)
std::vector<circular_buffer<int>> vb;
vb.emplace_back(10); //forwards the arg to the circular_buffer constructor to make a buffer of size 10

For another emplace_back() example, check out the cppreference entry.

You are not limited to inserting elements at the end of a std::vector: insert() and emplace() play similar roles and allow you to insert elements anywhere in the list. Simply supply an iterator to the location where the new element should be inserted:

v2.insert(v2.begin(), -1); //insert an element - you need an iterator
v2.emplace(v2.end(), 1000); //construct and place an element at the iterator

To erase a specific element, call the erase() function and supply an iterator to the element you want removed.

v2.erase(v2.begin()); //erase our new element - also needs an iterator

You can also remove the last element from the std::vector using pop_back():

v2.pop_back(); //removes last element

However, note that this function does not return a value! The element is simply removed from the vector. Read the element using end() prior to removal.

You can clear all elements in a std::vector by using the clear() function. This will set the size() to 0, but note that capacity() will remain at the previous allocation level.

Managing space

You need to be at least somewhat aware of the underlying memory implications of using a std::vector. Your vector often ends up occupying more space than a builtin array, as memory can be allocated to handle future growth.

Extra memory can be allocated to prevent a std::vector from needing to reallocate memory every time a new element is inserted. Additionally, if you remove elements from the std::vector, that memory remains allocated to the std::vector.

Luckily the std::vector class provides us a few functions for managing memory:

  • reserve()
  • shrink_to_fit()
  • resize()

Reallocations are costly in terms of performance. Often, you are aware of how many elements you need to store in your std::vector (at least as a maximum). You can use the reserve() function to make a single large allocation, reducing the runtime hit that may occur from frequent reallocations:

v2.reserve(10) //increase vector capacity to 10 elements

reserve() can only increase the vector size. If the requested capacity is smaller than the current capacity, nothing happens.

If you want to shrink your vector's current capacity to match the current size, you can use the shrink_to_fit() function:

//If you have reserved space greater than your current needs, you can shrink the buffer
v2.shrink_to_fit();

You can also use the resize() function to manually increase or decrease the size of your std::vector at will. If you are increasing the size using resize(), the new elements will be 0-initialized by default. You can also supply an initialization value for the new elements. If you are resizing to a smaller size, the elements at the end of the list will be removed.

v2.resize(7); //resize to 7. The new elements will be 0-initialized
v2.resize(10, -1); //resize to 10. New elements initialized with -1
v2.resize(4); //shrink and strip off extra elements

If you want to add space but don't want to add elements to the vector, use the reserve() function instead of resize().

Other Useful Interfaces

std::vector keeps track of useful details for you and provides the following functions:

  • size()
  • capacity()
  • empty()
  • max_size()

size() represents the current number of elements in the array. If there are no elements in the std::vector, empty() will return true. If there are elements, it will return false.

if(v1.empty())
{
    std::cout << "v1 is empty!" << std::endl;
}
else
{
    std::cout << "v1 is not empty. # elements: " << v1.size() << std::endl;
}

The current number of elements stored in the std::vector is not necessarily the same as the amount of memory currently allocated. To find the number of elements that can be stored based on the current memory allocation, you can use the capacity() function:

if(v1.capacity() == v1.size())
{
    std::cout << "v1 has used all of the current memory allocation. Adding new elements will trigger a realloc." << std::endl;
}

In order to know the maximum size of a std::vector on your system, you can use the max_size() member function. On my host machine this number is huge. You can tune this capacity on an embedded system to limit the max vector size on your system.

if(v1.size() == v1.max_size())
{
    std::cout << "v1 is at maximum size. no more memory for you!" << std::endl;
}

Container Operations

I just want to briefly note: since std::vector is a container class, any operations meant to be used on containers will work:

//Container operations work
std::sort(v2.begin(), v2.end());

I will be discussing container operations further in a future article.

For Loop

One benefit of using modern C++ is that you can use more convenient for loops! Instead of specifying the format we are all used to:

for(int j = 0; j < v2.size(); j++)

We can simply use this format to iterate through all elements of a vector:

for(auto &t: v2)

Here's a trivial example which prints every element using the new for loop construct:

std::cout << std::endl << "v2: " << std::endl;
for (const auto & t : v2)
{
    std::cout << t << " ";
}
std::cout << std::endl;

std::vector Overhead

Using std::vector does come with minor overhead costs (which, to me, are worth the benefits). These costs are similar to manually managing memory allocations on your own.

  • The std::vector container requires 24 bytes of overhead
  • Construction and destruction take time
  • Allocation time is required - tor the initial storage allocation, as well as any resizing that is required.

cppreference lists the complexity of vector operations as follows:

  • Random access - constant O(1)
  • Insertion or removal of elements at the end - amortized constant O(1)
  • Insertion or removal of elements - linear in distance to the end of the vector O(n)

Putting it All Together

Example code for std::vector can be found in the embedded-resources Github repository.

Further Reading

I have just provided a cursory overview of the std::vector container. Many great resources on the web exist to help explain this feature, so don't be afraid to search around or ask questions!