I recently provided an overview of the C++ standard container types as well as high-level guidelines for selecting the right container. Today we will be taking a look at the tradeoffs of the STL associative containers.
- Overview
- Selecting between Ordered and Unordered Associative Containers
- Differentiating the Containers
- Further Reading
Overview
In general, there are two types of associative containers:
AssociativeContainer
, which represents an ordered container utilizing key-based lookupUnorderedAssociativeContainer
, which utilizes hashing for key-based lookup
The ordered associative containers are:
std::set
, which implements a collection of unique keys, sorted by keysstd::map
, which implements a collection of key-value pairs, sorted by unique keysstd::multiset
, which implements a set that allows multiple entries each keystd::multimap
, which implements a map that allows multiple entries for each key
The unordered associative containers are:
std::unordered_set
, which implements a collection of unique keys, hashed by keystd::unordered_map
, which implements a collection of key-value pairs, hashed by unique keysstd::unordered_multiset
, which implements an unordered set that allows multiple entries for each keystd::unordered_multimap
, which implements an unordered map that allows multiple entries for each key
Selecting between Ordered and Unordered Associative Containers
One of the biggest questions I’ve had is “when should I use a map
vs. a unordered_map
”? There are certainly plenty of opinions around the internet, and I haven’t achieved full clarity yet. However, I have distilled some general details and rules of thumb to help you decide between ordered and unordered associative containers.
If you are only worried about performance, I’ll save you some time: it seems that unordered_map
has better performance in general.
Ordered Containers
Each of the associative containers sorts keys upon insertion, allowing for O(log n) search complexity.
The ordered associative containers use a node-based allocation scheme. Iterators to all elements are stable, and in C++17 you can even move elements from one map to another without invalidating their iterators. Node-based memory allocation keeps ordered container function timings more consistent, since they never need to make large allocations.
By using an ordered container, you are able to implement efficient range searches or iterate over subsets (e.g. operate on all elements where key > 100
).
If you want to use a user-defined object for a key, you’ll need to define a less-than operator or implement a new comparison function. This can be much simpler than implementing a hashing function for your custom data type.
You should use an ordered associative container if:
- You prefer a tree structure
- You don’t want the memory overhead of storing the hash table
- Ordering or ordered traversal of your data set matters
- A good hash of your key data is not possible or is to slow
- You prefer to write a comparison operator for your custom data type instead of a hashing operator
- You need guaranteed performance (e.g. embedded systems)
- Unordered containers have O(n) worst-case complexity in collision cases
Unordered Containers
Each of the unordered associative containers uses hashed data structures that can be searched in amortized constant time (O(1) amortized). This means that most operations take constant time, though collisions can increase the operational time to O(n) in the worst case.
Unordered containers require additional memory to store the hash table.
If you want to use a user-defined object for a key, you’ll need to implement a definition of std::hash<T>
for your key type T
and override the ==
operator.
You should use an unordered associative container if:
- You have the memory for a hash table
- Can absorb the occasional long operation
- Have a good hashing function to reduce collisions
- You are using string data as a key
Differentiating the Containers
Most differences between the various types of associative containers are connected to the OrderedAssociative
and UnorderedAssociative
container concepts. Since we’ve already reviewed these two major categories, let’s briefly review the subtypes: set
, multiset
, map
, and multimap
. The notes below apply to both ordered and unordered associative containers.
The map
containers support key-value pairs. If you’re just looking for a simple lookup table implementation (or a Perl-like associative array), the map
subtype is the container for you. The map
containers support unique keys: writing to a key which already exists in the structure will overwrite the stored value.
The set
containers support key-only entries. Like map
, set
only supports unique keys, though writing to a key which already exists is a wasted operation. A set
is useful for storing a collection of unique things, such as “topics we are subscribed to” or “types of food that our restaurant sells”. If you need to store a value with your key, you need a map
instead.
The multimap
and multiset
containers operate the same as map
and set
but relax the requirement that keys must be unique.
The multimap
containers are useful for grouping objects that are related together by key. For example:
- ZIP code for the key, and people who reside at that ZIP code as the value
- Word as the key, and definition(s) as the value
A multiset
is similar in concept to a key with an associated integer count. You can get the count for each time that an entry appears in your multiset
.
In general, you will stick to the map
and set
types in most cases.
Further Reading
- Overview of C++ Containers
- ISO C++ FAQ: Containers
AssociativeContainer
ConceptUnorderedAssociativeContainer
Concept- Ordered Map vs Unorderd Map: A Performance Study
- StackOverflow: In Which Scenario Do I Use a Particular STL Container?
- StackOverflow: Is there any advantage of using map over unordered map?
Using C++ Without the Heap
Want to use C++, but worried about how much it relies on dynamic memory allocations? Our course provides a hands-on approach for learning a diverse set of patterns, tools, and techniques for writing C++ code that never uses the heap.
Learn More on the Course Page