I recently provided an overview of the C++ standard container types as well as high-level guidelines for selecting the right container. Today we will be taking a look at the tradeoffs of the STL associative containers.
In general, there are two types of associative containers:
AssociativeContainer, which represents an ordered container utilizing key-based lookup
UnorderedAssociativeContainer, which utilizes hashing for key-based lookup
The ordered associative containers are:
std::set, which implements a collection of unique keys, sorted by keys
std::map, which implements a collection of key-value pairs, sorted by unique keys
std::multiset, which implements a set that allows multiple entries each key
std::multimap, which implements a map that allows multiple entries for each key
The unordered associative containers are:
std::unordered_set, which implements a collection of unique keys, hashed by key
std::unordered_map, which implements a collection of key-value pairs, hashed by unique keys
std::unordered_multiset, which implements an unordered set that allows multiple entries for each key
std::unordered_multimap, which implements an unordered map that allows multiple entries for each key
Selecting between Ordered and Unordered Associative Containers
One of the biggest questions I've had is "when should I use a
map vs. a
unordered_map"? There are certainly plenty of opinions around the internet, and I haven't achieved full clarity yet. However, I have distilled some general details and rules of thumb to help you decide between ordered and unordered associative containers.
If you are only worried about performance, I'll save you some time: it seems that
unordered_map has better performance in general.
Each of the associative containers sorts keys upon insertion, allowing for O(log n) search complexity.
The ordered associative containers use a node-based allocation scheme. Iterators to all elements are stable, and in C++17 you can even move elements from one map to another without invalidating their iterators. Node-based memory allocation keeps ordered container function timings more consistent, since they never need to make large allocations.
By using an ordered container, you are able to implement efficient range searches or iterate over subsets (e.g. operate on all elements where
key > 100).
If you want to use a user-defined object for a key, you'll need to define a less-than operator or implement a new comparison function. This can be much simpler than implementing a hashing function for your custom data type.
You should use an ordered associative container if:
- You prefer a tree structure
- You don't want the memory overhead of storing the hash table
- Ordering or ordered traversal of your data set matters
- A good hash of your key data is not possible or is to slow
- You prefer to write a comparison operator for your custom data type instead of a hashing operator
- You need guaranteed performance (e.g. embedded systems)
- Unordered containers have O(n) worst-case complexity in collision cases
Each of the unordered associative containers uses hashed data structures that can be searched in amortized constant time (O(1) amortized). This means that most operations take constant time, though collisions can increase the operational time to O(n) in the worst case.
Unordered containers require additional memory to store the hash table.
If you want to use a user-defined object for a key, you'll need to implement a definition of
std::hash<T> for your key type
T and override the
You should use an unordered associative container if:
- You have the memory for a hash table
- Can absorb the occasional long operation
- Have a good hashing function to reduce collisions
- You are using string data as a key
Differentiating the Containers
Most differences between the various types of associative containers are connected to the
UnorderedAssociative container concepts. Since we've already reviewed these two major categories, let's briefly review the subtypes:
multimap. The notes below apply to both ordered and unordered associative containers.
map containers support key-value pairs. If you're just looking for a simple lookup table implementation (or a Perl-like associative array), the
map subtype is the container for you. The
map containers support unique keys: writing to a key which already exists in the structure will overwrite the stored value.
set containers support key-only entries. Like
set only supports unique keys, though writing to a key which already exists is a wasted operation. A
set is useful for storing a collection of unique things, such as "topics we are subscribed to" or "types of food that our restaurant sells". If you need to store a value with your key, you need a
multiset containers operate the same as
set but relax the requirement that keys must be unique.
multimap containers are useful for grouping objects that are related together by key. For example:
- ZIP code for the key, and people who reside at that ZIP code as the value
- Word as the key, and definition(s) as the value
multiset is similar in concept to a key with an associated integer count. You can get the count for each time that an entry appears in your
In general, you will stick to the
set types in most cases.
- Overview of C++ Containers
- ISO C++ FAQ: Containers
- Ordered Map vs Unorderd Map: A Performance Study
- StackOverflow: In Which Scenario Do I Use a Particular STL Container?
- StackOverflow: Is there any advantage of using map over unordered map?