map
and
unordered_map
Document #: | P3091R3 |
Date: | 2024-10-14 19:11 EDT |
Project: | Programming Language C++ |
Audience: |
LEWGI |
Reply-to: |
Pablo Halpern <phalpern@halpernwightsoftware.com> |
The most convenient way to look up an element of a
map
or
unordered_map
is to use the index
operator, i.e., theMap[key]
.
This operator cannot be used, however, when 1) the container is
const
, 2)
the mapped type is not default constructible, 3) the default-constructed
value is inappropriate for the context, or 4) the container should not
be modified. These limitations often force the user to resort to the
find
member function, which returns
an iterator that points to a pair
and typically leads to more complex code having at least one
if
statement
and/or duplicate lookup operations. Taking inspiration from other
languages, especially Python, this paper proposes a
get
member function that returns
optional<T&>
and leverages the observers and monadic operations of
optional
to simplify lookups for
map
and
unordered_map
. Note that this
proposal is built on optional<T&>
,
which is described in [P2988R7] and is not yet in the C++
Working Draft. It’s usefulness is also enhanced by the
maybe
facilities described in [P1255].
R3 (2024-10-15 pre-Wroclaw mailing)
optional<T>::value_or
that will be handled in a separate paper, [P1255], by Steve Downey.R2 (2024-05-23 pre-St. Louis mailing)
value_or
should return an rvalue for
both optional<T>
and optional<T&>
.
This change is reflected in the wording for this paper. Rather than a
separate optional<T>::or_construct
member, the functionality of
or_construct
is folded into an
enhanced value_or
both optional<T>
and optional<T&>
.R1 (post 2024-02-27 LEWGI teleconference)
get
was changed from
mapped_type
to optional<mapped_type&>
,
and the functionality of get_ref
and
get_as
were delegated to
optional
. Note that optional<T&>
is not valid in C++23 but is proposed in [P2988R4] for C++26.or_construct
member to optional
optional<T&>::value_or
over that in [P2988R4]R0 (2024-02-15 pre-Tokyo mailing)
Just about every modern computer language has, as part of the
language or its standard library, one or more associative containers
that uniquely map a key to a value, variously called
map
,
hash_map
,
dictionary
, or something similar. In
C++, we have
std::map
,
std::unordered_map
,
and eventually (per [P0429R9] and [P2767R2]) std::flat_map
. The
easy-to-write and easy-to-read expression for getting a value for an
associated key is simply theMap[key]
;
in other words, a mapped value is retrieved (and often set) using the
index operator, which returns a reference to the value associated with
the key. Unfortunately, the index operator in the C++ associative
containers has a number of shortcomings compared to many other
languages.
const
containers.Consider a std::unordered_map
named theMap
that maps an integer
key to a floating-point value, modeling a sparse array of
double
. If
we want to find the largest of the
double
values mapped to the integer keys in the range 1 to 100, we might be
tempted to write the following loop:
double largest = -std::numeric_limits<double>::infinity();
for (int i = 1; i <= 100; ++i)
= std::max(largest, theMap[i]); largest
If theMap
is
const
, the
loop will not compile. If any of the keys in the range [1, 100]
are absent from the map, then theMap[i]
will return a default-constructed
double
having value 0.0, which might or might not be desirable, depending on
whether we want to treat missing values as truly having value 0.0 or to
ignore them (or, equivalently, to treat them as having value \(-\infty\)). Finally if, before executing
this loop, theMap
contains only,
say, five entries, then at the end of the loop, it will contain at least
100 entries, most of whose values will be zero.
There are alternatives that avoid all these shortcomings but are
significantly less elegant and therefore more error prone. For example,
the at
member function looks a lot
like operator[]
and has none of the above shortcomings, but missing keys are handled by
throwing exceptions, making this option impractical for situations other
than when the key is almost certain to exist. Moreover, a
try
-catch
block within a loop is rarely a clean way to structure code:
double largest = -std::numeric_limits<double>::infinity();
for (int i = 1; i <= 100; ++i)
{
try {
= std::max(largest, theMap.at(i));
largest } catch (const std::out_of_range&) { }
}
The above code would work with a
const
map,
would ignore missing elements (rather than treating them as zeros), and
would not populate the map with useless entries, but many programmers
would argue that the loop is inelegant at best. In most C++
implementations, this code would be extremely inefficient unless key
misses are rare.
The other obvious alternative uses the
find
member function:
double largest = -std::numeric_limits<double>::infinity();
for (int i = 1; i <= 100; ++i)
{
auto iter = theMap.find(i);
if (iter != theMap.end())
= std::max(largest, iter->second);
largest }
This version of the loop is only slightly more verbose than our
starting point and avoids all the issues, but using iterators, needing
to call two member functions
(find
and
end
), and having to extract the
second
member of the element
pair
for what should be a
simple operation increases the cognitive load on both the
programmer and the reader. Moreover, a generic use of
find
can yield a subtle bug.
Consider the following function template,
f
:
template <class Key, class Value>
void f(const Key& k, const std::map<Key, Value>& aMap)
{
= some-default-obj-value-expression;
Value obj auto iter = aMap.find(k);
if (iter != aMap.end())
= iter->second;
obj // code that uses `obj` ...
}
An instantiation of f
will not
compile unless Value
is copy
assignable. Worse, unless tested with a nonassignable type, the bug
could go undetected for a long time. One fix would be initialize
obj
in a single expression:
= aMap.count(k) ? aMap.at(k) : some-default-obj-value-expression; Value obj
While correct, this solution involves two lookup operations: one for
the count
and one for the
at
. A better fix requires a bit more
code:
auto iter = aMap.find(k);
= iter != aMap.end() ? iter->second : some-default-obj-value-expression; Value obj
Note that the last solution again involves
iterator
,
pair
, and a conditional expression,
which is a far cry from the simplicity of operator[]
.
Let’s contrast these less-than-ideal map lookups to dictionary lookups in another language and consider one way to write the largest-value computation in Python:
= float("inf")
inf = -inf
largest for i in range(1, 101):
= max(largest, theMap.get(i, -inf)) largest
The get
member of Python’s
dictionary type looks up the supplied key (first argument). If the key
exists in the dictionary, get
returns the corresponding value; otherwise,
get
returns the specified
alternative value (second argument).
In this paper, I propose a get
member function for
std::map
and
std::unordered_map
similar to the get
member in Python
dictionaries. Because C++ does not have a
None
value like Python’s,
get
returns an
optional
, instead, and delegates the
construction of an alternative return value to the
optional
observer members.
What’s desired is a simple expression that, given a key, returns the
mapped value if the key exists in a specific map and a user-supplied
alternative value if the key does not exist. The proposed
feature is a get
member function for
map
and
unordered_map
that returns optional<T&>
(or, for a
const
map,
optional<const T&>
):
template<class Key, class T, class Compare = less<Key>,
class Allocator = allocator<pair<const Key, T>>>
class map {
// ...
< mapped_type&> get(const key_type& k);
optional<const mapped_type&> get(const key_type& k) const;
optional//...
};
These functions depend on having an
optional
template that can be
instantiated with reference types, as proposed in [P2988R7].
Using this feature, the earlier example could be written almost as simply as the Python version:
constexpr double inf = std::numeric_limits<double>::infinity();
double largest = -inf;
for (int i = 1; i <= 100; ++i)
= std::max(largest, theMap.get(i).value_or(-inf)); largest
optional::value_or
To maximize the usefulness and convenience of the proposed
get
function, earlier revisions of
this paper also proposed enhancements to optional<T>::value_or
and optional<T&>::value_or
,
allowing (but not requiring) the caller to specify a return type other
than T
, and giving it a variadic
argument list comprising constructor arguments for the return value.
Although such an extension would be useful, it can also be provided
through a non-member function that is expected to be in the next
revision of [P1255] by
Steve Downey. Since that change does not directly relate to the subject
of this proposal, there is no reason to combine it with this paper. You
can see the proposed extensions in a previous revision of this paper,
[P3091R2].
The following table shows how operations are simplified using the
proposed new member functions. In these examples,
K
,
T
, and
U
are object types;
m
is an object of (possibly
const
) std::map<K, T>
,
unordered_map<K, T>
,
or flat_map<K, T>
;
k
is a value compatible with
K
;
v
is an lvalue of type (possibly
const
)
T
; and
a1..aN
are
arguments that can used to construct an object of type
T
.
Before
|
After
|
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
get
The name
map::get
is
borrowed from the Python dictionary member of the same name. Other names
considered were get_optional
,
lookup
, and
lookup_optional
. The
get
name was selected for brevity
and familiarity, but lookup
is also
concise and is a reasonable alternative.
map
Version R0 of this paper proposed
get
,
get_ref
member functions that would
look up a key and return the corresponding value (or a reference to the
corresponding value) or else construct an alternative from the nonkey
arguments without, involving optional<T&>
:
// return a value
template <class U = remove_cvref_t<T>, class... Args>
(const key_type& key, Args&&... args);
U gettemplate <class U = remove_cvref_t<T>, class... Args>
(const key_type& key, Args&&... args) const;
U get
// return a reference
template <class Arg>
auto get_ref(const key_type& key, Arg&& ref) -> common_reference_t< mapped_type&, Arg>;
template <class Arg>
auto get_ref(const key_type& key, Arg&& ref) const -> common_reference_t<const mapped_type&, Arg>;
The following table shows the usage of the R0 design compared to the currently proposed design.
R0 Design ( get Returns
T )
|
Current Design ( get Returns
optional<T&> )
|
---|---|
|
|
|
|
|
|
|
|
|
|
Advantages of the R0 Design Over the Current Design
get
and
get_ref
in many cases.map
and
unordered_map
specification changes
would be independent of optional<T&>
,
thus decoupling this paper from [P2988R7] and its subsequent
revisions.Advantages of the Current Design Over the R0 Design
get
provides a
direct indication, via a disengaged
optional
return value, that the key
was not found.optional
, the
current design can leverage all of the functionality of
optional
, including
value_or
and monadic operations. Any
future improvements to optional
could be accessed by users of
map::get
without modifying map
. In
particular, all of the features in [P1255] could be applied to the return
value of
map::get
,
including a std::reference_or
function.get
) compared to two observers for
the R0 design (get
and
get_ref
). Since each observer
requires four overloads
(const
and
non-const
,
each having a key_type
or templated
K
parameter), the interface
simplification is notable.get
is simple to specify and implement, making
get
easy to add to nonstandard
map-like containers and new standard containers such as
flat_map
.vector
and
deque
, should the committee choose
to do that.During the 2024-02-27
LEWGI (SG18) telecon, unanimous consent was achieved for
get
returning optional<T&>
(known as the Alternative Design in [P3091R0]):
POLL: The alternative design with a smaller container API with extensions to
std::optional
should be pursued.
SF
|
WF
|
N
|
WA
|
SA
|
---|---|---|---|---|
6 | 4 | 0 | 0 | 0 |
Providing the functionality described in this paper is possible using
namespace-scope functions, without modifying
std::map
and
std::unordered_map
:
template <class Map, class K, class... Args>
typename optional<typename Map::mapped_type&> get(Map& m, const K& k);
template <class Map, class K, class... Args>
typename optional<const typename Map::mapped_type&> get(const Map& m, const K& k);
auto x = get(m, k).value_or(v);
One benefit to this approach is that the namespace-scoped
get
template can handle any
map-like dictionary type (i.e., a type having a
mapped_type
and a
find
method that returns an
iterator
pointing to a key-value
pair
). Such an approach, however,
has disadvantages.
get
outside of the map
interface.An implementation, with tests and usage examples, can be found at https://github.com/phalpern/WG21-halpern/tree/P3091/P3091-map_lookup.
Some of the functionality can be found in Meta’s [Folly] library.
This wording is relative to the July 2024 working draft, [N4986].
To the list in 17.3.2 [version.syn]1, add:
#define __cpp_lib_map_get yyyymmL // also in <map>, <multimap>
In 24.4.4.1
[map.overview]/2,
insert the get
element-access
members:
//
24.4.4.3 [map.access], element access
mapped_type& operator[](const key_type& x);
mapped_type& operator[](key_type&& x);
template<class K> mapped_type& operator[](K&& x);
mapped_type& at(const key_type& x);
const mapped_type& at(const key_type& x) const;
template<class K> mapped_type& at(const K& x);
template<class K> const mapped_type& at(const K& x) const;
optional<mapped_type&> get(const key_type& x) noexcept;
optional<const mapped_type&> get(const key_type& x) const noexcept;
template<class K> optional<mapped_type&> get(const K& x) noexcept;
template<class K> optional<const mapped_type&> get(const K& x) const noexcept;
At the end of 24.4.4.3 [map.access], add these descriptions:
optional<mapped_type&> get(const key_type& x) noexcept;
optional<const mapped_type&> get(const key_type& x) const noexcept;
template<class K> optional<mapped_type&> get(const K& x) noexcept;
template<class K> optional<const mapped_type&> get(const K& x) const noexcept;
Constraints: For the third and fourth overloads, the qualified-id
Compare::is_transparent
is valid and denotes a type.
Preconditions: The expression
find(x)
is well formed and has well-defined behavior.
Returns:
find(x)->second
iffind(x) == end()
isfalse
, otherwisenullopt
.
Complexity: Logarithmic
In 24.5.4.1
[unord.map.overview]/3,
insert the get
element-access
members:
//
24.5.4.3 [unord.map.elem], element access
mapped_type& operator[](const key_type& x);
mapped_type& operator[](key_type&& x);
template<class K> mapped_type& operator[](K&& x);
mapped_type& at(const key_type& x);
const mapped_type& at(const key_type& x) const;
template<class K> mapped_type& at(const K& x);
template<class K> const mapped_type& at(const K& x) const;
optional<mapped_type&> get(const key_type& x) noexcept;
optional<const mapped_type&> get(const key_type& x) const noexcept;
template<class K> optional<mapped_type&> get(const K& x) noexcept;
template<class K> optional<const mapped_type&> get(const K& x) const noexcept;
At the end of 24.5.4.3 [unord.map.elem], add these descriptions:
optional<mapped_type&> get(const key_type& x) noexcept;
optional<const mapped_type&> get(const key_type& x) const noexcept;
template<class K> optional<mapped_type&> get(const K& x) noexcept;
template<class K> optional<const mapped_type&> get(const K& x) const noexcept;
Constraints: For the third and fourth overloads, the qualified-ids
Hash::is_transparent
andPred::is_transparent
are valid and denote types.
Preconditions: The expression
find(x)
is well formed and has well-defined behavior.
Returns:
find(x)->second
iffind(x) == end()
isfalse
, otherwisenullopt
.
Thanks to Tomasz Kamiński for pushing me on the optional<T&>
approach.
Thanks to Steve Downey for working with me to harmonize P2988 and P1255 with this paper.
Thanks to Lori Hughes for editing support.
views::nullable
And a concept to constrain maybe
s.
All citations to the Standard are to working draft N4986 unless otherwise specified.↩︎