The existing interface of unique-keyed map containers (std::map
,
std::unordered_map
) is slightly underspecified, which makes certain
container mutations more complicated to write than necessary. This proposal is to
add new specialized algorithms:
emplace_stable()
: if the key already exists, does not
insert and does not modify the arguments.emplace_or_update()
: inserts or updates the mapped element if
the key already exists.The existing insert()
and emplace()
interfaces are underspecified. Consider the following example:
What is the value of p
? It is currently unspecified whether
p
has been moved-from. (The answer is that it depends on the
library implementation.)
The remainder of this proposal is motivated by the problem of inserting
p
so that it does not get moved-from when the key already exists,
and of overwriting it unconditionally. We will see that all possible solutions
are less than perfect: All of them require boilerplate, some are inefficient,
and none are elegant. Under the proposal, we could write:
This is purely an extension of the standard library. Two new functions have to be added to [map.special], and also to a new section “specialized algorithms” [unord.maps.special] for unordered maps. There is no interference with existing code.
First, let us consider some pieces of code that may look like a solution, but in fact are not.
Both cases #1
and #2
typically create value-pair
objects which unconditionally take ownership, all before looking up the key. The
case #3
and #4
are more insidious, because it looks like they might
work. But it is in fact perfectly valid for the implementation to construct a new, intermediate
value_type
object internally, which unconditionally takes ownership, before
comparing keys.
The signature of insert
is template <typename P> insert(P &&)
,
which suggests to the user that it is possible to pass strictly by reference (as in cases #3
and #4
above). The problem is that an implementation is not prohibited from constructing
further objects, which may move-from the original arguments..
(Concretely, the behaviour of an implementation may come down to whether an internal
auxiliary member template is of the form template <typename _Pair> void __internal_insert(_Pair
&&)
, in which case there is no ownership transfer, or of the form void
__internal_insert(value_type &&)
, in which case a temporary value-pair is constructed which takes
ownership. These differences have been observed between GCC 4.6 and GCC 4.8.)
Let us consider how we can program the desired algorithms with the existing library, and examine the relative merits and shortcomings of each case. We continue using the code of the initial example. In each case, we use the term lookup to refer either to the tree traversal in the ordered map or the hashing and bucket lookup in the unordered map.
Stable insertion problem: insert the unique pointer and guarantee that it still owns the pointee if the key already exists.
Insert-or-update problem: Set the map entry for a given key, unconditionally on whether they key already exists.
find()
This solution performs the lookup twice when the key does not exist. It is the most general in terms of the various type requirements. Information on whether the key already existed is implicit in the branching.
lower_bound()
and hintThis solution is only available for the ordered map. It is near-perfect for that case, though, except perhaps for the very insignificant fact that slightly more key comparisons than strictly necessary are being performed. Information on whether the key already existed is implicit in the branching.
operator[]
For the insert-or-update problem only, this code is perhaps the most natural-looking one.
It requires that the mapped_type
be default-constructible, and it always default-constructs
an element if the key does not exist before assigning the given value. There is no information
on whether the key already existed.
With the introduction of move semantics and move-only types in C++11, the standard library containers have evolved naturally to support such types, and by and large this evolution has been seamless. The insertion interface for unique-key maps is one of the few which do not extend naturally. To achieve the desired functionality, a programmer has to wrap the interface, and each of the possible ways of doing so has limitations, costs, or both.
The acceptance and sucess of C++ and its standard library is largely due to the belief of its users that they could not generally do a better job by writing code “by hand” (say, in C). It achieves this by being unconstrained, by not imposing arbitrary restrictions and limitations on what a programmer can do where they are not necessary, and by not imposing unwarranted cost (don't pay for what you don't use). The existing map insertion interface is something that can be done better, and in this spirit this proposal is to plug this hole. The proposed interface allows seamless use of move-only types in maps, can easily be implemented so as to not require any redundant lookup or comparison operations, and returns information which already comes out for free (the iterator and the boolean). In fact, the last two points apply equally well to pre-C++11 copyable types, and the new interface offers a previously unavailable guarantee to not make unneeded copies.
This proposal introduces one idiosyncrasy: Traditionally, we teach that we should universally
regard a (potentially) moved-from object as being in an indeterminate, unusable state. The proposed
emplace_stable
interface has different semantics. However, we feel that this
peculiarity is justified. The unique-key maps are already peculiar containers, in the sense that
they are the only containers whose value_type
has a non-trivial microstructure.
Alternatives which would preserve the traditional notion of moving-from objects might consist of
giving the object back optionally if it was not used, but this is too far a departure from the
existing library.
We do not consider the “satellite-less” associative containers
std::{unordered_,}{multi,}set
, just as we do not consider movable key
types. The notion of a key that only exists as a unique object is rather more unusual; for
example, one cannot easily search for something by value which is unique. (Rather, heterogeneous
and transparent comparators seem to be the more appropriate notion in that case.) Operations on
move-only keys will already require specialized code, and thus present less of an opportunity
for improving the existing interface. We are however perfectly open to suggestions to extend the
proposal to cover move-only keys, should this turn out to be desirable.
unordered_map
specialized algorithms [unord.maps.special]”:Effects: If the key k
already exists in the map, there is no effect, no
dynamic allocations are performed and no exceptions are thrown. Otherwise, inserts the element
constructed from the arguments as value_type(k, std::forward<M>(obj))
into
the map. The bool
part of the return value is true if and only if the insertion
took place, and the iterator
part points to the element of the map whose key
is equivalent to k
. The hint
iterator provides an insertion hint.
Complexity: The same as emplace
and emplace_hint
,
respectively.
Effects. If the key comparing equal to k
does not exist in the map,
inserts the element constructed as value_type(k, std::forward<M>(obj))
.
If the key already exists, the mapped part of the value is assigned std::forward<M>(obj)
.
The bool
part of the return value is true if and only if the key already existed
prior to the operation, and the iterator
part points to the inserted or updated
element. The hint
iterator provides an insertion hint.
Complexity: The same as emplace
and emplace_hint
,
respectively.
emplace_or_update
is optionalStrictly speaking, once a reliable emplace_stable
exists, emplace_or_update
could be written generically and efficiently as follows:
We propose the addition of emplace_or_update
regardless, so as to provide
a readable solution for both problems, without making one inferior to the other.
The implementation of the stable interface is straight-forward: construction of the element node must simply be deferred until after the key lookup. Until such time, the arguments may be passed along by reference.
For emplace_or_update
, the implementation should first look up the key and then
either construct the full value_type
or forward-assign the mapped element.
insert
-like interfaceWe considered a function like “insert_stable
” with a signature like value_type
&&
, or like the existing template signature of insert
. All we
require for stable insertion is the ability to pass the arguments by reference, so
that we can guarantee that they will not be consumed if the element cannot be inserted.
However, this would be awkward to use. The common-place
construction insert_stable(std::make_pair("foo", std::move(p)))
would do the wrong
thing and construct a temporary. The correct idiom would
be insert_stable(std::forward_as_tuple("foo", std::move(p)))
, but there would be no
diagnostic if this were done wrong. By separating the key and mapped parts, we embrace the peculiar
nature of the map's value type and make it an explicit, deliberate part of the interface. We believe
that this aids readability and teachability, as well as making the intended use easy to get right.
It is worth considering whether less intrusive changes to the standard can achieve the goals
of this proposal. This raises the issue that the current specification is unclear, both the
“Associative containers” requirements ([associative.reqmts, unord.req]) and the
“{unordered_,}map
modifiers” ([map.modifiers, unord.map.modifiers]).
For example, [unord.map.modifiers] says:
Effects: Inserts obj
converted to value_type
if and only if there is no element in the container with key equivalent to the key
of value_type(obj)
.
This does not place any conditions on whether obj
is
converted or not, only on whether the converted value is inserted. The corresponding
text for the ordered map is very different; here is only one excerpt (from [map.modifiers]):
Otherwise [if P
is not a reference] x
is considered
to be an rvalue as it is converted to value_type
and inserted into the map.
The problem with this formulation is that it is unclear what “is converted and inserted into the map” means – does the conversion happen regardless of the success of the insertion or not?
This only leaves us with emplace
. This function's semantics are not specified as
part of the containers, but only as part of the container requirements [associative.reqmts,
unord.req]. The language is again unclear; in both cases the wording is:
Effects: Inserts a T
object t
constructed with
std::forward<Args>(args)...
if and only if there is no element in the container
with key equivalent to the key of t
.
Once again, the conflation of “construct” and “insert” make the specification too imprecise.
In all three cases, the standard leaves it unspecified whether further conversions and object constructions are allowed to happen. We believe that the standard wording could benefit from a clarification, regardless of this proposal.
It is perhaps possible, although we do not have a good suggestion for how to do it, to
clarify the existing language and at the same time achieve the goal of this proposal.
The fact that the map insert signature has a universal reference,
template <typename P> insert(P &&)
(as opposed to, say, the
corresponding vector interface insert(value_type &&)
) suggests that there
is some intention in the existing standard to allow the user to pass arguments strictly by
reference: the function call expression itself does not construct any objects as a side effect
(unlike in the vector interface, where a temporary may be constructed). If some kind of rule
were added that implementations must not construct value objects internally before checking the
key, then perhaps insert
could be given the semantics of the
proposed emplace_stable
, and emplace_or_update
and
“insert_or_update
” could be derived from it as described above.
However, we do not currently have a suggestion for such a change that we would be comfortable with.
Changes would have to be made to the container-specific description of insert
, as well as
to the general associative or unordered container requirements. More importantly yet, if we merely
clarified insert
and/or emplace
to have stable semantics (bearing in mind that
for insert
to behave stably requires forward_as_tuple
and not make_pair
!),
it would still be difficult for users to rely on the new behaviour, since it would not be easy to detect whether
an implementation was conforming, and a non-conforming implementation would silently produce different
behaviour.
Many thanks to Geoffrey Romer, Jeffrey Yasskin, Billy Donahue and Stuart Taylor for providing invaluable feedback and discussion.