Document Number: P1190R0 Date: 2018-08-06 Author: David Stone (davidmstone@google.com, david@doublewise.net) Audience: LEWG, EWG
There are many types in the standard library that have comparison operators that defer to the comparison operators of some template parameter (for instance, vector
, tuple
, and variant
, henceforth "wrappers"). This paper explores the issues surrounding operator<=>
for such wrappers. In particular, for some types, there is an important reason to define operator==
and operator!=
even in a world with operator<=>
: it is sometimes cheaper to determine that two values are not equal than it is to determine in which way they are unequal. The canonical example here is differently-sized containers. Comparing their sizes is cheap (single integer comparison that does not need to follow a pointer), but comparing a value could be arbitrarily expensive. The current behavior of containers like vector
is that no comparisons are performed on the contained objects if the sizes are not equal. It would be an unfortunate pessimization if operator==
on a vector<int>
, vector<vector<int>>
, tuple<vector<int>>
, or struct { vector<int>; }
suddenly got slower in C++20. This paper explores several potential solutions to decide how to specify operator<=>
for wrapper types.
= default
on all comparison operators and had no rules for implicit generation. It got wide support, but was eventually withdrawn in favor of P0221R2. The primary objection to this paper was the verbosity of enabling comparisons.operator<
should not be generated by default, certain relational operators were not as performant as they could be, and there was no way to opt-out of the behavior.operator==
and operator!=
by default if the copy constructor is implicitly generated, with explicit syntax of = default
and = delete
supported when necessary. This paper got a reasonable level of support in terms of the direction proposed. Ultimately, we abandoned this approach in favor of operator<=>
. The primary objection to this paper was that it did not attempt to completely solve the comparison problem, it solved it only for equality.= default
for relational operators (<
, <=
, >
, and >=
). "Something kind of like this" was well supported. This paper was withdrawn in favor of operator<=>
. The primary objections to this paper were that it did not help with the code duplication between comparison operators and it led to less efficient relational operators.operator<=>
), Herb Sutter: This paper was a synthesis of many of the previously mentioned paper (plus several others), in addition to new material. This paper was voted into the standard.operator<=>
.operator<=>
to the standard library.The obvious specifications for operator<=>
on variable-sized containers is significantly slower than operator==
is today because operator<=>
cannot short circuit based on size (you know the two containers are unequal, but you do not know which is less), but operator==
can. This ends up having far reaching consequences.
When we add operator<=>
to wrapper types, what do we do when the wrappers are given types that do not implement operator<=>
? If we define the wrapper's operator<=>
to call the wrapped type's operator<=>
, we do not generate operator<=>
for such a type, and if we unconditionally remove operator==
, operator!=
, operator<
, operator<=
, operator>
, and operator>=
from the wrapper types, the wrappers suddenly become non-comparable when instantiated with such types. If we unconditionally use compare_3way
to implement these types so that we fall back on operator==
and operator<
for the underlying type, we end up with a silent performance change that makes such types up to twice as expensive to compare.
Prior to operator<=>
, how do you implement operator==
for a SequenceContainer
, such as vector
? (note: this paper intentially inlines the call to algorithms like equal
and lexicographical_compare
, and names the functions something other than operator==
to give them a name for discussion)
template<typename T>
bool equal1(vector<T> const & lhs, vector<T> const & rhs) {
if (lhs.size() != rhs.size()) {
return false;
}
auto lhs_first = lhs.begin();
auto const lhs_last = lhs.end();
auto rhs_first = rhs.begin();
while (lhs_first != lhs_last) {
if (!(*lhs_first == *rhs_first)) {
return false;
}
++lhs_first;
++rhs_first;
}
return true;
}
This ensures that two differently-sized containers will immediately return false. It performs a single comparison of a data member within the vector
, then each iteration performs a single iterator comparison. It is important to note that the early size check lets us avoid following the internal data pointer in the container and lets us avoid a loop / function call. If we try to implement this without having the size check (note that it is currently mandated that implementations perform this size check), we have two possible solutions:
template<typename T>
bool equal2(vector<T> const & lhs, vector<T> const & rhs) {
auto lhs_first = lhs.begin();
auto const lhs_last = lhs.end();
auto rhs_first = rhs.begin();
auto const rhs_last = rhs.end();
while (lhs_first != lhs_last && rhs_first != rhs_last) {
if (!(*lhs_first == *rhs_first)) {
return false;
}
++lhs_first;
++rhs_first;
}
return lhs.size() == rhs.size();
}
template<typename T>
bool equal3(vector<T> const & lhs, vector<T> const & rhs) {
auto impl = [](vector<T> const & smaller, vector<T> const & larger) {
auto smaller_first = smaller.begin();
auto const smaller_last = smaller.end();
auto larger_first = larger.begin();
while (smaller_first != smaller_last) {
if (!(*smaller_first == *larger_first)) {
return false;
}
++smaller_first;
++larger_first;
}
return larger_first == larger.end();
};
return lhs.size() <= rhs.size() ? impl(lhs, rhs) : impl(rhs, lhs);
}
You can see the generated code for these versions on Godbolt.
equal2
is the straightforward approach, which ignores the problem. It means that for each element in the smaller container, we have two iterator comparisons instead of just one. For the common case of vector<Integer>
or string
, this changes our per iteration cost from two increments and two compare + branch to two increments, one compare + branch, and one compare + compare + and + branch. Using a very rough estimation, this adds an extra two operations per loop.
equal3
tries to be smarter. It is optimized for longer sequences, because it pays a small fixed cost for the whole comparison to keep the loop costs the same as the first version. Something much like this is essentially what gcc's libstdc++ did for comparing string
with char const *
. Compared with equal1
, this was benchmarked to dramatically increase the costs of comparisons (depending on the benchmark, anywhere from 0 to 66% of the total cost was removed when moving from something like equal3
to something like equal1
, but with strlen
and memcmp
mixed in).
Neither of these two options are satisfying. They all have at least as much in fixed costs as equal1
, and equal2
has much more in per-loop costs. Unfortunately, straightforward application of operator<=>
to the containers gives us a generated operator==
that is the moral equivalent to equal2
or equal3
.
operator<=>
for containers must support ordering containers if the contained type can be ordered. If we want to maintain current ordering (more on that later), operator<=>
would look something one of these two solutions:
// Does something like `equal2` above
template<typename T>
auto compare2(vector<T> const & lhs, vector<T> const & rhs) {
auto lhs_first = lhs.begin();
auto const lhs_last = lhs.end();
auto rhs_first = rhs.begin();
auto const rhs_last = rhs.end();
while (lhs_first != lhs_last && rhs_first != rhs_last) {
if (auto const cmp = compare_3way(*lhs_first, *rhs_first); cmp != 0) {
return cmp;
}
++lhs_first;
++rhs_first;
}
return lhs.size() <=> rhs.size();
}
// Does something like `equal3` above
template<typename T>
auto compare3(vector<T> const & lhs, vector<T> const & rhs) {
auto impl = [](vector<T> const & smaller, vector<T> const & larger) {
auto smaller_first = smaller.begin();
auto const smaller_last = smaller.end();
auto larger_first = larger.begin();
while (smaller_first != smaller_last) {
if (auto const cmp = compare_3way(*smaller_first, *larger_first); cmp != 0) {
return cmp;
}
++smaller_first;
++larger_first;
}
return larger_first == larger.end() ? strong_ordering::equal : strong_ordering::less;
};
auto reverse_ordering = [](auto order) {
return 0 <=> order;
};
return lhs.size() <= rhs.size() ? impl(lhs, rhs) : reverse_ordering(impl(rhs, lhs));
}
These types are all variable-size, sized, ordered ranges, which are all directly affected by this problem.
basic_string
basic_string_view
deque
list
vector
(including vector<bool>
)map
set
multimap
multiset
queue
stack
array
pair
tuple
reverse_iterator
move_iterator
optional
, including heterogeneous optional<T> <=> optional<U>
, optional<T> <=> U
, optional<T> <=> nullopt_t
, and nullopt_t <=> nullopt_t
variant
forward_list
All of these are either fixed-size (in many cases, size == 1) or, in the case of forward_list
, unsized, and thus would not benefit from an early exit based on different sizes. If we can define a maximally efficient operator<=>
for all types, these types don't need to do anything special with regards to performance (operator<=> = default
would be sufficient).
unordered_map
unordered_set
unordered_multimap
unordered_multiset
The containers on this list provide only operator==
and operator!=
. Their operator<=>
will return, at best, strong_equality
. As long as a valid implementation of operator<=>
is to call a == b
for each element, these containers are unaffected by the performance problem outlined in this paper.
operator==
and operator!=
for rangesThis is a tempting solution. The author of the range type merely defines their own operator==
(and operator!=
in terms of it) that does the size short circuiting. Unfortunately, this solution just pushes the problem up. If we define operator==
and operator!=
for ranges, we run into the same problem with the obvious specification of operator<=>
for types like tuple
, so now any type that calls another type's operator<=>
also has to define its own operator==
and operator!=
. Even if we specify types like tuple
to have all three of operator<=>
, operator==
, and operator!=
, current wording for operator<=>
would make user-defined types like
struct S {
string str;
auto operator<=>(S const &) const = default;
};
slower than string
. If we accept this library-only solution, we must sacrifice maintainability and respecting the programmer's time.
operator==
and operator!=
for ranges, and synthesize those operators in generalThis takes the previous approach, but does not simply define defaulted operator==
and operator!=
in terms of operator<=>
. Instead, we create a slightly more complicated set of rules to pick up user-defined operator==
.
a != b
, call !(a == b)
instead of a <=> b != 0
. For types that define only operator<=>
this has the same effect as current rules.auto operator<=>(T const &) const = default;
, this gives a memberwise operator<=>
and a memberwise operator==
. In other words, rather than creating a rewrite rule for operator==
, we define an actual operator. For types that contain only data members that do not have an explicitly-defined operator==
, this has the same effect as the current rules. If, however, some of the data members do have an explicitly-defined operator==
, this will ensure that the generated operator==
and operator!=
for the wrapper respect the definition of each contained type. This will ensure that tuple<string>
and the struct S
example above naturally are as efficient as possible when compared for equality.There are three main issues with this approach:
operator==
(presumably because it was necessary), so this is an example of paying for what you do use.operator<=>
is also defaulting operator==
. It's a bit strange that a user would have to define both operator<=>
and operator==
to get the behavior they would get from auto operator<=>(...) const = default
. We could make this behavior more explicit and require users to type bool operator==(T const &) const = default
to opt-in to this behavior, but I do not recommend this approach because forgetting to do so leads to a silent performance problem, rather than a compiler error. We could make it a compiler error to use operator==
on a type that has auto operator<=>(...) const = default
if any of the data members have an explicitly-defined operator==
. At this point, we are making the user type more code to get the behavior that they almost certainly want (what would the error message from your compiler look like? "You meant to put bool operator==(T const &) const = default;
somewhere in your class."). The model of operator<=>
is already that it is rewriting calls to other operators, so from that perspective it might not be that strange.a != b
, call !(a == b)
". There are two options here: 1) If a == b
is valid (changes code without operator<=>
) or 2) Only when operator<=>
is defined (we change the specification of operator<=>
to rewrite a != b
as !(a == b)
instead of a <=> b != 0
).operator<=>
create only operator<
, operator<=
, operator>
, and operator>=
In other words, operator<=>
would be for ordering only, not equality. If we take this approach, we would also want to allow bool operator==(T const &) const = default;
and bool operator!=(T const &) const = default;
. Where the above approach causes a silent performance issue when we try to make this behavior explicit, this approach simply does not give you equality unless you explicitly ask for it.
There are two main issues with this approach:
1) Compared to the previous solution, this requires the user to type even more to opt-in to behavior that they almost always want (if you have defaulted relational operators, you probably want the equality operators). Because operator<=>
is a new feature, we do not have any concerns of legacy code, so if the feature starts out as giving users all six comparison operators, it would be better if they must type only one line rather than having to type three.
2) It is a natural side-effect of computing less than vs. greater than that you compute equal to. It is strange that we define an operator that can tell us whether things are equal, but we use it to generate all comparisons other than equal and not equal. For the large set of types for which operator<=>
alone is sufficient, it also means that users who are not using the default (they are explicitly defining the comparisons) must define two operators that encode much of the same logic of comparison. This mandatory duplication invites bugs as the code is changed under maintenance.
operator<
, operator<=
, operator>
, and operator>=
An interesting modification to this idea would be to create one more comparison category that, when used as the return type of operator<=>
, leads to only the relational operators being synthesized. When calling operator<=>
directly, there would still be a value that tells whether the two objects are equal (as this is a natural side-effect), it would be only the generation that is affected.
This has most of the same drawbacks as the previous approach, except that it allows the user to say "No, this type is different, operator<=>
is not good enough for operator==
and operator!=
".
The previous solutions accept that operator<=>
is slower than operator==
and provide ways to have both. The ideal solution would make lhs <=> rhs == 0
be just as fast as lhs == rhs
, and have lhs == rhs
be just as fast in C++20 as it is in C++17. We can have such a solution if we are willing to sacrifice either backward compatibility or consistency. In C++17 and earlier, operator<
gives a lexicographical order for containers, which is an extension of alphabetical order. It essentially sorts first by elements, then by size. "Pascalexicographical order" sorts first by size, then by elements.
Lexicographical (what we currently have) | Pascalexicographical |
---|---|
"a"s == "a"s |
"a"s == "a"s |
"a"s < "z"s |
"a"s < "z"s |
"a"s < "aa"s |
"a"s < "aa"s |
"aa"s < "z"s |
"z"s < "aa"s |
What does an operator<=>
that gives this ordering look like?
template<typename T>
auto compare1(vector<T> const & lhs, vector<T> const & rhs) {
if (auto const cmp = lhs.size() <=> rhs.size(); cmp != 0) {
return cmp;
}
auto lhs_first = lhs.begin();
auto const lhs_last = lhs.end();
auto rhs_first = rhs.begin();
while (lhs_first != lhs_last) {
if (auto const cmp = compare_3way(*lhs_first, *rhs_first); cmp != 0) {
return cmp;
}
++lhs_first;
++rhs_first;
}
return strong_ordering_equal;
}
Now we are back to essentially the same fast code that we had for equal1
.
This operator fully embraces the idea that for many uses, "any order" is sufficient, so it provides the fastest strong ordering possible. There are certainly users that depend on the order of these wrappers (that is to say, they need the specific order given by a lexicographical comparison, they do not just need any ordering). This means that the straightforward application of this change would be a major, silent breaking change. But is there some clever way we can do this such that it does not break any users?
One thought could be to define operator<=>
with this new order only for those types that, themselves, define operator<=>
. We know that no user code is doing this today, so no existing code will change behavior. This immediately runs into one major stumbling block, the foremost example of which is basic_string<char>
. char
has operator<=>
, and changing the default sorting order of strings is likely the biggest breaking change in this category. If we try to exclude basic_string<char>
from this change in ordering, we also lose out on most of the performance benefits.
The other option would be to make this change for operator<=>
only. The long-term plan could be for operator==
and operator!=
to not be defined explicitly for these wrapper types (they just defer to operator<=>
), but the relational operators remain and provide a lexicographical ordering. This maintains backward compatibility and ensures that we generate code that is fast by default, but has a huge loss in consistency (for all of the most common standard library types, a <=> b < 0
is not the same as a < b
, despite that being the rewrite rule).
In theory, there could exist a tool that (as part of the C++20 upgrade process) rewrote all existing uses of relational operators like operator<
to be lexicographical_compare
for affected containers...
Any solution we decide on (other than the non-solution of not providing operator<=>
for any of these types) will have at least one thing in common: the return type will be the common comparison result type for comparing each of the elements, similar to lexicographical_compare_3way
.
When given a type that does not have operator<=>
, there are two distinct possibilities: either we provide operator<=>
or we do not. The standard library has compare_3way
, which calls operator<=>
if possible, otherwise it synthesizes it from operator==
and operator<
. If we require all calls in wrapper types to go through compare_3way
rather than operator<=>
on all wrapped types, we get operator<=>
automatically for free. This would also work with the default lexicographical_compare_3way
as an implementation strategy. It seems better to always provide operator<=>
to give a consistent user experience ("vector<T>
always has operator<=>
") and it can give better performance in situations that require 3-way comparisons -- in the case of vector<vector<T>>
, for instance, doing all necessary comparisons on the first vector
element, even if that implementation is synthesized, means that you iterate over the outer vector only once, which is slightly less traffic on your memory bus.
However, we would not want to remove all of the old comparison operators for types that do not define operator<=>
, otherwise we would have a silent performance degredation from C++17 to C++20. Here, we have a choice again. We can define the 6 comparison operators only for those types that do not define operator<=>
, or we can define them always. Here, it probably makes sense to define them only for types that do not have operator<=>
, unless we go for a solution to the performance problem that involves generating operator==
and operator!=
, in which case we should define those as well.
operator==
when the user defaults operator<=>
operator!=
as being rewritten to !(lhs == rhs)
(following the same precedence / name hiding rules as operator<=>
) for all typesoperator!=
as being rewritten to !(lhs == rhs)
(following the same precedence / name hiding rules as operator<=>
) only when operator<=>
is definedoperator<=>
create only operator<
, operator<=
, operator>
, and operator>=
, not operator==
or operator!=
operator<
, operator<=
, operator>
, and operator>=
, not operator==
or operator!=
operator<=>
only to be implementation-definedoperator<=>
only to be "Pascalexicographical", thus not comparing any contained elements if the sizes are unequaloperator<=>
, even when given a type that does not have operator<=>
operator<=>
operator<=>