Document #: | P2830R4 |
Date: | 2024-05-22 |
Project: | Programming Language C++ |
Audience: |
EWG |
Reply-to: |
Nate Nichols <natenichols@cox.net> Gašper Ažman <gasper.azman@gmail.com> |
std::type_order_v<T, U>
std::entity_ordering<X, Y>
constexpr std::type_identity::operator<=>
constexpr std::__lift<arg>::operator<=>
constexpr bool std::type_info::before()
apply_canonicalized
__PRETTY_FUNCTION__
instabilityAs of C++23, std::type_info
provides a stable but unspecified and
non-constexpr
order on
types.
This paper explores a standardized ordering of types in C++.
This paper is split into two parts:
TYPE_ORDER(x, y)
, which deals
with how such an order should be defined,std::type_info::before
TYPE_ORDER(X, Y)
definition
constexpr
and
implementation-defined, as suggested by EWGi and BSI
reviewers.__PRETTY_FUNCTION__
differences
from Barry Revzin (thanks!)SORT_KEY(x, y)
There is currently no way in portable C++ to sort types at compile-time.
Various libraries hack it together, mostly by utilizing
__PRETTY_FUNCTION__
, but all
such methods are non-portable and error-prone (consider forward-declared
enums - example in Annex). There are multiple stack-overflow questions
on the topic.
One such implementation is part of the monadic functional library by Bronek Kozicki et al.
Fundamentally, we need a way to canonicalize sets and multisets of
types. This is necessary when building any kind of compositional library
in C++, from std::execution
[P2300R7], mixin libraries, libraries of
monadic components and operators, policy-based libraries, etc. The
inability to sort and unique types leads to the instantiation of an
combinatorial number of templates instead of just one per typeset, which
leads to code-bloat and untenable compile-times. The problem is
fundamentally that without typeset canonicalization, we generate
different template specializations but equivalent behavior, and there is
no way to avoid this.
The goal here is to provide a flexible language mechanism to
let both Foo<A, B, C>
and
Foo<C, B, A>
produce the
same underlying
Foo_impl<A, B, C>
.
The reason we start with
TYPE_ORDER()
and not “just give
me typesets” is that we need flexibility; consider
Foo<pair<A, X>, pair<B, Y>, pair<A, Z>>
might want to be the same as Foo_impl<pair<A, X>, pair<B, Y>>
or Foo_impl<pair<A, Z>, pair<B, Y>>
Foo<A, B, A, A, C>
should
perhaps be
Foo<A, A, A, B, C>
Matrix<float, policy1, policy2, policy3>
should only deduplicate policies.We must provide TYPE_ORDER
in
order to sort
and
unique
; they are required
building blocks for any set
primitive. Put another way, even if we standardized a
set
, we’d need to somehow
canonicalize the order (due to mangling and debug info), leading us back
here.
Without such canonicalization, it is infeasible to enumerate the set of function templates to instantiate in a separate compilation unit. For the same reason, it’s utterly impossible to type-erase them in a fixed-size virtual function table.
This section introduces the kind of code we would like to write,
regardless of how this feature ends up being spelled. To not prejudice
the reader in this section, please assume the existence of an
exposition-only consteval
macro
TYPE_ORDER(x, y) -> std::strong_ordering
whose arguments can be any cv-ref qualified types.
Note: while
TYPE_ORDER(x, y)
is defined on
types, we can define a set of class templates that take an arbitrary
template argument, and
TYPE_ORDER
those; this induces
an order on all entities that can be template arguments.
Crucially, also consider the interactions with [P1985R3] and [P2841R1], which introduce
concept
and
variable-template
arguments.
(Kept short because further examples give concrete needs).
struct A {};
struct B {};
// we want some way to enable
static_assert(std::is_same_v<typeset<A, B>, typeset<B, A, A>>);
Furthermore, we need the order to be the same in different compilation units.
Consider the needs of a library for type-erasure; we’d like the user to specify the capabilities to erase. Fundamentally, these capabilities are a set.
Observe:
using T1 = lib::dyn<ostreamable, json_serializable, iterable<int>, scalar_multiplicable<int>>;
Different users of the library will likely provide these capabilities in different orders; this becomes especially problematic when writing functions:
using T2 = lib::dyn<json_serializable, ostreamable, iterable<int>, scalar_multiplicable<int>>;
~~~~~~ flipped ~~~~~~~~~~~~~~~
void f(T2 const&);
We can solve this by having type aliases, but if these sets are computed, we are left without recourse:
int sum = 0;
auto pipeline1 =
(std::cerr) | sum_into(&sum) | imul(sum) | json_dump(std::cout);
log// dyn<ostreamable, iterable<int>, scalar_multiplicable<int>, json_serializable>
auto pipeline2 =
(&sum) | imul(sum) | json_dump(std::cout) | log(std::cerr);
sum_into// dyn<iterable<int>, scalar_multiplicable<int>, json_serializable, ostreamable>
The above pipelines need the same type-erased interface for its
input, and neither is either T1
or T2
.
The most obvious example is a canonicalized
std::variant
, that is, something
like
template <typename... Ts>
using variant_for = apply_canonicalized<std::variant, Ts...>;
Building apply_canonicalized
is not terribly difficult, as long as we have
TYPE_ORDER
. Please see the
appendices on how to do it.
It would be nice if
apply_canonicalized
was a
language built-in, but to do that, we need to first define
TYPE_ORDER(x, y)
. After we
define TYPE_ORDER
, putting
apply_canonicalized
into
type_traits
is a far simpler
proposition.
Note:
apply_canonicalized
is roughly
mp11::mp_sort
+
mp11::unique
with the order
derived from TYPE_ORDER
.
Effectively any time a variant is type-indexed instead of integer-indexed, we would prefer the compiler generate just one type per set, instead of type-per-list.
Examples follow, but they are by no means exhaustive.
std::expected
Consider a std::expected
for
an algorithm that can fail in several ways:
auto read_packet(socket& sock)
-> expected<message, variant<io_error, decode_error>>;
Such an algorithm doesn’t care about whether the error type is variant<io_error, decode_error>
or variant<decode_error, io_error>
.
This is not as much of a problem in cases where the programmer writes
the types by hand, but consider the generic situation where several
errors must be aggregated generically:
template <typename Decoder>
auto read_packet(socket& sock)
-> expected<Decoder::message_type,
</*uniqued and sorted io_error + Decoder::error_types*/>>; variant
If we had member packs and pack aliases, and a nice built-in
set
, we could o the above as
template <typename... Ts>
...using set = __builtin_uniqued_sorted<Ts...>;
<...set<io_error, Decoder::...error_types...>...>; variant
std::execution
(and every
other async library) needs a variant-type to store the current result at
a suspension point. In general, since it models a “suspended” function
call, it’s analogous to some kind of
<
variant<set_value, Arg1_1, Arg1_2, Arg1_3>, // first possible entry point
closure_tuple<set_value, Arg2_1>, // second possible entry point
closure_tuple<set_error, ...>,
closure_tuple...
>
The venerable rxcpp
library
by Kirk Shoop calls this type a notification.
Consider the
transfer(scheduler)
algorithm;
it receives all set_value(PACK)
,
set_error(PACK)
and possibly
set_stopped()
parameter lists,
stores them in a variant<tuple<CHANNEL, PACK>...>
,
and resumes with the values on a different scheduler once the compute
resource becomes available.
The code using it looks like this:
| transfer(scheduler) | then([](auto&&...args){/*...*/})... some_computation
All the possible closure tuples are fundamentally unsorted, and generating the assembly, debug information, and everything else for this type more than once is a waste of compile-time and executable space (as well as instruction cache).
Furthermore, stamping out this type multiple times also duplicates the calling code, and given the different indices, COMDAT folding cannot deduplicate those code-paths.
The states of most state-machines are fundamentally unordered, as are the message types they receive. If a state machine is generated (such as in a compile-time regex library), canonicalizing identically-behaving components would lead to smaller state-machines and faster compile- and execution times.
Similarly to a canonicalized
variant
, a
canonicalized_tuple
could be
implemented as
template <typename...Ts>
using canonical_tuple = apply_canonicalized<tuple, Ts...>;
// or
using canonical_tuple = tuple<...set<Ts...>...>;
This is far less useful on its own than the
variant
, since initializing that
type will be quite the chore. Instead, canonicalized tuples appear as a
result of the environment
monads.
std::execution
mixes in an
environment, which consists at least of a
stoppable_token
, a
scheduler
, and possibly other
things like a domain
, and, at
least in the authors’ implementation, arbitrary other things. An
allocator
is also optionally
part of the environment.
The environment<pair<Tags, Ts>...>
is a product type that is a map from
Tag
to a value of type
T
, say
std::stop_token
to
never_stop_token{}
and
scheduler
to
thread_pool_scheduler
.
One can push and pop things off the environment in different parts of the pipeline; if the pushes come in the wrong order, the environment is difficult to keep canonicalized.
auto ss = std::stop_source();
auto tp = std::static_thread_pool(5);
auto env = empty_env{}
| push_env<stop_token>(ss.get_token())
| push_env<scheduler_t>(tp.get_scheduler())
| push_env<mylib::numa_domain_t>(2);
If you imagine those three calls to happen in different algorithm transformers, it’s quite likely they’ll be in a different order, manifesting a different type, which doesn’t make much sense - the type behaves identically, this just leads to code bloat.
The authors have observed a necessity to sometimes split parts of such pipeline compositions into different compilation units due to excessive compile times and repetitive compilation.
This has proven difficult due to the brittle nature of type ordering of the names of explicit template specializations involved. By far, the main culprit has been the lack of a sensible canonicalization of names, as the order changes far more frequently than the set of types.
Consider:
do_something()
| error_if_odd() // there is no semantic change
| error_if_we_too_long() // if we flip these lines | ...
Flipping the lines will flip the two exit error types in the error list, however.
Having a defined, stable order of types proved invaluable (even if we did hack the type order manually).
The first two revisions of this proposal tried to construct a full type ordering from first principles; EWGi then requested we change to an implementation-defined order (use the mangler), but EWG subsequently reversed that decision.
The overwhelming response of people working on implementations, as well as people sitting on the Itanium ABI standards committee, was that this is a fool’s errand, and we need only look at the bugreports for the Itanium ABI to see why.
As a taster, consider the following conundrum:
using what_type = decltype(
[](this auto&& self,
decltype([](decltype(self)&){}) x = {}){ return x; }());
// ^^ outer ^^ inner
The inner lambda’s “name” is clearly dependent on the outer’s, but it also goes the other way! The ABI standards already have to deal with such mangling conundrums, and duplicating their work in the C++ standard itself seems highly counterproductive, especially since the ABI standards already accomplish all the stability guarantees we would want.
Another problem is that within the C++ standard, we cannot legislate
what happens with orderings between compilation units. Granted, we do
not need this for constexpr
, but
we do need it to be stable and make sense, so a certain amount of common
sense will be expected from implementations anyway.
The recommendation was overwhelmingly to just let the order be implementation-defined, and let the implementations do the sensible thing.
“punting it off to implementations” saves the C++ standardization process from having to figure this out for every new proposal that touches types; implementations, however, already have representation on the committee, and will veto any truly unimplementable things in this space.
TYPE_ORDER(X, Y)
already has
public-facing API implicationsThe “normalized” versions of types will inevitably show up in function signatures; if different compilers on the same platform produced different orderings, compilation units from different compilers would refuse to link, despite both “signatures” beings spelled identically.
Letting the compiler vendors synchronize based on the mangling scheme or something equivalently useful is the same forum as the current ABI discussion forums; it seems the appropriate venue to standardize the ordering anyay, without making WG21 duplicate the work.
This proved decisive. Since static analyzers and compilers that produce intermediate representations before choosing a backend are part of the ecosystem, it is important to be consistent and not rely on a particular mangler.
consteval
should be as portable
as possibleThe abstract machine used at constant evaluation time has far fewer parameters than the runtime one. We should keep them to a minimum, and type ordering is not a necessary parameter. We should keep implementations consistent.
TYPE_ORDER(x, y)
The order should be the same across compilation units. This is key for generating ABI-compatible vtables, for instance.
It also should be stable through time. An order based on an ABI-mangling scheme satisfies this notion, at least.
The order should be available on freestanding implementations. One
crucial use-case is replacing the exception mechanism on those platforms
with return values of
std::expected
or similar, and
compromising that is undesirable.
type_info::before()
Ideally, whenever typeid(x).before(typeid(y)) == true
,
then TYPE_ORDER(x, y) == std::strong_ordering::less
.
The converse obviously cannot be true, since
TYPE_ORDER(x, y)
is finer than
the order induced by
type_info::before()
.
However, the standard currently says
The names, encoding rule, and collating sequence for types are all unspecified and may differ between programs.
Since this paper requires that the order not differ between programs, exposing this as a normative requirement is impossible without tightening this wording, which is outside the scope of this paper.
At least at present, some implementations decide this order at program startup time, so this is not in general implementable.
The ordering should be self-consistent, that is, for all possible
template arguments T
,
U
, and any unary template
some_template
:
TYPE_ORDER(T, U) == TYPE_ORDER(some_template<T>, some_template<U>)
.
Any operator<=>(std::meta::info, std::meta::info)
should be consistent with this one.
While std::meta::info
[P2320R0] objects can reflect more
entities than just types and values (mainly: expressions), any ordering
defined on them should be finer than this one, and specifically
consistent with it. We should do something that reflection can
subsume.
However, it doesn’t seem like that proposal will define an ordering
on info
objects, and we need
this solved sooner than that.
Regardless of the syntax chosen, the semantics would be the following.
Let X
and
Y
be (possibly cv-ref) qualified
types.
TYPE_ORDER(X, Y)
is an
constant expression of type
std::strong_ordering
.
std::same_as<X, Y> == true
if and only if TYPE_ORDER(X, Y) == std::strong_ordering::equal
.TYPE_ORDER(X, Y)
is either
std::strong_ordering::less
or
std::strong_ordering::greater
.
Which of those is implementation-defined, subject to the following
semantic constraints.Implementations must define
TYPE_ORDER(X, Y)
such that it is
an ordering, that is, it is transitive and antisymmetric; that is
TYPE_ORDER(X, Y) == std::strong_ordering::less
,
then
TYPE_ORDER(Y, X) ==
std::strong_ordering::greater`
and vice versaTYPE_ORDER(X, Y) == std::strong_ordering::less
and TYPE_ORDER(Y, Z) == std::strong_ordering::less
,
then TYPE_ORDER(X, Z)
is also
std::strong_ordering::less
.It is implementation-defined whether the order
TYPE_ORDER(X, Y)
is finer than
the one implied by
std::type_info::before
. That
is,
typeid(X).before(typeid(Y))
,
then TYPE_ORDER(X, Y) == std::strong_ordering::less
.Note: the converse is not possible -
TYPE_ORDER(X, Y)
does not strip
cv-ref qualifiers.
Please also note that X
and
Y
are types (not expressions),
and therefore typeid
does not
have dynamic typing semantics in this case.
Note: Implementations are encouraged to do this if possible, and making it “implementation-defined” places a requirement on implementations to document whether they did so.
Any syntax will do for the use-case, but some better than others. This section explores the trade-offs in syntax choice.
The authors recommend the first option.
std::type_order_v<T, U>
Specifally:
template <typename T, typename U>
inline constexpr std::strong_ordering type_order_v = TYPE_ORDER(T, U); /* see below */
template <typename T, typename U>
struct type_order : integral_constant<strong_ordering, type_order_v<T, U>> {};
// as a separate library proposal, once member packs make it
template <typename... Ts>
using ...typemultiset = /* pack of Ts, sorted by type_order_v */;
template <typename... Ts>
using ...typeset = /* uniqued ...typemultiset<Ts...>... */;
This seems like a pretty good choice. It does not need a new keyword,
only depends on <compare>
,
and the name seems relatively discoverable.
We could, in fact, put it into
<compare>
if we didn’t
want <type_traits>
to
depend on it.
It’s also freestanding, since it doesn’t depend on
<typeinfo>
.
We should also allow, as a special provision, the arguments to the above metafunctions to be incomplete.
std::entity_ordering<X, Y>
Specifically:
template <universal template X, universal template Y>
inline constexpr strong_ordering entity_order_v = TYPE_ORDER(X, Y); /* see below */
template <universal template X, universal template Y>
struct entity_order : integral_constant<strong_ordering, entity_order_v<X, Y>> {};
This is a better option than Option 1 if we get universal template parameters, as we really want to also order class templates, not just types.
However, without universal template parameters, we really don’t have much of a choice but to reach for Option 1.
The name entity_order
is also
slightly less obvious than
type_order
, but metaprogrammers
shouldn’t have trouble finding either.
Specifically:
consteval std::partial_ordering partial_order(std::meta::info x, std::meta::info y) {
return __comparable(x, y) ? TYPE_ORDER(x, y) : std::partial_order::unordered;
}
We could standardize a type order as a function on
std::meta::info
objects in
std::meta
. However, once we’re
in std::meta::info
space, it’s
more difficult to know which reflections are comparable and which
aren’t, so such a function would need to return a
std::partial_order
, which seems
decidedly less desirable.
It also means we’d need to pass the correct kind of reflection into
the ordering function, which is a bit less intuitive than just
decltype
or just the template
parameter that we already have.
constexpr std::type_identity::operator<=>
Specifically:
template <typename T, typename U>
constexpr std::strong_ordering operator<=>(std::type_identity<T>, std::type_identity<U>);
Pros:
Cons:
type_identity<T>
,
type_identity<U>
, as well
as operator<=>
overload
resolution and substitution, which is quite expensive compared to a
single type_order_v<T, U>
direct substitution and lookup.operator<=>
overload in the std
namespace is
a drag on overload resolution<compare>
to
<type_traits>
, since
that’s where type_identity
isconstexpr std::__lift<arg>::operator<=>
This option means we add template <universal template> struct __lift {};
into <type_traits>
and
define operator<=>
for
it.
Pros:
Cons:
type_identity
constexpr bool std::type_info::before()
(Included because it’s a common question.)
It would be nice, but alas, operates on cv-unqualified versions of the referenced type, so it’s not sufficient.
constexpr std::strong_order(std::type_info, std::type_info)
has similar issues.
Because we have no way to reliably order types across compilation units at compile-time.
It’s a good question. However, reflection will do nothing
for this problem by itself; the user will still have to implement
ordering using consteval
functions, which have no hope of being as fast as a compiler-provided
built-in.
User-programmed functions also won’t adapt to language evolution; this feature will.
Finally, sorting is arbitrary; having it be consistent throughout the software ecosystem is potentially a great enabler of interoperability.
No; Peter Dimov shares an interesting anecdote.
I have in Mp11 the algorithm
mp_unique
, which takes a list of types and removes the duplicates. In the course of writing the reflection papers, their authors occasionally took Mp11 code examples and tried to show how they are elegantly implemented using value-based reflection metaprogramming.
So, you take a
vector<info>
that contains types, and then you simply apply the existing algorithmstd::unique
to it, et voila… oh wait.
std::unique
wants a sorted range, and you can’tstd::sort
the info vector, because info objects aren’t ordered, even when they refer to types.
The proposal has no questions on whether it can be implemented - the question is about the definition of the order.
The concerns raised by the implementers so far have been mostly around having to bring the name mangler from the backend and make it accessible to the compiler front-end, because the obvious implementation is to define the comparison result on the mangled type strings; this option has been rejected by EWG.
Making this order match up with
type_info::before
is a matter of
bringing the name mangler for the correct platform to the frontend.
It is the opinion of the authors that this doesn’t seem to be a layering violation, as the name mangling for a given platform is analogous to other platform properties, such as the size and alignment of pointers.
If a platform does not have a name mangling strategy, any name mangling scheme will still result in a standards-conforming implementation; however, after much discussion, it seems a better direction to completely define the order in the standard itself.
This section sets out an approach for defining an ordering of all
compile-time entities; that is, the definition of
SORT_KEY(X)
for a given
cv-qualified type X
;
and the comparison function between these sort-key-tuples.
Note to reviewers: any well-defined order will satisfy the design requirements; the chief design element here is whether we have achieved a well-ordering, not what exactly it is. The only other considration is whether we can evolve the order as we introduce new entities into the language.
The entities we must order are:
Every sort-key-tuple is of the form
(element...)
.
where an element is one of:
These tuples are then ordered lexicographically (ties broken in favor of shorter tuple), atoms first, then names, then constants, then other sort-key-tuples.
Let us name this transformation as
sort_key(entity)
.
The rest of the design is concerned with defining this transformation.
A sort-key for most entities is of the form
sort_key(x) = (sort_key(sort-scope(x)), x-dependent-elements...)
This section deals with the sort-scope of an entity.
To deal with anonymous types and templated entities like hidden friends, we have to observe more than just the namespace and class the entity is declared in; we must also observe anything at all that could contribute to the declaration, such as template parameters and arguments of partial and full template specializations of both class and function templates, alias templates, concepts, their parameter numbers (in case of default arguments), and so on. After that, we must make special allowances for entities that cannot be influenced by such surroundings, such as forward declaratios of named classes.
This section deals with namespaces and their sort-keys.
Let x be the global namespace.
sort_key(x) := (_namespace_, _global-namespace_) sort_scope(x) := ()
Every namespace (except for the global namespace) has a parent namespace, which is its immediately enclosing namespace.
let x be some named namespace with simple-name “namespace_name”.
sort_key(x) := (sort_key(parent_namespace), _namespace_, "namespace_name") sort_scope(x) := sort_key(parent_namespace)
The anonymous namespace sorts after all the other namespaces.
Let x be some anonymous namespace, with parent namespace
parent_namespace
.
sort_key(x) := (sort_key(parent_namespace), _namespace_, _anonymous_) sort_scope(x) := sort_key(parent_namespace)
TODO
Qualifiers are each assigned a score
&: 1
&&: 2
const: 3 volatile: 6
and ordering lowest-first after summing them.
Therefore, for an unqualified type
T
, the order of all possible
qualified types would be:
0 T
1 T &
2 T &&
3 T const
4 T const &
5 T const &&
6 T volatile
7 T volatile &
8 T volatile &&
9 T const volatile
10 T const volatile &
11 T const volatile &&
This is accomplished by putting the qualifier sequence into the tuple just after the typename.
For a given type T
, we
define
qualifier_sequence(T const volatile &) := const volatile & qualifier_sequence(T const volatile &&) := const volatile &&
so that we can put it into the
sort_key
tuple later.
All scalar types are built-in types, except for user-defined enumerations, which should be ordered as if they were class types.
Simple scalar types (everything in this section but function types and pointer types) are not ordered by “name” - they are ordered using the rules in the table. All atoms are still ordered before them, any name is ordered after them, as are all sort-key-tuples.
This is because some of the built-in types do not have names, only
type aliases (such as
decltype(nullptr)
), and we do
not order types by their aliases.
This causes any built-in scalar types to be ordered before any compound types.
In case of ties, built-in types with simple names shall be ordered before any nameless types.
In particular, scalar types shall be ordered as follows:
void
comes first because
it’s not reifiable,std::nullptr_t
as the first monostatebool
as the first
bi-statechar
,
signed char
,
unsigned char
) (std::byte is an
enumeration in std
so it falls
under different rules)short
,
unsigned short
,
int
,
unsigned int
,
long
,
unsigned long
,
long long
,
unsigned long long
, followed by
any implementation-defined wider integral types like
__int128_t
etc.). Intersperse
any implementation-defined built-in integral types as needed between the
above following those rules.float
,
double
and
long double
come before any
other floating point types of the same size. Any decimal-floating-point
types come after binary-floating-point types; if multiple floating point
types of the same bit-length exist, break ties by the bit-size of the
exponent, lower first.u8x16
from gcc’s
documentation).TODO: this section is not complete.
Array types shall be ordered after scalar types but before class types.
Given a non-array type T
sort_key(T[]) := (sort_key(T), [])
sort_key(T[n]) := (sort_key(T), [n]) sort_key(T[][n]) := (sort_key(T), [], [n])
class foo
is declared in
struct bar
:struct bar { class foo; }
sort_key(foo) = (sort_key(bar), sort_key(foo)) = ((type, bar), (type, foo, ))
This shall hold for any of the above named scopes.
Given
namespace foo::bar {
struct i;
}
namespace baz {
struct j;
}
Then:
sort_key(foo::bar::i)
is
((namespace, foo), (namespace, bar), (type, i, ))
.sort_key(baz::j)
is
((namespace, baz), (type, j, ))
When compared, the result is that
baz::j
<
foo::bar::i
, since
namespace baz
precedes
namespace foo
.
The atoms of key-tuples are ordered as follows:
[]
(array of unknown
bound)[n]
(array of known bound n)
(ordered by n
internally)*
(pointer)...
in
f(...)
)...
in
typename...
)These are simple names (identifiers).
Examples: “pair”, “string”, “unique_ptr”, “std”, “hashmap”, “absl”.
There are the following kind tokens that can appear in key-tuples.
Note: everything but “values” is pretty simple, but we haven’t dealt
with values extensively yet with the R1 of this paper, though we should
just defer to <=>
and
require a default strong structural ordering for values that may be
template arguments.
Most names are strings that are valid (atomic) identifiers. Those are just themselves:
namespace foo::bar { struct baz; }
foo
,
bar
and
baz
are such atomic
identifiers.
Unnamed entities are all given a name that is not an identifier (but is, in fact, a tuple), and are then numbered consecutively, starting with zero, based on their name-scope.
Name-scopes are namespaces, classes, unions, functions, and enumerations.
Function declarations are name-scoped to the function itself.
Consider a lambda that appears as a default argument of a function template:
template <typename T>
void f(T x = []{ return T{0}; }());
// ^^^^^^^^^^^^^^^^^^ this one
The key-tuple for
f<int>
is:
(function, (f, (type, int)), (type, void), ((type, int)))
The key-tuple for the lambda is:
((function, (f, (type, int)), (type, void), ((type, int))), (type, (lambda, 0), ))
.
Note: because of the regular structure of key-tuples, such anonymous classes will compare greater than any entity that has a simple identifier, due to tuples comparing greater than atoms (which simple names are).
Types of lambda objects are ordered first by where they are declared, then by declaration order.
In effect, we assign them the name
(lambda, #)
where
#
is the count of other unnamed
entities in the name scope.
namespace Banana {
auto i = [](int) -> void {}; // 0th lambda instantiated in Banana
}
namespace Apple {
auto i = [](float) -> int {}; // 0th lambda instantiated in Apple
auto j = []() -> std::string {}; // 1st lambda instantiated in Apple
}
These would produce the following tuples:
sort_key(decltype(Banana::i)) = ((namespace, Banana), (type, (lambda, 0), ));
sort_key(decltype(Apple::i)) = ((namespace, Apple), (type, (lambda, 0), )); sort_key(decltype(Apple::j)) = ((namespace, Apple), (type, (lambda, 1), ));
Note: the empty bit after the identifier is the empty qualifier pack.
struct
and
union
typesThey are named, respectively,
(class, #)
and
(union, #)
.
The sort_key(namespace-name)
is (namespace, identifier)
.
This means that namespaces are ordered alphabetically by comparing namespace names at the same rank. A namespace comes before any of its subnamespaces.
Example:
namespace outer1 {
struct i;
}
namespace outer2 {
namespace inner1 {
struct i;
}
namespace inner2 {
struct i;
}
}
The order of the three structs w/ type
i
types shall be
sort_key(outer1::i) < sort_key(outer2::inner1::i) < sort_key(outer2::inner2::i)
.
The sort_key
of a type is
(type, <identifier>, <qualifiers>)
.
The <identifier>
bit is
a bit complicated, so let’s deal with the qualifiers first.
Note: any name-scopes the
type
is declared in are part of
the parent key-tuple. The
identifier
portion is
complicated because of possible template arguments for types that are
template specializations.
Class types shall be greater than scalar types.
Since we cannot redeclare two types with the same name, class types shall be ordered alphabetically.
struct Apple {};
class Banana {};
struct Carrot {};
Would be ordered as
Apple < Banana < Carrot
As such, we define sort key as:
sort_key(Apple) = (type, Apple, )
sort_key(Banana) = (type, Banana, )
sort_key(Carrot) = (type, Carrot, )
NTTPs shall first be ordered by their type, then their value.
Given:
template <auto T>
struct s {
decltype(T) i = T;
};
<1u> a;
s<1.0f> b; s
sort_key(s<1u>) = ((type, (s, sort_key(1u))))
We can define sort_key of 1u
as: sort_key(1u) = ( sort_key(decltype(1u)), 1)
s<1u>
shall be ordered
before s<1.0f>
, as
integral types come before floating point types.
NTTPs of the same type shall be lexicographically ordered by their scalar subobjects. Meaning
struct F final {
struct G final {
int h;
int i;
} g;
int j;
};
{{0,1}, 2};
F f{{1,2}, 3}; F f2
sort_key(s<f>) < sort_key(s<f2>)
;
NTTPs of the same pointer or reference type shall be ordered by instantiation order.
Class templates shall be ordered by:
For example, given:
template <typename T, typename U>
struct Apple;
struct Banana;
struct Carrot;
<Banana, Carrot>;
Apple<Banana, Banana>;
Apple<Carrot, Carrot>; Apple
Note,
sort_key(<parameter>)...
will be used to denote a tuple where
sort_key
has been applied to all
parameters.
For void f(Foo, Bar)
sort_key(<parameter>)...
would mean
(sort_key(Foo), sort_key(Bar))
sort_key
of a class template
shall be defined as:
sort_key(<class template>) = (type, (<name>, (sort_key(<parameter>)...)))
So
sort_key(Apple<Banana, Carrot> = (type, (Apple, (sort_key(Banana), sort_key(Carrot)), )
sort_key(Apple<Banana, Carrot> = (type, (Apple, ((type, Banana, ), (type, Carrot, )), )
Note: the empty bit after the identifier is the empty qualifier pack.
The above would be ordered sort_key(Apple<Banana, Banana>)
,
sort_key(Apple<Banana, Carrot>)
,
sort_key(Apple<Carrot, Carrot>
.
Function types shall be ordered by
The sort_key
of a function
shall be defined as:
sort_key(<function>) = (function, <name>, (sort_key(<parameter>)...), sort_key(<return type>))
void foo(int i);
This function can be represented by: (function, (foo, (type, void), ((type, int))))
void foo(int)
void foo(int, double)
sort_key(void foo(int)) = (function, foo, (type, void), ((type, int)))
sort_key(void foo(int, double)) = (function, foo, (type, void), ((type, int), (type, double)))
So, the type of void foo(int)
would precede the type of
void foo(int, double)
Function types shall be ordered by
The sort key of a member function shall be defined as:
sort_key(<member function>) =
(function, (<name>, sort_key(<class>)), sort_key(<return type>), (sort_key(<parameter>)...))))
struct Foo {
void bar(int i, float j);
};
sort_key(Foo::bar) =
(type, Foo, ), (function, (bar, (type, Foo, )), (type, void), ((type, int, ), (type, float, ))))
Variadic function shall be ordered in a similar way. In a variadic function, the last argument is a variadic argument. A variadic argument shall be ordered immediately after its underlying type.
Given:
void foo(Foo);
void foo(Foo...);
In this case, the type of
void foo(Foo...)
is ordered
immediately after the type of
void foo(Foo)
.
We can represent these as:
(function (type, void) (type, Foo, ))
(function (type, void) (type, Foo, ...))
Parameter are ordered as class templates.
Given:
template<class... Types>
struct Tuple {};
class Foo {};
class Bar {};
<> t0;
Tuple<int> t1;
Tuple<Foo> t2;
Tuple<Bar> t3;
Tuple<Foo, Bar> t4; Tuple
would be ordered:
Tuple<>
<
Tuple<int>
<
Tuple<Bar>
<
Tuple<Foo>
<
Tuple<Foo, Bar>
Kinds of templates are ordered first by name, then by template arguments.
Given:
template <template <template<typename> class> class Template>
struct two{};
template <template <typename> class> struct one{};
template <typename> struct zero{};
<int> value0;
zero<zero> value1;
one<one> value2; two
These are represented by tuples:
sort_key(zero<int>) = (type, (zero, (type, int)))
sort_key(one<zero>) = (type, (one, (class_template, zero))))
sort_key(two<one>) = (type, (two, (class_template, one))))
Variable templates are ordered by name, then by template parameter.
sort_key(<variable_template>) = (variable_template, (<name>, (sort_key(<template_parameter>)...)))
template <typename F, typename S>
constexpr std::pair<F, S> pair_one_two = {1, 2};
the type of
pair_one_two<int, double>
can be represented as:
sort_key(pair_one_two<int, double>) = (variable_template, (pair_one_two, (type, int), (type, double)))
Alias templates are ordered alphabetically by name.
sort_key(<alias_template>) = (alias_template, <name>)
Given
template< class T >
using remove_cvref_t = typename remove_cvref<T>::type;
sort_key(remove_cvref_t) = (alias_template, remove_cvref_t)
Concepts are ordered in a similar manner to variable templates.
sort_key(<concept>) = (concept, (<name>, (sort_key(<template_parameter>)...)))
template <typename T, typename F = decltype([](T){})>
concept f = requires (T i, F f = [](T){}) {
{f(i)} -> std::convertible_to<void>;
};
In order to order the type of the lambda declared in
concept f
,
concept f
must be comparable
with other types.
Concepts shall be ordered first by name, then by template arguments.
sort_key(f<int>) = (concept, (f, (type, int), (lambda, 0)))
(The wording is not complete; the design will be transcribed here once approved by EWG)
In 17.11.1 [compare.syn], add
template <class T, class U>
struct type_order : integral_constant<strong_ordering, see below> {};
template <class T, class U> inline constexpr strong_ordering type_order_v = type_order<T, U>::value;
At the end of 17.11 [cmp], just before 17.12 [support.coroutine], add:
17.11.7: Type Ordering
1
For (possibly cv-qualified) types X and Y,
the expression
TYPE_ORDER(X, Y)
is a
constant expression 7.7
[expr.const] whose
value is the value of an enumerator of
strong_ordering
, subject to the
following constraints:
TYPE_ORDER(X, Y)
is
strong_ordering::equal
if and
only if X and Y are the same type2
The name type_order
denotes a
Cpp17BinaryTypeTrait (20.15.2) with a base characteristic of
integral_constant<strong_ordering, TYPE_ORDER(X, Y)>
.
3
SORT_KEY(X)_
, for a
given cv-qualified type
X
is a sort-key-tuple
where every element is one of: - [3.1] sort-key-atom, one of -
[3.1.1] “namespace” - [3.1.2] “scalar” - [3.1.3] “enum” - [3.1.4]
“union” - [3.1.5] “struct” - [3.1.7] “template” - [3.1.8] “concept” -
[3.1.9] “requirement” - [3.1.10] “noexcept” - [3.1.11] “qual” - [3.1.12]
“constant” - [3.1.13] “reference” - [3.1.14] “pointer” - [3.1.15]
“anonymous” - [3.1.16] “expression” - [3.1.17] “default_argument” -
[3.2] name - [3.3] a constant - [3.3.1] signed integral
value - [3.3.2] floating point value - [3.3.3]
lambda-literal - [3.5] a sort-key-tuple
Thanks to all of the following:
apply_canonicalized
We will need a small metaprogramming library; a filter is difficult to do otherwise.
struct undefined;
template <typename... Ts> struct list {};
// apply<F, list<Ts...>> -> F<Ts...>
template <template <typename...> typename, typename> extern undefined _apply;
template <template <typename...> typename F, template <typename...> typename L,
typename... Ts>
<Ts...> _apply<F, L<Ts...>>;
Ftemplate <template <typename...> typename F, typename List>
using apply = decltype(_apply<F, List>);
// concatenate<list<Ts...>, list<Us...>, list<Vs...>> -> list<Ts..., Us..., Vs...>
template <typename...> extern undefined _concatenate;
template <typename... Ts> list<Ts...> _concatenate<list<Ts...>>;
template <typename... Ts, typename... Us, typename... Lists>
decltype(_concatenate<list<Ts..., Us...>, Lists...>)
<list<Ts...>, list<Us...>, Lists...>;
_concatenatetemplate <typename... Ts>
using concatenate = decltype(_concatenate<Ts...>);
// select: list<T> if true, list<> if false
template <bool v, typename T> extern list<> _select;
template <typename T> list<T> _select<true, T>;
template <bool v, typename T>
using select = decltype(_select<v, T>);
Canonicalization is now just a basic not-in-place quicksort-ish thing:
template <typename.../*empty*/> extern list<> _canon;
template <typename... Ts>
using canonicalized = decltype(_canon<Ts...>);
// a canonicalized T is just T
template <typename T>
<T> _canon<T>;
list
template <typename T, typename... Ts>
<
concatenate// canonicalized things less than T
<canonicalized, concatenate<select<(TYPE_ORDER(Ts, T) < 0), Ts>...>>,
apply<T> /*T*/, // ~~~~~~~~~~~~~~~~
list// canonicalized things greater than T
<canonicalized, concatenate<select<(TYPE_ORDER(Ts, T) > 0), Ts>... >>
apply> // ~~~~~~~~~~~~~~~~
<T, Ts...>; _canon
We now have
canonicalized<Ts...>
- but
this still leaves list
as a
special type which we’d rather not expose to the user. Onto
apply_canonicalized
:
template <template <typename...> typename F, typename... Ts>
using apply_canonicalized = apply<F, canonicalized<Ts...>>;
Here for completeness, feel free to skip.
#include <compare>
#include <type_traits>
struct undefined;
#define TYPE_ORDER(x, y) type_order_v<x, y>
template <typename X, typename Y>
constexpr inline std::strong_ordering type_order_v;
template <template <typename...> typename, typename>
extern undefined _apply;
template <template <typename...> typename F, template <typename...> typename L,
typename... Ts>
<Ts...> _apply<F, L<Ts...>>;
F
template <template <typename...> typename F, typename List>
using apply = decltype(_apply<F, List>);
// some user-type
template <auto x>
struct value_t : std::integral_constant<decltype(x), x> {};
template <auto x>
inline constexpr value_t<x> value_v{};
// built-in
template <auto x, auto y>
constexpr inline std::strong_ordering type_order_v<value_t<x>, value_t<y>> =
<=> y;
x
template <typename... Ts>
struct list {};
template <typename...>
extern undefined _concatenate;
template <typename... Ts>
<Ts...> _concatenate<list<Ts...>>;
listtemplate <typename... Ts, typename... Us, typename... Lists>
decltype(_concatenate<list<Ts..., Us...>, Lists...>)
<list<Ts...>, list<Us...>, Lists...>;
_concatenate
template <typename... Ts>
using concatenate = decltype(_concatenate<Ts...>);
template <bool v, typename T>
extern list<> _select;
template <typename T>
<T> _select<true, T>;
list
template <bool v, typename T>
using select = decltype(_select<v, T>);
template <typename...>
extern list<> _canon;
template <typename... Ts>
using canonicalized = decltype(_canon<Ts...>);
template <typename T>
<T> _canon<T>;
list
template <typename T, typename... Ts>
<
concatenate<canonicalized, concatenate<select<(TYPE_ORDER(Ts, T) < 0), Ts>...>>,
apply<T>,
list<canonicalized, concatenate<select<(TYPE_ORDER(Ts, T) > 0), Ts>... >>
apply>
<T, Ts...>;
_canon
static_assert(std::same_as<canonicalized<value_t<0>, value_t<-1>, value_t<-1>, value_t<1>>, list<value_t<-1>, value_t<0>, value_t<1>>>);
__PRETTY_FUNCTION__
instabilityThis example is available at https://godbolt.org/z/ojb9TnE99 .
Consider the following program, contributed by Barry Revzin:
#include <print>
enum class E;
template <E> struct C;
#ifdef DEFINED
enum class E { hi, gašper };
#endif
template <class T>
void show() {
::print("{}\n", __PRETTY_FUNCTION__);
std}
int main() {
<C<E(0)>>();
show}
When compiled with
-std=c++23
, it yields the
following output:
void show() [with T = C<(E)0>]
However, with
-std=c++23 -DDEFINED
, it
produces a different one:
void show() [with T = C<E::hi>]
This makes external names of types incorporating enums as non-type template arguments have inconsistent between translation units.