1. Changelog
-
R5 (post-Prague):
-
Polymorphic types (those with any vptr at all) inhibit "natural" trivial relocatability. David Stone gave the motivating example. See Polymorphic downcast effectively relies on offset-into-self.
-
Updated
. Added semantic requirements on assignment and some discussion of the issue withconcept relocatable
.vector :: insert -
Added many direct links to sections of my C++Now 2019 talk covering particular motivations, design decisions, and implementation details.
-
-
R3 (post-Kona):
-
User-provided copy/move assignment operators inhibit "natural" trivial relocatability, just like user-provided move constructors and destructors. This is intended to permit us to optimize
.vector :: insert -
User-provided copy constructors inhibit "natural" trivial relocatability, just like user-provided move constructors and destructors.
-
Adopted
. The attribute-argument is optional.[[ trivially_relocatable ( bool )]] -
Added
. Note that this is not a customization point.std :: relocate_at ( source , dest ) -
Removed the core-language blanket wording that permitted "the implementation" to eliminate moves and destroys of trivially relocatable types. Instead, the existence of
and handwaviness of e.g. [vector.capacity] combine to strongly imply (but still not to mandate) that library vendors will usestd :: relocate_at
wherever it helps with performance. We no longer propose to permit eliminating moves and destroys in e.g.std :: relocate_at
statements, except as already permitted under the as-if rule.return
-
2. Introduction and motivation
C++17 knows the verbs "move," "copy," "destroy," and "swap," where "swap" is a higher-level operation
composed of several lower-level operations. To this list we propose to add the verb "relocate,"
which is a higher-level operation composed of exactly two lower-level operations.
Given an object type
and memory addresses
and
,
the phrase "relocate a
from
to
" means no more and no
less than "move-construct
from
, and then immediately destroy the object at
."
Any type which is both move-constructible and destructible is relocatable.
The notion can be modified by adverbs: we say that a type
is nothrow relocatable if its relocation operation is noexcept, and we say that a type
is trivially relocatable if its relocation operation is trivial (which,
just like trivial move-construction and trivial copy-construction, means
"tantamount to
").
In practice, almost all relocatable types are trivially relocatable:
,
,
. Non-trivially relocatable types
exist but are rare; one example is
. See Appendix C: Examples of non-trivially relocatable class types.
P1144 provides a way for the programmer to warrant to the compiler that a resource-managing type is trivially relocatable, and provides an algorithm by which the compiler can recursively infer that a Rule-of-Zero type is trivially relocatable.
The most important thing about P1144 relocation is that it is backward compatible and does not break either API or ABI. My intention is simply to legalize the well-understood tricks that every industry codebase is already doing in practice (see [BSL], [EASTL], [Folly]). P1144 is not intended to change the behavior of any existing source code (except to speed it up). P1144 is not intended to require any major work from standard library vendors.
As observed in [CppChat] (@21:57): Just as with C++11 move semantics, you can write benchmarks to show whatever speedup you like. The more complicated your types' move-constructors and destructors, the more time you save by eliminating calls to them.
2.1. Optimizations enabled by trivial relocatability
2.1.1. Vector resize
If we have a reliable way of detecting "trivial relocatability,"
we can optimize any routine that performs the moral equivalent of
, including
std :: vector < R >:: resize std :: vector < R >:: reserve std :: vector < R >:: emplace_back std :: vector < R >:: push_back
[Bench] (presented at C++Now 2018) shows a 3x speedup on
. This Reddit thread demonstrates a similar 3x speedup using the online tool Quick-Bench.
2.1.2. Swap
Given a reliable way of detecting trivial relocatability,
we can optimize any routine that uses the moral equivalent of
, such as
std :: swap std :: vector < R >:: insert
Optimizing
produces massive code-size improvements for all swap-based
algorithms, including
and
. See @19:56–21:22 in my C++Now 2019 talk.
However, see § 6.7 Confusing interactions with std::pmr and vector::insert for further discussion of
.
2.1.3. More efficient small-buffer type-erasure
Given a reliable way of detecting trivial relocatability,
we can de-duplicate the code generated by small-buffer-optimized (SBO) type-erasing wrappers
such as
and
.
For these types, a move of the wrapper object is implemented in terms of a relocation of the contained object. (See for example libc++'s std::any,
where the function that performs the relocation operation is confusingly named
.)
In general, the relocate operation for a contained type
must be uniquely codegenned for each
different
, leading to code bloat. But a single instantiation suffices to relocate every trivially relocatable
in the program. A smaller number of instantiations means faster compile times,
a smaller text section, and "hotter" code (because a relatively higher proportion of your
code now fits in icache).
2.1.4. More efficient fixed-capacity containers
Given a reliable way of detecting trivial relocatability,
we can optimize the move-constructor of
,
which can be implemented naïvely as an element-by-element move (leaving the source vector’s elements in their moved-from state),
or can be implemented efficiently as an element-by-element relocate (leaving the source vector empty).
For a detailed analysis of this case, see [FixedCapacityVector].
Note:
currently implements the
naïve element-by-element-move strategy.
2.1.5. Assertions, not assumptions
Some concurrent data structures might reasonably assert the trivial relocatability of their elements, just as they sometimes assert the stronger property of trivial copyability today.
2.2. The most important benefit
Many real-world codebases already contain templates which require
trivial relocatability of their template parameters, but currently have no way to verify trivial relocatability. For example, [Folly] requires the programmer to warrant the trivial
relocatability of any type stored in a
:
class Widget { std :: vector < int > lst_ ; }; folly :: fbvector < Widget > vec ; // FAILS AT COMPILE TIME for lack of warrant
But this merely encourages the programmer to add the warrant and continue. An incorrect warrant will be discovered only at runtime, via undefined behavior. See Allocated memory contains pointer to self, [FollyIssue889], and (most importantly) @27:26–31:47 in my C++Now 2019 talk.
class Gadget { std :: list < int > lst_ ; }; // sigh, add the warrant on autopilot template <> struct folly :: IsRelocatable < Gadget > : std :: true_type {}; folly :: fbvector < Gadget > vec ; // CRASHES AT RUNTIME due to fraudulent warrant
If this proposal is adopted, then Folly can start using
in the implementation of
, and the programmer can stop writing explicit warrants.
Finally, the programmer can start writing assertions of correctness, which aids maintainability and
can even find real bugs. Example:
class Widget { std :: vector < int > lst_ ; }; static_assert ( std :: is_trivially_relocatable_v < Widget > ); // correctly SUCCEEDS class Gadget { std :: list < int > lst_ ; }; static_assert ( std :: is_trivially_relocatable_v < Gadget > ); // correctly ERRORS OUT
The improvement in user experience for real-world codebases (such as [Folly], [EASTL], BDE, Qt, etc.) is the most important benefit to be gained by this proposal.
3. Design goals
Every C++ type already is or is not trivially relocatable. This proposal does not require any library vendor to make any library type trivially relocatable. (We assume that quality implementations will do so on their own.)
The optimizations discussed above are purely in the domain of library writers. If you’re writing
a vector, and you detect that your element type
is trivially relocatable, then
whether you do any special optimization in that case is up to you.
This proposal does not require any library vendor to guarantee that any particular optimization
happens. (But we assume that quality implementations will do so on their own.)
What C++ lacks is a standard way for library writers to detect the (existing) trivial relocatability
of a type
, so that they can reliably apply their (existing) optimizations.
All we really need is to add detection, and then all the optimizations described above will naturally
emerge without any further special effort by WG21.
There are three kinds of object types that we want to make sure are correctly detected as
trivially relocatable. These three cases are important for improving the performance of
the standard library, and for improving the correctness of programs using libraries such as [Folly]'s
.
3.1. Standard library types such as std :: string
In order to optimize
, we must come up with a way to achieve
This could be done unilaterally by the library vendor — via a non-standard attribute (#include <string>static_assert ( is_trivially_relocatable < std :: string >:: value );
[[ clang :: trivially_relocatable ]]
), or a member typedef with a reserved name,
or simply a vendor-provided specialization of std :: is_trivially_relocatable < std :: string >
.
That is, we can in principle solve §2.1 while confining our "magic" to the headers of the implementation itself. The programmer doesn’t have to learn anything new, so far.
3.2. Program-defined types that follow the Rule of Zero
In order to optimize the SBO
in any meaningful sense,
we must come up with a way to achieve
Lambdas are not a special case in C++; they are simply class types with all their special members defaulted. Therefore, presumably we should be able to use the same solution for lambdas as for#include <string>auto lam2 = [ x = std :: string ( "hello" )]{}; static_assert ( is_trivially_relocatable < decltype ( lam2 ) >:: value );
Here#include <string>struct A { std :: string s ; }; static_assert ( is_trivially_relocatable < A >:: value );
struct A
follows the Rule of Zero: its move-constructor and destructor are both defaulted.
If they were also trivial, then we’d be done. In fact they are non-trivial; and yet, because the type’s
bases and members are all of trivially relocatable types, the type as a whole is trivially relocatable.
§2.2 asks that we make the
succeed without breaking the "Rule of Zero."
We do not want to require the programmer to annotate
with a special attribute, or
a special member typedef, or anything like that. We want it to Just Work. Even for lambda types.
This is a much harder problem than §2.1; it requires standard support in the core language.
But it still does not require any new syntax.
3.3. Program-defined types with non-defaulted special members
In order to optimize
,
we must come up with a way to achieve
via some standard annotation applied to class typestruct B { B ( B && ); // non-trivial ~ B (); // non-trivial }; static_assert ( is_trivially_relocatable < B >:: value );
B
(which in this example is standing in for boost :: shared_ptr
).
Note: We cannot possibly do it without annotation, because there exist
examples of types that look just like
and are trivially relocatable (for example, libstdc++'s std::function) and there exist types that look just like
and are not trivially relocatable (for example, libc++'s std::function).
The compiler cannot "crack open" the definitions of
and
to see if
they combine to form a trivial operation:
the definitions of
and
might not even be available in the current translation unit.
So, without some kind of opt-in annotation, we cannot achieve our goal.
This use-case is the only one that requires us to design the "opt-in" syntax. In §2.1, any special syntax is hidden inside the implementation’s own headers. In §2.2, our design goal is to avoid special syntax. In §2.3, WG21 must actually design user-facing syntax.
Therefore, I believe it would be acceptable to punt on §2.3 and come back to it later. We say, "Sure, that would be nice, but there’s no syntax for it. Be glad that it works for core-language and library types. Ask again in three years." As long as we leave the design space open, I believe we wouldn’t lose much by delaying a solution to §2.3.
This paper does propose a standard syntax for §2.3 — an attribute — which in turn provides a simple and portable way for library vendors to implement §2.1.
4. Proposed language and library features
This paper proposes five separate additions to the C++ Standard. These additions introduce "relocate" as a well-supported C++ notion on par with "swap," and furthermore, successfully communicate trivial relocatability in each of the three use-cases above.
-
New standard algorithms, including
andrelocate_at ( source , dest )
, in theuninitialized_relocate ( first , last , d_first )
header.< memory > -
Additional type traits,
andis_relocatable < T >
, in theis_nothrow_relocatable < T >
header.< type_traits > -
New type traits, including
, in theis_trivially_relocatable < T >
header. This is the detection mechanism.< type_traits > -
A new core-language rule by which a class type’s "trivial relocatability" is inherited according to the Rule of Zero.
-
A new attribute,
, in the core language. This is the opt-in mechanism for program-defined types.[[ trivially_relocatable ]]
These five bullet points are severable to some degree. For example, if the
attribute (point 5) is adopted, library vendors will certainly use it in their implementations;
but if the attribute is rejected, library vendors could still indicate the trivial relocatability
of certain standard library types by providing library specializations of
(point 3).
Points 1 and 2 are completely severable from points 3, 4, and 5;
but we believe these algorithms should be provided for symmetry with the
other uninitialized-memory algorithms in the
header
(e.g.
)
and the other trios of type-traits in the
header
(e.g.
,
,
). I do not expect these templates to be frequently useful,
but I believe they should be provided, so as not to surprise the programmer
by their absence.
Points 3 and 4 together motivate point 5. In order to achieve the goal of § 3.2 Program-defined types that follow the Rule of Zero, we must define a core-language mechanism by which we can "inherit" trivial relocatability. This is especially important for the template case.
We strongly believe thattemplate < class T > struct D { T t ; }; // class C comes in from outside, already marked, via whatever mechanism constexpr bool c = is_trivially_relocatable < C >:: value ; constexpr bool dc = is_trivially_relocatable < D < C > >:: value ; static_assert ( dc == c );
std :: is_trivially_relocatable < T >
should be just a plain old
class template, exactly like std :: is_trivially_destructible < T >
and so on.
The core language should not know or care that the class template is_trivially_relocatable
exists, any more than it knows that the class template is_trivially_destructible
exists.
We expect that the library vendor will implement
,
just like
, in terms of a non-standard compiler
builtin whose natural spelling is
.
This builtin has been implemented in my fork of Clang; see [D50119].
The compiler computes the value of
by inspecting the
definition of
(and the definitions of its base classes and members, recursively).
This recursive process "bottoms out" at primitive types, or at any type with a user-provided
move or destroy operation. For safety, classes with user-provided move or destroy operations
(e.g. Appendix C: Examples of non-trivially relocatable class types) must be assumed not to be trivially relocatable. To achieve the goal
of § 3.3 Program-defined types with non-defaulted special members, we must provide a way that such a class can "opt in" and warrant to the
implementation that it is in fact trivially relocatable (despite being non-trivially
move-constructible and/or non-trivially destructible).
In point 5 we propose that the opt-in mechanism should be an attribute. The programmer
of a trivially relocatable but non-trivially destructible class
will mark it for
the compiler using the attribute:
The attribute overrides the compiler’s usual computation.struct [[ trivially_relocatable ]] C { C ( C && ); // defined elsewhere ~ C (); // defined elsewhere }; static_assert ( is_trivially_relocatable < C >:: value );
An example of a "conditionally" trivially relocatable class is shown in Conditionally trivial relocation.
The attribute is severable; WG21 could adopt all the rest of this proposal and
leave vendors to implement
,
, etc.,
as non-standard extension mechanisms.
In that case, we would strike §5.6 and one bullet point from §5.5;
the rest of this proposal would remain exactly the same.
5. Proposed wording for C++2b
The wording in this section is relative to WG21 draft N4800.
5.1. Relocation operation
Add a new section in [definitions]:
[definitions] is probably the wrong place for the core-language definition of "relocation operation"
- relocation operation
the homogeneous binary operation performed by
, consisting of a move-construction immediately followed by a destruction of the source object
std :: relocate_at
this definition of "relocation operation" is not good
5.2. Algorithm relocate_at
Add a new section after [uninitialized.move]:
template < class T > T * relocate_at ( T * source , T * dest ); namespace ranges { template < relocatable T > T * relocate_at ( T * source , T * dest ); } Mandates:
shall be a complete non-array object type.
T Effects: Equivalent to:
uninitialized_move ( dest , dest + 1 , source ); destroy_at ( source ); return std :: launder ( dest ); except that if
is trivially relocatable [basic.types], side effects associated with the relocation of the object’s value might not happen.
T
Note: The "as-if-by-memcpy" codepath hidden inside
cannot be implemented by
literally
(at least not without undefined behavior).
At Kona, EWG discussion of Richard Smith’s [P0593R3] suggests to me, and Richard
confirmed, that a typical library vendor could implement this codepath
by a call to a hypothetical
function whose
implementation is invisible to the compiler’s optimizer.
See @45:23–48:39 in my C++Now 2019 talk.
5.3. Algorithm uninitialized_relocate
Add a new section after [uninitialized.move]:
template < class ForwardIterator1 , class ForwardIterator2 > ForwardIterator2 uninitialized_relocate ( ForwardIterator1 first , ForwardIterator1 last , ForwardIterator2 result ); Effects: Equivalent to:
result = uninitialized_move ( first , last , result ); destroy ( first , last ); return result ; except that if the iterators' common value type is trivially relocatable, side effects associated with the relocation of the object’s value might not happen.
Remarks: If an exception is thrown, some objects in the range
are left in a valid but unspecified state.
[ first , last )
Note: The "Remark" implies that
has a
-and-cleanup clause
similar to
: if an exception is thrown, the whole destination range
is destroyed before the exception is propagated. If
is non-trivially relocatable and
's move-constructor might throw, then the implementation must be "move-in-a-loop
followed by destroy-in-a-loop." See the implementation in Appendix B: Sample code.
Note: We are guided by [P0884R0] to make
unconditionally
.
This is consistent with
and
, both of which are unconditionally
.
5.4. Algorithm uninitialized_relocate_n
template < class ForwardIterator1 , class Size , class ForwardIterator2 > pair < ForwardIterator1 , ForwardIterator2 > uninitialized_relocate_n ( ForwardIterator1 first , Size n , ForwardIterator2 result ); Effects: Equivalent to:
auto pair = uninitialized_move_n ( first , n , result ); destroy_n ( first , n ); return pair ; except that if the iterators' common value type is trivially relocatable, side effects associated with the relocation of the object’s value might not happen.
Remarks: If an exception is thrown, some objects in the range
are left in a valid but unspecified state.
[ first , std :: next ( first , n ))
5.5. Trivially relocatable type
Add a new section in [basic.types]:
A move-constructible, destructible object typeis a trivially relocatable type if it is:
T
a trivially copyable type, or
an array of trivially relocatable type, or
a (possibly cv-qualified) class type declared with a
attribute with value
[[ trivially_relocatable ]] true
, ora (possibly cv-qualified) class type which:
has no user-provided move constructors or move assignment operators,
has no user-provided copy constructors or copy assignment operators,
has no user-provided destructors,
has no virtual member functions,
has no virtual base classes,
all of whose members are either of reference type or of trivially relocatable type, and
all of whose base classes are trivially relocatable.
[Note: For a trivially relocatable type, the relocation operation (such as the relocation operations performed by the library functions
and
std :: swap ) is tantamount to a simple copy of the underlying bytes. —end note]
std :: vector :: resize [Note: It is intended that most standard library types be trivially relocatable types. —end note]
Note: As of P1144R5, polymorphic types are not "naturally" trivially relocatable. See Appendix C, example 5.
Note: A class type declared
may still be trivially relocatable,
if it satisfies the criteria above. For example, it is impossible to create a trivially copyable type
which is not also trivially relocatable.
Note: There is no special treatment for volatile subobjects (see [Subobjects]) nor for possibly overlapping subobjects (see [Subobjects]).
The relevant move constructor, copy constructor, and/or destructor must be public and unambiguous. We imply this via the words "A move-constructible, destructible object type". However, "move-constructible" and "destructible" are library concepts, not core language concepts, so it is inappropriate to use them here.
We must find a rule that makes neitherstruct A { struct MA { MA ( MA & ); MA ( const MA & ) = default ; MA ( MA && ) = default ; }; mutable MA ma ; A ( const A & ) = default ; }; static_assert ( not std :: is_trivially_relocatable_v < A > ); struct B { struct MB { MB ( const volatile MB & ); MB ( const MB & ) = default ; MB ( MB && ) = default ; }; volatile MB mb ; B ( const B & ) = default ; }; static_assert ( not std :: is_trivially_relocatable_v < B > ); struct H { H ( H && ); }; struct [[ trivially_relocatable ]] I { I ( I && ); }; template < bool Cond > struct J : std :: conditional_t < Cond , H , I > { J ( const J & ); J ( J && ) = default ; }; static_assert ( std :: is_trivially_relocatable_v < J < false>> ); static_assert ( not std :: is_trivially_relocatable_v < J < true>> );
A
nor B
trivially relocatable,
because the move-construction A ( std :: move ( a ))
invokes user-provided copy constructor MA ( MA & )
and the move-construction B ( std :: move ( b ))
invokes user-provided copy constructor MB ( const volatile MB & )
.
Prior to P1144R3, the rule was that mutable
and volatile
data members
inhibited "natural" trivial relocatability. As of P1144R3, the rule is that
any user-provided copy constructor inhibits trivial relocatability (even in the
presence of a defaulted move constructor).
In P1144R2, I tried to find a rule that makes
trivially relocatable,
because the
pattern was used to implement "conditionally trivial relocatability"
for all allocator-aware containers in my libc++ reference implementation.
However, this paper adopts the
syntax, which
means that we don’t need to care about
anymore. The new way to write
is simply
template < bool Cond > struct [[ trivially_relocatable ( ! Cond )]] J { J ( J && ); J ( const J & ); }; static_assert ( std :: is_trivially_relocatable_v < J < false>> ); static_assert ( not std :: is_trivially_relocatable_v < J < true>> );
5.6. [[ trivially_relocatable ]]
attribute
Add a new section after [dcl.attr.nouniqueattr]:
The attribute-tokenspecifies that a class type’s relocation operation has no visible side-effects other than a copy of the underlying bytes, as if by the library function
trivially_relocatable . It may be applied to the definition of a class. It shall appear at most once in each attribute-list. An attribute-argument-clause may be present and, if present, shall have the form
std :: memcpy The constant-expression shall be an integral constant expression of type( constant - expression ) . If no attribute-argument-clause is present, it has the same effect as an attribute-argument-clause of
bool .
( true) If any definition of a class type has a
attribute with value V, then each definition of the same class type shall have a
trivially_relocatable attribute with value V. No diagnostic is required if definitions in different translation units have mismatched
trivially_relocatable attributes.
trivially_relocatable If a type
is declared with the
T attribute, and
trivially_relocatable is either not move-constructible or not destructible, the program is ill-formed.
T If a class type is declared with the
attribute, and the program relies on observable side-effects of relocation other than a copy of the underlying bytes, the behavior is undefined.
trivially_relocatable
"If a type
is declared with the
attribute, and
is either not move-constructible
or not destructible, the program is ill-formed." We might want to replace this wording with
a mere "Note" encouraging implementations to diagnose.
See this example where a diagnostic might be unwanted.
5.7. Type traits is_relocatable
etc.
Add new entries to Table 47 in [meta.unary.prop]:
Template Condition Preconditions
template < class T > struct is_relocatable ; is
is_move_constructible_v < T > true
andis
is_destructible_v < T > true
T shall be a complete type, cv , or an array of unknown bound.
void
template < class T > struct is_nothrow_relocatable ; is
is_relocatable_v < T > true
and both the indicated move-constructor and the destructor are known not to throw any exceptions.T shall be a complete type, cv , or an array of unknown bound.
void
template < class T > struct is_trivially_relocatable ; is a trivially relocatable type.
T T shall be a complete type, cv , or an array of unknown bound.
void
5.8. Relocatable
concept
Add a new section after [concept.moveconstructible]:
template < class T > concept relocatable = move_constructible < T > ; If
is an object type, then let
T be an rvalue of type
rv ,
T an lvalue of type
lv equal to
T , and
rv a distinct object of type
u2 equal to
T .
rv models
T only if
relocatable
After the definition
,
T u = rv ; is equal to
u .
u2
is equal to
T ( rv ) .
u2 If the expression
is well-formed, then the expression has the same semantics as
u2 = rv
u2 . ~ T (); :: new (( void * ) std :: addressof ( u2 )) T ( rv ); If the definition
is well-formed, then after the definition
T u = lv ; is equal to
u .
u2 If the expression
is well-formed, then the expression’s result is equal to
T ( lv ) .
u2 If the expression
is well-formed, then the expression has the same semantics as
u2 = lv
u2 . ~ T (); :: new (( void * ) std :: addressof ( u2 )) T ( lv );
The semantic requirements of this concept are poorly worded. We intend that a type may be relocatable regardless of whether it is copy-constructible; but, if it is copy-constructible then copy-and-destroy must have the same semantics as move-and-destroy. We intend that a type may be relocatable regardless of whether it is assignable; but, if it is assignable then assignment must have the same semantics as destroy-and-copy or destroy-and-move.
Note: The semantic requirements on assignment help us optimize
and
.
The type
satisfies
, but it models
only if all relevant objects have equal allocators.
Note: I do not propose to add a concept named
.
There is currently no concept named
nor
.
6. Rationale and alternatives
6.1. Why not destructive move?
As discussed in EWGI at San Diego, this proposal does not give us a general user-provided "destructive move" facility.
-
Denis Bider’s [P0023R0] and Pablo Halpern’s [N4158] went in that direction and did not succeed. People have been chasing "destructive move" for decades; maybe it’s time to try something different.
-
We get the performance benefit only when the library (e.g.
) can detect that "relocate/destructive move" is tantamount to memcpy. If we permit a user-provided "destructive move" operation, we must also design a way for the user to warrant that their "destructive move" is tantamount to memcpy. No previous proposal has shown how to do this.std :: vector :: resize -
P1144’s approach is explicitly based on existing industry practice: [Folly], [EASTL], and [BSL] all use this exact idea in practice and it seems to work for them. Marc Glisse has been integrating the same idea into GNU libstdc++; see [Deque]. The term "relocation" is due to [EASTL] (
) and [Folly] (has_trivial_relocate
). The same concept appears in pre-C++11 libraries under the name "movable": Qt (IsRelocatable
) and [BSL] (Q_MOVABLE_TYPE
). P1144’s sole innovation is to give a consistent set of core-language rules by which the compiler can deduce the trivial relocatability of some class types which follow the Rule of Zero.IsBitwiseMoveable
6.2. Why [[ trivially_relocatable ( bool )]]
?
It was suggested by numerous reviewers that
should
take an optional boolean parameter, as in
.
This allows us to write complicated conditions directly inline, instead of using
metaprogramming to inherit the right behavior from a conditional base class.
See Conditionally trivial relocation for an example of how this can be used.
There is no technical obstacle to adding an arbitrary C++ expression as the parameter to
an attribute. The grammar for balancing
and
in attribute parameters has been
there since C++11. There is already prior art for arbitrary expressions in attributes;
see Clang’s
attribute. EWG has also previously considered an attribute
; it was rejected in favor of [P1007R3]'s non-attribute
approach, but not because of the
part specifically.
The major downside I see to
is that it could lead to
an arbitrarily complicated C++ expression appearing in an awkward position. But this
is better than having to do the metaprogramming tricks shown in Conditionally trivial relocation.
6.3. Attribute [[ maybe_trivially_relocatable ]]
The Clang patch currently available on Godbolt Compiler Explorer supports both
and another attribute called
,
which John McCall requested that I explore.
See P1144R4 section 6.2 for discussion of the
attribute, including the reasons
I do not propose it for standardization.
6.4. Should relocate_at
be a customization point?
No. See P1144R2 section 5.4 for discussion of this approach, which was taken by Pablo Halpern’s [N4158]. See also "Trivially Relocatable versus Destructive Movable" (2018-09-28).
N4158’s customization-point approach has a very high cost-to-benefit ratio. I am satisfied with P1144’s avoiding that approach.
6.5. Unintuitive is_nothrow_relocatable
Consider a type such as
struct [[ trivially_relocatable ]] Widget { int i ; Widget ( Widget && ) : i ( rhs . i ) {} }; static_assert ( not std :: is_nothrow_move_constructible_v < Widget > ); static_assert ( not std :: is_nothrow_relocatable_v < Widget > ); static_assert ( std :: is_trivially_relocatable_v < Widget > );
Since
is non-nothrow move-constructible, P1144 calls it non-nothrow relocatable.
So, looking at how
interacts with the type-traits, we are in the awkward position
that
simultaneously claims "My relocation operation might throw" and "My relocation
operation is trivial." These claims seem inconsistent.
This is a real-world concern because GNU libstdc++'s
works like
: its move-constructor is
(it must allocate)
but it is trivially relocatable. As of 2019-01-18, libstdc++ marks its
as
trivially relocatable (see [Deque]).
However, I believe that it would be incorrect and unsafe for the library to claim that
was "nothrow relocatable." "Nothrow relocatable" should imply that
a generic algorithm could relocate it (as if by
)
without worrying about catching exceptions. "
is trivially relocatable" means that
is relocatable as if by
; it does not mean that every relocation of
must be performed literally by
.
I believe P1144’s proposed behavior is the best behavior. However, another plausible option
would be simply to eliminate the
type-trait from the standard library.
If we don’t provide
, then we don’t have to defend its
mildly unintuitive behavior.
6.6. Provide T relocate ( T * )
?
Note: This section is new in P1144R5.
The current proposal provides
, which ends the lifetime of
and starts the lifetime of
. Anton Zhilin suggested that a more general primitive
would actually be
, which ends the lifetime of
and starts the
lifetime of a prvalue
wherever the caller wants.
// Before P1144: :: new ( dst_ptr ) C { std :: move ( * static_cast < C *> ( src_ptr )) }; static_cast < C *> ( src_ptr ) ->~ C (); // After P1144R5: std :: relocate_at ( static_cast < C *> ( src_ptr ), static_cast < C *> ( dst_ptr )); // After Anton Zhilin’s suggestion: :: new ( dst_ptr ) C { std :: relocate ( static_cast < C *> ( src_ptr )) };
With the
primitive, we could implement operations such as
:
// Before Anton Zhilin’s suggestion: T pilfer_back () { T result = std :: move ( data_ [ size_ - 1 ]); data_ [ -- size_ ]. ~ T (); return result ; } // After Anton Zhilin’s suggestion: T pilfer_back () { T result = std :: relocate ( data_ [ size_ ]); -- size_ ; return result ; }
can be implemented with today’s core language. Anton’s
cannot. There is
no way to "memcpy into the return slot" of a function, because the return slot is unnamed —
would
have to use additional implementation magic; for example, an attribute
.
That attribute (or whatever) would not need to be standardized nor exposed to users.
This also assumes a compiler with Clang-level smarts about when to do NRVO. [P2025R0], currently in EWG, proposes that all compilers be required to have such smarts.
template < class T > T relocate ( T * source ) { if constexpr ( std :: is_trivially_relocatable_v < T > ) { [[ clang :: unconstructed ]] T dest ; // do not run dest’s constructor memcpy ( & dest , source , sizeof ( T )); // do not run source’s destructor return dest ; // copy-elision } else { T dest ( std :: move ( * source )); // run dest’s constructor source ->~ T (); // run source’s destructor return dest ; // copy-elision } } template < class T > T * relocate_at ( T * source , T * dest ) { return :: new ( dest ) T ( std :: relocate ( source )); }
P1144R5 does not propose
for inclusion in the standard library, because
it requires this additional compiler support, which has not yet been implemented anywhere.
6.7. Confusing interactions with std :: pmr
and vector :: insert
Note: This section is new in P1144R5.
The main thing P1144 does for
is efficient reallocation of the whole buffer.
However, a secondary goal is to permit efficient insertion and erasure. Example:
std :: vector < std :: string > vs ( 1000 ); vs . erase ( vs . begin () + 500 );
This
requires us to shift down 500
objects, which can be done either
by 500 calls to
followed by one call to
, or by
one call to
followed by a
. We want to permit the latter. That’s why
as of P1144R3, the definition of "trivially relocatable" in § 5.5 Trivially relocatable type places requirements on
's assignment operators as well as on
's constructors and destructors.
However, consider
. Its move-constructor and destructor follow the
rules for trivial relocatability, but its assignment operator does not.
std :: vector < std :: pmr :: vector < int >> vv ; vv . emplace_back ( 1 , 1 , & mr1 ); vv . emplace_back ( 2 , 2 , & mr2 ); vv . emplace_back ( 3 , 3 , & mr3 ); vv . reserve ( 1000 ); // A vv . erase ( vv . begin ()); // B
On line "A", we would like the implementation to use trivial relocation (
) because it
is fast and correct. But on line "B", if the implementation today uses
and tomorrow
uses
, the user will be able to detect the change by inspecting
.
See this example completely worked out @76:17–84:20 in my C++Now 2019 talk.
For now, P1144R5 implies that
shall not be trivially relocatable,
which means that line "A" will not get the
speedup unless the vendor special-cases
a whitelist of
containers. However, I think this area deserves more discussion.
Note: Regardless,
is a trivially relocatable type.
The problem above arises from
etc., which affect the behavior of
only for containers such as
. The author of the container makes
the choice whether to respect POCMA/POCCA/POCS, and the author of the container also makes the choice
when to warrant trivial relocatability. These choices are correlated, and so it is natural that
they should be made by the same person, at the same place in the source code.
7. Acknowledgements
Thanks to Elias Kosunen, Niall Douglas, and John Bandela for their feedback on early drafts of this paper.
Many thanks to Matt Godbolt for allowing me to install the prototype Clang implementation on Compiler Explorer (godbolt.org). See also [Announcing].
Thanks to Nicolas Lesser for his relentless feedback on drafts of P1144R0, and for his helpful review comments on the Clang implementation [D50119].
Thanks to Howard Hinnant for appearing with me on [CppChat], and to Jon Kalb and Phil Nash for hosting us.
Thanks to Pablo Halpern for [N4158], to which this paper bears a striking and coincidental resemblance —
Significantly different approaches to this problem have previously appeared in Rodrigo Castro Campos’s [N2754], Denis Bider’s [P0023R0] (introducing a core-language "relocation" operator), and Niall Douglas’s [P1029R3] (treating trivial relocatability as an aspect of move-construction in isolation, rather than an aspect of the class type as a whole).
Thanks to John McCall for his thought-provoking review comments on the Clang implementation [D50119].
Thanks to Marc Glisse for his work integrating a "trivially relocatable" trait into GNU libstdc++ and for answering my questions on GCC bug 87106.
Thanks to Jens Maurer for his feedback on this paper at Kona 2019, and to Corentin Jabot for championing P1144R4 at Prague 2020.
Appendix A: Straw polls
The next time this paper is seen, Anton Zhilin would like us to take a straw poll on whether it should be possible to create a type which is "trivially relocatable, but not move-constructible." Currently P1144 does not permit such a thing; relocatability here is defined as a superset of move-constructibility.
Polls taken at EWGI at Prague on 2020-02-13
Corentin Jabot championed P1144R4. EWGI discussed P1144R4 and Niall Douglas’s [P1029R3] consecutively, then took the following straw polls. The final poll was interpreted by EWGI assistant chair Erich Keane as "weak consensus" to forward P1144 to EWG.
SF | F | N | A | SA | |
---|---|---|---|---|---|
We believe that P1029 and P1144 are sufficiently different that they should be advanced separately. | 7 | 3 | 2 | 0 | 0 |
EWGI is ok to have the spelling as an attribute with an expression argument. | 3 | 5 | 1 | 1 | 0 |
EWGI would prefer a contextual keyword. | 0 | 0 | 6 | 3 | 0 |
EWGI thinks the author should explore P1144 as a customizable type trait. | 0 | 0 | 0 | 9 | 2 |
Forward P1144 to EWG. | 1 | 3 | 4 | 1 | 0 |
Polls taken of the Internet between 2018-11-12 and 2018-11-21
SF | F | N | A | SA | |
---|---|---|---|---|---|
We approve of the general idea that user-defined classes should be able to warrant their own trivial relocatability via a standard mechanism. | 6 | 1 | 0 | 0 | 1 |
We approve of the general idea that user-defined classes which follow the Rule of Zero should inherit the trivial relocatability of their bases and members. | 7 | 1 | 0 | 0 | 0 |
Nobody should be able to warrant the trivial relocatability of except for itself (i.e., we do not want to see a customization point analogous to ).
| 4 | 2 | 2 | 0 | 0 |
A class should be able to warrant its own trivial relocatability via the attribute , as proposed in this paper (P1144R0).
| 3 | 0 | 3 | 1 | 0 |
A class should be able to warrant its own trivial relocatability via some attribute, but not necessarily under that exact name. | 2 | 0 | 4 | 1 | 0 |
A class should be able to warrant its own trivial relocatability as proposed in this paper (P1144R0), but via a contextual keyword rather than an attribute. | 0 | 2 | 3 | 3 | 0 |
If a trait with the semantics of is added to the header, the programmer should be permitted to specialize it for program-defined types (i.e., we want to see that trait itself become a customization point analogous to ).
| 0 | 1 | 0 | 1 | 5 |
Trivial relocatability should be assumed by default. Classes such as those in Appendix C should indicate their non-trivial relocatability via an opt-in mechanism. | 0 | 0 | 0 | 3 | 5 |
To simplify Conditionally trivial relocation, if an attribute with the semantics of is added, it should take a boolean argument.
| 1 | 1 | 3 | 2 | 0 |
The algorithm should be added to the header,
as proposed in this paper (P1144R0).
| 0 | 4 | 1 | 1 | 0 |
The type trait (and its version) should be added to the header, as proposed in this paper (P1144R0).
| 0 | 2 | 3 | 0 | 1 |
If is added, then we should also add (and its version), as proposed in this paper (P1144R0).
| 1 | 4 | 2 | 0 | 0 |
The type trait (and its version) should be added to the header, under that exact name, as proposed in this paper (P1144R0).
| 3 | 3 | 1 | 0 | 0 |
We approve of a trait with the semantics of , but not necessarily under that exact name. (For example, .)
| 3 | 3 | 0 | 1 | 0 |
If is added, under that exact name, then the type trait (and its version) should also be added to the header.
| 0 | 3 | 3 | 0 | 0 |
The "Strongly Against" vote on poll 1 was due to concerns that P1144 permits a class to warrant its own trivial relocatability, overruling the compiler’s assumptions, not only when the compiler’s assumptions are based on the presence of special members, but also when the compiler’s assumptions are based partly or wholly on the non-triviality of member or base subobjects. See further discussion under § 6.3 Attribute [[maybe_trivially_relocatable]].
The "Against" vote on poll 10,
, was due to its exception guarantee, which was
weaker in P1144R0. P1144R1 strengthened the guarantee (and tightened the constraint on the source
iterator from
to
) to better match the other
algorithms.
The "Strongly Against" vote on poll 11,
, was from a desire to save the name
for something different, such as a built-in destructive-move operation.
Poll taken of EWGI at San Diego on 2018-11-07
SF | F | N | A | SA | |
---|---|---|---|---|---|
Should we commit additional committee time to solving the problem P1144R0 is trying to solve, knowing it will leave less time to other work? | 8 | 3 | 0 | 0 | 0 |
Polls taken of SG14 at CppCon on 2018-09-26
SF | F | N | A | SA | |
---|---|---|---|---|---|
The type trait (and its version) should be added to the header, under that exact name, as proposed in this paper.
| 1 | 20 | 7 | 1 | 0 |
We approve of a trait with the semantics of , but not necessarily under that exact name. (For example, .)
| 15 | 12 | 1 | 0 | 0 |
We approve of the general idea that user-defined classes should be able to warrant their own trivial relocatability. | 25 | 5 | 2 | 0 | 0 |
Appendix B: Sample code
Reference implementation of relocate_at
and uninitialized_relocate
template < class T > T * relocate_at ( T * source , T * dest ) { if constexpr ( std :: is_trivially_relocatable_v < T > ) { std :: memmove ( dest , source , sizeof ( T )); return std :: launder ( dest ); } else { T * result = :: new ( dest ) ( std :: move ( * source )); source ->~ T (); return result ; } } template < class ForwardIterator1 , class ForwardIterator2 > ForwardIterator2 uninitialized_relocate ( ForwardIterator1 first , ForwardIterator1 last , ForwardIterator2 result ) { using T = typename iterator_traits < ForwardIterator2 >:: value_type ; using U = decltype ( std :: move ( * first )); constexpr bool memcpyable = ( std :: is_same_v < T , std :: remove_ref_t < U >> && std :: is_trivially_relocatable_v < T > ); constexpr bool both_contiguous = ( std :: is_pointer_v < ForwardIterator1 > && std :: is_pointer_v < ForwardIterator2 > ); constexpr bool nothrow_relocatable = std :: is_nothrow_constructible_v < T , U > ; if constexpr ( memcpyable && both_contiguous ) { std :: size_t nbytes = ( char * ) last - ( char * ) first ; if ( nbytes != 0 ) { std :: memmove ( std :: addressof ( * result ), std :: addressof ( * first ), nbytes ); result += ( last - first ); } } else if constexpr ( nothrow_relocatable ) { for (; first != last ; ( void ) ++ result , ++ first ) { :: new ( static_cast < void *> ( std :: addressof ( * result ))) T ( std :: move ( * first )); std :: destroy_at ( std :: addressof ( * first )); } } else { result = std :: uninitialized_move ( first , last , result ); std :: destroy ( first , last ); } return result ; }
Conditionally trivial relocation
We expect, but do not require, that
should be trivially relocatable
if and only if
itself is trivially relocatable.
The following abbreviated implementation shows how to achieve an
which
has the same trivial-move-constructibility as
, the same trivial-destructibility
as
, and the same trivial-relocatability as
.
The primitives of move-construction and destruction are provided by four specializations
of
; then the public
extends the appropriate specialization of
and applies a conditional
attribute.
template < class T > class [[ trivially_relocatable ( is_trivially_relocatable_v < T > )]] optional : optional_base < T > { using optional_base < T >:: optional_base ; }; template < class T , bool D = is_trivially_destructible_v < T > , bool M = is_trivially_move_constructible_v < T >> class optional_base { // NOT SHOWN };
I have implemented the entire Standard Library using the proposed
attribute; you can find the source code on my GitHub and explore the resulting codegen on Godbolt Compiler Explorer.
I have also implemented case studies for
and
, in each of the
alternative styles:
Style | Size of diff (lines)
| Size of diff (lines)
|
---|---|---|
| -2, +14 | -18, +52 |
| problematic | -5, +5 |
| -1, +1 | -1, +17 |
For why one entry in this table is "problematic," see § 6.3 Attribute [[maybe_trivially_relocatable]].
Appendix C: Examples of non-trivially relocatable class types
Class contains pointer to self
This fictional
illustrates a mechanism that can apply
to any small-buffer-optimized class. libc++'s std::function uses this mechanism (on a 24-byte buffer) and is thus not trivially relocatable.
However, different mechanisms for small-buffer optimization exist. libc++'s std::any also achieves small-buffer optimization on a 24-byte buffer, without (necessarily) sacrificing trivial relocatability.
struct short_string { char * data_ = buffer_ ; size_t size_ = 0 ; char buffer_ [ 8 ] = {}; const char * data () const { return data_ ; } short_string () = default ; short_string ( const char * s ) : size_ ( strlen ( s )) { if ( size_ < sizeof buffer_ ) strcpy ( buffer_ , s ); else data_ = strdup ( s ); } short_string ( short_string && s ) { memcpy ( this , & s , sizeof ( * this )); if ( s . data_ == s . buffer_ ) data_ = buffer_ ; else s . data_ = nullptr ; } ~ short_string () { if ( data_ != buffer_ ) free ( data_ ); } };
Allocated memory contains pointer to self
needs somewhere to store its "past-the-end" node, commonly referred to
as the "sentinel node," whose
pointer points to the list’s last node.
If the sentinel node is allocated on the heap, then
can be trivially
relocatable; but if the sentinel node is placed within the
object itself
(as happens on libc++ and libstdc++), then relocating the
object requires
fixing up the list’s last node’s
pointer so that it points to the
new sentinel node inside the destination
object. This fixup of an arbitrary
heap object cannot be simulated by
.
Traditional implementations of
and
also store a "past-the-end"
node inside themselves and thus also fall into this category.
struct node { node * prev_ = nullptr ; node * next_ = nullptr ; }; struct list { node n_ ; iterator begin () { return iterator ( n_ . next_ ); } iterator end () { return iterator ( & n_ ); } list ( list && l ) { if ( l . n_ . next_ ) l . n_ . next_ -> prev_ = & n_ ; // fixup if ( l . n_ . prev_ ) l . n_ . prev_ -> next_ = & n_ ; // fixup n_ = l . n_ ; l . n_ = node {}; } // ... };
Class invariant depends on this
The
provided by [Boost.Interprocess] is an example of this category.
struct offset_ptr { uintptr_t value_ ; uintptr_t here () const { return uintptr_t ( this ); } uintptr_t distance_to ( void * p ) const { return uintptr_t ( p ) - here (); } void * get () const { return ( void * )( here () + value_ ); } offset_ptr () : value_ ( distance_to ( nullptr )) {} offset_ptr ( void * p ) : value_ ( distance_to ( p )) {} offset_ptr ( const offset_ptr & rhs ) : value_ ( distance_to ( rhs . get ())) {} offset_ptr & operator = ( const offset_ptr & rhs ) { value_ = distance_to ( rhs . get ()); return * this ; } ~ offset_ptr () = default ; };
Program invariant depends on this
In the following snippet,
is relocatable, but not
trivially relocatable, because the relocation operation of destroying a
at point A
and constructing a new
at point B has behavior that is observably different
from a simple
.
std :: set < void *> registry ; struct registered_object { registered_object () { registry . insert ( this ); } registered_object ( registered_object && ) = default ; registered_object ( const registered_object & ) = default ; registered_object & operator = ( registered_object && ) = default ; registered_object & operator = ( const registered_object & ) = default ; ~ registered_object () { registry . erase ( this ); } }; struct Widget : registered_object {};
Polymorphic downcast effectively relies on offset-into-self
Note: This section is new in P1144R5.
Thanks to David Stone for this example.
In the following snippet,
is relocatable, but not trivially relocatable,
because its copy constructor and assignment operator do not copy the entire state of the
right-hand object. (Notice that
is initialized with
, not with a copy of
.)
struct Base { static int f ( Base * ) { return 21 ; } int ( * pf )( Base * ); Base ( int ( * pf )( Base * ) = f ) : pf ( pf ) {} Base ( const Base & o ) : pf ( f ) {} Base & operator = ( const Base & ) { return * this ; } }; struct Derived : Base { static int f ( Base * self ) { return (( Derived * ) self ) -> x ; } Derived () : Base ( f ) {} Derived ( const Derived & ) = default ; Derived & operator = ( const Derived & o ) { x = o . x ; return * this ; } int x = 42 ; }; int main () { Base && d = Derived (); Base && b = Base (); std :: swap ( b , d ); printf ( "%d \n " , b . pf ( & b )); }
The above snippet is isomorphic to a classically polymorphic hierarchy
with virtual methods. Here is the same snippet using
:
struct Base { virtual int f () { return 21 ; } }; struct Derived : Base { int f () override { return x ; } int x = 42 ; }; int main () { Base && b = Base (); Base && d = Derived (); std :: swap ( b , d ); printf ( "%d \n " , b . f ()); }
This is why (as of P1144R5) the compiler will not consider types with virtual methods to be "naturally" trivially relocatable.
Appendix D: Implementation experience
A prototype Clang/libc++ implementation is at
-
github.com/Quuxplusone/llvm-project/tree/trivially-relocatable
-
godbolt.org, under the name "x86-64 clang (experimental P1144)"
Side-by-side case studies of
,
,
and
are given in Conditionally trivial relocation.
As of November 2018, libstdc++ performs the
optimization
for any type which has manually specialized
.
(See it on godbolt.org here.)
Manual specialization is also the approach used by [EASTL], [Folly], and [BSL].
As of 2020-03-01, the only libstdc++ library type for which
has
been specialized is
; see [Deque] and § 6.5 Unintuitive is_nothrow_relocatable.