1. Changelog
-
R7 (post-Issaquah 2023):
-
[P2786R0] debuted at Issaquah; it raised several open questions (see Appendix E: Open questions). The authors of P1144 and P2786 plan a joint paper to resolve some of these design issues.
-
Added
, the building block ofstd :: uninitialized_relocate_backward
.vector :: insert -
Removed "move-constructible, destructible" from the requirements in basic.types.general. Now trivially relocatable types, like trivially copyable types, can be non-relocatable.
-
Added table §2.1 directly comparing this proposal to Folly, BSL, and Qt.
-
Added straw poll results from Issaquah; removed much background and discussion (leaving only references for the interested reader to consult [P1144R6]).
-
Jonathan Hopkins points out that (contrary to the English description in [N2271]) EASTL defines
as a synonym forhas_trivial_relocate
, thus it’s not prior art for this notion. (In particular, EASTL considersis_trivially_move_assignable
non-trivially relocatable.) Remove mentions of EASTL.unique_ptr < int > -
Added Appendix E.
-
-
R6:
-
Added
.T std :: relocate ( T * source ) -
Changed the exception guarantee of
anduninitialized_relocate
in accord with feedback from David Stone.uninitialized_relocate_n -
Removed
(but keptstd :: ranges :: relocate_at
).std :: relocate_at
-
-
R5 (post-Prague 2020):
-
Polymorphic types (those with any vptr at all) inhibit "natural" trivial relocatability. David Stone gave the motivating example. See Polymorphic downcast effectively relies on offset-into-self.
-
Updated
. Added semantic requirements on assignment and some discussion of the issue withconcept relocatable
.vector :: insert
-
-
R3 (post-Kona 2019):
-
User-provided copy constructors and copy/move assignment operators inhibit "natural" trivial relocatability. This permits optimizing
andvector :: insert
.vector :: erase -
Adopted
.[[ trivially_relocatable ( bool )]] -
Added
.std :: relocate_at ( source , dest ) -
Removed the core-language blanket wording that permitted "the implementation" to eliminate moves and destroys of trivially relocatable types. Instead, the existence of
and handwaviness of e.g. [vector.capacity] combine to strongly imply (but still not to mandate) that library vendors will usestd :: relocate_at
wherever it helps with performance. We no longer propose to permit eliminating moves and destroys in e.g.std :: relocate_at
statements, except as already permitted under the as-if rule.return
-
2. Introduction and motivation
Given an object type
and memory addresses
and
,
the phrase "relocate a
from
to
" means
"move-construct
from
, and then immediately destroy the object at
."
Any type which is both move-constructible and destructible is relocatable.
A type is nothrow relocatable if its relocation operation is noexcept, and a type
is trivially relocatable if its relocation operation is trivial (which,
just like trivial move-construction and trivial copy-construction, means
"tantamount to
").
In practice, almost all relocatable types are trivially relocatable:
,
,
(on libc++),
(on MSVC).
Examples of non-trivially relocatable types include
(on libstdc++ and libc++),
(on libstdc++ and MSVC),
(everywhere). See Appendix C: Examples of non-trivially relocatable class types.
P1144 allows the library programmer to warrant to the compiler that a resource-management type is trivially relocatable. Explicit warrants are rarely needed because the compiler can infer trivial relocatability for Rule-of-Zero types. See § 4.5 Trivially relocatable type [basic.types.general].
The most important thing about P1144 relocation is that it is backward compatible and does not break either API or ABI. My intention is simply to legalize the well-understood tricks that many industry codebases ([BSL], [Folly], [Deque]) are already doing in practice. P1144 is not intended to change the behavior of any existing source code (except to speed it up), nor does it require major work from standard library vendors.
2.1. Optimizations enabled by trivial relocatability
The following optimizations are possible according to P1144R7’s notion of trivial relocatability. Here’s who does these optimizations in practice:
vector realloc | type erasure | fixed-cap move | vector
| vector
|
|
| |
Arthur’s libc++
|
| no | N/A | not yet |
| yes, uses
|
|
[BSL]
|
| no | ? |
|
| , unused by
| no |
[Folly]
|
| no | proposed for
|
|
| N/A | N/A |
[Qt]
|
|
|
|
|
|
| N/A |
2.1.1. Contiguous reallocation
Trivial relocation can be used anywhere we do the moral equivalent of
,
such as in
.
[Bench] (presented at C++Now 2018) shows a 3x speedup on
. This Reddit thread demonstrates a similar 3x speedup using the online tool Quick-Bench.
2.1.2. Moving in-place/SBO type-erased types like any
and function
Trivial relocation can be used to de-duplicate the code generated by type-erasing wrappers
like
,
, and
.
For these types, a move of the wrapper object is implemented in terms of a relocation of the contained object. (See for example libc++'s
.)
In general, the relocate operation must be uniquely codegenned for each
different contained type
, leading to code bloat. But a single instantiation suffices to relocate every trivially relocatable
in the program.
2.1.3. Moving fixed-capacity containers like static_vector
and small_vector
The move-constructor of
,
can be implemented naïvely as an element-by-element move (leaving the source vector’s elements in their moved-from state),
or efficiently as an element-by-element relocate (leaving the source vector empty).
For a detailed analysis of this case, see [FixedCapacityVector].
currently implements the
naïve element-by-element-move strategy, but after LEWG feedback, [P0843R5]
does permit the faster relocation strategy.
2.1.4. Contiguous insert and erase
Trivial relocation can be used anywhere we shift a contiguous range left or right,
such as in
(i.e., destroy the erased elements and "close the window"
by memmoving left) and
(i.e., "make a window" by memmoving right and
then construct the new elements in place). [Folly]'s
is prior art
for this optimization; see
.
§ 5.2 Confusing interactions with std::pmr determines whether this optimization will be possible in Standard C++.
2.1.5. Swap
Given a reliable way of detecting trivial relocatability,
we can optimize any routine that uses the moral equivalent of
, such as
,
, or
.
Optimizing
produces massive code-size improvements for all swap-based
algorithms, including
and
.
See @19:56–21:22 in my C++Now 2019 talk,
and see this Godbolt. This Quick-Bench shows a 25% speedup
in
when it is allowed to use bitwise swap on a Rule-of-Zero type.
But § 5.2 Confusing interactions with std::pmr determines whether this optimization will be possible in Standard C++.
2.2. Assertions, not assumptions
Some data structures might reasonably assert the trivial relocatability of their elements, just as they sometimes assert the stronger property of trivial copyability today. This might help them guarantee reasonable performance, or guarantee exception-safety.
2.3. Eliminate manual warrants, improve safety
Many real-world codebases contain templates which require
trivial relocatability of their template parameters, but cannot today verify trivial relocatability. For example, [Folly] requires the programmer to warrant the trivial
relocatability of any type stored in a
:
class Widget { std :: vector < int > lst_ ; }; folly :: fbvector < Widget > vec ; // FAILS AT COMPILE TIME for lack of warrant
This merely encourages the programmer to add the warrant and continue. An incorrect warrant will be discovered only at runtime, via undefined behavior. See Allocated memory contains pointer to self, [FollyIssue889], and (most importantly) @27:26–31:47 in my C++Now 2019 talk.
class Gadget { std :: list < int > lst_ ; }; // sigh, add the warrant on autopilot template <> struct folly :: IsRelocatable < Gadget > : std :: true_type {}; folly :: fbvector < Gadget > vec ; // CRASHES AT RUNTIME due to fraudulent warrant
If this proposal is adopted, then Folly can start using
in the implementation of
, and the programmer can stop writing explicit warrants.
Finally, the programmer can start writing assertions of correctness, which aids maintainability and
can even find real bugs. Example:
class Widget { std :: vector < int > lst_ ; }; static_assert ( std :: is_trivially_relocatable_v < Widget > ); // correctly SUCCEEDS class Gadget { std :: list < int > lst_ ; }; static_assert ( std :: is_trivially_relocatable_v < Gadget > ); // correctly ERRORS
The improvement in user experience and safety in real-world codebases ([Folly], [BSL], [Qt], etc.) is the most important benefit to be gained by this proposal.
3. Design goals
Every C++ type already is or is not trivially relocatable. This proposal does not require any library vendor to make any library type trivially relocatable. (We assume that quality implementations will do so on their own.)
The optimizations above are in the domain of library writers. If you’re writing
a vector, and you detect that your element type
is trivially relocatable, then
whether you optimize in that case is up to you.
This proposal does not require any library vendor to guarantee that any particular optimization
happens. (We assume that quality implementations will do so on their own.)
What C++ lacks is a standard way for library writers to detect the (existing) trivial relocatability
of a type
, so that they can reliably apply their (existing) optimizations.
All we really need is to add detection, and then all the optimizations described above will naturally
emerge without any further special effort by WG21.
There are three kinds of object types that we want to make sure are correctly detected as
trivially relocatable. These three cases are important for improving the performance of
the standard library, and for improving the correctness of libraries such as [Folly]'s
.
3.1. Standard library types such as std :: string
In order to optimize
, we must come up with a way to achieve
This could be done unilaterally by the library vendor — via a non-standard attribute (#include <string>static_assert ( std :: is_trivially_relocatable_v < std :: string > );
[[ clang :: trivially_relocatable ]]
), or a member typedef with a reserved name,
or simply a vendor-provided specialization of std :: is_trivially_relocatable < std :: string >
.
That is, we can in principle solve §3.1 while confining our "magic" to the headers of the implementation itself. The programmer doesn’t have to learn anything new, so far.
3.2. Program-defined types that follow the Rule of Zero
In order to optimize the SBO
in any meaningful sense,
we must come up with a way to achieve
Lambdas are not a special case in C++; they are simply class types with all their special members defaulted. Therefore, lambdas can use the same solution as#include <string>auto lam2 = [ x = std :: string ( "hello" )]{}; static_assert ( std :: is_trivially_relocatable_v < decltype ( lam2 ) > );
Here#include <string>struct A { std :: string s ; }; static_assert ( std :: is_trivially_relocatable_v < A > );
struct A
follows the Rule of Zero: its move-constructor and destructor are both defaulted.
If they were also trivial, then we’d be done. In fact they are non-trivial; and yet, because the type’s
bases and members are all of trivially relocatable types, the type as a whole is trivially relocatable.
§3.2 asks that we make the
succeed without breaking the "Rule of Zero."
We do not want to require the programmer to annotate
with a special attribute, or
a special member typedef, or anything like that. We want it to Just Work.
This is a harder problem than §3.1; it requires standard support in the core language.
But it still does not require any new syntax.
3.3. Program-defined types with non-defaulted special members
In order to optimize
,
we must come up with a way to achieve
via some standard annotation applied to class typestruct B { B ( B && ); // non-trivial ~ B (); // non-trivial }; static_assert ( std :: is_trivially_relocatable_v < B > );
B
(which in this example is standing in for boost :: shared_ptr
).
We cannot possibly do it without annotation, because there exist
examples of types that look just like B
and are trivially relocatable
(e.g. libstdc++'s std :: function
)
and there exist types that look just like B
and are not trivially relocatable
(e.g. libc++'s std :: function
).
So, without some kind of opt-in annotation, we cannot achieve our goal.
This paper proposes a standard syntax for §3.3, which in turn provides a simple and portable way for library vendors to implement §3.1.
4. Proposed wording for C++2b
The wording in this section is relative to WG21 draft N4910.
Note: There is no difficulty in changing the attribute syntax to a contextual-keyword syntax; the only downsides are aesthetic. We can defer that decision to the last minute, according to CWG’s feedback on the final wording.
4.1. Nothrow bidirectional iterator [algorithms.requirements]
Modify [algorithms.requirements] as follows:
If an algorithm’s template parameter is named
,
ForwardIterator ,
ForwardIterator1 , or
ForwardIterator2 , the template argument shall meet the Cpp17ForwardIterator requirements ([forward.iterators]) if it is required to be a mutable iterator, or model
NoThrowForwardIterator ([iterator.concept.forward]) otherwise.
forward_iterator If an algorithm’s template parameter is named
, the template argument is also required to have the property that no exceptions are thrown from increment, assignment, or comparison of, or indirection through, valid iterators.
NoThrowForwardIterator If an algorithm’s template parameter is named
,
BidirectionalIterator ,
BidirectionalIterator1 or, or
BidirectionalIterator2 , the template argument shall meet the Cpp17BidirectionalIterator requirements ([bidirectional.iterators]) if it is required to be a mutable iterator, or model
NoThrowBidirectionalIterator ([iterator.concept.bidir]) otherwise.
bidirectional_iterator - If an algorithm’s template parameter is named
, the template argument is also required to have the property that no exceptions are thrown from increment, decrement, assignment, or comparison of, or indirection through, valid iterators.
NoThrowBidirectionalIterator
4.2. Relocation operation [defns.relocation]
Add a new section in [intro.defs]:
- relocation operation
the homogeneous binary operation performed by
, consisting of a move construction immediately followed by a destruction of the source object
std :: relocate_at
4.3. relocate_at
and relocate
[specialized.relocate]
Add a new section after [specialized.destroy]:
template < class T > T * relocate_at ( T * source , T * dest ); Mandates:
shall be a complete non-array object type.
T Effects: Equivalent to:
struct guard { T * t ; ~ guard () { destroy_at ( t ); } } g ( source ); return :: new ( voidify ( * dest )) T ( std :: move ( * source )); except that if
is trivially relocatable [basic.types], side effects associated with the relocation of the value of
T might not happen.
* source template < class T > T relocate ( T * source ); Mandates:
shall be a complete non-array object type.
T Effects: Equivalent to:
T t = move ( source ); destroy_at ( source ); return t ; except that if
is trivially relocatable [basic.types], side effects associated with the relocation of the object’s value might not happen.
T
Note: These functions have both been implemented in my libc++ fork; for
, see godbolt.org/z/cqPP4oeE9 and [StdRelocateIsCute]. My implementation of their "as-if-by-memcpy" codepaths relies on
Clang’s
; vendors can use any vendor-specific means to implement them.
The wording also deliberately permits a low-quality implementation with no such codepath at all.
See @45:23–48:39 in my C++Now 2019 talk.
4.4. uninitialized_relocate
, uninitialized_relocate_n
, uninitialized_relocate_backward
[uninitialized.relocate]
Add a new section after [uninitialized.move]:
template < class InputIterator , class NoThrowForwardIterator > NoThrowForwardIterator uninitialized_relocate ( InputIterator first , InputIterator last , NoThrowForwardIterator result ); Effects: Equivalent to:
try { for (; first != last ; ++ result , ( void ) ++ first ) { :: new ( voidify ( * result )) typename iterator_traits < NoThrowForwardIterator >:: value_type ( std :: move ( * first )); destroy_at ( addressof ( * first )); } return result ; } catch (...) { destroy ( ++ first , last ); throw ; } except that if the iterators' common value type is trivially relocatable, side effects associated with the relocation of the object’s value might not happen.
Remarks: If an exception is thrown, all objects in both the source and destination ranges are destroyed.
template < class InputIterator , class Size , class NoThrowForwardIterator > pair < InputIterator , NoThrowForwardIterator > uninitialized_relocate_n ( InputIterator first , Size n , NoThrowForwardIterator result ); Effects: Equivalent to:
try { for (; n > 0 ; ++ result , ( void ) ++ first , -- n ) { :: new ( voidify ( * result )) typename iterator_traits < NoThrowForwardIterator >:: value_type ( std :: move ( * first )); destroy_at ( addressof ( * first )); } return { first , result }; } catch (...) { destroy_n ( ++ first , -- n ); throw ; } except that if the iterators' common value type is trivially relocatable, side effects associated with the relocation of the object’s value might not happen.
Remarks: If an exception is thrown, all objects in both the source and destination ranges are destroyed.
template < class BidirectionalIterator , class NoThrowBidirectionalIterator > NoThrowBidirectionalIterator uninitialized_relocate_backward ( BidirectionalIterator first , BidirectionalIterator last , NoThrowBidirectionalIterator result ); Effects: Equivalent to:
try { for (; last != first ; ) { -- last ; -- result ; :: new ( voidify ( * result )) typename iterator_traits < NoThrowBidirectionalIterator >:: value_type ( std :: move ( * last )); destroy_at ( addressof ( * last )); } return result ; } catch (...) { destroy ( first , ++ last ); throw ; } except that if the iterators' common value type is trivially relocatable, side effects associated with the relocation of the object’s value might not happen.
Remarks: If an exception is thrown, all objects in both the source and destination ranges are destroyed.
Note: The Remarks allude to blanket wording in [specialized.algorithms.general]/2.
4.5. Trivially relocatable type [basic.types.general]
Add a new section in [basic.types.general]:
An object typeis a trivially relocatable type if it is:
T
a trivially copyable type, or
an array of trivially relocatable type, or
a (possibly cv-qualified) class type declared with a
attribute with value
trivially_relocatable true
[dcl.attr.trivreloc], ora (possibly cv-qualified) class type which:
has no user-provided move constructors or move assignment operators,
has no user-provided copy constructors or copy assignment operators,
has no user-provided destructors,
has no virtual member functions,
has no virtual base classes,
all of whose members are either of reference type or of trivially relocatable type, and
all of whose base classes are of trivially relocatable type.
[Note: For a trivially relocatable type, the relocation operation (as performed by, for example, the library functions
and
std :: swap ) is tantamount to a simple copy of the underlying bytes. —end note]
std :: vector :: resize [Note: It is intended that most standard library types be trivially relocatable types. —end note]
Note: Polymorphic types are disallowed from "natural" trivial relocatability. See Appendix C, example 5. Volatile members are not disallowed. See [Subobjects].
4.6. [[ trivially_relocatable ]]
attribute [dcl.attr.trivreloc]
Add a new section after [dcl.attr.nouniqueaddr]:
The attribute-tokenspecifies that a class type’s relocation operation has no visible side-effects other than a copy of the underlying bytes, as if by the library function
trivially_relocatable . It may be applied to the definition of a class. It shall appear at most once in each attribute-list. An attribute-argument-clause may be present and, if present, shall have the form
std :: memcpy The constant-expression shall be an integral constant expression of type( constant - expression ) . If no attribute-argument-clause is present, it has the same effect as an attribute-argument-clause of
bool .
( true) If any definition of a class type has a
attribute with value V, then each definition of the same class type shall have a
trivially_relocatable attribute with value V. No diagnostic is required if definitions in different translation units have mismatched
trivially_relocatable attributes.
trivially_relocatable If a type
is declared with the
T attribute, and
trivially_relocatable is either not move-constructible or not destructible, the program is ill-formed.
T If a class type is declared with the
attribute, and the program relies on observable side-effects of its relocation other than a copy of the underlying bytes, the behavior is undefined.
trivially_relocatable
"If a type
is declared with the
attribute, and
is either not move-constructible
or not destructible, the program is ill-formed." We might want to replace this wording with
a mere "Note" encouraging implementations to diagnose.
See this example where a diagnostic might be unwanted.
4.7. Type traits is_relocatable
etc. [meta.unary.prop]
Add new entries to Table 47 in [meta.unary.prop]:
Template Condition Preconditions
template < class T > struct is_relocatable ; is
is_move_constructible_v < T > true
andis
is_destructible_v < T > true
T shall be a complete type, cv , or an array of unknown bound.
void
template < class T > struct is_nothrow_relocatable ; is
is_relocatable_v < T > true
and both the indicated move-constructor and the destructor are known not to throw any exceptions.T shall be a complete type, cv , or an array of unknown bound.
void
template < class T > struct is_trivially_relocatable ; is a trivially relocatable type.
T T shall be a complete type, cv , or an array of unknown bound.
void
4.8. relocatable
concept [concept.relocatable]
Add a new section after [concept.copyconstructible]:
template < class T > concept relocatable = move_constructible < T > ; If
is an object type, then let
T be an rvalue of type
rv ,
T an lvalue of type
lv equal to
T , and
rv a distinct object of type
u2 equal to
T .
rv models
T only if
relocatable
After the definition
,
T u = rv ; is equal to
u .
u2
is equal to
T ( rv ) .
u2 If the expression
is well-formed, then the expression has the same semantics as
u2 = rv
u2 . ~ T (); :: new (( void * ) std :: addressof ( u2 )) T ( rv ); If the definition
is well-formed, then after the definition
T u = lv ; is equal to
u .
u2 If the expression
is well-formed, then the expression’s result is equal to
T ( lv ) .
u2 If the expression
is well-formed, then the expression has the same semantics as
u2 = lv
u2 . ~ T (); :: new (( void * ) std :: addressof ( u2 )) T ( lv );
Note: We intend that a type may be relocatable
regardless of whether it is copy-constructible; but, if it is copy-constructible then copy-and-destroy
must have the same semantics as move-and-destroy. We intend that a type may be relocatable regardless of
whether it is assignable; but, if it is assignable then assignment must have the same semantics as
destroy-and-copy or destroy-and-move.
The semantic requirements on assignment help us optimize
and
.
satisfies
, but it models
only when all relevant objects have equal allocators.
5. Rationale and alternatives
5.1. Attribute [[ maybe_trivially_relocatable ]]
The Clang patch currently available on Godbolt Compiler Explorer supports both
and another attribute called
,
which John McCall requested that I explore.
See P1144R4 section 6.2 for discussion of the
attribute, including the reasons
I do not propose it for standardization.
In Issaquah (February 2023), [P2786R0] suggested a very similar design to
.
EWGI took a three-way straw poll on the design of
versus
,
with an inconclusive 7–5–6 vote (the author of P1144 voting "For" and the three representatives of P2786
presumably voting "Against."
5.2. Confusing interactions with std :: pmr
Note: This section was added in P1144R5 and revised in P1144R7.
Here I assume libc++, where
is trivially relocatable.
Feel free to substitute your favorite trivially relocatable container type;
I’m sticking with
for these examples because it is short
and easy to spell.
P1144 aims to permit efficient insertion and erasure in
. Example:
std :: vector < std :: string > vs ( 501 ); vs . erase ( vs . begin ());
This
requires us to shift down 500
objects, which can be done either
by 500 calls to
followed by one call to
, or by
one call to
followed by a
(as seen for example in [Folly] and BSL;
see table §2.1). We want to permit BSL’s implementation.
That’s why since P1144R5, the definition of
in §4.4 places semantic requirements on
's assignment operators (if they exist) as well as on
's constructors and destructors.
is a trivially relocatable type:
std :: vector < std :: pmr :: polymorphic_allocator < int >> va ; va . emplace_back ( & mr1 ); va . emplace_back ( & mr2 ); va . emplace_back ( & mr3 ); va . erase ( va . begin ()); // A
Line "A" can use trivial relocation (
) because it is fast and correct:
after line "A" we will have
.
But an STL container using a
suddenly becomes problematic:
std :: vector < std :: pmr :: string > vp ; vv . emplace_back ( "1" , & mr1 ); vv . emplace_back ( "2" , & mr2 ); vv . emplace_back ( "3" , & mr3 ); vv . reserve ( 1000 ); // B vv . erase ( vv . begin ()); // C
Line "B" can certainly use trivial relocation (
) because it is fast and correct.
But on line "C", the difference between
and
is detectable:
the former makes
and the latter would make
. Finally, hypothetically, if
were
implemented in terms of a single
followed by a call to
, we’d have UB here
because it is UB to
strings with unequal allocators.
For now, P1144R7’s proposed wording implies that
must not be marked as trivially relocatable.
But if we add some blanket wording setting appropriate preconditions on
(for example, permitting
and
to be implemented in terms of
,
or forbidding STL containers from containing PMR types with unequal allocators), then
vendors could make
trivially relocatable.
What new wording is needed to achieve this?
Note: Regardless,
itself is trivially relocatable.
The problem above arises from
etc., which affect the behavior of
the container’s
. The author of the container makes
the choice whether to respect POCMA/POCCA/POCS, and also makes the choice
when to warrant trivial relocatability. These choices are correlated, and so it is natural that
they should be made by the same person, at the same place in the source code.
5.3. Overlapping base-class subobjects
Note: This section was added in P1144R7.
In the following snippet,
fails to be trivially relocatable in practice
because its assignment operator doesn’t overwrite all
bytes of the
destination object. The tail padding of
is allowed to contain the value of
. (Godbolt.)
struct Dangerous { Dangerous ( int i , int j ) : i ( i ), j ( j ) {} int i ; short j ; }; static_assert ( std :: is_standard_layout_v < Dangerous > ); static_assert ( std :: is_trivially_relocatable_v < Dangerous > ); struct Derived : Dangerous { Derived ( int i , int j , int k ) : Dangerous ( i , j ), k ( k ) {} short k ; }; static_assert ( ! std :: is_standard_layout_v < Derived > ); static_assert ( ! std :: is_trivially_relocatable_v < Derived > ); int main () { Derived a = Derived ( 1 , 2 , 3 ); Derived b = Derived ( 4 , 5 , 6 ); Dangerous & ra = a , & rb = b ; std :: swap ( ra , rb ); assert ( a . k == 3 && b . k == 6 ); }
[Subspace] presented a version of this snippet, which led to the suggestion to consider only
standard-layout types "naturally" trivially relocatable. Unfortunately, that criterion affects only
, whereas it’s
that we can’t trivially swap.
Possible solution #1 is due to [Subspace]: Bring the ABI notion of "data size" into standard C++,
so that we can say that the "data size" of
is less than its "size." Types with "data size"
less than "size" aren’t naturally trivially relocatable. But this is painful, because it makes
false.
Possible solution #2: Don’t expose the ABI notion of "data size," but do take ABI into account
when deciding whether
is naturally trivially relocatable. This makes
implementation-defined in theory (false in practice) —
Possible solution #3: Bring the ABI notion of "data size" into standard C++, and make the trivial
version of
swap only
bytes of memory instead of
bytes.
This makes it difficult for the vendor to benefit from trivial relocation in
(because
just calling
won’t swap all the bytes, which makes it impossible for the optimizer
to coalesce adjacent memcpy operations into one big memcpy).
This makes
true, makes
trivial,
and makes
non-trivial.
Possible solution #4: Disallow
from using trivial relocation unless
it can prove that it’s operating on complete objects. This would also disallow e.g.
from using trivial relocation unless it’s operating on complete objects (e.g.
because it received a contiguous range). This creates fewer surprises, but makes it difficult
for the vendor to benefit from trivial relocation in
(because they can’t just get
it for free by calling
; they’d have to create a second entrypoint
and figure out how to call that instead of
from algorithms that would benefit from it).
This makes
true, makes
non-trivial,
and makes
trivial with heroics.
Possible solution #5 is what [P1144R6] implicitly assumed: Put a precondition on
(both the template and user-defined ADL overloads) that says it’s UB if either argument refers
to an overlapping subobject.
This makes
true, makes
undefined behavior,
and makes
trivial.
The difficulty here is getting WG21 to accept a new precondition on
.
6. Acknowledgements
Thanks to Pablo Halpern for [N4158], to which this paper bears a striking resemblance —
Significantly different approaches to this problem have previously appeared in Rodrigo Castro Campos’s [N2754], Denis Bider’s [P0023R0] (introducing a core-language "relocation" operator), and Niall Douglas’s [P1029R3] (treating trivial relocatability as an aspect of move-construction in isolation, rather than an aspect of the class type as a whole). A less different approach is taken in Mungo Gill & Alisdair Meredith’s [P2786R0].
Thanks to Elias Kosunen, Niall Douglas, John Bandela, and Nicolas Lesser for their feedback on early drafts of P1144R0.
Thanks to Nicolas Lesser and John McCall for their review comments on the Clang implementation [D50119].
Many thanks to Matt Godbolt for allowing me to install my Clang fork on Compiler Explorer (godbolt.org). See also [Announcing].
Thanks to Howard Hinnant for appearing with me on [CppChat], and to Jon Kalb and Phil Nash for hosting us.
Thanks to Marc Glisse for his work integrating a "trivially relocatable" trait into GNU libstdc++ and for answering my questions on GCC bug 87106.
Thanks to Jens Maurer for his feedback on P1144R3 at Kona 2019, and to Corentin Jabot for championing P1144R4 at Prague 2020.
Thanks to Dana Jansens for her contributions re overlapping and non-standard-layout types (see [Subspace]), to Alisdair Meredith for our extensive discussions during the February 2023 drafting of [P2786R0], to Giuseppe D’Angelo and Thiago Maceira for contributing the Qt entries in table §2.1, and to Giuseppe D’Angelo for extensive review comments and discussion.
Appendix A: Straw polls
Polls taken at EWGI at Issaquah on 2023-02-10
Arthur O’Dwyer presented P1144R6. Alisdair Meredith presented [P2786R0] (which proposed a
-style facility, and expressed it as a contextual keyword instead
of an attribute). EWGI took the following straw polls (as well as polls on attribute syntax and on both papers' readiness for EWG).
SF | F | N | A | SA | |
---|---|---|---|---|---|
The problem presented in P1144/P2786 is worth solving. | 10 | 8 | 0 | 0 | 0 |
The problem being introduced in P1144/P2786 should be solved in a more general way instead of as proposed. | 3 | 0 | 5 | 6 | 4 |
The annotation should "trust the user" as in P1144R6’s ("sharp knife"),
instead of diagnosing as in P1144R6’s and P2786R0’s ("dull knife"). Three-way poll.
| — | 7 | 5 | 6 | — |
Polls taken at EWGI at Prague on 2020-02-13
Corentin Jabot championed P1144R4. EWGI discussed P1144R4 and Niall Douglas’s [P1029R3] consecutively, then took the following straw polls (as well as a poll on the attribute syntax).
SF | F | N | A | SA | |
---|---|---|---|---|---|
We believe that P1029 and P1144 are sufficiently different that they should be advanced separately. | 7 | 3 | 2 | 0 | 0 |
EWGI is ok to have the spelling as an attribute with an expression argument. | 3 | 5 | 1 | 1 | 0 |
EWGI thinks the author should explore P1144 as a customizable type trait. | 0 | 0 | 0 | 9 | 2 |
Forward P1144 to EWG. | 1 | 3 | 4 | 1 | 0 |
For polls taken September–November 2018, see [P1144R6].
Appendix B: Sample code
See [P1144R6]'s Appendix B for reference implementations of
,
, and
P1144R6’s version of the
library algorithm, plus
a conditionally trivially relocatable
.
I have implemented the entire Standard Library using the
attribute; you can find the source code on my GitHub and explore the resulting codegen on Godbolt Compiler Explorer.
Appendix C: Examples of non-trivially relocatable class types
Class contains pointer to self
This fictional
illustrates a mechanism that can apply
to any small-buffer-optimized class. libc++'s
uses this mechanism and is thus not trivially relocatable.
However, different mechanisms for small-buffer optimization exist. libc++'s
achieves small-buffer optimization
on a 24-byte buffer, without (necessarily) sacrificing trivial relocatability.
struct short_string { char * data_ = buffer_ ; size_t size_ = 0 ; char buffer_ [ 8 ] = {}; const char * data () const { return data_ ; } short_string () = default ; short_string ( const char * s ) : size_ ( strlen ( s )) { if ( size_ < sizeof buffer_ ) strcpy ( buffer_ , s ); else data_ = strdup ( s ); } short_string ( short_string && s ) { memcpy ( this , & s , sizeof ( * this )); if ( s . data_ == s . buffer_ ) data_ = buffer_ ; else s . data_ = nullptr ; } ~ short_string () { if ( data_ != buffer_ ) free ( data_ ); } };
Allocated memory contains pointer to self
needs somewhere to store its "past-the-end" node, commonly referred to
as the "sentinel node," whose
pointer points to the list’s last node.
If the sentinel node is allocated on the heap, then
can be trivially
relocatable; but if the sentinel node is placed within the
object itself
(as happens on libc++ and libstdc++), then relocating the
object requires
fixing up the list’s last node’s
pointer so that it points to the
new sentinel node inside the destination
object. This fixup of an arbitrary
heap object cannot be simulated by
.
Traditional implementations of
and
also store a "past-the-end"
node inside themselves and thus also fall into this category.
struct node { node * prev_ = nullptr ; node * next_ = nullptr ; }; struct list { node n_ ; iterator begin () { return iterator ( n_ . next_ ); } iterator end () { return iterator ( & n_ ); } list ( list && l ) { if ( l . n_ . next_ ) l . n_ . next_ -> prev_ = & n_ ; // fixup if ( l . n_ . prev_ ) l . n_ . prev_ -> next_ = & n_ ; // fixup n_ = l . n_ ; l . n_ = node {}; } // ... };
Class invariant depends on this
The
provided by [Boost.Interprocess] is an example of this category.
struct offset_ptr { uintptr_t value_ ; uintptr_t here () const { return uintptr_t ( this ); } uintptr_t distance_to ( void * p ) const { return uintptr_t ( p ) - here (); } void * get () const { return ( void * )( here () + value_ ); } offset_ptr () : value_ ( distance_to ( nullptr )) {} offset_ptr ( void * p ) : value_ ( distance_to ( p )) {} offset_ptr ( const offset_ptr & rhs ) : value_ ( distance_to ( rhs . get ())) {} offset_ptr & operator = ( const offset_ptr & rhs ) { value_ = distance_to ( rhs . get ()); return * this ; } ~ offset_ptr () = default ; };
Program invariant depends on this
In the following snippet,
is relocatable, but not
trivially relocatable, because the relocation operation of destroying a
at point A
and constructing a new
at point B has behavior that is observably different
from a simple
.
std :: set < void *> registry ; struct registered_object { registered_object () { registry . insert ( this ); } registered_object ( registered_object && ) = default ; registered_object ( const registered_object & ) = default ; registered_object & operator = ( registered_object && ) = default ; registered_object & operator = ( const registered_object & ) = default ; ~ registered_object () { registry . erase ( this ); } }; struct Widget : registered_object {};
Polymorphic downcast effectively relies on offset-into-self
Thanks to David Stone for this example.
In the following snippet,
is relocatable, but not trivially relocatable,
because its copy constructor and assignment operator do not copy the entire state of the
right-hand object. (Notice that
is initialized with
, not with a copy of
.)
struct Base { static int f ( Base * ) { return 21 ; } int ( * pf )( Base * ); Base ( int ( * pf )( Base * ) = f ) : pf ( pf ) {} Base ( const Base & o ) : pf ( f ) {} Base & operator = ( const Base & ) { return * this ; } }; struct Derived : Base { static int f ( Base * self ) { return (( Derived * ) self ) -> x ; } Derived () : Base ( f ) {} Derived ( const Derived & ) = default ; Derived & operator = ( const Derived & o ) { x = o . x ; return * this ; } int x = 42 ; }; int main () { Base && d = Derived (); Base && b = Base (); std :: swap ( b , d ); printf ( "%d \n " , b . pf ( & b )); }
The above snippet is isomorphic to a classically polymorphic hierarchy
with virtual methods. Here is the same snippet using
:
struct Base { virtual int f () { return 21 ; } }; struct Derived : Base { int f () override { return x ; } int x = 42 ; }; int main () { Base && b = Base (); Base && d = Derived (); std :: swap ( b , d ); printf ( "%d \n " , b . f ()); }
This is why (since P1144R5) the compiler will not consider types with virtual methods to be "naturally" trivially relocatable.
Appendix D: Implementation experience
A prototype Clang/libc++ implementation is at
-
github.com/Quuxplusone/llvm-project/tree/trivially-relocatable
-
godbolt.org, under the name "x86-64 clang (experimental P1144)"
Side-by-side case studies of
,
,
and
are given in [P1144R6].
As of November 2018, libstdc++ performs the
optimization
for any type which has manually specialized
.
(See this Godbolt.)
Manual specialization is also the approach used by [Qt], [Folly], and [BSL].
As of 2023-02-12, the only libstdc++ library type for which
has
been specialized is
; see [Deque].
Clang trunk provides a compiler builtin type trait
(see [D114732]), which is largely the same as the trait proposed in this paper. (There are slight
differences; e.g., Clang reports polymorphic types, reference types, and incomplete types
as trivially relocatable, whereas P1144 does not. I’m not aware that any of Clang’s differences were intentional.)
Clang trunk has no equivalent of the
attribute, so
is true only for
trivially copyable types and types marked with the
attribute. As of 2023-02-12, Clang trunk has no conception of a type which is non-trivial for purposes of calls
and yet is trivially relocatable. But Clang’s current status is compatible with P1144 (modulo the
few unintentional differences in
mentioned above).
Appendix E: Open questions
P1144R6
's spec lacks the magic either-direction-overlap-handling of
.
Either-direction-overlap-handling is implementable only for contiguous iterators (which can be converted to pointers);
it is not implementable for arbitrary random-access iterators. Is there ever a case where someone might want to e.g.
? Do we need a separate algorithm? [Qt] provides both
with magic overlap handling, and
without. [P2786R0] proposes only
with magic overlap handling. [P1144R6] proposed only
without.
In P1144R7, types with vptrs are never naturally trivially relocatable. Arthur claims that types with vptrs should be used only with inheritance, which means they never do value semantics (they slice instead), which means relocating one is always a bug, just like assigning or copying one. But [P2786R0] proposes to make polymorphic types naturally trivially relocatable. Is there any use-case for relocating polymorphic types? Is P1144 being too conservative here?
In P1144R7,
can use trivial relocation to
"close up the window" in the vector; this is also done in [Qt], [Folly], and [BSL]. The difference between
closing-the-window-by-move-assignment and closing-the-window-by-memmove is detectable in a conforming C++ program.
Therefore [P2786R0] proposes that relocation should not be allowed here.
Is there any implementation experience of P2786R0’s conservative position here?
([BSL]'s
follows P1144R7’s liberal position already.)
In P1144R6,
on an array
can use trivial relocation to
swap the elements. The difference between swap-by-move-assignment and swap-by-trivial-relocation is not detectable
in a conforming C++ program, because swapping
s with unequal allocators is UB. [P2786R0] conservatively proposes that relocation should not be allowed here, anyway.
Is there any usage experience in [Qt], [BSL], [Folly], or elsewhere that bears on this question?
(Arthur’s libc++ fork does relocation here, but doesn’t really count as usage experience since he’s the only one using it.)
We need a solution for potentially overlapping subobjects; see § 5.3 Overlapping base-class subobjects.