1. Background
Each version of C++ has improved the efficiency of returning objects by value. By the middle of the last decade, copy elision was reliable (if not technically guaranteed) in situations like this:
Widget one () { return Widget (); // copy elision } Widget two () { Widget result ; return result ; // copy elision }
In C++11, a completely new feature was added: a change to overload resolution which I will call implicit move. Even when copy elision is impossible, the compiler is sometimes
required to implicitly move the
statement’s operand into the result object:
std :: shared_ptr < Base > three () { std :: shared_ptr < Base > result ; return result ; // copy elision } std :: shared_ptr < Base > four () { std :: shared_ptr < Derived > result ; return result ; // no copy elision, but implicitly moved (not copied) }
The wording for this optimization was amended by [CWG1579]. The current wording in [class.copy.elision]/3 says:
In the following copy-initialization contexts, a move operation might be used instead of a copy operation:
If the expression in a
statement is a (possibly parenthesized) id-expression that names an object with automatic storage duration declared in the body or parameter-declaration-clause of the innermost enclosing function or lambda-expression, or
return if the operand of a throw-expression is the name of a non-volatile automatic object (other than a
function or catch-clause parameter) whose scope does not extend beyond the end of the innermost enclosing try-block (if there is one),overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue. If the first overload resolution fails or was not performed, or if the type of the first parameter of the selected
constructor is not anrvalue reference tothe object’s type (possibly cv-qualified), overload resolution is performed again, considering the object as an lvalue.
The highlighted phrases above indicate places where the wording diverges from a naïve programmer’s intuition. Consider the following examples...
1.1. Throwing is pessimized
Throwing is pessimized because of the highlighted word
void five () { Widget w ; throw w ; // non-guaranteed copy elision, but implicitly moved (never copied) } Widget six ( Widget w ) { return w ; // no copy elision, but implicitly moved (never copied) } void seven ( Widget w ) { throw w ; // no copy elision, and no implicit move (the object is copied) }
Note: The comment in
matches the current Standard wording, and matches the behavior of GCC.
Most compilers (Clang 4.0.1+, MSVC 2015+, ICC 16.0.3+) already do this implicit move.
1.2. Non-constructor conversion is pessimized
Non-constructor conversion is pessimized because of the highlighted word
struct From { From ( Widget const & ); From ( Widget && ); }; struct To { operator Widget () const & ; operator Widget () && ; }; From eight () { Widget w ; return w ; // no copy elision, but implicitly moved (never copied) } Widget nine () { To t ; return t ; // no copy elision, and no implicit move (the object is copied) }
1.3. By-value sinks are pessimized
By-value sinks are pessimized because of the highlighted phrase
struct Fish { Fish ( Widget const & ); Fish ( Widget && ); }; struct Fowl { Fowl ( Widget ); }; Fish ten () { Widget w ; return w ; // no copy elision, but implicitly moved (never copied) } Fowl eleven () { Widget w ; return w ; // no copy elision, and no implicit move (the Widget object is copied) }
Note: The comment in
matches the current Standard wording, and matches the behavior of
Clang, ICC, and MSVC. One compiler (GCC 5.1+) already does this implicit move.
1.4. Slicing is pessimized
Slicing is pessimized because of the highlighted phrase
std :: shared_ptr < Base > twelve () { std :: shared_ptr < Derived > result ; return result ; // no copy elision, but implicitly moved (never copied) } Base thirteen () { Derived result ; return result ; // no copy elision, and no implicit move (the object is copied) }
Note: The comment in
matches the current Standard wording, and matches the behavior
of Clang and MSVC. Some compilers (GCC 8.1+, ICC 18.0.0+) already do this implicit move.
We propose to remove all four of these unnecessary limitations.
2. Proposed wording relative to N4762
Modify [class.copy.elision]/3 as follows:
In the following copy-initialization contexts, a move operation might be used instead of a copy operation:
If the expression in a
statement is a (possibly parenthesized) id-expression that names an object with automatic storage duration declared in the body or parameter-declaration-clause of the innermost enclosing function or lambda-expression, or
return if the operand of a throw-expression is the name of a non-volatile automatic object (other than a
function orcatch-clause parameter) whose scope does not extend beyond the end of the innermost enclosing try-block (if there is one),overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue. If the first overload resolution fails or was not performed,
or if the type of the first parameter of the selected constructor is not an rvalue reference to the object’s type (possibly cv-qualified),overload resolution is performed again, considering the object as an lvalue. [Note: This two-stage overload resolution must be performed regardless of whether copy elision will occur. It determines the constructor to be called if elision is not performed, and the selected constructor must be accessible even if the call is elided. —end note]
Note: I believe that the two instances of the word "constructor" in the quoted note remain correct. They refer to the constructor selected to initialize the result object, as the very last step of the conversion sequence. This proposed change merely permits the conversion sequence to be longer than a single step; for example, it might involve a derived-to-base conversion followed by a move-constructor, or a user-defined conversion operator followed by a move-constructor. In either case, as far as the quoted note is concerned, that ultimate move-constructor is the "constructor to be called," and indeed it must be accessible even if elision is performed.
3. Proposed wording relative to P0527r1
David Stone’s [P0527] "Implicitly move from rvalue references in return statements" proposes to alter the current rules "references are never implicitly moved-from" and "catch-clause parameters are never implicitly moved-from." It accomplishes this by significantly refactoring clause [class.copy.elision]/3.
In the case that [P0527]'s changes are adopted into C++2a, we propose to modify the new [class.copy.elision]/3 as follows:
A movable entity is a non-volatile object or an rvalue reference to a non-volatile type, in either case with automatic storage duration.
The underlying type of a movable entity is the type of the object or the referenced type, respectively.In the following copy-initialization contexts, a move operation might be used instead of a copy operation:
If the expression in a
statement is a (possibly parenthesized) id-expression that names a movable entity declared in the body or parameter-declaration-clause of the innermost enclosing function or lambda-expression, or
return if the operand of a throw-expression is a (possibly parenthesized) id-expression that names a movable entity whose scope does not extend beyond the end of the innermost enclosing try-block (if there is one),
overload resolution to select the constructor for the copy is first performed as if the entity were designated by an rvalue. If the first overload resolution fails or was not performed,
or if the type of the first parameter of the selected constructor is not an rvalue reference to the (possibly cv-qualified) underlying type of the movable entity,overload resolution is performed again, considering the entity as an lvalue. [Note: This two-stage overload resolution must be performed regardless of whether copy elision will occur. It determines the constructor to be called if elision is not performed, and the selected constructor must be accessible even if the call is elided. —end note]
4. Implementation experience
This feature has effectively already been implemented in Clang since February 2018; see [D43322].
Under the diagnostic option
(which is enabled as part of
,
, and
),
the compiler performs overload resolution according to both rules — the standard rule and also
a rule similar to the one proposed in this proposal. If the two resolutions produce different results,
then Clang emits a warning diagnostic explaining that the return value will not be implicitly moved and
suggesting that the programmer add an explicit
.
However, Clang does not diagnose the examples from §1.3 By-value sinks.
4.1. Plenitude of true positives
These warning diagnostics have proven helpful on real code. Many instances have been reported of code that is currently accidentally pessimized, and which would become optimized (with no loss of correctness) if this proposal were adopted:
-
[SG14]: a clever trick to reduce code duplication by using conversion operators, rather than converting constructors, turned out to cause unnecessary copying in a common use-case.
-
[Chromium]: a non-standard container library used
instead ofiterator :: operator const_iterator () &&
. (The actual committed diff is here.)const_iterator :: const_iterator ( iterator && ) -
[LibreOffice]: "An explicit std::move would be needed in the return statements, as there’s a conversion from
to base classVclPtrInstance
involved."VclPtr
However, we must note that about half of the true positives from the diagnostic are on code like the following example, which is not affected by this proposal:
std :: string fourteen ( std :: string && s ) { s += "foo" ; return s ; // no copy elision, and no implicit move (the object is copied) }
See [Khronos], [Folly], and three of the four diffs in [Chromium]. [AWS] is a particularly egregious variation. (The committed diff is here.)
std :: string fifteen () { std :: string && s = "hello world" ; return s ; // no copy elision, and no implicit move (the object is copied) }
Some number of programmers certainly expect a move here, and in fact [P0527] proposes to implicitly move in both of these cases. This paper does not conflict with [P0527], and we provide an alternative wording for the case that [P0527] is adopted.
4.2. Lack of false positives
In five months we have received a single "false positive" report ([Mozilla]), which complained that the move-constructor suggested
by Clang was not significantly more efficient than the actually selected copy-constructor. The programmer preferred not
to add the suggested
because the code ugliness was not worth the minor performance gain.
This proposal would give Mozilla that minor performance gain without the ugliness — the best of both worlds!
We have never received any report that Clang’s suggested move would have been incorrect.