ISO/IEC JTC1 SC22 WG21 N3170 = 10-0160 - 2010-10-17
Lawrence Crowl, crowl@google.com, Lawrence@Crowl.org
Anthony Williams, anthony@justsoftwaresolutions.co.uk
Howard Hinnant, howard.hinnant@gmail.com
Introduction
Easy Concurrency
Early Call
Late Return
Multiple Get
Const Reference
Naming Types
Undefined Behavior
Deferred Functions
Copyable Versus Move-Only Types
Get Value Concurrently
External Synchronization
Moving Get
Move Invalidation
Atomic Operation
Summary
Solution
US 202 — 30.6.8/18
US 203 — 30.6.8
US 204 — 30.6.8/7-8
US 205 — 30.6.9/3
Wording
30.6.1 Overview [futures.overview]
30.6.6 Class template future
[futures.unique_future]
30.6.7 Class template shared_future
[futures.shared_future]
30.6.8 Class template atomic_future
[futures.atomic_future]
30.6.9 Function template async
[futures.async]
At the Summer 2010 meeting of the C++ Standards Committee, it became apparent that members had different models for how futures were intended to be used, and thus different views on the interface of futures. This paper explores the uses of futures, problems with those uses, and solutions to those problems. It captures and extends the discussion at the meeting. We analyze the implications of the current interface. We present a relatively minimal solution. Finally, we propose normative wording.
This paper specifically addresses the following national body comments.
The note in this paragraph says "unlike future, calling get more than once on the same atomic_future object is well defined and produces the result again." There is nothing in future that says anything negative about calling get more than once.
Remove this note, or add words to the requirements for future that reflect what this note says.
Both future and shared_future specify that calling most member functions on an object for which valid() == false produces undefined behavior. There is no such statement for atomic_future.
Add a new paragraph after 30.6.8 [futures.atomic_future]/2 with the same words as 30.6.7 [futures.shared_future]/3.
According to the definition of atomic_future, all members of atomic_future are synchronizing except constructors. However, it would probably be appropriate for a move constructor to be synchronizing on the source object. If not, the postconditions on paragraphs 7-8, might not be satisfied. This may be applicable if a collection of futures are being doled out to a set of threads that process their value.
Make the move constructor for atomic future lock the source.
The third sentence says "If the invocation is not deferred, a call to a waiting function on an asynchronous return object that shares the associated asynchronous state created by this async call shall block until the associated thread has completed." The next sentence says "If the invocation is not deferred, the join() on the created thread..." Blocking until a thread completes is not necessarily a join.
Decide whether the requirement is to block until finished or to call join, and rewrite to match.
There are many uses for futures. In this section, we concentrate on the use case for obtaining easy concurrency from sequential code. Given the following example,
void original( type arg ) { ....1 ....2 type var = work( arg ); ....3 if ( cond1 ) { ....4 use1( var ); ....5 } else { ....6 use2( var ); ....7 } }
the task is to use async
and future
to obtain concurrency within the example.
Technique:
The working technique is to move the call of work
earlier,
so that work
executes concurrently with code ....2
.
void early_call( type arg ) { ....1 auto ftr = async( [=]{ return work( arg ); } ); ....2 type var =
work( arg )ftr.get(); ....3 if ( cond1 ) { ....4 use1( var ); ....5 } else { ....6 use2( var ); ....7 } }
This code works in general with the existing unique future
,
which requires at most one call to get
.
This approach works in general, even when the return value is not stored.
void early_not_stored( type arg ) { ....1 auto ftr = async( [=]{ return work( arg ); } ); ....2 use1(
work( arg )ftr.get() ); }
Technique:
Another approach to easy concurrency
is to change uses of any stored return value
into access to the future value,
effectively moving the time the return value is needed to later.
So, for the original
example above,
the work
executes concurrently with code ....3
and either ....4
or ....6
.
void late_return( type arg ) { ....1 ....2
type varauto ftr = async( [=]{ return work( arg ); } ); ....3 if ( cond1 ) { ....4 use1(varftr.get() ); ....5 } else { ....6 use2(varftr.get() ); ....7 } }
Problem:
This code works with unique future
only if get()
is called at most once.
Any edit to the code that introduces a second call
introduces undefined behavior.
void bad_second_use( type arg ) { ....1 ....2
type varauto ftr = async( [=]{ return work( arg ); } ); ....3 if ( cond1 ) { ....4 use1(varftr.get() ); ....5 } else { ....6 use2(varftr.get() ); ....7 } ....8 use3(varftr.get() ); // second use is undefined }
Workaround:
The example can be made to work with a shared_future
.
void good_second_use( type arg ) { ....1 ....2
type varshared_future<type> ftr = async( [=]{ return work( arg ); } ); ....3 if ( cond1 ) { ....4 use1(varftr.get() ); ....5 } else { ....6 use2(varftr.get() ); ....7 } ....8 use3(varftr.get() ); // second use is defined }
Problem:
The return late approach fails for slightly more complicated examples.
For instance, modify the else clause in our example
to either read or write var
.
void write_to_get( type arg ) { ....1 ....2
type varshared_future<type> ftr = async( [=]{ return work( arg ); } ); ....3 if ( cond1 ) { ....4 use1(varftr.get() ); ....5 } else { ....6 if ( cond2 ) { ....7 use2(varftr.get() ); ....8 } else { ....9varftr.get() = something(); // assign to const reference! ....10 } ....11 } ....12 use3(varftr.get() ); }
The problem is that the get()
on shared_future
returns a const
reference,
which means that the update of the variable is ill-formed.
Solution:
The return late approach can work in general
only if the get function returns a non-const
reference to the appropriate storage.
That is, the shared_future::get
call
serves as a replacement for naming the former variable
that held the function return value.
Problem:
Programmers must name the return type
when declaring the shared_future
;
auto
is not available within template argument lists.
Solution:
The burden of manual type inference can be eliminated
by providing a second async
that returns a shared_future
.
void auto_shared( type arg ) { ....1 ....2
type varauto ftr = shared_async( [=]{ return work( arg ); } ); ........ }
Solution:
The burden of manual type inference can also be eliminated
by providing an operation
to turn an rvalue unique future
into an rvalue shared_future
:
template< Type > shared_future< Type > future::share() { .... }
Which changes the above code to:
void auto_shared( type arg ) { ....1 ....2
type varauto ftr = async( [=]{ return work( arg ); } ).share(); ........ }
Observation:
The cost of conversion
between unique future
and shared_future
is very low
because they share the same representation.
Problem: Failing to apply the workaround results in undefined behavior.
Strawman:
To avoid undefined behavior,
the library committee showed a strong preference [13-2-0-4]
for have a second use of such a get
throw an exception.
Unfortunately, that approach turns a problem
that is diagnosable at compile time
into a defined run-time behavior.
That is, a programmer could exploit the defined throw
to control normal program execution.
Solution:
Leave the behavior as undefined,
but encourage implementations to detect
a second call and then
either call std::terminate
or throw an exception.
Problem:
Programmers must use a shared_future
only for some uses.
The core of the problem is that unique future
was designed around move-only types,
and thus permits only one call to get
.
Solution:
We can address the problem by separating the sharing of futures
from their applicability to move-only types.
In particular, we can make sharing a feature of the future type,
and 'move only' a feature of operations on that future.
There is support [9-5-1-3]
for allowing multiple get
operations
on a unique future
for a copyable type.
Note:
This solution also substantially reduces
the need for easy type conversion,
because it preserves the unique future
for concurrency conversion.
Problem:
In generic contexts, different constraints on get
depending on the template argument
may lead to subtly broken code.
Solution:
Separate the get
operation into two variants,
one intended to be called only once
and one intended to be called multiple times.
The latter does not support move-only types,
and so would not be available
in specializations of futures on move-only types.
There is support [10-2-1-4]
for renaming the only-once get
to something else.
Suggestions include
release
, pop
, and drain
.
Technique: One of the mechanisms that supports easy concurrency is deferred functions. This mechanism enables the system to adapt to the amount of parallelism available, which avoids over-subscribing the system. That is, programmers can specify more concurrency than they think they need with confidence that the excess will not have a significant cost.
Observation:
The single-use get
of unique future
keeps the cost of excess concurrency low
by avoiding threads when they are not helpful.
The implementation need not store the function result,
but can instead return it directly.
The implementation does not need to catch exceptions,
but can instead pass them up unhindered.
The current futures have seemingly inconsistent support for copyable and move-only argument types.
Observation:
The current unique future
supports only a single-use, value-returning get
.
This design is most appropriate to move-only types.
Observation:
The current shared_future
supports only a multi-use, const-reference returning get
.
This design is most appropriate to copyable types.
In particular,
it does not support moving a result out of the future.
This design means that its use with move-only types
is restricted to accessing the result through the const reference returned.
This may or may not be a problem, depending on the type
(e.g. a const unique_ptr<T>
still allows non-const access to the T
object).
Observation:
The single-use value get
is move optimized.
E.g. std::vector<int> v = ftr.get()
.
Observation:
The const reference get
is access optimized.
E.g. int i = ftr.get()[3]
.
The shared_future
is designed to be shared between threads,
that is to allow multiple concurrent get
operations.
Note:
There is some committee support [4-10-2-1]
for renaming shared_future
to multi_future
.
Problem:
The get
operation on shared_future
currently returns a const reference.
This reference persists
beyond the scope of any locking that operations on the future can provide.
Thus, access through the reference
requires programmer-provided external synchronization.
Solution: Programmers can avoid that external synchronization by returning a type that is thread-safe for const qualification.
Problem:
The objects in the standard library must meet this const requirement,
but it is not presently a requirement on user types
and the existing specification for shared_future
does not make such a requirement.
Solution:
Normatively, make this const requirement
apply to the argument types of shared_future
.
Problem:
Such a const requirement
would fail for a get
returning a non-const reference,
which was our solution to an earlier problem.
Solution:
We can provide a different get
operation
that returns a copy of the future object.
This copying would be done under a lock internal to the future's liaison state.
Note:
This lock is not required for a unique future
because it a priori cannot be copied concurrently.
Note:
The copy-returning get
can exploit copy elision.
Returning a copy is not necessarily any more expensive
than returning a reference and copying the object outside the call.
Problem:
The use of a non-const-reference-returning get
requires care in managing external synchronization.
Solution:
Do not provide a non-const-reference-returning get
for shared_future
.
(This solution constrains expert programmers.)
Problem:
A version of get
that returns a copy
is not applicable to move-only types.
Solution:
Disable shared_future
for move-only types.
This solution is more extreme than the status quo.
Solution:
Disable copying of move-only types out of shared_future
,
permitting only const lvalue access.
This solution is the status quo.
Problem:
Sharing a future for move-only types
would enable "first available consumer" program structures,
and hence disabling shared_future
for move-only types,
or providing onliy reference-returning get
,
would limit programmers.
Solution:
Provide a get
that moves out its value,
and throws an exception on a second call.
Problem:
The above solution
puts exception handling into non-exceptional code.
Indeed, with a shared_future
,
one can expect multiple calls
because otherwise a unique future
would suffice.
Strawman: Return a default-constructed object when the object has been previously moved. Unfortunately, this strawman puts an undesireable constraint on programmers.
Solution:
We can avoid the multiple-move attempt
by providing a get
operation that
move-assigns the value if and only if it has not been moved.
That get
operation
would return a boolean indicating whether or not the move occurred.
For example,
moveonly var; .... if ( ftr.moving_get( var ) ) { .... // var has the value; use it. } else { .... // another thread already moved the value; // var does not have the value; do something else. }
Solution:
We can avoid the multiple-move attempt
by providing a get
operation that
calls a user function if and only if the value has not been moved.
That function would receive the moved value as a parameter.
For convenience, that get
operation
would return a boolean indicating whether or not the function was called.
For example, we can forward the value to a function.
.... if ( ftr.conditional_get( [](moveonly arg){ forward(arg); } ) ) { .... // value was forwarded. } else { .... // value was not forwarded. }
Problem:
A call-once get
changes the future state to invalid.
The wait operations have a precondition of valid.
For unique future
this precondition is not a problem.
However, for a shared_future
,
another thread could invalidate the precondition
just before a call to wait
.
Solution: Permit wait calls on an invalid future. Return immediately from the wait.
The working draft defines an atomic_future
for direct sharing of future variables.
This type is modeled on shared_future
,
which is not a bad approach at the high level.
Indeed the discussion above that applies to shared_future
also applies to atomic_future
.
Unfortunately, the other details of the interface of atomic_future
seem less sound.
Problem:
The copy assign operation implies a transactional read and write,
which is less performant than alternatives,
as it implies acquiring a pair of locks,
e.g. with std::lock()
.
Solution:
Assign a shared_future
to an atomic_future
,
rather than assigning an atomic_future
,
Do the same with the copy constructor.
Problem: The copy assign operation returns a reference to the left-hand side, which is problematic, as the subsequent read implied by that reference is not coordinated with the write.
Solution: Return an rvalue rather than an lvalue.
Problem:
There is no mechanism to obtain a shared_future
from an atomic_future
.
Solution:
Add an implicit conversion operator on atomic_future
yielding a shared_future
.
Problem:
There is no mechanism
to swap the liaison state of an atomic_future
.
Solution:
Add an exchange
operation,
like those of chapter 29.
Problem:
There is no mechanism
to compare-and-swap the liaison state of an atomic_future
.
Solution:
Add a compare_exchange
operation,
like those of chapter 29.
Unlike those operations,
the measure of equality is whether or not the futures
refer to the same liaison state.
Observation:
The package_task
that sets a future's liaison state
happens before successful get
operations (30.6.4 [futures.state],
and so the get operation provides memory synchronization.
Thus there appears no need for memory synchronization specifications
on atomic_future
operations.
Problem:
The sequential consistency of atomic_future
operations
is not specified.
Solution:
The solution is arguable.
However, for consistency with other atomic operations
that do not specify memory order,
atomic_future
operations should be sequentially consistent.
Problem: The liaison state is atomically reference counted, and extra work is performed on temporaries.
Solution: Add move operations. These operations only affect performance.
Solution:
This consideration also applies to shared_future
.
Problem:
The get
operation returns a const reference,
as does shared_future
.
This return implies that future values must be thread-safe read only.
Solution:
Add a locking get
to copy out the future value,
even when copies are mutating.
Problem:
Such an interface would be incompatible
with sharing the liaison state
between shared_future
and atomic_future
,
which in turn implies the solutions above may be ill-advised.
Solution:
Give ourselves more time to develop the atomic_future
interface
by removing atomic_future
from the library.
Problem: The existing interface is already in the FCD, and requires consensus to remove.
The existing futures interfaces have problems. Our solution consists of two primary approaches:
We preserve the distinction that a unique future
has unique ownership of the get
side of the liaison state.
We also preserve the distinction that a shared_future
enables multiple ownership of the get
side of the liaison state.
Finally, we preserve the distinction that an atomic_future
enables concurrent access to a future variable.
The operations
on unique future
and shared_future
discussed in this paper
and their analysis
are as follows.
operation | unique future |
shared_future |
|||
---|---|---|---|---|---|
<moveonly> |
<copyable> |
<moveonly> |
<copyable> |
||
get only once,
returning rvalue |
throw on second call | problem: no diagnostics, throw abuse | problem: non-exceptional throw | ||
undefined (terminate) on second call | okay: supports call early, diagnostics | unsupportable: no effective way to ensure that another thread will not grab the value | |||
conditional get ,
returning bool |
move assign | unneeded:
testing valid beforehand is sufficient |
okay:
minimal, simple support for move-only types in shared_future |
||
callback | okay:
generalizes support for move-only types in shared_future |
||||
get called many times |
returns rvalue | unsupportable: requires copying non-copyable types | okay: supports value futures | unsupportable: requires copying non-copyable types | okay: supports locked copying |
returns non-const lvalue | restricted: must use result in place | okay: supports return value late | tricky: requires external synchronization | ||
returns const lvalue | restricted: must use result in place, does not support general late return | restricted: does not support general late return | restricted:
must use result in place,
requires thread-safe const |
restricted:
requires thread-safe const |
|
valid |
okay | tricky:
one-way unstable; in the presence of moving gets,
true does not imply that it will remain so |
|||
wait operations | okay | modify: needs update for invalid futures |
Observations on the atomic_future
interface
are as follows.
operation | observation | |
---|---|---|
copy assignment |
const atomic_future& argument,
atomic_future& lvalue result |
problem: implies transactional semantics, uncoordinated read and write |
const shared_future& argument,
shared_future rvalue result |
okay:
similar to atomic (chapter 29) operations |
|
copy constructor |
atomic_future argument |
problem:
atomic types should be variables, not values;
similar to atomic (chapter 29) operations |
shared_future argument,
shared_future rvalue result |
okay:
similar to atomic (chapter 29) operations |
|
move assignment |
atomic_future&& argument,
atomic_future& lvalue result |
problem:
atomic types should be variables, not values;
similar to atomic (chapter 29) operations |
shared_future&& argument,
shared_future rvalue result |
optional: implementations can avoid some reference counting | |
move constructor |
atomic_future argument |
problem:
atomic types should be variables, not values;
similar to atomic (chapter 29) operations |
shared_future argument,
shared_future rvalue result |
optional: implementations can avoid some reference counting | |
conversion operator, shared_future rvalue result |
const member function |
okay:
similar to atomic (chapter 29) operations |
&& member function |
optional: implementations can avoid some reference counting | |
exchange function, shared_future rvalue result |
plain member function | okay:
similar to atomic (chapter 29) operations |
&& member function |
optional: implementations can avoid some reference counting | |
compare_exchange function |
plain member function | okay:
similar to atomic (chapter 29) operations |
We choose generally conservative solutions to the issues raised by the national body comments.
The national body comment is:
The note in this paragraph says "unlike future, calling get more than once on the same atomic_future object is well defined and produces the result again." There is nothing in future that says anything negative about calling get more than once.
Remove this note, or add words to the requirements for future that reflect what this note says.
This comment indicates a misinterpretation of the intent of the future design. We address weaknesses in the design exposed by the misinterpretation and the subsequent committee discussion.
future::get
call diagnosis
The current unique future
makes multiple calls to get
undefined behavior.
We propose to add a suggestion that implementers
detect a second call and throw future_error
with an error condition of future_errc::no_state
.
We ease use of the late-return pattern
by providing an easy conversion
from future
to shared_future
.
We propose to add a future::share
operation
to enable effective use of auto
.
get
The current definition of shared_future
enables data races on get
.
We propose to avoid this
by adding a requirement that
the argument type to shared_future
meet the 'const means concurrency-safe' requirement
in place for the standard library.
[FIX Also for atomic_future
if we keep it.]
The national body comment is:
Both future and shared_future specify that calling most member functions on an object for which valid() == false produces undefined behavior. There is no such statement for atomic_future. Add a new paragraph after 30.6.8 [futures.atomic_future]/2 with the same words as 30.6.7 [futures.shared_future]/3.
We believe the proposed resolution goes in the wrong direction.
Atomic futures may become invalid due to concurrent agents,
and hence making operations on them invalid is unsupportable.
While shared_future
does not currently suffer from this problem,
plausible future extensions could introduce it.
So, we propose to
change shared_future
operations
that currently require valid() == true
to instead throw throw future_error
with an error condition of future_errc::no_state
.
The national body comment is:
According to the definition of atomic_future, all members of atomic_future are synchronizing except constructors. However, it would probably be appropriate for a move constructor to be synchronizing on the source object. If not, the postconditions on paragraphs 7-8, might not be satisfied. This may be applicable if a collection of futures are being doled out to a set of threads that process their value.
Make the move constructor for atomic future lock the source.
We could not agree
on what the interface to atomic_future
should be
nor could we find a use case for it.
Therefore, we propose to remove atomic_future
.
However, if we continue to support atomic_future
,
then we concur with the proposed resolution.
The national body comment is:
The third sentence says "If the invocation is not deferred, a call to a waiting function on an asynchronous return object that shares the associated asynchronous state created by this async call shall block until the associated thread has completed." The next sentence says "If the invocation is not deferred, the join() on the created thread..." Blocking until a thread completes is not necessarily a join.
Decide whether the requirement is to block until finished or to call join, and rewrite to match.
[FIX Note changed proposal.]
We propose to clarify that "block until finished"
is the same as "as if join()
",
which preserves the happens-before chain.
Adjust the synopsis to reflect the changes below.
future
[futures.unique_future]Edit paragraph 3 as follows.
The effect of calling any member function other than the destructor, the move-assignment operator, or
valid
on afuture
object for whichvalid() == false
is undefined. [Note: Implementations are encouraged to detect this case and throwfuture_error
with an error condition offuture_errc::no_state
. —end note]
Edit the synopsis as follows.
namespace std { template <class R> class future { public: future(); future(future &&); future(const future& rhs) = delete; ~future(); future& operator=(const future& rhs) = delete; future& operator=(future&&); shared_future<R> share(); ....
After paragraph 11, add a new function entry as follows.
shared_future<R> share();
Effects:
shared_future<R>(std::move(*this));
Postcondition:
valid() == false
Remove paragraphs 13, 19, 21, and 24.
Requires:valid() == true
.
shared_future
[futures.shared_future]Edit paragraph 3 as follows.
The effect of calling any member function other than the destructor, the copy-assignment operator, the move-assignment operator, or
valid()
on ashared_future
object for whichvalid() == false
is undefinedshall throwfuture_error
with an error condition offuture_errc::no_state
.
Edit paragraph 19 as follows.
Requires:
The copy constructor ofvalid() == true
.R
and calls through the reference returned fromget()
shall not introduce a data race (17.6.3.10 [res.on.objects]).
Remove paragraphs 24, 26, and 29.
Requires:valid() == true
.
atomic_future
[futures.atomic_future]Remove this section. Otherwise, edit as below.
Edit paragraph 2 as follows.
Unlike
future
andshared_future
, member functions ofatomic_future
other than constructors and destructors are synchronization operations (1.10). The copy constructor is a synchronizing operation on the argument object, but not the constructed object.Accessor member functions perform acquire operations on the object. All member function calls shall be included in the order ofAll synchronizing operations onmemory_order_seq_cst
operations (29.3).atomic_future
are sequentially consistent (1.10, 29.3).
Remove paragraph 13.
Throws:future_error
with an error condition ofno_state
if the precondition is not met.
After paragraph 14 add a new paragraph as follows.
Requires: The copy constructor of
R
and calls through the reference returned fromget()
shall not introduce a data race (17.6.3.10 [res.on.objects]).
Edit paragraphs 21, 25, and 29 as follows.
Throws:
future_error
with an error condition ofno_state
ifvalid() == false
, otherwisefuture_error
if an error condition occurs.
async
[futures.async]Edit paragraph 5 as follows.
Synchronization: the invocation of
async
happens before (1.10) the invocation off
. [Note: this statement applies even when the corresponding future object is moved to another thread. —end note] If the invocation is not deferred, a call to a waiting function on an asynchronous return object that shares the associated asynchronous state created by thisasync
call shall block until the associated thread has completed, as if joined ([thread.thread.member]). If the invocation is not deferred, thejoin()
on the created thread happens-before (1.10) the first function that successfully detects the ready status of the associated asynchronous state returns or before the function that gives up the last reference to the associated asynchronous state returns, whichever happens first. If the invocation is deferred, the completion of the invocation of the deferred function happens-before the calls to the waiting functions return.