This paper proposes the addition of two new launch policies to std::async
,
one sychronous (launch::sync
) and one asynchronous (launch::task
).
It also suggests changes to the default launch policy.
launch::task
is an asynchronous execution policy that is
similar to the existing launch::async
, except that it doesn't
require the creation of a new thread for each task.
The current asynchronous policy, launch::async
, specifies that
execution occurs "as if in a new thread". The implementation is thus required
to create a new thread for each task. This is expensive.
The motivation for this imposed cost is that the task is guaranteed to start with fresh, default-constructed, thread local variables, and that those thread local variables are guaranteed to be destroyed immediately after completion.
A common use of thread local variables is to locally cache objects that are expensive to recreate. For such uses, destroying and reinitializing the thread local variables imposes an additional source of inefficiency on top of the mandated thread creation. Reuse of such thread locals is actually desirable.
In most other cases, reusing thread local variables across tasks is harmless.
Therefore, a launch policy that would allow the implementation to reuse a thread for more than one task execution would be a significant performance enhancement.
The common concerns about such thread reuse are:
The answers, in this proposal, are no and no.
Implementation-induced deadlocks are specifically disallowed, by introducing a requirement
that a task using the task
(and async
) policy shall be assigned a
thread no later than the first call to a wait
function. The implementation may
avoid spawning too many threads and oversubscribing the CPU by taking advantage of its freedom
to use deferred or synchronous execution, if the user has included launch::deferred
or launch::sync
as an allowed policy for the std::async
call.
At program termination, completed or running tasks using the proposed launch::task
policy have the thread local variables of their corresponding threads destroyed before static
destruction takes place. This implies that exit
may need to wait for the currently
running tasks to complete. Tasks that are launched after static destruction starts behave as if
launch::async
has been used.
launch::sync
is a synchronous policy that executes the task
directly in the std::async
call.
On its surface, a policy that executes the task immediately may seem superfluous; the user could
have just executed the task instead of going through the trouble of using std::async
.
Its advantages become more apparent if we consider that a routine may take a launch policy as a
parameter, as in the following pseudocode:
void routine( std::launch policy, args...) { /* ... */ std::future<X> fx = std::async( policy, ... ); /* ... */ }
Such parameterization is desirable, for example, if we want to be able to experiment with different launch policies and pick the one that delivers the best performance.
In such cases, it is very convenient to be able to tell routine
to execute everything
synchronously, for the following reasons:
routine
does not work as intended, the problem may have something to
do with the asynchronous execution, or it may not. Switching to launch::sync
allows us
to quickly determine which of these two is the case.routine
with launch::sync
can be very useful both as a sanity check (is it by chance faster than the supposedly parallel version?)
and as a baseline (how well does it scale?)launch::sync
for some of the recursive calls allows us finer control over which branches is executed in parallel
and which aren't.
In addition, launch::sync
can be combined with other policies, to grant the implementation
the option to execute in the calling thread. This allows the implementation to better balance the load if,
for example, it detects that the task queue has grown too big.
Half-seriously, the policy also allows one to obtain a ready future
holding a specific value or
exception:
std::future<int> x = std::async( std::launch::sync, []{ return 42; } ); std::future<int> y = std::async( std::launch::sync, [] -> int { throw std::runtime_error( "Hello exceptional world!" ); } );
The default launch policy is currently launch::async | launch::deferred
and is unnamed.
This proposal suggest two changes. First, the default policy should be given a name, launch::default_
.
Second, the default should be launch::sync | launch::async | launch::task | launch::deferred
.
The default policy should be given a name both to simplify the specification and isolate any eventual changes to a single place, and to allow users to name it without spelling it out.
The plain std::async
call, which implicitly uses the default policy, is, for many programmers,
their first encounter with parallelism in C++. It should make a good first impression, and good performance
is essential. The default policy should afford the implementation maximum flexibility in meeting the
performance expectations of a C++ programmer. That is why this paper suggests that the implementation should
be free to choose among all of the available policies.
Currently, there is still not much code that depends on the default, so the change will be relatively painless.
As more and more programmers take advantage of std::async
, the default policy will progressively
become more entrenched and harder to change. The time for a change is now.
(All edits are relative to ISO/IEC 14882-2011.)
Change enum class launch
in the synopsis of <future>
in 30.6.1
[futures.overview] p1 as follows:
enum class launch : unspecified { async = unspecified, deferred = unspecified, task = unspecified, sync = unspecified, default_ = sync | async | task | deferred, implementation-defined };
Change the first sentence of 30.6.1 [futures.overview] p2 as follows:
The enum typelaunch
is an implementation-defined bitmask type (17.5.2.1.3) withlaunch::async
,andlaunch::deferred
,launch::task
, andlaunch::sync
denoting individual bits.
Change the first sentence of 30.6.8 [futures.async] p3 as follows:
Effects: The first function behaves the same as a call to the second function with a policy argument oflaunch::async | launch::deferred
launch::default_
and the same arguments forF
andArgs
.
Add the following two bullets to 30.6.8 [futures.async] p3:
policy & launch::task
is non-zero — equivalent to the
policy & launch::async
case, except that the task may inherit the
thread_local
variables from a previous completed task execution, and the
thread_local
variables of the current execution are not necessarily destroyed immediately
after its completion. If the async
call happens before a call to exit
or return
from main
, destructors for thread_local
variables corresponding to the task's
thread will run before those for static duration objects. The call to exit
or the return from
main
may implicitly wait for currently running tasks using the launch::task
policy to
complete. If the exit
call or return from main
happens before an std::async
call with launch::task
policy then that call behaves as though it had used
launch::async
policy. [Note: in a long-lived program, implementations are encouraged
to eventually destroy the thread_local
variables of completed executions. — end note.]policy & launch::sync
is non-zero — calls
INVOKE(DECAY_COPY(std::forward<F>(f)), DECAY_COPY(std::forward<Args>(args))...)
.
Any return value is stored as the result in the shared state. Any exception propagated from the execution of
INVOKE(DECAY_COPY(std::forward<F>(f)), DECAY_COPY(std::forward<Args>(args))...)
is stored as the exceptional result in the shared state.Add the following paragraph to 30.6.8 [futures.async] p3, after the bullets:
Tasks using thelaunch::async
andlaunch::task
policies shall be assigned a thread and begin execution no later than the first call to await
function (30.6.4). [Note: In other words, the implementation is not allowed to deadlock if an earlier task waits for a later one. — end note.]
Change 30.6.8 [futures.async] p6 as follows:
Throws:system_error
ifpolicy
islaunch::async
orlaunch::task
and the implementation is unable to start a new thread.
Change 30.6.8 [futures.async] p7 as follows:
Error conditions:
resource_unavailable_try_again
— ifpolicy
islaunch::async
orlaunch::task
and the system is unable to start a new thread.
Thanks to Hans Boehm, Herb Sutter, Niklas Gustafsson and Anthony Williams.
— end