Document Number: |
N3113=10-0103 |
Date: |
2010-08-18 |
Project: |
Programming Language C++ |
Peter Sommerlad <peter.sommerlad@hsr.ch>
The limitation of async's launch strategies to only 2 different strategies (sync, async) and a third strategy saying either one makes it hard for vendors to provide better strategies in the future and for users writing portable code wrt to the available strategies.
A bitmask type for the async launch strategy seems to be more suitable than a 3-way enum. However, adaptation of the bitmask requirements (GB53) couldn't be voted in Rapperswil, but were close to and it is expected they will be voted in Batavia. Further discussion provided insight that an enum with corresponding overloaded bit-operators should be chosen.
Providing only three different possible values for the enum launch and saying that launch::any means either launch::sync or launch::async is very restricting. This hinders future implementors to provide clever infrastructures that can simply used by a call to async(launch::any,...). Also there is no hook for an implementation to provide additional alternatives to launch enumeration and no useful means to combine those (i.e. interpret them like flags). We believe something like async(launch::sync | launch::async, ...) should be allowed and can become especially useful if one could say also something like async(launch::any & ~launch::sync, ....) respectively. This flexibility might limit the features usable in the function called through async(), but it will allow a path to effortless profit from improved hardware/software without complicating the programming model when just using async(launch::any,...)
The visual distinction of launch::sync
and launch::async
is hard to see. In addition launch::sync
is not about synchronous execution,
but deferring the function execution until its result is really wanted, which may never.
Therefore this document suggests also renaming the launch policy
launch::sync
to become
launch::deferred
CH 36 provided the following proposal:
Change in 30.6.1 'enum class launch' to allow further implementation defined values and provide the following bit-operators on the launch values (operator|, operator&, operator~ delivering a launch value). Note: a possible implementation might use an unsigned value to represent the launch enums, but we shouldn't limit the standard to just 32 or 64 available bits in that case and also should keep the launch enums in their own enum namespace.
Change [future.async] p3 according to the changes to enum launch. change --launch::any to "the implementation may choose any of the policies it provides." Note: this can mean that an implementation may restrict the called function to take all required information by copy in case it will be called in a different address space, or even, on a different processor type. To ensure that a call is either performed like launch::async or launch::sync describe one should call async(launch::sync|launch::async,...)
Discussion in Rapperswil:
The discussion discovered that the launch enum served two aspects: On the one hand, there is an implementation view, where an "enum bit" can denote a specific async launch strategy, e.g., in a thread pool, or run it on a GPU. Such a specific launch mechanism provides specific requirements for the underlying function to be run asynchronously, such as copying all input values, or only referring to read-only data to avoid races. On the other hand, there is a user's view where the enum should specify the requirements the user can guarantee for the asynchronously called function and the implementation should be able to select an appropriate one, may be even dynamically for very clever implementations (see below).
To allow this dual nature the enum should provide a hook for implementers to extend it and for users to combine enum values in a useful way, e.g., with a meaning "anything but sync", or "I don't care, because the function is a pure function and would not give any data races or undefined behavior".
The discussion also covered if launch::any is a good name for (launch::sync|launch::async) or for "everything implementers think is safe". However, the name "default" is a keyword and thus unavailable. Nevertheless, the "default" used by the async() function overload without a launch strategy should at least be (launch::sync|launch::async).
Minutes from Discussion in Rapperswil:
existing launch enums: sync, async vendor and future lunch enums: separate_process, other_endian, gpu possible launch enum sets: nothing_outside_standard = async | sync no_restrictions_beyond_the_standard = what_implementers_think_is_safe = everything_implementers_have = launch::default <= launch::no_restrictions_beyond_the_standard launch::any = ? std::async( task ); std::async( std::launch::async | std::launch::gpu, task );
Proposed text: The value launch::default is at least sync|async. Any vendor extensions shall place no additional restrictions on task interaction.
further discussion on the reflector and emails provided input and changes.
Thanks to Detlef Vollmann, Lawrence Crowl, Pete Becker, Alberto Ganesh Barbati, Anthony Williams, Daniel Krügler, Hans Boehm, Michael Wang, Bjarne Stroustrup and the Concurrency subgroup for their comments and contributions to this paper.
This paper addresses and details the proposed resolution of FDIS NB comment CH 36. It uses terms of art to be introduced by a paper by Lawrence Crowl that is yet unnumbered at the time of this writing
In 30.6.1 p1 replace
enum class launch { any, async, sync };
with
enum class launch : unspecified { async = unspecified power of 2, deferred = unspecified power of 2 , implementation defined }; launch operator|( launch, launch ); launch operator&( launch, launch ); launch operator^( launch, launch ); launch operator~( launch ); launch& operator|=(launch & , launch ); launch& operator&=(launch & , launch ); launch& operator^=(launch & , launch );
At the end of 30.6.1 add
The enum type launch is an implementation-defined bitmask type
(17.5.2.1.3).
[ Note: implementations are encouraged to use bits for individual
launch policies. For example, policy launch::deferred
has a value of a power of 2.
Furthermore, implementations can provide bitmasks to specify restrictions on task interaction
by functions launched by async()
applicable to a corresponding subset
of available launch policies.
end note ]
Change 30.6.4 paragraph 2 as follows:
[Note: The result can be any kind of object including a function to compute that result, as used byasync
when policy
is
launch::deferredsync
. —
end note ]
Change 30.6.9 p3 as follows:
Effects: The first function behaves the same as a call to the second
function with a policy
argument of
and the same arguments
for launch::any
(launch::async|launch::deferred)F
and Args
.
Implementations who would like to extend
the behavior of the first overload are free to do so by adding their
extensions to the launch policy under the "as if" rule.
The second function creates an associated asynchronous
state that is associated with the returned future
object. The further
behavior of the second function depends on the policy
argument as
follows.
If more than one bullet applies the implementation may choose any applicable policy.
— ( policy & launch::async ) == launch::async
— executes INVOKE(decay_copy(std::forward<F>(f)),
decay_copy(std::forward<Args>(args))...)
(20.8.2, 30.3.1.2) as if in
a new thread of execution represented by a thread
object with the
calls to decay_copy()
being evaluated in the thread that called
async
. Any return value is stored as the result in the associated
asynchronous state. Any exception propagated from the execution of
INVOKE(decay_copy(std::forward<F>(f)),
decay_copy(std::forward<Args>(args))...)
is stored as the
exceptional result in the associated asynchronous state. The
thread
object is stored in the associated asynchronous state and
affects the behavior of any asynchrounous return
objects that reference that state.
future
— ( policy & launch::deferred ) == launch::
— Stores
sync
deferred decay_copy(std::forward<F>(f))
and
decay_copy(std::forward<Arg>(args))...
in the associated
asynchronous state. These copies of f
and args
constitute a
deferred function. Invocation of the deferred function evaluates
INVOKE(g, xyz)
where g
is the stored value of
decay_copy(std::forward<F>(f))
and xyz
is the stored
copy of decay_copy(std::forward<Args.(args))...
. The associated
asynchronous state is not made ready until the function has completed. The
first call to a waiting function waiting
on an asynchronous return
object referring for the associated asynchronous state created by this
async
call to become ready shall invoke the deferred function in
the thread that called the waiting function; all other calls to waiting
functions on asynchronous return objects sharing for the same
associated asynchronous state created by this async
call to become ready shall block until the deferred function has
completed. [ Note: If this policy is specified together with other policies,
such as when using a policy
value of launch::async|launch::deferred
,
implementations should defer invocation or the
selection of the policy when no more concurrency can be effectively exploited.
—end note]
—launc::any
— the implementation
may choose either policy at
any call to async
.
[ Note: implementations should defer invocation
when no more concurrency can be effectively exploited.
–
end note ]
Change 30.6.9 p5 as follows:
Synchronization: Regardless of provided policy
async
happens before (1.10) the invocation of
f
. [ Note: this statement applies even when the corresponding
future
object is moved to another thread. —end note ]f
happens-before
(1.10) the calling thread makes ready the associated asynchronous state. [
Note: f might not be called at all, so its completion might never happen. –
end note]async
happens-before (1.10) the return from last function that releases the associated
asynchronous state.If (policy & launch::async) == launch::async
the invocation is not deferred,
async
call shall block
until the associated thread
has completed.join()
on the created thread
object
happens-before (1.10) the first function that successfully detects the ready status
of the associated asynchronous state returns or happens-before (1.10)
the return from the last function that Change the note in 30.6.9 p 9 as follows:
[ Note: line #1 might not result in concurrency because the
async
call uses the default policy, which may
use launch::any
launch::deferred
, in which case the lambda might
not be invoked until the syncget()
call; in that case, work1
and work2
are called on the same thread and there is no concurrency. –
end note ] —end example ]