Doc. no: N4482 Date: 2015-04-13 Reply-To: Christopher Kohlhoff <chris@kohlhoff.com>
This document is intended to provide some input to further discussion of executors by:
— attempting to relate executor requirements, as defined in N4478 Networking Library Proposal, to the terminology described in N4231 Terms and definitions related to threads and N4156 Light-Weight Execution Agents;
— capturing some aspects of the discussion related to executors from the February 2015 review of the Networking Library Proposal in Cologne; and
— exploring some of the related prior art.
At its most general, one can imagine an executor as something that, when given an arbitrary sequence of instructions, runs them as a thread of execution. In these terms, there are no particular requirements as to whether an executor gives a concurrent, parallel, or weakly parallel progress guarantee.
However, the executor model utilised in the Networking Library Proposal traffics only in arbitrary “normal” function objects executed on the abstract machine. A “normal” function object will eventually be allowed to execute all steps in its thread of execution, but only after it has executed its first step. That is, this executor model provides a parallel progress guarantee.
Therefore, let us define the executor requirements of the Networking Library Proposal as parallel function object executor requirements.
Of course, we can still use these executors to implement other abstractions that do provide no more than a weakly parallel progress guarantee.
For instance, we can implement a scheduler for resumable functions on top of these executors. A given resumable function can have an associated executor, and the work to “resume” the resumable function is performed by an implementation-detail function object running on that executor. Even though that function object is executed with the stronger parallel guarantee, the entire resumable function can assume only weakly parallel progress.
A library or program is allowed to implement the parallel function object
executor requirements such that they provide a concurrent progress guarantee.
One possible example is a new_thread_executor
where every call to dispatch()
, post()
or defer()
creates a new std::thread
.
However, the asynchronous operations framework used by the Networking Library Proposal does not require a concurrent progress guarantee — the parallel guarantee suffices for all uses. It is also worth noting that implied executor requirements of the prior art surveyed below do not require the concurrent guarantee.
Yet, as N4156 suggests, there are times when a concurrent guarantee is required, and it would be beneficial to be able to detect the ability of an executor to meet this guarantee at compile time.
Therefore, it may be desirable to define new concurrent function
object executor type requirements. These requirements may be a
refinement of the parallel function object requirements, or they may be independent.
As a straw man proposal, consider use of the function name spawn()
to mean launch a function object with a concurrent progress guarantee. An
executor may provide this member function if it is able to meet this guarantee.
These hypothetical concurrent function object executor requirements are out-of-scope for a Networking Library Proposal. Were Networking Library executors to impose such a requirement, it would introduce a burden on all implementations of the executor type requirements. Furthermore, for some executor types the concurrent progress requirements are impossible to implement (such as bounded thread pools, and strands or serial executors).
The Networking Library Proposal executor requirements for dispatch()
say:
The executor may invoke
f1
prior to returning fromdispatch
.
This requirement should be redefined in terms of forward progress. For example,
a better definition may be to say that dispatch()
is permitted to block the forward progress
of the caller until f1()
finishes execution.
The executor requirements for post()
(and, in a similar fashion, defer()
)
say:
The executor shall not invoke
f1
in the current thread of execution prior to returning frompost
.
When considered in terms of forward progress this may be an over-specification.
A possible redefinition of these requirements may be to say that post()
and defer()
are not permitted to block the forward progress of the caller pending completion
of f1()
.
When post()
is defined in this way, it means that in certain very limited cases an implementation
of post()
may invoke f1
in the calling
thread. For example, the type of the function object f1
is "known" to an executor and is "known" to not block.
Is the choice between dispatch()
and post()
better presented as a hint to an executor?
dispatch()
is provided as a tool to minimise the latency and overhead of f1
's execution, provided doing so does
not violate the rules of the executor. The widespread existence of dispatch()
in the prior art attests to its utility.
However, a consequence of dispatch()
's specification is that its use can introduce
deadlock. This occurs when the caller holds a mutex and f1
attempts to acquire the same mutex. Thus the choice between dispatch()
and post()
impacts program correctness. A hint, which by definition need not be respected,
is an inappropriate way for the caller to express its intention.
Where the distinction between dispatch()
and post()
/defer()
is related to the forward progress of the
caller, the distinction between post()
and defer()
relates to the forward progress of f1
. The executor requirements say:
defer
is used to convey the intention of the caller that the submitted function is a continuation of the current call context. The executor may use this information to optimize or otherwise adjust the way in whichf
is invoked.
A better way to define this may be to say that the use of post()
means that we prefer that the caller does
not block the first step of f1
's
progress, whereas defer()
means that we prefer that the caller does block the
first step of f1
.
As the difference between post()
and defer()
uses the word "prefer", there
is a case to be made that this is just a hint to the executor.
It may be more than a hint in circumstances where the caller knows something
about the concrete executor being used. For example, on a specific bounded
thread pool implementation, a call to post()
will allow f1
to exploit the available concurrency in the pool even if the caller continues
to execute. (Although, even if a hint is used to distinguish between post()
and defer()
,
this may be addressed by simply specifying that the concrete executor will
obey the hint in a certain way.)
In surveying the prior art, we will:
— Identify whether it provides an equivalent to dispatch
.
— Identify whether it provides an equivalent to post
.
— Identify whether it provides facilities for counting outstanding work, i.e.
the equivalent of the on_work_started
and on_work_finished
functions.
The following table summarises the findings. Where a facility is provided it
is marked "y", otherwise "n'. In some cases, the prior art provides
a variant of dispatch
where
it's not just that it "may" block the caller, but that it always
blocks the caller. These are marked "y (v)".
library |
dispatch |
post |
work counting |
---|---|---|---|
Boost.Asio |
y |
y |
y |
Java |
y |
n |
n |
Grand Central Dispatch queues |
y (v) |
y |
n |
.NET |
y (v) |
y |
y |
.NET Reactive Extensions |
y |
n |
n |
Thread Building Blocks |
y (v) |
y |
n |
Although the Networking Library Proposal is based on Boost.Asio, the executors facility is a relatively new addition to the library in order to enable move-only completion handlers. In this section we will discuss the original realisation of executor facilities in the library.
The Boost.Asio io_service
class is an executor for function objects that provides a parallel progress
guarantee.
The io_service::dispatch()
and io_service::post()
functions provide the dispatch
and post
semantics respectively.
Work counting is performed via the io_service::work
class. Objects of this type automatically count work as they are constructed
and destroyed.
(http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executor.html)
The Java Executor
interface
provides only a single method 'execute'. This method has dispatch
semantics:
the Executor interface does not strictly require that execution be asynchronous. In the simplest case, an executor can run the submitted task immediately in the caller's thread.
Grand Central Dispatch provides dispatch queues, which let you execute arbtitrary blocks of code either asynchronously or synchronously with respect to the caller. Dispatch queues may be "serial" or "concurrent".
The dispatch_sync
function
provides dispatch
semantics.
However, dispatch_sync
always
blocks the caller:
Submits a block object for execution on a dispatch queue and waits until that block completes. ... As an optimization, this function invokes the block on the current thread when possible.
The dispatch_async
function
provides post
semantics:
Submits an application-defined function for asynchronous execution on a dispatch queue and returns immediately.
(https://msdn.microsoft.com/en-us/library/system.threading.synchronizationcontext(v=vs.110).aspx)
The .NET SynchronizationContext
class is used, amongst other things, to coordinate asynchronous operations
in a threaded environment.
The Send
function provides
dispatch
semantics, however
it always blocks the caller:
The Send method starts a synchronous request to send a message.
The Post
function provides
post
semantics:
The Post method starts an asynchronous request to post a message.
The OperationStarted
and
OperationCompleted
functions
provide the work counting facilities.
(https://msdn.microsoft.com/en-us/library/system.reactive.concurrency.scheduler(v=vs.103).aspx)
The Scheduler
interface provides
several overloads of a Schedule
function, used to submit actions.
The existence of a concrete implementation class ImmediateScheduler
indicates that the Schedule
function provides dispatch
semantics:
Represents an object that schedules units of work to run immediately on the current thread.
(https://www.threadingbuildingblocks.org/docs/help/reference/task_scheduler/task_arena_cls.htm)
A task_arena
class "represents
an internal task scheduler object where a number of threads, limited by a
maximal concurrency level, share and execute tasks".
The execute
function provides
dispatch
semantics, however
it always blocks the caller:
If possible, the calling thread joins the arena and executes the specified functor, then leaves the arena ... If not possible to join, the call wraps the functor into a task, enqueues it into the arena, waits using an OS kernel synchronization object for a joining opportunity, and finishes after the task completion.
The enqueue
function provides
post
semantics:
Enqueues a task into the arena to process specified functor and immediately returns.