Some notes on executors and the Networking Library Proposal

1. Introduction

This document is intended to provide some input to further discussion of executors by:

— attempting to relate executor requirements, as defined in N4478 Networking Library Proposal, to the terminology described in N4231 Terms and definitions related to threads and N4156 Light-Weight Execution Agents;

— capturing some aspects of the discussion related to executors from the February 2015 review of the Networking Library Proposal in Cologne; and

— exploring some of the related prior art.

2. Executors and progress guarantees

At its most general, one can imagine an executor as something that, when given an arbitrary sequence of instructions, runs them as a thread of execution. In these terms, there are no particular requirements as to whether an executor gives a concurrent, parallel, or weakly parallel progress guarantee.

However, the executor model utilised in the Networking Library Proposal traffics only in arbitrary “normal” function objects executed on the abstract machine. A “normal” function object will eventually be allowed to execute all steps in its thread of execution, but only after it has executed its first step. That is, this executor model provides a parallel progress guarantee.

Therefore, let us define the executor requirements of the Networking Library Proposal as parallel function object executor requirements.

2.1. Implementing weaker progress guarantees

Of course, we can still use these executors to implement other abstractions that do provide no more than a weakly parallel progress guarantee.

For instance, we can implement a scheduler for resumable functions on top of these executors. A given resumable function can have an associated executor, and the work to “resume” the resumable function is performed by an implementation-detail function object running on that executor. Even though that function object is executed with the stronger parallel guarantee, the entire resumable function can assume only weakly parallel progress.

2.2. Requirements for stronger progress guarantees?

A library or program is allowed to implement the parallel function object executor requirements such that they provide a concurrent progress guarantee. One possible example is a new_thread_executor where every call to dispatch(), post() or defer() creates a new std::thread.

However, the asynchronous operations framework used by the Networking Library Proposal does not require a concurrent progress guarantee — the parallel guarantee suffices for all uses. It is also worth noting that implied executor requirements of the prior art surveyed below do not require the concurrent guarantee.

Yet, as N4156 suggests, there are times when a concurrent guarantee is required, and it would be beneficial to be able to detect the ability of an executor to meet this guarantee at compile time.

Therefore, it may be desirable to define new concurrent function object executor type requirements. These requirements may be a refinement of the parallel function object requirements, or they may be independent. As a straw man proposal, consider use of the function name spawn() to mean launch a function object with a concurrent progress guarantee. An executor may provide this member function if it is able to meet this guarantee.

These hypothetical concurrent function object executor requirements are out-of-scope for a Networking Library Proposal. Were Networking Library executors to impose such a requirement, it would introduce a burden on all implementations of the executor type requirements. Furthermore, for some executor types the concurrent progress requirements are impossible to implement (such as bounded thread pools, and strands or serial executors).

3. Dispatch, Post and Defer

3.1. Dispatch vs Post

The Networking Library Proposal executor requirements for dispatch() say:

The executor may invoke f1 prior to returning from dispatch.

This requirement should be redefined in terms of forward progress. For example, a better definition may be to say that dispatch() is permitted to block the forward progress of the caller until f1() finishes execution.

The executor requirements for post() (and, in a similar fashion, defer()) say:

The executor shall not invoke f1 in the current thread of execution prior to returning from post.

When considered in terms of forward progress this may be an over-specification. A possible redefinition of these requirements may be to say that post() and defer() are not permitted to block the forward progress of the caller pending completion of f1().

When post() is defined in this way, it means that in certain very limited cases an implementation of post() may invoke f1 in the calling thread. For example, the type of the function object f1 is "known" to an executor and is "known" to not block.

3.1.1. Just a hint?

Is the choice between dispatch() and post() better presented as a hint to an executor?

dispatch() is provided as a tool to minimise the latency and overhead of f1's execution, provided doing so does not violate the rules of the executor. The widespread existence of dispatch() in the prior art attests to its utility.

However, a consequence of dispatch()'s specification is that its use can introduce deadlock. This occurs when the caller holds a mutex and f1 attempts to acquire the same mutex. Thus the choice between dispatch() and post() impacts program correctness. A hint, which by definition need not be respected, is an inappropriate way for the caller to express its intention.

3.2. Post vs Defer

Where the distinction between dispatch() and post()/defer() is related to the forward progress of the caller, the distinction between post() and defer() relates to the forward progress of f1. The executor requirements say:

defer is used to convey the intention of the caller that the submitted function is a continuation of the current call context. The executor may use this information to optimize or otherwise adjust the way in which f is invoked.

A better way to define this may be to say that the use of post() means that we prefer that the caller does not block the first step of f1's progress, whereas defer() means that we prefer that the caller does block the first step of f1.

3.2.1. Just a hint?

As the difference between post() and defer() uses the word "prefer", there is a case to be made that this is just a hint to the executor.

It may be more than a hint in circumstances where the caller knows something about the concrete executor being used. For example, on a specific bounded thread pool implementation, a call to post() will allow f1 to exploit the available concurrency in the pool even if the caller continues to execute. (Although, even if a hint is used to distinguish between post() and defer(), this may be addressed by simply specifying that the concrete executor will obey the hint in a certain way.)

4. Related prior art

In surveying the prior art, we will:

— Identify whether it provides an equivalent to dispatch.

— Identify whether it provides an equivalent to post.

— Identify whether it provides facilities for counting outstanding work, i.e. the equivalent of the on_work_started and on_work_finished functions.

The following table summarises the findings. Where a facility is provided it is marked "y", otherwise "n'. In some cases, the prior art provides a variant of dispatch where it's not just that it "may" block the caller, but that it always blocks the caller. These are marked "y (v)".

library	dispatch	post	work counting
Boost.Asio	y	y	y
Java `Executor`	y	n	n
Grand Central Dispatch queues	y (v)	y	n
.NET `SynchronizationContext`	y (v)	y	y
.NET Reactive Extensions `Scheduler`	y	n	n
Thread Building Blocks `task_arena`	y (v)	y	n

4.1. Boost.Asio

Although the Networking Library Proposal is based on Boost.Asio, the executors facility is a relatively new addition to the library in order to enable move-only completion handlers. In this section we will discuss the original realisation of executor facilities in the library.

The Boost.Asio io_service class is an executor for function objects that provides a parallel progress guarantee.

The io_service::dispatch() and io_service::post() functions provide the dispatch and post semantics respectively.

Work counting is performed via the io_service::work class. Objects of this type automatically count work as they are constructed and destroyed.

4.2. Java Executor

(http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executor.html)

The Java Executor interface provides only a single method 'execute'. This method has dispatch semantics:

the Executor interface does not strictly require that execution be asynchronous. In the simplest case, an executor can run the submitted task immediately in the caller's thread.

4.3. Grand Central Dispatch queues

(https://developer.apple.com/library/ios/documentation/Performance/Reference/GCD_libdispatch_Ref/index.html)

Grand Central Dispatch provides dispatch queues, which let you execute arbtitrary blocks of code either asynchronously or synchronously with respect to the caller. Dispatch queues may be "serial" or "concurrent".

The dispatch_sync function provides dispatch semantics. However, dispatch_sync always blocks the caller:

Submits a block object for execution on a dispatch queue and waits until that block completes. ... As an optimization, this function invokes the block on the current thread when possible.

The dispatch_async function provides post semantics:

Submits an application-defined function for asynchronous execution on a dispatch queue and returns immediately.

4.4. .NET SynchronizationContext

(https://msdn.microsoft.com/en-us/library/system.threading.synchronizationcontext(v=vs.110).aspx)

The .NET SynchronizationContext class is used, amongst other things, to coordinate asynchronous operations in a threaded environment.

The Send function provides dispatch semantics, however it always blocks the caller:

The Send method starts a synchronous request to send a message.

The Post function provides post semantics:

The Post method starts an asynchronous request to post a message.

The OperationStarted and OperationCompleted functions provide the work counting facilities.

4.5. .NET Reactive Extensions Scheduler

(https://msdn.microsoft.com/en-us/library/system.reactive.concurrency.scheduler(v=vs.103).aspx)

The Scheduler interface provides several overloads of a Schedule function, used to submit actions.

The existence of a concrete implementation class ImmediateScheduler indicates that the Schedule function provides dispatch semantics:

Represents an object that schedules units of work to run immediately on the current thread.

4.6. Thread Building Blocks task_arena

(https://www.threadingbuildingblocks.org/docs/help/reference/task_scheduler/task_arena_cls.htm)

A task_arena class "represents an internal task scheduler object where a number of threads, limited by a maximal concurrency level, share and execute tasks".

The execute function provides dispatch semantics, however it always blocks the caller:

If possible, the calling thread joins the arena and executes the specified functor, then leaves the arena ... If not possible to join, the call wraps the functor into a task, enqueues it into the arena, waits using an OS kernel synchronization object for a joining opportunity, and finishes after the task completion.

The enqueue function provides post semantics:

Enqueues a task into the arena to process specified functor and immediately returns.