Document Number:	N4032
Date:	2014-05-23
Author:	Anthony Williams Just Software Solutions Ltd

N4032: Comments on continuations and executors

Having implemented the concurrency extensions from D3904 from the Issaquah wiki, it is apparent that there are several aspects of the specification which are incomplete, and others which I find undesirable.

This paper attempts to enumerate those aspects, and proposes fixes and suggestions.

Executors

Abstract class vs concept

While I can understand the need for a concrete executor type that can be passed around, so functions that require executors do not have to be templates, there are various downsides to consider too:

The use of virtual functions constrains the interface, and prevents executors from returning anything from the add function (such as a task handle).
The use of virtual functions forces the use of std::function to wrap all the tasks. This prevents the use of move-only function objects such as std::packaged_task.

If we instead had an exector concept then we could allow implementations that handled these scenarios. We could also provide a type-erased generic_executor_ref which wrapped an underlying executor in a concrete type.

Uses that are already templates, such as std::async could now take the executor type directly as a reference, avoiding the need for virtual function calls, and potentially avoiding the use of a std::function wrapper as well.

If we had a movable_function template equivalent to std::function that only required that the target was movable rather than copyable, then this would be preferable to std::function for passing tasks around.

Scheduled Executors

The use of std::chrono::system_clock for the timeouts is odd: std::chrono::steady_clock should be the natural choice. However the use of timestamps for scheduling is very simplistic, and probably meets very few real use cases. Again, the use of an abstract base class and virtual functions is constraining, and prevents the use of varied scheduling functions for different scenarios.

My recommendation is to remove the scheduled_executor class entirely.

Concrete Executors

The list of concrete executors is limited, but workable. Those executors that have an underlying executor should return it via a generic_executor_ref rather than an executor*, to maintain the concrete interface whilst allowing flexibility in the type of the underlying executor.

One additional executor type that would be nice to have is a multiplexing executor that takes a set of underlying executors, and shares the tasks out between them. This and the loop_executor would provide basic building blocks for a thread pool.

The descriptions of the concrete executors are insufficiently detailed when it comes to the scheduling properties. We need wording to specify when the tasks are destructed, and any synchronization relationships. For those executors that use multiple threads we need to specify additional details about which threads the tasks are run on.

Enhancements to Futures

Executors and `std::async`

Rather than taking a reference to the executor base class, the overloads of std::async should instead just take a reference to anything that implements the Executor requirements. This will require careful wording of the form "shall only partipate in overload resolution if..." for the various overloads to avoid ambiguity.

Executors and `then()`

Rather than taking a reference to the executor base class, the overloads of then() should instead just take a reference to anything that implements the Executor requirements. Since then() only takes a nullary callable, and not an argument set, this is unambiguous.

`unwrap()` and invalid futures

The future returned from unwrap() is specified to be valid, regardless of whether or not the inner future is valid. However, if the outer future becomes ready with an inner future that is not valid, it is unspecified what the behaviour of the returned future should be. The inner future can never become ready (because it has no shared state), but there is now no way to detect this.

I therefore propose that if the inner future is not valid, the outer future becomes ready with an exception of type std::future_error, with an error code of std::future_errc::broken_promise.

`unwrap()` and `std::shared_future`

Firstly, std::shared_future<std::shared_future<R>>::unwrap() is specified to return a std::future<R>. I believe it should return a std::shared_future<R>, which can therefore reference the same shared state as the original inner future, rather than copying the value.

Secondly, std::shared_future<std::future<R>>::unwrap() is specified to move the inner future into the result:

When the inner future is ready, its value (or exception) is moved to the shared state of the returned future.

This moves the result out from under any other copies of that std::shared_future object which refer to the same result — in theory multiple threads could each have their own std::shared_future object referencing the same inner std::future, and each could call unwrap() in parallel. This results in a race condition, which I believe is undesirable.

Instead, I propose that std::shared_future<std::future<R>>::unwrap() is not permitted. Users should either unwrap the std::future<std::future<R>> before it is converted to a std::shared_future, or use a std::shared_future<std::shared_future<R>>, which can be safely unwrapped into a std::shared_future<R>.

`when_all` effects

The effects as specified are:

Each future and shared_future is waited upon and then copied into the collection of the output (returned) future, maintaining the order of the futures in the input collection.

This implies that the futures are waited on before the call to when_all returns, which would defeat the purpose of the call.

It also requires copying of std::future objects, which is not possible.

I propose that the wording is clarified to say that the waiting is done in the background, and that the futures are moved.

`when_any` effects

The effects as specified are:

Each future and shared_future is waited upon. When at least one is ready, all the futures are copied into the collection of the output (returned) future, maintaining the order of the futures in the input collection.

This implies that the futures are waited on before the call to when_any returns, which would defeat the purpose of the call.

It also requires copying of std::future objects, which is not possible.

I propose that the wording is clarified to say that the waiting is done in the background, and that the futures are moved.

`when_all` and deferred tasks

Futures that result from a call to std::async, or a call to then() on another future can refer to deferred tasks. It is currently unspecified what happens when such futures are passed to when_all().

Passing these to when_all should execute the deferred tasks before the call to when_all returns, just as then() does, as otherwise the future returned from when_all will never become ready, as the deferred tasks will not execute until their futures are waited on.

`when_any` and deferred tasks

Whereas when_all() requires all the futures to be ready before the result is ready, when_any only requires one of the supplied futures to be ready. I therefore propose that a call to when_any checks the passed futures in the order passed to see if they are already ready, or are deferred. If future f_i is ready then the result is ready, and no further futures are checked. If future f_i is deferred, then the deferred task is executed, the result is ready, and no further futures are checked.

Partial proposed wording

What follows is partial wording for addressing some of the issues raised in this paper.

Remove the executor and scheduled_executor classes (2.2.1 and 2.2.2). Replace them with an executor concept and generic_executor_ref class defined as follows:

2.2.x Requirements for Executor types

An Executor type is a class that manages the scheduling and execution of supplied tasks. The details of the scheduling and ordering of the tasks, along with the execution agents used to execute the tasks will vary between executors. In order for a type E to qualify as an Executor type, the following expressions must be supported, with the specified semantics, where e denotes a value of type E, and f denotes a value of a callable type F such that f() is well-formed, and F is CopyConstructible.

e.add(f)

Effects:

A copy of f is constructed in internal storage as if F g(f). The copy of f is executed at the time and manner specified for type E.

Throws:

Any exception thrown by the copy constructor of f. std::bad_alloc if sufficient internal storage cannot be allocated. Any other exceptions specified by E.

Synchronization:

The completion of the copy-construction of f into internal storage synchronizes-with the start of the execution of that copy.

e.num_pending_closures()

Returns:

A value implicitly convertible to size_t which is the number of tasks submitted to the executor but not yet started.

2.2.y Class generic_executor_ref

generic_executor_ref satisfies the Executor requirements (2.2.x). It wraps a reference to a concrete executor type.
class generic_executor_ref {
public:
    template<typename E>
    generic_executor_ref(E& e) noexcept;

    generic_executor_ref(generic_executor_ref const& other) noexcept;
    generic_executor_ref& operator=(generic_executor_ref const& other) noexcept;

    void add(std::function<void()> f);
    size_t num_pending_closures() const;
};
template<typename E> generic_executor_ref(E& e) noexcept;

Requires:

E shall satisfy the Executor requirements (2.2.x).

Effects:

Constructs a new instance of generic_executor_ref that refers to e.

generic_executor_ref(generic_executor_ref const& other) noexcept;

Effects:

Constructs a new instance of generic_executor_ref that refers to the same underlying executor as other.

generic_executor_ref& operator=(generic_executor_ref const& other) noexcept;

Postconditions:

*this refers to the same underlying executor as other.

Returns:

*this

void add(std::function<void()> f);

Effects:

e.add(f), where e is the underlying executor referred to by *this.

size_t num_pending_closures() const;

Returns:

e.num_pending_closures(), where e is the underlying executor referred to by *this.

Modify the effects clause of 3.4 (when_all):

Effects:

Each future and shared_future is waited upon and then copied into the collection of the output (returned) future, maintaining the order of the futures in the input collection.

If any of the futures supplied to a call to when_all refer to deferred tasks that have not started execution, those tasks are executed before the call to when_all returns. Once all such tasks have been executed, the call to when_all returns immediately.

The call to when_all does not wait for non-deferred tasks, or deferred tasks that have already started executing elsewhere, to complete before returning.

Once all the futures supplied to the call to when_all are ready, the futures are moved into the associated state of the future returned from the call to when_all, preserving the order of the futures supplied to when_all. That future is then ready.

The future returned by when_all will not throw an exception, but the futures held in the output collection may.

Modify the effects clause of 3.5 (when_any):

Effects:

Each future and shared_future is waited upon. When at least one is ready, all the futures are copied into the collection of the output (returned) future, maintaining the order of the futures in the input collection.

Each of the futures supplied to when_any is checked in the order supplied. If a given future is ready, then no further futures are checked, and the call to when_any returns immediately. If a given future refers to a deferred task that has not yet started execution, then no further futures are checked, that task is executed, and the call to when_any then returns immediately.

The call to when_any does not wait for non-deferred tasks, or deferred tasks that have already started executing elsewhere, to complete before returning.

Once at least one of the futures supplied to the call to when_any are ready, the futures are moved into the associated state of the future returned from the call to when_any, preserving the order of the futures supplied to when_any. That future is then ready.

The future returned by when_any will not throw an exception, but the futures held in the output collection may.

N4032: Comments on continuations and executors

Executors

Abstract class vs concept

Scheduled Executors

Concrete Executors

Enhancements to Futures

Executors and `std::async`

Executors and `then()`

`unwrap()` and invalid futures

`unwrap()` and `std::shared_future`

`when_all` effects

`when_any` effects

`when_all` and deferred tasks

`when_any` and deferred tasks

Partial proposed wording

2.2.x Requirements for Executor types

`e.add(f)`

`e.num_pending_closures()`

2.2.y Class `generic_executor_ref`

`template<typename E> generic_executor_ref(E& e) noexcept;`

`generic_executor_ref(generic_executor_ref const& other) noexcept;`

`generic_executor_ref& operator=(generic_executor_ref const& other) noexcept;`

`void add(std::function<void()> f);`

`size_t num_pending_closures() const;`

N4032: Comments on continuations and executors

Executors

Abstract class vs concept

Scheduled Executors

Concrete Executors

Enhancements to Futures

Executors and std::async

Executors and then()

unwrap() and invalid futures

unwrap() and std::shared_future

when_all effects

when_any effects

when_all and deferred tasks

when_any and deferred tasks

Partial proposed wording

2.2.x Requirements for Executor types

e.add(f)

e.num_pending_closures()

2.2.y Class generic_executor_ref

template<typename E> generic_executor_ref(E& e) noexcept;

generic_executor_ref(generic_executor_ref const& other) noexcept;

generic_executor_ref& operator=(generic_executor_ref const& other) noexcept;

void add(std::function<void()> f);

size_t num_pending_closures() const;

Executors and `std::async`

Executors and `then()`

`unwrap()` and invalid futures

`unwrap()` and `std::shared_future`

`when_all` effects

`when_any` effects

`when_all` and deferred tasks

`when_any` and deferred tasks

`e.add(f)`

`e.num_pending_closures()`

2.2.y Class `generic_executor_ref`

`template<typename E> generic_executor_ref(E& e) noexcept;`

`generic_executor_ref(generic_executor_ref const& other) noexcept;`

`generic_executor_ref& operator=(generic_executor_ref const& other) noexcept;`

`void add(std::function<void()> f);`

`size_t num_pending_closures() const;`