Document number: P2643R0
Date: 2022-09-15
Reply to: Gonzalo Brito Gadeschi <gonzalob _at_ nvidia.com>
Authors: Gonzalo Brito Gadeschi, Olivier Giroux, Thomas Rodgers
Audience: Concurrency
Improving C++ concurrency features
Revisions
This is the initial revision.
Introduction
When we applied P1135R6 to C++20, we introduced several new concurrency constructs to the C++ concurrency library:
- In
<atomic>
, the member functions wait
, notify_one
and notify_all
were added to class template atomic<>
and class atomic_flag
, and free function versions of the same also.
- In
<semaphore>
, the class template counting_semaphore<>
and class binary_semaphore
were introduced.
- In
<barrier>
and <latch>
, the class template barrier<>
and class latch
were introduced.
Though each element included was long coming, and had much implementation experience behind it, fresh user feedback tells us that some improvements could still be made.
Proposed direction
The following is a grossly priority-ordered list of requests that users and implementers both have voiced over the last year:
- Add timed versions of
atomic::wait
.
The primary purpose of this facility is to make it easier to implement other concurrency facilities, but often these other facilities expose timed waiting facilities themselves. Without timed versions of wait
, the programmer is left to ad-hoc solutions for timed waiting facilities, and perhaps even all waiting facilities. Anecdotally, at least two implementations of C++20 have added internal timed versions of this facility to implement <semaphore>
.
Adding timed versions of atomic::wait
removes hurdles to adoption of this facility for its intended purpose.
Adding timed versions of atomic::wait
will require a discussion of what facilities from <chrono>
need to be present in <atomic>
for freestanding implementations.
- Return the last observed value from
atomic::wait
.
After the return from wait
, it is common for programs to reload the value of the atomic object. By necessity, the implementation of wait
already loaded this value, to compare it with the operand supplied and return non-spuriously. This is duplicate work which, in principle, could be optimized away by the compiler but conservatively isn’t.
Returning the value from atomic::wait
is a straightforward way to recover performance lost from the duplicate work.
-
Avoid spurious polling in atomic::wait
with at least one of:
a. Add an overload of wait
taking a predicate instead of a value.
When the program is waiting for a condition different from “not equal to”, there is an added re-try loop around the wait
operation in the program. This loop causes each call to wait
to be performed as if it were the first call to wait
, oblivious to the fact that the program has already been waiting for some time. This leads to re-executing the short-term polling strategy.
Taking a predicate instead of a value allows us to push the program-defined condition inside of atomic::wait
, delete the outer loop, and allows the implementation to track time spent.
At least two implementations currently implement atomic::wait
in terms of a wait taking a predicate.
b. Add a hint operand to wait
to steer the internal strategy.
By default, that short-term strategy inside of wait
is to poll the atomic object’s value for some time, so as to avoid limiting the responsiveness of the program to that of the operating system kernel’s scheduler. Sometimes, however, it is known that either (a) an event cannot or is not hoped to occur in this short of a window of time, or (b) the program has already supplied its own polling strategy before the call to wait
, or © this call to wait
is not the first and should be considered a long-term wait.
Taking a hint would let the program indicate whether the short-term strategy of atomic::wait
should execute or not.
-
Add timed versions of barrier::wait
and latch::wait
also.
Since every waiting facility in the concurrency library has timed wait functions at this point, it makes sense to add timed versions of these as well.
Although this is a very weak reason to do anything, there is also no clear reason why we should not do it.
Design
The design of the features above is mostly orthogonal, and this section explores them independently.
- Return last observed value from atomic
::wait
APIs: solved as voidT wait(…);
- Fallible and timed versions of wait APIs:
Wording
Return last observed value from atomic::wait
To [atomics.ref.generic.general]:
namespace std {
template<class T> struct atomic_ref { // [atomics.ref.generic.general]
voidT wait(T, memory_order = memory_order::seq_cst) const noexcept;
};
}
UNRESOLVED QUESTION: all atomic_ref
types are missing volatile
wait overloads?
To [atomics.ref.ops]:
voidT wait(T old, memory_order order = memory_order::seq_cst) const noexcept;
- Preconditions:
order
is neither memory_order::release
nor memory_order::acq_rel
.
- Effects: Repeatedly performs the following steps, in order:
- Evaluates
load(order)
and compares its value representation for equality against that of old
.
- If they compare unequal, returns the result of the evaluation of
load(order)
in the previous step.
- Blocks until it is unblocked by an atomic notifying operation or is unblocked spuriously.
- Remarks: This function is an atomic waiting operation (atomics.wait) on atomic object
*ptr
.
To [atomics.ref.int]:
namespace std {
template<> struct atomic_ref<integral> {
voidT wait(integral, memory_order = memory_order::seq_cst) const noexcept;
};
}
To [atomics.ref.float]:
namespace std {
template<> struct atomic_ref<floating-point> {
voidT wait(floating-point, memory_order = memory_order::seq_cst) const noexcept;
};
}
To [atomics.ref.pointer]:
namespace std {
template<class T> struct atomic_ref<T*> {
voidT* wait(T*, memory_order = memory_order::seq_cst) const noexcept;
};
}
To [atomics.types.generic.general]:
namespace std {
template<class T> struct atomic {
voidT wait(T, memory_order = memory_order::seq_cst) const volatile noexcept;
voidT wait(T, memory_order = memory_order::seq_cst) const noexcept;
};
}
To [atomics.types.operations]:
voidT wait(T old, memory_order order = memory_order::seq_cst) const volatile noexcept;
voidT wait(T old, memory_order order = memory_order::seq_cst) const noexcept;
- Preconditions: order is neither
memory_order::release
nor memory_order::acq_rel
.
- Effects: Repeatedly performs the following steps, in order:
- Evaluates
load(order)
and compares its value representation for equality against that of old
.
- If they compare unequal, returns the result of the evaluation of
load(order)
in the previous step.
- Blocks until it is unblocked by an atomic notifying operation or is unblocked spuriously.
- Remarks: This function is an atomic waiting operation (atomics.wait).
To [atomics.types.int]:
namespace std {
template<> struct atomic<integral> {
voidT wait(integral, memory_order = memory_order::seq_cst) const volatile noexcept;
voidT wait(integral, memory_order = memory_order::seq_cst) const noexcept;
};
}
To [atomics.types.float]:
namespace std {
template<> struct atomic<floating-point> {
voidT wait(floating-point, memory_order = memory_order::seq_cst) const volatile noexcept;
voidT wait(floating-point, memory_order = memory_order::seq_cst) const noexcept;
};
}
To [atomics.types.pointer]:
namespace std {
template<class T> struct atomic<T*> {
voidT* wait(T*, memory_order = memory_order::seq_cst) const volatile noexcept;
voidT* wait(T*, memory_order = memory_order::seq_cst) const noexcept;
};
}
To [util.smartptr.atomic.shared]:
namespace std {
template<class T> struct atomic<shared_ptr<T>> {
voidshared_ptr<T> wait(shared_ptr<T> old, memory_order = memory_order::seq_cst) const noexcept;
};
}
and
voidshared_ptr<T> wait(shared_ptr<T> old, memory_order order = memory_order::seq_cst) const noexcept;
- Preconditions:
order
is neither memory_order::release
nor memory_order::acq_rel
.
- Effects: Repeatedly performs the following steps, in order:
- Evaluates
load(order)
and compares it to old
.
- If the two are not equivalent, returns the result of the evaluation of
load(order)
in the previous step.
- Blocks until it is unblocked by an atomic notifying operation or is unblocked spuriously.
- Remarks: Two
shared_ptr
objects are equivalent if they store the same pointer and either share ownership or are both empty. This function is an atomic waiting operation (atomics.wait).
To [util.smartptr.atomic.weak]:
namespace std {
template<class T> struct atomic<weak_ptr<T>> {
voidweak_ptr<T> wait(weak_ptr<T> old, memory_order = memory_order::seq_cst) const noexcept;
};
}
voidweak_ptr<T> wait(weak_ptr<T> old, memory_order order = memory_order::seq_cst) const noexcept;
- Preconditions: order is neither
memory_order::release
nor memory_order::acq_rel
.
- Effects: Repeatedly performs the following steps, in order:
- Evaluates
load(order)
and compares it to old
.
- If the two are not equivalent, returns the result of the evaluation of
load(order)
in the previous step.
- Blocks until it is unblocked by an atomic notifying operation or is unblocked spuriously.
- Remarks: Two
weak_ptr
objects are equivalent if they store the same pointer and either share ownership or are both empty. This function is an atomic waiting operation (atomics.wait).
No changes to [atomics.nonmembers] are needed.
No changes to [atomic.flag]'s wait
APIs are needed.
Fallible and timed-versions of ::wait
APIs
To [atomics.ref.generic.general]:
namespace std {
template<class T> struct atomic_ref { // [atomics.ref.generic.general]
optional<T> try_wait(T, memory_order = memory_order::seq_cst) const noexcept;
template <class Rep, class Period>
optional<T> try_wait_for(
T, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const noexcept;
template <class Clock, class Duration>
optional<T> try_wait_until(
T, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const noexcept;
};
}
UNRESOLVED QUESTION: all atomic_ref
types are missing volatile
wait overloads?
To [atomics.ref.ops]:
optional<T> try_wait(T old, memory_order order = memory_order::seq_cst) const noexcept;
- Preconditions:
order
is neither memory_order::release
nor memory_order::acq_rel
.
- Effects: Performs the following steps in order:
- Evaluates
load(order)
and compares its value representation for equality against that of old
.
- If they compare unequal, returns the result of the evaluation of
load(order)
in the previous step.
- Otherwise, there is no effect and it returns
nullopt
.
Remarks: This function is an atomic waiting operation (atomics.wait).
template <class Rep, class Period>
optional<T> try_wait_for(T old,
chrono::duration<Rep, Period> const& rel_time,
memory_order order = memory_order::seq_cst
) const noexcept;
template <class Clock, class Duration>
optional<T> try_wait_until(T old,
chrono::time_point<Clock, Duration> const& abs_time,
memory_order order = memory_order::seq_cst
) const noexcept;
- Preconditions:
order
is neither memory_order::release
nor memory_order::acq_rel
.
- Effects: Repeatedly performs the following steps, in order:
- Evaluates
load(order)
and compares its value representation for equality against that of old
.
- If they compare unequal, returns the result of the evaluation of
load(order)
in the previous step.
- Blocks until it is unblocked by an atomic notifying operation or is unblocked spuriously or the timeout expired. If it is unblocked by the timeout there is no effect and it returns
nullopt
.
The timeout expires (thread.req.timing) when the current time is after abs_time
(for try_wait_until
) or when at least rel_time
has passed from the start of the function (for try_wait_for
).
An implementation should ensure that try_wait_for
and try_wait_until
do not consistently return nullopt
in the absence of contending atomic operations.
- Throws: Timeout-related exceptions (thread.req.timing).
- Remarks: This function is an atomic waiting operation (atomics.wait) on atomic object
*ptr
.
To [atomics.ref.int]:
namespace std {
template<> struct atomic_ref<integral> {
optional<integral> try_wait(integral, memory_order = memory_order::seq_cst) const noexcept;
template <class Rep, class Period>
optional<integral> try_wait_for(
integral, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const noexcept;
template <class Clock, class Duration>
optional<integral> try_wait_until(
integral, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const noexcept;
};
}
To [atomics.ref.float]:
namespace std {
template<> struct atomic_ref<floating-point> {
optional<floating-point> try_wait(floating-point, memory_order = memory_order::seq_cst) const noexcept;
template <class Rep, class Period>
optional<floating-point> try_wait_for(
floating-point, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const noexcept;
template <class Clock, class Duration>
optional<floating-point> try_wait_until(
floating-point, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const noexcept;
};
}
To [atomics.ref.pointer]:
namespace std {
template<class T> struct atomic_ref<T*> {
optional<T*> try_wait(T*, memory_order = memory_order::seq_cst) const noexcept;
template <class Rep, class Period>
optional<T*> try_wait_for(
T*, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const noexcept;
template <class Clock, class Duration>
optional<T*> try_wait_until(
T*, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const noexcept;
};
}
To [atomics.types.generic.general]:
namespace std {
template<class T> struct atomic {
optional<T> try_wait(T, memory_order = memory_order::seq_cst) const noexcept;
optional<T> try_wait(T, memory_order = memory_order::seq_cst) const volatile noexcept;
template <class Rep, class Period>
optional<T> try_wait_for(
integral, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const noexcept;
optional<T> try_wait_for(
integral, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const volatile noexcept;
template <class Clock, class Duration>
optional<T> try_wait_until(
integral, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const noexcept;
template <class Clock, class Duration>
optional<T> try_wait_until(
integral, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const volatile noexcept;
};
}
To [atomics.types.operations]:
optional<T> try_wait(T, memory_order = memory_order::seq_cst) const noexcept;
optional<T> try_wait(T, memory_order = memory_order::seq_cst) const volatile noexcept;
template <class Rep, class Period>
optional<T> try_wait_for(T old,
chrono::duration<Rep, Period> const& rel_time,
memory_order order = memory_order::seq_cst
) const noexcept;
template <class Rep, class Period>
optional<T> try_wait_for(T old,
chrono::duration<Rep, Period> const& rel_time,
memory_order order = memory_order::seq_cst
) const volatile noexcept;
template <class Clock, class Duration>
optional<T> try_wait_until(T old,
chrono::time_point<Clock, Duration> const& abs_time,
memory_order order = memory_order::seq_cst
) const noexcept;
template <class Clock, class Duration>
optional<T> try_wait_until(T old,
chrono::time_point<Clock, Duration> const& abs_time,
memory_order order = memory_order::seq_cst
) const volatile noexcept;
EDITORIAL: analogous to atomic_ref
. Intentionally left out from the current revision of this paper.
To [atomics.types.int]:
namespace std {
template<> struct atomic<integral> {
optional<integral> try_wait(integral, memory_order = memory_order::seq_cst) const noexcept;
optional<integral> try_wait(integral, memory_order = memory_order::seq_cst) const volatile noexcept;
template <class Rep, class Period>
optional<integral> try_wait_for(
integral, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const noexcept;
template <class Rep, class Period>
optional<integral> try_wait_for(
integral, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const volatile noexcept;
template <class Clock, class Duration>
optional<integral> try_wait_until(
integral, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const noexcept;
template <class Clock, class Duration>
optional<integral> try_wait_until(
integral, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const volatile noexcept;
};
}
To [atomics.types.float]:
namespace std {
template<> struct atomic<floating-point> {
optional<floating-point> try_wait(floating-point, memory_order = memory_order::seq_cst) const noexcept;
optional<floating-point> try_wait(floating-point, memory_order = memory_order::seq_cst) const volatile noexcept;
template <class Rep, class Period>
optional<floating-point> try_wait_for(
floating-point, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const noexcept;
template <class Rep, class Period>
optional<floating-point> try_wait_for(
floating-point, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const volatile noexcept;
template <class Clock, class Duration>
optional<floating-point> try_wait_until(
floating-point, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const noexcept;
template <class Clock, class Duration>
optional<floating-point> try_wait_until(
floating-point, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const volatile noexcept;
};
}
To [atomics.types.pointer]:
namespace std {
template<class T> struct atomic<T*> {
optional<T*> try_wait(T*, memory_order = memory_order::seq_cst) const noexcept;
optional<T*> try_wait(T*, memory_order = memory_order::seq_cst) const volatile noexcept;
template <class Rep, class Period>
optional<T*> try_wait_for(
T*, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const noexcept;
template <class Rep, class Period>
optional<T*> try_wait_for(
T*, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const volatile noexcept;
template <class Clock, class Duration>
optional<T*> try_wait_until(
T*, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const noexcept;
template <class Clock, class Duration>
optional<T*> try_wait_until(
T*, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const volatile noexcept;
};
}
To [util.smartptr.atomic.shared]:
namespace std {
template<class T> struct atomic<shared_ptr<T>> {
optional<shared_ptr<T>> try_wait(shared_ptr<T>, memory_order = memory_order::seq_cst) const noexcept;
template <class Rep, class Period>
optional<shared_ptr<T>> try_wait_for(
shared_ptr<T>l, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const noexcept;
template <class Clock, class Duration>
optional<shared_ptr<T>> try_wait_until(
shared_ptr<T>, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const noexcept;
};
}
EDITORIAL: analogous to the try_wait
APIS of atomic_ref
, with shared_ptr
/weak_ptr
tweaks. Intentionally left out of the current revision of this paper.
To [util.smartptr.atomic.weak]:
namespace std {
template<class T> struct atomic<weak_ptr<T>> {
optional<weak_ptr<T>> try_wait(weak_ptr<T>, memory_order = memory_order::seq_cst) const noexcept;
template <class Rep, class Period>
optional<weak_ptr<T>> try_wait_for(
weak_ptr<T>l, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const noexcept;
template <class Clock, class Duration>
optional<weak_ptr<T>> try_wait_until(
weak_ptr<T>, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const noexcept;
};
}
EDITORIAL: analogous to the try_wait
APIS of atomic_ref
, with shared_ptr
/weak_ptr
tweaks. Intentionally left out of the current revision of this paper.
EDITORIAL: No changes to [atomics.nonmembers] are needed.
To [atomic.flag]:
namespace std {
struct atomic_flag {
bool try_wait(bool, memory_order = memory_order::seq_cst) const noexcept;
bool try_wait(bool, memory_order = memory_order::seq_cst) const volatile noexcept;
template <class Rep, class Period>
bool try_wait_for(
bool, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const noexcept;
template <class Rep, class Period>
bool try_wait_for(
bool, chrono::duration<Rep, Period> const& rel_time,
memory_order = memory_order::seq_cst
) const volatile noexcept;
template <class Clock, class Duration>
bool try_wait_until(
bool, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const noexcept;
template <class Clock, class Duration>
bool try_wait_until(
bool, chrono::time_point<Clock, Duration> const& abs_time,
memory_order = memory_order::seq_cst
) const volatile noexcept;
};
}
bool atomic_flag_try_wait(const atomic_flag* object, bool old) noexcept;
bool atomic_flag_try_wait(const volatile atomic_flag* object, bool old) noexcept;
bool atomic_flag_try_wait_explicit(const atomic_flag* object, bool old, memory_order order = memory_order::seq_cst) noexcept;
bool atomic_flag_try_wait_explicit(const volatile atomic_flag* object, bool old, memory_order order = memory_order::seq_cst) noexcept;
bool atomic_flag::try_wait(bool old, memory_order order = memory_order::seq_cst) const noexcept;
bool atomic_flag::try_wait(bool old, memory_order order = memory_order::seq_cst) const volatile noexcept;
For atomic_flag_try_wait
, let order
be memory_order::seq_cst
. Let flag
be object
for the non-member functions, and this
for the member functions.
- Preconditions_:
order
is neither memory_order::release
nor memory_order::acq_rel
.
- Effects: Repeatedly performs the following steps, in order:
- Evaluates
flag->test(order) != old
.
- If the result of that evaluation is
true
, returns true
.
- Otherwise, it has no effects and returns
false
..
- Remarks: This function is an atomic waiting operation (atomics.wait).
EDITORIAL: analogous for the atomic_flag_try_wait_for
/_until
APIs. Intentionally omitted from the current revision of this paper.
UNRESOLVED QUESTION: do we need to change something else for the non-member versions of try_wait
, try_wait_for
, and try_wait_until
operations?
UNRESOLVED QUESTION: do we need to define a “try-wait” atomic operation in atomics.wait?
To [thread.barrier]:
namespace std {
template <class Completion Function>
class barrier {
public:
bool try_wait(arrival_token&& tok) const;
template <class Rep, class Period>
bool try_wait_for(arrival_token&& tok, chrono::duration<Rep, Period> const& rel_time) const;
template <class Clock, class Duration>
bool try_wait_until(arrival_token&& tok, chrono::time_point<Clock, Duration> const& abs_time) const;
};
}
UNRESOLVED QUESTION: should we remove const
qualification from the new APIs if P2588 is accepted?
EDITORIAL: these changes are compatible with both adding try_wait
overloads that accept a memory_order
(P2628) and try_wait
overloads that accept a bool parity
instead of an arrival_token
(P2629).
bool try_wait(arrival_token&& arrival) const;
- Preconditions:
arrival
is associated with the phase synchronization point for the current phase or the immediately preceding phase of the same barrier object.
- Effects: If
arrival
is associated with the synchronization point for a previous phase, the call returns true
immediately without blocking. Otherwise, there are no effects, and the call returns false
.
- Throws:
system_error
when an exception is required (thread.req.exception).
- Error conditions: Error conditions: Any of the error conditions allowed for mutex types (thread.mutex.requirements.mutex).
UNRESOLVED QUESTION: if P2588 is accepted, then try_wait
is able to complete the phase and the Effects clause needs updating, e.g., as follows: “[…] Otherwise, if all threads have arrived try_wait
may complete the phase and return true
, or the call has no effects and returns false
.”.
template <class Rep, class Period>
bool try_wait_for(arrival_token&& tok, chrono::duration<Rep, Period> const& rel_time) const;
template <class Clock, class Duration>
bool try_wait_until(arrival_token&& tok, chrono::time_point<Clock, Duration> const& abs_time) const;
EDITORIAL: try_wait_for
and try_wait_until
shall have analogous semantics.
To thread.latch:
namespace std {
class latch {
public:
template <class Rep, class Period>
bool try_wait_for(chrono::duration<Rep, Period> const& rel_time) const;
template <class Clock, class Duration>
bool try_wait_until(chrono::time_point<Clock, Duration> const& abs_time) const;
};
}
template <class Rep, class Period>
bool try_wait_for(chrono::duration<Rep, Period> const& rel_time) const;
template <class Clock, class Duration>
bool try_wait_until(chrono::time_point<Clock, Duration> const& abs_time) const;
EDITORIAL: semantics intentionally omitted from the current revision of this paper.
Document number: P2643R0
Date: 2022-09-15
Reply to: Gonzalo Brito Gadeschi <gonzalob _at_ nvidia.com>
Authors: Gonzalo Brito Gadeschi, Olivier Giroux, Thomas Rodgers
Audience: Concurrency
Improving C++ concurrency features
Revisions
This is the initial revision.
Introduction
When we applied P1135R6 to C++20, we introduced several new concurrency constructs to the C++ concurrency library:
<atomic>
, the member functionswait
,notify_one
andnotify_all
were added to class templateatomic<>
and classatomic_flag
, and free function versions of the same also.<semaphore>
, the class templatecounting_semaphore<>
and classbinary_semaphore
were introduced.<barrier>
and<latch>
, the class templatebarrier<>
and classlatch
were introduced.Though each element included was long coming, and had much implementation experience behind it, fresh user feedback tells us that some improvements could still be made.
Proposed direction
The following is a grossly priority-ordered list of requests that users and implementers both have voiced over the last year:
atomic::wait
.atomic::wait
.Avoid spurious polling in
atomic::wait
with at least one of:a. Add an overload of
wait
taking a predicate instead of a value.b. Add a hint operand to
wait
to steer the internal strategy.Add timed versions of
barrier::wait
andlatch::wait
also.Design
The design of the features above is mostly orthogonal, and this section explores them independently.
::wait
APIs: solved asvoidT wait(…);Solved by adding:
optional<T> try_wait(...)
,optional<T> try_wait_for(..., chrono::duration<Rep, Period> const&)
, andoptional<T> try_wait_until(..., chrono::time_point<Clock, Duration> const&)
methods that return
nullopt
if the wait operation did not synchronize, and anoptional<T>
containing theT
value observed if it did synchronize.Wording
Return last observed value from
atomic::wait
To [atomics.ref.generic.general]:
To [atomics.ref.ops]:
order
is neithermemory_order::release
normemory_order::acq_rel
.load(order)
and compares its value representation for equality against that ofold
.load(order)
in the previous step.*ptr
.To [atomics.ref.int]:
To [atomics.ref.float]:
To [atomics.ref.pointer]:
To [atomics.types.generic.general]:
To [atomics.types.operations]:
memory_order::release
normemory_order::acq_rel
.load(order)
and compares its value representation for equality against that ofold
.load(order)
in the previous step.To [atomics.types.int]:
To [atomics.types.float]:
To [atomics.types.pointer]:
To [util.smartptr.atomic.shared]:
and
order
is neithermemory_order::release
normemory_order::acq_rel
.load(order)
and compares it toold
.load(order)
in the previous step.shared_ptr
objects are equivalent if they store the same pointer and either share ownership or are both empty. This function is an atomic waiting operation (atomics.wait).To [util.smartptr.atomic.weak]:
memory_order::release
normemory_order::acq_rel
.load(order)
and compares it toold
.load(order)
in the previous step.weak_ptr
objects are equivalent if they store the same pointer and either share ownership or are both empty. This function is an atomic waiting operation (atomics.wait).No changes to [atomics.nonmembers] are needed.
No changes to [atomic.flag]'s
wait
APIs are needed.Fallible and timed-versions of
::wait
APIsTo [atomics.ref.generic.general]:
To [atomics.ref.ops]:
order
is neithermemory_order::release
normemory_order::acq_rel
.load(order)
and compares its value representation for equality against that ofold
.load(order)
in the previous step.nullopt
.Remarks: This function is an atomic waiting operation (atomics.wait).
order
is neithermemory_order::release
normemory_order::acq_rel
.load(order)
and compares its value representation for equality against that ofold
.load(order)
in the previous step.nullopt
.The timeout expires (thread.req.timing) when the current time is after
abs_time
(fortry_wait_until
) or when at leastrel_time
has passed from the start of the function (fortry_wait_for
).An implementation should ensure that
try_wait_for
andtry_wait_until
do not consistently returnnullopt
in the absence of contending atomic operations.*ptr
.To [atomics.ref.int]:
To [atomics.ref.float]:
To [atomics.ref.pointer]:
To [atomics.types.generic.general]:
To [atomics.types.operations]:
To [atomics.types.int]:
To [atomics.types.float]:
To [atomics.types.pointer]:
To [util.smartptr.atomic.shared]:
To [util.smartptr.atomic.weak]:
To [atomic.flag]:
For
atomic_flag_try_wait
, letorder
bememory_order::seq_cst
. Letflag
beobject
for the non-member functions, andthis
for the member functions.order
is neithermemory_order::release
normemory_order::acq_rel
.flag->test(order) != old
.true
, returnstrue
.false
..To [thread.barrier]:
arrival
is associated with the phase synchronization point for the current phase or the immediately preceding phase of the same barrier object.arrival
is associated with the synchronization point for a previous phase, the call returnstrue
immediately without blocking. Otherwise, there are no effects, and the call returnsfalse
.system_error
when an exception is required (thread.req.exception).To thread.latch: