1. Background
[LWG2445] was discussed and resolved by SG1 in Urbana.
1.1. LWG issue #2445
The definitions of compare and exchange in [util.smartptr.shared.atomic] ¶32 and [atomics.types.operations.req] ¶21 state:
Requires: The failure argument shall not be
memory_order_release
normemory_order_acq_rel
. The failure argument shall be no stronger than the success argument.The term "stronger" isn’t defined by the standard.
It is hinted at by [atomics.types.operations.req] ¶22:
When only one
memory_order
argument is supplied, the value ofsuccess
isorder
, and the value offailure
isorder
except that a value ofmemory_order_acq_rel
shall be replaced by the valuememory_order_acquire
and a value ofmemory_order_release
shall be replaced by the valuememory_order_relaxed
.Should the standard define a partial ordering for memory orders, where consume and acquire are incomparable with release?
1.2. Proposed SG1 resolution from Urbana
Add the following note:
[Note: Memory orders have the following relative strengths implied by their definitions:
—end note]
1.3. Further issue
Nonetheless:
-
The resolution isn’t on the LWG tracker.
-
The proposed note was never moved to the draft Standard.
Furthermore, the resolution which SG1 came to in Urbana resolves what "stronger" means by specifying a lattice, but isn’t not clear on what "The failure argument shall be no stronger than the success argument" means given the lattice.
There is no relationship, "stronger" or otherwise, between release and consume/acquire. The current wording says "shall be no stronger" which isn’t the same as "shall not be stronger" in this context. Is that on purpose? At a minimum it’s not clear and should be clarified.
Should the following be valid:
compare_exchange_strong(x, y, z, memory_order_release, memory_order_acquire);
Or does the code need to be:
compare_exchange_strong(x, y, z, memory_order_acq_rel, memory_order_acquire);
Similar questions can be asked for memory_order_consume
ordering on failure
.
Is there even a point in restricting success
/failure
orderings? On
architectures with load-linked/store-conditional instructions the load and store
are distinct instructions which can each have their own memory ordering (with
appropriate leading/trailing fences if required), whereas architectures with
compare-and-exchange already have a limited set of instructions to choose
from. The current limitation (assuming [LWG2445] is resolved) only seems to
restrict compilers on load-linked/store-conditional architectures.
The following code could be valid if the stored data didn’t need to be published nor ordered, whereas any retry needs to read additional data:
compare_exchange_strong(x, y, z, memory_order_relaxed, memory_order_acquire);
Even if—for lack of clever instruction—architectures cannot take advantage of such code, compiler are able to optimize atomics in all sorts of clever ways as discussed in [N4455].
2. Updated proposal
This paper proposes removing the "stronger" restrictions between
compare-exchange’s success
and failure
ordering, and doesn’t add a lattice
to order atomic orderings. The only remaining restriction is that memory_order_release
and memory_order_acq_rel
for failure
are still
disallowed: a failed compare-exchange doesn’t store, the current model is
therefore not sensible with these orderings.
There have been discussions about memory_order_release
loads, e.g. for
seqlock. Such potential changes are left up to future papers.
Modify [util.smartptr.shared.atomic] ¶32 as follows:
Requires: The failure argument shall not be
memory_order_release
normemory_order_acq_rel
.The failure argument shall be no stronger than the success argument.
Modify [atomics.types.operations.req] ¶21 as follows:
Requires: The failure argument shall not be
memory_order_release
normemory_order_acq_rel
.The failure argument shall be no stronger than the success argument.
Leave [atomics.types.operations.req] ¶22 as-is:
Effects: Atomically, compares the contents of the memory pointed to by
object
or bythis
for equality with that inexpected
, and iftrue
, replaces the contents of the memory pointed to byobject
or bythis
with that indesired
, and iffalse
, updates the contents of the memory inexpected
with the contents of the memory pointed to byobject
or bythis
. Further, if the comparison istrue
, memory is affected according to the value ofsuccess
, and if the comparison isfalse
, memory is affected according to the value offailure
.When only one
memory_order
argument is supplied, the value ofsuccess
isorder
, and the value offailure
isorder
except that a value ofmemory_order_acq_rel
shall be replaced by the valuememory_order_acquire
and a value ofmemory_order_release
shall be replaced by the valuememory_order_relaxed
.If the operation returns
true
, these operations are atomic read-modify-write operations (1.10). Otherwise, these operations are atomic load operations.
3. Acknowledgement
Thanks to John McCall for pointing out that the proposed resolution was still insufficient, and for providing ample feedback.